Concordance-Based Data-Driven Learning Activities and Learning English Phrasal Verbs in EFL Classrooms

Document Type: Research Paper


1 Assistant Professor of TESOL, Shahid Beheshti University, Iran

2 M.A. in TEFL, Shahid Beheshti University, Iran


In spite of the highly beneficial applications of corpus linguistics in language pedagogy, it has not found its way into mainstream EFL. The major reasons seem to be the teachers’ lack of training and the unavailability of resources, especially computers in language classes. Phrasal verbs have been shown to be a problematic area of learning English as a foreign language due to their semantic opacity and structural differences between English and learners’ first languages. To examine the pedagogic potentiality of the use of corpus linguistics in the context of EFL, the present study aimed at comparing the effect of paper-based data-driven learning (DDL) activities, as a substitute for online DDL activities, with the activities designed based on dictionary entries in terms of their effect on learning phrasal verbs in both short and long run. To this end, the study adopted a quasi-experimental pretest posttest control group design. The analysis of the data collected through an immediate posttest as well as a delayed posttest showed that the DDL activities led to greater improvements by the participants. Based on the results of the study, it is argued that paper-based DDL activities can be used effectively in EFL classes to enhance learning and help learners to become more autonomous in their learning efforts.


Past decades have witnessed a tremendous interest in the use of language corpora and computer analysis tools for language education. What is more noticeable in the field of ELT, as mentioned by McEnery and Xiao (2011), is the indirect use of corpora. The direct use of corpora is still limited. The factors which limit the direct use of corpora include, among others, access to computers and software, the knowledge and skills of teachers in mediating the use of corpora, and curricular requirements in terms of time and resources. One of the areas of indirect use of corpora which is more relevant to their direct use is syllabus design and materials development. Good examples of this category of use include McCarthy, McCarten, and Sandiford’s (2005) Touchstone book series which make use of the idea that developing materials based on the frequency of use can expose learners to the vocabulary, grammar and functions that they are likely to encounter in real life. A second example is Willis and Willis’ (1989) Collins COBUILD English Course which uses a lexical syllabus based on the idea that knowledge of collocations is central to gaining competence in a language and fluency of its use (e.g. Cowie, 1994; Hoey, 2005).

       The direct use of corpora involves a change in pedagogy which is assumed to be more useful in language learning for a number of reasons. For example, McEnery and Xiao (2011) take the three categories of the direct use of corpora (Leech, 1997) namely “teaching about” or using corpus linguistics as a subject, “exploiting to teach” or corpus-based teaching of courses in linguistics and language programs and “teaching to exploit” or the use of corpora for data driven learning (DDL) to argue that the how of using corpora or the methodology of DDL which is relevant to the latter two categories is different from the traditional “three Ps” methodology. Contrary to the top-down deductive approach of the “three Ps”, they argue, the “three Is” (Illustration, Interaction, and Induction) (Carter & McCarthy, 1995) or its equivalent “observation”, “classification”, and “generalization” (Johns, 1991) is a bottom-up inductive approach to learning. This approach is more conducive to independent learning because it makes learners as agents in the learning process. It is therefore the case that the pedagogy of teaching corpora to exploit them for learning purposes enables learners to take control of the learning process.

As mentioned repeatedly in the literature (e.g. Boulton, 2009; Gaskell & Cobb, 2004; Scott & Tribble, 2006), the above-mentioned pedagogy paves the way for learners to adopt a more natural approach to learning as they adapt to the demands of the task of finding patterns in the data. In this process, it is more likely that they would use a number of cognitive processes summarized by O’Sullivan (2007) as predicting, observing, noticing, thinking, reasoning, analyzing, interpreting, reflecting, exploring, making inferences (inductively or deductively), focusing, guessing, comparing, differentiating, theorizing, hypothesizing, and verifying. The use of these processes on a regular basis can be considered the major source of independence from a learning perspective.

In spite of the potential advantages of the mentioned pedagogy for language learners, the use of corpora and its concomitant pedagogy has not yet taken ground in mainstream ELT especially in the EFL context. Boulton (2010) counts three reasons for this state of affairs. The first reason is that the wider audience is not still convinced that the investment in terms of time, effort, money and resources can make a meaningful difference. This has functioned as a barrier to DDL reaching its wider audience since the logical way of convincing the audience is to first show empirically that DDL makes meaningful differences and then continue with finding ways of making it more user-friendly to teachers and learners. The second reason, Boulton continues, is experimentation with corpus materials in higher education institutions with the consequence of limiting its use to higher-level learners and more complex language items. And finally the third reason mentioned by Boulton is the predominance of hands-on manipulation of corpus data. Hands-on manipulation of corpus data is limiting by itself as it requires hardware and software equipment, some level of technical expertise, and the need for training learners and teachers alike with regard to corpus and the way it works, the use of corpus data and the use of software; all of which may not be available in mainstream language education institutions.

The above-mentioned reasons might explain the presumptions that have limited, to a large extent, the use of corpora to higher-level sophisticated learners and the need for sufficient training for both teachers and learners. Such training seems to be necessary as “a corpus is not a simple object, and it is just as easy to derive nonsensical conclusions from the evidence as insightful ones” (Sincliar, 2004, p. 2); however, as argued by Boulton (2009), notwithstanding the fact that the use of any tool for learning, including dictionaries and concordances, involves some level of training; this should not give the impression that some aspects of these tools cannot be used with low level of training or even no training at all. One way to reduce the need for training is to increase teacher mediation in the process. As a preliminary stage to hands-on use of corpus material, Boulton (2010) proposed paper-based DDL materials in the form of worksheets prepared by the teacher in advance to be used in the classroom.

       Boulton’s (2010) research provided results suggesting that paper-based DDL material has its uses alongside dictionary as a traditional resource for comparatively low-level language ability learners and regular teachers with little or no training in corpus linguistics. The outcome of this research seems promising regarding the fact that the treatment lasted only for a one hour of ninety-minute session. As an attempt in the same line of research, the present study was aimed at experimenting with paper-based DDL material along side dictionary resource material with lower intermediate learners over an extended period of treatment focusing on phrasal verbs.



Corpus-based techniques like DDL have generated a lot of interest among those who are involved in second and foreign language teaching due to their strong theoretical background, their alignment with the philosophy of language teaching, and their practical value in language classes and with different groups of learners around the world. DDL activities lend themselves well to many different aspects of language. A case in point is teaching vocabulary. Researchers have tried to elaborate on the uses and advantages of using DDL or concordance lines in the classroom to teach vocabulary to second or foreign language learners (Cobb, 1997, 1999; Horst, Cobb, & Nicolae, 2005; Pickard, 1994; Stevens, 1991; Thurstun & Candlin, 1998). In most of these studies, researchers have tried to demonstrate that how traditional vocabulary drills like gap-filling or matching exercises which have the added value of being based on authentic data drawn from corpus enable learners to get the meaning of vocabulary words by being exposed to multiple and at the same time novel contexts (Stevens, 1991; Thurstun & Candlin, 1998). Teachers and language educators have tried to demonstrate the efficiency of online self-accessed vocabulary packs for language learners which provide them with extensive vocabulary teaching opportunities through using concordance data as an auxiliary to traditional vocabulary teaching materials (Horst, Cobb, & Nicolae, 2005; Pickard, 1994).

Researchers who have commented on the potential facilitative effects of corpus consultation on SLA processes believe that the same procedures which corpus linguists use to conduct descriptive studies of language can be taught to and used by language learners themselves to promote SLA processes (O’keeffe, McCarty, & Carter, 2007). For example, Aston (1995) argues that concordance lines expose learners to contextual repetition and variation of linguistic structures, promoting a process of synthesis and analysis of information on their part, which is the key to the acquisition process. In addition, other researchers such as Thurstun and Candlin (1998) and Conrad (2005) have noted that engaging students in corpus-based activities promotes noticing or consciousness-raising.

In addition to facilitating SLA, corpus-based activities are viewed as being consistent with a variety of CLT principles and its learning goals. First, concordance output exposes learners to linguistic phenomena in authentic contexts which they have to analyze and categorize inductively. The process is supposed to lead them to hypothesis testing as a precursor to learning the rules of the language on their own. Furthermore, the new role of the learner as researcher shifts the control of learning by the teacher to the student, making the classroom more student-centered while these activities are going on. Finally, corpus-based activities are thought to increase learner autonomy “as students are taught how to observe language and make generalizations rather than depending on a teacher” (Conrad, 2005, p. 402).

In spite of the potential advantages of corpus-based teaching and learning referred to in the literature (e.g. Johns, 1991; Sinclair, 1997), the range of experimentation with DDL is very limited. In a survey of the experiments conducted in the domain of DDL work, Chambers (2007) found that most of them were small-scale studies which were mostly qualitative in nature with limited applicability to language classes. In another more extensive survey of the same domain, Boulton (2009) found that most of the 39 empirical studies surveyed had focused on gauging participants’ perceptions toward DDL, rather than the real outcomes of their activities using DDL. In the rest of the studies, Boulton found that although they could be categorized among quantitative studies with real outcomes dealing with language learning, the results were not statistically significant or if they were, they could hardly be attributed to the impact of the treatment.

In order to extend the range of experimentation with DDL to lower level learners and reduce the need for training, Boulton (2010) proposed paper-based DDL materials as a substitute for hands-on DDL work. A survey of experimentation with paper-based DDL is presented below. Ciesielska-Ciupek (2001) used paper-based materials as a supplement to a course book with junior school students in Poland. The tests showed positive results but the measures were not statistically analyzed. In a study by Allan (2006) with 18 advanced English students enrolled in an exam preparation course in Ireland, the experimental group worked with paper-based corpus materials for 12 weeks. The participants had to work with the materials out of the class. Statistical analyses showed that the experimental group outperformed the control group of five participants. Tian (2005), in a more elaborate quantitative research, worked with 50 participants. The experimental group worked with paper-based DDL activities from online news sources with some L2 items for five weeks. The control group of the same size was taught rules for the same L2 items. Although no significant differences were observed between the treatments and the levels, the DDL group made more substantial improvements. In another study, Koosha and Jafarpoor (2006) studied a group of Iranian language learners who were supposed to deduce the prepositional collocations from concordances over 15 sessions. Two hundred participants, divided into two groups of 100 as control and experimental, worked with printed concordance lines. The results showed that the experimental group outperformed the control group in the use of the L2 elements. Yet another experimental study conducted by Boulton (2008), examined the ability of lower- level L2 learners in France to work with DDL without training. The results showed that the lower-level participants succeeded in drawing relevant patterns from the concordance lines and used the patterns in new situations. In a follow-up study with the same participants, Boulton (2009) demonstrated that the participants were able to use concordance lines more effectively than traditional tools (i.e., dictionaries and grammar-usage manuals) while dealing with connectors. The results of a delayed posttest administered 10 days later showed that the participants improved significantly over the period between the pretest and the posttest. From the results of these few studies, in spite of their limitations, it can be concluded that paper-based materials can be considered a valuable teaching resource.


Dictionaries and Vocabulary Learning

Dictionaries are considered among one of the most important tools of language learning. According to Laufer (2009), without a dictionary, vocabulary learning is an amalgamation of guessing and ignoring the unknown words. It is clear that when words are ignored, they are not likely to be remembered. Overuse of wild guessing is not successful, and it often leads to false retention or non-retention of words. In other words, it leads to retention of incorrect meanings (Laufer, 1997). In regard with these claims, some studies have shown that when words are looked up in a dictionary, some of them are retained ( Knight, 1994; Luppesku & Day, 1993), and the retention of words looked up in a dictionary is better than words inferred from context (Mondria, 1993) or those explained by the teacher (Hulstijn, Hollander, & Greidanus, 1996). This observation is the basis for the involvement hypothesis proposed by Laufer and Hulstijn (2001), according to which creating a need for a word, making the user search for it, and including evaluation in the process would have a better effect on the retention of its meaning. The dictionary consultation provides room for all these three conditions. Today’s ubiquity of electronic dictionaries has paved the way for easier use of this precious resource in language classrooms for vocabulary learning and teaching. 


Phrasal Verbs

No doubt, mastering English phrasal verbs is a great challenge for L2 learners. Several reasons have been mentioned for the difficulty of learning phrasal verbs among which the ones mentioned frequently are their ubiquity in all registers, productivity, and syntactic and semantic complexity. Many recent studies have shown that these difficulties often lead to participants’ avoidance of phrasal verbs in writing and speaking (Dagut & Laufer, 1985; Hulstijn & Marchena, 1989; Laufer & Eliasson, 1993; Liao & Fukuya, 2004). A number of factors are assumed to contribute to this avoidance among which we can refer to the effect of context of language learning, interference from the participants’ L1, participants’ proficiency, and problems with interpreting their meanings (Ghabanchi & Goudarzi, 2012).

As phrasal verbs, consisting of a verb and a particle, act like single words, it is not always possible to interpret their meaning by analyzing their components. In fact, there are a large number of phrasal verbs whose meanings are not transparent, that is, not recoverable from the analysis of their constituent parts. They act like idioms, and “such idiomatic meanings make participants feel that they are difficult to learn and to use, although learners of English recognize their importance” (Cheon, 2006, p. 1). Liao and Fukuya (2004) divided phrasal verbs into two groups:

1. Literal phrasal verbs: whose meanings are vivid, that is, recoverable from the analysis of their parts (i.e., go in, walk out)

2. Figurative phrasal verbs: whose meanings are opaque, that is, unrecoverable from the analysis of their parts (i.e., walk off, act on)

It is the second category which poses a lot of problems for L2 learners. As Dagut and Laufer (1985), Laufer and Eliasson (1993) and later Liao and Fukuya (2004) have shown in their studies of Hebrew, Swedish, and Chinese participants, the difficulty involved in using the right phrasal verbs make participants misinterpret received messages, avoid using these structures properly and erroneously use one-word verbs instead of them.

The problem of avoidance has attracted a lot of attention in different contexts and under different conditions. Dagut and Laufer (1985) studied phrasal verb use by advanced Hebrew-speaking L2 learners of English who lacked such constructions in their L1. They chose 15 phrasal verbs preferred by native speakers and used them in their study to check whether they were also preferred by the Hebrew-speaking participants. The results showed that most of the participants avoided phrasal verbs and preferred one-word verbs instead of phrasal verbs. They explained the result by attributing it to the differences between the structure of the participants’ L1 and L2. In a later study by Hulstijn and Marchena (1989), they hypothesized that language participants with non-Germanic native languages would tend to avoid phrasal verbs because of the absence of the same structures in their L1. The hypothesis predicted that Dutch L2 learners of English would not experience the same problem since “phrasal verbs are a peculiarity of the Germanic languages” (Waibel, 2007, p. 23). They tested two groups of Dutch participants and concluded that, in addition to the structural differences between L1 and L2, semantic difficulty may account for the challenges the participants had with mastering phrasal verbs that made them use one-word verbs instead of phrasal verbs.

The avoidance of phrasal verbs has also been investigated in the Iranian context. In a study by Ghabanchi and Goudarzi (2012), they analyzed the role of the type of phrasal verbs, type of test, and proficiency of the learners in avoiding English phrasal verbs. They tested two groups of intermediate and advanced participants and used three types of tests (i.e., multiple choice, translation, and recall tests). They tested two types of phrasal verbs– literal and figurative. Their results indicated that the type of test and the type of phrasal verb were effective in the avoidance of phrasal verbs, but proficiency level was not effective in this regard. Thus, they concluded that structural and semantic complexity of phrasal verbs accounted for their avoidance by Iranian L2 learners of English. In another study, Khatib and Ghannadi (2011) examined the effect of interventionist and noninterventionist methods in acquiring phrasal verbs and reducing the possibility of avoidance of these structures by Iranian L2 learners of English. They divided the participants into three groups: a noninterventionist control group, an experimental implicit group, and an experimental explicit group. They had 10 sessions of treatment. Their results showed that the interventionist group outperformed the noninterventionist group in both recognition and production of phrasal verbs.



Considering the difficulty of learning phrasal verbs due to the factors mentioned above, the search for a sound teaching technique to cope with the challenges of their teaching and learning in EFL and ESL settings is worth the effort. In line with this consideration and the ones mentioned with regard to the application of corpus-based DDL and dictionaries to vocabulary teaching, the purpose of the present study was to examine the efficacy of DDL activities versus the more traditional method of dictionary-based activities on participants’ achievement of phrasal verbs. The study sought to answer the following research question:

  1. Is corpus-based teaching of phrasal verbs more effective than their teaching by dictionary definitions in terms of immediate and delayed achievement?



The study was designed to compare the effect of teaching phrasal verbs using the two methods of dictionary use and DDL activities on the participants’ learning of these constructions in short and long run. To this end, the participants were divided into two groups: one group to work with dictionary definitions and the other group with concordance lines. The data were collected by means of pretest and immediate posttest as well as a delayed posttest. The participants had no access to computers or laboratories, so paper-based DDL materials in the form of handouts were used. The dictionary group had the same handout with dictionary definitions instead of concordance lines.



The participants were freshmen university students taking part in general English classes of a private language institute in a western central city in Iran. The reason these students had decided to register for a general English course was that they considered their English inadequate for IELTS or TOEFL certification purposes. Thirty-four students participated in the present study based on availability, 17 in one class to work with dictionary definitions and 17 in a second class to work with the paper-based DDL activities. The course book series used in the general English course was Top-notch 1B series (Saslow & Ascher, 2006). All the participants were male with Persian as their L1. They all consented to take part in the treatment sessions.

Their familiarity with English was typical of most Iranian university students, 7 years of compulsory 2 to 4 hours a week English classes at secondary schools with a didactic English language curriculum emphasizing language elements with a special focus on the skill of reading. At university, students typically go through two EAP courses in the first year of their studies. In general, the level of English of the participants, as estimated by the institute’s placement procedure, was low-intermediate.


Materials and Instruments

The Pearson free access Web site ( records 3,274 English idiomatic phrasal verbs (2012). Consistent with the purpose of the study and the assumption that the second category of phrasal verbs i.e., idiomatic phrasal verbs poses most of the problems for L2 learners, it was decided to choose the phrasal verbs from this category. One hundred idiomatic phrasal verbs were chosen from the database of this website using systematic random sampling. The site itself provides definitions for every phrasal verb listed. Because there were to be 14 sessions, and it had been decided to focus on five items each session, 70 phrasal verbs were chosen. In order to make sure of the lack of any prior knowledge of these verbs, they were given to the participants in both groups. They were asked to put a check mark against verbs they knew and provide the equivalents in their L1. Considering the results of this test, it became clear that very few items were familiar to the participants. Based on the analysis of the results, 30 phrasal verbs were excluded from the list of 100. The 70 remaining phrasal verbs were transformed into multiple-choice test items. The developed test was pretested with a group of 30 students majoring in English translation at the city’s state University. The reliability of the final version of the test was 0.72. To obviate the practice effect, it was decided to use the final version of the test to generate two equivalent tests. To generate the two equivalent tests, the quiz builder software version 2.00 (Pro QuizV2, 2012) was used. This software has the capacity to alter the order of the items and shuffle the alternatives.

The teaching materials consisted of two handouts covering the same language items for both groups with two different types of activities:

  1. Paper-based DDL activities
  2. Dictionary definition activities

The handout for the paper-based DDL activities contained the 70 phrasal verbs covered through DDL activities and the second with dictionary definitions. The first handout contained 14 units. Each unit began with 5 multiple-choice items to introduce the phrasal verbs covered in that unit. Then these phrasal verbs were introduced through 10 concordance lines. True/false, matching and gap-filling exercises which followed the concordance lines were designed with the aim of consolidating the meaning of the phrasal verbs.

The format of the DDL activities was similar to the ones designed by Thurstun and Candlin (1998). The same procedure was followed in the process of designing the activities. The only difference was that because the present study was conducted in an EFL context, more emphasis was put on noticing the meaning of phrasal verbs rather than the ability to produce them. The concordance lines were derived from Mark Davis’ Corpus of Contemporary American English (COCA). COCA was considered a good choice because it is freely available on the web at and benefits from its own built-in concordancer, SARA, and is composed of the type of language the participants in the present study were being exposed to i.e., American English.

The handout for the dictionary definition activities followed the same format except that the 10 concordance lines were substituted by dictionary definitions and two example sentences. The dictionary definitions and example sentences were derived from Pearson website mentioned above.

Data Collection Procedure

One of the researchers undertook the experiment. Because the participants were thinking of sitting for IELTS or TOEFL, they were very eager to receive whatever help deemed necessary from the instructor. The first session was devoted to introducing the participants of both groups to the course. In this session, it was emphasized that two extra sessions per week had been scheduled for them in addition to the usual sessions of the course to boost their knowledge of vocabulary. It was also emphasized that these two sessions would be optional, and that the exam results would be used for research purposes. It was emphasized that the attendants of the extra sessions would receive feedback on their performance in the exams. Almost all of the language course participants volunteered to sign up for the extra sessions. The signed up participants were given a consent form to complete. After the study was completed, they received feedback with regard to their performance in the exams.

The treatments were randomly assigned and the handouts were distributed. Because none of the participants in the DDL group had any experience of using DDL activities, a 30-minute training session was organized for them. The pretest was conducted in the first week, and the results convinced the researchers that the groups were homogeneous with regard to their familiarity with the selected phrasal verbs. The experimental sessions began in the second week. The 14 treatment sessions, lasting 90 minutes each, were extended over the whole summer term. In each session, the participants were instructed that a set of teaching procedures was supposed to be followed in covering each unit, and they were expected to actively take part in all phases and activities of the unit. For a DDL unit, the participants were asked to answer 5 multiple-choice items of the phrasal verbs to be covered in the unit. Then, they were asked to go through the concordance lines and then go back to the multiple-choice questions and re-examine their answers.

The same procedure was followed in the dictionary definition group in which the participants at first answered 5 multiple-choice items, and then they went through the dictionary definitions with the aim of checking and re-examining their answers to the multiple-choice questions. The two groups followed these procedures until all the handouts were covered. Soon after the work with the handouts was over, the participants took the immediate posttest. The two groups took part in this test. Because the study was to assess the two methods in terms of immediate as well as delayed comprehension of phrasal verbs, it was necessary to have a delayed posttest. Nearly five weeks after the immediate posttest, the second version of the final test was administered to both groups.


Data Analysis

The data were analyzed in two phases. At first, they were tested for normality. The statistic used in this phase was Kolmogorov-Smirnov test. This statistic shows that whether the data collected through a study are normal and have a normal distribution. In the second phase of the analysis, a t test was run on the means of the immediate posttest and the delayed posttest to answer the research question.




Before testing the hypotheses of the research, it was necessary to check the normality of the data. To check this, Kolmogorov-Smirnov test was conducted. The results of this test are presented in Table 1.

      As can be seen in Table 1, the p values exceed the significance level of 0.05. This result shows that there is no significant difference in all levels of the variables, and so it can be concluded that all the data used in the experiment were consistent with the normality assumptions.


Table 1: Kolmogorov-Smirnov test for the normality of the data




Std. Deviation

P  Value

Kolmogorov-Smirnov Z

Dictionary Group Posttest






DDL Group Posttest






Dictionary Group Delayed Posttest






DDL Group Delayed Posttest







To answer the research question, a t test was run on the means of the two groups on the immediate posttest first. The results of this test are presented in Table 2.

As can be seen in Table 2, with t (32) =2.85, p=.008, the mean difference was significant. The numerical values of the means (M= 53.12 for the DDL group and M= 43.76 for the dictionary definitions group) show that the difference is in favor of working with DDL activities. This suggests that using DDL activities was more effective than consulting dictionary definitions in terms of immediate comprehension.


Table 2: T-value to check the effectiveness of both methods in teaching phrasal verbs




Std. Deviation

Degree of Freedom

α level

t value

Dictionary Group




   32                  0.008         -2.85

DDL Group




A second t test was run on the means of the two groups in the delayed posttest. The results are presented in Table 3.

As can be seen in Table 3, with t (32) = 3.90, p= .001,the mean difference was significant. As the numerical values of the means show (dictionary definitions group M= 30.06 and DDL group M= 36.65), this difference is in favor of the DDL group. Based on this result, we can conclude that teaching phrasal verbs based on the corpus-based DDL model was more effective than their teaching based on dictionary definitions in terms of delayed comprehension.

Table 3: T-value to check the effectiveness of both methods in teaching phrasal verbs




Std. Deviation

Degree of Freedom

α level

t value

Dictionary Group




   32               0.001           -3.90 

DDL Group






The study focused on testing the difference between DDL and dictionary definitions in terms of their effect on learning figurative phrasal verbs which have been empirically shown to be challenging to Iranian language learners (Ghabanchi & Goudarzi, 2012) because of their syntactic and semantic complexities. The two selected methods addressed the semantic difficulties of the phrasal verbs through concordance lines and dictionary definitions. The findings of the present study showed that the DDL activities were more effective than the dictionary definitions. The results are interesting since they provide evidence for the effectiveness of paper-based DDL activities in helping learners of low-intermediate level improve their knowledge and understanding of phrasal verbs. The language items for treatment were decidedly selected from among the problematic aspects of language consistent with the common assumption among researchers in this area that problematic aspects are more in need of new techniques than the less problematic items which are adequately presented and practiced through traditional methods and techniques (see Boulton, 2009, 2010). The fact that the participants, who had almost no familiarity with concordance lines, managed to glean the meaning of phrasal verbs from concordance lines much better than dictionary definitions suggests that paper-based DDL materials can be a better alternative to hands-on corpus work in situations where learners are of a relatively lower level of language proficiency and they are new to DDL.  

This finding further indicates that for the DDL work to be effective, longer periods of treatments are required since in most studies conducted with very few sessions of treatment no difference has been reported between DDL activities and traditional type of activities. As an example of these studies, Boulton (2009) analyzed the ability of the research participants to extract the meaning of linking adverbials. The results showed gains for both the DDL group and the traditional reference material group between the pretest and posttest though no significant difference was found between the two groups at the posttest. In a subsequent study, Boulton (2010) found that both the traditional and DDL treatment groups improved between the pretest and posttest; however, the posttest difference of the two groups was not significant. In both studies, the treatment lasted for one session.

The reasons for the superiority of DDL may be manifold. One reason for the superiority of DDL activities may be that, in working with DDL activities, participants are flooded with enormous amount of data that promote the likelihood of incidental learning which by nature might include a small percentage of the total learning. Cobb (1997) found that consulting words in concordance lines led to small but consistent outcomes in participants’ knowledge of vocabulary. Bernardini (2002) and Gavioli (1997) explain that DDL activities allow participants to arrive at developmentally appropriate conclusions about the linguistic structure being analyzed. This might explain Ilse’s (1991) observation that the students took away less factual information at the end of a DDL lesson than they would if the lesson had been taught in a traditional didactic way. Aside from the benefits attributed to DDL work in general, the mere exclusion of the computer from the scene may be considered as an important factor in gaining the results. DDL activities presented to participants in the form of printouts are limited compared to hands-on DDL work, but this may be considered an advantage (Aston, 1997; Chambers, 2007). This limitation reduces some of the cognitive burden the participants have to bear in tackling the unlimited on-line data especially at early stages of language learning because it allows them to focus on a single element at a time. This is in contrast with the challenges they face while working on hands-on DDL activities because, in this case, the materials, technology, and the method are new and difficult for learners to manage, especially at lower levels (Gavioli, 2005). As a further advantage, paper-based DDL activities (Bernardini, 2002) help "technophobic" learners focus on language items. Additionally, these printed exercises are accessible outside the classroom and participants can consult them on their own in their free time. This is not the case with hands-on DDL activities because the data on the web are rapid-fade. In sum, from an educational point of view, as mentioned by Sun (2003), the learning curve in working with computers“is arduously steep, in that students tend to get confused easily about the concordance outputs; thus, they need either a stronger degree of teacher involvement, or to learn in a more structured environment” (p. 609). Introducing paper-based DDL materials seems to provide a “convenient way of introducing concordance-based methods and as preparation for using a full concordancer” (Johns, 1997, p. 113).

The advantages of DDL work mentioned above should not distract us from considering the results gained for the dictionary-learning group. Comparing the results of the pretest with the results of the immediate and delayed posttests of this group, we can observe that the dictionary group achieved a lot although not as much as that of the DDL group. The participants in this group started the activities with more confidence because they had the correct answer at hand, and they were not forced to work out the meaning. The only thing they had to do was to work with that meaning in all phases of the activity. After reading the definitions and the two example sentences, they were able to refer back to the test and re-examine their answers with greater confidence and ease, and most of their answers, as the instructor checked, were correct even without completing the set of planned activities before referring back to the multiple-choice questions. This may be attributable to the nature of referring to reference materials. In consulting a dictionary, learners have confidence in what they are doing, and this helps them mitigate their anxiety and frustration and in effect increase their motivation. Comparing the two methods of treatment in the present study, we can conclude that both of them are useful. However, the group working with concordance lines had the advantage of coping with language in its natural complexity. The considerable length of treatment period might have helped them gain more confidence in dealing with the meaning of phrasal verbs and as a result exceeding the other group in immediate and delayed posttests.  



The participants in the present study worked on one of the problematic aspects of English lexicon i.e., phrasal verbs, an aspect of language study which often leads to avoidance on the part of many English L2 learners from different language backgrounds. Furthermore, contrary to the common belief that DDL is appropriate for advanced learners with much training before undergoing DDL work, the participants in this study were not at a high level of language ability and did go through very limited amount of training. The study sought to overcome the major shortcoming of previous researches done in this domain by extending the treatment over one complete semester.

The procedure and the materials used in this study could have implications for language teachers as well as language learners. Today, with the availability of free corpora on the internet, it is not very difficult for teachers to bring corpus to their classrooms. Teachers can familiarize their students with corpus search and prepare them for the job using printed handouts initially. This study has also implications for materials developers. Through incorporating corpus-based or corpus-informed activities in teaching materials, material developers can help learners learn vocabulary items in rich contexts. This experience is more likely to provide learners with some insights into collocations and contextualized grammatical structures. The rich context of concordance lines within which the language items are introduced provides considerable opportunities for learners to broaden their lexical and grammatical awareness.

It is also worth mentioning that the intervention introduced in this study was indented to encourage the participants to work on vocabulary on their own i.e., to boost their independent vocabulary learning. Therefore, instruction was limited to introducing the materials and supervising the participants while doing the activities. In fact, the materials were selected based on the assumption that, in vocabulary learning through DDL and using dictionary definitions, learners are supposed to work on their own and ultimately become independent learners capable of using available materials.     

Although the present study has yielded some significant results, its design is not without flaws. A number of caveats should be observed regarding the reported results. First, learners benefit differently from different tools for learning. This implies that the identification of differential rates of learning might give a better picture of how concordance lines and dictionary definitions work in practice. However, the question of differential rates of learning was beyond the scope of the present study. Second, learners’ attitudes toward the tools they use in learning are an important factor in their learning achievement. This factor was also beyond the scope of the study. Future research will be more convincing if the researchers include the above-mentioned factors into the design of their studies.


Allan, R. (2006). Data-driven learning and vocabulary: Investigating the use of concordances with advanced participants of English. Centre for Language and Communication Studies. Occasional Paper 66. Dublin: Trinity College.

Aston, G. (1995). Corpora in language pedagogy: Matching theory and practice. In G. Cook & B. Seidlhofer (Eds.), Principle and practice in applied linguistics (pp. 257-270). Oxford: Oxford University Press.

Aston, G. (1997). Involving participants in developing learning methods: Exploiting text corpora in self–access. In P. Benson & P. Voller (Eds.), Autonomy and independence in language learning (pp. 204-263). London: Longman.

Bernardini, S. (2002). Exploring new directions for discovery learning. In B. Kettemann & G. Marko (Eds.), Teaching and learning by doing corpus analysis (pp. 165-182). Amsterdam: Rodopi.

Boulton, A. (2008). Looking for empirical evidence for DDL at lower levels. In B. Lewandowska–Tomaszczyk (Ed.), Corpus linguistics, computer tools, and applications-state of the art (pp. 581-598). Frankfurt: Peter Lang.

Boulton, A. (2009). Testing the limits of data-driven learning: language proficiency and training. ReCALL, 21(1), 37-51.

Boulton, A. (2010). Data-driven learning: taking the computer out of the equation. Language Learning, 60(3), 534-572.

Carter, R. & McCarthy, M. (1995). Grammar and the spoken language. Applied Linguistics, 16(2), 141-158.

Chambers, A. (2007). Popularizing corpus consultation by language participants and teachers. In E. Hidalgo, L. Quereda, & J. Santana (Eds.), Corpora in the foreign language classroom (pp. 3-16). Amsterdam: Rodopi.

Cheon, Y. (2006). A pilot study in learning English phrasal verbs. Unpublished Master Thesis, University of Pittsburgh, Pennsylvania, United States.

Ciesielska-Ciupek, M. (2001). Teaching with the Internet and corpus materials: Preparation of ELT materials using the Internet and corpus resources. In B. Lewandowska-Tomaszczyk (Ed.), Practical Applications in language corpora (pp. 521-531). Frankfurt:Peter Lang.

Cobb, T. (1997). From concord to lexicon: Development and test of a corpus-based lexical tutor. Unpublished doctoral dissertation, Concordia University, Montreal, Canada. Retrieved November 2012 from:

Cobb, T. (1999). Breadth and depth of vocabulary acquisition with hands-on concordancing. CALL, 12, 345-360. Retrieved 2012 from:

Conrad, S. (2005). Corpus linguistics and L2 teaching. In E. Hinkel (Ed.), Handbook of research in second language teaching and learning (pp. 393-409). Mahwah, NJ: Lawrence Erlbaum.

Cowie, A. P. (1994). Phraseology. In R. E. Asher (Ed.), The encyclopedia of language and linguistics (Vol.6). Oxford and New York: Pergamon Press.

Dagut, M., & Laufer, B. (1985). Avoidance of phrasal verbs: A case for contrastive analysis. Studies in Second Language Acquisition, 7, 73-79.

Gaskell, D., & Cobb, T. (2004). Can learners use concordance feedback for writing errors? System, 32(3), 301-319.

Gavioli, L. (1997). Exploring texts through the concordancer: Guiding the learner. In A. Wichmann, S. Fligelstone, T. McEnery, & G. Knowles (Eds.), Teaching and language corpora (pp. 83-99). London: Longman.

Gavioli, L. (2005). Exploring corpora for ESP learning. Amsterdam: John Benjamins.

Ghabanchi, Z., & Goudarzi, E. (2012). Avoidance of phrasal verbs in participants English: A study of Iranian students. World Journal of English Language, 2(2), 43-54.

Hoey, M. P. (2005). Lexical priming. A new theory of words and language. London, UK: Routledge.

Horst, M., Cobb, T., & Nicolae, I. (2005). Expanding academic vocabulary with an interactive on-line database. Language Learning and Technology, 9(2), 90-110.

Hulstijn, J. H., Hollander, M., & Greidanus, T. (1996). Incidental vocabulary learning by advanced foreign language students: The influence of marginal glosses, dictionary use, and reoccurrence of unknown words. The Modern Language Journal,80(3), 327-339.

Hulstijn, J. H., & Marchena, E. (1989). Avoidance: Grammatical or semantic causes. Studies in Second Language Acquisition, 11, 241-55.

Ilse, W. (1991). Concordancing in vocational training. English Language Research Journal, 4, 103-113.

Johns, T. (1991). “Should you be persuaded”: two samples of data-driven learning materials. In T. Johns & P. King (Eds.), Classroom concordancing: English Language Research Journal, 4. University of Birmingham: Centre for English Language Studies.

Johns, T. (1997). Contexts: The background, development, and trialing of a concordance-based CALL program. In A. Wichmann, S. Fligelstone, T. McEnery, & G. Knowles (Eds.), Teaching and language corpora (pp. 100-115). Harlow: Addison Wesley Longman.

Khatib, M., & Ghannadi, M. (2011). Interventionist (explicit and implicit) versus non-interventionist (incidental) learning of phrasal verbs by Iranian EFL participants. Journal of Language Teaching and Research, 2(3), 537-546.

Knight, S. (1994). Dictionary use while reading: The effects on comprehension and vocabulary acquisition for students of different verbal abilities. The Modern Language Journal, 78(2), 285-298.

Koosha, M., & Jafarpour, A. (2006). Data-driven learning and teaching collocation of prepositions: The case of Iranian EFL adult participants. Asian EFL Journal Quarterly, 8(4),192-209. Retrieved June 2012 from:  

Laufer, B. (1997). The lexical plight in second language reading: words you don’t know, words you think you know and words you can’t guess. In J. Coady & T. Huckin(Eds.), Second language vocabulary acquisition: A rationale for pedagogy (pp. 20-34). Cambridge: Cambridge University Press.

Laufer, B. (2009). Second language vocabulary acquisition from language input and from form-focused activities. Language Teaching, 42(3), 341-354.

Laufer, B., & Eliasson, S. (1993). What causes avoidance in L2 learning: L1-L2 difference, L1-L2 similarity or L2 complexity? Studies in Second Language Acquisition, 15, 35-48.

Laufer, B., & Hulstijn, J. (2001). Incidental vocabulary acquisition in a second language: The construct of task-induced involvement. Applied Linguistics, 22(1),1-26.

Leech, G. (1997). Teaching and language corpora: a convergence. In A. Wichmann, S. Fligelstone, T. McEnery, & G. Knowles (Eds.), Teaching and Language Corpora (pp. 1-23). London: Longman.

Liao, Y., & Fukuya, Y. J. (2004). Avoidance of phrasal verbs: The case of Chinese participants of English. Language Learning, 54(2), 193-226.

Luppesku, S., & Day, R. (1993). Reading, dictionaries, and vocabulary learning. Language Learning, 43(2), 263-287.

McCarthy, M., McCarten, J., & Sandiford, H. (2005). Touchstone 1: Student’s book. Cambridge: Cambridge University Press.

McEnery, T., & Xiao, R. (2011). What corpora can offer in language teaching and learning? In E. Hinkel (Ed.), Handbook of Research in Second Language Teaching and Learning, Volume II (pp. 364-380). New York: Routledge.

Mondria, J. A. (1993). The effects of different types of context and different types of learning activity on the retention of foreign language words. Paper presented at the 10th AILA World Congress of Applied Linguistics, Amsterdam.

O’Keeffe, A., McCarthy, M., & Carter, R. (2007). From corpus to classroom: Language use and language teaching. Cambridge: Cambridge University Press.

O’Sullivan, I. (2007). Enhancing a process–oriented approach to literacy and language learning: The role of corpus consultation literacy. ReCALL, 19(3), 269-286.

Pickard, V. (1994). Producing a concordanced-based self-access vocabulary package: Some problems and solutions. In L. Flowerdew & A. K. K. Tong (Eds.), Entering text (pp. 215-226). Hong Kong: Hong Kong University of Science and Technology Language Center.

Pro Quiz V2. (2012). Powered by Softon Technologies. Retrieved December 2012 from:

Saslow, J., & Ascher, A. (2006). Top Notch: English for Today`s World. New York: Pearson Education, Longman.

Scott, M., & Tribble, C. (2006). Textual patterns: Key words and corpus analysis in language education. Amsterdam: John Benjamins.

Sinclair, J. (1997). Corpus evidence in language description. In A. Wichmann, S. Fligelstone, T. McEnery, & G. Knowles (Eds.), Teaching and language corpora (pp. 27-39). London: Longman.

Sinclair, J. M. (2004). Introduction. In J. M. Sinclair (Ed.), How to use corpora in language teaching (pp. 1-10). Amsterdam: John Benjamins.

Stevens, V. (1991). Concordance-based vocabulary exercises: A viable alternative to gap-filling. English Language Research Journal, 4, 47-61.

Sun, Y. C. (2003). Learning process, strategies and Web-based concordancers: A case-study. British Journal of Educational Technology, 34(5), 601-613.

Thurstun, J., & Candlin, C. (1998). Concordancing and the teaching of the vocabulary of academic English. English for Specific Purposes, 17(3), 267-280.

Tian, S. (2005). The impact of learning tasks and learner proficiency on the effectiveness of data-driven learning. Journal of Pan-Pacific Association of Applied Linguistics, 9(2), 263-275.

Waibel, B. (2007). Phrasal verbs in learner English: a corpus-based study of German and Italian students. Unpublished doctoral dissertation, Albert-Ludwings University, Freiburg.

Willis, D., & Willis, J. (1989). Collins COBUILD English course. London: Harper Collins.