Authors

1 Associate Professor of Applied Linguistics, Urmia University, Iran

2 M.A. in TEFL, Urmia University, Iran

Abstract

The pivotal role of listening comprehension in second/foreign language learning requires that researchers conduct studies which investigate factors that affect test takers’ performances. The present study was set out to examine whether item modality (i.e., written vs. oral items) affects listening comprehension test performance. In addition, it investigated whether allowing test takers to take notes while listening would also affect their performances. To this end, two different tests, each containing 20 multiple choice items, were administered to 66 (35 female and 31 male) upper-intermediate EFL learners. The first test was administered to look into the role of item modality, and the second test was employed to investigate the effect of note-taking. The application of independent samples t-tests to analyze the data revealed that that test takers performed better when the items were provided in written rather than oral form, and that test takers’ performances did not differ significantly when they were allowed to take notes. More detailed findings and implications are discussed in the paper.

Keywords: item modality, note-taking, listening test, EFL

 
Authors’ emails: k.sadeghi@urmia.ac.ir; m.zeinali1987@gmail.com
 
 
INTRODUCTION
Testing is an integral part of any teaching and learning process and like other educational fields, English as a Foreign Language (EFL) education has long recognized testing as a major part of the teaching. New perspectives on the use of English as an international language (EIL) have presented significant challenges to the field of language testing, with calls for change in assessment practices arising over the past decade (Jenkins, 2006). One of the skills for which constructing test items is demanding is listening comprehension, as in real life contexts, listeners cannot usually move backwards and forwards over what is being said in the way that they can do in a written text. In a listening test, the key concern is to evaluate the students’ comprehension, that is, to determine whether the students have grasped the intended message. So, it is essential to decide on the conditions and operations that merit inclusion in a test of listening comprehension (Weir, 1990). In actual fact, the assessment of listening abilities is one of the least understood, least developed and yet one of the most important areas of language testing (Buck, 2007).
The issue is even more complex nowadays given the unprecedented diversity of testing methods and academic pathways available for international students (Taylor & Geranpayeh, 2011). In other words, among the many existing variables that are considered to affect test takers’ performance, one central issue is the effect of test methods and formats (Alderson, 2000; Bachman, 1990; Buck, 2007).
Besides the awkward nature of testing listening comprehension, there exist some factors that might affect test-takers’ performance. When test developers set out to design a listening comprehension test, they usually encounter, and have to account for, numerous factors that may influence test-takers’ performances, such as item format, speech rate, speaker accents, topic familiarity, etc. Considering this, the present study is, for one thing, concerned with the mode of presentation of multiple choice items in a listening comprehension test, that is, it makes a difference to present the items orally or in written form.
Focusing on different items format, some studies conclude that allowing candidates to preview question stems enables them to make good use of planning, a meta-cognitive strategy by directing their attention to relevant areas of the text (Wu, 1998; Yanagawa & Green, 2008). However, the listening items in which the stem of the question is not seen on the paper or the screen, have their own advocates who believe that auditory memory does not need to be supported by visual aids. When it comes to listening instruction, there are numerous studies that look at enhancing listening comprehension through various means of support, such as visual aids, advance organizers, captions, etc. with the overall conclusion that most of these forms of support have been found to facilitate listener comprehension and also to have some positive psychological effects on listeners’ learning (Chang, 2009). Elekaei, Faramarzi and Biria (2015), for instance, investigated test-takers’ attitudes towards items with audio-only, pictorial and visual modality and found that students favoured picture-based (rather than visual) items over audio items. I support of Elekaei et al.’s (2015) findings, Basal, Gulozer and Demir (2015) compared the performance of Turkish EFL learners on items with audio and visual modality and found the performance on audio modality to be significantly higher than that on visual modality.
In addition to the modality of the item, another factor that may affect listening performance of the EFL learners is whether they are allowed to take notes during the listening test, which is the second concern of this study. Note-taking variable was considered in conjunction with modality in this research on the assumption that both these variables involve similar cognitive processes in listening. In other words, while written item modality helps listeners to overcome the memory problem (which is evident in oral items), note-taking functions similarly by allowing the listener to have a partial written record of the lapsing message, helping to remember better and retrieve what may otherwise be unretrievable.
Some studies (e.g., Hale & Courtney, 1991) have found that note-taking almost always improves retention of aurally presented material when performance is measured with a recall test. In their studies, Hale and Courtney concluded that allowing students to take notes would lead them to a better performance in listening tests. However, research suggests that note-taking may work differently for listeners of different proficiency level. In his study with 257 participants who took English as a second language placement exam, Song (2011) found that those with higher levels of proficiency benefited more from note-taking compared to listeners with lower proficiency level, while some other studies have failed to find an effect (Carter & Van Matre, 1975; Dunkel, 1986); and still other researchers like Aiken, Thomas, and Shennum (as cited in Song, 2011) have observed an interfering effect. 
Given the widespread use of language proficiency tests administered throughout the world and considering test-takers’ desire to gain satisfactory results in such tests as the score are sued to make life-changing decisions on them, there seems to be a need to better understand what affects candidates’ performances in such tests (as well as in less high-stakes assessments) in order to assist test-takers in obtaining desired results. Therefore, in designing such tests, besides the needs of the candidates, test-dependent factors including item modality and allowing test-takers the opportunity to take notes are areas which require further research attention with the aim to provide listeners the chance to reveal their true listening competence and guard them against memory problems, which can be doubled in exam setting.
 
LITERATURE REVIEW
L2 listening tests should demonstrate that the test-taker has the ability to process language automatically, in real time (Buck, 2007).  Thus, there is a need for the listener to automatize the listening process, and consequently there is a need to assess if the listener can indeed comprehend spoken language automatically in real time. This presents a dilemma for testers, in determining the mode of presenting the item stems and allowing the test-takers to take notes or not, since the first of these resources does not seem to exist in real-life situations, and the second has few outside realizations (except for academic or formal encounters). Ideally, the item stems should be presented orally to the test-takers, because this is generally how spoken language is encountered in real life.  Note-taking is considered as a good strategy for keeping the points in mind in real life and in listening to lectures. However, the burden on listeners in an exam context is quite different from that in a real-life context, and it needs to be investigated whether providing support to listeners in the form of written items (as opposed to oral items) and allowing them the chance to take notes helps them to better reveal their listening ability in a test context. Below we provide a brief account of some studies conducted in this area before we introduce our project.  
 
 
Item Modality
It has been argued that EFL learners need abundant support when processing auditory input (Chang, 2009). Numerous studies (e.g., Markham, Peter, & McCarthy, 2001; Stewart & Pertusa, 2004; Vandergrift, 2007) have looked at enhancing listening comprehension through various means of support such as visual aids, captions, etc. Most of these supports have been recognized as facilitative and have been shown to have positive psychological effects on listeners’ comprehension. However, in the realm of assessing listening, providing cognitive processing support to listeners in the form of written item modality has not received due attention.
A few studies have looked at the issue of modality but diverse results have been reported. Yanagawa and Green (2008), for example, examined whether the choice of multiple choice item format led to differences in task difficulty and test performance. In their study, they studied three formats, two of which were Full Question Preview (used in tests such as TOEIC which displays both the question stem and answer options on the question paper/screen) and Answer Option Preview (used in TOEFL where answer options are displayed on the question paper/screen, but the questions are heard after the text). In their study, 279 test-takers participated and listening tests were administered using different formats. The results indicated that listening comprehension test performance did vary significantly according to whether test-takers had been able to preview the question stem. It was found that allowing test-takers to preview only the answer options produced fewer correct answers than allowing test-takers to preview both the question stem and answer options prior to listening. However, they suggested that although the cues provided in answer options did not facilitate comprehension, previewing them may encourage test takers to fall back on a lexical matching strategy.
Chang (2009) compared two modes of aural input: reading while listening versus listening only. The results of the study revealed that although students showed a strong preference for the reading/listening mode, they gained only 10% more with that mode. More than half of the students believed that reading while listening mode made listening tasks easier and more comprehensible.
In a study similar to ours, Wagner (2010) examined the effect of using visual components of spoken texts on listeners’ performance and their comprehension of aural information in a listening test. In his study, the two groups’ performance on an ESL listening test was compared. The control group took a listening test with audio-only texts. The experimental group took the same listening test, with the exception that test-takers received the input through the use of video texts. Analyses of the results indicated that the video (experimental) group performed better than the audio-only (control) group on the test, and the difference between their performances was statistically significant.
More recently, Rogowsky, Calhoun, and Tallal (2016) compared immediate and delayed comprehension (retention) of three groups of learners who either listened to an audio text (the preface and a chapter form a non-fiction book), or read the original text on screen or did both at the same time (dual modality). The findings revealed that in neither condition did readers/listeners outperform either at Time 1, or at Time 2, concluding that input modality does not matter in comprehension. The comprehension test was however in written mode and whether similar results could be obtained in listening comprehension has to be established by future research.
 
Note-taking
Note-taking is generally considered to promote the process of learning and retaining, especially in the context of reading comprehension (Rahmani & Sadeghi, 2011). Over the years, research on note-taking has generated debates, and researchers have tried to implement studies to verify whether taking notes is effective for students to improve their listening comprehension. A study conducted by Hale and Courtney (1991) who investigated note-taking effect on listening comprehension of test-takers in TOEFL mini-talks. In their study, Hale and Courtney had two groups of international test-takers (a total number of 563 students) who were getting ready to take part in TOEFL. In their study, one of the groups was free to take notes while the test-takers were listening to the text. However, the test-takers in the other group were not allowed to take notes at all. The results revealed that allowing test-takers to take notes had little effect on their performance, and more interestingly, allowing test-takers to take notes impaired their performance in the listening test.
In a similar vein, in a study conducted by Kobayashi (2005), the researcher was concerned with the question of whether the process of taking notes promotes the encoding of lecture or text information, and if so, how much and why. The results of his meta-analysis demonstrated that the overall effect of note-taking compared with no note-taking was positive but modest, which was somewhat inconsistent with the tenets of encoding hypothesis that note-taking enhances learning by stimulating note-takers to actively process the material and to relate it to their existing knowledge.
Carrell (2007) investigated the relationships between note-taking strategies and performance on the three language assessment tasks. Her study employed 216 international test-takers (88 males and 128 females) ranging in listening comprehension proficiency from low-intermediate to high. The participants were tested and were asked to take notes while listening to the talks. The researcher analyzed the content of the notes as well as the candidates’ performances. The overall results revealed that the relationship is complex, depending upon the note-taking strategy and the task. She found positive correlations between the number of total notations and task performance.
Likewise, Ching Ko (2007) in his study with fifteen university EFL students tried to explore test-takers’ perceptions of note-taking and analyze the effect of note-taking on students’ foreign language listening comprehension. The findings indicated that taking notes did not distract students from their listening process; but rather, it helped them pay more attention to the text. He concluded that with the help of note-taking, students can improve their listening performance through both enhancing recall and paying more attention to the listening text.
The above brief literature on two variables of interest in this study (item modality and note-taking) reveals that although these two variables are among those important test method facets that have the potential to affect listening performance in exam contexts, little research exists to indicate the role item modality and note-taking plays in test-taking, and the small body of published research does not point to a uniform direction. In order to contribute to the existing literature in this important area of language testing, this study was planned to further our understanding of the links between item modality, note-taking, and performance in listening tests.
 
 
PURPOSE OF THE STUDY
The main purpose of the current research was to assess students’ ability to comprehend spoken language as it would typically occur in an academic setting. In other words, the study sought to find the effects of the modality of multiple choice items (oral versus written modality) and note-taking (whether it is allowed or not) on the performance of upper-intermediate EFL learners in taking listening tests.
More specifically the following research questions were posed for further scrutiny:
 
1. Does item modality (written vs. oral) have any significant effect on the listening performance of Iranian upper-intermediate EFL test-takers?
2. Does note-taking have any significant effect on the listening performance of Iranian upper-intermediate EFL test-takers?   
 
METHOD
Participants
A total number of 66 upper-intermediate EFL learners (31 males and 35 females) within the age range of 18 to 25 took an institutional version of PBT TOEFL, from among whom no one was excluded as an outlier (since they all enjoyed a similar proficiency level, and their scores ranged between 62 to 85 out of 100). They were all upper-intermediate language learners who were taking English language courses in Shukuh-e-Iran language school; and having attended English classes for the last three years, they had relatively high levels of English proficiency, including listening. They participants attended the same course (in different classes for males and females) and the institute placed them at the same level, confirming their homogeneity as revealed by TOEFL scores.
 
Instrumentation
The following data elicitation tools were employed to measure participants' listening performance under four measurement conditions discussed above (oral versus written item modality and note-taking versus no-note-taking condition).
 
 
Listening Test 1
The first listening test was the listening section of an institutional PBT TOEFL. The test consisted of 20 mini-talks, each followed by a multiple choice question. The mini-talks were randomly selected from among 150 items provided in the Complete TOEFL Test section of Longman Preparation Course for the TOEFL Test by Deborah Phillips (2003) published by Pearson ESL. The items in this pack are claimed to be similar to real TOEFL in terms of content and difficulty, hence evidence for its construct validity. In order to provide data for the first research question, two versions of this test were produced: the first version with written item modality (for both the stem and the options) and the second version with item stems in the oral mode (but with the options in the written mode). K-R 21 was utilized to estimate the reliability of the test, which was estimated to be 0.75.
 
Listening Test 2
A second test of listening (based on the same sample tests as above) was employed to provide data for the second research question. The test consisted of two long conversations and three talks. For each conversation or talk, there were four multiple choice items that the students had to answer after listening to each conversation or talk. The texts used in this test ranged in length from 100 to 150 words. These texts and questions were selected randomly from among 20 talks and 20 long conversations in Complete TOEFL Test section of Longman Preparation Course for the TOEFL Test and were assumed to be valid in content and difficulty as they represented real TOEFL items. The test was administered to the same participants as above in a different session. In administering the test, one group was not allowed to take notes, while the other group was instructed to take notes (using the note-taking sheets provided) while listening to the talks/conversations. K-R 21 was also used to estimate the test’s reliability, and the results revealed a high index of reliability of 0.79.
 
Listening Proficiency Test
In order to have a controlled level of listening proficiency and work with homogeneous participants, the Listening Section of an institutional version of TOEFL was administered at the beginning of the study. The test had 20 multiple choice items, and enjoyed a reliability index of 0.86.
 
Data Collection Procedure
The following steps were taken to conduct this study:
First, a listening proficiency test was administered to all upper-intermediate EFL learners at a language school (as mentioned above) to select that all the candidates who enjoyed a homogeneous listening ability. These learners were all studying “Passages 1” book and were regarded as higher intermediate by institute standards. The results of the proficiency test revealed (see above) that students were indeed homogeneous and of similar language proficiency (in listening).  Then, to provide data for the first research question, thirty three learners (16 males and 17 females) were selected randomly and took the first version of the test, that is, the test with oral item modality while the other 33 testees took the second version with items in written modality.  Subsequent to this, and in another session of the treatment, the second listening test was administered to the same groups in a similar procedure where one group was allowed to take notes and the other was not.
 
Data Analysis
To analyze the elicited data, the data were entered into SPSS (Statistical Package for Social Sciences) software, PASW Statistics 18 and two separate sets of independent samples t-tests were run.
 
RESULTS
Results of the Normality Test
To ensure the homogeneity of the participants, the Listening Section of an institutional version of TOEFL test was utilized as explained above. Table 1 shows the results of test of normality for the participants.
 
Table 1. Tests of normality for the proficiency test





 


Kolmogrove-Smirnov


Shapiro-Wilk




 


Statistics


df.


Sig.


Statistics


df.


Sig.




Exam Scores


.14


66


.06


.93


66


.07





    
As it can be seen in the table above, the non-significant result (i.e., .06 which is more than .05) indicates normality which means that participants were homogeneous. Furthermore, Figure 1 presents the related box plot which shows that there were no outliers among the participants.
 
 
Figure 1. Box plot for homogeneity of participants.
 
Item Modality and Listening Comprehension
After ensuring the homogeneity of the participants, an independent samples t-test was run to find the answer to the first research question by comparing the mean scores of the groups which had different item modalities in the tests. Table 2 provides the independent samples t-test statistics.
 
Table 2. Independent samples t-test for test 1 (item modality variable)




 


Levene’s Test for Equality of Variances


t-test for Equality of Means


 


 


 




 


 


 


 


 


 


 


 


95% Confidence Interval of the Difference




 


 


F


Sig.


t


df


Sig (2-tailed)


Mean Difference


Std. Error Difference


Lower


Lower




Test Scores


Equal Variances assumes


1.19


.27


8.18


64


.00


4.69


.57


3.55


5.84




Equal Variances not assumes


 


 


8.18


64


.00


4.69


.57


3.54


5.84



 
 
 
 
 
 
 
 
 
 
 
 



 
As it is shown in Table 2, the significance level shown by Levene’s Test is .27 which is larger than the cut-off of .05, and this means that the assumption of equal variances has not been violated. And the significance level (i.e., Sig (2-tailed) is p = .00) which is less than .05 and this indicates that there is a significant difference between the two groups in terms of item modality. Comparing the mean scores of the test-takers, it is evident that test-takers exposed to the written item modality (M = 16.64) did much better than those who experienced the oral presentation of the items (M = 11.94).
In addition, using the Eta squared formula, the effect size of this independent samples test was calculated and the result (i.e., Eta squared = .51) reveals that the effect size for this test is medium. Expressed as percentages, it can be inferred that 51 percent of the variance in listening test performance is explained by item modality. All this can be interpreted to mean that the modality of test items does have a significant effect on the listening performance of Iranian upper-intermediate EFL learners.
 
Note-taking and Listening Comprehension
In order to provide an answer to research question 2, another independent-samples t-test was used to compare the mean scores of the two groups of test-takers (with and without note-taking condition).
Table 3 reports the results of homogeneity of variances as well as t-test results. Since the significance level for Levene's test is less than .05 (p = .04), the assumption of homogeneity of variances is violated, so the second row is consulted for analysis of the results.
 
Table 3. Independent samples t-test for test 2 (note-taking variable)




 


Levene’s Test for Equality of Variances


t-test for Equality of Means


 


 


 




 


 


 


 


 


 


 


 


95% Confidence Interval of the Difference




 


 


F


Sig.


t


Df


Sig (2-tailed)


Mean Difference


Std. Error Difference


Lower


Lower




Test Scores


Equal Variances assumes


4.16


.04


-.34


64


.73


-.30


.88


-2.06


1.46




Equal Variances not assumes


 


 


-.34


58.69


.73


-.30


.88


-2.07


1.46



 
 
 
 
 
 
 
 
 
 
 
 



 
As it can be seen in Table 3, the p value for the independent test is .73 which is greater than the cut-off of .05, and this reveals that there is not a significant difference between the mean scores of the two groups. The Eta squared was also calculated and showed a really small amount (i.e., Eta = .001). The mean scores of the test-takers who were allowed to take notes while listening (i.e., M = 13.94) did not prove to be statistically different from the mean score of the test-takers who did not have the chance to take notes (M = 14.24). In other words, the mean difference for the two groups was -.3 which is too small a difference for statistical significance. Surprisingly, note-taking seems to have negatively affected listening performance, to a non-significant level though.
 
DISCUSSION
This study set out with the aim of assessing the effect of written versus oral item modality, and note-taking on listening performance under test-taking conditions. The results revealed a positive effect of written item modality of listening performance and no significant effect for note-taking. These findings are elaborated further below.
 
Listening Test Performance and Item Modality
To answer the first research question, two groups of test-takers took a listening test with a different item modality (written versus oral). The results following the application of an independent samples t-test (p = .00) revealed that there is a significant difference between the mean scores of the two groups. This means that listening test performance did vary significantly according to whether test-takers had the chance to view the item stem in writing or not with the result that the test-takers who had the chance to view item stems outperformed those who received the item stems in oral format. The findings of the current study corroborate the findings of Wu (1998) in which he concluded that viewing the item stems as well as the options appeared to benefit advanced EFL test-takers. Of course, he links this benefit to advanced level of language proficiency; and the present study also confirms that written item modality also benefits upper-intermediate test-takers (who enjoy more or less advanced level proficiency).
Moreover, an inspection of Yanagawa and Green's (2008) study indicates that there was an apparent difference regarding item preview format. In their study, the results indicated differences between the full question preview (written item stem) condition and answer option preview (oral item stem) condition. The research found that test-takers were able to benefit from previewing the full questions rather than just previewing the options. In other words, it seems that the cues provided in the answer options did not facilitate comprehension to the same extent as the item stems did, a finding which our study adds support for. However, the findings of the current study do not seem to support those of Sherman (1997) who found no significant effect of item stem preview on test-takers’ performance.
The reason why test-takers do better when the stem of the item is revealed rather than hidden from them can be easily justified by referring to psychological aspects of the listening test. When test-takers have access to the item stems as well as the options while listening, they are psychologically more relaxed and feel more secure compared to the situation in which they do not have a visual record of the item stem and when the item stem is gone as soon as it is produced (in oral modality). Although this psychological stand is not supported by some studies (e.g., Buck, 1991; Sherman, 1997), the context in which the present study was carried out highly supports this position, since most Iranian learners are stressed when they take a test and this stress would increase if test-takers do not see the item stems on their sheets. Furthermore, the cues which are present in the stem of an item help test-takers have better understanding of the item and when these cues are presented in written modality, they are processed and retrieved more easily.
 
Listening Test Performance and Note-taking
This study was also an attempt to examine the effect of taking notes on listening test performance while test-takers listen to short talks or long conversations. Contrary to most findings of previous studies, our analysis did not detect any evidence for the effect of note-taking on test-takers’ performance in a listening test. A quick glance at Table 3 reveals that the p value of .73 suggests that allowing students to take notes had little effect on their performance compared to test-takers who were not allowed to take notes. Test-takers who did not take notes even gained slightly higher scores than those who took notes, an observation which implies that while busy taking notes, some candidates may be interrupted by the flow of speech and lose important pieces of information needed to answer some items.
Kobayashi’s (2005) meta-analytic study revealed that the effect of note-taking compared was moderately positive. Although the findings of the current study ran counter to our expectations as far as note-taking is concerned (as well as Kobayashi's observation), they are consistent with those of Hale and Courtney (1991). In their study, Hale and Courtney (with 563 students participating in their study) came to the conclusion that allowing students to take notes not only did not have any positive effects on their performance in a listening test but also impaired their listening performances. The results of their study showed that examinees made little use of note-taking chance. Another study carried out by Chaudron, Cook, and Loschky (1988) investigated the relationship between note-taking (as opposed to no note-taking) and listening comprehension and found no significant relationship between the two. Dunkel’s (1986) study also corroborates the findings of the current study. In his study, he generally concluded that the opportunity to take notes does not necessarily produce beneficial effects. Similarly, Zheng (1996) came to the conclusion that taking notes during a listening test had the effect of distracting test-takers’ attention and did not help them to perform better in a listening test.  Lin's (2004) research also revealed that being allowed to take notes did not help students perform better in tests, and through further analysis the reason for this was identified as the lack of test-takers’ vocabulary capacity.
Contrary to our findings and those mentioned in the preceding paragraph, research has also demonstrated the potential benefits of note-taking (e.g., Carrell,  Dunkel, & Mollaun, 2004). Carrell et al.'s (2004) study showed a facilitating effect on L2 listening comprehension when a group of examinees (in that case with heterogeneous L1s) was allowed to take and refer to notes during mini-lecture listening. To provide theoretical support to such observations, Van Meter, Yokoi, and Pressley (1994) argue that the act of taking notes facilitates college students’ attending to the lecture, comprehension of the material to be learned, and the subsequent recall. Moreover, Ching Ko's (2007), Yeh's (2004), and Liu's (2001) studies offer additional evidence that taking notes while listening to a text (i.e., in a listening test) facilitates retention of the material and leads to better performance in a listening test.
One major justification for the lack of note-taking effect on listening perfromance  in this study may have been that the strategies of taking notes were never taught to the participants before the study. Neither were the candidates monitored to know whether they actually took any notes. Here we can refer to Dunkel’s (1986) findings that good notes are the ones that contain the most information in the fewest number of words; so if a test-taker just takes notes without taking into account its quality, this note-taking would not lead to any positive effect. The effect of note-taking training is well documented in reading research (Rahmani & Sadeghi, 2011) but this line of inquiry needs to be followed in listening research to offer more insight on the nature of note-taking in listening.
It should also be highlighted that test-takers are selective in taking notes depending on their own note-taking styles. That is, it is probable that highly proficient listeners might not record much and as a result, produce less complete notes, while other less proficient listeners might write down as many idea units as they can. Another possible explanation for the results of the current study is that although the participants of the study were homogeneous in terms of language proficiency, they might not have been homogeneous in other factors possessed by a good note-taker, such as general intelligence, speed of writing, the ability to take notes at the same time as listening (i.e. writing, reading, and listening simultaneously), etc. It can be claimed that note-taking is not inherently effective; it becomes so when it is used properly in a particular context, when needed training is offered, and when the quantity and quality of notes to be taken are already decided. Indeed, some learners may not know what they should focus on while taking notes and may jot dowm every word they hear. It cannot be said how effective a hand-tool is unless one knows exactly what for and when it is used. Consequently, in the current research, the analysis of the data indicates that the overall encoding effect of note-taking is next to nothing.
 
CONCLUSION AND IMPLICATIONS
Studies have found that people spend 80% of their waking hours communicating, and according to research, at least 45% of that time is spent listening (Lawson, 2007). Therefore, it is important for individuals to be efficient listeners. Consequently, improving listening ability of EFL learners is an essential task for language teachers, course administers, and test designers as well as students themselves. Taking these into consideration, the results of the present study have far-reaching theoretical and practical implications for EFL teachers, test developers, and curriculum designers. Regarding the item modality variable, teachers can help the students learn how to concentrate on the text they listen to, in contexts where no text is provided since in real life contexts there would be no visual or written support while listening.  In other words, teachers should try to teach listening rather than just exposing learners to listening tests. Considering note-taking, EFL teachers can increase learners’ note-taking ability by focusing on and teaching the acquisition of certain skills necessary to take adequate notes, such as learning to identify main ideas, transcription speed, etc. Moreover, there is a key implication for test constructors/developers. As an example, the length of a listening text especially in the conditions in which test-takers are not permitted to take notes should be reasonable.
Like most other research studies, there were some limitations in this study as well. Factors such as the number of participants, their level of proficiency, and the time of the tests’ administration might impede generalizability of the results to other contextx. Moreover, this study was, of course, limited in the number of test items, test formats and features investigated as well. The results of the study were elicited through administering two separate tests (one test for each independent variable); each test comprised 20 items and lasted about 20 to 25 minutes. The quantity of test items in a single test could have been more but considering some factors such as time, students’ participation rate, etc. it was decided to include 20 items in each test.
Further studies can be conducted by adding a qualitative part to the study which may delve into test-takers’ opinions and attitudes about the effectiveness of different modalities, as well as exploring the links between different learning styles/strategies and test-taking strategies and various test method facets. Further studies are also needed to compare the performance of test-takers who receive note-taking strategy training with those who do not. Also, it would be worthwhile to examine the content of the actual notes taken by test-takers to identify what type and quantity of note-taking are desirable for optimum listening performance.
 

Keywords

Testing is an integral part of any teaching and learning process and like other educational fields, English as a Foreign Language (EFL) education has long recognized testing as a major part of the teaching. New perspectives on the use of English as an international language (EIL) have presented significant challenges to the field of language testing, with calls for change in assessment practices arising over the past decade (Jenkins, 2006). One of the skills for which constructing test items is demanding is listening comprehension, as in real life contexts, listeners cannot usually move backwards and forwards over what is being said in the way that they can do in a written text. In a listening test, the key concern is to evaluate the students’ comprehension, that is, to determine whether the students have grasped the intended message. So, it is essential to decide on the conditions and operations that merit inclusion in a test of listening comprehension (Weir, 1990). In actual fact, the assessment of listening abilities is one of the least understood, least developed and yet one of the most important areas of language testing (Buck, 2007).

The issue is even more complex nowadays given the unprecedented diversity of testing methods and academic pathways available for international students (Taylor & Geranpayeh, 2011). In other words, among the many existing variables that are considered to affect test takers’ performance, one central issue is the effect of test methods and formats (Alderson, 2000; Bachman, 1990; Buck, 2007).

Besides the awkward nature of testing listening comprehension, there exist some factors that might affect test-takers’ performance. When test developers set out to design a listening comprehension test, they usually encounter, and have to account for, numerous factors that may influence test-takers’ performances, such as item format, speech rate, speaker accents, topic familiarity, etc. Considering this, the present study is, for one thing, concerned with the mode of presentation of multiple choice items in a listening comprehension test, that is, it makes a difference to present the items orally or in written form.

Focusing on different items format, some studies conclude that allowing candidates to preview question stems enables them to make good use of planning, a meta-cognitive strategy by directing their attention to relevant areas of the text (Wu, 1998; Yanagawa & Green, 2008). However, the listening items in which the stem of the question is not seen on the paper or the screen, have their own advocates who believe that auditory memory does not need to be supported by visual aids. When it comes to listening instruction, there are numerous studies that look at enhancing listening comprehension through various means of support, such as visual aids, advance organizers, captions, etc. with the overall conclusion that most of these forms of support have been found to facilitate listener comprehension and also to have some positive psychological effects on listeners’ learning (Chang, 2009). Elekaei, Faramarzi and Biria (2015), for instance, investigated test-takers’ attitudes towards items with audio-only, pictorial and visual modality and found that students favoured picture-based (rather than visual) items over audio items. I support of Elekaei et al.’s (2015) findings, Basal, Gulozer and Demir (2015) compared the performance of Turkish EFL learners on items with audio and visual modality and found the performance on audio modality to be significantly higher than that on visual modality.

In addition to the modality of the item, another factor that may affect listening performance of the EFL learners is whether they are allowed to take notes during the listening test, which is the second concern of this study. Note-taking variable was considered in conjunction with modality in this research on the assumption that both these variables involve similar cognitive processes in listening. In other words, while written item modality helps listeners to overcome the memory problem (which is evident in oral items), note-taking functions similarly by allowing the listener to have a partial written record of the lapsing message, helping to remember better and retrieve what may otherwise be unretrievable.

Some studies (e.g., Hale & Courtney, 1991) have found that note-taking almost always improves retention of aurally presented material when performance is measured with a recall test. In their studies, Hale and Courtney concluded that allowing students to take notes would lead them to a better performance in listening tests. However, research suggests that note-taking may work differently for listeners of different proficiency level. In his study with 257 participants who took English as a second language placement exam, Song (2011) found that those with higher levels of proficiency benefited more from note-taking compared to listeners with lower proficiency level, while some other studies have failed to find an effect (Carter & Van Matre, 1975; Dunkel, 1986); and still other researchers like Aiken, Thomas, and Shennum (as cited in Song, 2011) have observed an interfering effect. 

Given the widespread use of language proficiency tests administered throughout the world and considering test-takers’ desire to gain satisfactory results in such tests as the score are sued to make life-changing decisions on them, there seems to be a need to better understand what affects candidates’ performances in such tests (as well as in less high-stakes assessments) in order to assist test-takers in obtaining desired results. Therefore, in designing such tests, besides the needs of the candidates, test-dependent factors including item modality and allowing test-takers the opportunity to take notes are areas which require further research attention with the aim to provide listeners the chance to reveal their true listening competence and guard them against memory problems, which can be doubled in exam setting.

 

LITERATURE REVIEW

L2 listening tests should demonstrate that the test-taker has the ability to process language automatically, in real time (Buck, 2007).  Thus, there is a need for the listener to automatize the listening process, and consequently there is a need to assess if the listener can indeed comprehend spoken language automatically in real time. This presents a dilemma for testers, in determining the mode of presenting the item stems and allowing the test-takers to take notes or not, since the first of these resources does not seem to exist in real-life situations, and the second has few outside realizations (except for academic or formal encounters). Ideally, the item stems should be presented orally to the test-takers, because this is generally how spoken language is encountered in real life.  Note-taking is considered as a good strategy for keeping the points in mind in real life and in listening to lectures. However, the burden on listeners in an exam context is quite different from that in a real-life context, and it needs to be investigated whether providing support to listeners in the form of written items (as opposed to oral items) and allowing them the chance to take notes helps them to better reveal their listening ability in a test context. Below we provide a brief account of some studies conducted in this area before we introduce our project.  

 

 

Item Modality

It has been argued that EFL learners need abundant support when processing auditory input (Chang, 2009). Numerous studies (e.g., Markham, Peter, & McCarthy, 2001; Stewart & Pertusa, 2004; Vandergrift, 2007) have looked at enhancing listening comprehension through various means of support such as visual aids, captions, etc. Most of these supports have been recognized as facilitative and have been shown to have positive psychological effects on listeners’ comprehension. However, in the realm of assessing listening, providing cognitive processing support to listeners in the form of written item modality has not received due attention.

A few studies have looked at the issue of modality but diverse results have been reported. Yanagawa and Green (2008), for example, examined whether the choice of multiple choice item format led to differences in task difficulty and test performance. In their study, they studied three formats, two of which were Full Question Preview (used in tests such as TOEIC which displays both the question stem and answer options on the question paper/screen) and Answer Option Preview (used in TOEFL where answer options are displayed on the question paper/screen, but the questions are heard after the text). In their study, 279 test-takers participated and listening tests were administered using different formats. The results indicated that listening comprehension test performance did vary significantly according to whether test-takers had been able to preview the question stem. It was found that allowing test-takers to preview only the answer options produced fewer correct answers than allowing test-takers to preview both the question stem and answer options prior to listening. However, they suggested that although the cues provided in answer options did not facilitate comprehension, previewing them may encourage test takers to fall back on a lexical matching strategy.

Chang (2009) compared two modes of aural input: reading while listening versus listening only. The results of the study revealed that although students showed a strong preference for the reading/listening mode, they gained only 10% more with that mode. More than half of the students believed that reading while listening mode made listening tasks easier and more comprehensible.

In a study similar to ours, Wagner (2010) examined the effect of using visual components of spoken texts on listeners’ performance and their comprehension of aural information in a listening test. In his study, the two groups’ performance on an ESL listening test was compared. The control group took a listening test with audio-only texts. The experimental group took the same listening test, with the exception that test-takers received the input through the use of video texts. Analyses of the results indicated that the video (experimental) group performed better than the audio-only (control) group on the test, and the difference between their performances was statistically significant.

More recently, Rogowsky, Calhoun, and Tallal (2016) compared immediate and delayed comprehension (retention) of three groups of learners who either listened to an audio text (the preface and a chapter form a non-fiction book), or read the original text on screen or did both at the same time (dual modality). The findings revealed that in neither condition did readers/listeners outperform either at Time 1, or at Time 2, concluding that input modality does not matter in comprehension. The comprehension test was however in written mode and whether similar results could be obtained in listening comprehension has to be established by future research.

 

Note-taking

Note-taking is generally considered to promote the process of learning and retaining, especially in the context of reading comprehension (Rahmani & Sadeghi, 2011). Over the years, research on note-taking has generated debates, and researchers have tried to implement studies to verify whether taking notes is effective for students to improve their listening comprehension. A study conducted by Hale and Courtney (1991) who investigated note-taking effect on listening comprehension of test-takers in TOEFL mini-talks. In their study, Hale and Courtney had two groups of international test-takers (a total number of 563 students) who were getting ready to take part in TOEFL. In their study, one of the groups was free to take notes while the test-takers were listening to the text. However, the test-takers in the other group were not allowed to take notes at all. The results revealed that allowing test-takers to take notes had little effect on their performance, and more interestingly, allowing test-takers to take notes impaired their performance in the listening test.

In a similar vein, in a study conducted by Kobayashi (2005), the researcher was concerned with the question of whether the process of taking notes promotes the encoding of lecture or text information, and if so, how much and why. The results of his meta-analysis demonstrated that the overall effect of note-taking compared with no note-taking was positive but modest, which was somewhat inconsistent with the tenets of encoding hypothesis that note-taking enhances learning by stimulating note-takers to actively process the material and to relate it to their existing knowledge.

Carrell (2007) investigated the relationships between note-taking strategies and performance on the three language assessment tasks. Her study employed 216 international test-takers (88 males and 128 females) ranging in listening comprehension proficiency from low-intermediate to high. The participants were tested and were asked to take notes while listening to the talks. The researcher analyzed the content of the notes as well as the candidates’ performances. The overall results revealed that the relationship is complex, depending upon the note-taking strategy and the task. She found positive correlations between the number of total notations and task performance.

Likewise, Ching Ko (2007) in his study with fifteen university EFL students tried to explore test-takers’ perceptions of note-taking and analyze the effect of note-taking on students’ foreign language listening comprehension. The findings indicated that taking notes did not distract students from their listening process; but rather, it helped them pay more attention to the text. He concluded that with the help of note-taking, students can improve their listening performance through both enhancing recall and paying more attention to the listening text.

The above brief literature on two variables of interest in this study (item modality and note-taking) reveals that although these two variables are among those important test method facets that have the potential to affect listening performance in exam contexts, little research exists to indicate the role item modality and note-taking plays in test-taking, and the small body of published research does not point to a uniform direction. In order to contribute to the existing literature in this important area of language testing, this study was planned to further our understanding of the links between item modality, note-taking, and performance in listening tests.

 

 

PURPOSE OF THE STUDY

The main purpose of the current research was to assess students’ ability to comprehend spoken language as it would typically occur in an academic setting. In other words, the study sought to find the effects of the modality of multiple choice items (oral versus written modality) and note-taking (whether it is allowed or not) on the performance of upper-intermediate EFL learners in taking listening tests.

More specifically the following research questions were posed for further scrutiny:

 

1. Does item modality (written vs. oral) have any significant effect on the listening performance of Iranian upper-intermediate EFL test-takers?

2. Does note-taking have any significant effect on the listening performance of Iranian upper-intermediate EFL test-takers?   

 

METHOD

Participants

A total number of 66 upper-intermediate EFL learners (31 males and 35 females) within the age range of 18 to 25 took an institutional version of PBT TOEFL, from among whom no one was excluded as an outlier (since they all enjoyed a similar proficiency level, and their scores ranged between 62 to 85 out of 100). They were all upper-intermediate language learners who were taking English language courses in Shukuh-e-Iran language school; and having attended English classes for the last three years, they had relatively high levels of English proficiency, including listening. They participants attended the same course (in different classes for males and females) and the institute placed them at the same level, confirming their homogeneity as revealed by TOEFL scores.

 

Instrumentation

The following data elicitation tools were employed to measure participants' listening performance under four measurement conditions discussed above (oral versus written item modality and note-taking versus no-note-taking condition).

 

 

Listening Test 1

The first listening test was the listening section of an institutional PBT TOEFL. The test consisted of 20 mini-talks, each followed by a multiple choice question. The mini-talks were randomly selected from among 150 items provided in the Complete TOEFL Test section of Longman Preparation Course for the TOEFL Test by Deborah Phillips (2003) published by Pearson ESL. The items in this pack are claimed to be similar to real TOEFL in terms of content and difficulty, hence evidence for its construct validity. In order to provide data for the first research question, two versions of this test were produced: the first version with written item modality (for both the stem and the options) and the second version with item stems in the oral mode (but with the options in the written mode). K-R 21 was utilized to estimate the reliability of the test, which was estimated to be 0.75.

 

Listening Test 2

A second test of listening (based on the same sample tests as above) was employed to provide data for the second research question. The test consisted of two long conversations and three talks. For each conversation or talk, there were four multiple choice items that the students had to answer after listening to each conversation or talk. The texts used in this test ranged in length from 100 to 150 words. These texts and questions were selected randomly from among 20 talks and 20 long conversations in Complete TOEFL Test section of Longman Preparation Course for the TOEFL Test and were assumed to be valid in content and difficulty as they represented real TOEFL items. The test was administered to the same participants as above in a different session. In administering the test, one group was not allowed to take notes, while the other group was instructed to take notes (using the note-taking sheets provided) while listening to the talks/conversations. K-R 21 was also used to estimate the test’s reliability, and the results revealed a high index of reliability of 0.79.

 

Listening Proficiency Test

In order to have a controlled level of listening proficiency and work with homogeneous participants, the Listening Section of an institutional version of TOEFL was administered at the beginning of the study. The test had 20 multiple choice items, and enjoyed a reliability index of 0.86.

 

Data Collection Procedure

The following steps were taken to conduct this study:

First, a listening proficiency test was administered to all upper-intermediate EFL learners at a language school (as mentioned above) to select that all the candidates who enjoyed a homogeneous listening ability. These learners were all studying “Passages 1” book and were regarded as higher intermediate by institute standards. The results of the proficiency test revealed (see above) that students were indeed homogeneous and of similar language proficiency (in listening).  Then, to provide data for the first research question, thirty three learners (16 males and 17 females) were selected randomly and took the first version of the test, that is, the test with oral item modality while the other 33 testees took the second version with items in written modality.  Subsequent to this, and in another session of the treatment, the second listening test was administered to the same groups in a similar procedure where one group was allowed to take notes and the other was not.

 

Data Analysis

To analyze the elicited data, the data were entered into SPSS (Statistical Package for Social Sciences) software, PASW Statistics 18 and two separate sets of independent samples t-tests were run.

 

RESULTS

Results of the Normality Test

To ensure the homogeneity of the participants, the Listening Section of an institutional version of TOEFL test was utilized as explained above. Table 1 shows the results of test of normality for the participants.

 

Table 1. Tests of normality for the proficiency test

 

Kolmogrove-Smirnov

Shapiro-Wilk

 

Statistics

df.

Sig.

Statistics

df.

Sig.

Exam Scores

.14

66

.06

.93

66

.07

    

As it can be seen in the table above, the non-significant result (i.e., .06 which is more than .05) indicates normality which means that participants were homogeneous. Furthermore, Figure 1 presents the related box plot which shows that there were no outliers among the participants.

 

 

Figure 1. Box plot for homogeneity of participants.

 

Item Modality and Listening Comprehension

After ensuring the homogeneity of the participants, an independent samples t-test was run to find the answer to the first research question by comparing the mean scores of the groups which had different item modalities in the tests. Table 2 provides the independent samples t-test statistics.

 

Table 2. Independent samples t-test for test 1 (item modality variable)

 

Levene’s Test for Equality of Variances

t-test for Equality of Means

 

 

 

 

 

 

 

 

 

 

 

95% Confidence Interval of the Difference

 

 

F

Sig.

t

df

Sig (2-tailed)

Mean Difference

Std. Error Difference

Lower

Lower

Test Scores

Equal Variances assumes

1.19

.27

8.18

64

.00

4.69

.57

3.55

5.84

Equal Variances not assumes

 

 

8.18

64

.00

4.69

.57

3.54

5.84

                       

 

As it is shown in Table 2, the significance level shown by Levene’s Test is .27 which is larger than the cut-off of .05, and this means that the assumption of equal variances has not been violated. And the significance level (i.e., Sig (2-tailed) is p = .00) which is less than .05 and this indicates that there is a significant difference between the two groups in terms of item modality. Comparing the mean scores of the test-takers, it is evident that test-takers exposed to the written item modality (M = 16.64) did much better than those who experienced the oral presentation of the items (M = 11.94).

In addition, using the Eta squared formula, the effect size of this independent samples test was calculated and the result (i.e., Eta squared = .51) reveals that the effect size for this test is medium. Expressed as percentages, it can be inferred that 51 percent of the variance in listening test performance is explained by item modality. All this can be interpreted to mean that the modality of test items does have a significant effect on the listening performance of Iranian upper-intermediate EFL learners.

 

Note-taking and Listening Comprehension

In order to provide an answer to research question 2, another independent-samples t-test was used to compare the mean scores of the two groups of test-takers (with and without note-taking condition).

Table 3 reports the results of homogeneity of variances as well as t-test results. Since the significance level for Levene's test is less than .05 (p = .04), the assumption of homogeneity of variances is violated, so the second row is consulted for analysis of the results.

 

Table 3. Independent samples t-test for test 2 (note-taking variable)

 

Levene’s Test for Equality of Variances

t-test for Equality of Means

 

 

 

 

 

 

 

 

 

 

 

95% Confidence Interval of the Difference

 

 

F

Sig.

t

Df

Sig (2-tailed)

Mean Difference

Std. Error Difference

Lower

Lower

Test Scores

Equal Variances assumes

4.16

.04

-.34

64

.73

-.30

.88

-2.06

1.46

Equal Variances not assumes

 

 

-.34

58.69

.73

-.30

.88

-2.07

1.46

                       

 

As it can be seen in Table 3, the p value for the independent test is .73 which is greater than the cut-off of .05, and this reveals that there is not a significant difference between the mean scores of the two groups. The Eta squared was also calculated and showed a really small amount (i.e., Eta = .001). The mean scores of the test-takers who were allowed to take notes while listening (i.e., M = 13.94) did not prove to be statistically different from the mean score of the test-takers who did not have the chance to take notes (M = 14.24). In other words, the mean difference for the two groups was -.3 which is too small a difference for statistical significance. Surprisingly, note-taking seems to have negatively affected listening performance, to a non-significant level though.

 

DISCUSSION

This study set out with the aim of assessing the effect of written versus oral item modality, and note-taking on listening performance under test-taking conditions. The results revealed a positive effect of written item modality of listening performance and no significant effect for note-taking. These findings are elaborated further below.

 

Listening Test Performance and Item Modality

To answer the first research question, two groups of test-takers took a listening test with a different item modality (written versus oral). The results following the application of an independent samples t-test (p = .00) revealed that there is a significant difference between the mean scores of the two groups. This means that listening test performance did vary significantly according to whether test-takers had the chance to view the item stem in writing or not with the result that the test-takers who had the chance to view item stems outperformed those who received the item stems in oral format. The findings of the current study corroborate the findings of Wu (1998) in which he concluded that viewing the item stems as well as the options appeared to benefit advanced EFL test-takers. Of course, he links this benefit to advanced level of language proficiency; and the present study also confirms that written item modality also benefits upper-intermediate test-takers (who enjoy more or less advanced level proficiency).

Moreover, an inspection of Yanagawa and Green's (2008) study indicates that there was an apparent difference regarding item preview format. In their study, the results indicated differences between the full question preview (written item stem) condition and answer option preview (oral item stem) condition. The research found that test-takers were able to benefit from previewing the full questions rather than just previewing the options. In other words, it seems that the cues provided in the answer options did not facilitate comprehension to the same extent as the item stems did, a finding which our study adds support for. However, the findings of the current study do not seem to support those of Sherman (1997) who found no significant effect of item stem preview on test-takers’ performance.

The reason why test-takers do better when the stem of the item is revealed rather than hidden from them can be easily justified by referring to psychological aspects of the listening test. When test-takers have access to the item stems as well as the options while listening, they are psychologically more relaxed and feel more secure compared to the situation in which they do not have a visual record of the item stem and when the item stem is gone as soon as it is produced (in oral modality). Although this psychological stand is not supported by some studies (e.g., Buck, 1991; Sherman, 1997), the context in which the present study was carried out highly supports this position, since most Iranian learners are stressed when they take a test and this stress would increase if test-takers do not see the item stems on their sheets. Furthermore, the cues which are present in the stem of an item help test-takers have better understanding of the item and when these cues are presented in written modality, they are processed and retrieved more easily.

 

Listening Test Performance and Note-taking

This study was also an attempt to examine the effect of taking notes on listening test performance while test-takers listen to short talks or long conversations. Contrary to most findings of previous studies, our analysis did not detect any evidence for the effect of note-taking on test-takers’ performance in a listening test. A quick glance at Table 3 reveals that the p value of .73 suggests that allowing students to take notes had little effect on their performance compared to test-takers who were not allowed to take notes. Test-takers who did not take notes even gained slightly higher scores than those who took notes, an observation which implies that while busy taking notes, some candidates may be interrupted by the flow of speech and lose important pieces of information needed to answer some items.

Kobayashi’s (2005) meta-analytic study revealed that the effect of note-taking compared was moderately positive. Although the findings of the current study ran counter to our expectations as far as note-taking is concerned (as well as Kobayashi's observation), they are consistent with those of Hale and Courtney (1991). In their study, Hale and Courtney (with 563 students participating in their study) came to the conclusion that allowing students to take notes not only did not have any positive effects on their performance in a listening test but also impaired their listening performances. The results of their study showed that examinees made little use of note-taking chance. Another study carried out by Chaudron, Cook, and Loschky (1988) investigated the relationship between note-taking (as opposed to no note-taking) and listening comprehension and found no significant relationship between the two. Dunkel’s (1986) study also corroborates the findings of the current study. In his study, he generally concluded that the opportunity to take notes does not necessarily produce beneficial effects. Similarly, Zheng (1996) came to the conclusion that taking notes during a listening test had the effect of distracting test-takers’ attention and did not help them to perform better in a listening test.  Lin's (2004) research also revealed that being allowed to take notes did not help students perform better in tests, and through further analysis the reason for this was identified as the lack of test-takers’ vocabulary capacity.

Contrary to our findings and those mentioned in the preceding paragraph, research has also demonstrated the potential benefits of note-taking (e.g., Carrell,  Dunkel, & Mollaun, 2004). Carrell et al.'s (2004) study showed a facilitating effect on L2 listening comprehension when a group of examinees (in that case with heterogeneous L1s) was allowed to take and refer to notes during mini-lecture listening. To provide theoretical support to such observations, Van Meter, Yokoi, and Pressley (1994) argue that the act of taking notes facilitates college students’ attending to the lecture, comprehension of the material to be learned, and the subsequent recall. Moreover, Ching Ko's (2007), Yeh's (2004), and Liu's (2001) studies offer additional evidence that taking notes while listening to a text (i.e., in a listening test) facilitates retention of the material and leads to better performance in a listening test.

One major justification for the lack of note-taking effect on listening perfromance  in this study may have been that the strategies of taking notes were never taught to the participants before the study. Neither were the candidates monitored to know whether they actually took any notes. Here we can refer to Dunkel’s (1986) findings that good notes are the ones that contain the most information in the fewest number of words; so if a test-taker just takes notes without taking into account its quality, this note-taking would not lead to any positive effect. The effect of note-taking training is well documented in reading research (Rahmani & Sadeghi, 2011) but this line of inquiry needs to be followed in listening research to offer more insight on the nature of note-taking in listening.

It should also be highlighted that test-takers are selective in taking notes depending on their own note-taking styles. That is, it is probable that highly proficient listeners might not record much and as a result, produce less complete notes, while other less proficient listeners might write down as many idea units as they can. Another possible explanation for the results of the current study is that although the participants of the study were homogeneous in terms of language proficiency, they might not have been homogeneous in other factors possessed by a good note-taker, such as general intelligence, speed of writing, the ability to take notes at the same time as listening (i.e. writing, reading, and listening simultaneously), etc. It can be claimed that note-taking is not inherently effective; it becomes so when it is used properly in a particular context, when needed training is offered, and when the quantity and quality of notes to be taken are already decided. Indeed, some learners may not know what they should focus on while taking notes and may jot dowm every word they hear. It cannot be said how effective a hand-tool is unless one knows exactly what for and when it is used. Consequently, in the current research, the analysis of the data indicates that the overall encoding effect of note-taking is next to nothing.

 

CONCLUSION AND IMPLICATIONS

Studies have found that people spend 80% of their waking hours communicating, and according to research, at least 45% of that time is spent listening (Lawson, 2007). Therefore, it is important for individuals to be efficient listeners. Consequently, improving listening ability of EFL learners is an essential task for language teachers, course administers, and test designers as well as students themselves. Taking these into consideration, the results of the present study have far-reaching theoretical and practical implications for EFL teachers, test developers, and curriculum designers. Regarding the item modality variable, teachers can help the students learn how to concentrate on the text they listen to, in contexts where no text is provided since in real life contexts there would be no visual or written support while listening.  In other words, teachers should try to teach listening rather than just exposing learners to listening tests. Considering note-taking, EFL teachers can increase learners’ note-taking ability by focusing on and teaching the acquisition of certain skills necessary to take adequate notes, such as learning to identify main ideas, transcription speed, etc. Moreover, there is a key implication for test constructors/developers. As an example, the length of a listening text especially in the conditions in which test-takers are not permitted to take notes should be reasonable.

Like most other research studies, there were some limitations in this study as well. Factors such as the number of participants, their level of proficiency, and the time of the tests’ administration might impede generalizability of the results to other contextx. Moreover, this study was, of course, limited in the number of test items, test formats and features investigated as well. The results of the study were elicited through administering two separate tests (one test for each independent variable); each test comprised 20 items and lasted about 20 to 25 minutes. The quantity of test items in a single test could have been more but considering some factors such as time, students’ participation rate, etc. it was decided to include 20 items in each test.

Further studies can be conducted by adding a qualitative part to the study which may delve into test-takers’ opinions and attitudes about the effectiveness of different modalities, as well as exploring the links between different learning styles/strategies and test-taking strategies and various test method facets. Further studies are also needed to compare the performance of test-takers who receive note-taking strategy training with those who do not. Also, it would be worthwhile to examine the content of the actual notes taken by test-takers to identify what type and quantity of note-taking are desirable for optimum listening performance. 

Alderson, J. C. (2000). Assessing reading. Cambridge: Cambridge University Press.
Bachman, L.F. (1990). Fundamental considerations in language testing. Oxford: Oxford University Press.
Basal, A., Gulozer, K. & Demir, I. (2015). Use of video and audio texts in EFL listening test. Journal of Education and Training Studies, 3(6), 83-89.
Buck, G. (1991). The testing of listening comprehension: An introspective study. Language Testing, 8, 67-91.
Buck, G. (2007). Assessing listening. Cambridge: Cambridge University Press.
Carrell, P. L. (2007). Note-taking strategies and their relationship to performance on listening com­prehension and communicative assessment tasks. (TOEFL Monograph Series No. MS-35). Princeton, NJ: ETS.
Carrell, P. L., Dunkel, P. A., & Mollaun, P. (2004). The effects of note-taking, lecture length, and topic on a computer-based test of ESL listening comprehension. Applied Language Learning, 14, 83-105.
Carter, J. F., & Van Matre, N. H. (1986).  Note-taking versus note having. Journal of Educational Psychology, 67, 900-904.
Chang, C. S. (2008). Enhancing Second Language Listening Competence through Extensive Listening Activities: Its Effectiveness. Taiwan: Bookman Books, Difficulties, and Solutions.
Chang, S. (2009). Gains to L2 listeners from reading while listening vs. listening only in comprehending short stories. System, 37, 652-663.
Chaudron, C., Cook, J., & Loschky, L. (1988). Quality of lecture notes and second language listening comprehension. Honolulu: University of Hawaii.
Ching Ko, H.S. (2007).  The impact of note-taking on university EFL learners’ listening comprehension. Journal of Cheng Shiu University, 20, 257-266.
Dunkel, P. (1986). The immediate recall of English lecture information by native and non-native speakers of English as a function of note-taking. Dissertation Abstracts International, 46, 78-93.
Elekai, A., Faramarzi, S., & Biria, R. (2015). Test-takers’ attitudes towards taking pictorial and visual modalities of listening comprehension test in an EFL context. Journal of Language Teaching and Research, 6(2), 308-316.
Hale, H.A., & Courtney, R. (1991). Note-taking and listening comprehension on the test of English as a foreign language. Princeton, NJ: Educational Testing Service.
Jenkins, J. (2000). The phonology of English as an international language: New models, new norms, new goals. Oxford: Oxford University Press.
Kobayashi, K. (2005). What limits the encoding effect of note-taking? A meta-analytic examination. Contemporary Educational Psychology, 30, 242-262.
Lawson, K. (2007). Importance of listening. Retrieved August 20, 2012, from http ://www. Lawson CG.com/
Lin, T. Y. (2004). Effects of note-taking on EFL learners’ listening comprehension. Master’s thesis, National Taiwan Normal University, Taiwan.
Liu, Y. (2001). A cognitive study on the functions of note-taking and the content of notes taken in a context of Chinese EFL learners. Master’s Thesis, Guangdong University of Foreign Studies, China.
Markham, P., Peter, L., & McCarthy, T. (2001). The effects of native language vs. target language captions on foreign language students’ DVD video comprehension. Foreign Language Annals, 34, 439-445.
Rahmani, M., & Sadeghi, K. (2011). Effects of note-taking training on reading comprehension and recall. The Reading Matrix, 11(2), 116-128.
Rogowsky, B.A., Calhoun, B.M., & Tallal, P. (2016). Does modality matter? The effects of reading, listening and dual modality on comprehension. SAGE open, 1-9. doi/abs/10.1177/2158244016669550
Sherman, J. (1997). The effect of question preview in listening comprehension tests. Language Testing, 14, 185-213.
Slotte, V., & Lonka, K. (1999). Review and process effects of spontaneous note-taking on text comprehension. Contemporary Educational Psychology, 24, 1-20.
Song, M. (2011). Note-taking quality and performance on L2 academic listening test. Language Testing, 29(1), 67-89.
Stewart, M.A., & Pertusa, I.  (2004). Gains to language learners from viewing target language close-captioned films. Foreign Language Annals, 37, 438-447.
Taylor, L., & Geranpayeh, A. (2011). Assessing listening for academic purposes: Defining and operationalising the test construct. Journal of English for Academic Purposes, 10, 89-101.
Vandergrift, L. (2007). Recent developments in second and foreign language listening comprehension research. Language Teaching, 40, 191-210.
Van Meter, P., Yokoi, L., & Pressley, M. (1994).  College students' theory of note-taking derived from their perceptions of note-taking. Journal of Educational Psychology, 86 (3), 323-338.
Wagner, E. (2010). The effect of the use of video texts on ESL listening test-takers’ performance. Language Testing, 27(4), 493-513.
Weir, C. (1990).  Communicative language testing. London: Prentice Hall International.
Wu, Y. (1998).What do tests of listening comprehension test? A retrospective study of EFL test-takers performing a multiple-choice task. Language Testing, 15, 21-44.                                   
Yanagawa, K., & Green, A. (2008). To show or not to show: The effects of item stems and answer options on performance on a multiple-choice listening test. System, 36, 107-122.
Yeh, M. C. (2004). The effects of motivation and listening strategies on the English listening comprehension of junior high students’ in Taiwan. Master’s thesis, National Kaohsiung Normal University, Taiwan.
Zhang, Y., & Elder, C. (2011). Judgments of oral proficiency by non-native and native English speaking teacher raters: Competing or complementary constructs? Language Testing, 28(1), 31-50.