Document Type : Research Paper
Author
Associate Professor of TEFL, Allameh Tabataba’i University, Tehran, Iran
Abstract
Foreign language test anxiety (FLTA) affects learners' cognitive, emotional, and behavioral responses to language assessments. Despite extensive research on its causes and effects, a comprehensive understanding from the learners' perspective remains underexplored. This study aims to synthesize qualitative and mixed-methods research on the antecedents and consequences of FLTA, with a focus on learners’ experiences and perceptions. A systematic search was conducted across primary education and psychology databases for studies published from 1990 to 2025. The review adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) and Enhancing Transparency in Reporting the Synthesis of Qualitative Research (ENTREQ) guidelines. Studies were appraised for methodological quality using the Enhancing Transparency in Reporting (ETR) checklist of the Critical Appraisal Skills Programme (CASP). Thematic synthesis and meta-ethnographic translation were used to integrate findings from the selected studies. The review identified key antecedents of FLTA, including learner-level factors (e.g., low self-efficacy, perfectionism, prior academic failure, and inadequate self-regulatory skills), test-design features (e.g., time pressure, difficulty, unfamiliar formats, and unclear criteria), and broader assessment cultures marked by high stakes, exam-driven teaching, and social pressures. The consequences of FLTA were multifaceted, ranging from cognitive interference and reduced performance to avoidance behaviors, reliance on short-term strategies, and negative impacts on long-term engagement with language learning. Positive changes were associated with alternative assessment practices, such as portfolios and formative assessment. The findings suggest practical recommendations for language teachers, curriculum designers, and policymakers to create assessment environments that reduce FLTA and support learner well-being.
Keywords
- foreign language test anxiety
- antecedents
- consequences
- qualitative systematic review
- language assessment
Main Subjects
INTRODUCTION
Foreign language anxiety (FLA) has long been recognized as one of the most powerful affective brakes on second-language learning—shaping what learners notice, remember, and ultimately feel willing (or unwilling) to do with the target language (Horwitz et al., 1986). Within this broader construct, foreign language test anxiety (FLTA)—anxiety specifically tied to tests, exams, and assessment situations in the L2—has become especially consequential because many language tests operate as high-stakes gatekeeping mechanisms with tangible implications for learners’ academic progress and life trajectories (Botes et al., 2020; Tsai & Li., 2012; Zhang, 2019). Nevertheless, despite decades of research on language anxiety (Aydın et al.,, 2020; Chen, 2025; Gao & Zuo, 2025; Özdemir, 2025), we still know surprisingly little about how learners themselves narrate the antecedents and consequences of FLTA in their own words and lived experiences.
In this review, antecedents refer to the constellation of psychological, pedagogical, and assessment-related conditions that give rise to anxiety in testing situations, including learners’ self-beliefs, prior evaluative experiences, and perceptions of test demands. Consequences, in turn, denote the cognitive, emotional, behavioral, and academic effects that emerge once anxiety is activated, such as disrupted attentional control, impaired language processing, avoidance behaviors, diminished performance, and longer-term impacts on motivation and self-concept. Attending to learners’ subjective accounts of these processes is essential because it shifts the analytic lens from broad correlational patterns toward how anxiety is experienced, interpreted, and negotiated within specific assessment ecologies.
LITERATURE REVIEW
Foundational Evidence and its Limits for FLTA
Early conceptualizations framed foreign language classroom anxiety (FLCA) as a composite of communication apprehension, fear of negative evaluation, and test anxiety (Horwitz et al., 1986). A key merit of this foundational work is that it provided a coherent construct and vocabulary that enabled cumulative measurement and comparison across contexts. Building on this foundation, large-scale meta-analytic syntheses have repeatedly confirmed a robust, moderately negative association between FLA and language achievement across diverse settings and outcome measures (Botes et al., 2020; Teimouri et al., 2019; Zhang, 2019). Reviews focusing on FLCA further report consistent negative correlations with overall grades and skill-specific performance (reading, writing, listening, and speaking), commonly in the moderate range (approximately r ≈ −.30 to −.40) (Botes et al., 2020; Liu & Guzmán, 2025; Zhang, 2019). Parallel scholarship on test anxiety likewise documents harmful associations with academic achievement, particularly within high-stakes exam contexts. Taken together, these findings firmly establish that anxiety is not a trivial “side variable” but a meaningful correlate of performance and learning outcomes.
At the same time, an important demerit of much of this evidence base—especially when the focus is FLTA—is its tendency to treat test anxiety as embedded within broader anxiety composites, rather than as a distinct, assessment-centered phenomenon with its own triggers and meanings. This matters because FLTA is often activated by test-specific pressures (e.g., time limits, unfamiliarity with the format, grading criteria, and perceived fairness) that may not be fully captured when studies focus primarily on classroom anxiety or on global achievement correlations (Horwitz et al., 1986; Zhang, 2019). As a result, the literature has been highly successful at demonstrating that “anxiety relates to outcomes,” while often remaining less informative about how learners experience the antecedent conditions that generate FLTA and why those conditions produce particular consequence patterns.
Theoretical Reframing Through Achievement Emotion Models
Recent conceptual developments have further advanced language anxiety research by reframing it through achievement emotion theories. Control-value theory (CVT) posits that achievement emotions such as anxiety arise from learners’ appraisals of control (e.g., perceived competence, task controllability) and value (e.g., intrinsic interest, importance, perceived cost) in achievement situations (Artino, 2012; Pekrun, 2006). A major merit of CVT-based work is that it offers a mechanism-oriented explanation for when and why anxiety spikes—an especially relevant lens for evaluative events where perceived stakes and perceived control are often heightened (Pekrun, 2006). Applied to L2 learning, CVT has inspired new measures and models that treat anxiety as one of multiple interrelated emotions rather than an isolated negative state (Shao, 2014). For example, Shao et al. (2023) validated the Achievement Emotions Questionnaire–Second Language Learning (AEQ-L2L), demonstrating distinct yet correlated profiles of enjoyment, anxiety, boredom, and other emotions in L2 classrooms and showing that anxiety is systematically associated with lower perceived control and more negative outcomes (Pekrun, 2006; Shao et al., 2023). Qualitative work under the CVT umbrella similarly suggests that learners’ narratives of control and value contribute to complex emotional constellations in which anxiety can coexist with enjoyment, pride, or hope during evaluative episodes (Liu & Guzman, 2025; Yu et al., 2022).
However, a notable limitation remains: even when theory becomes more sophisticated, the empirical emphasis often stays variable-centered, leaving learners’ narrated explanations of antecedents and consequences under-synthesized—particularly in test-centered contexts where perceptions of fairness, transparency, and support may be pivotal.
FLTA Within an Expanding but Fragmented Evidence Base
Second language anxiety research has now evolved into a multi-strand literature spanning conceptual reviews, correlational studies, intervention trials, and dynamic, longitudinal investigations (Papi & Khajavy, 2023). A state-of-the-art overview argues that L2 anxiety is among the most extensively studied affective factors in SLA, with major work clustered around construct definition, effects, and sources (Papi & Khajavy, 2023). Systematic reviews and meta-analyses have refined the understanding of correlates and malleability: negative links with academic performance (Chen et al., 2025), strong negative associations with foreign language self-efficacy (Zhou et al., 2023), and moderate negative relationships between writing anxiety and both writing performance and writing self-efficacy (Li, 2022). Longitudinal synthesis further suggests that FLA is dynamic and potentially malleable, shifting with changes in perceived control, instructional context, and assessment regimes (Sun et al., 2025).
Within this expanding landscape, FLTA has received increasing—but still fragmented—attention. Some quantitative studies extract a “test anxiety” factor from broader FLA measures and show that cognitive test anxiety predicts language test performance even when other anxiety facets are considered (Zhang, 2019). Qualitative and mixed-methods studies underscore the salience of high-stakes gatekeeping exams (e.g., entrance or exit assessments) as emotionally charged events where learners report somatic symptoms, intrusive worry, and catastrophic thoughts about failure (Aydın et al., 2020; Chen, 2024). Importantly, qualitative evidence also indicates that FLTA is not purely debilitating: learners describe both maladaptive outcomes (e.g., avoidance, sleep disruption, self-doubt) and compensatory coping (e.g., over-preparation, peer support), suggesting complex consequence pathways that can be missed by purely correlational designs (Aydın et al., 2020).
Recent qualitative and grounded theory research has begun to probe mechanisms underlying anxiety more broadly. Using interview and journal data from Chinese tertiary learners, Gao and Zuo (2025) identified interacting pressures—perceived stakes, institutional expectations, prior failures, teacher feedback style, and peer comparison—implicated in the emergence and maintenance of foreign language learning anxiety (Gao & Zuo, 2025). Related exploratory work in higher education suggests that anxiety about productive skills may intensify in evaluative situations (e.g., oral exams, graded presentations) and is shaped by perceived fairness, transparency, and teacher support (Alsuwaidi, 2019; Özdemir, 2025). These studies collectively hint that FLTA may be rooted in a layered ecology of antecedents, ranging from individual dispositions (e.g., perfectionism, self-efficacy, prior failure) to classroom practices (e.g., error correction, time pressure) and broader assessment cultures (e.g., high-stakes, norm-referenced systems) (Gao & Zuo, 2025; Zhang, 2019).
Consequences, Interventions, and the Remaining Gap
On the consequences side, meta-analytic evidence clearly indicates that anxiety in foreign language learning environments undermines performance and achievement (Botes et al., 2020; Chen et al., 2025; Teimouri et al., 2019; Zhang, 2019). Complementary reviews and primary studies further link L2 anxiety to reduced strategy use, impaired cognitive functioning under pressure, and broader impacts on well-being and identity (MacIntyre & Gardner, 1994; Papi, 2021,2023; Papi & Khajavy, 2023). In exam contexts, learners’ accounts of “blank mind,” intrusive self-criticism, and post-test rumination are especially salient because they illustrate how FLTA can extend beyond immediate performance into cycles of chronic self-doubt and future avoidance (Aydın et al., 2020; Gao & Zuo, 2025). Meta-analytic findings on the strong negative association between FLA and self-efficacy further reinforce the possibility of reciprocal cycles in which perceived (in)competence both fuels and is reinforced by anxiety and underperformance (Wu & Li, 2017; Zhou et al., 2023).
At the intervention level, research aiming to reduce FLA reports modest but heterogeneous effects, varying by intervention type, duration, targeted anxiety component, and contextual constraints (Xiong & Zhang, 2024). Positive psychology–oriented approaches (e.g., guided reminiscing about past achievements) appears promising for improving emotional profiles while reducing anxiety (Jin & Zhang, 2021). Technology-focused reviews likewise report a mixed picture: some tools may reduce speaking anxiety by enabling lower-pressure practice, whereas surveillance-like assessment technologies may intensify test-related anxiety (Huang et al., 2025). Despite these advances—and several broad reviews of language anxiety (Fattahi Marnani, 2022; Tran, 2023; Yu, 2021)—the literature still lacks a focused qualitative synthesis of FLTA as lived and narrated by learners. Existing qualitative studies often remain single-site and difficult to integrate across contexts and theoretical frames, limiting their cumulative explanatory power for understanding how antecedents and consequences connect within real testing ecologies (Aydın et al., 2020; Chen, 2024; Gao & Zuo, 2025; Özdemir, 2025).
There are at least three reasons why a qualitative systematic review and meta-synthesis of FLTA antecedents and consequences is timely and necessary. First, CVT-based research underscores that emotions are embedded in learners’ interpretive frameworks of control, value, fairness, and identity—dimensions that quantitative scales only partially capture (Pekrun, 2006; Shao et al., 2023; Yu et al., 2022). Second, learners’ narratives suggest that FLTA consequences may extend beyond immediate score decrements to include longer-term avoidance, altered academic or career plans, and shifts in self-concept and relationships—patterns often most visible in qualitative evidence (Gao & Zuo, 2025; Huang et al., 2025; MacIntyre & Gardner, 1994). Third, in policy environments increasingly shaped by high-stakes testing and evolving assessment technologies, synthesizing how learners make sense of FLTA can inform more humane assessment design, targeted support mechanisms, and teacher development initiatives (Huang et al., 2025).
PURPOSE of the STUDY
Responding to the above-mentioned gaps, the present study conducts a qualitative systematic review and meta-synthesis of research on the antecedents and consequences of foreign language test anxiety (FLTA). Through qualitative evidence synthesis, the study aims to move beyond whether FLTA matters toward a richer understanding of how it is embedded in the lived ecology of language testing and how that ecology might be reshaped in more supportive, equity-oriented directions. Specifically, the following research questions guide the review:
- What are the antecedents of foreign language test anxiety?
- What are the consequences of foreign language test anxiety?
METHOD
Design
This study employed a qualitative systematic review with an interpretive evidence synthesis to integrate learner-reported evidence on the antecedents and consequences of foreign language test anxiety (FLTA). Consistent with core systematic review principles, we used a transparent search, explicit eligibility criteria, dual-stage screening, and an auditable synthesis process (Page et al., 2021). Reporting followed PRISMA 2020 and ENTREQ recommendations for qualitative evidence syntheses (Sohrabi et al., 2021; Tong et al., 2012). To generate higher-order conceptual insights while preserving the meaning of primary accounts, we conducted thematic synthesis (Thomas & Harden, 2008), informed by meta-ethnographic translation to support interpretive “theme-to-theme” comparison across studies (Luong et al., 2023; Noblit & Hare, 1988).
Eligibility Criteria
Eligibility criteria were specified a priori and then operationalized during pilot screening to ensure consistent application. Studies were included if they (a) explicitly examined FLTA or closely aligned constructs (e.g., test/exam anxiety in foreign language courses, high-stakes proficiency tests, language certification exams), (b) reported primary qualitative data generated from learners (e.g., interviews, focus groups, diaries, reflective journals, online narratives), and (c) presented qualitative findings about antecedents and/or consequences of anxiety in L2 assessment contexts. Mixed-methods studies were included only when qualitative data were clearly reported and separable. Studies focusing exclusively on first-language test anxiety or non-language subject tests were excluded unless the testing context involved a foreign/second language. Only peer-reviewed journal articles in English published from 1990 onward were included.
Information Sources and Search Strategy
Searches were conducted in Web of Science, Scopus, ERIC, and PsycINFO, complemented by (a) hand-searching relevant journals and (b) backward and forward citation tracking to identify additional eligible qualitative studies. Search strings were developed iteratively through pilot searches, combining terms for (1) language learning context, (2) anxiety, (3) testing/assessment, and (4) qualitative methods, consistent with guidance on qualitative review searching and indexing limitations (Booth et al., 2016). A typical string was:
(“foreign language” OR “second language” OR EFL OR ESL OR L2) AND (anxiety OR “test anxiety” OR “exam anxiety” OR “assessment anxiety”) AND (test* OR exam* OR assess* OR evaluat*) AND (qualitative OR interview* OR “focus group*” OR narrative* OR “open-ended” OR diary OR journal)
No additional methodological filters beyond qualitative keywords were imposed, given variability in database tagging of qualitative studies (Booth et al., 2016).
Study Selection
Study selection proceeded in two stages: title/abstract screening followed by full-text screening. To reduce ambiguity, two reviewers independently screened an initial subset and met to align interpretation of key concepts (e.g., what counted as “test anxiety” in an L2 context; what qualified as learner-generated qualitative evidence). Full texts of potentially eligible articles were then screened independently by two reviewers against the eligibility criteria. Disagreements were resolved through discussion, with unresolved cases adjudicated by a third reviewer. Reasons for full-text exclusion (e.g., not an L2 testing context; absence of qualitative data; anxiety not tied to evaluative assessment) were recorded to support transparent reporting (Page et al., 2021; Tong et al., 2012).
Quality Appraisal
Methodological quality was appraised using the CASP qualitative checklist (CASP, 2018). Two reviewers independently rated each study and compared judgments. Discrepancies were resolved through discussion; a third reviewer adjudicated when needed. Appraisal was used to contextualize confidence in findings rather than to automatically exclude studies, consistent with interpretive synthesis goals and transparency principles (Tong et al., 2012).
Data Extraction
To address common criticisms that qualitative extraction procedures are often underspecified, we implemented a structured extraction template alongside an explicit extraction protocol to ensure consistency, transparency, and traceability across studies. Specifically, the extraction form first captured core study descriptors, including author and year, country or setting, participant characteristics (e.g., educational level and sample size), target language, and assessment type (e.g., course exams, oral exams, proficiency or certification tests). Next, it recorded key elements of the assessment ecology, including perceived stakes (low vs. high), test format (written vs. oral), evaluation features (e.g., time limits and transparency of criteria), and relevant classroom or testing conditions.
We then extracted qualitative evidence units at two analytic levels: first-order constructs representing learners’ voices (verbatim quotations and narrative excerpts reported in findings) and second-order constructs representing authors’ interpretations (themes, categories, and interpretive claims). Finally, to enable systematic synthesis, each extracted evidence unit was entered into mapping fields and tagged as an Antecedent, Consequence, or Linking process, with the latter used when the source text explicitly connected a trigger to an outcome through a stated mechanism (e.g., perceived unfairness → worry → blank mind).
When an included study reported a quotation such as “During the oral exam I couldn’t speak; my mind went blank because I feared losing my scholarship,” we extracted it as a first-order evidence unit and entered it in three columns: (a) quotation, (b) immediate code(s), and (c) provisional classification. In this example, “feared losing my scholarship” was coded as high perceived stakes (Antecedent), “oral exam” as assessment format: oral/interactive (Context), and “my mind went blank / couldn’t speak” as cognitive interference/speech disruption (Consequence). If the same paper interpreted these accounts as showing that “high stakes amplify perceived failure cost, which increases cognitive worry and impairs retrieval,” that interpretive statement was extracted as a second-order construct and linked to the relevant quotations.
Handling Mixed-methods Studies
For mixed-methods papers, only the qualitative sections (quoted data and qualitative themes) were extracted and synthesized. Quantitative results were recorded as contextual descriptors (e.g., sample characteristics; test type) but not merged into the qualitative coding unless the authors explicitly integrated them into qualitative interpretation.
Synthesis Procedures
Synthesis followed thematic synthesis (Thomas & Harden, 2008) in three iterative steps:
- Line-by-line coding: First-order constructs (learner quotations/narratives) and second-order constructs (author interpretations) were coded inductively. Codes were kept close to participants’ meanings at this stage (e.g., time pressure, unclear criteria, teacher harshness, peer comparison, catastrophic thinking, blank mind, avoidance).
- Descriptive themes: Codes were clustered into descriptive categories corresponding to (a) antecedents, (b) consequences, and (c) linking mechanisms (e.g., control/value appraisals, perceived fairness).
- Analytic themes: We then generated higher-order explanations that accounted for cross-study patterns and contradictions—for instance, distinguishing assessment-design antecedents (e.g., time pressure; unclear criteria) from sociocultural antecedents (e.g., gatekeeping stakes; normative comparison), and modeling how these cascaded into consequence pathways (e.g., worry → attentional disruption → performance decline → post-test rumination).
To strengthen interpretive integration, we incorporated meta-ethnographic translation (Noblit & Hare, 1988), comparing themes across studies to identify where concepts were effectively equivalent (reciprocal translation) and where they conflicted or varied by context (refutational translation), before producing a consolidated line-of-argument synthesis (Luong et al., 2023). Throughout synthesis, we maintained an audit trail linking analytic themes back to the extracted first-order quotations and second-order author interpretations to ensure transparency and traceability (Tong et al., 2012).
RESULTS
Antecedents of FLTA
Across the studies in Table 1, antecedents cluster into learner-related appraisals, test/task properties, and assessment ecology (classroom + institutional). Within those, the most recurrent sub-antecedents are:
- Learner-level Antecedents
The learner-related antecedents of FLTA are multi-layered and tend to cluster around perceived vulnerability, evaluative threat, and maladaptive preparation patterns. First, learners frequently attribute their anxiety to perceived low proficiency and inadequate preparedness, describing themselves as “not ready,” lacking solid foundations, or feeling that their skills are too fragile to withstand exam pressure (Aydin, 2013; Aydın et al., 2021; Gursoy & Arman, 2016; Abusurra, 2023). Second, many accounts point to prior negative testing experiences and failure memory, whereby earlier poor scores and discouraging exam episodes shape anticipatory fear and create an expectation that failure will recur (Abusurra, 2023; Gursoy & Arman, 2016). Third, learners repeatedly highlight fear of negative evaluation, particularly when performance is perceived as publicly judged by teachers or peers, which activates embarrassment, shame, and concerns about competence display (Abusurra, 2023; Aydın et al., 2021). Fourth, the reviewed evidence suggests skill-specific vulnerability, such that anxiety may intensify in reading proficiency contexts or in productive skill situations where performance is visible and face-threatening (Abusurra, 2023; Tsai & Li, 2012). Fifth, several studies suggest that these vulnerabilities are sustained by maladaptive preparation orientations, including cramming, over-fixation on errors, and fluctuating confidence that varies with item type, which may temporarily increase effort but ultimately reinforce anxiety cycles (Aydin, 2013; Khoshhal, 2021).
2) Test and Task Design Antecedents
The test and task design antecedents of FLTA are primarily linked to features that heighten time-based pressure, uncertainty, and error salience during assessment. First, learners’ anxiety is frequently intensified by time pressure and pacing demands, particularly when they perceive the allotted time as insufficient or the test as requiring speeded performance, which increases the likelihood of rushing, losing control, and second-guessing responses (Khoshhal, 2021; Aydın et al., 2021; Tsai & Li, 2012).
Second, FLTA is amplified by unfamiliar or “tricky” formats, including novel item types and perceived unpredictability, because unfamiliarity reduces perceived controllability and makes learners feel they are being tested on “surprises” rather than learned content (Khoshhal, 2021; Tsai & Li, 2012; Aydin, 2013). Third, studies indicate that error-salient item types—such as error identification or error recognition formats—can be especially anxiety-provoking, as they foreground mistakes, intensify vigilance, and trigger self-doubt even among learners who might otherwise perform competently (Khoshhal, 2021; Aydin, 2013). Fourth, learners also report heightened anxiety when there is a perceived mismatch between instruction and assessment, meaning that what was taught, practiced, or emphasized in class does not align with what is ultimately tested, thereby undermining trust in the assessment and strengthening anticipatory worry (Aydın et al., 2021; Saha, 2014).)
3) Assessment Ecology Antecedents (classroom/institution)
The assessment ecology antecedents of FLTA operate through classroom interactions and institutional testing cultures, shaping how learners interpret the meaning and consequences of assessment. First, learners frequently attribute anxiety to the teacher feedback and correction climate, particularly when feedback is experienced as harsh, punitive, or emotionally unsupportive, because such climates increase fear of making mistakes and strengthen evaluative threat (Aydın et al., 2021; Abusurra, 2023).
Second, FLTA is intensified when learners perceive low fairness, transparency, or clarity of criteria, such as ambiguous expectations or unclear scoring, because uncertainty about how performance will be judged reduces perceived control and increases rumination (Abusurra, 2023). Third, anxiety becomes especially pronounced under high-stakes consequences and accountability pressures, where grades, progression, or gatekeeping requirements amplify the perceived cost of failure and turn tests into consequential life events rather than routine evaluations (Saha, 2014; Gursoy & Arman, 2016).
Fourth, even when assessment is labeled formative, learners may still experience formative assessment pressure if tasks are monitored, feel consequential, or are enacted in an evaluative tone, indicating that perceived threat is shaped by implementation rather than labels alone (Bukhori et al., 2025). Fifth, the corpus also identifies a protective ecology factor—portfolio-oriented assessment—because distributing evidence across time and reframing evaluation away from one-shot performance can reduce threat and, in turn, lessen FLTA (Contreras-Soto et al., 2019).
Overall, these findings suggest that FLTA tends to emerge most strongly when perceived vulnerability (e.g., low readiness, fear of judgment, skill fragility) intersects with perceived threat embedded in the assessment environment (e.g., time pressure, error-salient formats, unclear criteria, and high-stakes cultures) (Aydın et al., 2020; Khoshhal, 2021; Tsai & Li, 2012; Gursoy & Arman, 2016; Abusurra, 2023).
Table 1. Selected studies on antecedents of FLTA
|
Author(s), Year |
Title |
Journal |
|
Saha (2014). |
EFL Test Anxiety: Sources and Supervisions |
Journal of Teaching and Teacher Education |
|
Aydın, 2020 |
Test Anxiety among Foreign Language Learners: A Qualitative Study |
The Qualitative Report |
|
Bukhori et al., 2025 |
Test Anxiety During Formative Assessment in English Learning: Insights from Islamic Boarding School Students |
Al-Manar: English and Arabic Journal |
|
Contreras-Soto, Véliz-Campos, & Véliz, 2019 |
Portfolios as a Strategy to Lower English Language Test Anxiety: The Case of Chile |
International Journal of Instruction |
|
Khoshhal, 2021 |
Test Anxiety among English Language Learners: A Case of Vocabulary Testing Using Multiple-Choice Items and Error Identification Tests |
REIRE Revista d’Innovació i Recerca en Educació |
|
Aydın, İnceçay, & Karabacak, 2021 |
A Descriptive Study on Test Anxiety among Foreign Language Learners |
FIRE: Futuristic Implementations of Research in Education |
|
Gursoy & Arman, 2016 |
Analyzing Foreign Language Test Anxiety among High School Students in an EFL Context |
Journal of Education and Learning |
|
Aydin, S. (2013). |
Factors Affecting the Level of Test Anxiety among EFL Students |
IOSR Journal of Research & Method in Education |
|
Abusurra & Shalandi (2023). |
EFL Students’ Experiences and Views of Test Anxiety |
Democratic Arabic Center (Education Series) |
|
Tsai & Li, 2012 |
Test Anxiety and Foreign Language Reading Anxiety in a Reading-Proficiency Test |
Journal of Social Sciences |
Consequences of FLTA
Across Table 2, consequences extend beyond “lower scores” into cognitive, behavioral, motivational, and longer-horizon effects.
- Performance and Achievement Consequences.
Across quantitative and mixed-methods studies, FLTA is consistently associated with lower performance on language assessments, including vocabulary, grammar, reading, and listening. This pattern appears robust across different testing formats and institutional contexts, suggesting that anxiety operates as a broadly performance-undermining condition rather than a skill-bound nuisance. Importantly, the association is not presented as merely correlational “noise,” but as a meaningful performance constraint that can plausibly accumulate over repeated evaluative events, especially in settings where language tests function as progression requirements. Taken together, the evidence supports the interpretation that FLTA systematically depresses observable achievement indicators, making it a consequential barrier for learners navigating exam-driven language systems (Salehi & Marefat, 2014; Nihae & Chiramanee, 2014; Khoshhal, 2021; Tsai & Li, 2012; Wu & Lee, 2017).
(2) Cognitive-processing Consequences (mechanism layer)
The most consistently described mechanism is cognitive interference: anxious worry and intrusive self-evaluative thoughts consume attentional resources, disrupt retrieval, and compromise the efficiency of online language processing. Rather than simply “feeling nervous,” learners experience a cognitive bottleneck in which concentration becomes unstable, and recall becomes less accessible under pressure. This is amplified when tests impose strict time limits or employ error-salient formats that invite constant self-monitoring and second-guessing. In such conditions, learners may read more slowly, re-check answers excessively, fixate on mistakes, or lose track of meaning while monitoring form—patterns that translate anxiety into measurable performance costs. Consequently, FLTA is best understood as a cognitive load amplifier that distorts the very processes tests are intended to sample (Aydın et al, 2021; Khoshhal, 2021; Nihae & Chiramanee, 2014; Tsai & Li, 2012).
(3) Behavioral Consequences
Behaviorally, FLTA is associated with both avoidance and maladaptive coping patterns that can maintain the problem over time. Avoidance may appear as procrastination, reduced engagement with preparation, reluctance to practice under test-like conditions, or avoidance of risk-taking in language use—behaviors that may provide short-term emotional relief but increase vulnerability at the next assessment. Conversely, some learners respond with intense compensatory routines such as over-preparation, rigid rehearsal, repeated checking, or hypervigilant monitoring. While these strategies can sometimes raise preparedness, they may also reinforce the belief that performance is fragile and that failure is catastrophic, thereby sustaining anxiety cycles across successive tests. Thus, FLTA influences not only what happens during the exam, but also how learners structure their learning and preparation in the weeks leading up to it (Aydın et al., 2021; Khoshhal, 2021).
(4) Motivational and Attitudinal Consequences
Motivationally, FLTA tends to erode learners’ engagement by lowering learning motivation and undermining the perceived value of test-linked coursework or requirements. When testing becomes the dominant frame, assessment may be experienced less as feedback and more as a threat, diminishing voluntary investment and shifting effort toward defensive strategies rather than developmental learning. Over time, repeated anxiety-laden exam experiences can produce more generalized negative attitudes toward English and assessment, in which the language becomes associated with judgment, vulnerability, and potential failure rather than competence and growth. This longer-horizon consequence is particularly important because it suggests FLTA may shape trajectories: anxious learners may disengage from elective opportunities, reduce participation, or avoid future assessment-heavy pathways, thereby indirectly constraining learning and attainment beyond any single test event (Wu & Lee, 2017; Aydın et al., 2020).
(5) Affective and Physiological Consequences
On the affective–physiological plane, FLTA is linked to pre-exam somatic strain and emotional depletion, including sleep disruption and heightened stress symptoms around exam periods. These effects matter because they can intensify the cognitive mechanisms described above: poorer sleep and sustained stress can reduce attentional control and working capacity, making intrusive worry more difficult to regulate during testing. Moreover, the anticipation of these symptoms can itself become a conditioned cue—learners may begin to expect bodily distress whenever a test approaches, which strengthens threat appraisals and increases vulnerability to anxiety spirals. In this way, FLTA becomes not only a momentary emotional reaction but also a recurring embodied experience that shapes how learners approach assessment seasons and how resilient they feel within evaluative language-learning environments (Aydın et al., 2020).
Table 2. Selected studies on the consequences of FLTA
|
Author(s), Year |
Title |
Journal |
|
Salehi & Marefat, 2014 |
The Effects of Foreign Language Anxiety and Test Anxiety on Foreign Language Test Performance |
Theory and Practice in Language Studies |
|
Nihae & Chiramanee, 2014 |
Multiple-Choice and Error Recognition Tests: Effects of Test Anxiety on Test Performance |
International Journal of English Language Education |
|
Khoshhal, 2021 |
Test Anxiety among English Language Learners: A Case of Vocabulary Testing Using Multiple-Choice Items and Error Identification Tests |
REIRE Revista d’Innovació i Recerca en Educació |
|
Tsai & Li, 2012 |
Test Anxiety and Foreign Language Reading Anxiety in a Reading-Proficiency Test |
Journal of Social Sciences |
|
Wu & Lee, 2017 |
The Relationships between Test Performance and Students’ Perceptions of Learning Motivation, Test Value, and Test Anxiety in the Context of the English Benchmark Requirement for Graduation in Taiwan’s Universities |
Language Testing in Asia |
|
Lee et al., 2015 |
Effects of Audio-Visual Aids on Foreign Language Test Anxiety, Reading and Listening Comprehension, and Retention in EFL Learners |
Perceptual and Motor Skills |
|
Tasan, Mede, & Sadeghi, 2021 |
The Effect of Pranayamic Breathing as a Positive Psychology Exercise on Foreign Language Learning Anxiety and Test Anxiety among Language Learners at the Tertiary Level |
Frontiers in Psychology |
|
Aydın et al.,, 2020 |
Test Anxiety among Foreign Language Learners: A Qualitative Study |
The Qualitative Report |
|
Bukhori et al., 2025 |
Test Anxiety During Formative Assessment in English Learning: Insights from Islamic Boarding School Students |
Al-Manar: English and Arabic Journal |
|
Contreras-Soto, Véliz-Campos, & Véliz, 2019 |
Portfolios as a Strategy to Lower English Language Test Anxiety: The Case of Chile |
International Journal of Instruction |
DISCUSSION
This qualitative systematic review synthesized learners’ accounts of the antecedents and consequences of foreign language test anxiety (FLTA). Across the included studies, FLTA emerged not as a narrow, intra-psychic trait but as a situated, context-sensitive achievement emotion that crystallizes at the intersection of learners' perceived vulnerability, assessment design, and the social-institutional stakes attached to language testing. The expanded findings strengthen a central claim: FLTA is best explained as an ecological cascade—assessment conditions and meanings (stakes, transparency, format, time pressure, feedback climate) shape appraisals and coping, which then reorganize cognition during tests and, over time, alter motivation, participation, and trajectories.
FLTA within the Broader FLA Landscape
Our synthesis aligns with foundational accounts that positioned foreign language classroom anxiety (FLCA) as a constellation of communication apprehension, fear of negative evaluation, and test anxiety (Horwitz et al., 1986). Learners’ narratives in the reviewed corpus suggest that test episodes intensify all three components simultaneously: performance is publicly or symbolically evaluated, errors carry exaggerated meaning, and outcomes are perceived as consequential. This convergence helps explain why FLTA often feels sharper than day-to-day classroom anxiety and why it is reliably linked to lower achievement in the broader quantitative literature (Teimouri et al., 2019; Zhang, 2019). Importantly, learners’ descriptions also support classic cognitive accounts in which anxiety undermines processing efficiency by consuming attentional resources and increasing self-monitoring (MacIntyre & Gardner, 1994). In other words, FLTA is not merely a negative feeling that co-occurs with weaker performance; learners repeatedly portray it as a force that actively changes how they think and respond during assessment.
At the same time, the expanded findings justify moving beyond “anxiety in isolation.” Studies capturing intervention or alternative assessment conditions suggest that test episodes can become emotional hotspots where anxiety coexists with more adaptive emotions when learners experience higher control (e.g., more preparation opportunities, clearer criteria, lower threat) and interpret the test as fairer and more learnable (Contreras-Soto et al., 2019; Lee et al., 2015; Tasan et al., 2021). This pattern is consistent with achievement-emotion perspectives: when perceived control increases and the value (stakes) remains high, anxiety may diminish without reducing engagement, allowing more productive emotional profiles to surface during evaluative moments. Thus, our results strengthen the argument that FLTA is not inevitable under testing; rather, it is partially produced by how testing is designed, explained, and socially enacted.
Multi-layered Antecedents: Why These Triggers Are Plausible
A key contribution of this review is demonstrating that learners experience FLTA antecedents as layered and interacting, rather than as single predictors. At the learner level, antecedents cluster around perceived low preparedness and low confidence, fear of negative evaluation, and carryover effects from prior negative testing experiences (Aydin, 2013; Aydın et al., 2020; Abusurra, 2023; Gursoy & Arman, 2016). These antecedents are compelling because they shape the very appraisal that a test is “uncontrollable”—and uncontrollability is precisely what learners report when they anticipate blanking out, being exposed, or failing despite effort. Notably, several studies imply that learners do not treat these vulnerabilities as fixed dispositions; rather, they view them as shaped by repeated feedback cycles and assessment histories, supporting an interpretation of FLTA as dynamically constructed over time (Aydın et al., 2020; Gursoy & Arman, 2016).
At the test-design level, the expanded extraction clarifies why specific task properties repeatedly trigger FLTA: time pressure, unfamiliar formats, and error-salient item types (e.g., error identification/recognition) intensify self-monitoring and uncertainty, which are fertile conditions for worry and intrusive thoughts (Khoshhal, 2021; Tsai & Li, 2012; Nihae & Chiramanee, 2014). These are not superficial complaints about tests being “hard”; they are design features that plausibly reduce learners perceived control and increase threat salience. The finding that error-focused item types can amplify anxiety is particularly useful because it helps distinguish between unavoidable challenge (construct-relevant difficulty) and avoidable threat (formats that foreground mistakes and invite second-guessing), a distinction that matters for ethically defensible language assessment.
At the classroom and institutional level, learners’ accounts justify treating FLTA as an assessment-culture outcome. Teacher feedback and correction climates, perceived transparency/fairness, and the social meaning of scores can amplify FLTA by turning assessment into a judgment of competence or status rather than a measure of learning (Aydın, 2020; Abusurra, 2023; Aydın et al., 2021; Gursoy & Arman, 2016). The new evidence also strengthens the point that “formative” assessment is not automatically low-anxiety: when formative tasks are experienced as monitored, consequential, or evaluative in tone, they can still elicit test anxiety (Bukhori et al., 2025). Conversely, portfolio-oriented approaches appear to reduce anxiety partly by redistributing evaluation across time and evidence, thereby lowering the sense that a single performance moment defines ability (Contreras-Soto et al., 2019). Together, these findings offer a coherent justification: FLTA escalates when assessment systems communicate high cost of failure under low perceived controllability, especially when social evaluation is salient.
Consequences: A Defensible Pathway from Anxiety to Outcomes
The synthesis of consequences becomes more convincing when framed as a mechanistic chain rather than a list of outcomes. First, multiple studies show that FLTA is associated with lower test performance across vocabulary, grammar, reading, and listening measures (Khoshhal, 2021; Nihae & Chiramanee, 2014; Salehi & Marefat, 2014; Tsai & Li, 2012; Wu & Lee, 2017). The plausibility of this association is strengthened by convergent evidence about how performance is impaired: learners describe cognitive interference—worry, intrusive thoughts, and self-monitoring—competing with attention and retrieval, particularly under time pressure or error-salient formats (Aydın et al., 2020; Khoshhal, 2021; Nihae & Chiramanee, 2014; Tsai & Li, 2012). This mechanism-level explanation provides the “reasonable justification” reviewers typically want: the findings are not merely that anxious students do worse, but that anxiety systematically changes test-time processing in ways that predict poorer performance.
Second, the expanded synthesis clarifies that consequences extend beyond the exam room into behavior and motivation. Learners report avoidance, procrastination, and reduced willingness to engage in test-like practice—behaviors that may temporarily regulate distress but increase vulnerability at the next assessment, creating a self-reinforcing loop (Aydın et al., 2020; Khoshhal, 2021). At the motivational level, FLTA is linked to reduced perceived test value and declining engagement in exam-linked coursework, contributing to longer-term negative attitudes toward English and assessment (Aydın et al., 2020; Wu & Lee, 2017). These consequences are theoretically coherent: if repeated evaluative episodes reliably produce threat and embarrassment, learners may rationally reduce exposure by disengaging or narrowing participation. Finally, the affective/physiological consequences reported (e.g., sleep disruption, pre-exam strain) further intensify cognitive vulnerability, plausibly worsening attention and retrieval during tests and increasing the likelihood of intrusive worry (Aydın et al., 2020). Overall, the evidence supports an ecological feedback model: antecedent conditions generate FLTA, which disrupts cognition and behavior, thereby undermining performance and motivation, thereby recreating the conditions for future FLTA.
Why Interventions Help
A particularly persuasive aspect of the corpus is that intervention and alternative-assessment studies do more than “reduce anxiety”—they also clarify what causes it. If portfolio assessment lowers FLTA and improves outcomes, this supports the inference that one-shot, high-threat testing conditions are not just correlated with anxiety but partially produce it (Contreras-Soto et al., 2019). Similarly, audio-visual supports appear to reduce anxiety while improving comprehension/retention outcomes, suggesting that anxiety is sensitive to scaffolding that increases perceived control and reduces processing overload during assessment (Lee et al., 2015). Breathing-based interventions show that physiological regulation can interrupt the anxiety cascade, supporting a multi-component view in which cognitive worry and somatic arousal jointly shape test performance (Tasan et al., 2021).
Evidence from formative assessment contexts further indicates that anxiety is not solely a function of “summative stakes” but of perceived evaluative threat, surveillance, and judgment—meaning that formative designs must be implemented with care to prevent them from inheriting the anxiety profile of summative tests (Bukhori et al., 2025). In short, the modifiability findings are not peripheral; they provide a strong empirical justification for the ecological account by showing that changing assessment conditions and regulation supports changes the emotional outcome.
Methodologically, this review demonstrates why qualitative evidence synthesis is essential for understanding FLTA. Meta-analyses convincingly establish that anxiety correlates with achievement (Teimouri et al., 2019; Zhang, 2019), but they rarely reveal learners’ explanatory models of why anxiety emerges, which test features trigger it, and how it reorganizes cognition and behavior. By integrating studies that include learner accounts, the present synthesis surfaces fine-grained antecedents (e.g., error-salient formats; fairness/transparency concerns; formative assessment pressure) and longer-horizon consequences (e.g., motivational erosion; negative attitudes; physiological strain) that are easily lost when FLTA is treated as a subscale or residual variance. Theoretically, the review supports an appraisal-sensitive interpretation of FLTA: antecedents map coherently onto perceived controllability (preparedness, clarity, time, format familiarity) and perceived cost/value (stakes, judgment, gatekeeping), while consequences follow predictable pathways through cognitive interference and avoidance loops (Aydın et al, 2020; Bukhori et al., 2025; Contreras-Soto et al., 2019; Khoshhal, 2021; Lee et al., 2015; Nihae & Chiramanee, 2014; Tasan et al., 2021).
CONCLUSION AND IMPLICATIONS
The most defensible implications flow directly from the antecedent–mechanism–consequence chain. First, assessment designers should reduce avoidable threat by improving transparency (clear criteria, exemplars), minimizing unnecessary time pressure, and reconsidering error-salient formats when they inflate anxiety without strengthening construct representation (Khoshhal, 2021; Tsai & Li, 2012). Second, teachers can reduce social-evaluative threat by shifting feedback climates away from public judgment and toward supportive guidance, and by normalizing errors as developmental evidence rather than personal deficiency. Third, programs should consider integrating portfolio components or distributed assessment, not merely as “alternative assessment,” but as anxiety-sensitive design that increases perceived control while maintaining meaningful evaluation. Finally, learners can be supported through targeted regulation strategies—including simple physiological techniques—because the evidence suggests that interrupting arousal can reduce the downstream cognitive interference that harms performance.
Disclosure statement
No potential conflict of interest was reported by the authors.
ORCID
|
Goudarz Alibakhshi |