Hostname: page-component-69cd664f8f-k8xkd Total loading time: 0 Render date: 2025-03-13T06:03:39.502Z Has data issue: false hasContentIssue false

New data on text reading in English as a second language

The Wave 2 expansion of the Multilingual Eye-Movement Corpus (MECO)

Published online by Cambridge University Press:  12 March 2025

Victor Kuperman*
Affiliation:
McMaster University
Sascha Schroeder
Affiliation:
University of Goettingen
Cengiz Acartürk
Affiliation:
Jagiellonian University
Niket Agrawal
Affiliation:
Indian Institute of Technology Kanpur
Dominick M. Alexandre
Affiliation:
Universidade Federal do Ceará
Lena S. Bolliger
Affiliation:
University of Zurich
Jan Brasser
Affiliation:
University of Zurich
César Campos-Rojas
Affiliation:
Pontificia Universidad Católica de Valparaíso Millennium Nucleus for the Science of Learning
Denis Drieghe
Affiliation:
University of Southampton
Dušica Filipović Đurđević
Affiliation:
University of Belgrade University of Novi Sad
Luiz Vinicius Gadelha de Freitas
Affiliation:
Universidade Federal do Ceará
Sofya Goldina
Affiliation:
Université Paris Cité National Research University Higher School of Economics Moscow
Romualdo Ibáñez Orellana
Affiliation:
Pontificia Universidad Católica de Valparaíso Millennium Nucleus for the Science of Learning
Lena A. Jäger
Affiliation:
University of Zurich University of Potsdam
Ómar I. Jóhannesson
Affiliation:
University of Iceland
Anurag Khare
Affiliation:
Indian Institute of Technology Kanpur
Nik Kharlamov
Affiliation:
Aalborg University
Hanne B. S. Knudsen
Affiliation:
Aalborg University
Árni Kristjánsson
Affiliation:
University of Iceland
Charlotte E. Lee
Affiliation:
University of Southampton
Jun Ren Lee
Affiliation:
National Taiwan Normal University
Marina P. T. Leite
Affiliation:
Universidade Federal de Minas Gerais
Simona Mancini
Affiliation:
Basque Center on Cognition, Brain and Language Ikerbasque, Basque Foundation for Science
Nataša Mihajlović
Affiliation:
University of Novi Sad
Ksenija Mišić
Affiliation:
University of Belgrade
Miloslava Orekhova
Affiliation:
National Research University Higher School of Economics Moscow
Olga Parshina
Affiliation:
National Research University Higher School of Economics Moscow Middlebury College
Milica Popović Stijačić
Affiliation:
University of Novi Sad Singidunum University
Athanassios Protopapas
Affiliation:
University of Oslo
David R. Reich
Affiliation:
University of Potsdam
Anurag Rimzhim
Affiliation:
College of the Holy Cross
Rui Rothe-Neves
Affiliation:
Universidade Federal de Minas Gerais
Thais M. M. Sá
Affiliation:
Universidade Federal de Lavras
Andrea Santana Covarrubias
Affiliation:
Pontificia Universidad Católica de Valparaíso
Irina Sekerina
Affiliation:
College of Staten Island of the City University of New York
Heida M. Sigurdardottir
Affiliation:
University of Iceland
Anna Smirnova
Affiliation:
National Research University Higher School of Economics Moscow University of Groningen
Priyanka Srivastava
Affiliation:
International Institute of Information Technology Hyderabad
Elisangela N. Teixeira
Affiliation:
Universidade Federal do Ceará
Ivana Ugrinic
Affiliation:
University of Oslo
Kerem Alp Usal
Affiliation:
Middle East Technical University
Karolina Vakulya
Affiliation:
University of Plymouth
João M. M. Vieira
Affiliation:
University of Southampton
Ark Verma
Affiliation:
Indian Institute of Technology Kanpur
Denise H. Wu
Affiliation:
National Central University
Jin Xue
Affiliation:
Beijing Institute of Technology University of Science and Technology Beijing
Sunčica Zdravković
Affiliation:
University of Belgrade University of Novi Sad
Junjing Zhuo
Affiliation:
University of Science and Technology Beijing Northeast Normal University
Laoura Ziaka
Affiliation:
University of Oslo Oslo University Hospital
Noam Siegelman
Affiliation:
Hebrew University of Jerusalem
*
Corresponding author: Victor Kuperman; Email: vickup@mcmaster.ca
Rights & Permissions [Opens in a new window]

Abstract

This paper reports an expansion of the English as a second language (L2) component of the Multilingual Eye Movement Corpus (MECO L2), an international database of eye movements during text reading. While the previous Wave 1 of the MECO project (Kuperman et al., 2023) contained English as a L2 reading data from readers with 12 different first language (L1) backgrounds, the newly collected dataset adds eye-tracking data on English text reading from 13 distinct L1 backgrounds (N = 660) as well as participants’ scores on component skills of English proficiency and information about their demographics and language background and use. The paper reports reliability estimates, descriptive statistics, and correlational analyses as means to validate the expansion dataset. Consistent with prior literature and the MECO Wave 1, trends in the MECO Wave 2 data include a weak correlation between reading comprehension and oculomotor measures of reading fluency and a greater L1-L2 contrast in reading fluency than reading comprehension. Jointly with Wave 1, the MECO project includes English reading data from more than 1,200 readers representing a diversity of native writing systems (logographic, abjad, abugida, and alphabetic) and 19 distinct L1 backgrounds. We provide multiple pointers to new venues of how L2 reading researchers can mine this rich publicly available dataset.

Type
Data Report
Creative Commons
Creative Common License - CCCreative Common License - BYCreative Common License - NCCreative Common License - ND
This is an Open Access article, distributed under the terms of the Creative Commons Attribution-NonCommercial-NoDerivatives licence (http://creativecommons.org/licenses/by-nc-nd/4.0), which permits non-commercial re-use, distribution, and reproduction in any medium, provided that no alterations are made and the original article is properly cited. The written permission of Cambridge University Press must be obtained prior to any commercial use and/or adaptation of the article.
Open Practices
Open data
Copyright
© The Author(s), 2025. Published by Cambridge University Press

While highly prolific, research into bilingualism and second language (L2) reading represents a relatively small subset of first (L1) and additional languages (for estimates, see, e.g., Melby-Lervåg & Lervåg, Reference Melby-Lervåg and Lervåg2014; Siegelman et al., Reference Siegelman, Elgort, Brysbaert, Agrawal, Amenta, Arsenijević Mijalković and Kuperman2023). Arguably, the need for a broader coverage is felt particularly in the research stream that uses eye-tracking to study L2 reading behavior. Because of the relatively high cost of eye-tracking equipment, this type of experimentation is largely concentrated in high-income countries with developed scientific infrastructure (e.g., Godfroid, Reference Godfroid2020). Thus, existing eye-tracking studies in this field are biased toward L2s that are official languages of so-called WEIRD (i.e., Western, Educated, Industrialized, Rich, and Democratic) societies and the L1s that are well represented among international university students or immigrants in WEIRD countries. More broadly, L2 reading research is in constant need of methodologically comparable, high-quality, empirical data (see discussions in De Bruin, Reference De Bruin2019; Gullifer & Titone, Reference Gullifer and Titone2020; Luk & Bialystok, Reference Luk and Bialystok2013, among many others), and again this is arguably particularly true in research into eye-movements in L2 given the still limited coverage of this line of research.

One recent approach to addressing these needs has emerged in the form of mega-studies that coordinate data collection across multiple labs worldwide, using comparable texts, reader populations, and procedures (see Brysbaert & Drieghe, Reference Brysbaert and Drieghe2024, for discussion). One such study (Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023) presents eye-tracking data on English text reading produced by N = 543 L1 speakers of 12 languages (for other examples, see Berzak et al., Reference Berzak, Nakamura, Smith, Weng, Katz, Flynn and Levy2022; Cop, Dirix, Drieghe & Duyck, Reference Cop, Dirix, Drieghe and Duyck2017; Siegelman et al., Reference Siegelman, Elgort, Brysbaert, Agrawal, Amenta, Arsenijević Mijalković and Kuperman2023; Sui, Dirix, Woumans & Duyck, Reference Sui, Dirix, Woumans and Duyck2023), along with several tests of component skills of English proficiency and rich demographic and language-background data. This study is one component of the MECO, labeled MECO L2. Within the MECO project, the same participants (at a given data collection wave) produced eye-tracking data on text reading in their L1 (MECO L1; Siegelman et al., Reference Siegelman, Schroeder, Acartürk, Ahn, Alexeeva, Amenta and Kuperman2022) and in English, which enables within-participant comparisons of oculomotor behavior in one’s L1 and L2.

While the published Wave 1 of the MECO project in Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023) has already provided a solid expansion of the empirical base for studies of L1 and L2 reading, the current paper makes a further contribution to existing research needs, reporting new eye-tracking and skill test data on L2 reading that constitutes Wave 2 of the MECO L2 project. The first major goal of the current paper is to expand the coverage of the MECO L2 project in terms of the language background represented in the database. Thus, here we report data from 16 samples, representing 13 distinct L1 backgrounds, contributing eye-tracking data on L2 English text reading, along with measures of English component skills and language and demographic background. Most samples in the current wave are from sites where participants’ L1 background is new to the MECO project: Specifically, we add a total of nine new samples of participants with seven L1 backgrounds previously uncovered in the MECO Wave 1: i.e., Basque, Brazilian Portuguese, Danish, Hindi, Icelandic, Mandarin (both simplified and traditional script), and Serbian. Each of these sites aimed to include a minimum of N = 45 usable participants, and, in most cases, samples met this threshold (see below). As a result, taken jointly, Waves 1 and 2 of MECO L2 bring the total of different L1 backgrounds in the English reading data from 12 (reported in Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023) to 19. Given that all measures and procedures reported here are fully comparable with those in the previous Wave 1 of the project (Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023), the full database now presents researchers with an unprecedented opportunity to examine the determinants of English L2 proficiency and fluency across a very wide range of participant language backgrounds. The scope of the database enables tackling many novel theoretical questions, including, for example, questions about the links between eye-movement behavior and component skills of English reading and about the language distance between the L1 background of the reader and English as L2. Another benefit of the Wave 2 expansion is an addition of native readers of very different writing systems from the alphabetic system of English, i.e., Chinese and Hindi (in addition to the Korean Hangul and Hebrew abjad represented in Wave 1). This increased diversity of writing systems enables users of the MECO database to systematically study the effects of the writing system on English reading proficiency as well (e.g., Bialystok, McBride-Chang & Luk, Reference Bialystok, McBride-Chang and Luk2005; Geva & Siegel, Reference Geva and Siegel2000). Clearly, the questions mentioned here are simply examples of the types of investigations made possible with the full MECO L2 data. The major goal of this paper is to present and validate this rich dataset and make it publicly available for researchers for secondary use in line with open science practices and mega-studies of reading and language.

A second crucial goal of Wave 2 of the MECO L2 project is to increase the sample size of the MECO L2 database to improve the statistical power of studies using the open dataset. As discussed in detail in Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023), the MECO database is structured to enable both bird’s-eye view types of analyses of similarities and differences in reading behavior across many language backgrounds as well as targeted analyses of data from specific sites, theoretically interesting L1 pairs/groups, or specific L1 families (see also Siegelman et al., Reference Siegelman, Schroeder, Acartürk, Ahn, Alexeeva, Amenta and Kuperman2022, for a related discussion in the context of the MECO L1 component). With the new addition of the current Wave 2 data to the MECO L2 component, researchers will now have access to an unprecedented number of N = 1,204 participants reading in English across the project’s two waves (i.e., adding N = 660 new participants to the N = 543 in Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023). This addition, combined with the improved cross-linguistic coverage discussed above, may also substantially improve analyses targeting specific L1 groups or typological families. In this context, note that the current Wave 2 data include additional data for two samples included in the Wave 1 project, the Turkish and Norwegian samples, where data collection of Wave 1 were interrupted by the closures related to the COVID-19 pandemic. Below we refer to these two samples as “appended samples.” The addition of new participants to these two samples was meant to make sure their sample size is in line with the target number of participants per MECO site (N = 45 or more, see below).

Finally, another goal of the current Wave 2 data is to make available several “replication samples,” i.e., data records that represent L1 backgrounds already found in Wave 1 but collected at different universities or countries. Thus, the Waves 1 and 2 of MECO L2 jointly include three samples of participants with German as L1 (two from Germany and one from Switzerland), two with English (from Canada and UK), two with Hindi (both from India), two with Russian (both from Russia), and two with Spanish as L1 (from Argentina and Chile). The replication samples enable methodologically important comparative analyses of multiple samples from the same language background. Such analyses make it possible to disentangle the effect of the language background and the effect of the specific sample, university, or country. Also, they can be used to determine whether readers with the same L1 background are more similar to one another in their English L2 reading proficiency than speakers with different L1 backgrounds.

With these goals in mind, in the current paper we present the MECO L2 Wave 2 data. We start by providing full information about the included participants, eye-tracking methodology and procedure, tests of component skills, and questionnaire data collected. We then follow with analyses of the reliability of the collected data as well as descriptive information regarding the distribution of basic eye-movement measures and measures of component skills across sites. These are meant to establish the collected database as a useful tool that can form the basis for secondary analyses in future research. We end with a few pointers to the future directions that L2 reading research can take with the help of the newly expanded MECO database.

Method

Participants

The present data on reading in English—labeled Wave 2 of the MECO L2 database—stem from 16 eye-tracking university-based laboratories in Asia, Europe, and South America. English was the first and dominant language for only one of the partner sites (UK), while the first and dominant language in other samples was the official language of university instruction (typically, the official language of the country).Footnote 1 All participants were university students or (rarely) staff members. With the present emphasis on the typical English as L2 readers, we applied a screening procedure that also took place in Wave 1 of the project (Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023). Specifically, we excluded participants with uncharacteristically high English fluency in all but the UK-based sample, i.e., self-reported simultaneously bilingual participants (with English as one of the languages), majors in English language or literature, and individuals who have lived for more than 6 months in an English-speaking country. The ethics clearance was obtained by each participating site from the ethics research board of the corresponding institution or country. Complete details of participant recruitment, materials, procedure, and apparatus of the present study are highly compatible with those used during Wave 1 of MECO data collection (see Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023): Our description below draws relevant details from Kuperman et al.’s Methods section.

Table 1 lists the country and institution where the data were collected, sample size, and details regarding the participants’ compensation as well as the L1, age, and years of education of participants. Complete demographic information can be found in the project’s data repository (see the Data availability section). In total, the current Wave 2 of MECO includes 660 new participants with valid eye-tracking data.

Table 1. Information regarding participants in available samples

Note: The L2 data for the UK sample represented L1 reading of the same 12 English texts that all other participant samples read. Samples marked with * are appended samples from the same institutions collected during Wave 1. Note that some sites paid participants for the full experimental session, also including the L1 component of the study (i.e., per session), while other sites paid participants on an hourly basis (i.e., per hour).

Materials

The English passage reading eye-tracking task consisted of 12 texts in English, compiled from the training materials for the ACCUPLACER Reading test and the English as Second Language Reading Skills Test, i.e., the placement tests often taken by students in North American colleges. Each text, written in expository prose and dedicated to a historical person or natural phenomenon, came with two 4-alternative-forced-choice factual and inferential comprehension questions. Text lengths varied from 98 to 185 words (4–11 sentences). Texts and questions were presented to participants in a fixed order. Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023) report characteristics of the texts, including their length and readability. The Flesch-Kincaid grade level of readability showed that the texts were in the range expected of high school– and college-level reading (M = 10.56, SD = 2.68) and close to the range observed among advanced L2 learners of English in Crossley, Allen & McNamara (Reference Crossley, Allen and McNamara2011). The Coh-Metrix L2 readability score (M = 16.17, SD = 5.56) for MECO L2 texts approximated the mean values that Crossley et al. (Reference Crossley, Allen and McNamara2011) associated with readings for intermediate learners. These readability estimates thus suggest that the texts used are appropriate for our intermediate-to-advanced sample of English L2 readers. For further details, we refer readers to Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023).

Additional questionnaires and tests

Participants in all samples completed the same series of tests and questionnaires. This series included a battery assessing component skills in English (see below) and a nonverbal intelligence test (the Culture Fair Test-3 [CFT 20], subset 3 matrices, short version, form A, timed at 3 min; Weiß, Reference Weiß2006). Further, an abridged version of the Language Experience and Proficiency Questionnaire (LEAP-Q; Marian, Blumenfeld & Kaushanskaya, Reference Marian, Blumenfeld and Kaushanskaya2007) collected basic demographic and linguistic information about the participants’ use of and proficiency in L1 and English: e.g., the participants’ age and years of education, age when learning English began, and self-ratings of their proficiency in their L1 and in English. Note that, for simplicity, we designate English as L2 for all samples (except the UK), even though English may be third language or an additional language for some samples or some participants. The full information collected through the questionnaire, including the age of acquisition and proficiency in each language, is available through the project’s repository (see below).

English reading comprehension and fluency are demonstrably contingent on the reader’s mastery of component skills of English language and reading proficiency (see reviews by Gillon, Reference Gillon2017; Jeon & Yamashita, Reference Jeon and Yamashita2014; Koda, Reference Koda2005; Schmitt, Reference Schmitt2008; Vandergrift, Reference Vandergrift2007, among others). The MECO L2 project taps into some of those component skills through administering an additional battery of six tests of individual differences. Test (1) was the Spelling Recognition test (adapted from Andrews & Hersch, Reference Andrews and Hersch2010): In this test, items are presented in a list, and participants need to decide for each whether or not it is a correctly spelled word in English (i.e., mark each item as “correct” or “incorrect”). Half of the items are correctly spelled and the other half include spelling errors (e.g. seperate, benafit). Test (2) was a Vocabulary Knowledge test based on word recognition with multiple-choice questions (adapted from Nation & Beglar, Reference Nation and Beglar2007). For this test of the receptive knowledge of English, words are selected from a frequency-ranked list of 14,000 English lemmas, and 10 items are chosen from each 1,000 words in the ranked list to represent the respective frequency band. The test consists of a series of questions where a target word is embedded in a short nondefining context, and participants need to choose its correct definition from four options. Test (3) consisted of the assessment of motivation to excel in the task (using the Student Opinion Scale [SOS] questionnaire; Thelk, Sundre, Horst & Finney, Reference Thelk, Sundre, Horst and Finney2009). The SOS includes 10 statements that participants are asked to rank from “1 = Strongly Disagree” to “5 = Strongly Agree” according to how they feel about each of them in relation to completing the current study. Test (4) is the Lexical Test for Advanced Learners of English (LexTALE) with yes/no decisions (Lemhöfer & Boersma, Reference Lemhöfer and Broersma2012). It is an untimed lexical decision task, consisting of 60 trials: 40 words and 20 pseudowords. Tests 5–6 come from the Test of Word Reading Efficiency – Second Edition (TOWRE-2; Torgesen, Wagner & Rashotte, Reference Torgesen, Wagner and Rashotte2012), with one subtest for word naming (Sight Word Efficiency) and one subtest for pseudoword naming (phonemic decoding efficiency). In each subtest, participants are required to read aloud as many items as possible from a list of words/pseudowords within a 45-s time limit. Altogether, these tests tap into the reader’s ability to associate sounds and letters of the written word (decoding); ability for word identification; spelling ability as a measure of orthographic learning and knowledge; vocabulary knowledge as a central ability for word recognition and comprehension; and an extra-linguistic motivational component. This battery of tests was again identical to that collected in the first wave of the project: We thus do not repeat the full details regarding the scoring and administration of tests as these are fully available in Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023) (supplementary material S2). The tasks in the battery were administered in the fixed order (1)–(6), after the completion of the main eye-tracking passage reading task. Tasks (1)–(3) were administered using an in-house web-based platform; task (4) was administered through the LexTALE website (http://www.lextale.com/); and tasks (5) and (6) were administered in the standard pencil-and-paper version.

For different reasons—including administration errors, connectivity issues, and copyright restrictions, due to which a few sites opted out of the CFT 20 test or TOWRE—some participants in the MECO L2 sample do not have complete data in all verbal and background skill tests. We report details regarding the number of missing values in each test in supplementary material S1.

Procedure

Participants were tested individually. In the beginning of the experimental session, participants signed a consent form and completed the LEAP-Q. Then, participants completed an L1 reading task where they read 12 texts in their L1 silently for comprehension while their eye-movements were recorded, followed by four yes/no comprehension questions after each text. Then, participants proceeded to a skill-test battery in L1, which included the CFT 20 and other tests of individual differences in L1. With the exception of the CFT 20 task, data collected during these stages of the experiment are reported elsewhere. The current paper reports data when participants proceeded to the English component of the project, i.e., the task of silently reading 12 texts in English for comprehension, while their eye-movements were recorded. The reading task was followed by the battery of English individual differences tasks described above. The duration of the L2 eye-tracking reading task was roughly 20–30 min, and the individual differences battery took up to 30 min.Footnote 2 The entire session lasted no more than 2 hr, and breaks were provided as requested.Footnote 3 All data were collected by research assistants trained in eye-tracking data collection according to the protocols of their labs.

Apparatus

As outlined in Kuperman et al.’s (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023) Methods section, to record eye movements during reading, all participating laboratories used an EyeLink eye-tracker (SR Research, Kanata, ON, Canada). Labs had one of the EyeLink Portable Duo, EyeLink II, EyeLink 1000 or EyeLink 1000 Plus models. A sampling rate of 1,000 Hz was used in all sites but Serbia, where the EyeLink II was used with a sampling rate of 500 Hz. All sites used the same experimental procedure programmed in the Experiment Builder software (SR Research). A chin rest was used to minimize head movements. Calibration was performed using a series of nine fixed targets distributed around the display, followed by a 9-point accuracy test to validate eye position. Stimuli were viewed binocularly but eye-movement data were analyzed from only the self-reported dominant eye (the right eye in most participants). Before presenting the trial stimuli in the English eye-tracking reading task, a dot appeared on the monitor screen, slightly to the left of the first word in the passage. Once the participant had fixated on it, the trial began. This drift check took place at the beginning of each trial, and calibration was monitored by the experimenter throughout the task and was redone if necessary. Each of the 12 texts appeared on a separate screen. Participants were instructed to read the passages silently for comprehension and press the space bar when their reading of a passage was completed. A mono-spaced font (Consolas) was used, with a size generally ranging from 20 to 22 points (given variation in screen size and resolution at different testing sites) and 1.5 line spacing. In accordance with their local experimental setup, the German site in Zurich used a smaller font size of 10 with a lower resolution of 1280 x 1024. The refresh rate was set to 60 Hz at all sites. For further specifications of the screen, font size, presentation settings, and apparatus at each participating site, see supplementary material S2. Each text was followed by two multiple-choice comprehension questions, shown on a separate screen one after another. Participants responded by choosing their answers using the number keys 1–4.

Data editing and cleaning

The popEye software was used to pre-process the eye-tracking data (implemented in R, version 0.8.1; Schroeder, Reference Schroeder2019). During this process, fixations are automatically corrected on the vertical axis and assigned to lines. In the current Wave 2 of MECO, the “slice” algorithm was used, because it was shown to provide a substantial boost in assignment accuracy compared to the baseline algorithm used for Wave 1 (Glandorf & Schroeder, Reference Glandorf and Schroeder2021). However, in the two appended samples that added participants to a Wave 1 sample (i.e., in Turkey and Norway) the baseline algorithm was used to maintain consistency within a site. Following this automatic procedure, members of the research team visually inspected the output of the software and assessed the quality of the resulting data. The assessment consisted of detecting texts in which fixations and text lines were misaligned (due to poor calibration). Such texts were removed from the data pool, as were participants with less than 5 (out of 12) usable texts with the high-quality eye movement record. One more sample of English L2 reading collected in Beijing in simplified Chinese (label ch_s) used full justification rather than left justification of the texts. As a result, English letters in these texts were not monospaced (i.e., amounted to a different number of pixels). Since the popEye algorithm is not applicable to such cases, the Data Viewer, version 4.4.1 (SR Research Ltd), functionality was used to create interest area, fixation, and saccade reports for the ch_s sample. These reports were combined with the respective popEye outputs for other data samples. Vertical alignment of fixations with lines of text was determined through manual inspected and adjusted as needed. All other cleaning and trimming procedures were identical for ch_s and other samples. Table 1 reports the percent of remaining texts and word tokens (interest areas) after this data cleaning.

For the purposes of reliability and descriptive analyses below, further data cleaning involved removing data points that showed very short (<80 ms) first fixations, which are unlikely to provide sufficient time to complete visual uptake (see Warren, White & Reichle, Reference Warren, White and Reichle2009) or very long total fixation times (top 1% of the participant-specific distribution, all exceeding 3 s on the word). A total of 15,400 data points (2.0% of total) were removed, between 1.3% and 2.3% per language. Off-screen looks were incorporated in the passage-level variables (e.g., reading rate) but not in the word-level eye-tracking variables (see details on variables used, below).

Reading variables

A number of eye-movement variables are considered as measures of reading fluency (both in L1 and L2). In our description below, we follow closely on Kuperman et al.’s (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023) variable definitions. The word-level variables include skipping (a binary index of whether the word was fixated upon at least once during the entire text reading, labeled as skipFootnote 4 ). For words that were fixated at least once, the following variables were defined: first fixation duration (the duration of the first fixation landing on the word, firstfix.dur); gaze duration (the summed duration of fixations on the word in the first pass, i.e., before the gaze leaves it for the first time, firstrun.dur); total fixation duration (the summed duration of all fixations on the word, dur); number of fixations on the word (nfix); refixation (a binary index of whether a word elicited more than one fixation in the first pass, refix); regression-in (a binary index of whether the gaze returned to the word after inspecting further textual material, i.e., to the right of the word in left-to-right orthographies, reg.in); and rereading (a binary index of whether the word elicited fixations after the first pass, i.e., after the gaze left the word for the first time, reread Footnote 5). See Inhoff and Radach (Reference Inhoff, Radach and Underwood1998), Rayner (Reference Rayner1998), and Godfroid (Reference Godfroid2020) for detailed discussion of these variables. At the participant level, the following measures of fluency were defined: reading rate (in words per minute, rate), and mean word-level variables (e.g., participant’s mean skipping rate, mean first fixation duration, etc.). Sentence and passage reading times as well as the number of fixations, skips, and regressions per sentence and passage can be found in the sentence- and passage-level reports, respectively, in the project’s data repository. Finally, we gauged comprehension accuracy as the percent of correct responses to all 24 questions (acc). Computed variables are identical and backward compatible with variables in the first wave of the project, enabling future analyses on the aggregate data across Wave 1 and Wave 2 sites.

Tests of individual differences provide the following set of dependent variables: scores from the CFT test of nonverbal intelligence (cft) as well as scores on tests of spelling (spelling), vocabulary knowledge (vocabularyFootnote 6), motivation (motivation), LexTALE, sight word efficiency (towre: swe), and phonemic decoding efficiency (towre: pde; see details regarding the scoring of individual differences tests in Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023; supplementary material S2).

Results

Below, we first report reliability analyses of eye-tracking data and individual differences tests in the current Wave 2 and compare those estimates against reliability previously observed in Wave 1 of the MECO L2 project. We follow with presenting the descriptive statistics of the Wave 2 data and correlational analyses that pit eye movement measures against themselves and against the skill test scores. In all sets of analyses, we follow the analytical procedure of Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023) for comparability.

Reliability estimates

Eye-tracking data

The split-half reliability at the participant level for an eye-tracking measure reveals how stable that measure is given individual differences between participants. This reliability metric is the correlation between mean values for “odd” and “even” words within a participant. Specifically, we would calculate mean values for, say, gaze duration for words (i.e., interest areas) 1, 3, 5, etc. and words 2, 4, 6, etc. for each participant. Reliability can then be estimated as the correlation between the mean values for “odd” and “even” words across all participants in the sample. The participant-level reliability for reading rate was estimated using an intra-class correlation coefficient, measuring the degree of agreement in reading rate estimates across the 12 texts. In addition, a word-level reliability estimation was done at the word token level. This reliability is of interest for studies of the effect that word properties have on eye movements. For each word token in the MECO texts, mean values were calculated for each eye movement measure for “odd” and “even” participants separately. The resulting two sets of values were correlated across all word tokens to form a reliability estimate.

Supplementary materials S3 and S4 provide a full report of the two types of reliability estimates (i.e., participant level and word token level). Similar to reliability analyses in Wave 1, all eye movement measures in Wave 2 demonstrate an extremely high reliability of eye-tracking measures at the participant level (all Spearman-Brown corrected reliability estimates >.90). In line with Staub (Reference Staub2021), this finding indicates that the eye-movement measures faithfully reflect individual differences in English proficiency. As expected and in line with Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023, reliability at the word token level was considerably lower. Still, the average Spearman-Brown corrected reliability estimates, aggregated across sites and measures, were in the moderate range (mean r = .69, median = .72) as were the reliability estimates for most measures and samples. Again, reliability levels found in the Wave 2 data were highly comparable to those in the MECO Wave 1 (e.g., Spearman-Brown corrected reliability estimates r > .94 at the participant level and r > .6 at the item level in Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023) as well as the Ghent Eye-Tracking Corpus (GECO) database (between .6 and .9 in Cop et al., Reference Cop, Dirix, Drieghe and Duyck2017).

Tests of component skills and comprehension accuracy

In addition to eye-movement measures, we calculated the reliability for scores in the online battery of English skill tests (spelling, vocabulary, and motivationFootnote 7) as well as for comprehension accuracy in the passage reading task. For comprehension, spelling, and motivation, we calculated both split-half reliability and Cronbach α. For the vocabulary knowledge task, we only calculated split-half because of the adaptive nature of this task, which means that different participants have data from different trials (see design details in Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023). Reliability estimates were calculated on the aggregated dataset (not broken down by site), as procedural differences across sites are not expected to have an impact on the data quality in these tests. The estimates are provided in supplementary material S5. Unsurprisingly, these estimates were highly similar to the ones reported in Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023); this is expected given the highly similar nature of participants in the two waves of the project. Specifically, reliability estimates for the four tests—spelling, motivation, vocabulary, and comprehension—were reasonable, with split-half estimates ranging from .64 to .75 and Cronbach α values of .61 to .73. In sum, MECO L2 data on reading fluency and comprehension as well as the test scores in component skills of English reading show acceptable to high levels of reliability, making the data eligible for a meaningful inferential analysis.

Descriptive and correlation analyses

Figure 1 visualizes means and standard deviations of eye movement measures and comprehension accuracy by language sample. These estimates were obtained by first calculating the means for these variables by participants and then aggregating those by-participant means. Detailed data summaries, organized by variable and sample, are provided in the project’s repository. Figure 2 further shows the means and standard deviations of scores in the available measures of individual differences (including tests of component skills and nonverbal intelligence).

Figure 1. Means of measures from the eye-tracking task across samples. Error bars stand for ± 1 SE. accuracy = percent comprehension answers correct; ba = Basque; bp = Brazilian Portuguese; ch_s = Chinese simplified; ch_t = Chinese traditional; da = Danish; en_uk = English (UK sample); ge_po = German (Potsdam sample); ge_zu = German (Zurich sample); hi_iiith = Hindi (Hyderabad sample); hi_iitk = Hindi (Kanpur sample); ic = Icelandic; n Fixations = number of fixations; no = Norwegian; refixation = likelihood of second fixation on the word; regressionIn = regression rate; rereading = likelihood of second pass; ru_mo = Russian (Moscow sample); skipping = skipping rate; se = Serbian; sp_ch = Spanish (Chile sample); tr = Turkish.

Figure 2. Means of measures of individual differences of English proficiency across samples. Error bars stand for ± 1 SE. Ba = Basque; bp = Brazilian Portuguese; cft = score in the CFT test; ch_s = Chinese simplified; ch_t = Chinese traditional; da = Danish; en_uk = English (UK sample); ge_po - German (Potsdam sample); ge_zu = German (Zurich sample); hi_iiith = Hindi (Hyderabad sample); hi_iitk = Hindi (Kanpur sample); ic = Icelandic; no = Norwegian; ru_mo = Russian (Moscow sample); se = Serbian; sp_ch = Spanish (Chile sample); towre: pde = TOWRE, phonemic decoding efficiency subtest (pseudoword naming); towre: swe = TOWRE, sight word efficiency subtest (word naming); vocabulary = vocabulary knowledge (Groups 2-5); tr = Turkish.

A comprehensive analysis of these descriptive patterns and the cross-site differences and similarities that emerge from Figures 1 and 2 is beyond the scope of this paper. However, we do want to highlight a few important observations that again establish the quality of the MECO Wave 2 data. First, we note that there was substantial similarity across samples in terms of comprehension accuracy: Specifically, 11 out of the 16 samples showed comprehension accuracy in a similar range of 70% to 75% (a range far from ceiling performance). This picture is very much in line with MECO Wave 1 data (where 8 out of the previous 12 samples showed comprehension accuracy in a similar range). In contrast, and again in line with Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023), there was much more variability in oculomotor measures of fluency. This is true both within the different L2 samples but also, most notably, in how estimates of oculomotor measures in the English L1 sample (i.e., in the UK) stand out among the sites where participants are L2 readers of English. Visual inspection of Figure 1 demonstrates that the English L1 readers (en_uk sample) had a faster reading rate, shorter and fewer fixations, a higher skipping rate, and a lower likelihood of refixations or rereading compared to most L2 samples. There were some L2 samples (e.g., two samples of German speakers) that sometimes approached the mean values of the English L1 sample in some eye movement measures. Yet no L2 sample showed as consistent a contrast with the majority of L2 samples as the sample of L1 English speakers. See also Siegelman et al. (Reference Siegelman, Elgort, Brysbaert, Agrawal, Amenta, Arsenijević Mijalković and Kuperman2023) for evidence of the comprehension-fluency contrast in terms of L1-L2 differences and similarities.

Figure 2 further presents extensive variability in performance on tests of component skills across the sites, with the English L1 (en_UK) sample showing generally higher performance than other sites in these tests, with further expected variability among L2 sites. We further note that in both Figure 1 (i.e., eye-movement measures) and Figure 2 (i.e., component skill tests), it is hard to find, from a cursory look, a clear linguistic factor that maps directly into the observed behavioral similarities and differences. Taken together, these observations replicate those from the Wave 1 data and open exciting avenues for systematic analyses of the determinants of oculomotor measures of L2 reading given various properties of participants’ L1 across the various language backgrounds and their component skills (see more in the General discussion section).

Lastly, we computed the correlations between eye-movement measures, accuracy, reading rate, and all individual differences tests on the aggregated dataset of participants from all Wave 2 samples (N = 660; Table 2). Correlational analyses like this speak to some of the central questions in L2 acquisition research, e.g., Does reading fluency correlate with reading comprehension and does individual variability in component skills of reading influence reading comprehension and fluency? They also help answer methodological questions about the inter-sample differences, potentially driven by variability in the nonverbal IQ and motivation to perform well in the task.

Table 2. Correlation table for reading measures (data aggregated across samples, N = 660)

Note: Accuracy = comprehension accuracy; cft = score in the CFT test; First fix duration = first fixation duration; n, fixations = number of fixations; Refixation = likelihood of second fixation on the word; Rereading = likelihood of second pass; towre: swe = TOWRE, Sight Word Efficiency subtest (word naming); towre: pde = TOWRE, phonemic decoding efficiency subtest (pseudoword naming); vocabulary = vocabulary knowledge (groups 2–5). Values above the diagonal show Pearson correlation coefficients; values below the diagonal show p values (p value shown as 0 stands for p < .001), and significant correlations (p < .05) appear in bold.

We again replicated four main correlational findings in Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023): (a) There were substantial correlations between the various eye-movement measures; (b) There were only weak correlations between comprehension accuracy and the oculomotor reading measures (|r| between .03 and .35); (c) Individuals with higher performance in the English component skill tests had more efficient eye-movement reading patterns (i.e., more skips, fewer and shorter fixations); and (d) CFT and motivation were only weakly correlated with eye-movement measures (|r| ≤ .15). These expected correlational patterns suggest, in line with Kuperman et al.’s (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023) data, that reading fluency (gauged by eye movements) and comprehension are only weakly related; proficiency in component skills of reading influences reading behavior; and the inter-sample variability in IQ and motivation did not strongly affect eye movement patterns. The alignment with Kuperman et al.’s report from MECO Wave 1 validates the current extension of the MECO L2 data.

General discussion

This paper reports an expansion of the English as L2 reading component of the MECO L2 (Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023). This Wave 2 adds eye-tracking data on text reading in English, as well as scores from component skills of English proficiency, from 16 laboratories worldwide and 13 unique L1s. Tests of component skills of the English proficiency include spelling, vocabulary knowledge, lexical decision, sight word efficiency, and phonemic decoding efficiency as well as nonverbal intelligence and motivation to excel in the task. Questionnaires offer additional insight into demographics of participants as well as their background and use of their L1 and English. Some of the samples in the present report increase the size of the samples collected in Wave 1 (“appended samples” in Turkish and Norwegian); some represent the languages already found in Wave 1 but recruit participants from a different country or university (“replication samples” in German, Russian, Spanish, and English); and the majority of the samples come from L1s new to the MECO project.

Jointly with Wave 1, the English-reading component of the MECO corpus currently encompasses readers of English with 19 distinct L1 language backgrounds, including English as L1 (Canada, UK) and, primarily, L2 readers of English. The language backgrounds of the readers of English in the MECO project incorporate a large typological and genetic variety of languages (Basque, Indo-European, Semitic, Sino-Tibetan, and Turkic language families) and writing systems (e.g., Chinese simplified logographic, Hebrew abjad, Hindi abugida, and several alphabets). Participants in the current MECO L2 component also contributed data in their L1, enabling within-participant comparisons: The L1 data are reported elsewhere. We note that MECO is an evolving and ongoing project, and its future releases (e.g., Wave 3) plan to further enrich this data resource with behavioral samples of L1 and L2 reading from readers of diverse languages and writing systems.

As is the case with any data resource, MECO has its limitations. Its samples are of relatively small size (around 50 participants), which limits cross-linguistic comparisons at the participant level. The battery of component skills of reading is lacking tests of several skills that are known to strongly contribute to L2 reading proficiency (e.g., L2 listening comprehension). Also, since availability of tests of individual differences varies drastically across languages, we do not administer a battery of tests for proficiency in L1, which is a factor of major influence on L2 reading proficiency.

Analyses in this paper demonstrate very high reliability of eye-movement data at the participant level and moderate to good reliability of eye-movement data at the word-token level, comprehension accuracy, and all tests of component skills. Thus, the data have adequate quality both for group-level comparisons and for the study of individual differences. Another validation of the quality of the Wave 2 data comes from the observed correlation patterns, which match those uncovered in Wave 1 of the MECO project (Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023). Among other findings, we found that the L1-L2 differences and the overall variability in English reading comprehension are minor relative to L1-L2 differences and variability in all measures related to reading fluency. While L1 English speakers demonstrate comprehension accuracy comparable to that in most L2 samples, they were much more fluent (shorter reading times, faster reading rate, etc.) than the L2 counterparts. This dissociation between reading comprehension and fluency, observed in Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023), is a fruitful topic for future research.

More broadly, the goal of the current paper is simply to establish the reliability and quality of the MECO Wave 2 data so that follow-up analyses can mine it in future studies into different facets of L2 reading. Multiple interesting avenues include (a) a comparison of English reading performance between speakers of the same language versus speakers of different languages (e.g., Does a specific L1 background have a footprint that makes, say, German readers of English more similar to one another than to English readers with other non-native backgrounds?); (b) the effect of the degree of similarity between the L1 background of the reader and English on a reader’s reading comprehension and fluency; and (c) the determinants of (various facets of) English proficiency and the relative contribution of skill tests (e.g., spelling, vocabulary knowledge), one’s L1 background and its similarity to English, and other participant-level characteristics. With the full MECO L2 data made freely available in the spirit of open science, we hope that these and many other questions are investigated by the community of researchers of L2 reading.

Supplementary material

The supplementary material for this article can be found at http://doi.org/10.1017/S0272263125000105.

Data availability statement

As with the previous release of MECO reported in Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023), the current Wave 2 release of MECO L2 includes full interest-area reports from usable participants and trials as well as passage- and sentence-level summaries. Also included are full data from individual differences tests in L2, the nonverbal IQ test, and the background questionnaire. Please refer to the project’s repository page at https://osf.io/q9h43/ for the full materials, the analysis code, and data. Note that the MECO L2 Wave 2 data can be easily aggregated with the Wave 1 data (i.e., data structures are similar), previously reported and made available in Kuperman et al. (Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023). Data from both waves are available at the same Open Science Framework repository.

Acknowledgments

Research reported in this publication was supported by the following grants and organizations: Social Sciences and Humanities Research Council of Canada Partnered Research Training Grant, 895-2016-1008 (primary investigator [PI]: G. Libben); Social Sciences and Humanities Research Council of Canada Insight Grant, 435-2021-0657; Canada Research Chair (Tier 2; PI: V. Kuperman); German Federal Ministry of Education and Research, 01| S20043 (PI: L. A. Jäger); National Council for Scientific and Technological Development (Conselho Nacional de Desenvolvimento Científico e Tecnológico) of Brazil Project 316036/2021-8 (PI: R. Rothe-Neves); Fundação Cearense de Apoio ao Desenvolvimento Científico e Tecnológico; Obel Family Foundation Research Equipment Grant to Aalborg University, 2017 (PI: H. B. S. Knudsen); UK Research and Innovation Economic and Social Research Council South Coast Doctoral Training Partnership, ES/P000673/1; Project Fondecyt Regular by the National Research and Development Agency (ANID-CHILE), 1201440 (PI: R. Ibáñez Orellana); Project Fondecyt de Postdoctorado by the National Research and Development Agency (ANID-CHILE), 3210252 (PI: A. Santana Covarrubias); Chinese Language and Technology Center, National Taiwan Normal University, within the Higher Education Sprout Project framework by the Ministry of Education in Taiwan (PI: Y. T. Sung); Basic Research Program at the National Research University Higher School of Economics (HSE University); Ministry of Science, Technological Development and Innovation of the Republic of Serbia; Israel Science Foundation Grant, project 1034/23 (PI: N. Siegelman), and Azrieli Early Career Faculty Fellowship (PI: N. Siegelman).

We wish to thank the following individuals: Yaqian Bao, Itziar Basterra, Isidora Damjanović, Ainhoa Eguiguren, Amets Esnal, Brianna Griska-Macphee, Nora Hollenstein, Chia En Hsieh, Alexandra Jackson, Nadia Lana, Jolie Luk, Sriya Ravula, Evonne Syed, and Lucy Thomas.

Footnotes

*

V. Kuperman, S. Schroeder, and N. Siegelman contributed equally to this work

1 Although the first and dominant language was Hindi for the Indian samples, English was their official language of university-level instruction. Further, in India, many schools and higher educational institutes teach in English; therefore, most participants had already received education in English from primary school level onward. Also note that in the Basque Country, Spanish and Basque are both official languages of university instruction.

2 The UK sample completed these tests as part of their L1 individual-differences battery, so their testing session was shorter than in the other sites.

3 There was one exception to the described testing order. For logistic reasons, in Serbia, participants completed two separate testing sessions: The first consisted of the L2 (English) battery, including the eye-tracking L2 data collection, skills of individual differences, CFT, and LEAP-Q; and the second consisted of the L1 eye-tracking reading task and individual differences in L1.

4 The data we make available also include a variable (firstrun.skip) for whether the word was skipped during the first reading pass. While this variable finds more use in word and sentence reading, it is more problematic in studies of text reading. Quite often, readers begin with inspecting the length of the text to be read, so the first few fixations may land toward the middle or the end of a text passage: under a traditional definition, most words in such scenario would be considered skipped, leading to massive data loss for the fixation analysis.

5 An alternative measure of rereading can be computed using the MECO data to examine not just whether a word was reread but also how long rereading took. This can be done by subtracting gaze duration from total reading time.

6 As discussed at length in Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023, two measures were computed based on the vocabulary knowledge test: One based on data across all available blocks (“thousands” 2–10) and another based on responses in earlier blocks only (“thousands” 2–5). As this is an adaptive task, with stopping rules at the end of each block (“thousand”), many participants had little to no data in later blocks. The adapted measure from thousands 2–5 focuses on parts of the test where most participants have substantial data and, indeed, was shown to be more reliable in both Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023, and our data (see Reliability estimates section, below). Similar to Kuperman et al., Reference Kuperman, Siegelman, Schroeder, Acartürk, Alexeeva, Amenta and Usal2023, we thus use the adapted measure throughout this paper. Both measures are available in the project’s repository for interested users.

7 Reliability could not be calculated for TOWRE as the test is based on a single word and a single pseudoword list. TOWRE scores are expected to be highly reliable, as reflected in previous reports of high test-retest reliability estimates (Torgesen et al., Reference Torgesen, Wagner and Rashotte2012). Previous reports also establish LexTALE as a reliable measure in L2-English participants (Lemhöfer & Broersma, Reference Lemhöfer and Broersma2012).

References

Andrews, S., & Hersch, J. (2010). Lexical precision in skilled readers: Individual differences in masked neighbor priming. Journal of Experimental Psychology: General, 139, 299.CrossRefGoogle ScholarPubMed
Berzak, Y., Nakamura, C., Smith, A., Weng, E., Katz, B., Flynn, S., & Levy, R. (2022). CELER: A 365-participant corpus of eye movements in L1 and L2 English reading. Open Mind, 6, 4150.CrossRefGoogle Scholar
Bialystok, E., McBride-Chang, C., & Luk, G. (2005). Bilingualism, language proficiency, and learning to read in two writing systems. Journal of Educational Psychology, 97, 580.CrossRefGoogle Scholar
Brysbaert, M., & Drieghe, D. (2024). The use of eye movement corpora in vocabulary research. Research Methods in Applied Linguistics, 3, 100093.CrossRefGoogle Scholar
Cop, U., Dirix, N., Drieghe, D., & Duyck, W. (2017). Presenting GECO: An eyetracking corpus of monolingual and bilingual sentence reading. Behavior Research Methods, 49, 602615.CrossRefGoogle ScholarPubMed
Crossley, S. A., Allen, D. B., & McNamara, D. S. (2011). Text readability and intuitive simplification: Comparison of readability formulas. Reading in a Foreign Language, 23, 84101.Google Scholar
De Bruin, A. (2019). Not all bilinguals are the same: A call for more detailed assessments and descriptions of bilingual experiences. Behavioral Sciences, 9, 33.CrossRefGoogle Scholar
Geva, E., & Siegel, L. S. (2000). Orthographic and cognitive factors in the concurrent development of basic reading skills in two languages. Reading and Writing, 12, 130.CrossRefGoogle Scholar
Gillon, G. T. (2017). Phonological awareness: From research to practice. Guilford Publications.Google Scholar
Glandorf, D., & Schroeder, S. (2021). Slice: An algorithm to assign fixations in multi-line texts. Procedia Computer Science, 192, 29712979.CrossRefGoogle Scholar
Godfroid, A. (2020). Eye tracking in second language acquisition and bilingualism: A research synthesis and methodological guide. Routledge.Google Scholar
Gullifer, J. W., & Titone, D. (2020). Characterizing the social diversity of bilingualism using language entropy. Bilingualism: Language and Cognition, 23, 283294.CrossRefGoogle Scholar
Inhoff, A. W., & Radach, R. (1998). Definition and computation of oculomotor measures in the study of cognitive processes. In Underwood, G. (Ed.), Eye guidance in reading and scene perception (pp. 2953). Elsevier Science.CrossRefGoogle Scholar
Jeon, E. H., & Yamashita, J. (2014). L2 reading comprehension and its correlates: A meta‐analysis. Language Learning, 64, 160212.CrossRefGoogle Scholar
Koda, K. (2005). Insights into second language reading: A cross-linguistic approach. Cambridge University Press.CrossRefGoogle Scholar
Kuperman, V., Siegelman, N., Schroeder, S., Acartürk, C., Alexeeva, S., Amenta, S., … & Usal, K. A. (2023). Text reading in English as a second language: Evidence from the Multilingual Eye-Movements Corpus. Studies in Second Language Acquisition, 45, 337.CrossRefGoogle Scholar
Lemhöfer, K., & Broersma, M. (2012). Introducing LexTALE: A quick and valid lexical test for advanced learners of English. Behavior Research Methods, 44, 325343.CrossRefGoogle ScholarPubMed
Luk, G., & Bialystok, E. (2013). Bilingualism is not a categorical variable: Interaction between language proficiency and usage. Journal of Cognitive Psychology, 25, 605621.CrossRefGoogle Scholar
Marian, V., Blumenfeld, H. K., & Kaushanskaya, M. (2007). The Language Experience and Proficiency Questionnaire (LEAP-Q): Assessing language profiles in bilinguals and multilinguals. Journal of Speech, Language, and Hearing Research, 50, 940967.CrossRefGoogle ScholarPubMed
Melby-Lervåg, M., & Lervåg, A. (2014). Reading comprehension and its underlying components in second-language learners: A meta-analysis of studies comparing first-and second-language learners. Psychological Bulletin, 140, 409.CrossRefGoogle ScholarPubMed
Nation, P., & Beglar, D. (2007). A vocabulary size test. The Language Teacher, 31, 913.Google Scholar
Rayner, K. (1998). Eye movements in reading and information processing: 20 years of research. Psychological Bulletin, 124, 372422.CrossRefGoogle ScholarPubMed
Schmitt, N. (2008). Instructed second language vocabulary learning. Language Teaching Research, 12, 329363.CrossRefGoogle Scholar
Schroeder, S. (2019). popEye - An integrated R package to analyse eye movement data from reading experiments. Journal of Eye Movement Research, 12, 92.Google Scholar
Siegelman, N., Elgort, I., Brysbaert, M., Agrawal, N., Amenta, S., Arsenijević Mijalković, J., … & Kuperman, V. (2023). Rethinking first language–second language similarities and differences in English proficiency: Insights from the ENglish Reading Online (ENRO) Project. Language Learning, 74, 249294.CrossRefGoogle Scholar
Siegelman, N., Schroeder, S., Acartürk, C., Ahn, H. D., Alexeeva, S., Amenta, S., … & Kuperman, V. (2022). Expanding horizons of cross-linguistic research on reading: The Multilingual Eye-movement Corpus (MECO). Behavior Research Methods, 54, 28432863.CrossRefGoogle ScholarPubMed
Staub, A. (2021). How reliable are individual differences in eye movements in reading? Journal of Memory and Language, 116, 104190.CrossRefGoogle Scholar
Sui, L., Dirix, N., Woumans, E., & Duyck, W. (2023). GECO-CN: Ghent Eye-tracking COrpus of sentence reading for Chinese-English bilinguals. Behavior Research Methods, 55, 27432763.CrossRefGoogle ScholarPubMed
Thelk, A. D., Sundre, D. L., Horst, S. J., & Finney, S. J. (2009). Motivation matters: Using the Student Opinion Scale to make valid inferences about student performance. The Journal of General Education, 129151.CrossRefGoogle Scholar
Torgesen, J. K., Wagner, R. K., & Rashotte, C. A. (2012). Test of word reading efficiency–second edition (TOWRE-2). Pro-Ed.Google Scholar
Vandergrift, L. (2007). Recent developments in second and foreign language listening comprehension research. Language Teaching, 40, 191.CrossRefGoogle Scholar
Warren, T., White, S. J., & Reichle, E. D. (2009). Investigating the causes of wrap-up effects: Evidence from eye movements and E–Z Reader. Cognition, 111, 132137.CrossRefGoogle Scholar
Weiß, R. H. (2006). CFT 20-R: GrundintelligenztestSkala 2 - Revision [Basic intelligence scale 2 with vocabulary knowledge test and sequential number sequence test]. Hogrefe.Google Scholar
Figure 0

Table 1. Information regarding participants in available samples

Figure 1

Figure 1. Means of measures from the eye-tracking task across samples. Error bars stand for ± 1 SE. accuracy = percent comprehension answers correct; ba = Basque; bp = Brazilian Portuguese; ch_s = Chinese simplified; ch_t = Chinese traditional; da = Danish; en_uk = English (UK sample); ge_po = German (Potsdam sample); ge_zu = German (Zurich sample); hi_iiith = Hindi (Hyderabad sample); hi_iitk = Hindi (Kanpur sample); ic = Icelandic; n Fixations = number of fixations; no = Norwegian; refixation = likelihood of second fixation on the word; regressionIn = regression rate; rereading = likelihood of second pass; ru_mo = Russian (Moscow sample); skipping = skipping rate; se = Serbian; sp_ch = Spanish (Chile sample); tr = Turkish.

Figure 2

Figure 2. Means of measures of individual differences of English proficiency across samples. Error bars stand for ± 1 SE. Ba = Basque; bp = Brazilian Portuguese; cft = score in the CFT test; ch_s = Chinese simplified; ch_t = Chinese traditional; da = Danish; en_uk = English (UK sample); ge_po - German (Potsdam sample); ge_zu = German (Zurich sample); hi_iiith = Hindi (Hyderabad sample); hi_iitk = Hindi (Kanpur sample); ic = Icelandic; no = Norwegian; ru_mo = Russian (Moscow sample); se = Serbian; sp_ch = Spanish (Chile sample); towre: pde = TOWRE, phonemic decoding efficiency subtest (pseudoword naming); towre: swe = TOWRE, sight word efficiency subtest (word naming); vocabulary = vocabulary knowledge (Groups 2-5); tr = Turkish.

Figure 3

Table 2. Correlation table for reading measures (data aggregated across samples, N = 660)

Supplementary material: File

Kuperman et al. supplementary material

Kuperman et al. supplementary material
Download Kuperman et al. supplementary material(File)
File 36.3 KB