(1) Context and motivation
In classroom settings, student engagement is widely recognized as a multidimensional rather than unitary construct (Fredricks et al., 2004; Salmela-Aro et al., 2021). It is most often theorized in terms of behavioral, emotional, and cognitive dimensions (Fredricks et al., 2004). In language education, however, a substantial body of work has extended this tripartite framing by emphasizing social and agentic aspects to capture the inherently interactional and participatory nature of language classrooms (e.g., Dincer et al., 2019; Philp & Duchesne, 2016; Svalberg, 2009).
Behavioral engagement refers to learners’ observable actions in academic activities such as paying attention, asking questions, and contributing to classroom discussions (Fredricks et al., 2004). It is considered the most visible form of engagement (Martin & Borup, 2022, as cited in López-Pernas & Saqr, 2024) and is significant to predicting academic success (Skinner & Pitzer, 2012). The second dimension, cognitive engagement, extends beyond observable behaviors to encompass learners’ internal mental investment in learning (Fredricks et al., 2004). It involves the strategic effort, self-regulation, and metacognitive process that students apply to understand complex ideas and monitor their learning processes (Fredricks et al., 2004; Skilling & Stylianides, 2023). Emotional engagement, on the other hand, involves learners’ affective responses to the learning experience. Positive emotions such as enjoyment and enthusiasm are key indicators of emotionally engaged behaviors, whereas negative feelings such as boredom and anxiety indicate disengagement or negative affective involvement (Reeve, 2012; Sang & Hiver, 2020). The classroom’s social ecology, including such factors as classroom climate, teacher support, and peer interactions, significantly shapes these feelings (Hu & Wang, 2023). As regards the social and agentic aspects, Sulis (2023, 2024) argues that, because engagement is intrinsically embedded in social contexts, the social dimension permeates all other facets and may not require separate classification. Therefore, by embracing Sulis’ classification, the final dimension considered in the present study is agentic engagement, which captures the proactive role students take in directing their learning process. Agentically engaged students meaningfully contribute to the flow of classroom instruction, ultimately facilitating the learning experiences not only of themselves but also of their peers (Patall, 2024; Zambrano et al., 2022). It is manifested through actions such as expressing preferences, seeking feedback, or suggesting modifications to tasks (Reeve, 2013; Reeve & Tseng, 2011).
Methodologically, student engagement has been examined through multiple approaches, including student self-report surveys, teacher reports/ratings, classroom observation, interviews, and experience sampling methods, each providing a different window into engagement as it unfolds in context (Fredricks & McColskey, 2012; Sulis, 2024). Nevertheless, self-report questionnaires remain the most common strategy in large-sample engagement research because they can be administered efficiently to large and diverse cohorts and can elicit cognitive and affective experiences that are not fully accessible through observation alone (Fredricks et al., 2004; Fredricks & McColskey, 2012). At the same time, the measurement literature cautions that engagement questionnaires vary in dimensional coverage and item content, and that overlap in how indicators are operationalized can limit cross-study comparability unless researchers clearly specify their engagement model and provide transparent validity evidence (Appleton et al., 2008; Fredricks & McColskey, 2012). These concerns are particularly salient for multidimensional engagement because measures are often worded broadly rather than anchored to specific tasks or situations, and similar items have sometimes been used to represent different engagement dimensions, features that can affect whether dimensions emerge as empirically distinguishable within a given dataset (Fredricks & McColskey, 2012). Recent synthesis work in language engagement research likewise documents substantial heterogeneity in methods and conceptual frameworks and calls for clearer definitions and operationalization to reduce ambiguity across studies (Hiver et al., 2024). In this light, adapting established engagement items to the instructional domain of L2 learning can be treated as a context-sensitizing step that clarifies the target of engagement and supports more interpretable inferences about engagement in classroom activity (Hiver et al., 2024; Philp & Duchesne, 2016).
Building on these conceptual and methodological considerations, the present dataset is intended to support both cumulative, instrument-based research and humanities-facing interpretations of classroom life. In applied linguistics, engagement is increasingly conceptualized as learners’ active and meaningful participation in language learning and use (Hiver et al., 2024) and as a situated process of heightened attention and involvement in classroom activity (Philp & Duchesne, 2016). By adapting engagement items to the domain of L2 learning and making both the instrument and its validation evidence transparent, this study is expected to provide a reusable measurement resource that can be triangulated with discourse-analytic, ethnographic, and sociocultural inquiries into how learners negotiate opportunities to participate, position themselves, and develop voice and agency as L2 learners in interaction. Such triangulation aligns with calls for integrated methodological approaches that combine multiple data sources and interpretive lenses, while also addressing documented challenges in construct–measurement alignment that can constrain cumulative theorizing in language engagement research (Hiver et al., 2024; Sulis, 2024).
(2) Dataset description
The dataset comprised 276 observations. Each row in the dataset represented a single student, and each column corresponded to an individual questionnaire item measuring engagement. The engagement questionnaire consisted of four dimensions: behavioral engagement (BE), cognitive engagement (CE), emotional engagement (EE), and agentic engagement (AE). The items were labeled as BE1 to BE5, CE1 to CE6, EE1 to EE5, and AE1 to AE5.
Repository location
Repository name
Figshare
Object name
Data_L2 Student Engagement.sav
Data_L2 Student Engagement.xlsx
Format names and versions
.sav
.xlsx
Creation dates
From 2025-05-07 to 2025-05-30
Dataset creators
Le Thi Diem Lan
Duong Minh Tuan
Language
All the variable names in the dataset are in English.
License
CC BY 4.0
Publication date
The dataset was published on Figshare on 2025-10-28.
(3) Method
(3.1) Participants
Participants included 276 undergraduate students majoring in English Studies at a university in the Mekong Delta region of Vietnam. Among them, there were 105 males (38.04%) and 171 females (61.96%), with ages ranging from 19 to 22 years (M = 20.57, SD = 1.10). Of the 276 participants, second-year students made up the largest group (N = 77, 27.90%), followed closely by third-year students (N = 75, 27.17%) and then fourth-year students (N = 65, 23.55%). First-year students comprised the smallest group (N = 59, 21.38%). All participants had at least ten years of experience studying English as a required subject in their formal education.
(3.2) Instrument
A survey questionnaire was employed as the primary instrument for data collection and dataset construction. Prior to its administration, a validation procedure was undertaken to establish its psychometric quality. The development process commenced with a qualitative design phase that specified the construct domain and drew on a critical review of engagement theory and existing measures to inform item generation and to confirm the need for an adapted instrument for the present context (Boateng et al., 2018). Also consistent with DeVellis’s (2017) guidance of scale development, an initial item pool was then generated to sample the content boundaries of the target construct, ensuring that items reflected the intended latent dimensions and did not extend beyond the construct definition. Building on this foundation, a 21-item engagement scale was constructed, comprising four subscales aligned with the four dimensions of engagement adopted in the present study (behavioral, cognitive, emotional, and agentic). All items were rated on a 6-point Likert scale ranging from 1 (“Strongly disagree”) to 6 (“Strongly agree”).
The scale was sourced from Reeve (2013) and Reeve and Tseng (2011), whose original instruments measured student engagement in general classroom contexts. To enhance contextual appropriateness for L2 learning, necessary adaptations were made. Specifically, the cognitive subscale, comprising five items, was adapted from Reeve (2013). One sample item was “When I am in English classes, I listen very carefully,” which was adjusted from “When I am in this class, I listen very carefully.” The remaining subscales drew on both Reeve (2013) and Reeve and Tseng (2011), resulting in six items for the cognitive subscale, and five items each for the emotional and agentic subscales. Several examples illustrating these adaptations include the rephrasing of the cognitive item “Before I begin to study, I think about what I want to get done” as “Before I begin to study in English classes, I think about what I want to accomplish,” the modification of the emotional item “This class is fun” to “English classes are fun,” and the revision of the agentic item “I let my teacher know what I need and want” to “I tell my English teachers what I need and want.” Through these adaptations, the instrument maintained its conceptual rigor while enhancing its contextual relevance for measuring engagement in L2 learning environments. Table 1 presents the adapted L2 engagement scale and its sources.
Table 1
Adapted L2 engagement scale and its sources.
| QUESTIONNAIRE ITEM | ITEM CODE | SOURCES |
|---|---|---|
| Behavioral engagement | ||
| When I am in English classes, I listen very carefully. | BE1 | Reeve (2013) |
| I pay attention in English classes. | BE2 | |
| I try hard to do well in English classes. | BE3 | |
| In English classes, I work as hard as I can. | BE4 | |
| When I am in English classes, I participate in class discussions. | BE5 | |
| Cognitive engagement | ||
| Before I begin to study in English classes, I think about what I want to accomplish. | CE1 | Reeve and Tseng (2011); Reeve (2013) |
| When I study in English classes, I try to connect what I am learning with my own experiences. | CE2 | |
| I try to make all the different ideas fit together and make sense when I study in English classes. | CE3 | |
| I create my own examples to help me understand the important content I study in English classes. | CE4 | |
| When what I am working on is challenging to understand in English classes, I adjust my approach to learning the material. | CE5 | |
| As I study in English classes, I keep track of how much I understand, not just if I am getting the correct answers. | CE6 | |
| Emotional engagement | ||
| When I am in English classes, I feel curious about what we are learning. | EE1 | Reeve and Tseng (2011); Reeve (2013) |
| When we work on something in English classes, I feel interested. | EE2 | |
| When I am in English classes, I feel good. | EE3 | |
| I enjoy learning new things in English classes. | EE4 | |
| English classes are fun. | EE5 | |
| Agentic engagement | ||
| During English classes, I express my preferences and opinions. | AE1 | Reeve and Tseng (2011); Reeve (2013) |
| During English classes, I ask questions to help me learn. | AE2 | |
| I let my English teachers know what I am interested in. | AE3 | |
| I tell my English teachers what I need and want. | AE4 | |
| I offer suggestions to my English teachers on how to make classes better. | AE5 |
An expert review was conducted to evaluate the instrument’s content relevance and clarity. Four experts (two experienced EFL teachers and two TESOL specialists) reviewed the items and provided feedback, which informed minor wording revisions to improve readability. Overall, the panel judged the items to be appropriate indicators of the four intended engagement components and did not recommend adding new items, providing initial evidence of content validity based on expert judgment (Haynes et al., 1995; Sireci, 1998). A pilot study was then conducted with 32 students from the target population. To support comprehension and minimize construct-irrelevant variance associated with L2 reading demands, the questionnaire was administered in Vietnamese using a back-translation procedure to enhance equivalence (Brislin, 1970). Specifically, the survey was translated into Vietnamese and independently back-translated into English by two experienced EFL teachers. The back-translated versions were compared with the original, and discrepancies were documented at the item level and evaluated for lexical variation versus potential conceptual shift. Minor discrepancies, primarily lexical, were resolved through consensus discussion among the translators and the research team, with revisions designed to preserve the intended construct meaning while ensuring natural Vietnamese phrasing. Short follow-up interviews revealed that participants had no difficulty understanding the questionnaire. The pilot students were excluded from the subsequent main data collection to avoid potential biases that might affect the integrity of the dataset.
(3.3) Data collection
Prior to data collection, the dataset protocol was reviewed and approved by the Faculty of Foreign Languages’ Scientific and Educational Committee at Nam Can Tho University, which functioned as an alternative to an Institutional Review Board (Ref. No. C25.75). The data collection procedure was implemented for about one week in the third semester of the academic year 2024–2025. The questionnaire was distributed to the target participants, who were English major students. With the permission of the course instructors, one of the researchers visited several classes of the intended participants during regular class hours to collect data. Participants were instructed to complete the questionnaire in a paper-and-pencil format under the supervision of the administering researcher. To guide them through the process, the first page of the survey contained a brief introduction to the data collection. Informed consent was obtained from all participants before they filled in the questionnaire. They were explicitly told that participation was voluntary, meaning they could withdraw at any time without consequences, and that all responses would be kept strictly confidential. Completion of the questionnaire required approximately ten minutes. After removing seven invalid responses submitted within one minute, the final dataset comprised 276 valid cases.
(3.4) Data analysis
Data analyses were conducted using IBM SPSS Statistics (version 27) and JASP (version 0.95.3.0). The analytical process involved two main stages. In the first stage, preliminary data screening was performed to ensure completeness and to identify potential outliers using standardized z-scores and Mahalanobis distance. The assumptions for factor analysis were assessed through the Kaiser–Meyer–Olkin (KMO) measure of sampling adequacy and Bartlett’s test of sphericity. In the second stage, psychometric analyses were carried out to evaluate the measurement properties of the engagement scale. Prior to these analyses, internal consistency reliability was examined using Cronbach’s alpha (α) and McDonald’s omega (ω). The dimensional structure of the scale was then explored using Exploratory Factor Analysis (EFA), followed by Confirmatory Factor Analysis (CFA). Model fit in the CFA was evaluated using multiple indices, including the Root Mean Square Error of Approximation (RMSEA), Standardized Root Mean Square Residual (SRMR), Comparative Fit Index (CFI), Tucker–Lewis Index (TLI), Normed Fit Index (NFI), and Bollen’s Incremental Fit Index (IFI).
(4) Results
(4.1) Preliminary analyses
All data were screened for completeness, outliers, and normality. Descriptive analyses confirmed that no missing data were present among the 21 engagement items. Multivariate outliers were evaluated using Mahalanobis distance, computed across all 21 engagement items. Mahalanobis values ranged from 2.11 to 42.78 (M = 19.93, SD = 6.86). Following the guidelines of Tabachnick and Fidell (2013), cases with Mahalanobis distances exceeding the critical chi-square value at p < .001 (df = 21; χ² = 46.80) were classified as potential multivariate outliers. No cases exceeded this cutoff, indicating that the dataset was free of multivariate outliers. Univariate outliers were further examined using standardized z-scores; any case with a z-score below –3.29 or above +3.29 was considered an outlier (Tabachnick & Fidell, 2013). This step revealed no univariate outliers. The distribution of all 21 engagement items was examined for skewness and kurtosis to assess univariate normality. Skewness values ranged from –.145 to .508, and kurtosis values ranged from –1.163 to –.177, all within the acceptable range of ±2 (George & Mallery, 2019). These results indicate that the data approximated a normal distribution across all items.
(4.2) Reliability
The reliability of the dataset was assessed using McDonald’s ω and Cronbach’s α coefficients for each dimension. The behavioral engagement (BE) subscale demonstrated excellent reliability (ω = .944, α = .950), as did cognitive engagement (CE) (ω = .957, α = .956), emotional engagement (EE) (ω = .948, α = .952), and agentic engagement (AE) (ω = .914, α = .917). The overall 21-item engagement scale also showed strong internal consistency (ω = .982, α = .956), indicating that the items reliably measured the intended constructs. In addition, corrected item–total correlations, which indicate the extent to which a particular item aligns with the total score of its respective subscale when it is excluded from that total (Nunnally & Bernstein, 1994), ranged from .745 to .903 across dimensions. These coefficients, which well exceeded the commonly used minimum threshold of .30 for adjusted item–total correlations (Boateng et al., 2018; Field, 2009), suggested that all items contributed meaningfully to their respective subscales and that none functioned as weak or misfitting indicators. Taken together, these findings indicated that the engagement questionnaire was a reliable instrument for assessing the multidimensional construct of student engagement within the current sample.
(4.3) Exploratory Factor Analysis (EFA)
The dataset’s suitability for factor analysis was first evaluated. The overall KMO measure of sampling adequacy was .953, exceeding the conventionally applied lower bound of .50 and indicating that the correlation matrix contained sufficient shared variance for factor extraction (Kaiser, 1970; Kaiser & Rice, 1974). Item-level measures of sampling adequacy were similarly high (.816–.967), supporting the inclusion of each item in the factor model (Kaiser & Rice, 1974). Bartlett’s test of sphericity was significant (χ²(210) = 5785.772, p < .001), rejecting the null hypothesis that the correlation matrix is an identity matrix and further supporting the factorability of the data (Bartlett, 1950). EFA was conducted using principal axis factoring with Promax rotation to allow for correlated factors. The rotated matrix of factor loadings (Table 2) and the scree plot (Figure 1) confirmed the extraction of four factors, consistent with the theoretical model of student engagement, which comprises the behavioral (BE), cognitive (CE), emotional (EE), and agentic (AE) dimensions. Factor 1 corresponded to CE, Factor 2 to EE, Factor 3 to BE, and Factor 4 to AE.
Table 2
EFA using principal axis factoring with Promax rotation.
| ITEM | FACTOR 1 | FACTOR 2 | FACTOR 3 | FACTOR 4 | UNIQUENESS |
|---|---|---|---|---|---|
| CE1 | .927 | .000 | .013 | –.082 | .200 |
| CE3 | .916 | –.025 | –.011 | .001 | .197 |
| CE4 | .883 | –.009 | –.012 | –.002 | .244 |
| CE2 | .853 | –.017 | .014 | .078 | .204 |
| CE5 | .848 | .062 | .011 | –.035 | .232 |
| CE6 | .821 | .047 | .007 | .082 | .190 |
| EE3 | –.024 | .938 | .041 | –.032 | .127 |
| EE5 | –.034 | .904 | –.001 | .038 | .188 |
| EE2 | .022 | .871 | –.018 | –.018 | .252 |
| EE4 | .067 | .846 | .015 | –.007 | .198 |
| EE1 | .034 | .835 | .021 | .022 | .225 |
| BE3 | .024 | –.017 | .928 | –.055 | .180 |
| BE5 | .093 | –.041 | .882 | –.016 | .186 |
| BE1 | –.085 | .041 | .871 | .004 | .272 |
| BE2 | .001 | .048 | .861 | .038 | .170 |
| BE4 | .007 | .019 | .847 | .043 | .217 |
| AE4 | –.036 | –.025 | –.037 | .888 | .290 |
| AE2 | –.005 | –.021 | –.005 | .854 | .295 |
| AE1 | –.020 | .011 | .069 | .834 | .254 |
| AE3 | .024 | .048 | –.037 | .816 | .304 |
| AE5 | .049 | –.007 | .024 | .749 | .386 |

Figure 1
Scree plot displaying the eigenvalues associated with each factor extracted through EFA.
The rotated solution accounted for 77.1% of the total variance. All factor loadings were above .70, indicating strong associations between items and their respective factors. Uniqueness values ranged from .127 to .386, suggesting that most variance in each item was explained by its factor. No items demonstrated problematic cross-loadings, supporting the distinctiveness of the four engagement dimensions. The inflection point at the fourth factor shown in Figure 1 supports the retention of the four distinct engagement dimensions considered for the present study.
(4.4) Convergent and discriminant validity
Convergent validity was evaluated using the magnitude and significance of the standardized factor loadings and the Average Variance Extracted (AVE). All items loaded significantly on their intended latent factors, with standardized loadings ranging from .854 to .952 (p < .001). Residual variances were also small across items (.094–.271). The magnitude and statistical significance of these loadings were interpreted as evidence that the indicators converged on their intended constructs (Brown, 2015). AVE values were high (BE = .845, AE = .791, CE = .841, EE = .853), indicating that each construct accounted for a substantial proportion of variance in its indicators relative to measurement error (Fornell & Larcker, 1981). Discriminant validity was assessed using the heterotrait–monotrait (HTMT) ratio; all HTMT values were below .85, supporting the empirical distinctness of the four engagement dimensions (Henseler et al., 2015).
(4.5) Confirmatory factor analysis (CFA)
CFA was conducted to evaluate the four-factor structure of the engagement scale. Using the diagonally weighted least squares estimator, the four-factor model demonstrated excellent fit to the data. The chi-square test of exact fit was nonsignificant, χ²(183) = 212.09, p = .069, indicating that the null hypothesis of exact fit was not rejected (Widaman & Thompson, 2003). Approximate-fit indices corroborated close fit: RMSEA = .024 and SRMR = .031 (Hu & Bentler, 1999). Incremental fit indices, which evaluate improvement in fit relative to a nested baseline model (Hu & Bentler, 1999), were also very high: CFI = .999; TLI = .998; NFI = .989; IFI = .999 (Widaman & Thompson, 2003). Taken together, these results indicated that the hypothesized four-factor model provided an excellent representation of the data (see Figure 2).

Figure 2
Standardized CFA model.
This figure illustrates the standardized CFA model confirming the four-factor structure of the L2 student engagement scale, comprising behavioral, cognitive, emotional, and agentic dimensions. Observed variables load onto their respective latent constructs, with all standardized loadings and inter-factor correlations indicating strong relationships among the engagement components.
(5) Discussion
The psychometric results provide empirical support for treating student engagement in L2 classes as a multidimensional construct. In the broader engagement literature, engagement is most commonly conceptualized as comprising behavioral, emotional, and cognitive components (Fredricks et al., 2004), and reviews further emphasize that engagement measures vary substantially in how dimensions are operationalized and combined across studies (Appleton et al., 2008). Building on this foundational framing, the present four-factor pattern with the addition of agentic engagement is also consistent with work arguing that students’ proactive contributions to instruction constitute a distinct aspect capturing learners’ intentional attempts to shape the flow of teaching and learning (Reeve, 2012, 2013; Reeve & Tseng, 2011). The strong reliability estimates and salient factor loadings observed in this sample suggest that the adapted items functioned coherently as indicators of their intended subconstructs in a Vietnamese university EFL context.
Importantly, these findings should be interpreted in relation to how engagement is theorized and enacted in L2 learning. In language classrooms, engagement is inseparable from interaction and discourse because participation is realized through the organization of classroom talk and the opportunities learners have to initiate, respond, and negotiate meaning (Seedhouse, 2004; Walsh, 2011). Students’ BE is therefore most visible in moment-by-moment participation, indicated through taking turns, initiating, responding, and co-constructing meaning as classroom interaction unfolds (Seedhouse, 2004; Walsh, 2011). CE can be understood as strategic attention, monitoring, and problem-solving as learners work to comprehend and respond during interaction, including noticing gaps and addressing communication difficulties during negotiation work (Long, 1996; Pintrich, 2000). EE is embedded in learners’ situated experiences of challenge, risk, and affiliation in communicative activity and is shaped by achievement-related emotions, such as enjoyment and anxiety, that arise in classroom learning (Dewaele & MacIntyre, 2016; Pekrun, 2006). Accordingly, validated questionnaire data do not replace close analysis of classroom practice, including observation- and discourse-oriented accounts of how engagement is enacted in situated activity (Svalberg, 2009). Rather, questionnaires offer a complementary “wide-angle” perspective that enables large-sample comparisons across groups and contexts, and they are more optimal when combined with interviews and classroom observation as part of a multi-method design that supports purposive qualitative follow-up and triangulation (Fredricks & McColskey, 2012; Philp & Duchesne, 2016).
Framed in humanities terms, the dataset can be read as evidence of learners’ situated orientations toward participation and agency within a local ecology of classroom meaning-making rather than as decontextualized traits (van Lier, 2004). From sociocultural and identity perspectives, engagement in language learning is bound up with learners’ access to participation in community practices, the meanings they attach to languages, and the interactional positions made available to them within unequal relations of power (Norton, 2013; Wenger, 1998). These concerns have long been central to applied linguistics and humanities-adjacent traditions that treat language learning as socially mediated practice and as a site where subjectivity and identity are negotiated (Kramsch, 2009; Lantolf & Thorne, 2006; Norton, 2013). The present dataset, therefore, has value not only for predictive models but also for interpretive scholarship seeking context-rich explanations of how engagement is enacted and experienced in classroom life. It can help researchers identify patterns of engagement that warrant deeper explanation and can be examined alongside classroom discourse analysis or ethnographic accounts of interaction and participation.
Finally, adapting engagement items to the explicit domain label English classes can be understood as a context-sensitizing step that clarifies the object of engagement and responds to documented measurement heterogeneity. Foundational reviews have noted limited consensus in how engagement is defined and substantial variability in how it is operationalized and measured, which can constrain cross-study comparability when measures are not tightly mapped onto their conceptual definitions (Appleton et al., 2008; Fredricks & McColskey, 2012). In language engagement research, methodological syntheses likewise document heterogeneous methods and conceptual frameworks and call for greater definitional and operational transparency to reduce ambiguity across studies (Hiver et al., 2024; Sulis, 2024). By providing item-level documentation and explicitly reporting analytic decisions, the present study aims to support more interpretable reuse of questionnaire scores, consistent with argument-based validity approaches that foreground the intended interpretations and uses of scores and the evidentiary basis for those inferences (Kane, 2013). At the same time, the meaning of questionnaire responses remains situated and should be interpreted alongside the local linguistic and institutional context in which they were produced (Norton, 2013; van Lier, 2004).
(6) Implications and limitations
This dataset offers a modest empirical contribution to applied linguistics and L2 education by providing a transparent, reusable measurement resource for examining engagement in English classes as behavioral, cognitive, emotional, and agentic participation. Because the instrument underwent content validation, piloting, exploratory and confirmatory factor analyses, and multiple reliability and validity checks, it can support replication, comparative work, and cumulative instrument-based research in L2 engagement. Researchers may use the dataset to (a) benchmark or refine emerging engagement measures, (b) test theoretically motivated extensions (e.g., links with motivation, emotion regulation, classroom interactional opportunities, or instructional support), and (c) conduct secondary analyses examining the interrelations among the four engagement dimensions across learner subgroups.
The dataset’s item-level documentation and detailed reporting of analytic decisions also make it suitable for methodologically informed synthesis, including psychometric reviews and meta-analytic work that examines the dimensional structure and measurement properties of L2 engagement instruments across contexts. In addition, the four-factor structure provides a practical starting point for measurement invariance testing across instructional settings or demographic groups, clarifying the extent to which engagement is configured similarly or differently across contexts and thereby strengthening cross-study comparability and cumulative theorizing.
Last but not least, consistent with humanities-facing approaches that treat engagement as situated participation in classroom life, the dataset can be triangulated with discourse-analytic, ethnographic, or sociocultural analyses. Used in this way, questionnaire profiles can help identify patterns warranting closer interpretive explanation (e.g., how learners enact agency, position themselves, or experience affective demands in interaction), while qualitative evidence can contextualize what questionnaire responses mean within a particular institutional and linguistic ecology. The dataset may also serve as a baseline for longitudinal or pedagogical intervention research, enabling pre–post comparisons and modelling changes in engagement across different teaching conditions.
Despite the methodological rigor of the dataset, several limitations should be acknowledged. First, because engagement was measured solely through self-reported questionnaires, the results may be affected by response bias. Future investigations should triangulate self-reports with complementary data sources (e.g., teachers’ assessments and classroom observations) to strengthen the validity of engagement measurement. Second, although the sample size (N = 276) was adequate for exploratory and confirmatory analyses, it was not large or sufficiently rich to support more stringent cross-validation (i.e., splitting into independent calibration and validation subsets) or fine-grained subgroup analyses, which also constrain the dataset’s standalone reuse. Consequently, EFA and CFA were conducted on the same sample, raising potential concerns about overfitting. Future studies should therefore use larger samples with more detailed participant descriptors to enable cross-validation, invariance testing across groups/contexts, and stronger cumulative reuse, particularly through pooling with compatible datasets that administer the same questionnaire in other settings.
Note
L2 refers to learners’ foreign or second language in this study.
Competing Interests
The authors have no competing interests to declare.
Author Contributions
Le Thi Diem Lan: Conceptualization, Data curation, Investigation, Methodology, Writing – original draft
Duong Minh Tuan: Data curation, Investigation, Methodology, Supervision, Writing – review & editing
