Contribución del machine learning al análisis de la repetición escolar en España: un estudio con datos PISA
DOI
10.22550/2174-0909.4014
Abstract
Introduction: The rate of grade repetition is excessively high in Spain despite being a controversial measure. In order to obtain evidence to contribute to reducing it in compulsory education, the present work is an in-depth study of the PISA 2018 context indices that are most closely linked to this phenomenon. Method: With the sample of Spanish students (n = 35 943), we used an automatic machine learning method to select and order the predictors, and multilevel logistic regression (students and centres) to quantify the contribution of each one. Results: For each educational stage we obtained the 30 most significant contextual variables, which explain 65.5% of the grade repetition variance in primary education and almost 55.7% in secondary education. Conclusions: The main indicators are principally at student level, which suggests the suitability of psychoeducational interventions based more on individual support than general policies. This gives rise to potentially more efficient and equitable measures than grade repetition, aimed at, for example, the management of learning time or academic/professional guidance, and predictors with specific differential significance at each stage. Methodologically, the study contributes to improving the specification of predictive models.
Financiación | Funding
This article received no public or private funding.
Referencias | References
Allen, C. S., Chen, Q., Willson, V. L., & Hughes J. N. (2009). Quality of design moderates effects of grade retention on achievement: A metaanalytic, multi-level analysis. Educational Evaluation and Policy Analysis, 31 (4), 480-99. https://doi.org/10.3102/0162373709352239
Arroyo, D., Constante-Amores, A., & Asensio, I. (2019). La repetición de curso a debate: un estudio empírico a partir de PISA 2015 [Debate on grade repetition: An empirical study from PISA 2015]. Educación XX1, 22 (2), 69-92. https://doi.org/10.5944/educxx1.22479
Arroyo, D., Constante-Amores, A., Castro, M., & Navarro, E. (2024a). Eficacia escolar y alto rendimiento del alumnado español en PISA 2018: un enfoque de machine learning [School effectiveness and high reading achievement of spanish students in PISA 2018: A machine learning approach]. Educación XX1, 27 (2), 223-251. https://doi.org/10.5944/educxx1.38634
Arroyo, D., Constante-Amores, A., Gil, P., & Carrillo, P. J. (2024b). Student well-being and mathematical literacy performance in PISA 2018: A machine-learning approach. Educational Psychology, 44 (3), 340-357. https://doi.org/10.1080/01443410.2024.2359104
Asensio, I., Carpintero, E., Expósito, E., & López, E. (2018). ¿Cuánto oro hay entre la arena? Minería de datos con los resultados de España en PISA 2015 [How much gold is in the sand? Data mining with Spain’s PISA 2015 results]. Revista Española de Pedagogía, 76 (270), 225-245. https://doi.org/10.22550/REP76-2-2018-02
Battistin, E., & Schizzerotto, A. (2019). Threat of grade retention, remedial education and student achievement: Evidence from upper secondary schools in Italy. Empirical Economics, 56, 651- 678. https://doi.org/10.1007/s00181-018-1443-6
Cameron, A. C., & Windmeijer, F. A. (1997). An R-squared measure of goodness of fit for some common nonlinear regression models. Journal of Econometrics, 77 (2), 329-342. https://doi.org/10.1016/S0304-4076(96)01818-0
Carabaña, J. (2013). Repetición de curso y puntuaciones PISA, ¿cuál causa cuál? [Grade repetition and PISA scores, which causes which?] In Instituto Nacional de Evaluación Educativa (Ed.), PISA 2012: Programa para la evaluación de los alumnos, informe español (volumen II: análisis secundario) [PISA 2012: Programme for international student assessment, Spanish report (volume II: secondary analysis)]. INEE.
Choi, A., Gil, M., Mediavilla, M., & Valbuena, J. (2018). Predictors and effects of grade repetition. Revista de Economía Mundial, (48), 21-42. Constante, A., Florenciano, E., Navarro, E., & Fernández, M. (2021). Factores asociados al abandono universitario [Factors associated with university dropout]. Educación XX1, 24 (1), 17-44. https://doi.org/10.5944/educxx1.26889
Cordero, J., Manchón, C., & Simancas, R. (2014). La repetición de curso y sus factores condicio- nantes en España [Repetition and explanatory factors in Spain]. Revista de Educación, (365), 12-37. https://doi.org/10.4438/1988-592X-RE-2014-365-263
Ferla, J., Valcke, M., & Cai, Y. (2009). Academic self-efficacy and academic self-concept: Reconsidering structural relationships. Learning and Individual Differences, 19 (4), 499-505. https://doi.org/10.1016/j.lindif.2009.05.004
Fernández-Lasarte, O., Ramos-Díaz, E., & Axpe, I. (2019). Rendimiento académico, apoyo social percibido e inteligencia emocional en la universidad [Academic performance, perceived social support and emotional intelligence at the university]. European Journal of Investigation in Health, 9 (1), 39-49.
Gamazo, A., & Martínez-Abad, F. (2020). An exploration of factors linked to academic performance in PISA 2018 through data mining techniques. Frontiers in Psychology, 11. https://doi.org/10.3389/fpsyg.2020.575167
García-Pérez, J., Hidalgo, M., & Robles, J. (2014). Does grade retention affect students’ achievement? Some evidence from Spain. Applied Economics, 46 (12), 1373-1392. https://doi.org/10.1080/00036846.2013.872761
Gaviria, J. L., & Castro, M. (2005). Modelos jerárquicos lineales [Linear hierarchical models]. La Muralla.
González-Nuevo, C., Postigo, Á., García-Cueto, E. Menéndez-Aller, Á., Muñiz, J., Cuesta, M., Álvarez-Díaz, M., & Fernández-Alonso, R. (2023). Grade retention impact on academic self-concept: a longitudinal perspective. School Mental Health, 15, 600-610. https://doi.org/10.1007/ s12310-023-09573-2
Gorostiaga, A., & Rojo-Álvarez, J. L. (2016). On the use of conventional and statistical-learning techniques for the analysis of PISA results in Spain. Neurocomputing, 171, 625-637. https://doi.org/10.1016/j.neucom.2015.07.001
Hattie, J. (2017). Aprendizaje visible para profesores. Maximizando el impacto en el aprendizaje [Visible learning for teachers. Maximising learning impact]. Paraninfo.
Hornstra, L., Majoor, M., & Peetsma, T. (2017). Achievement goal profiles and developments in effort and achievement in upper elementary school. British Journal of Educational Psychology, 87 (4), 606-629. https://doi.org/10.1111/bjep.12167
Ikeda, M., & García, E. (2014). Grade repetition: A comparative study of academic and non-academic consequences. OECD Journal: Economic Studies, (1), 269-315. https://doi.org/10.1787/ eco_studies-2013-5k3w65mx3hnx
Jimerson, S. R. (2001). Meta-analysis of grade retention research: new directions for practice in the 21st century. School Psychology Review, 30 (3), 420-437. http://dx.doi.org/10.1080/02796015.2001.12086124
Jimerson, S. R., & Ferguson, P. (2007). A longitudinal study of grade retention: Academic and behavioral outcomes of retained students through adolescence. School psychology quarterly, 22 (3), 314. Kılıç, D., Aşkın, Ö. E., & Öz, E. (2017). Identifying the classification performances of educational data mining methods: A case study for TIMSS. Kuram ve Uygulamada Egitim Bilimleri, 17 (5), 1605-1623. https://jestp.com/menuscript/index.php/estp/article/view/426
Kiray, S. A., Gok, B., & Bozkir, A. S. (2015). Identifying the factors affecting science and mathematics achievement using data mining methods. Journal of Education in Science, Environment and Health, 1 (1), 28-48.
Lee, V. E. (2000). Using hierarchical linear modeling to study social contexts: The case of school effects. Educational Psychologist, 35 (2), 125-141. https://doi.org/10.1207/S15326985EP3502_6
Liu, X., & Ruiz, M. E. (2008). Using data mining to predict K-12 students’ performance on large-scale assessment items related to energy. Journal of Research in Science Teaching, 45 (5), 554-573. https://doi.org/10.1002/tea.20232
Lopes, J., Oliveira, C. & Costa, P. (2022). Determinantes escolares y de los estudiantes en el rendimiento lector: un análisis multinivel con estudiantes portugueses. Revista de Psicodidáctica, 27, 29-37. http://doi.org/10.1016/j.psicod.2021.05.001
López, E., Expósito, E., Carpintero, E., & Asensio, I. (2018). ¿Qué nos dice PISA sobre la enseñanza y el aprendizaje de las ciencias? Una aproximación a través de árboles de decisión [What does PISA tell us about science teaching and learning? A decision tree approach]. Revista de Educación, (382), 133-161.
López, F., & García, I. (2020). A vueltas con la equidad en educación: una aproximación empírica en la perspectiva de las consecuencias [The equity in education: An empirical approach in the perspective of consequences]. Universidad Camilo José Cela. http://hdl.handle.net/20.500.12020/901
López, L., González-Rodríguez, D., & Vieira, M. J. (2023). Variables que afectan la repetición en la educación obligatoria en España [Variables affecting repetition in compulsory education in Spain]. Revista Electrónica de Investigación Educativa, 25, 1-15. https://doi.org/10.24320/redie.2023.25.e17.4942
López-Rupérez, F., García-García, I., & Expósito- Casas, E. (2021). La repetición de curso y la graduación en Educación Secundaria Obligatoria en España. Análisis empíricos y recomendaciones políticas [Grade repetition and graduation in compulsory secondary education in Spain. Empirical analysis and policy recommendations]. Revista de Educación, (394), 325-353. https://doi.org/10.4438/1988-592X-RE-2021-394-510
Marsh, H. W., Parker, P. D., & Pekrun, R. (2018). Three paradoxical effects on academic self-concept across countries, schools, and students. European Psychologist, 24 (3), 231-242. http://doi.org/10.1027/1016-9040/a000332
Mathys, C., Véronneau, M., & Lecocq, A. (2019). Grade retention at the transition to secondary school: Using propensity score matching to identify consequences on psychosocial adjustment. Journal of Early Adolescence, 39 (1), 97-133. https://doi.org/10.1177/0272431617735651
Martínez-Abad, F., & Chaparro, A. A. (2017). Data-mining techniques in detecting factors linked to academic achievement. School Effectiveness and School Improvement, 28 (1), 39-55. https://doi.org/10.1080/09243453.2016.1235591
Martínez-Abad, F., Gamazo, A., & Rodríguez- Conde, M. J. (2020). Educational data mining: Identification of factors associated with school effectiveness in PISA assessment. Studies in Educational Evaluation, 66, 100875. https://doi.org/10.1016/j.stueduc.2020.100875
Nieto-Isidro, S., & Martínez-Abad, F. (2023). Repetición de curso y su relación con variables socioeconómicas y educativas en España [Grade retention and its relationship with socioeconomic and educative variables in Spain]. Revista de Educación, 1 (402), 207-236. https://doi.org/10.4438/1988-592X-RE-2023-402-600
Ministerio de Educación y Formación Profesional. (2019). PISA 2018. Programa para la evaluación internacional de los estudiantes. Informe español (vol. I) [PISA 2018. Programme for international student assessment. Spanish report (vol. I)]. https://www.educacionyfp.gob.es/inee/evaluaciones-internacionales/pisa/pisa-2018/pisa-2018-informes-es.html
Ministerio de Educación y Formación Profesional. (2022). Sistema estatal de indicadores de la educación 2022 [State system of education indicators 2022]. https://www.libreria.educacion.gob.es/libro/sistema-estatal-de-indicadores-de-la-educacion-2022_184171/
Niemi, H. (2015). Teacher professional development in Finland: Towards a more holistic approach. Psychology, Society and Education, 7 (3), 279-294. OCDE (2019). PISA 2018 assessment and analytical framework. https://doi.org/10.1787/b25e-fab8-en
OCDE (2020). PISA 2018 results (volume VI). Are students ready to thrive in an interconnected world? https://www.oecd.org/publications/pisa-2018-results-volume-vi-d5f68679-en.htm
Pardo, M., & Ruiz, M. (2013). Análisis de datos en ciencias sociales y de la salud III [Data analysis in health and social sciences III]. Síntesis.
Peixoto, F., Monteiro, V., Mata, L., Sanches, C., Pipa, J., & Navas, L. (2016). To be or not to be retained… That’s the question! Retention, self-esteem, self-concept, achievement goals, and grades. Frontiers in Psychology, 7, (1550). https://doi.org/10.3389/fpsyg.2016.01550
Raschka, S. (2015). Python machine learning. Packt.
Rhodes, J., Thomas, J. M., & Liles, A. R. (2018). Predictors of grade retention among children in an elementary school truancy intervention. Journal of At-Risk Issues, 21 (1), 1-10.
Rivero, G. (2011). Análisis de datos incompletos en ciencias sociales [Incomplete data analysis in social sciences]. CIS.
Rodríguez-Rodríguez, D. (2022). Repetición de curso, rendimiento académico y variables motivacionales en Educación Secundaria Obligatoria: un estudio longitudinal [Grade retention, academic performance and motivational variables in compulsory secondary education: A longitudinal study]. Psicothema, 34 (3), 429-436. https://doi.org/10.7334/psicothema2021.582
Rosenthal, R., & Rosnow, R. L. (2008). Essentials of behavioral research: Methods and data analysis. McGraw-Hill
Sarkar, D., Bali, R., & Sharma, T. (2018). Practical machine learning with Python. A problem-solvers guide to building real-world intelligent systems. Apress.
Schwerdt, G., West, M. R., & Winters, M. A. (2017). The effects of testbased retention on student outcomes over time: Regression discontinuity evidence from Florida. Journal of Public Economics, 152, 154-169. https://doi.org/10.1016/j.jpubeco.2017.06.004
Seabra, D. A., & Ferrão, M. E. (2016). Repetência e indisciplina: Evidências de Brasil e Portugal no PISA 2012 [Grade repetition and indiscipline: Evidence from Brazil and Portugal in Pisa, 2012]. Cadernos de Pesquisa, 46 (161), 614-636. https://publicacoes.fcc.org.br/cp/article/view/3669
Tapia, J. G., & Álvarez, C. A. (2022). La repetición de curso en educación secundaria en clave organizacional [Grade repetition in secondary education in an organizational perspective]. Avances en Supervisión Educativa, (38). https://doi.org/10.23824/ase.v0i38.772
Urbina, A. B., & De la Calleja, J. (2017). Breve revisión de aplicaciones educativas utilizando minería de datos y aprendizaje automático [Brief review of educational applications using data mining and machine learning]. Revista Electrónica de Investigación Educativa, 19 (4), 84-96. https://doi.org/10.24320/redie.2017.19.4.1305
Valbuena, J., Mediavilla, M., Choi, Á., & Gil, M. (2021). Effects of grade retention policies: A literature review of empirical studies applying causal inference. Journal of Economic Surveys, 35 (2), 408-451. https://doi.org/10.1111/joes.12406
Van Canegem, T., Van Houtte, M., & Demanet, J. (2021). Grade retention and academic self-concept: A multilevel analysis of the effects of schools’ retention composition. British Educational Research Journal, 47 (5), 1340-1360. https://doi.org/10.1002/berj.3729
Warren, J. R., Hoffman, E., & Andrew, M. (2014). Patterns and trends in grade retention rates in the United States, 1995-2010. Educational Researcher, 43 (9), 433-443. https://dx.doi.org/10.3102/0013189X14563599
Citación recomendada | Recommended citation
Constante-Amores, A., Arroyo-Resino, D., Sánchez-Munilla, M., & Asensio-Muñoz, I. (2024). Contribution of machine learning to the analysis of grade repetition in Spain: A study based on PISA data [Contribución del machine learning al análisis de la repetición escolar en España: un estudio con datos PISA]. Revista Española de Pedagogía, 82 (289), 539-562. https://doi.org/10.22550/2174-0909.4014
Licencia Creative Commons | Creative Commons License
Esta obra está bajo una licencia internacional Creative Commons Atribución-NoComercial 4.0.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License
Included in
Palabras clave | Keywords
PISA, grade repetition, machine learning, contextual variables, multilevel logistic regression, compulsory education.