
Contribución del machine learning al análisis de la repetición escolar en España: un estudio con datos PISA




Introduction: The rate of grade repetition is excessively high in Spain despite being a controversial measure. In order to obtain evidence to contribute to reducing it in compulsory education, the present work is an in-depth study of the PISA 2018 context indices that are most closely linked to this phenomenon. Method: With the sample of Spanish students (n = 35 943), we used an automatic machine learning method to select and order the predictors, and multilevel logistic regression (students and centres) to quantify the contribution of each one. Results: For each educational stage we obtained the 30 most significant contextual variables, which explain 65.5% of the grade repetition variance in primary education and almost 55.7% in secondary education. Conclusions: The main indicators are principally at student level, which suggests the suitability of psychoeducational interventions based more on individual support than general policies. This gives rise to potentially more efficient and equitable measures than grade repetition, aimed at, for example, the management of learning time or academic/professional guidance, and predictors with specific differential significance at each stage. Methodologically, the study contributes to improving the specification of predictive models.

Financiación | Funding

This article received no public or private funding.

Author Biography

Alexander Constante-Amores. Bachelor’s Degree in Pedagogy and Master’s Degree in Educational Research from the Universidad Complutense de Madrid (UCM). He is currently completing the Doctoral Programme in Education at the UCM. He is a lecturer at the Universidad Camilo José Cela and the Universidad Europea de Madrid, where he teaches subjects in the areas of Statistics, Biostatistics and Research Methods. He was an intern at the Instituto Nacional de Estadística [Spanish National Statistics Institute] and he is a member of the research group Medida y Evaluación de Sistemas Educativos (MESE) [Measurement and Evaluation of Educational Systems]. His line of research focuses on the evaluation of educational systems.


Delia Arroyo-Resino. Lecturer at the Universidad Complutense de Madrid. She holds an International Doctorate with an Outstanding Award from the Department of Research and Diagnostic Methods in Education for the Universidad Complutense de Madrid. She is a member of the research group Medida y Evaluación de Sistemas Educativos (MESE).


María Sánchez-Munilla. Bachelor’s Degree in Pedagogy and a Master’s Degree in Methodologies in Behavioural Health Sciences at the Universidad Nacional de Educación a Distancia (UNED). She is currently completing the UCM Doctoral Programme in Education on a predoctoral contract named Formación del Profesorado Universitario (FPU) [University Teacher Training]. She is a member of the research group Medida y Evaluación de Sistemas Educativos (MESE) and the Servicio de Evaluación y Diagnóstico en Educación (SEDE) [Educational Evaluation and Diagnosis Service] of the Faculty of Education at the UCM.


Inmaculada Asensio-Muñoz. Doctorate with an outstanding award from the Universidad Complutense de Madrid. Senior lecturer in the area of Research and Diagnostic Methods in Education, at the Department of Research and Psychology in Education in the Faculty of Education at this University. She has extensive teaching experience in pedagogy and teacher training, both at undergraduate and master’s or doctorate level. She is a member of the research group Medida y Evaluación de Sistemas Educativos (MESE) and, as an expert in educational research methodology, during her professional career, she has worked on several funded projects on research and innovation related to this subject. Her publications focus on improving education and learning, and, in general, on the role of teaching and guidance.


Included in

Education Commons



Palabras clave | Keywords

PISA, grade repetition, machine learning, contextual variables, multilevel logistic regression, compulsory education.