•  
  •  
 

Abstract

Since the start of the PISA evaluations there have been numerous studies that have metaphorically tried «to separate the gold from the sand», in other words, to derive useful knowledge to guide educational practice and policy from the vast amount of data collected. However, research that uses data mining techniques to extract knowledge from the databases provided by the OECD has been less common. This paper analyses the context questionnaires from a metric perspective using a methodology based on data mining with «regression trees». Its main goal is to discover how much value (how much «gold») is in the items that compose these questionnaires, considering their use as predictors of the performance of Spanish students. The results provide a list of the items selected in the six questionnaires and their predictive value. It also provides a methodological approach to help improve the productivity of educational research derived from PISA.

This is the English version of an article originally printed in Spanish in issue 270 of the revista española de pedagogía. For this reason, the abbreviation EV has been added to the page numbers. Please, cite this article as follows: Asensio Muñoz, I., Carpintero Molina, E., Expósito Casas, E., & López Martín, E. (2018). ¿Cuánto oro hay entre la arena? Minería de datos con los resultados de España en PISA 2015 | How much gold is in the sand? Data mining with Spain’s PISA 2015 results. Revista Española de Pedagogía, 76 (270), 225-245. doi: 10.22550/REP76-2-2018-02

Referencias | References

Aksu, G., & Güzeller, C. O. (2016). Classification of PISA 2012 Mathematical Literacy Scores Using Decision-Tree Method: Turkey Sampling. Education and Science, 41 (185), 101-122.

Alcover, R., Benlloch, J., Blesa, P., Calduch, M. A., Celma, M., Ferri, C., … Zúnica, L. R. (2007). Análisis del rendimiento académico en los estudios de informática de la Universidad Politécnica de Valencia aplicando técnicas de minería de datos. En Actas de las XIII Jornadas de Enseñanza Universitaria de la Informática (pp. 163-170). Teruel: Universidad de Zaragoza.

Blanco-Blanco, A., Asensio, I. I., Carpintero, E., Ruiz De Miguel, C., & Expósito, E. (2017). Aplicaciones de la segmentación jerárquica en medición y evaluación de programas educativos. Ejemplos con un programa de educación financiera. Educación XX1, 20 (2), 235-257.

Breiman, L., Friedman, J. H., Olshen R. A., & Stone, C. J. (1984). Classification and regression trees. Boca Raton, FL: Chapman & Hall/CRC.

Carabaña, J. (2009). Una vindicación de la escuela española. Lección inaugural del Curso Académico 2009-2010. Madrid: Facultad de Educación, UCM.

Carabaña, J. (2015). La inutilidad de PISA para las escuelas. Madrid: La Catarata.

Castro, M. & Lizasoaín, L. (2012). Las técnicas de modelización estadística en la investigación educativa: minería de datos, modelos de ecuaciones estructurales y modelos jerárquicos lineales. revista española de pedagogía, 70 (251), 131-148.

Cordero, J. M., Crespo, E., & Pedraja, F. (2013). Rendimiento educativo y determinantes según PISA: Una revisión de la literatura en España. Revista de Educación, 362, 273-297.

De La Orden, A. & Jornet, J. (2012). La utilidad de las evaluaciones de sistemas educativos: el valor de la consideración del contexto. Bordón, 64 (2), 69-88.

González-Montesinos, M. J. & Backhoff, E. (2010). Validación de un cuestionario de contexto para evaluar sistemas educativos con Modelos de Ecuaciones Estructurales. RELIEVE, 16 (2), 1-17. Retrieved from: http://www.uv.es/RELIEVE/v16n2/RELIEVEv16n2_1.htm

González-Such, J., Sancho-Álvarez, C., & Sánchez-Delgado, P. (2016). Cuestionarios de contexto pisa: un estudio sobre los indicadores de evaluación. RELIEVE, 22 (1), 1-7. doi: http:// dx.doi.org/10.7203/relieve.22.1.8429

Gorostiaga, A. & Rojo-Álvarez, J. (2016). On the use of conventional and statistical-learning techniques for the analysis of PISA results in Spain. Neurocomputing, 171, 625-637. Hanberger, A. (2014). What PISA intends to and can possibly achieve: A critical programme theory analysis. European Educational Research Journal, 13 (2), 167-180. doi: http://dx. doi.org/10.2304/eerj.2014.13.2.167

Hastie, T., Tibshirani, R., & Friedman, J. (2002). The elements of statistical learning. Data mining, inference and prediction. New York: Springer.

Hernández Orallo, J., Ramírez, M. J., & Ferri, C. (2004). Introducción a la minería de datos. Madrid: Pearson-Prentice Hall.

Hopfenbeck, T. H., Lenkeit, J., El Masri, Y., Cantrell, K., Ryan J., & Baird, J. A. (2017). Lessons Learned from PISA: A Systematic Review of Peer Reviewed Articles on the Programme for International Student Assessment, Scandinavian Journal of Educational Research. doi: 10.1080/00313831.2016.1258726

Idil, F. H., Narli, S., & Aksoy, E. (2016). Using data mining techniques examination of the middle school student attitude towards mathematics in the context of some variables. International Journal of Education in Mathematics, Science and Technology, 4 (3), 210-228. doi: http://dx. doi.org/10.18404/ijemst.02496.

Instituto de Evaluación (IE) (2007). PISA 2006. Informe Español. Madrid: MEC.

Izenman, A. J. (2008). Modern multivariate statistical techniques. Regression, classification, and manifold learning. New York: Springer.

Kaplan, D. & Su, D. (2016). On Matrix Sampling and Imputation of Context Questionnaires with Implications for the Generation of Plausible Values in Large-Scale Assessments. Journal of Educational and Behavioral Statistics, 41 (1), 57-80.

Kass, G. V. (1980). An exploratory technique for investigating large quantities of categorical data. Applied Statistics, 29, 119-127.

Kiray, S. A., Gok, B., & Bozkir, A. S. (2015). Identifying the factors affecting science and mathematics achievement using data mining methods. Journal of Education in Science, Environment and Health (JESEH), 1 (1), 28-48.

Lakshmipriya, K. & Arunesh, P. K. (2017). Predicting student performance using data mining classification techniques. International Journal of Innovative Research in Science and Engineering, 3 (2), 54-60.

Lizasoain, L. (2012). Las técnicas de minería de datos aplicadas a la investigación educativa. Arboles estadísticos de decisión. In M. Castro (Ed.), Elogio a la Pedagogía Científica (pp. 101-121). Madrid: Grafidridma.

Martínez Arias, R. (2006). La metodología de los estudios PISA. Revista de Educación, extraordinary number 2006, 111-129.

Muñoz Ledesma, D. (2015). Modelos para la mejora del rendimiento académico de alumnos de la E.S.O. mediante técnicas de minería de datos (Tesis doctoral). Universidad de Murcia. OECD (2014). PISA 2012 Technical Report. Paris: OECD. Retrieved from https://www.oecd.org/ pisa/pisaproducts/PISA%202012%20Technical%20Report_Chapter%2019.pdf

OECD (2016). PISA 2015 Context Questionnaires Framework in PISA 2015 Assessment and Analytical Framework: Science, Reading, Mathematic and Financial Literacy. Paris: OECD Publishing, 101-127. doi: http://dx.doi. org/10.1787/9789264255425-7-en

Pereira, D., Perales, M. J., & Bakieva, M. (2016). Análisis de tendencias en las investigaciones realizadas a partir de los datos del Proyecto PISA. RELIEVE, 22 (1), 1-18. doi: http://dx. doi.org/10.7203/relieve.22.1.8248

Ruby, J. & David, K. (2015). Analysis of Influencing Factors in Predicting Students Performance Using MLP. A Comparative Study. International Journal of Innovative Research in Computer and Communication Engineering, 3 (2), 1085-1092.

Rutkowski, L. & Rutkowski, D. (2010). Getting it ‘better’. The importance of improving background questionnaires in international large-scale assessment. Journal of Curriculum Studies, 42 (3), 411-430. doi: http://dx.doi.org /10.1080/00220272.2010.487546

Rutkowski, L. & Rutkowski, D. (2017). Improving the comparability and local usefulness of international assessments: A look back and a way forward. Scandinavian Journal of Educational Research, 1-14. doi: http://dx.doi.org/10. 1080/00313831.2016.1261044

Santín, D. (2006). La medición de la eficiencia de las escuelas: una revisión crítica. Hacienda Pública Española / Revista de Economía Pública, 177 (2/2006), 57-82.

Streifer, P. A. & Schumann, J. A. (2005). Using data mining to identify actionable information: breaking new ground in data-driven decision making. Journal of Education for Students Placed at Risk (JESPAR), 10, 281-293.

Strobl, C., Malley, J., & Tutz, G. (2009). An Introduction to Recursive Partitioning: Rationale, Application, and Characteristics of Classification and Regression Trees, Bagging, and Random Forests. Psychological Methods, 14 (4), 323-348.

Taut, S. & Palacios, D. (2016). Interpretaciones no intencionadas e intencionadas y usos de los resultados de PISA: Una perspectiva de validez consecuencial. RELIEVE, 22 (1), M8. doi: http://dx.doi.org/10.7203/relieve22.1.8294

Thai Nghe, N., Janecek, P., & Haddawy, P. (2007). A Comparative Analysis of Techniques for Predicting Academic Performance (Conference Paper, Session T2G). In 37th ASEE/ IEEE Frontiers in Education Conference.

Milwaukee, WI. Thakar, P., Mehta, A., & Manisha (2015). Performance Analysis and Prediction in Educational Data Mining: A Research Travelogue. International Journal of Computer Applications, 110 (15), 60-68.

Yu, C. H., Kaprolet, C., Jannasch-Pennell, A., & Digangi. S. (2012). A Data Mining Approach to Comparing American and Canadian Grade 10 Students’ PISA Science Test Performance. Journal of Data Science, 10, 441-464.

Author Biography

Inmaculada Asensio Muñoz has a PhD in Pedagogy from the Universidad Complutense of Madrid with special doctoral prize. Lecturer in the Department of Educational Research Methods and Assessment in the Faculty of Education of the Universidad Complutense of Madrid, and member of the Measuring and Evaluating Educational Systems research group.

Elvira Carpintero Molina has a PhD in Educational Psychology from the Universidad Complutense of Madrid and Assistant Professor in the Department of Educational Research Methods and Assessment. Member of the Measuring and Evaluating Educational Systems research group and the Adaptive Pedagogy research group at the Universidad Complutense of Madrid.

Eva Expósito Casas has a PhD in Education from the Universidad Complutense of Madrid and Assistant Professor in the Department of Research Methods and Assessment in Education II at the Universidad Nacional de Educación a Distancia. She is a member of the Complutense’s Measuring and Evaluating Educational Systems research group (MESE) and the Educational Psychology Counselling and Counsellor Skills research group (GRISOP).

Licencia Creative Commons | Creative Commons License

Creative Commons Attribution-NonCommercial 4.0 International License

Esta obra está bajo una licencia internacional Creative Commons Atribución-NoComercial 4.0.
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License

Share

COinS