Tytuł artykułu
Autorzy
Wybrane pełne teksty z tego czasopisma
Identyfikatory
Warianty tytułu
Języki publikacji
Abstrakty
Automated essay evaluation is a widely used practical solution for replacing time-consuming manual grading of student essays. Automated systems are used in combination with human graders in different high-stake assessments, where grading models are learned on essays datasets scored by different graders. Despite the definition of the standardized grading rules, human graders can unintentionally introduce subjective bias into scores. Consequently, a grading model has to learn from data that represents a noisy relationship between essay attributes and its grade. We propose an approach for partitioning a set of essays into subsets that represent similar graders, which uses an explanation methodology and clustering. The results confirm our assumption that learning from the ensemble of separated models can significantly improve the average prediction accuracy on artificial and real-world datasets.
Słowa kluczowe
Wydawca
Czasopismo
Rocznik
Tom
Strony
239--259
Opis fizyczny
Bibliogr. 41 poz., rys., tab., wykr.
Twórcy
autor
- University of Ljubljana, Faculty of Computer and Information Science, Večna pot 113, Ljubljana, Slovenia
autor
- University of Ljubljana, Faculty of Computer and Information Science, Večna pot 113, Ljubljana, Slovenia
Bibliografia
- [1] Shermis MD, Burstein J. Introduction. In: Shermis MD, Burstein J (eds.), Automated essay scoring: A cross-disciplinary perspective, pp. xiii-xvi. Lawrence Erlbaum Associates, Manwah, NJ, 2003.
- [2] Bridgeman B. Human Ratings and Automated Essay Evaluation. In: Shermis MD, Burstein JC (eds.), Handbook of Automated Essay Evaluation: Current Applications and New Directions, chapter 13, pp. 221-232. Routledge, New York, 2013. URL http://dx.doi.org/10.4324/9780203122761.ch13.
- [3] Lottridge SM, Schulz EM, Mitzel HC. Using Automated Scoring to Monitor Reader Performance and Detect Reader Drift in Essay Scoring. In: Shermis MD, Burstein J (eds.), Handbook of Automated Essay Evaluation: Current Applications and New Directions, chapter 14, pp. 233-250. Routledge, New York, 2013. URL https://www.taylorfrancis.com/books/e/9780203122761/chapters/10.4324/9780203122761-22.
- [4] Congdon PJ, McQueen J. The stability of rater severity in large-scale assessment programs. Journal of Educational Measurement, 2000. 37(2):163-178. doi:10.1111/j.1745-3984.2000.tb01081.x.
- [5] Hoskens M, Wilson M. Real-time feedback on rater drift in constructed-response items: An example from the Golden State Examination. Journal of Educational Measurement, 2001. 38(2):121-145. doi:10.1111/j.1745-3984.2001.tb01119.x.
- [6] Leckie G, Baird JA. Rater effects on essay scoring: A multilevel analysis of severity drift, central tendency, and rater experience. Journal of Educational Measurement, 2011. 48(4):399-418. doi:10.1111/j.1745-3984.2011.00152.x.
- [7] Myford CM, Wolfe EW. Monitoring rater performance over time: A framework for detecting differential accuracy and differential scale category use. Journal of Educational Measurement, 2009. 46(4):371-389. doi:10.1111/j.1745-3984.2009.00088.x.
- [8] Bejar II. A validity-based approach to quality control and assurance of automated scoring. Assessment in Education: Principles, Policy & Practice, 2011. 18(3):319-341. doi:10.1080/0969594X.2011.555329.
- [9] Williamson DM, Xi X, Breyer FJ. A Framework for Evaluation and Use of Automated Scoring. Educational Measurement: Isues and Practice, 2012. 31(1):2-13. doi:10.1111/j.1745-3992.2011.00223.x.
- [10] Attali Y. Validity and Reliability of Automated Essay Scoring. In: Shermis MD, Burstein JC (eds.), Handbook of Automated Essay Evaluation: Current Applications and New Directions2, chapter 11, pp. 181-198. Routledge, New York, 2013. URL https://www.routledge.com/products/9780415810968.
- [11] Štrumbelj E, Kononenko I, Robnik Šikonja M. Explaining instance classifications with interactions of subsets of feature values. Data and Knowledge Engineering, 2009. 68(10):886-904. doi:10.1016/j.datak.2009.01.004.
- [12] Shermis MD, Burstein JC (eds.). Handbook of Automated Essay Evaluation: Current Applications and New Directions. Routledge, New York, 2013. ISBN-10:9780415810968, 13:978-0415810968.
- [13] Zupanc K, Bosnić Z. Automated essay evaluation with semantic analysis. Knowledge-Based Systems, 2017. 120:118-132. doi:10.1016/j.knosys.2017.01.006.
- [14] Page EB. The Imminence of... Grading Essays by Computer. Phi Delta Kappan, 1966. 47(5):238-243.
- [15] Burstein J, Tetreault J, Madnani N. The E-rater® Automated Essay Scoring System. In: Shermis MD, Burstein J (eds.), Handbook of Automated Essay Evaluation: Current Applications and New Directions, chapter 4, pp. 55-67. Routledge, New York, 2013.
- [16] Foltz PW, Streeter LA, Lochbaum KE, Landauer TK. Implementation and Applications of the Intelligent Essay Assessor. In: Shermis MD, Burstein J (eds.), Handbook of Automated Essay Evaluation: Current Applications and New Directions, chapter 5, pp. 68-88. Routledge, New York, 2013.
- [17] Schultz MT. The IntelliMetric Automated Essay Scoring Engine - A Review and an Application to Chinese Essay Scoring. In: Shermis MD, Burstein JC (eds.), Handbook of Automated Essay Evaluation: Current Applications and New Directions, chapter 6, pp. 89-98. Routledge, New York, 2013. doi:10.4324/9780203122761.ch6.
- [18] Mayfield E, Penstein-Rosé C. An Interactive Tool for Supporting Error Analysis for Text Mining. In: Proceedings of the NAACL HLT 2010 Demonstration Session. Los Angeles, CA, 2010 pp. 25-28.
- [19] Fazal A, Dillon T, Chang E. Noise Reduction in Essay Datasets for Automated Essay Grading. Lecture Notes in Computer Science, 2011. 7046:484-493. doi:10.1007/978-3-642-25126-9\_60.
- [20] Brent E, Atkisson C, Green N. Time-shifted Collaboration: Creating Teachable Moments through Automated Grading. In: Juan AA, Daradournis T, Caballe S (eds.), Monitoring and Assessment in Online Collaborative Environments: Emergent Computational Technologies for E-learning Support, pp. 55-73. IGI Global, 2010. doi:10.4018/978-1-60566-786-7.ch004.
- [21] Gutierrez F, Dou D, Fickas S, Griffiths G. Online Reasoning for Ontology-Based Error Detection in Text. On the Move to Meaningful Internet Systems: OTM 2014 Conferences Lecture Notes in Computer Science, 2014. 8841:562-579. doi:10.1007/978-3-662-45563-0\_34.
- [22] Nguyen HV, Litman DJ. Argument Mining for Improving the Automated Scoring of Persuasive Essays. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence. 2018.
- [23] Mayfield E, Rosé C. LightSIDE: Open Source Machine Learning for Text. In: Shermis MD, Burstein J (eds.), Handbook of Automated Essay Evaluation: Current Applications and New Directions, chapter 8, pp. 124-135. Routledge, New York, 2013.
- [24] Elder C, Barkhuizen G, Knoch U, von Randow J. Evaluating rater responses to an online training program for L2 writing assessment. Language Testing, 2007. 24(1):37-64. doi:10.1177/0265532207071511.
- [25] Engelhard GJ. Examining Rater Errors in the Assessment of Written Composition with a Many-Faceted Rasch Model. Journal of Educational Measurement, 1994. 31(2):93-112. doi:10.1111/j.1745-3984.1994. tb00436.x.
- [26] Eckes T. Rater types in writing performance assessments: A classification approach to rater variability, volume 25. 2008. doi:10.1177/0265532207086780.
- [27] Rezaei AR, Lovorn M. Reliability and validity of rubrics for assessment through writing. Assessing Writing, 2010. 15(1):18-39. doi:10.1016/j.asw.2010.01.003.
- [28] Štrumbelj E, Kononenko I. Explaining prediction models and individual predictions with feature contributions. Knowledge and Information Systems, 2014. 41(3):647-665. doi:10.1007/s10115-013-0679-x.
- [29] Abdi H, Williams LJ. Principal component analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2010. 2(4):433-459. doi:10.1002/wics.101.
- [30] Hartigan JA, Wong MA. Algorithm AS 136: A K-Means Clustering Algorithm. Journal of the Royal Statistical Society. Series C (Applied Statistics), 1979. 28(1):100-108. doi:10.1890/11-0206.1.
- [31] Lloyd SP. Least squares quantization in PCM. Technical Report RR-5497. Technical report, Bell Lab, 1957.
- [32] Macqueen J. Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability. 1967 pp. 281-297. doi:citeulike-article-id:6083430.
- [33] Kononenko I. Estimating attributes: Analysis and extensions of RELIEF. In: Proceedings of European Conference on Machine Learning (ECML’94). Catania, Italy, 1994 pp. 171-182. doi:10.1007/3-540-57868-4\_57.
- [34] Gini C. Variabilità e Mutuabilità. Tipografia di Paolo Cuppini, Bologna, Italy, 1912.
- [35] Roobaert D, Karakoulas G, Chawla N. Information Gain, Correlation and Support Vector Machines. In: Guyon I, Nikravesh M, Gunn S, Zadeh LA (eds.), Feature Extraction: Foundations and Applications, volume 470, chapter 22, pp. 463-470. Springer Berlin Heidelberg, 2006. doi:10.1007/978-3-540-35488-8\_23.
- [36] Dunn JC. Well-Separated Clusters and Optimal Fuzzy Partitions. Journal of Cybernetics, 1974. 4(1):95-104. doi:10.1080/01969727408546059.
- [37] Fowlkes EB, Mallows CL. A Method for Comparing Two Hierarchical Clusterings. Journal of the American Statistical Association, 1983. 78(383):553-569. doi:10.1080/01621459.1983.10478008.
- [38] Zupanc K, Bosnić Z. Advances in the field of automated essay evaluation. Informatica, 2015. 39(4):383-395.
- [39] Kanji GK. 100 Statistical Tests. SAGE Publications, London, Thousand Oaks, New Delhi, 3rd edition, 2006. ISBN-10:9788178297316, 13:978-8178297316.
- [40] Hyvarinen A. Independent component analysis: recent advances. Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences, 2013. 371(1984):20110534-20110534. doi:10.1098/rsta.2011.0534.
- [41] Lee DD, Seung HS. Algorithms for Non-negative Matrix Factorization. In: In NIPS. MIT Press, 2000 pp. 556-562.
Uwagi
Opracowanie rekordu ze środków MNiSW, umowa Nr 461252 w ramach programu "Społeczna odpowiedzialność nauki" - moduł: Popularyzacja nauki i promocja sportu (2020).
Typ dokumentu
Bibliografia
Identyfikator YADDA
bwmeta1.element.baztech-579f478e-5f14-45af-a7ce-90cba0358072
JavaScript jest wyłączony w Twojej przeglądarce internetowej. Włącz go, a następnie odśwież stronę, aby móc w pełni z niej korzystać.