NATURAL LANGUAGE PROCESSING (NLP) APPLICATIONS FOR ERROR ANALYSIS IN LEARNING INDONESIAN FOR FOREIGN SPEAKERS (BIPA)

BIPA Error Analysis Natural Language Processing

Authors

December 24, 2025
December 24, 2025

Downloads

The increasing global demand for Indonesian language learning (BIPA) necessitates systematic, scalable error analysis to optimize pedagogical interventions, a task severely hindered by the limitations of manual correction. This study aimed to develop and validate a specialized Natural Language Processing (NLP) framework to automatically classify linguistic errors in BIPA written output and generate a statistically generalizable error map for curriculum reform. The research employed a corpus-based, developmental design, building a BIPA-Optimized NLP Error Classification Pipeline and validating it on a corpus of over 500,000 words. The model achieved a high F1-score of 0.89. Findings revealed a high error density (7.2 per 100 words), with Affix Misapplication constituting the most resistant obstacle (45% of all errors). Crucially, ANOVA confirmed a non-significant reduction rate of these errors across proficiency levels (p=0.316), indicating that simple exposure is insufficient. The study concludes that the NLP pipeline successfully provides the first objective diagnostic standard for BIPA pedagogy, proving that the difficulty is structural. This mandates an urgent shift toward systematic, targeted remediation strategies focused on the most persistent error sub-types, enabling evidence-based curriculum development.