Development of an Indonesian-English Parallel Corpus for Translation and Comparative Linguistics Research

Computational Linguistics Comparative Linguistics Indonesian-English

Authors

May 15, 2025
May 15, 2025

Downloads

The development of parallel corpora plays a crucial role in the fields of translation studies, computational linguistics, and comparative linguistics. While significant parallel corpora have been developed for major languages like English, the availability of such resources for Indonesian-English translation research remains limited. This study aims to develop a comprehensive Indonesian-English parallel corpus, specifically designed to aid translation research and enhance linguistic comparisons between these two languages. The corpus is intended to serve as a foundational resource for further studies on machine translation, linguistic patterns, and cross-linguistic influence. The research adopts a corpus-driven methodology, where the corpus is compiled from diverse sources, including literary texts, news articles, academic papers, and everyday discourse, to ensure a broad representation of language use. The corpus is annotated for both syntax and semantics, with a focus on aligning sentence structures and identifying key linguistic features in both languages. The analysis of the corpus reveals significant differences and similarities in sentence structure, word order, and translation equivalence between Indonesian and English. The findings highlight the potential of the corpus to facilitate various types of linguistic research and translation studies. It serves as a valuable tool for enhancing the quality of machine translation systems and provides insights into the challenges of translating between Indonesian and English.

 

 

Most read articles by the same author(s)