AI in the Corner:  Evaluating Literary Translation with Artificial Intelligence in Higher Education Setting

Boualem BENGHALEM

Autores

Boualem BENGHALEM University of Ain Temouchent – Algeria https://orcid.org/0000-0002-6973-7655

Palavras-chave:

artificial intelligence, literary translation, assessment, higher education, GPT-4

Resumo

This study investigates the effectiveness of artificial intelligence (AI) in evaluating literary translations within a higher education context. Literary translation, by nature, involves subjective judgment, cultural sensitivity, and stylistic interpretation elements that challenge standardization in assessment. With growing interest in AI-driven educational tools, this study explores whether AI can serve as a reliable evaluator in translation pedagogy. Nine Master’s students specialising in literature and civilization were asked to translate Mahmoud Darwish’s poemكمقهى صغير هو الحب [Like a Small Café, That’s Love] from Arabic to English using various translation techniques studied in class. Each translation was first assessed by a human instructor and then evaluated using GPT-4-mini, a large language model developed by OpenAI. The AI was prompted to assign scores based on four criteria: accuracy, fluency and style, completeness, and grammar and mechanics. Results showed that while the AI consistently assigned lower scores—on average 7.1 points lower than the instructor, it maintained a moderate positive correlation (r = 0.64) with instructor rankings, indicating relative reliability in performance differentiation. However, issues such as OCR errors and conservative scoring highlighted limitations in using AI for holistic literary assessment. The findings suggest that AI tools, when properly calibrated and used alongside human oversight, can enhance efficiency and provide formative feedback in translation instruction. Nevertheless, they are not yet suitable as standalone grading solutions for creative, interpretive tasks.