Automated test generation and marking using Local LLMs

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Πανεπιστήμιο Πελοποννήσου

Abstract

This case study presents an innovative exam creation and grading system powered by advanced Natural Language Processing (NLP) and Llama 3.1. The system generates clear, grammatically accurate questions in English and Greek from both short text and long documents. It supports diverse question formats across various difficulty levels, ensuring semantically distinct content while minimizing redundancy. Grading utilizes a semantic similarity model to accurately evaluate essay and open-ended responses, offering partial credit and reducing bias from phrasing or syntax based on Named Entity Recognition (NER). A key advantage is its ability to run locally on ordinary personal computers without requiring specialized AI systems. The system also provides feedback on graded responses. Evaluations using metrics such as ROUGE, BLEU, diversity scores, and cosine similarity demonstrate its effectiveness, outperforming state-of-the-art models like BERT and T5 for educational assessment tasks.

Description

Μ.Δ.Ε. 74

Citation

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license