Evaluating Lexical Proficiency in Neural Language Models

Abstract

We present a novel evaluation framework designed to assess the lexical proficiency and linguistic creativity of Transformer-based Language Models (LMs). We validate the framework by analyzing the performance of a set of LMs of different sizes, in both mono- and multilingual configuration, across tasks involving the generation, definition, and contextual usage of lexicalized words, neologisms, and nonce words. To support these evaluations, we developed a novel dataset of lexical entries for the Italian language, including curated definitions and usage examples sourced from various online platforms. The results highlight the robustness and effectiveness of our framework in evaluating multiple dimensions of LMs' linguistic understanding and offers an insight, through the assessment of their linguistic creativity, on the lexical generalization abilities of LMs.

Publication
In Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (ACL 2025, Vienna, Austria) (upcoming)
Alessio Miaschi
Alessio Miaschi
Full-time researcher (RTD) in Natural Language Processing