Download

Abstract

Cruciverb-IT is the first shared task on Italian crossword solving, held at EVALITA 2026. The task comprises two subtasks: (1) answering individual crossword clues given the expected answer length, and (2) autonomously solving complete crossword grids of varying sizes. We release a dataset of approximately 410,000 Italian clue-answer pairs along with automatically generated crossword grids ranging from size 5×5 to 13×13. Five teams participated in the evaluation, submitting a total of 17 system runs. The best-performing system on Subtask 1 achieved 69% accuracy at rank 1 and 0.72 MRR using a retrieval-augmented LLM approach, while the top system on Subtask 2 reached an average character accuracy of 92%, fully solving 34% of grids by means of a fine-tuned encoder-decoder model paired with a constraint-driven depth first search and ranking heuristics. Results show that while modern approaches achieve strong performance on individual clues and smaller grids, solving larger crosswords remains an open problem, with full match performance decreasing rapidly for grids larger than 5x5.


Citation
@inproceedings{cruciverb-it2026,
    title={Cruciverb-{IT} @ {EVALITA} 2026: {O}verview of the {C}rossword {S}olving in {I}talian {T}ask},
    author={Ciaccio, Cristiano and Sarti, Gabriele and Miaschi, Alessio and Dell'Orletta, Felice and Nissim, Malvina},
    booktitle={Proceedings of the {N}inth {E}valuation {C}ampaign of {N}atural {L}anguage {P}rocessing and {S}peech {T}ools for {I}talian. {F}inal {W}orkshop ({EVALITA} 2026)},
    publisher = {CEUR.org},
    year = {2026},
    month = {February},
    address = {Bari, Italy}
}