Andrea Santilli
Andrea Santilli
Home
Publications
Experience
Contact
Light
Dark
Automatic
Publications
Type
Uncategorized
Conference paper
Preprint
Date
2025
2023
2022
2021
2020
2019
2018
Language Models are Injective and Hence Invertible
ICLR 2026
Transformer components such as non-linear activations and normalization are inherently non-injective, suggesting that different inputs …
Giorgos Nikolaou
,
Tommaso Mencattini
,
Donato Crisostomi
,
Andrea Santilli
,
Yannis Panagakis
,
Emanuele Rodolà
Cite
arXiv
Revisiting Uncertainty Quantification Evaluation in Language Models: Spurious Interactions with Response Length Bias Results
ACL 2025 (Main)
Uncertainty Quantification (UQ) in Language Models (LMs) is key to improving their safety and reliability. Evaluations often use …
Andrea Santilli
,
Adam Golinski
,
Michael Kirchhof
,
Federico Danieli
,
Arno Blaas
,
Miao Xiong
,
Luca Zappella
,
Sinead Williamson
Cite
DOI
arXiv
Mergenetic: a Simple Evolutionary Model Merging Library
ACL 2025 (Demo)
Model merging allows combining the capabilities of existing models into a new one—post hoc, without additional training. This has made …
Adrian Robert Minut
,
Tommaso Mencattini
,
Andrea Santilli
,
Donato Crisostomi
,
Emanuele Rodolà
Cite
DOI
arXiv
MERGE3: Efficient Evolutionary Merging on Consumer-grade GPUs
ICML 2025
Evolutionary model merging enables the creation of high-performing multi-task models but remains computationally prohibitive for …
Tommaso Mencattini
,
Adrian Robert Minut
,
Donato Crisostomi
,
Andrea Santilli
,
Emanuele Rodolà
Cite
arXiv
Escaping Plato's Cave: Towards the Alignment of 3D and Text Latent Spaces
CVPR 2025
Recent works have shown that, when trained at scale, uni-modal 2D vision and text encoders converge to learned features that share …
Souhail Hadgi
,
Luca Moschella
,
Andrea Santilli
,
Diego Gomez
,
Qixing Huang
,
Emanuele Rodolà
,
Simone Melzi
,
Maks Ovsjanikov
Cite
arXiv
Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions
TMLR
Large Language Models (LLMs) represent a significant advancement in artificial intelligence, finding applications across various …
Michele Miranda
,
Elena Sofia Ruzzetti
,
Andrea Santilli
,
Fabio Massimo Zanzotto
,
Sébastien Bratières
,
Emanuele Rodolà
Cite
arXiv
Camoscio: An italian instruction-tuned llama
CLiC-it 2023 - 🏆 Best Student Paper
In recent years Large Language Models have improved the state of the art on several natural language processing tasks. However, their …
Andrea Santilli
,
Emanuele Rodolà
Cite
arXiv
Accelerating Transformer Inference for Translation via Parallel Decoding
ACL 2023
Autoregressive decoding limits the efficiency of transformers for Machine Translation (MT). The community proposed specific network …
Andrea Santilli
,
Silvio Severino
,
Emilian Postolache
,
Valentino Maiorca
,
Michele Mancusi
,
Riccardo Marin
,
Emanuele Rodolà
PDF
Cite
arXiv
GitHub
Multimodal Neural Databases
SIGIR 2023
The rise in loosely-structured data available through text, images, and other modalities has called for new ways of querying them. …
Giovanni Trappolini
,
Andrea Santilli
,
Emanuele Rodolà
,
Alon Halevy
,
Fabrizio Silvestri
PDF
Cite
arXiv
Latent Autoregressive Source Separation
AAAI 2023
Autoregressive models have achieved impressive results over a wide range of domains in terms of generation quality and downstream task …
Emilian Polostache
,
Giorgio Mariani
,
Michele Mancusi
,
Andrea Santilli
,
Luca Cosmo
,
Emanuele Rodolà
Cite
AAAI 2023
Bloom: A 176b-parameter open-access multilingual language model
arXiv
Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language …
BIG-Science contributors including
,
Andrea Santilli
PDF
Cite
arXiv
PromptSource: An Integrated Development Environment and Repository for Natural Language Prompts
ACL 2022
PromptSource is a system for creating, sharing, and using natural language prompts. Prompts are functions that map an example from a …
Stephen Bach
,
Victor Sanh
,
Zheng Xin Yong
,
Albert Webson
,
Colin Raffel
,
BIG-Science contributors including
,
Andrea Santilli
PDF
Cite
DOI
ACL 2022 Demo
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
TMLR
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their …
BIG-bench contributors including
,
Andrea Santilli
,
Antonio Norelli
,
Emanuele Rodolà
,
Giambattista Parascandolo
,
Giorgio Mariani
,
Luca Moschella
,
Simone Melzi
PDF
Cite
arXiv
Multitask Prompted Training Enables Zero-Shot Task Generalization
ICLR 2022
Large language models have recently been shown to attain reasonable zero-shot generalization on a diverse set of tasks (Brown et al., …
Victor Sanh
,
Albert Webson
,
Colin Raffel
,
Stephen H. Bach
,
BIG-Science contributors including
,
Andrea Santilli
PDF
Cite
ICLR 2022 (Oral)
KERMIT: Complementing Transformer Architectures with Encoders of Explicit Syntactic Interpretations
EMNLP 2020
Syntactic parsers have dominated natural language understanding for decades. Yet, their syntactic interpretations are losing centrality …
Fabio Massimo Zanzotto
,
Andrea Santilli
,
Leonardo Ranaldi
,
Dario Onorati
,
Pierfrancesco Tommasino
,
Francesca Fallucchi
PDF
Cite
DOI
EMNLP 2020
Unsupervised Source Separation via Bayesian Inference in the Latent Domain
arXiv preprint
State of the art audio source separation models rely on supervised data-driven approaches, which can be expensive in terms of labeling …
Michele Mancusi
,
Emilian Postolache
,
Giorgio Mariani
,
Marco Fumero
,
Andrea Santilli
,
Luca Cosmo
,
Emanuele Rodolà
PDF
Cite
arXiv
Explanatory Learning: Beyond Empiricism in Neural Networks
arXiv preprint
We introduce Explanatory Learning (EL), a framework to let machines use existing knowledge buried in symbolic sequences – e.g. …
Antonio Norelli
,
Giorgio Mariani
,
Luca Moschella
,
Andrea Santilli
,
Giambattista Parascandolo
,
Simone Melzi
,
Emanuele Rodolà
PDF
Cite
arXiv
A Kernel-based Approach for Irony and Sarcasm Detection in Italian
EVALITA 2018
This paper describes the UNITOR system that participated to the Irony Detection in Italian Tweets task (IronITA) within the context of …
Andrea Santilli
,
Danilo Croce
,
Roberto Basili
PDF
Cite
DOI
EVALITA 2018
SyntNN at SemEval-2018 Task 2: is Syntax Useful for Emoji Prediction? Embedding Syntactic Trees in Multi Layer Perceptrons
SemEval2018
In this paper, we present SyntNN as a way to include traditional syntactic models in multilayer neural networks used in the task of …
Andrea Santilli
,
Fabio Massimo Zanzotto
PDF
Cite
DOI
SemEval 2018
Cite
×