Andrea Santilli

Andrea Santilli

Senior Research Engineer

NVIDIA

Biography

I am a Senior Research Engineer at NVIDIA, where I work on Large Language Models (LLMs) evaluation. I hold a PhD in Computer Science from GLADIA at Sapienza University of Rome; my doctoral research focused on building effective, efficient, and reliable LLMs. Previously, I was a Research Scientist at Nous Research and at Apple (MLR team). See Experiences for the full list.

My current research focuses on the evaluation of Large Language Models. In the past, my work has spanned syntax in transformers (KERMIT), efficient decoding (Parallel Jacobi Decoding, which roughly doubles decoding speed and has been adopted by LMSYS), instruction tuning for LLMs (now a standard step in modern training pipelines), and LLM robustness and reliability (pub1, pub2). I have also explored instruction tuning for Italian (Camoscio), privacy in LLMs, audio LLMs, and multimodal neural databases. A full list of publications is available on my Google Scholar profile.

In the news: The BigScience project, to which I contributed, has been covered by outlets such as MIT Technology Review and The Washington Post. I was also featured in La Repubblica as one of the “500 Italians who matter in AI” (article in Italian). More recently, our work on LLM injectivity (aka the Pringle paper) received broad attention with roughly 5 million views!

If you would like to connect, feel free to reach out on X, LinkedIn, or through the contact form below.

Interests
  • Large Language Models
  • Natural Language Processing
  • Representation Learning
Education
  • PhD in Computer Science, 2025

    Sapienza University of Rome

  • MSc in Computer Science, 2020

    University of Roma Tor Vergata

  • BSc in Computer Science, 2018

    University of Roma Tor Vergata

Experience

 
 
 
 
 
NVIDIA
Senior Research Engineer
Mar 2026 – Present Zurich
Conducting research on Large Language Models evaluation within the Frontier AI Evaluation team.
 
 
 
 
 
Nous Research
Research Scientist
Mar 2025 – Jun 2025 Remote
Conducting post-training research on LLMs with a focus on enhancing their robustness, reliability, and alignment.
 
 
 
 
 
Apple
MLR Research Scientist
Apr 2024 – Oct 2024 Barcelona
Researched robustness and reliability of foundation models through uncertainty estimation in the MLR group, resulting in publications at ACL 2025 (Main) and the NeurIPS Safe Generative AI Workshop 2024.
 
 
 
 
 
BigScience - Hugging Face
Open Science Researcher
Jun 2021 – Jun 2022 Remote
Researcher at Hugginface’s workshop on large language models. Worked in the prompt-engineering working group, introducing the now popular instruction-tuning training paradigm. Three publications: T0, BLOOM, PromptSource.

Publications

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
TMLR
Language models demonstrate both quantitative improvement and new qualitative capabilities with increasing scale. Despite their …
Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

Grants Awarded

Activation-Level Control for Reliable LLM Behavior
This project advances methods for making AI behavior more predictable and reliable under targeted interventions. It addresses a key safety challenge: reducing harmful or undesirable behaviors without introducing unintended side effects or degrading core capabilities. The work aims to improve the precision of behavioral control in advanced AI systems and to provide evaluation tools that clarify when interventions can be trusted to generalize. Status: In progress. Budget: 80.000$
Our project on efficient Machine Translation (MT) was selected as the winner of the category ‘Machine Learning Algorithms For Translation’ among different proposals submitted by world experts and professors (7% acceptance rate). We develop a novel decoding algorithm to speedup autoregressive transformers up to 2x and published the results at ACL 2023. PI: Andrea Santilli. Budget: 20.000€
ufi
Multimodal Artificial Intelligence for 3D shape analysis, modeling and applications
Joint project on multimodal 3D and NLP applications between our research group GLADIA at Sapienza and Maks Ovsjanikov’s group at Ecole Polytechnique. PI: Simone Melzi, Maks Ovsjanikov. Budget: 10.000€

Contact