I’m an AI researcher at HuggingFace, leading our evaluation efforts and collabs (on LLMs/agents). The OpenEvals team maintains lighteval and the evaluation guidebook, as well as builds/helps the community build cool evaluations. We previously worked on the Open LLM Leaderboard.
On the side, I give a hand to our AI for good/AI for science initiatives.
I enjoy programming, making science open and understandable, books, and delicious food. My motto would likely be: “So much to do, so little time”.
You’ll find me as clefourrier
over the web (Twitter, LinkedIn, BlueSky, …), or
you can reach me at myfirstname at 🤗 dot co
. Open to both
collabs and mentoring, within available bandwidth.
If you want a fast answer, better make it short and to the point ^^
Apr 2025: 🗞️ BusinessInsider: Figuring
out which AI model is right for you is harder than you think
Apr 2025: 🗞️ VentureBeat : Beyond
generic benchmarks: How Yourbench lets enterprises evaluate AI models
against actual data
Apr 2025: 📜 Arxiv : YourBench: Easy Custom
Evaluation Sets for Everyone
Mar 2025: 🎤 CNRS
NLP working group : Panorama
of LLM evaluations ✨
Mar 2025: 📝 Blog : Fixing the
Open LLM Leaderboard with Math-Verify
Mar
2025: 🗞️ Epsiloon Magazine : IA:
le quiz ultime
Feb 2025: 🗞️ France 24 TV : Tech24 on AI
Feb 2025: 🗞️ French AI Summit Conclusions: French
LLM Leaderboard showcase,
Feb 2025: ⭐
Finalist of the 2025 French Innovators Awards, AI section : 100
French scientists whose research change our lives, by the French journal
Le Point
Feb 2025: 📜 Arxiv : SmolLM2: When Smol Goes Big –
Data-Centric Training of a Small Language Model
Jan
2025: 📝 Blog : CO2
emissions and model performance: Insights from the Open LLM
Leaderboard
Dec 2024: 📜 Arxiv : Global MMLU: Understanding and
Addressing Cultural and Linguistic Biases in Multilingual
Evaluation
Oct 2024: ⚙️ Release : Evaluation
Guidebook✨
Jul 2024: 🗞️ The Economist : How
to tell which AI model is best
Jul 2024: 🎧
Latent Space Benchmarks 201 ✨
Jun 2024: 🗞️ VentureBeat : Hugging
Face’s updated leaderboard shakes up the AI evaluation game
Jun 2024: 🗞️ La Tribune : Pour
contrer la crise de l’évaluation des IA, Hugging Face rehausse les
exigences
Jun 2024: 📝 Blog : Performances
are plateauing, let’s make the leaderboard steep again
May 2024: 📝 Blog : Let’s talk
about LLM evaluation
May 2024: ⭐ Invited to
France’s top AI talents gathering at Elysée Event
May 2024: 📜 ICLR : GAIA: a benchmark for
General AI Assistants ✨
Apr 2024: 🗞️ La
Recherche : 2023,
l’année des grands modèles de langue ouverts
Apr
2024: 🗞️ TechCrunch : Hugging
Face releases a benchmark for testing generative AI on health
tasks
Apr 2024: 📜 Arxiv : The Hallucinations Leaderboard –
An Open Effort to Measure Hallucinations in Large Language
Models
Feb 2024: ⚙️ Release : Lighteval ✨
Dec 2023: 📝 Blog : 2023, year of Open
LLMs ✨
Dec 2023: 📝 Blog : Open LLM
Leaderboard: DROP deep dive
Oct 2023: 📜
Arxiv : Zephyr: Direct
Distillation of LM Alignment
Jun 2023: 📝 Blog
: What’s
going on with the Open LLM Leaderboard?
Apr
2023: ⚙️ Release : Open LLM
Leaderboard
Mar 2023: 🎧 Parlons Tech : L’IA
générative à la Loupe
Jan 2023: 📝 Blog : Introduction to Graph
Machine Learning
Nov 2022: 📜 Arxiv : Bloom: A 176b-parameter
open-access multilingual language model
Oct
2022: 📜 PhD : Neural Approaches to
Historical Word Reconstruction
May 2022: 📜
ACL : Probing
Multilingual Cognate Prediction Models ✨
Apr
2022: 📜 Arxiv : Entities, Dates, and Languages:
Zero-Shot on Historical Texts with T0
Aug
2021: 📜 ACL : Can Cognate
Prediction Be Modelled as a Low-Resource Machine Translation
Task?
May 2020: 📜 LREC : Methodological Aspects
of Developing and Managing an Etymological Lexical Resource: Introducing
EtymDB-2.0
Feb 2020: 📜 Arxiv : The Alzheimer’s Disease
Prediction Of Longitudinal Evolution (TADPOLE) Challenge: Results after
1 Year Follow-up
atm, evaluation of LLMs and agents (2023-now)
graph machine
learning (2022)
reconstructing dead languages using neural networks
(2019-2022)
neurodegenerative disease prediction from longitudinal
data (2017-2018)
using 3D meshes and grids for geology and
structural modeling (2014-2015)
It’s likely I’ll learn new things again! (robotics maybe? ¯\(ツ)/¯ )
This site is deliberately static and very lightweight, for ecology, accessibility, and this. I use markdown and pandoc. My logo was made by Alix Chagué.