I’m an AI researcher at HuggingFace, leading our LLM/agents evaluation efforts and collabs. My team and I maintain ligtheval
and the evaluation guidebook
(resources for efficient LLM evaluation), as well as build and help the community build evaluations. We previously worked on the Open LLM Leaderboard (11K open source models evaluated over 2 years). On the side, I give a hand to our AI for good initiatives.
Before Hugging Face, I did a PhD from 2019 to 2022 at Inria Paris, after working as a software engineer from 2015 to 2019.
I enjoy programming, and am interested in making science open and understandable, books, and delicious food. My motto would likely be: “So much to do, so little time”.
You’ll find me as clefourrier
over the web (Twitter, LinkedInb, BlueSky, …), or you can reach me at myfirstname at 🤗 dot co
.
As a main author:
As a contributor:
You’ll find my full publication list on google scholar.
Lightweight evaluation library for LLMs. Easy to use and contains most SOTA datasets!
World wide rankings and evaluation of Open Source LLMs, maintained by my team and I, from 2023 to 2025. 11K models were evaluated. The repositories and datasets are kept for archival purposes
Anything you need to know about LLM evaluations. Contains both starter resources and advanced troubleshooting.
Conlang generator. This one dates back from my PhD, it’s a phonetic lexicon generator and sound change applier! Provide your language shape (phonetics and phonotactics) and sound changes, and it will randomly generate a proto-language and its daughter languages after your specifications!
It’s likely I’ll learn new things again! (robotics maybe? ¯\(ツ)/¯ )
This site is deliberately static and lightweight, for ecology, accessibility, and this.I use markdown and pandoc.