The Learning Variability Network Exchange (LEVANTE) brings together researchers from around the world aiming to capture the richness and diversity of child development and learning.
Only by conducting open-access, cutting-edge research can we enhance our knowledge on learning and developmental variability.
Delve into the science of learning variability, explore cutting-edge research, and discover practical insights to enhance learning for all.
Home » Introducing the LEVANTE core tasks: open, cross-cultural measures for measuring learning and development (ages 5–12)
Dr. Fionnuala O’Reilly & Prof. Michael C. Frank
January 14, 2026
Capturing how children learn and develop is difficult, especially at scale. If you’re tracking growth over time, evaluating an intervention, or testing how a child’s environment shapes development, you need measures that are efficient, reliable, valid, and usable across ages and contexts. In reality, the field is full of compromises: tasks that only work for a narrow age band, measures that don’t translate cleanly across languages or cultures, and “gold standard” tools that are behind licenses and expensive to access. That’s the gap we’re trying to close with LEVANTE.
Today we’re sharing a preprint of our paper introducing the LEVANTE core tasks: a suite of nine psychometrically grounded behavioral tasks designed for children aged 5–12, covering language, mathematics, reasoning, executive function, and social cognition. Our aim is to provide a set of measures that can act as a shared yardstick for developmental research—across childhood, across contexts, and across cultures.
LEVANTE was designed to tackle these three challenges by building and validating a set of measures that are:
In this paper, we describe how we selected and adapted well-established tasks from the literature, re-implemented them on an open-source web platform, and evaluated initial feasibility, reliability, and validity using pilot data. Psychometric models based on item-response theory (IRT) are a central component of our approach: they allow us to calibrate item difficulty and estimate children’s ability on a common scale, enabling direct comparison across ages, sites, and administration modes.
Our pilot phase included three sites on three continents and three administration modes (in-school, in-lab, and at-home). We present initial evidence from the nine tasks indicating that they (i) capture expected developmental change across childhood, (ii) produce scores that differentiate children across a range of ability levels (with performance patterns varying by task), and (iii) show early validity through associations with other measures in theoretically sensible directions. A practical strength of LEVANTE is its use of computer adaptive testing (CAT): children are presented with items targeted to their ability level, allowing us to estimate skill level efficiently, often in just a few minutes, while maintaining accurate measurement across a range of abilities. Although we use psychometric models to place task performance on a common scale, LEVANTE is not designed to estimate absolute differences between sites. As outlined above, sites differ in sampling, recruitment, and administration, so apparent performance gaps would be highly confounded and therefore uninterpretable.
The nine core tasks are still in development, and our approach is to centre psychometrics throughout design and refinement: we continuously monitor performance, systematically refine items, and share both the tools and the evidence as they evolve. Two near-term directions we’re pursuing are:
If you’d like to use LEVANTE or join our network, please get in touch.