DIEP seminar by Merijn Moody
A major debate in the philosophy of Artificial Intelligence is whether LLMs and other AI systems are merely "stochastic parrots" repeating training data, or if they are capable of genuine creativity and understanding. While Large Language Models are increasingly used in research, current benchmarks often fail to measure whether a model truly understands physics and is capable of creativity, or is simply memorizing facts. In this talk, a new benchmark framework is proposed to measure the scientific understanding and creativity of LLMs. In this framework, models are tested on tasks ranging from standard textbook problems to complex coding challenges, such as the classification of particle collision events. Finally, a collaborative approach is proposed for the continuous generation of new tasks to ensure the lasting relevance of the benchmark.
Merijn Moody joined DIEP in October 2024 as a FAEME PhD student. He is working on the development of a mathematical framework to identify and characterize emergent information structures in multivariate discrete data. Concretely, he is working on connecting high-order spin models (which are a complete family of statistical models for discrete data) with the approaches used in graph theory, in particular, he is investigating a connection between the partition function of high-order spin models and the Tutte polynomial on matroids.
If you wish to attend this seminar online, please send an email to r.lier@uva.nl to receive the zoom-link.