Freiburg Science Benchmark

×

About the project

We, a group of computer scientists from the University of Freiburg's Machine Learning Lab, are launching an ambitious project to establish a new benchmark for Large Language Models (LLMs) like ChatGPT in the science field. We invite you to join us in contributing to this pioneering initiative and make a significant impact.

Our main objective is to create a comprehensive expert-based benchmark for LLMs in science, encompassing a range of difficulty levels from undergraduate to postdoctoral questions, with the aim of evaluating and improving the capabilities of state-of-the-art models for various science applications, such as summarizing papers, generating hypotheses, aiding data analysis, and facilitating interdisciplinary insights. We aim to publish the benchmark in a top-tier journal.

Upload a file

Statistics