Tackling the Reproducibility Problem in Biomolecular Simulation

Biomolecular simulations are vital for understanding how biological molecules behave, with applications spanning drug discovery, disease research, and personalised medicine. Yet, despite their importance, the field has long faced a reproducibility crisis. Too often, only the final results of simulations are published and shared, while the detailed methods, parameters, and steps that generated them remain undocumented. This makes it difficult to verify studies, prevents researchers from building on one another’s work, and often leads to duplication of effort. 

The COVID-19 pandemic highlighted the urgency of this issue. When UK groups attempted to pool simulation resources to accelerate drug discovery, they quickly discovered that missing information and a lack of shared infrastructure made collaboration nearly impossible. Without a cultural and technical shift, the community risked repeating the same mistakes in the face of future challenges. 

Laying the Foundations for Change

CCPBioSim, funded by UKRI-DRI, EPSRC and supported by STFC, has long worked to make biomolecular simulation more accessible by providing training, supporting software development, and building an inclusive research community. Under the leadership of Dr James Gebbie-Rayet and Dr Jas Kalayan, the team launched BioSimDR: pioneering tools and services designed to capture the full provenance of biomolecular simulations and share this with the wider research community. 

The provenance tools automatically record every step of a simulation, from experimental structures and force field configurations to computational parameters- ensuring results are not just shared, but also reproducible and reusable. This structured approach is creating a new research culture where simulations can be trusted, compared, and reused at scale, supporting everything from AI-driven drug discovery to advanced molecular modelling. 

Partnership with PSDI

The collaboration between CCPBioSim and PSDI began in the pilot phase of 2022, when CCPBioSim joined as one of the original demonstrator projects. CCPBioSim’s long-standing expertise in biomolecular simulation, combined with PSDI’s emerging platform, provided a powerful partnership to show both the demand and the possibilities for national-scale data infrastructure. 

Since then, PSDI has been instrumental in helping the CCPBioSim pathfinder to expand from local efforts into resources that support the wider community. Together, they have developed and hosted a suite of tools and databases on the PSDI platform that capture the full provenance of biomolecular simulations- ensuring every step, parameter, and configuration can be traced, reused, and trusted. 

Key PSDI-Hosted Resources

PSDI’s national infrastructure has given CCPBioSim the technical foundations and visibility to connect their community-driven workflows with international repositories and standards. By hosting BioSimDB and related tools, PSDI ensures that CCPBioSim’s work is not only accessible to UK researchers but also positioned to influence global practice in biomolecular data sharing. 

The partnership continues to grow, with ongoing work on metadata schemas, APIs for seamless data upload, and direct integration of CCPBioSim resources into the PSDI ecosystem making biomolecular simulations more reproducible, transparent, and impactful than ever before

The BioSim Team

ORCID          LinkedIn

ORCID          LinkedIn

Read More

Community Engagement and Training

Community involvement has been central to this collaboration. In September 2024, CCPBioSim held an in-person workshop in Sheffield, supported by PSDI, with training materials made openly available through cloud infrastructure- resources that have since been accessed in over 700 unique sessions. 

The team also runs a popular webinar series, bringing together industry and academia to explore the latest developments in biomolecular simulation data. These events not only showcase progress but also gather feedback that shapes the tools and resources being developed. 

Looking Ahead

The partnership is now evolving to connect with large European initiatives, ensuring that UK simulation workflows are embedded within continental-scale infrastructure. In the near term, the focus is on finalising metadata standards, expanding BioSimDB, and enabling AI-driven approaches by ensuring data is both abundant and high quality. 

Gallery

Loading...