Introduction

Biomolecular Simulation addresses complex biological challenges, such as drug discovery and antimicrobial resistance. This case study highlights the difficulties in reproducing biomolecular simulations due to inconsistent data storage and sharing practices. The project aims to develop tools for capturing simulation provenance, including plugins for GROMACS and Amber engines, and an online database for workflow storage.

Challenges and Impact

The PSDI BioSim project addresses a critical challenge in biomolecular simulation: transforming the research culture to prioritize reproducibility. Despite its importance, the field often lacks standardized methods and infrastructure to thoroughly document and share the intricate steps of simulation workflows. This gap leaves researchers unaware of existing simulations, leading to duplicated efforts instead of building on prior work.

To solve this, the project focuses on developing data provenance tools that automatically record and archive every step of the simulation process in an accessible and shareable format. By ensuring that both the results and the underlying methods are shared, this initiative fosters transparency, accountability, and collaboration within the community. Ultimately, the PSDI BioSim project aims to make biomolecular research more sustainable by enabling researchers to track, share, and build upon data effectively—preventing the constant reinvention of the wheel.

BioSimDB stands out for its seamless integration into existing workflows, eliminating the need for researchers to manually document methodologies or input data into separate repositories. By automating data capture and standardizing results behind the scenes, it reduces administrative burdens, allowing scientists to focus on their research. While biomolecular simulations are complex, the physics behind them is often simple—the true challenge lies in tracking the intricate processes involved. BioSimDB addresses this by meticulously recording every aspect of a simulation, from experimental structures and force field setups to computational parameters. This comprehensive provenance ensures research findings are not only verifiable but reusable in ways previously unachievable. The creation of a centralized database repository allows users to share their workflows, further preventing duplicated efforts, saving time and computational resources while democratizing simulation-based research.

BioSim Highlights

BioSim provides a collection of tools and services designed to transform biomolecular simulations by capturing and archiving the full provenance of each simulation. It meticulously records every step – from experimental structures to force field configurations and computational parameters – ensuring research is reproducible, transparent, and efficient. This approach helps scientists to avoid redundant efforts, improves the reliability of findings, and supports AI-driven drug discovery and molecular modelling. The resources provided by the project include:

  • BioSimDB (Biomolecular Simulations Database): Topology, trajectory, and AiiDA data provenance files from molecular dynamics simulations of biomolecules are stored in a repository specifically designed for their preservation and accessibility. The data within the BioSimDB repository has been created by the biomolecular simulation community for the community and is endorsed by CCPBioSim, a network of UK based researchers who provide support and expertise to biomolecular simulators. Submissions are welcomed from the biomolecular simulation community.
  • Amber Plugin for AiiDA: The Amber plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations.
  • GROMACS Plugin for AiiDA: The GROMACS plugin for AiiDA aims to enable the capture and sharing of the full provenance of data when parameterising and running molecular dynamics simulations.
  • Demo BioSim Database With Local Web Interface: This demo application will create an SQLite database that can be used to include aiida archive files produced from several aiida plugins (aiida -gromacs and -amber presently). This database can be used to store and explore archive files and gives a taster for how biomolecular MD simulation data provenance can be stored for multiple research projects.
  • Cloud Based Jupyter aiida-gromacs Demo/Training: Tutorials with pre-loaded user environments for using aiida-gromacs to produce provenance workflows for biomolecular simulations.

Explore all the resources for BioSim using the BioSim (Biomolecular Simulations) Data Resources entry in our What We Provide page.

Loading...