The vision of PSDI
The data needs for research are growing at previously unimaginable rates and the need for collaboration around data has never been clearer. Data cannot be considered as simply an output of research, as it is itself a driver of further discovery. Experiments, observations, computations and simulations all generate data and data flows form the very fabric of research in and across the physical sciences. But, for the most part, each physical science research infrastructure, from laboratory to large facility, has essentially its own data infrastructure.
Whilst centralised, data-centric infrastructures for collecting and reusing data can act as community hubs and drive new methods and discoveries, the current diversity of data infrastructures enables each platform to be tailored to the specific needs of its field. PSDI, therefore aims to provide an additional layer of infrastructure that enables sharing of existing resources whilst ensuring that each can remain dedicated to its specific application.
PSDI will form a socio-technical data infrastructure that connects many of these systems across existing experimental and computational facilities. It aims to:
- Support multiscale modelling and multimodal research
- Leverage simulation data to drive experimental science and vice versa
- Surface data from many sources
- Provide reference-quality data
- Standardise, normalise and aggregate data and metadata
- Enable data to be exploited by AI methods
- Support workflows that automate data processing
- Provide a common platform to run models and codes from different sources
- Seamlessly access performance compute for scaling up
- Enable software curation and publication
- Be a place for curation of legacy beyond individual projects
Statement of Need
The PSDI project was initially discussed as part of the EPSRC Large Infrastructure Investments Statement of Need (SoN) call, which was conducted in late 2020 – early 2021. During this SoN exercise a project team from STFC and the University of Southampton developed the outline plan for the PSDI. This included commentary on the ambition of the project and the strategic importance of investment in infrastructure for the physical sciences. This SoN exercise was well supported across the physical sciences community. Contributions and backing from a wide range of projects and initiatives demonstrated a community need and support for such an initiative.
The full text for our large infrastructure SoN can be found linked below:
The Statement of Need confirmed a widespread consensus in the community that investment in research data infrastructure is lagging behind investment in data sources and identifying an urgent need for integration of data and computational infrastructures. It identified four ‘pillars’ of user communities that would benefit from the proposed PSDI.
Pillar 1. Facilities, Institutes and Hubs – significant centralised national facilities and activities that serve a large number of researchers based on a common need.
Pillar 2. National Research Facilities – medium-scale centralised facilities operating at a world leading level to perform research that cannot be addressed in a standard laboratory.
Pillar 3. Computational Initiatives – uniting performing simulations with the communities and tools required to do so.
Pillar 4. Research Institutions, research groups and laboratories – community of institutions
Community Pillars identified in SoN – non exhaustive examples of initiatives within that pillar
Phase 1 – pilot activities
Following on from the SoN application, the PSDI team were requested by EPSRC to complete a proposal for a short phase pilot project. This pilot project was funded through the UKRI Digital Research Infrastructure (DRI) programme. The PSDI as a whole is a large undertaking, involving a wide range of stakeholders both within our proposed pillars and in the wider community including data managers, data providers, system architects and many more roles that are crucial to underpinning our national data landscape. This pilot study is intended to expand on the ambitions of the project from the SoN, and will be undertaking a wide range of community consultation on the scope and requirements of the PSDI. It will develop a roadmap for future investment in PSDI.
More information about the structure of the pilot project can be found on the pilot page and the individual workpackage pages. The pilot activities are running as a short intense phase and this portion of the work will complete on 31st March 2022.