Edinburgh Open Research Conference 2024

Home » Edinburgh Open Research Conference 2024

Jul 3, 2024

Dr Cerys Willoughby attended the 3rd annual Open Research Conference on behalf of PSDI. The conference focused on the topic of how Open Research can contribute to positive cultural change in research more broadly, with themes including next generation metrics, research integrity, and education and skills. The conference included contributors from across the UK and Europe, with a plenary discussion, a range of both full and lightening talks, and a poster session. The presenters came from a variety of different backgrounds including both sciences and humanities.

A wide variety of angles on the topic of cultural changes was discussed across the day. In the plenary panel the discussion was largely focused on cultural change and open research as an ethical question. Promoting values and adopting integrity principles were considered essential for cultural change. Differences in research cultures between disciplines were discussed and how factors such as different training practices may contribute to these differences. It was noted that activities in the community to promote research integrity tend to be a grassroots activity – undertaken by individuals in their own time – and the speakers advocated the need for more input into these activities from publishers and institutions. Research cultures and researcher behaviours vary across both disciplines and geographical boundaries, and the challenges of a lack of integrity in research was also discussed. Examples included the problems of plagiarism and the increasing occurrence of the use of technologies such as AI generation and software that can turn someone else’s words into new text. Such technologies can defeat plagiarism software, and in the communities where these technologies are routinely used this behaviour is not perceived as a problem. This is exacerbated by the ‘publish or perish’ culture within academic research where the importance of outputting large numbers of papers to secure tenure or gain promotions outweighs questions of integrity. Highlighting the principles of open research – and encouraging funders and publishers to focus on it – provides an opportunity to create a positive culture from the ground up.

Presentations in the conference were dominated by questions about diversity and inclusivity in research – who gets to do the research and whose stories are being told? The issues discussed were noted to be driven by the history of academic establishments themselves. Universities were set up to provide opportunities for a narrow demographic, and as a result other communities have historically been excluded. Although things have improved, other demographics are still underrepresented and may find it harder to get the opportunities to tell their stories or to get appropriate credit. With social media and the Web, it is easier to reach wider and larger audiences, but in academia “blogs don’t count”. Papers and grants as the measures used in universities leads to a negative impact preventing other cultures from emerging. Several presenters indicated that changes are not only needed to the way that research is done and communicated, but also in the way that stories and lived experiences are recorded in research. There is a need to ensure that knowledge presented in non-traditional forms – so-called ‘embodied knowledge’ is appropriately captured and shared to maintain the voice of the storyteller.

Several of the presentations also talked about the importance of recognition and inclusion of technical specialists in research. There are many roles that are not traditional research roles, but which contribute significantly to the research process. However, these individuals are rarely included on author lists for publications and therefore receive insufficient recognition and credit for their work. In addition, the discussion about open research typically revolves around the academic process, with little insight provided into what this means for those in technical specialist or other research support roles. This leaves individuals unclear as to how they can be involved and excluded from the conversation. A variety of initiatives seek to address these problems, for example, providing tailored training in open research, ensuring that contributions by technical staff are credited appropriately in publications, technical leads are included on grants, and the creation of technical specialist networks to share opportunities related to open research.  

Training was also a topic discussed in many of the presentations, and education and training in data management was viewed as being a vital route for communicating the importance and value of open research to researchers and their supervisors. Data Management Plans (DMPs) are driving the requirements for this training, but many individuals still have a limited awareness of data and think that the paper itself is the dataset. The different research cultures of different disciplines lead to variations in their level of understanding and perceived importance of data. In the humanities researchers appear to have limited awareness of data at all and need help to identify the data in their own research, whilst in the sciences there appears to be a better awareness with trainers receiving more specific questions around what, where and how of data management, and concern with important issues such as data protection and privacy. Across all disciplines, it was noted that repositories are not well understood, and are perceived as only a place to store data, rather than being viewed as a publishing medium.

From a PSDI perspective, the discussions raised several interesting questions for our own activities, especially around how we engage with a broad audience:

  • How can we help to skill up different roles, including educating leaders?
  • How can we engage with the community to understand the different needs, requirements, and tasks in open science for different roles?
  • Should we define principles for PSDI around open research and ethics – especially around data deposition and data citation?
  • What kinds of events can we run to engage with the technical specialist and research support communities?
  • Are there non-traditional ways we can provide content or accept content on PSDI to meet the needs of different audiences?
  • How do we deliver knowledge in the most effective form for our audience?
  • How can we recognise contributions to PSDI, for example should we mint DOIs for articles so that they can be cited, and the authors can receive credit?

Modern scientific research workflows use a plethora of diverse software tools and file formats. Unfortunately, the file formats that one software tool can export are often incompatible with the formats required for import by another.  Furthermore, the current capabilities for converting data between these different formats are often slow, unclear and error-prone, particularly because data formats vary in their structure and in the amount of information they can represent, making conversion between specific formats complex and sometimes resulting in information loss. PSDI’s Data Conversion Service (DCS) was created to address this challenge, offering researchers a single, trusted place to convert data formats while helping them understand the likely quality and limitations of different conversions.

Where the idea came from

The need for a Data Conversion Service was first identified during research carried out for the PSDI pilot phase at the University of Southampton, which was published in Digital Discovery. This research identified a recurring issue across the physical sciences: researchers were working with data that existed in many different formats, making collaboration and reuse difficult due to a lack of interoperability. Therefore, highlighting that there was a clear need for “data format conversion between different data types in order to facilitate data exchange between different services, and to allow users to collaborate using common formats.”

A key conclusion of this work was that this issue, alongside many other interoperability challenges could best be addressed by identifying existing software that already offers relevant functionality, and creating the infrastructure needed to allow these tools to work together.

Several converters had already been created by the scientific community to address some of these issues, such as Open Babel, although in their current form they were fragmented and offered little insight into conversion quality or potential information loss. Therefore, rather than creating another converter, PSDI’s focus shifted towards making better use of these existing software tools by bringing them together and exposing their capabilities more transparently.

As Dr. Samantha Pearman-Kanza, who was closely involved in shaping the early direction of the service, explains:

Rather than simply creating another conversion tool, the focus was on making the best use of existing software and elevating their offerings. The aim was to help researchers understand what conversions were possible across different scientific data formats , which existing tools could be used, and where the use of these tools for certain conversions might involve compromises in data quality.

From concept to working service

Early ideas explored a search interface that identified possible conversions and directed users to existing conversion software. This quickly evolved into a more researcher-friendly approach: integrating established converters directly into a single service and exposing their options in a consistent way.

Development was carried out by Research Software Engineers Dr. Ray Whorley, Dr. Bryan Gillis and Dr. Don Cruickshank, who initially prototyped the service as a small Python application before expanding it into a fully-fledged web service and suite of downloadable tools.

Reflecting on this evolution, Dr. Whorley says:

The service now incorporates widely used converters such as Open Babel, Atomsk and c2x. Users can upload files, choose input and output formats, apply available conversion options, and download both the converted file and a detailed log. Accessibility has been built in throughout, with users able to customise fonts, sizes and colour schemes.

The Data Conversion Service interface showing format selection, available converters and indicative conversion quality.

Supporting real research workflows

Alongside the web application, the team developed three downloadable tools: a local browser-based version, a command-line tool and a Python library. These are proving particularly valuable for researchers working with sensitive data or automated workflows.

As Dr. Whorley explains:

“The downloadable tools give researchers confidence that their data remains local, and they can be dropped straight into automated workflows.”

This flexibility allows the Data Conversion Service to support everything from quick, one-off conversions to large-scale, repeatable processing pipelines.

Supporting FAIR data and PSDI’s wider ecosystem

Interoperability is a core part of FAIR data practice, and the Data Conversion Service plays a key role in enabling it. Researchers often need to convert the output of one tool into a format that can be used by the next, or to revive legacy data stored in outdated formats. Our service helps reduce the technical barriers to doing both.

Looking ahead

Now that the Data Conversion Service is established, its future direction will be strongly shaped by user feedback. Researchers can report missing formats and conversions directly through the service, and suggestions are already influencing planned enhancements.

Alongside this, there is clear scope for closer integration between the Data Conversion Service and other PSDI tools and services, for example by enabling data transformed through the Data Revival Service (a service which takes scanned handwritten paper lab notebooks and converts them into machine-readable data) to be converted into a wider range of usable formats, or by generating chemical identifiers such as InChI or SMILES from a broader set of input formats for use in discovery services like Cross Data Search.

As Dr. Pearman-Kanza notes:

“The capacity to convert data between different formats is what really unlock reuse across tools, across projects and across disciplines.”

Potential future developments also include support for conversions that require more than one input file, additional conversion tools, chained conversions where no direct route exists, data visualisation, and an API to enable integration with other platforms and services.

A service built with researchers in mind

For the team, seeing the Data Conversion Service grow from an identified need into a live, widely usable tool has been deeply rewarding. The aim is to make data conversion clearer, more transparent and more inclusive, so researchers can spend less time wrestling with formats and software, and more time doing research.

As Dr. Pearman-Kanza puts it:

“If researchers can trust the conversion process and understand its limitations, they are better placed to make informed decisions about how their data can be used. This includes understanding when conversion is appropriate, what can be gained, and what might be lost, which is an important step towards better research practice overall.”


Try the Data Conversion Service

The Data Conversion Service is freely available to use and designed to fit a wide range of research needs, from quick, one-off conversions to integration within automated workflows. Researchers can explore the web-based service, download local tools, and provide feedback directly to help shape future development.

To get started, visit the live service, watch the short introduction video, explore the documentation, or download the tools to use locally within your own workflows.

Explore the Data Conversion Service and start converting your data with confidence.

 

Loading...