Scientific data sharing tools

Paul Jennings, Vrije Universiteit

Paul Jennings is Division Head of Molecular and Computational Toxicology, in the Department of Chemistry and Pharmaceutical Sciences at the Vrije Universiteit Amsterdam’s and a co-editor in chief of Toxicology in Vitro. Here he gives his view on research data management.

Why are data management, data sharing and collaboration important for research?

New technology brings new tools and new opportunities in all branches of science, increasing the speed of acquisition and depth of knowledge. But this also means the amount of data is increasing at an unprecedented rate. It is no longer satisfactory to cherry pick from these big data sets for scientific publications. The data must be made available to other colleagues to frame different questions. But this needs a change in culture and a fundamental change in the way we document experiments and samples.

What would be the best way to do this?

Well, we already know that data dumps are not the long term solution because its neither searchable or findable. In general, data sharing in science is not conducted at any sophisticated level. We need better software solutions, to trace data from source to publication, to storage and reuse.

When they do? Don’t scientists routinely share data?

No, unfortunately, it’s not in the culture for biologists and toxicologists to share raw data. I think it is a real problem with the culture: why should I share my data if nobody else does?

Is lack of sharing due to a technical problem?

Universities like ours have terabytes of online storage so, technically, there is no reason not to share data. But the issue is how you manage it. As biologists and toxicologists, we’re not trained in data management. We store our data on our hard drives and, if we want to share it with someone, we share files. But the person receiving the files may not understand what to do with it. We need a structure for organizing and consensus: Do we store raw data? Slightly analysed data? It completely depends on the data stream, for example, chemical analysis, high-content imaging or RNA sequencing are all very different types of data. External scientists should be able to interrogate others’ data to check if the conclusions reached are correct and also to ask new questions. Scientific data management processes really need to change because it is beginning to hurt research. It’s one of the reasons for the reproducibility crisis in scientific research.

So, how do we change the culture?

It’s not going to be easy but EU projects are making a good start. Data management used to be voluntary, but EU projects now require strong data management. I think this is where the EdelweissData™ platform comes in. Maybe today’s young scientists can take CSV files and rewrite them in a useful way, but EdelweissData™ would be fantastic by making that process automatic. If sharing is easy, people will be more willing and likely to do it. Let’s change the scientific culture, and let’s do it together.

About Edelweiss Connect

Edelweiss Connect has developed computer software that helps combine data from many different sources and formats into a useable form for evaluation and re-use. In addition, we have developed other software programs that can use the collected data and data from in vitro tests to predict the safety of chemical compounds in the human body.

Tags

Get in touch

  • Address: Edelweiss Connect GmbH
    Technology Park Basel
    Hochbergerstrasse 60C
    CH-4057 Basel / Basel-Stadt
    Switzerland

USA office

  • Address: Edelweiss Connect Inc
    Research Triangle Park NC
    800 Park Offices Dr
    Durham, NC 27709
    USA