A Shelf For Bookshelves
Published online at Lab Times.
Data sharing is a piece of cake in the cyber-world. Search engines display thousands of hits, leaving you baffled at the maze of articles and databases. The German initiative re3data.org puts an end to this confusion by standardizing data repositories.
As researchers we all recognize the importance of keeping abreast with the latest findings in our field. It’s not just for the fear of being scooped by a competitor but also for the need to constantly reframe hypotheses, identify gaps in our understanding or adopt a novel technique that we visit PubMed and other online databases on a daily basis.
The increasing acceptance of the Open Access attitude produced an unprecedented rise in their number, allowing a quick and inexpensive access to myriad research data. But with all that access, how does a researcher choose where to deposit his data or where to look for the right kind of data? This is where the Registry of Research Data Repositories” or in short re3data.org comes in. Re3data.org was founded in 2012 by a team led by Heinz Pampel from the Open Access Coordination Office at the German Research Centre for Geosciences in Potsdam, Germany. Its main goal is to systematically categorise existing repositories for a better visibility of reliable datasets. The registry is funded by the German Research Foundation (DFG) and is jointly led by GFZ German Research Centre for Geosciences, Berlin Humboldt University and Karlsruhe Institute of Technology.
Re3data.org has been built on the basis that database repositories, which are tailored to serve different disciplines and have distinct styles reflecting their founding institutions, are very heterogeneous by nature. Launched as a single global portal to consolidate such varied data sets under an unified format, the project is to benefit researchers, publishers and institutions alike when it comes to data search and storage. Recently, a new collaboration between Re3data.org, the European Open Access infrastructure OpenAIRE and BioSharing has been announced. BioSharing is an initiative to standardise repositories and manage data sharing.
At present, the growing registry lists about 600 data repositories spanning across 140 disciplines and include, for instance, ArrayExpress, providing functional genomics data, TAIR, the Arabidopsis Information Resource and ORGIDS, the Rotterdam Glaucoma Imaging Data Sets. For those database curators interested to join, re3data.orgsets out a detailed “list of metadata properties” to define the standards for any data repository to be eligible for an entry into the registry. This “vocabulary” describes the databases’ general scope, content, infrastructure, compliance with technical, metadata and quality standards. Guided by this vocabulary, moderators of database repositories can request their infrastructures to be added by filling out a URL suggest form. Inspected and approved databases are “identified by a green check mark”.
Besides a brief description of the database, re3data.org also provides information on the supporting institutions, submission and access policies, licenses and quality standards such as federal endorsements. The last feature permits the user to evaluate and compare the reliability of different database resources.
The portal is very user-friendly allowing the filtering of results either by subject, content-type or country. Subjects vary from arts, humanities and law to construction engineering and natural sciences. Clicking on the field “Neuroscience”, for example, enlists 12 repositories; a click on “Plant Sciences” gives 27 results. “Content-type” is a particularly useful filter in such cases when a user is interested solely in images, audiovisual data or source code, for example. Each repository has a set of icons against its name that identify its features such as access, certification and licensing policy and even openness to submissions without the user having to get into details.
“In the upcoming project phase the focus will be on improving usability and implementing new features. Among other things, the dialog with repositories’ operators will be supported by a workflow system,” writes Heinz Pampel in a guest post on the PLOS Tech Blog. And with secure funding until 2014, re3data.org will certainly do its share to promote the standardization of data repositories, and through it, “a culture of sharing, increased access and better visibility of research data”.
Photo: Bookshelves by Germán Poo-Caamaño via Creative Commons License