Finding & Reusing

What is open data, what is FAIR data?

Open data can be used and distributed without restrictions. The possibilities for further use are indicated via open licenses - usually Creative Commons [7]

As it is often not possible to make primary data completely openly available for various reasons (e.g. data protection or commercial exploitation interests), the aim when publishing data is to make it as FAIRLY accessible as possible.

The FAIR principles state that research data should be published in a way that is findable, accessible, interoperable and reusable. Therefore, if sensitive data is to be made accessible, storage in a repository that only allows data to be used after prior authentication and authorization is in line with the FAIR principles.

This and other information can be found on the information platform forschungsdaten.info (German language only).
 

 

Where can I find reusable data?

Research data is often stored in repositories. Re3Data, OpenDOAR and RIsources, among others, are available for finding suitable repositories.

To search for research data directly, you can use the search engines DataCite Commons, the European Union Open Data Portal, B2FIND or Google Dataset Search, for example.

This and other information can be found on the information platform forschungsdaten.info (German language only).
 

 

Collecting & Processing

What is the purpose of a data management plan (DMP)? How can a DMP be created?

A data management plan (DMP) describes the handling of research data during and after the completion of a scientific project. A DMP is a useful tool that provides project management and all project participants with an overview of data storage and management over the entire duration and research data life cycle.

The creation of a DMP can be useful regardless of whether the aim is to publish the data, as gaps and uncertainties regarding the handling of the research data created can be comprehensively clarified at an early stage. The type of data, the storage space required for it and access rights during and after the end of the project are a few aspects that can be clarified and documented in a DMP.

Several research funding institutions require information on the handling of research data in applications. In most cases, no formal DMP is expected, but the implementation of this funding condition can be facilitated with a DMP. The requirements of the research funding institutions are presented in an overview (German language only) on the information platform forschungsdaten.info (German language only).

A common tool for creating a DMP is the Research Data Management Organizer (RDMO). Freely available installations are provided by forschungsdaten.info (German language only) and NFDI4Ing. You can log in with your ORCID ID, among other things.

The German Research Foundation (DFG) offers a checklist on the content of a DMP. The DMPonline platform, which can also be used to create DMPs, has a list of publicly accessible DMPs that can be used as examples (not quality-checked).

This and other information can be found on the information platform forschungsdaten.info (German language only).
 

 

How can data be organized?

Structuring, documenting, and securing research data in a regulated manner from the outset can significantly reduce the time and effort involved in day-to-day research work. This includes such basic determinations as a clearly regulated folder structure and naming within the framework of a project, which is binding for all participants. A folder hierarchy consisting of a maximum of three levels is recommended. In addition, a clear regulation should be made for data deletion after the end of the project. All specifications should be documented in writing and made known to all project participants and accessible at all times.

For the collected data (records) themselves, it is crucial to make changes traceable through functional version control. This can be achieved through a precise naming convention that includes a version number and the date of change (ideally in the form YYYYMMDD). Finally edited data records should be marked accordingly. To avoid data loss as far as possible, it is advisable to regularly save versions that must not be deleted or changed in another storage location.

This and other information can be found on the information platform forschungsdaten.info (German language only).
 

 

Publishing & Archiving

Why should research data be published?

The publication of research data is primarily a matter of good scientific practice and is addressed in several places in the German Research Foundation's (DFG) Code of Conduct for Safeguarding Good Research Practice.

Guideline 12: Documentation, which describes that all relevant information on the development of research results must be documented according to subject-specific standards in order to assess and review them, is particularly relevant in this regard. Measures must be taken to protect against manipulation and third parties must be given access to this information. In addition to research data, this also applies to methods, evaluation and analysis as well as the source code of research software.

Guideline 13: Providing public access to research results explains at the outset that it may be justified in individual cases not to publish results. Researchers make the decision to publish on their own responsibility and on the basis of the practices of the respective subject area. Research data should be made accessible in reliable repositories in accordance with the FAIR principles (see also section "How can research data be published FAIR?").

The handling of research data is also regulated in Guideline 10: Legal and ethical frameworks, usage rights and Guideline 17: Archiving. The information platform forschungsdaten.info (German language only) provides an overview of the Code's statements on research data.
 

 

How can research data be published FAIR?

The FAIR principles state that research data should be published in a way that is findable, accessible, interoperable and reusable.

To ensure findability, the data must be described with extensive metadata and assigned a persistent identifier (e.g. DOI). It is also necessary to index the (meta) data in a search engine or database.

Access to the (meta) data should be possible via a standardized and open communication protocol that enables authentication and authorization if required. The metadata remains accessible even if the actual research data is not (or no longer) available.

In the context of interoperability, the languages and vocabularies used for the (meta) data are described. In addition, the (meta) data should be linked to other (meta) data in a meaningful way.

Reusability is primarily aimed at the description of the data. The context in which the data was created, the use of a license and subject-specific standards must be taken into account.

GO FAIR provides this and other detailed information on the FAIR principles as well as explanations on their implementation.
 

 

Where can research data be published?

Articles in data journals are suitable for the pure description of complex and significant data sets. Similar to traditional research articles in scientific journals, they undergo a review process. The portal forschungsdaten.org provides an overview of data journals [8].

The described dataset should be published separately in a suitable repository. Subject-specific standards and specifications of the funding institutions or publishers should be taken into account when making the selection. Long-term archiving should also be taken into account. 

If a suitable subject-specific repository exists, the research data should be published there. The Re3Data and RIsources directories enable a subject-specific search for repositories. 

If no suitable specialist repositories are available, generic repositories can be used, e.g. Zenodo.

This and other information can be found on the information platform forschungsdaten.info (German language only).
 

 

Which file formats are suitable for subsequent use and long-term archiving?

To ensure the long-term reusability of your research data, you should use file formats for publication that are compatible with various systems, can be archived for as long as possible and can be converted without loss.

When collecting data, special programs are generally used that are widely used in the specialist areas and are specifically geared to the survey method. Special file formats are usually used. If an export function is available that allows the data to be saved in an alternative file format that is more suitable for publication according to the above criteria, this should be used. The information platform forschungsdaten.info (German language only) has compiled an overview (German language only) of file formats that are suitable or considered unsuitable for publication in the long or medium term.

If a conversion to another file format is necessary, a decision must be made as to whether this conversion should be lossless, lossy or meaningful. A lossy conversion can be advantageous compared to a lossless conversion if a smaller file size is required. If this lossy conversion is carried out in such a way that all essential content is retained (i.e. meaningful), this may be sufficient.

This and other information can be found on the information platform forschungsdaten.info (German language only).