Skip to content Skip to footer

Research data terms

Some of the terms used in the context of data management plans and research data managment might need explanation. We try to adhere to the following definitions.

For Norwegian translations, consult the NO-RDA Termliste for forskningsdatahåndtering.

Key research data management terms explained (alphabetical order, non-exaustive)

Archiving

Engage in curation activity that ensures that records, objects, metadata and data are properly selected, stored, and can be accessed, and for which logical and physical integrity are maintained over time, including security and authenticity CODATA RDM Terminology (version 2023) License: CC BY 4.0

Data archive

On these pages and in the affiliated data management plan template, archive is used for repositories.

Archive (noun): Curated collection or repository containing physical or digital static records, objects, metadata and data deemed suitable for permanent retention, set up and managed to established standards and models, such as ISAD(G), CoreTrustSeal, and the OAIS reference model, that ensure long term integrity, security, authenticity and accessibility of the records, objects, metadata and data. CODATA RDM Terminology (version 2023) License: CC BY 4.0

Data preservation

Preservation: An activity within archiving in which specific items of data are maintained over time so that they can still be accessed and understood through changes in technology. CODATA RDM Terminology (version 2023) License: CC BY 4.0

Data storage

Embargo

A given period of time which defines when the data will be made available in an open archive after it has been deposited already.

FAIR principles

FAIR data principles: Set of guiding principles to make data Findable, Accessible, Interoperable, and Reusable. CODATA RDM Terminology (version 2023) License: CC BY 4.0 The FAIR principles in detail

Information security

Metadata

Metadata: Data about data. It is data (or information) that defines and describes the characteristics of other data. It is used to improve the understanding and use of the data. CODATA RDM Terminology (version 2023) License: CC BY 4.0

Metadata standard

Metadata standard: High level, shared representation of the metadata elements related to a dataset, collection, or other digital object. May also provide an XML schema describing the format in which the elements should be stored. Typically, a standard XML format is defined using XML Schema or document type definition (DTD). Standards are typically ratified by national or international standards bodies. CODATA RDM Terminology (version 2023) License: CC BY 4.0

Persistand identifier (PID)

Persistent identifier: Long-lasting digital reference to an object that gives information about that object regardless of what happens to that object. Developed to address link rot, a persistent identifier can be resolved to provide an appropriate representation of an object whether that object changes its online location or goes offline. CODATA RDM Terminology (version 2023) License: CC BY 4.0

Examples:

Personal data

Personal data is any information that relates to an identified or identifiable living individual. Different pieces of information, which collected together can lead to the identification of a particular person, also constitute personal data. Personal data that has been de-identified, encrypted or pseudonymised but can be used to re-identify a person remains personal data and falls within the scope of the GDPR. Personal data that has been rendered anonymous in such a way that the individual is not or no longer identifiable is no longer considered personal data. For data to be truly anonymised, the anonymisation must be irreversible. The GDPR protects personal data regardless of the technology used for processing that data – it’s technology neutral and applies to both automated and manual processing, provided the data is organised in accordance with pre-defined criteria (for example alphabetical order). It also doesn’t matter how the data is stored – in an IT system, through video surveillance, or on paper; in all cases, personal data is subject to the protection requirements set out in the GDPR. European Commission: What is personal data?

Provencance, data lineage

Provenance: A type of historical information or metadata about the origin, location or the source of something, or the history of the ownership or location of an object or resource including digital objects. For example, information about the Principal Investigator who recorded the data, and the information concerning its storage, handling, and migration. CODATA RDM Terminology (version 2023) License: CC BY 4.0

Repository

On these pages and in the affiliated data management plan template, archive is used for repositories.

Repository: Physical or digital storage location that can house, preserve, manage, and provide access to many types of digital and physical materials in a variety of formats. Materials in online repositories are curated to enable search, discovery, and reuse. There must be sufficient control for the physical and digital material to be authentic, reliable, accessible and usable on a continuing basis. CODATA RDM Terminology (version 2023) License: CC BY 4.0

Research data

The term research data is defined in this policy to mean the registration/recording/reporting of numerical scores, textual records, images and sounds that are generated by or arise during research projects. These may, for example, be data that are generated through new analysis by combining existing secondary data, or entirely new data that are generated through new data collection. Research data are always a direct result of research activity, regardless of whether the data are based on secondary data or whether they are collected from scratch. The term secondary data is used in this policy to refer to data that already exist, regardless of the research to be conducted. These may comprise information collected for a different purpose (e.g. public administrative data, clinical data or weather data) or they may be physical or digital collections of objects and texts (e.g. libraries or data reused from previous projects such text corpuses or other scientific collections). Information on the Internet may also be defined as secondary data in this context, and such information is highly heterogeneous. Data which are used as secondary data in research, but which have been collected, generated or processed by other researchers or research institutions than those conducting the research, will normally not be encompassed by the guidelines in this policy. The Research Council of Norway’s Policy for Open Access to Research Data

Research data: Data that are used as primary sources to support technical or scientific enquiry, research, scholarship, or artistic activity, and that are used as evidence in the research process and/or are commonly accepted in the research community as necessary to validate research findings and results. All other digital and non-digital content have the potential of becoming research data. Research data may be experimental data, observational data, operational data, third party data, public sector data, monitoring data, processed data, or repurposed data. CODATA RDM Terminology (version 2023) License: CC BY 4.0

Sensitive data

The term sensitive data is used when making data publicly available could put people, organisations, countries, and/or ecosystems at risk - this could be for example, personal or commercial information, […]. Such data sensitivity must be protected against unauthorized access, and therefore one should be cautious when deadling with potentitally sensitive or sensitive information. RDMkit: Data sensitivity License: CC BY 4.0

Special category data

The GDPR defines special categories of personal data (‘[personal] sensitive data’) as:

  • personal data revealing racial or ethnic origin, political opinions, religious or philosophical beliefs;
  • trade-union membership;
  • genetic data, biometric data processed solely to identify a human being;
  • health-related data;
  • data concerning a person’s sex life or sexual orientation. European Commission: What personal data is considered sensitive?
Contributors