The acronym FAIR stands for the principles of Findability, Accessibility, Interoperability, and Reusability, which are increasingly adopted guidelines to improve the reuse of research and other scholarly data. Since the principles were first articulated in 2016 they have become the de facto benchmark for evaluating and improving how research outputs are shared. FAIR principles can be applied to:
- Data, or any digital object (such as can be found in an ORCID record);
- Metadata (information about that digital object, which can also be found in an ORCID record), and;
- Research Infrastructure that enables the collection, storage, or exchange of data or metadata (such as the ORCID iD, ORCID records, and the registry in which they are stored).
FAIR principles have gained considerable momentum as so many organizations are looking for ways to reduce their administrative burden and gain deeper insight into the research they facilitate or fund, both of which are things that working with ORCID can help. In this post, we outline how ORCID adheres to the FAIR Data principles.
ORCID metadata are Findable
When a researcher has an ORCID iD, the data within their record is readable by both humans through our Registry, and machines through our APIs. One of the core goals of ORCID is to increase the discoverability of researchers by disambiguating them from all the other researchers with the same or even a similar name and definitively connecting them with their research contribution metadata (e.g. their scholarly record).
Specifically, ORCID addresses each of the FAIR findability principle components as follows:
- F1. Data are assigned a globally unique and persistent identifier: this is the core of our service: ORCID exists to provide a unique identifier assigned to people that disambiguates them from others with similar or same names, allowing them to be findable by machine or human. ORCID iDs are expressed as a sixteen digit unique identifier, e.g: https://orcid.org/0000-0001-5727-2427.
- F2. Data are described with rich metadata (defined by R1 below): ORCID supports rich data describing researchers and their contributions, including preferred and alternate name forms, education and employment information, funding and facilities awards, and works and publications of all kinds.
- F3. Data clearly and explicitly include the identifier of the data they describe: ORCID records by definition include the unique identifier (ORCID iD) of the person being described. ORCID also prioritizes the inclusion of resolvable persistent identifiers associated with the metadata items included in a record, such as DOIs for works, Research Organization IDs (or RORs, from the Research Organization Registry) for affiliations and Grant iDs for funding awards.
- F4. Data are registered or indexed in a searchable resource: ORCID records are held in ORCID’s fully searchable registry, and the registry is indexed by major search engines and scholarly platforms. ORCID public data file enables the complete ORCID dataset to be indexed by any system that finds it useful to do so.
ORCID metadata are Accessible
The ‘A’ in FAIR stands for Accessible, or more accurately, “accessible under well defined conditions.” Though ORCID is an acronym that begins with the word “open,” and our goal is to make as much data as possible openly and publicly available, we have also always been guided by the key founding principle of researcher control, and thus the accessibility of the data in the ORCID registry is subject to the record holder’s privacy settings. Record holders can choose to make individual items, such as their publications or data about their employment or education, available publicly, only to organizations that they trust, or to keep them private. In this way, the data in the ORCID registry can be described “as open as possible, as closed as necessary.” ORCID enables data access in the following ways:
- A1. Data are retrievable by their identifier using a standardized communications protocol: ORCID iDs are expressed natively as resolvable https URIs. End-users visiting these URIs in a standard web browser will see all of the public data available for the ORCID record in our registry’s user interface. ORCID also supports content negotiation on these URIs, allowing machine-readable access to the public data in a variety of standard web formats including XML, JSON and RDF. In addition, ORCID provides rich search and retrieval capabilities via our Public API, which is a restful API that supports both XML and JSON.
- A1.1 The protocol is open, free, and universally implementable: The ORCID registry uses standard, free, and universally implementable web protocols including https, XML, JSON. Our public API exposes all public data and can be accessed by anyone.
- A1.2 The protocol allows for an authentication and authorization procedure, where necessary: ORCID provides an authentication mechanism to allow organizations trusted by the record holder to retrieve additional non-public “trusted party” data and to add and update data in ORCID records. The authentication mechanism uses the standard OAuth 2.0 protocol to allow organizations to request, and record holders to grant, the necessary permissions.
- A2. Data are accessible, even when the data are no longer available: Given our principle of researcher control, ORCID record holders may delete all or part of the data in their records at any time. However, once registered, ORCID iDs remain in our registry even when the records are otherwise empty, and continue to be resolvable.
ORCID metadata are Interoperable
Since ORCID’s charge is to connect researchers and research, and since ORCID data are machine actionable, the FAIR principle of interoperability is where ORCID really shines. ORCID’s interoperable infrastructure can help accelerate knowledge discovery and increase the integrity, transparency, and reproducibility of research by encouraging FAIR Data Principles and Open Science practices through persistent identifiers and standardized, openly-accessible data.
- I1. Data use a formal, accessible, shared, and broadly applicable language for knowledge representation: The ORCID data model is currently described as an XSD schema. The schema includes elements of other community-developed taxonomies including the CASRAI catalog of elements and the CRediT contributor roles taxonomy.
- I2. Data use vocabularies that follow FAIR principles: We work closely with our community to collaboratively design and reuse consistent languages and vocabularies to enable our systems to exchange data freely and reliably.
- I3. Data include qualified references to other data: As explained in F3, ORCID prioritizes the inclusion of all affiliated persistent identifiers relating to the data they describe, such as DOIs to the object, Grant iDs and all forms of resolvable identifiers. These identifiers are in turn included in the ORCID dataset and API responses creating an interlinked web of metadata between ORCID and other major scholarly metadata repositories.
ORCID metadata are Reusable
ORCID’s public data is designed to be maximally reusable. The public dataset is released under a CC0 waiver and includes detailed provenance metadata, allowing users of the data to determine its applicability and trustworthiness for their use case. Researchers experience greater ease as an increasing number of manuscript submission and grant application forms can be auto-populated when they log into those systems with their ORCID. This results in reduced administrative burden, both for the researcher and the organization they’re sharing their data with. Everyone spends less time re-entering data when it’s FAIR-enabled!
- R1.1. Data are released with a clear and accessible data usage license: ORCID’s public data is released under a CC0 Public Domain Dedication.
- R1.2. Data are associated with detailed provenance: The provenance of each and every assertion present in an ORCID record is disclosed via structured metadata, allowing users of the data to determine its applicability and trustworthiness for their use case. These mechanisms are what we call “trust markers,” and we discuss how to use them to interpret the trustworthiness of an ORCID record further in this blog post.
- R1.3. Data meet domain-relevant community standards: As a broadly adopted community governed service operated according to the principles of openness, trust, and inclusivity, ORCID has emerged as the community standard for person identifiers and associated metadata in research and scholarship. ORCID continually seeks input and advice from its stakeholders in order to ensure that their needs continue to be met.
From principles to practice
ORCID is committed to supporting and enabling the FAIR data principles while encouraging the broadest possible adoption (and is always free for researchers to use)! ORCID aligns with best practices highlighted by a growing number of national and international commentary on FAIR-enablement, such as the recent G20 Research Ministerial Declaration, which establishes a set of joint principles for research integrity and security. In it, “the development and maintenance of sustainable infrastructures to support the findability, accessibility, interoperability, and reusability of research data and other research-relevant digital objects from public funding free of charge at the point of use,” was also called out as a best practice.
Though ORCID predates the establishment of FAIR data principles by several years, we certainly were on a very similar wavelength when we articulated our own vision—a world where all who participate in research, scholarship, and innovation are uniquely identified and connected to their contributions across disciplines, borders, and time. The persistence of the ORCID registry enables making data valuable for both current and future research.
ORCID’s Founding Principles also complement FAIR principles, especially where they intersect with the values of Open Research, such as the creation and stewardship of a permanent, clear, and unambiguous record of research and scholarly communication; enabling reliable attribution of authors and contributors; welcoming participation from any organization with an interest in research and scholarly communications, making our software openly available under an Open Source Software license; and the availability of no-charge APIs and services.
Keeping up with FAIR momentum
ORCID plays a critical part of the research and scholarly data infrastructure, in large part because of the shared principles also guiding FAIR Data: Findability, Accessibility, Interoperability, and Reusability. In turn, organizations that want to embrace FAIR principles in their operations and systems can be assured that utilizing ORCID is in harmony with FAIR Data goals. We believe that FAIR principles should and will continue to gain momentum as a major priority at every level across the entire research ecosystem. Ultimately this should allow more organizations to reduce their administrative burden and gain deeper insight into the research they facilitate or fund, which point to two of the many benefits of working with ORCID.