Regardless of who is analyzing data to evaluate scholarly impact, be they a grant funder, a university administrator, a peer reviewer, or a librarian working on collection development efforts, each person faces similar challenges in data selection and collection. They are particularly hampered by the challenges of data granularity and aggregation. To compensate, the research community has long relied on proxies of quality for assessment purposes, such as the impact factor of the journal where an article is published. Those seeking to consider the impact of some particular item or set of scholarship need to be able to “tune” the systems of data gathering to be able to calculate the metrics of greatest interest. The use of persistent identifiers, such as ORCIDs and DOIs, can support such tuning.
Traditional scholarship is available in a variety of forms and online locations these days. In the most traditional publication model, the final published paper might be available in a journal as well as from the publisher’s website. Beyond the mirror sites that a publisher might make available around the world, many publishers transition their content to archiving services like JSTOR, and also may have secondary distribution partnerships with full-text aggregation services, such as EBSCOHost Academic Search or Gale’s Academic OneFile. To complicate this picture, other sources of papers also include preprints stored in local repositories, shared copies, or even unauthorized copies, from which usage is presently difficult or impossible to aggregate.
Further, this traditional scholarly communications model is strictly article-focused and does not include scholarly outputs that many researchers are now creating and sharing such as data, software code, or blogs. Nor does the model include the various forms of interaction through citation systems, social media, or reuse, which could provide meaningful analytical data. Finally, this traditional process does not include the valuable application of the materials via integration in e‑courses, inclusion in applications, adoption by the community, patent applications, or use in legislative processes—all of which one might consider a component of overall impact.
The challenge of granularity
Regarding granularity, who hasn’t experienced the challenge where the thing you want isn’t available at the level you want it? You would like to purchase a single item, but it is only available in a bulk package. You may want one cable channel, but you have to subscribe to an entire tier of cable service. Similarly, in scholarly communications, assessment information has for too long only been provided at the journal or organizational level. However, outside of library collections assessment, most often the people interested in impact assessment want to know the performance of a particular researcher, the impact of a particular effort described in a single paper, or the success of particular grant-funded project.
To extract the meaningful elements from the whole, each item of value must be uniquely identified. Until this century, identifiers had traditionally been assigned at the “package” level, e.g., the ISSN for a journal or the ISBN for a book. In 2000, we saw the first implementation of the Digital Object Identifier (DOI®), which quickly began to be used to identify individual scholarly articles. Until the launch of ORCID, there was no broadly accepted mechanism to unambiguously identify authors of articles or researchers. The availability and usage of these identifiers has paved the way for assessment tools and services to collect data at a much more granular level.
Data aggregation
The second challenge comes from the need to aggregate these more granular levels of data. Even if there were a willingness to share data among providers (itself a big challenge) and a simple, standardized format existed to aggregate that data— (another significant problem), how can an analyst be assured of extracting all and only the relevant information related to the researcher in question? Aggregation of impact at the publication level serves the publishers and libraries, and can be a proxy of quality, but it is an imperfect measure at best of the component articles, which is needed to assess a particular researcher/author. Many of the next generation of metrics initiatives such as PIRUS, ImpactStory, Altmetric.com, and Plum Analytics depend heavily on the use of identifiers such as DOI and ORCID to accurately aggregate their data.
Linking scholarship with a network of persistent identifiers
As ORCID, DOI, institutional identifiers, and FundRef are adopted throughout the scholarly community and each item of scholarship can be linked together with a network of persistent unique identifiers, the problems of aggregation and granularity become more tractable. Aggregating information on a specific researcher’s output becomes possible across all of the systems that maintain content. Similarly, all of the content published by all the authors at the same institution or funded by a particular grant can be extracted more easily and more accurately from large collections. Even non-traditional content or non-traditional media can be linked and their usage aggregated if the metadata associated with those objects contain persistent identifiers. There has been work to advance the use of citation to data sets notably led by ICSTI and CODATA, which eventually developed into the Joint Principles on Data Citation. Adoption of these principles is ongoing, but it could be expanded to cover a range of other non-traditional outputs.
There is little doubt the trend toward greater interest in and application of metrics in scholarly assessment will continue to accelerate. There is simply too much demand for justifying the investments made in research or education from funders, administrators, legislators, and taxpayers. Researchers and their institutions should be interested in making impact metrics and the systems that underlie them as robust as possible to ensure all the credit due them is properly attributed and counted.
Taking action in the research commmunity
This expanded interest in metrics is one of the reasons that NISO has undertaken a project to establish standards and best practices related to new forms of assessment. This project, supported by a grant from the Alfred P. Sloan Foundation, is seeking to develop consensus around altmetrics definitions, use cases, approaches to assessment of non-traditional scholarly outputs, technical infrastructure for Altmetrics-data exchange, and the adoption of persistent identifiers. Working groups to address these five themes are now forming.
Researchers can do their part to foster this environment by registering with ORCID, using their ID on all the content they produce, and insist that publishers or repositories collect and publish IDs on articles and other scholarly resources. We collectively can work toward interoperability, data exchange, validity, and trust, but first the data needs to be available to be gathered. That starts with all of us: researchers, funders, universities, and publishers. If the ID isn’t submitted or collected, the data cannot be aggregated or disambiguated, and the impact of our work cannot be tracked and analyzed. It is in your interest as an author to have the credit tracked back to you.