Most principal investigators understand that funders now require publications to be freely available. The open-access mandate — driven by the 2022 OSTP Nelson memo and implemented across every US federal agency by the end of 2025 — is by now well understood. You deposit your accepted manuscript in PubMed Central or NSF-PAR. The paywall comes down. Box ticked.
What is less well understood is that "freely available" and "findable" are different things. And increasingly, it is findability — not just access — that funders are asking about, evaluating, and in some cases scoring.
Open access solves the paywall problem: can someone read this paper without paying? Findability solves the discovery problem: can someone — or something, like a search engine or an AI citation system — locate this paper in the first place? A paper deposited in PubMed Central with a garbled author name, no ORCID link, and an abstract that buries the key finding in the fourth sentence is technically open access. It is not, in any practical sense, findable.
This post maps the landscape of findability requirements as they stand in April 2026. The aim is not to alarm anyone — most of these requirements can be met in an afternoon — but to make the obligations visible, because many PIs are satisfying the access mandate while unknowingly falling short on the findability criteria that sit alongside it.
What "findable" actually means in FAIR
The word comes from the FAIR data principles, published in Scientific Data in 2016 and now referenced by virtually every major research funder. FAIR stands for Findable, Accessible, Interoperable, and Reusable. The "F" has four sub-principles:
- F1. (Meta)data are assigned a globally unique and persistent identifier — a DOI, an ORCID, a ROR ID.
- F2. Data are described with rich metadata — not just a title and author, but machine-readable descriptions that a search engine or database can parse.
- F3. Metadata clearly and explicitly include the identifier of the data they describe.
- F4. (Meta)data are registered or indexed in a searchable resource.
Note what this is and is not. It is not about whether someone can read the paper. It is about whether a system — Google Scholar, PubMed, Scopus, an AI retrieval engine — can locate the paper, correctly attribute it to its authors, and link it to the grant that funded it. These are infrastructure problems, not access problems. And they have infrastructure solutions.
Which funders now require findability — and how they evaluate it
The table below covers the major US federal agencies and two international programmes that frequently involve US-based researchers. Each entry focuses on the findability-specific requirements — the metadata, identifiers, and indexing obligations — not the broader open-access mandate.
| Funder / Programme | Findability requirement | How it is evaluated |
|---|---|---|
| NIH — Data Management & Sharing Plan | Plan must describe how data will be "findable and identifiable, i.e., via a persistent unique identifier or other standard indexing tools" (NOT-OD-21-014). Must address metadata standards and persistent identifiers (DOIs, ORCIDs, ROR IDs). NIH explicitly states plans should be consistent with the FAIR principles. | Reviewed by programme staff for acceptability. Can require modifications before award. Updated format effective May 25, 2026 (NOT-OD-26-046). |
| NIH — ORCID mandate | From January 25, 2026, all senior/key personnel must have an ORCID iD linked to their eRA Commons profile (NOT-OD-26-018). ORCID is a persistent identifier — the backbone of researcher findability across systems. | Hard gate. Applications without linked ORCIDs will not be accepted. |
| NSF — Data Management & Sharing Plan | Must describe how outputs will be "findable and accessible to the community in a form that links the data to adequate annotation." Each directorate (DMR, ENG, STEM Education) has published its own DMSP guidance. | Scored. Reviewed as integral part of proposal under Intellectual Merit or Broader Impacts or both. New Research.gov tool releasing April 27, 2026. |
| DOE — OSTI metadata deposit | Researchers must submit full-text accepted manuscript plus associated metadata to OSTI — title, authors, publication date, DOI, funding information. New E-Link 2.0 (2025) built to better support persistent identifiers. | Metadata completeness checked at deposit. Affects discoverability within DOE PAGES. |
| USDA — PubAg deposit | All peer-reviewed publications to PubAg with associated metadata. Embargo eliminated as of December 31, 2025. | Compliance monitored by agency. |
| IES (Dept of Education) | Full-text to ERIC immediately upon publication for awards after October 1, 2024. Shifted from "data management plan" to "data sharing and management plan" (DSMP) — the language change foregrounds discoverability. | Reviewed by programme staff during FY2025–26 rollout. |
| AHRQ — R18 grants | Application must include a dissemination plan describing both academic and non-traditional means of reaching policymakers and clinicians. Product dissemination plan required for tools and reports. | Part of scored review. |
| Horizon Europe / ERC | Metadata must be "machine-actionable" — not just human-readable. Must follow standardised formats. Must include title, authors, date, venue, grant number, licensing. Metadata deposited under CC0 and must be findable independently of data access restrictions. | Compliance checked during project reporting. Applies to US-based co-PIs on Horizon grants. |
Two patterns are visible. First, every agency now names persistent identifiers and metadata as explicit requirements — not suggestions, not best practices, but items that are checked or scored. Second, the NSF model, where the data management plan is scored as part of the proposal itself, is the direction of travel. The NIH model, where programme staff review the plan separately, is widely expected to move toward reviewer-facing evaluation in future cycles.
The ORCID mandate deserves special attention
Of all the changes in 2025–2026, the NIH ORCID requirement (NOT-OD-26-018) is the one with the sharpest teeth. From January 25, 2026, every senior or key person listed on an NIH application must have an ORCID iD linked to their eRA Commons profile. This is not a recommendation. It is a hard gate: no linked ORCID, no application accepted.
ORCID is a persistent identifier for people, in the same way a DOI is a persistent identifier for objects. It solves the problem of author disambiguation — distinguishing between the hundreds of "J. Wang" or "S. Patel" entries in PubMed — and it allows automated systems to link a researcher's publications, datasets, grants, and affiliations into a single identity graph.
The practical consequence is that if your ORCID profile is incomplete — if it links to some of your publications but not others, if it lists an outdated affiliation, if it is not connected to your preprints — you are visible as a partial entity. The system knows you exist, but it cannot confidently attribute all your work to you. This matters for citation counts, for Google Scholar profile accuracy, and for the automated matching that funders increasingly use to track the outputs of their investments.
Where repositories stop and findability problems begin
There is a common assumption that depositing a paper in PubMed Central or NSF-PAR satisfies the findability requirement. The deposit does satisfy the access mandate. It does not, by itself, solve every findability problem.
Consider what happens when a preprint goes up on bioRxiv. The server generates a landing page with the title, authors, abstract, and a DOI. Google Scholar's crawler visits the page and attempts to parse the bibliographic data. The Google Scholar inclusion guidelines are explicit about what can go wrong: "The most common cause of indexing problems is incorrect extraction of bibliographic data by the automated parser software." Mismatched author names, missing citation_title meta tags, abstracts that are not visible on the landing page, multiple papers bundled in a single PDF — all of these cause papers to be indexed incorrectly, or not indexed at all.
The same problem propagates to AI citation systems. When Google AI Overviews or Perplexity or ChatGPT with browsing looks for a source to cite, it evaluates candidate pages by how cleanly it can extract and attribute a factual claim. A preprint landing page with sparse metadata and a discursive abstract loses that contest to a secondary source — a review article, a press release, a blog post — that presents the same finding with clearer structure. The preprint is the primary source. It is just not the findable source.
This is not a theoretical concern. An analysis of AI Overview citation sources in early 2026 found that academic journals and research papers accounted for just 0.48 percent of all cited URLs in healthcare queries — despite AI Overviews appearing in 51 percent of those searches.
What "findable" looks like in practice — a checklist
The items below are not aspirational. They are the operational translation of what funders are asking for when they invoke FAIR's findability principle. Most can be completed in a single sitting.
Persistent identifiers
- Every paper and preprint has a DOI that resolves to the correct landing page.
- Every author on the paper has an ORCID iD, and those ORCIDs are linked on the preprint server and in the journal's author metadata.
- Your ORCID profile is current: it lists your affiliation, links to your publications (including preprints), and connects to your grant IDs.
- Your institution's ROR ID appears in the metadata where supported.
Metadata quality
- The paper's landing page contains
citation_title,citation_author, andcitation_publication_datemeta tags that Google Scholar can parse. - The abstract is visible in full on the landing page — not behind a "show more" toggle, not split across multiple pages.
- Author names are consistent across the preprint, the published version, ORCID, and Google Scholar. ("G. Kumaran" vs. "Girishkumar Kumaran" vs. "G.K. Kumaran" creates three separate entities in automated systems.)
- The funding grant number appears in the acknowledgements and, where supported, in the structured metadata.
Indexing and registration
- The paper appears in Google Scholar when you search for its exact title. (If it does not, there is a metadata problem to diagnose.)
- The paper's Google Scholar entry correctly links to the full-text version.
- The paper appears in your Google Scholar profile and is attributed to you — not to a name variant or a co-author.
- If the paper has been deposited in an institutional repository, the repository entry links to the DOI of the version of record.
Abstract structure for retrieval
- The first two sentences of the abstract state the core finding with specific, extractable claims — not background context. Retrieval systems weight the opening of the abstract disproportionately.
- The abstract contains the key terms a researcher would search for, used naturally and in the first 200 words.
- If the abstract follows a structured format (Background / Methods / Results / Conclusions), the Results section leads with numbers, not hedged interpretations.
Formal scoring frameworks already exist
For PIs who want to document their findability compliance — whether for a DMS plan, a grant progress report, or a tenure dossier — there are structured evaluation tools.
The FAIR Data Maturity Model, published by the Research Data Alliance in 2020, defines maturity indicators for each FAIR principle. The findability indicators assess whether metadata is identified by persistent identifiers, whether those identifiers match recognised GUID schemes, and whether data is registered in searchable resources. Scores are assigned on a scale from "never" to "always."
Practical tools include the ARDC FAIR Self-Assessment Tool and the Pistoia Alliance FAIR Toolkit, both of which provide structured questionnaires that produce numeric findability scores. These are designed for self-assessment, but the output is the kind of before-and-after documentation that a funder or review committee can interpret: "At the time of deposit, this output scored 3/8 on findability indicators. After metadata remediation, it scores 7/8."
No funder currently requires a FAIR maturity score in a grant report. But the infrastructure for one exists, and the direction of travel — especially given the NSF model of scoring the DMSP as part of proposal review — suggests that quantitative findability assessment is a matter of when, not whether.
The tenure and promotion dimension
Outside the grant compliance context, findability has a quieter but equally consequential role in academic careers. A growing number of tenure and promotion committees now ask for citation metrics — h-index, i10-index, total citation counts — as part of the dossier. The most commonly used source for these metrics is Google Scholar, because it is free, self-updating, and covers preprints and conference papers that Scopus and Web of Science miss.
The dependency is circular: your Google Scholar citation count is only as complete as Google Scholar's ability to find and correctly attribute your papers. If a paper is not indexed — because the meta tags are wrong, or the preprint server rendered the landing page in a way the parser cannot read — it does not accumulate citations in the system your tenure committee is looking at. If your name appears as three different variants across your publications, your h-index is fragmented across three partial profiles.
This is the same findability problem described above, with a different stakeholder. The funder asks: can we track the outputs of our investment? The tenure committee asks: what is this researcher's impact? Both questions are answered by systems that depend on metadata quality and persistent identifiers. Both give incomplete answers when the findability layer is broken.
What this means for how you think about your papers
The shift is subtle but worth naming. For the past decade, the dominant question a PI asked about a publication was: where did I publish it? The journal's prestige, its impact factor, its readership — these were the levers of visibility. The paper went into the journal, and the journal handled everything else.
That model assumed readers found papers through journals. It was largely true in 2015. It is decreasingly true in 2026. Google Scholar, PubMed, AI-powered search tools, and institutional discovery systems now route a substantial fraction of readers to papers without passing through a journal's front page. The metadata on the paper itself — its title, its abstract, its author identifiers, its structured tags — is what these systems use to decide whether to surface it.
The practical consequence is that a paper's findability is no longer something the journal takes care of. It is something the authors have direct influence over, and increasingly, something funders expect them to manage. The FAIR findability principles are not an abstract framework. They are a description of what your DMS plan is supposed to address, what your ORCID profile is supposed to enable, and what your abstract and metadata are supposed to make possible.
None of this requires a budget. It requires attention — to the same metadata and identifiers that your funder is already asking you to plan for.
Frequently asked questions
What does "findable" mean in FAIR?
It has a specific technical definition: research outputs must be assigned globally unique and persistent identifiers (DOIs, ORCIDs), described with rich machine-readable metadata, and registered in searchable resources. A paper can be openly accessible but effectively invisible if its metadata is incomplete or its landing page cannot be parsed by search engines.
Does NIH require my research to be findable?
Yes. The NIH Data Management and Sharing Policy requires all applications generating scientific data to include a plan describing how data will be "findable and identifiable, i.e., via a persistent unique identifier or other standard indexing tools." From January 25, 2026, all senior/key personnel must also have a linked ORCID — a hard requirement without which applications will not be accepted.
Is the NSF data management plan scored by reviewers?
Yes. NSF reviews the DMSP as an integral part of the proposal, evaluated under Intellectual Merit or Broader Impacts or both. NSF's guidance states data should be "findable and accessible to the community in a form that links the data to adequate annotation." A weak plan can directly affect your proposal score.
Can a paper be open access but not findable?
Absolutely. A paper in PubMed Central satisfies the access mandate. But if the preprint version has mismatched author names, missing citation meta tags, or an abstract that buries the key findings, it will rank poorly in Google Scholar and be overlooked by AI citation systems. Access and findability are different problems with different solutions.
Are there formal scoring frameworks for findability?
Yes. The FAIR Data Maturity Model (Research Data Alliance, 2020) provides structured indicators with numeric scores. Tools like the ARDC FAIR Self-Assessment Tool and the Pistoia Alliance FAIR Toolkit provide automated scoring. These can produce before-and-after documentation suitable for a grant progress report or DMS plan update.
Want to know your paper's findability score?
Our 115-point audit includes a findability assessment against FAIR indicators — metadata quality, persistent identifier coverage, Google Scholar indexing status, and structured data completeness — with a before-and-after report you can attach to your DMS plan or grant progress report.
Submit your paper →