Home About the data

About the data

Euan
By Euan
29 articles

About the SDG categories in Overton

*Describes how policy documents are linked to different SDGs Please note: this functionality was updated and improved in March 2025. If you would like more information on how we previously linked policy documents to SDGs there is more information here. *As well as topics and subject areas Overton tries to map policy documents to one or more Sustainable Development Goals, which are a set of 17 goals set up by the United Nations to serve as a framework for global development. The SDGs are often used as a quick way to group policy or research relating to a particular problem area e.g. climate change, poverty or gender inequality. Each SDG is accompanied by targets which provide specific pathways to achieving the overall goal of a fairer and more sustainable world. Overton uses an advanced multi-label approach, which allows a single classifier to predict multiple categories at the same time. This means our classifier can simultaneously categorize policy documents into multiple Sustainable Development Goals (SDGs) and their corresponding targets. As input it uses the new document descriptions and uses ModernBERT, a powerful language model known for its ability to understand the context of text, and organises the classification process hierarchically. Each SDG is associated with a set of targets. For example SDG 8 “Decent work and economic growth” has 12 targets (see https://en.wikipedia.org/wiki/Sustainable_Development_Goal_8 ) including “Diversify, innovate and upgrade for economic productivity” (target 8.2) and “Promote policies to support job creation and growing enterprises” (target 8.3). Our classifier first matches policy documents to these “targets” and then rolls them up to the parent SDG. We created a large set of training data to ensure coverage of all targets, even for categories with limited data. The performance of our classifier was evaluated using precision and recall, which measure how accurate and consistent the classifier’s predictions are. The results were very promising.

Last updated on Jun 25, 2026

About topics, entities, subject areas and COFOG

Information on the taxonomies that Overton uses to map topics, entities and subject areas to policy documents Overton uses machine learning techniques to extract topics and entities from the full text of each policy document we index, and then tries to map them to a taxonomy to make browsing and analyzing them easier. While Overton is generally language agnostic, these techniques are not, so only documents in a subset of languages (including English, French, Spanish, Chinese, Japanese and Russian) will have topics, entities and subject areas extracted. We use the English names for topics, subject areas and entities regardless of the source document language. Topics Topics are the main themes of a document. We analyze the phrases and entities used in the document and then compare them to data derived from pages on Wikipedia to find which ones have the most in common. The titles of Wikipedia pages that have a lot of overlap in language with the policy document are chosen by Overton as topics, so the set of possible topics is very broad. In total, Overton currently has over 650,000 topics connected to policy documents. Topics intend to capture all possibilities/variants within a set of results (which can be very relevant for some types of analyses like topic clustering). Browse Topics Browse Topics by using the “Explore the Data” menu at the top of any Overton page and selecting “Browse all topics.” You can browse by ‘Subject Area’ or only view ‘Overrepresented topics.’ The page also provides a Topic Map. This is a map of the topics that Overton thinks are most often associated with the documents in your results set. Topics with darker backgrounds appear more frequently than we’d expect. Searching with Topics You might use the Topics filter as a first step approach when running queries to discover a large set of policy documents on a given subject, and then apply additional filters if this is required for the type of analysis that needs to be run. Due to the exhaustive and fluid nature of topics, we developed new filters that rely on defined taxonomies and allow us to filters assign policy documents to multiple categories. Users can also view all filter options in the web interface and assign multiple options as needed. These filters are: - the COFOG framework filter - the SDG filter (which offers SDG target data on policy document records which can then be exported in csv or excel) For more tips, visit our Search using Topics page. Subject areas We look for subject areas the same way we do topics, but documents are matched against examples from each category in the IPTC’s MediaTopics controlled vocabulary instead of against Wikipedia pages. MediaTopics are the categories used by many newspapers and magazines to organize their articles. You can see the whole taxonomy in clickable tree format here. Entities Entities include the people, organisations, countries, and other proper nouns mentioned in a policy document. Overton performs entity extraction on the full text of policy documents. Our system uses a mix of speech tagging—which identifies which words are verbs, nouns, and so on—with pattern matching using a large dictionary of entities sourced from Wikipedia. Classification of the Functions of Government (COFOG) topic COFOG is a classification system to describe the broad objectives of government. We classify the majority of our policy documents to fall into at least one of the COFOG divisions. Our COFOG classifier uses an advanced multi-label approach that enables a single model to predict multiple categories simultaneously. It takes our AI-generated document descriptions as input, then applies ModernBERT—a powerful language model known for its contextual understanding of text—to organise the classification process hierarchically. In order to ensure the accuracy of our classifier we are unable to apply the following 4 subcategories to any policy documents at this time: - Basic research - Transfers of a general character between different levels of government - Economic affairs n.e.c. - R&D Housing and community amenities

Last updated on Jun 25, 2026

Documents in different languages

What languages does Overton support? Overton is largely language agnostic Overton is largely language agnostic (with some caveats – see below) and documents are indexed, analyzed and made available for search regardless of the language or alphabet that they’re written in. To make browsing easier, document titles (data in the title field) are translated into English in the web application and API. That said there are three main caveats. Quality of reference extraction when citing local policy sources Overton breaks up documents into paragraphs and analyzes each in turn, using a heuristic based approach to decide if it contains a valid reference. There are dozens of different heuristics but some of them rely on matching keywords, identifying dates or otherwise spotting common referencing conventions. These heuristics work best on Western style references so it’s possible that Overton will miss some references in non-Western documents if they are citing local sources (esp. Chinese, Japanese & Arabic). Translated references When linking references to scholarly articles Overton often needs at least part of the article title to confirm a match. This can be a problem when article titles get translated e.g. if a policy document is in Japanese and the authors have also translated the title of any cited English language scholarly articles into Japanese. That Japanese article title won’t match any scholarly articles in our publications database and so is ignored. In these cases Overton will miss matches that can’t be confirmed by other means (like a DOI or a link to the publisher website). Topics and classification Overton uses machine learning techniques to identify the key topics in each document and to assign documents broad categories (“Health”, “Education”, “Crime, Law and Justice” etc.). The algorithms we use support these languages: - English - Chinese - Dutch - French - German - Italian - Japanese - Polish - Portuguese - Russian - Spanish - Swedish - Finnish - Danish - Norwegian

Last updated on Jun 25, 2026

Funding data in Overton

Overton gets funding data from OpenAlex Overton allows you to see where scholarly articles funded by a given organisation are being cited in policy. But how do we know which articles are funded by who? Accurate information about who has funded what research is hard to come by in the scholarly world. While we previously got funding information from several sources, we now use OpenAlex — the same trusted dataset we already use for other publication metadata. OpenAlex uses stricter matching criteria and only includes funding data that can be reliably linked to publications. Changing our funding data source has made our funding data cleaner, more consistent, and easier to search across the platform. What this means for funder organisations Funding data is notoriously hard to track and not consistent across our different sources. Research doesn’t always reference its funding sources or these aren’t available as part of the article metadata. Some sources require researcher input (like GTR), which can also be inconsistent. We recommend using the DOI search to query articles that you know are funded by your organisation. We also recommend reviewing your funding data on OpenAlex. If the data in OpenAlex could be improved, you can work with OpenAlex on improving the quality and depth of your funder data directly with them (which would then be reflected in Overton). For funding organisations that subscribe to Overton, there is an option of Overton creating a custom filter for your data in the platform. If you want to learn more, reach out to [email protected].

Last updated on Jun 25, 2026

How Overton collects and displays institution data

Overview Overton analyses millions of policy documents to identify when they cite or mention research from academic and research institutions. To do this accurately, we need comprehensive, up-to-date information about research organisations worldwide. What is ROR? ROR (Research Organization Registry) is an open, community-led registry that provides unique identifiers for research organizations globally. It’s maintained collaboratively by a consortium of organizations committed to keeping research infrastructure open and sustainable. ROR includes: - Universities and research institutes - Government research agencies - Healthcare and medical research organizations - Related entities like departments, centers, and subsidiaries Why we migrated from GRID to ROR Previously, Overton used GRID (Global Research Identifier Database) to identify research institutions. However, GRID stopped receiving updates in Q4 2021, which meant: - Institution names were increasingly out of date - New organisations weren’t being added - Organisational changes (mergers, renamings, restructures) weren’t reflected - Related organisations were harder to track accurately In 2026, we migrated to ROR because: ✓ Actively maintained – ROR is continuously updated with new institutions and changes ✓ More comprehensive – Better coverage of institutional relationships and hierarchies ✓ Industry standard – Increasingly adopted by funders, publishers, and research systems ✓ Community-led – Open, transparent governance ensures long-term sustainability ✓ Better data quality – Reduces the need for manual corrections and data cleanup What changed for users? New filters As part of this change we updated some of the filters that were available as part of our document searching. We now allow users to filter documents by whether they: - Cite a document published from a researcher that is affiliated with the research organisation - Mention a researcher that is affiliated with the research organisation - Cite or mention a work or research from a research organisation. This gives our users more control over how they filter Overton’s data to better understand where they are having impact within policy documents. Improved accuracy You’ll now see: - Current institution names (not historical names from 2021) - Better detection of citations from related organisations - More comprehensive coverage of institutional affiliations Saved searches Existing saved searches using the old filters will continue to function. However, we recommend updating them to the new filters for optimal performance: 1. Navigate to your Saved Searches 2. Look for any using the old “Cites or mentions institution” 3. Update these to use “Citing research institutions” or “Mentioning research institutions” or “Cites or mentions research institution” Understanding data discrepancies When comparing results from before and after the migration, you may notice differences. Here are the most common reasons: Institution name changes: Organisations may have changed their official name since 2021. Results now appear under the current name. Better organisational relationship tracking: ROR provides more detailed information about institutional hierarchies, allowing us to better capture citations from departments, centers, and related entities. Improved affiliation matching: ROR’s comprehensive data allows us to more accurately match institutional affiliations in research papers to policy citations. Updated organisational structures: Mergers, restructures, and organisational changes since 2021 are now reflected in the data. In most cases, you’ll see MORE citations The majority of users will see an increase in citations when comparing searches before and after the migration. This isn’t because we’ve changed how we analyse documents—it’s because we can now more accurately detect and attribute institutional affiliations. Changes to Affiliation Data in the API Every API call returns three components: the query, the facet data, and the results. The updates described below affect each of these areas differently. Facet data The open_affiliations facet — which returns counts of institutions cited or mentioned across a set of policy documents — has been replaced with r_open_affiliations, reflecting our migration to ROR (Research Organization Registry) institution data. Existing API calls that reference open_affiliations will continue to function, but will not benefit from the updated ROR data. We recommend updating these calls to use r_open_affiliations to take advantage of the improvements. New API calls made through the platform will use r_open_affiliations by default. Documents API results There are no changes to results returned by the documents API. Affiliated institution data is not included in document results — only the DOI — so no action is required here. Articles API results The affiliation information returned for each DOI in the articles API will be updated as part of this migration. If your workflows depend on this field, please review your integration to confirm compatibility with the new data format. Still seeing issues? If you notice specific problems with institution data—such as missing organisations, incorrect attributions, or other data quality issues—please let us know. While ROR significantly reduces these issues, we can still manually address specific cases when needed. Contact support.

Last updated on Jun 25, 2026

How are scholarly references matched in policy documents?

A description of how Overton finds scholarly references in full text Policy documents don’t always – or even often – have a clearly laid out references section or bibliography, and typically don’t stick to a single referencing style like a more academic work would. This means that Overton has to be flexible in where citations might be found, and how they might be formatted. We do this by breaking up the full text into chunks, typically paragraphs. Then, for each one: - We build a set of “features” – characterizations of the text. Does it contain any italics? Does it look like it contains author names? Can we spot any journal names in it, or phrases common in reference strings (et al., op. cit. …)? - The features are scored individually and then those scores are summed - If the total score is higher than a specific threshold we try to identify and extract the different parts of the reference string – the source, the title, the year and so on - We use these to search CrossRef for scholarly works, or the Overton database itself for policy documents. We score the relevant search results from either system by similarity with the original paragraph, and if the similarity score is over a given threshold then we consider it a match By adjusting the different score thresholds we’re able to control the **precision **(how accurate matches are, in aggregate) and **recall **(how often we miss matches) of the system. There’s a delicate balance between these two characteristics. Typically we can either be extremely accurate but miss more references, or match all references at the expense of getting matches wrong more often. Overton errs on the side of accuracy, so we’re more likely to miss a reference than to get it wrong. We target a minimum accuracy of >= 98% and minimum recall of >= 80% for scholarly documents across the entire database. In practice these numbers vary based on each source’s citation style and norms and the observed recall is much higher (>= 95%) for most English language policy sources citing journal articles. In general Overton performs better than systems like Altmetric when references are less formal e.g. references don’t list volume and issue numbers, the title or authors are misspelled, or the citation is part of a sentence (“see Alice Smith in the Journal of X”). Some types of references pose issues: - Scholarly papers not indexed by Crossref – these cannot currently be matched by Overton, unless they are indexed separately as policy documents (e.g. because they were authored by a think tank or IGO) - Scholarly papers in languages other than English – these make up a relatively small minority of scholarly papers indexed by Crossref, which makes similarity scoring much harder for various reasons. To remain accurate we use higher thresholds for non-English reference paragraphs, but this means that we miss more papers. - Papers belonging to a series – occasionally a series of policy documents will be published with the same title: an example might be “Quarterly Report”. Overton uses authors and publication years in these cases to differentiate possible matches but sometimes citations will accrue to the wrong version of the document.

Last updated on Jun 25, 2026

How does Overton classify policy sources?

Overton Index uses a comprehensive policy source taxonomy to classify policy sources. Our taxonomy system comprises 3 levels: sector, organisation type and function. Each source is categorised across all 3 tiers. With a database like Overton, we need a taxonomy which accounts for the characteristics of our diverse policy sources, in order to fully support our users with their analyses. It must reflect how governments and other public bodies are structured in different countries, as well as scales of governance (local, regional, national, international). It must adequately reflect the nature of a policy source in a way that is understandable across languages and country contexts. Where to find our policy source classification data Users will be able to interact with the policy source taxonomy primarily via the filters, which allow you to refine your results based on characteristics of the policy source. Additionally, users will find taxonomy data in our exports. How do we classify policy sources? 1. Policy source sector: Our primary sectoral classification indicates which sector(s) our sources are classed as: public, private or third sector 2. Policy source organisation: Our secondary classification system categorises our sources on the basis of whether they’re a government, think tank, NGO or IGO, in a similar way to our existing system. 3. Policy source function: Our tertiary classification system is intended to act as a series of broad descriptors, which describe the core functions of or services provided by the organisation. This is our biggest change, and perhaps the one we hope will be most useful in terms of ‘drilling down’ into sources. There may be more than one classification applied at this level, to offer a more nuanced and detailed idea of what a source is or does. For example, an independent government-funded research centre may now appear in the database as: | Sector | | Organisation | | Function | | | Public Sector | | Government | | Arm’s Length Body National Body Research Centre | | Nuances of the classification system The taxonomy has been designed to achieve a balance between being ‘globally agnostic’ and being descriptive of what the source is, does and adequately describing its position within the governance system. At present, Index hosts a wide range of more than 2900+ sources from 193 countries. We encourage users to familiarise themselves with the definitions and data model to understand the nuances within the system such as: - University think tanks, research centres and so on, are classed as third sector think tanks due to their tendency to be non-profit, mission-oriented approaches. - Not-for-profits, foundations, NGOs and the like are classed as third sector – even when self-described as a ‘private foundation’ or privately funded. - Non-statutory bodies such as independent watch-dogs and regulators are classed as third sector, where these operate independently and are not a formal part of government itself. - Private sector organisations specifically refers to commercial or for-profit organisations which are not part of government. Definitions Source sector | Sector | | Definition | | | Public Sector | | A public sector organisation is a government-operated or funded entity responsible for delivering essential public services and administering public resources, serving the needs of society and citizens. | | | Private Sector | | A private sector organisation is a profit-driven entity owned by individuals or corporations, operating independently in competitive markets. | | | Third Sector | | A third sector organisation, also known as the non-profit sector, is a non-governmental entity dedicated to social or charitable missions, funded by donations, grants, and volunteers. | | Source organisation type | Organisation Type | | Definition | | | Think Tank | | Independent research organization providing policy analysis or seeking to influence policy through the dissemination of research outputs and recommendations to influence public discourse and decision-making. | | | NGO | | Nonprofit entity working on social, humanitarian, or environmental issues, often reliant on donations and volunteers. | | | Government | | Administrative body responsible for governing a specific territory, enacting and enforcing laws, and providing public services. | | | IGO | | Multinational entity formed by governments to address global issues and promote cooperation, such as the United Nations. | | | Legislative Body | | Lawmaking institution responsible for crafting and enacting laws. | | | Judicial Body | | A judicial body refers to an institution within the policy-making framework that interprets, enforces, and applies laws through rulings and judgments in legal disputes, ensuring the legal system is upheld. | | Source function | Source Function | | Definition | | | International Body | | Authority of more than one nation responsible for law making, governance, and public service delivery within its jurisdiction. | | | National Body | | The central authority of a nation responsible for law making, governance, and public service delivery within its jurisdiction. | | | Regional or State Level Body | | A government entity that administers a specific geographic region within a larger nation, with authority over regional policies and services. | | | Municipal Body | | Government at the municipal or community level, responsible for local governance, services, and infrastructure. | | | Mixed Roles | | Refers to ‘aggregator’ style policy sources or organisations whereby the source holds multiple ‘child’ organisations within its indexed documents – for example, large government websites housing documents published by several ministerial departments, agencies and committees. | | | Government Department | | Administrative unit within government responsible for specific policy areas and services. | | | IGO Department or Agency | | Administrative unit within government responsible for specific policy areas and services. | | | Arm’s Length Body | | External government entity with specialised functions, often independent of direct ministerial control. | | | Government Agency | | Administrative entity within the policy-making structure that is responsible for implementing, regulating, and enforcing specific laws, policies, or public programs as directed by legislative or executive authorities. | | | Public Service | | Organization responsible for the delivery of essential services, such as healthcare or transportation, funded and operated by the government for the benefit of the public | | | Financial Institution or Bank | | A financial institution or bank is an entity that manages financial transactions, advises on fiscal/monetary policy and works within the economic system by facilitating capital flow and providing financial services. | | | Top-Level Authority | | Refers primarily to organisations with a specialised, international mandates that coordinates efforts, sets standards and/or provides expertise on global issues such as public health, environmental conservation or humanitarian aid. | | | Parliament, Senate or Congress | | Legislative body responsible for creating and passing laws, representing citizens in governance, alternatively, upper house of a bicameral legislature, often representing regions or states or equivalent. | | | Professional Network, Association, Union or Cooperative | | A collective organisation formed by professionals or workers to advance common interests, standards, or rights. | | | Cultural Institution | | Refers to any organisation that preserves, promotes, and disseminates cultural heritage, arts, and knowledge, such as museums, libraries, theatres, and galleries. | | | Healthcare Service, Body or Agency | | Organisations providing medical care, health management, and public health services. | | | Court | | Legal institution responsible for adjudicating disputes and administering justice. | | | Research Centre | | An institution dedicated to systematic study, investigation, and analysis of various subjects, generating knowledge and expertise. | | | Auditor | | An organisation responsible for examining and verifying financial records, practices, or compliance with regulations to ensure transparency and accountability. | | | Archive | | Historical archive of data or documents | | | Hansard or Legislative Transcripts | | Official record of parliamentary proceedings, preserving debates, speeches, and decisions for historical reference. | | | Public Data Body or Statistics | | Organisation responsible for collecting, managing, and providing access to public data, promoting transparency and informed decision-making. | | | Armed Forces | | The armed forces are a nation’s military organisations responsible for defence, national security, and the execution of military operations, both domestically and internationally. | | | Committee | | A committee is a source or organisation, often within a legislative or government context, tasked with evaluating specific issues, developing recommendations, or overseeing certain functions to inform decision-making processes. | | | Initiative, Programme or Project | | A planned effort with specific goals and actions aimed at addressing a particular issue or achieving a set of objectives. | | | Food and Drug Safety | | Food and drug safety refers to the regulatory framework and institutions responsible for ensuring that food products and pharmaceuticals are safe, effective, and compliant with health standards before reaching consumers. | | | Central Bank | | National financial institution managing currency, monetary policy, and economic stability. | | | Development Bank | | Financial institution focused on funding and supporting projects and initiatives aimed at economic and social development. | | | Research Council | | A research council is a public or independent organization that funds, supports, and promotes scientific research and innovation in various fields to advance knowledge and inform policy or societal development. | | | Learned Society | | Membership organisation of experts and scholars in a specific field, promoting knowledge sharing and research. | | | Citizen’s Assembly | | Deliberative forum where citizens discuss and make recommendations on public policy. | | | Commission | | A commission is a formal body established by a government, law or other organization to investigate, regulate, or oversee specific issues, often providing recommendations or decisions on policy matters. | | | Foundation or Charity | | Nonprofit organisation with a mission to support causes, often through philanthropy. | | | Monitoring or Regulatory Body | | An independent authority overseeing and enforcing rules and regulations within a specific sector or industry. | | | Consultancy | | Professional service providing expertise and advice in various fields. | | | Religious Organisation | | Source whose activities include promoting religious values, conducting research, and providing guidance on social, ethical, and policy issues from a faith-based perspective. | | | Policy Centre | | A policy centre is an organisation focused on researching, analysing, and developing recommendations on public policy issues to inform decision-makers and influence legislative or governmental outcomes. | |

Last updated on Jun 25, 2026

How does Overton find citation contexts?

When Overton finds a reference to a scholarly output in a policy document we try to show you where in the text it is being used, like so: We do something very similar for citations to other policy documents. We call these citation contexts. Citation contexts in more detail The **citation **itself is the bit at the bottom of the image above: That’s the part that we’ve used to figure out which paper is being cited – in this case it’s one by Boldrin and Levine on What’s Intellectual Property Good for? We tell you where in the document we’ve found it – in this case page 10 – so that you can go check the PDF for yourself. The citation contexts are where that citation is used in the main body of the policy document: Example citation context Sometimes there aren’t any citation contexts Citation contexts are only extracted when Overton finds a clear bibliography section in a policy document. Often policy documents lack these: the citation is the context. For example many policy briefs will link directly to scholarly articles with hyperlinks in the main body of the text rather than splitting anything out at the end. At the other extreme a document might just be one big bibliography, for example an appendix showing which papers were considered during a literature review. It’s also not unusual for policy documents to have bibliography sections but then not refer to them in the text, or to forgo bibliographies in favour of numbered footnotes (which the citation context finder in Overton doesn’t currently support) Supported referencing styles The policy world uses a variety of different referencing styles (sometimes within the same document!). Overton looks at each document and tries to figure out what the best strategy to find citation contexts might be. If we’re not finding citation contexts in a document it could be because we don’t yet support the referencing style being used within it. We currently support: Superscript numbers Despite evidence to the contrary6-9 We draw on previous work1,5,12 and extend it with new cutting edge techniques2 Numbers in square brackets or parentheses Despite evidence to the contrary [6-9] We draw on previous work (1,5,12) and extend it with new cutting edge techniques (2) Citation shortcodes in parentheses Despite evidence to the contrary (Smith et al., Franklin) We draw on previous work (Linz 2002, Rosalind et al 2020) and extend it (World Bank, 2018) Narrative cues These rely on us finding specific phrasing, usually used in table captions, quotes and footnotes. Despite evidence to the contrary – see Smith 2019 Figure adapted from Linz et al We *don’t *currently support: Numbers in the text without brackets or parentheses Despite evidence to the contrary 6-9 at least 12 people insisted We draw on previous work 1,5,12 and extend it with 2 new cutting edge techniques 2 Humans are good at spotting which numbers are references and which aren’t in these examples because they can understand meaning and the context each number appears in, but Overton can’t: so we don’t try to find citation contexts when policy documents use this referencing style. Matching item numbers or shortcodes to references Once we’ve found a link between the text and an item in a bibliography we have to match them so that we can tell which scholarly output is actually being referred to. Overton will only do this if it is certain the match is correct: it will ignore any citation contexts that are ambiguous. This means that there are scenarios where citation contexts won’t be matched: - If the document is using a name shortcode (e.g. Smith et al) but there are multiple bibliography entries authored by Smith then that shortcode will be ignored - If the document is using a name and year shortcode (Smith 2020) but there are multiple bibliography entries authored by Smith and published in 2020 then that shortcode will be ignored. The exception to this is when letters are used after years: Overton supports e.g. Smith 2020a, Smith 2020b. - More subtly: Overton first maps bibliography entries to DOIs. Afterwards, when resolving author / year shortcodes to a cited scholarly paper, the shortcodes are matched against the “gold standard” metadata for that DOI in Crossref and *not *the year and author name in the reference string used in the paper (we do this to avoid a different set of errors that come from parsing reference strings into different parts). If the policy document author used the wrong year for a paper then it may not match. If the author names match then Overton is forgiving of a year or two in either direction, but beyond that it will ignore the relevant shortcode. - If a citing article is a journal article it may use superscript numbers to refer to author affiliations as well as bibliography entries. Overton is quite good at spotting and removing these but occasionally real citation contexts will be caught too. Seen a document where you think contexts should have been found? Let [email protected] know and we’ll take a closer look at it for you.

Last updated on Jun 25, 2026

How does Overton find people mentioned in policy documents?

In addition to processing any citations to other policy or scholarly outputs, Overton tracks where it finds a researchers name in the full-text of a policy documents. We call these “People Mentions”. A mention of Prof. Mariana Mazzucato from UCL See: What is a People mention? How Overton finds People mentions Finding people mentions in a policy document is a three stage process. Stage one – finding institution names First we search for the names and name variants (e.g. “University College London”, “UCL”) of all the academic research institutions that we know about. We use GRiD (now ROR) for our dictionary of institution names. If an institution isn’t in GRiD it won’t be matched. We record the document, page and paragraph of where we saw each institution name appear. Lots of institutions have very similar names (e.g. the University of Washington, Washington University) and most short acronyms appear more than once (e.g. LSE matches both the London and Lahore School of Economics). Overton doesn’t try to guess which option is the right one, instead it puts all possible matches forward to the second stage. Stage two – finding researcher names For each institution found we then build a dictionary of possible researcher names, using affiliation metadata from the journal articles and books we have in the database (i.e. that have been cited at least once in policy). We take some name variants into account. Specifically: - If an author name has a full first name and initials we’ll look for a version without the initial(s) (e.g. given “Alice J Smith” we’ll also match “Alice Smith” - We don’t do this when we only have an initial for the author’s first name - If an author name has a double-barreled surname we’ll look for versions with and without a hyphen between the two parts We then look for any of these researcher names in the paragraph(s) where we saw the relevant institution name, and in the paragraph immediately following it. Any institution / researcher from that institution pairs found move onto the third stage. Overton doesn’t disambiguate between researchers with the same name (Alice A Smith, Alice B Smith) at the same institution: if we saw “Alice Smith” in a policy document we’d map it to both researchers. Generally we try to balance precision – mapping a mention to exactly one, correct person – and recall – finding everywhere that one person is mentioned. Finding names without middle initials reduces precision but improves recall: it finds many, many more mentions for most researchers. We currently build our dictionary from journal articles and books that have been cited at least once in policy. This means that if a researcher has been engaged in policy but never had any of their works actually cited we won’t be looking for their name, so they won’t ever have any people mentions. We’re working on other approaches that won’t have this limitation. Stage three – sanity checking At this point Overton has what looks like a people mention but needs to verify it. It runs the relevant paragraph through a set of heuristics to make sure: - The matched institution name is the institution itself and not a similarly named organization – e.g. Cambridge University Press - The paragraph where we found the person mention isn’t actually a reference of some kind - There aren’t too many combinations of people names and affiliations in the relevant text, which might lead to false positives Browsing and searching people mentions Any researcher name / affiliation pairs passing stage three are saved in the database and can be found in the People search.

Last updated on Jun 25, 2026

How does Overton generate document descriptions?

Overton generates descriptions and themes for all documents within our database using a large language model (LLM). We run these LLMs locally, on servers we own and host. Occasionally we might supplement our local models with cloud services: in these cases we ensure that any text we send to third parties isn’t to be kept or used for the purpose of training new models. The descriptions are generated through a three-step process: 1. Parsing and cleaning text: The text of the document is parsed and cleaned to remove unnecessary elements such as new lines, extra spaces, and lines containing only numbers or symbols. Documents with fewer than 500 characters (approximately 3-4 paragraphs of text) after cleaning are are skipped, as they don’t contain enough information to be accurately summarized. 2. Getting a document description from the LLM: The cleaned text, capped at 15,000 characters, is then passed to a large language model for processing. A structured prompt guides the model to generate a focused summary in English, excluding non-informative sections like disclaimers, formatting notes, and editorial content. References are omitted, and the model is instructed to focus on the document’s theme and description. Note that we’re telling the LLM to generate a description of the document and what it’s about, rather than to summarize key points & takeaways. 3. Quality checking: The generated description undergoes an automatic quality review to ensure it meets some minimum standards (is it a clearly readable paragraph? Are there any odd LLM generated artifacts, like repeated phrases etc.?). Summaries that do not pass this quality check are discarded to maintain consistency and clarity in output.

Last updated on Jun 25, 2026

How does Overton know about author affiliations?

Details how Overton links scholarly books and articles back to specific researchers and their institutions Overton gets affiliation data from OpenAlex, an open database of scholarly metadata for books and papers. OpenAlex in turn inherited historical affiliation data from Microsoft Academic, which is what Overton used before it closed in December 2021. OpenAlex now gets affiliation data from “both structured and unstructured sources” – a mix of publisher websites and affiliation metadata in databases like Crossref and ORCID. If you notice any discrepancies in your affiliation data or your institution’s data, please check OpenAlex and contact them to correct any errors. Coverage The quantity and quality of metadata for papers varies – for newer papers publishers often make it easy to see who the authors are and what institutions they are affiliated with, but this isn’t the case with older books and papers. Speaking very generally about 80% of the scholarly papers seen by Overton have affiliation data available. We’re able to help users improve this by automatically collecting data from their CRIS – though this is a custom service and typically requires some development work. It can also be improved in the medium/long term by asking authors to claim their work on their ORCID profiles. What counts as a standalone institution? Affiliations from OpenAlex are keyed to ROR IDs (Research Organization Registry identifiers are unique, open, and persistent identifiers for institutions and organisations). Overton migrated from using GRID to ROR for institutional data in 2026. ROR is actively maintained and continuously growing, making it the preferred standard for identifying research organizations.

Last updated on Jun 25, 2026

How does Overton know who authored a scholarly article?

Details how Overton finds author names for the scholarly books and papers cited by policy documents When Overton finds a reference to a scholarly work in a policy document it creates an item for it in the Scholarly Articles tab. That item contains the scholarly work’s title, journal (if applicable!), published date and abstract. It also includes the work’s authors. We get all of this information from Crossref, which is a cross-publisher membership organization built to share metadata about scholarly papers. Overton is a member of Crossref. We use the authors list in two ways: - It’s made searchable, and used to power the “By author” sidebar filter - Where possible it is combined with affiliation data to power the People tab (to appear in the People tab we need both an author’s name and their institutional affiliation) The data in Crossref comes directly from academic publishers, and typically reflects what the authors of paper entered into that publisher’s manuscript tracking system. Unfortunately there are no hard and fast rules for how publishers ask for author names. Thus if they’ve published lots of papers the same person might appear in many different ways, e.g. - Alice Smith - A Smith - Alice B Smith - AB Smith - A B Smith Disambiguating author names (figuring out that a set of similar names are all the same person, based on their affiliation or subject area) is a hard problem to solve and neither Crossref or Overton can currently do this in a robust way. In practice this means that when you’re searching for a person’s scholarly outputs in Overton you may need to check the different variations of their name that they have previously used with different publishers. Alternatively you can search by pasting in the DOIs of articles that belong to them, or searching by ORCID: you can do both of these things in the Scholarly Articles tab, using the “Search by DOI, ORCID, PMID or ISBN” button.

Last updated on Jun 25, 2026

How international are your sources?

Learn more about which sources we track and explore the extent of our international coverage. Overview In our efforts to build a global policy database, we’ve brought together policy documents from sources in 193 countries and territories and authored in 74 different languages. Overton’s policy source coverage, June 2026 Logged-in users can view a breakdown of policy documents by country directly in the app through the summary report. They can also explore our ‘Sources’ index to see the number of policy sources available by country or region. While our coverage is broad, the volume of documents varies by location, with some countries contributing only a limited number of records. This is due to the availability of policy documents online in some regions. A closer look at our coverage As of August 2025, ~62% of the documents in Overton are from sources in the US, UK, Japan, Canada, Germany, or France. A further 13% of documents are from international organisations (IGO’s like the United Nations, OECD or the World Bank). There are a few reasons for this: - Think tanks and NGOs are heavily concentrated in London, Washington D.C, New York and Brussels. Around 6% of the documents in Overton are from these kinds of sources. - Governments are online to different degrees. Some governments don’t have the infrastructure needed to make documents available online or have other priorities. - Many users look for local policy impact. In the UK, North America, New Zealand, Australia and parts of Europe we collect data at a state level for this reason. For other countries we focus on collecting documents at national government level. - Local knowledge – sometimes we miss an important policy source because we’re not familiar with the way systems work in a particular country. If you think we’re missing something, please let us know.

Last updated on Jun 25, 2026

How is OpenAlex used in Overton?

OpenAlex is an open, freely available database that maps global research output and the relationships between it. It brings together information on scholarly works such as articles, books, and datasets, with details about authors, institutions, journals, funders and research topics. Research metadata in Overton Overton uses OpenAlex as a key source of structured research metadata. In particular, it helps us identify and organise information about academic publications, authors and their institutional affiliations and research funders. This supports Overton’s ability to reliably match research outputs to the people and organisations behind them, ensuring accurate attribution and consistent identification. If users notice an error in scholarly article data in Overton, it is likely that this is an error that has carried over. Researchers and organisations can also work directly with OpenAlex to optimise their institutional data or correct errors. ROR Both Overton and OpenAlex use ROR for institutional data. ROR provides a global, open system of unique IDs for research organisations like universities, hospitals, and labs. OpenAlex incorporates these ROR IDs to standardise how institutions are represented, which helps avoid confusion caused by name variations (for example, “MIT” vs. “Massachusetts Institute of Technology”). Linking institutional data to a stable identifier like ROR, makes it easier to: - consistently identify the same organisation across datasets - merge or compare institutional data reliably - track research outputs and affiliations without ambiguity Overton profiles Users who want to set-up their profile in Overton can conduct a name search powered by OpenAlex. OpenAlex to find DOIs Users can search with a list of their publications’ DOIs to search with in Overton. How are OpenAlex and Overton different? OpenAlex and Overton both include research data, but serve different purposes and audiences. OpenAlex is an open database of global scholarly information. Its goal is to comprehensively map academic research (papers, authors, institutions, citations, and topics) and make that data freely available for anyone to use. Overton is a specialised platform that focuses on policy documents including policy-to-policy citations and policy-to-scholarly-article citations. Overton users can see how research is used in policy and public decision-making. Instead of cataloguing all academic research, Overton specifically tracks academic work that is cited in policy documents. In short, OpenAlex describes the research world itself, while Overton shows how that research travels beyond academia into policy and real-world impact.

Last updated on Jun 25, 2026

How to reference Overton

This is intended as general, informal guidance for how to reference data from Overton. There is a wide range of referencing styles (some of which have multiple editions) which are used in academia and beyond and each style will require different elements to be included. As such, your librarians are your resident experts on referencing and should be consulted. Referencing Overton as a database Below you will find some Overton specific information regarding the common elements used to create a database reference. Again, the elements you need may depend on the reference style you are using, so please ask your librarian if you are unsure. Author: Overton Publisher: Open Policy Ltd. Location or place of publication: ‘Online’ Date: use the date that the database was accessed Below is an example of how to reference Overton as a database in Harvard Referencing Style. Overton. (2024) Open Policy Ltd. Available at: https://app.overton.io/dashboard.php (Accessed: 18 March 2024) Referencing policy documents found in Overton Policy documents differ from academic articles in a lot of ways, from who produces them, to the review processes they go through, to even the length of the research. As such, we encourage you to check what elements you need for the reference style you are using and consult a librarian for further guidance on how to best structure your reference. Here is some general information that may be helpful when creating a citation for a policy document. - Authorship of policy documents can be tricky to ascertain. When in doubt, you may consider referring to the organisation that produced the policy document (e.g. World Health Organisation) as the author. - There are also different types of policy documents in Overton including reports, clinical guidance, government transcripts etc. so ensure you’re aware of the type of document you are citing. You may want to look at the full-text of the document in order to ascertain the document type. - Overton provides the link to the original webpage where the full-text of the policy document was found. You can use this link** or** the link to the document in Overton in your reference but bare in mind if you link to Overton, only other users of Overton will be able to access the record of the policy document. - If you require ‘peer reviewed’ search for your research, please be aware that policy documents do not necessarily undergo the same kind of peer review process used for scholarly article publishing. It isn’t to say the data is less reliable, but rather the review processes may be different. The following is an example of how to cite a policy document in Harvard Referencing Style. World Health Organisation (2001) Planning a World Health Day Activity: Toolkit for organisers Available at: https://app.overton.io/document.php?policy_document_id=who-39f6b945bded778cc2dc5b11412b4bed (Accessed: 18 March 2024). Referencing images from Overton We are happy for users to use the images generated by our summary reports in their own research and reporting. If using the map found in our summary report, here are some things to note: - The dots on the map represent the countries where the policy documents in your results set come from. - If you have policy documents from the policy source ‘IGO,’ the map dot that represents ‘IGO’ will be located in the USA. - We recommend including the clarifying table underneath the map image to ensure transparency of the data represented Example image reference Overton (2024) Map Image for Summary Report on Documents matching the query ‘”AI” OR “Artificial Intelligence” OR “Machine learning”‘ and connected to Northumbria University [figure] Available at: https://app.overton.io/documents.php?query=%22AI%22+OR+%22Artificial+Intelligence%22+OR+%22Machine+learning%22&open_affiliations=Northumbria+University&sort=relevance&format=report If you need to include a figure caption to the image, this can be: Overton, *‘Map of countries producing policy documents’ *(2024) RIS export to citation management tools Overton doesn’t have automatic reference generation. If you use referencing tools such as Endnote, RefWorks or Zotero, you can use the ‘Export to RIS’ option. This will generate a file compatible with many citation management tools. If you have a small number of documents you want to export from a full set of search results, you may want to utilise the ‘Tag this’ function. Tagging documents allows you to create virtual folders of specific results. Once you have tagged all relevant documents, you can simply click on the relevant ‘Tag’ from your ‘Your Tags and Highlights’ filter and export the relevant results from there.

Last updated on Jun 25, 2026

How we disambiguate policy documents

How Overton tries to avoid collecting the same document multiple times Policy documents usually lack identifiers like ISBNs or DOIs that can be used to uniquely identify them, no matter where they are hosted. This can pose a problem when government websites change and documents are moved to different web addresses, or when the same document appears in multiple places on one site. To avoid collecting the same document multiple times we regularly run a diambiguation process, where for each individual policy source we go through each document to check: - Their title - Their URL, and variations on that URL (https instead of http, without a “www.” at the front, with or without a backslash character at the end etc.) for both the policy document landing page and any PDFs associated with it … and compare them to all the other documents from the same source. If we find matches to one or both fields then these are potentially duplicates, so we look deeper at: - Their publication date - Their “content hash” of the PDF files associated with the match - Their “content hash” of the front cover (the first page of the first PDF associated with the match) … to decide if it’s a real duplicate – in which case it is removed – or not. Real examples of matches found that are not real duplications include sources having multiple documents called “Annual Report” but with different publication years, or a set of documents all simply called “Memorandum”, some published on the same day, but that have different contents. Same document, different language Some policy sources publish in multiple languages – for example, UN agencies may produce a report in English, French, Spanish and Arabic. Overton doesn’t automatically detect that two documents are the same, just translated. Instead it relies on cues from the original policy source. If documents in different languages have the same landing page (all of the different language options are listed on a web page representing that document) then Overton will merge them and treat them as a single document. If the different language versions have separate landing pages / are different entries in the source’s publication catalog then Overton will also treat them as separate publications. This is both helpful and unhelpful depending on your use case! Some users are keen to see which language versions are collecting citations, while others would prefer to merge citations into one object. We’re actively working on possible solutions to this issue. Duplicates across policy sources We don’t currently disambiguate across policy sources: the same document can appear twice as long as two different organizations host it in different places. This is by design as we often see, for example, think tanks commissioned to write reports for government departments (both groups then host a copy of the report), or documents authored by IGOs on government websites in developing countries. Leaving them in place makes browsing and reporting on a source by source basis easier. By default any citations we see for duplicate documents are associated with the version hosted by whichever organization is mentioned or linked to in the citing reference. This does raise issues with some sources like the European Union Publications Office, PubMed Central and APO, which are primarily aggregators of content and rarely appear in the reference text: the citation numbers for documents on these sources are lower than they should be. We’re aware of this issue and plan to address it in a future update.

Last updated on Jun 25, 2026

The ari.org.uk dataset

What is the ari.org.uk database? Overton maintains the ari.org.uk database in partnership with the Government Office of Science, the ESRC and Transforming Evidence. Areas of Research Interest (ARIs) are specific topics or issues that the government is interested in. Accessing the data Overton is responsible for is API and bulk data access. If you need support or want to chat about the dataset – please reach out to us at [email protected] You can browse the data at the ari.org.uk website, but it is also available for download or to access through a very simple API. All of the data is free and made available under the Open Government License. Please let us know if you find the database useful – it helps build the case for keeping the ARIs current within the UK government departments and agencies. The data model The API and download share the same data model. Departments list a set of research priorities each year. Each research priority is called a “question” in our system. Questions are typically grouped by theme or topic, and often each question group will have a paragraph or two of extra background information. Each question in the ari.org.uk database looks a bit like this: Access via the API Here’s a breakdown of each field: | Field | | Description | | | questionId | | A stable identifer for the question. | | | url | | The URL for this question’s page on ari.org.uk | | | question | | The text of the question itself. Note that it won’t always make sense without additional context from questionGroup and backgroundInformation. | | | isArchived | | A boolean – if true then this question has been superseded by a more recent question and the department is no longer actively soliciting responses to it | | | department | | The name of the department or agency asking the question | | | questionGroup | | Departments typically group their questions by topic, and this is the name they’ve given to the group this question is in | | | backgroundInformation | | Question groups often have an associated paragraph or two of background information. Note that this can sometimes be very broad. | | | publicationDate | | The data of publication for this question, in YYYY-MM-DD format. | | | expiryDate | | Departments may sometimes specify an expiry date for a question, after which point it because archived automatically (see isArchived above) | | | contactDetails | | A free text field containing details of who to contact as a next step if you are interested in contributing answers or data to the question. | | | topics | | An array of relevant topics from the IPTC’s MediaTopics taxonomy – see iptc.org for more detail. This taxonomy is primarily used by newspapers and magazines; it covers lots of different areas but not in an in-depth way. | | | fieldsOfResearch | | An array of relevant academic subject areas from the Fields of Research taxonomy – see abs.gov.au for more detail. An academic subject area is “relevant” if a researcher with that background would be well suited to answering the question. | | | tags | | An array of tags – freeform strings that either highlight specific keywords from the question or add extra keywords to make them more searchable. | | | relatedQuestions | | An array of question IDs – these are questions from other departments that the system thinks are semantically similar to this one. | | | relatedUKRIProjects | | An array of projects taken from the UKRI’s Gateway to Research database that the system thinks are relevant to this question. To be relevant the project description has to be semantically similar to the question, and/or the description suggests that the project lead will have expertise relevant to the question and may be a good candidate to contribute to it. | | | pageViewCount | | The number of times a question’s page has been visited directly on ari.org.uk | | The API simply returns a paginated set of ARIs (“questions”) in JSON format. You can access it here: https://ari.org.uk/api/questions Move to the next page using the &page parameter: https://ari.org.uk/api/questions?page=2 The JSON result you’ll get has two sections, “data” and “meta” The meta section This returns the total number of records in the dataset (this was e.g. 1,863 on 14th September, 2023), the page number you’re currently viewing, the total number of pages for your request and the URL to use to get to the next page (in meta -> pagination -> links -> next). If the next page link field isn’t there then you have reached the end of the results set. The data section This returns a set of up to 250 questions – the format is detailed above in the “data model” section.

Last updated on Jun 25, 2026

What are data notes?

*Overton is committed to transparency and uses data notes to alert users of possible limitations or biases within their search query or results. * We work to keep Overton’s data accurate and complete, but it still includes limitations and potential biases. Learn more about our data collection methods and responsible metrics. We also guide users who are new to scholarly metadata and citation tools on when to interpret results with caution by using ‘data notes’. How do data notes work? Overton Index automatically generates data notes based on the filters you apply and your search results. Some searches won’t include data notes. When available, we show them at the bottom of the results page—click the Data notes bar to open them. Each note includes three bullet points: background on the issue, why it matters, and how to address it. You may also see a link to a relevant help page for more details Priority of data notes Some data notes provide useful background that doesn’t significantly affect further analysis. Others highlight biases or potential limitations you should consider within your analysis. We assign each data note a priority—low, medium, or high—and colour-code the note sidebar accordingly: red for high, yellow for medium, and green for low. When your search results in one high priority data note we’ll show the data notes bar at the top of the search page. To see all our data notes along with an interpretation of the limitations, see the following pages ‘Low priority data notes’, ‘Medium priority data notes’ and ‘High priority data notes’. Hiding, adding or customising notes We will keep adding new data notes whenever we find use cases that might be affected by spotty metadata or our data processing and structuring methods. If you think we are missing important notes or if you want to hide or customise data notes for your users, please contact us at [email protected].

Last updated on Jun 25, 2026

What are your criteria for adding new sources?

How new policy sources are identified and assessed How do you decide what is and isn’t included? The majority of sources that we track meet our minimum criteria: - They are official government sources, think tanks, or IGOs - They regularly publish policy documents (as we define them) - The documents that they publish are publicly available Some source types (e.g. NGOs engaging in policy work) are considered for inclusion on a case by case basis. We look at: - Is the source often cited by official documents? - Do they cite research or other policy documents? - Do they regularly publish documents of interest, or just on an ad hoc basis? If the latter, are the published documents ever cited? - Are the documents publicly available? If you’re logged in to Overton then you can see a complete list of sources on the Sources page. Users can request adding a specific source by emailing support. How do you identify sources? We spend a lot of time adding and maintaining policy sources inside Overton, but there’s always more to collect. We maintain a large list of candidate sources and the team picks new sources to add weekly. New candidates are also added because they’ve been requested by users, because they’re being cited by other policy documents, or because they belong to a category we’ve identified as a priority (“East African government sources”, “Environmental protection agencies” etc.)

Last updated on Jun 25, 2026

What is Overton’s coverage and how does it compare to other systems?

More information on the number of sources and documents we collect Overton indexes more than 22M documents from more than 2,700+ different policy sources, making it larger than similar systems. Overton was designed specifically for working with policy rather tracking non-scholarly attention more broadly. The positive side of this trade-off is that we’re able to quickly add and process new policy sources and to do a better job of matching references in free text. It’s important to note that a “policy source” in this context is a website or domain from which we are collecting documents. Usually a website (Overton policy source) includes documents from just one organisation, but this varies from country to country. For example, in the UK a single policy source – gov.uk – hosts documents from all of the government’s departments as well as many government agencies. Conversely in Australia each government department hosts its own documents, so each one is tracked as a separate policy source. Policy to policy citations Uniquely Overton also tracks citations within the policy literature rather than just from policy documents to scholarly articles. There are ~ 10M policy to policy citations in the database. They are kept in a separate index, but are searchable alongside the ~ 28M policy to DOI citations. This is a core part of our business and is how we’re able to work with government agencies, think tanks and NGOs who don’t always publish in academic books and journals. Our customers often have a mix of output types. For example, think tanks and universities may be publishing both scholarly works and policy briefs or reports. Looking at coverage in different countries You can use the number of sources in each country as a very broad indicator of how good coverage is, but if you’re focused on a specific geographical area then the best approach is to combine this datapoint with the volume of policy documents sources produce . Regular data analyses highlight that government bodies account for around 42% of sources but author 75% of policy documents, reflecting their broader remit, productivity and different publication practices. Consider this when comparing citation rates between countries and organisation types (you can use Overton policy source taxonomy to unpick those nuances).

Last updated on Jun 25, 2026

Why are some authors not appearing in the People tab?

If a name doesn’t return any matches or fewer citations than expected in the People tab there may still be other information about that author’s works in the database Data based on citations gets into the People tab after a two-step process: 1. We extract scholarly references from policy documents and map them to DOIs 2. We find affiliation data for those DOIs from the OpenAlex bibliographic database For a person or article to appear in the People tab **both **steps must be completed successfully and return data. This happens for the large majority of – but not all – authors and articles. However, if an author’s work appears in the database but we weren’t able to find affiliation data for it then it won’t be associated with that author’s name in the People tab. In the worst case none of the author’s works will have affiliation data, and their name may not appear at all. Why? Unfortunately we might miss some references, or be unable to match them to a DOI: you can read more about this on the How are scholarly references matched in policy documents? page. In particular books aren’t always matched, as they often lack DOIs – this is something we’re working on. Furthermore, it’s hard to get affiliation data for some papers. You can read more on the How does Overton know about author affiliations? page. As a rule of thumb around 80% of the papers cited in policy have affiliation data associated with them, but this is unevenly distributed with larger publishers & journals tending to have higher %ages and smaller publishers and societies having lower %ages. What can I do instead? The gold standard way to find information about a set of articles in Overton is to use the “Search by DOI, ORCID, PMID or ISBN” button on the Scholarly Articles tab. To get a more complete picture of a researcher’s policy footprint we’d suggest combining two searches: - First, get a list of the researcher’s articles as DOIs and run them through the “Search by DOI [..]” button. This will show you all of their articles that have been cited in policy at least once, and allow you to view the citing policy documents, create a report, export to Excel and so on - Secondly, search for the researcher’s name being as broad as possible (use initials instead of full first names). A search for “A Smith” will match “Alison Smith” and “Alison B Smith”. For more detailed tips see Searching names in Overton - If necessary, filter the researcher names that come back using the “With Affiliation” filter in the sidebar of the People tab, so that you’re only viewing researchers with that name from your own institution - Use the “Only people mentioned” filter in the sidebar to ignore citations to articles – which you already have from the first bullet point, above – and see just where that researcher has been mentioned by name in policy documents Longer term fixes If your institution has a well supported CRIS or institutional repository we may be able to pull in affiliation data from there to supplement and/or replace the data we get from OpenAlex. Please contact your account manager for more details.

Last updated on Jun 25, 2026