Open data — meta-directories (lists of lists)
This wiki post presents a curated overview of directories, catalogues, registries and curated lists whose primary content is other open data sources — portals, repositories, databases, datasets, or lists of creative assets. This file lists meta-directories only — resources whose value is in pointing to other open data sources. Individual datasets, portals, repositories and asset collections are described in a second wiki post below. This is a wiki post, so you can add to this post; please feel free to enrich/improve where you can!!
1. Cross-domain meta-directories
General-purpose directories that span many types of open data source. The broadest starting points.
- DataPortals.org — ~520 open data portals worldwide. Long-standing curated registry maintained by an international group of open data experts; covers local, regional and national levels.
- OpenDataSoft – Open Data Sources catalogue — 2,900+ portals worldwide, organised by geography. Generally regarded as the most comprehensive list.
- Open Data Inception — Geotagged map view of ~1,600+ open data portals worldwide; built on the OpenDataSoft list but useful for browsing by location.
- DataCatalogs.org — Long-running curated catalogue of open data catalogues (now redirects to / merged with DataPortals.org).
- CKAN Portals listing — Index of CKAN-powered open data portals; CKAN underpins a large share of government data portals globally.
- PortalJS Data Portals listing — Modern catalogue of open data portals maintained by the PortalJS project (Datopian/CKAN ecosystem).
- List of open government data sites (Wikipedia) — Wikipedia-maintained country-by-country index of national, regional and municipal OGD portals.
- EasyData Open Data Portals Catalogus — Dutch-language catalogue of open data portals (NL-based, global in scope).
- Awesome Public Datasets — Topic-organised GitHub list of high-value public datasets; hundreds of entries, community-maintained.
- sindresorhus/awesome — The hub-of-hubs: 1,000+ topical “awesome” lists, several of which (datasets, transit, citizen science, ML) are themselves directories of open data sources.
- brandonhimpfen/awesome-open-data — Curated list of open data resources, tools and platforms across domains.
- CoolDatasets — Curated, lightly categorised collection of public datasets across topics.
- Open Data Impact Map — Global database of organisations that use open data (Center for Open Data Enterprise); useful for finding sectoral data users and sources.
2. Directories of government & intergovernmental portals
Meta-lists specifically of official government / IGO open data portals.
- List of open government data sites (Wikipedia) — (Also in §1.) The most complete country-by-country index of government portals.
- US City Open Data Census — Ranks US cities by their data-sharing policies; doubles as a navigation index to city portals.
- Open Data Monitor (EU, legacy) — Index/benchmark of European national open data portals (now largely dormant).
- NAPCORE — Coordination body for the EU’s 30+ mobility National Access Points; the authoritative directory of national transport-data portals (see also §7).
3. Registries of research data repositories (generalist)
Cross-disciplinary registries of repositories that hold research data.
- re3data.org – Registry of Research Data Repositories — 3,300+ research data repositories across all disciplines, with rich metadata (subjects, certifications, policies, APIs). Run by DataCite + KIT + Purdue + partners. The canonical registry for scientific repositories.
- FAIRsharing.org — Curated registry of data standards, databases and policies; ~2,000+ databases catalogued with FAIR-compliance metadata.
- OpenDOAR — Global directory of ~6,000 academic open-access repositories (including data); operated by Jisc.
- Open Access Directory: Data Repositories — Wiki-maintained directory of data repositories, hosted by Simmons University.
4. Catalogues of databases within a domain
Meta-lists of the major databases in a specific field — the canonical “where are all the databases for X” references.
Life sciences & biomedical
- NAR Online Molecular Biology Database Collection — Curated catalogue of ~1,650 molecular-biology and bioinformatics databases, classified into 15 categories and 41 sub-categories; maintained alongside the annual Nucleic Acids Research Database Issue. The canonical meta-list for the life sciences.
- FAIRsharing (life sciences view) — (Also in §3.) Especially deep for biomedical databases and standards.
Astronomy
- VizieR (CDS) — The most complete library of published astronomical catalogues; ~24,000 catalogues and tables gathered by the Centre de Données astronomiques de Strasbourg. The reference meta-catalogue for astronomy.
- NASA Astrophysics Data System (ADS) — 15M+ records; indexes external data catalogues and archives alongside the literature.
Linguistics & language
- OLAC – Open Language Archives Community — International federation/meta-catalogue of dozens of language-resource archives (LDC, ELRA, AILLA, ELAR, etc.) searchable through one interface.
- CLARIN Virtual Language Observatory — Cross-archive discovery over the CLARIN network’s language resources.
Linked / semantic data
- Linked Open Data Cloud — Diagram and dataset of ~1,300 interlinked Linked Open Data datasets across nine domains (geography, government, life sciences, linguistics, media, etc.). Maintained by the Insight Centre for Data Analytics; CC BY.
Cultural heritage
- Europeana and DPLA — Each aggregates thousands of institutions, so each effectively functions as a meta-directory of cultural-heritage collections (also listed as sources in the companion file).
5. Dataset search engines & aggregators (that index many sources)
Tools that don’t host data themselves but catalogue/index it across many sources.
- Google Dataset Search — Indexes tens of millions of datasets by reading schema.org metadata across the web.
- Google Public Data Explorer — Directory + visualiser over public-interest datasets (World Bank, OECD, IMF, US BLS, etc.).
- DataCite Commons — Search across ~60 million DOI-registered research outputs.
- OpenAIRE Explore — Cross-repository aggregator for European Open Science (~150M items).
- BASE (Bielefeld Academic Search Engine) — ~400 million documents across academic repositories, including datasets.
7. Directories of map, transport & traffic data
Meta-lists specific to geospatial and mobility data sources.
- NAPCORE — (Also in §2.) Directory of the EU’s 30+ mobility National Access Points.
- Mobility Database (MobilityData) — Catalogue of 6,000+ GTFS / GTFS-RT / GBFS public-transport feeds across 99+ countries.
- Transitland Atlas — Open feed registry of GTFS / GTFS-RT / GBFS / MDS feeds from 2,500+ operators across 55+ countries.
- OpenAddresses — Aggregates 2,600+ open government address sources worldwide (a directory of address datasets as much as a dataset).
9. Curated dataset lists for data journalism
Curated, regularly updated lists aimed at journalists and storytellers — strong for finding interesting rather than merely official datasets.
- Data Is Plural — Jeremy Singer-Vine’s weekly newsletter of useful/curious datasets, running since 2015; 1,750+ datasets, with a browsable archive as a “dataset of datasets.”
- Data Liberation Project — Initiative (now run by MuckRock + Big Local News) that obtains, documents and publishes hard-to-get government datasets of public interest.
- FiveThirtyEight Data — Index of the datasets behind FiveThirtyEight’s data journalism (politics, sports, science, economics), released as plain CSVs.
- BuzzFeed News GitHub — Data and analysis behind BuzzFeed News investigations.
- ProPublica Data Store — Datasets compiled and cleaned by ProPublica’s investigative team (many free, some priced).
- The Pudding — Datasets underlying The Pudding’s visual essays.
- Awesome Public Datasets — (Also in §1.) Widely used by data journalists as a starting point.
10. Directories of open design & creative assets
Meta-lists of openly licensed creative assets (icons, fonts, images, CC media).
- Open Source Design – Resources — Curated directory of openly licensed icons, fonts, images, CC media and design tools. The best single meta-list for creative open assets.
- Openverse — Search engine indexing 800 million+ openly licensed and public-domain images and audio files across hundreds of sources; WordPress Foundation successor to CC Search.
- Creative Commons Search — Meta-search across CC-licensed works.
11. Directories for data preservation & “data rescue”
Meta-lists / clearinghouses of preservation efforts (especially the 2025 US federal-data rescue).
- Data Rescue Project Portal — Clearinghouse and tracker indexing 1,000+ rescued US public datasets; co-run by IASSIST, RDAP and the Data Curation Network.
- Public Environmental Data Partners — Coalition directory of archived/mirrored environmental datasets.
12. Standards, metrics & community references
Not data sources, but the infrastructure and benchmarks that catalogue or rank them.
- Open Data Charter — International principles for open data publication.
- Open Knowledge Foundation — Organisation behind CKAN, the Open Data Handbook and the Global Open Data Index.
- Global Open Data Index — Country-level open-data openness ranking (now mostly historical).
- Open Data Maturity Report (EU) — Annual EU benchmarking of national open data maturity.
- Open Data Watch / ODIN — Global benchmark of national statistical-office data openness.