diff --git a/README.rst b/README.rst index 2ca0ea6..c42cc31 100644 --- a/README.rst +++ b/README.rst @@ -30,116 +30,114 @@ Other amazingly awesome lists can be found in `sindresorhus's awesome `_ [`Meta `_] +* |OK_ICON| `The global dataset of historical yields for major crops 1981–2016 - The Global Dataset of [...] `_ [`Meta `_] -* |OK_ICON| `Hyperspectral benchmark dataset on soil moisture - This dataset was [...] `_ [`Meta `_] +* |OK_ICON| `Hyperspectral benchmark dataset on soil moisture - This dataset was measured in a five-day [...] `_ [`Meta `_] -* |OK_ICON| `Lemons quality control dataset - Lemon dataset has been prepared to [...] `_ [`Meta `_] +* |OK_ICON| `Lemons quality control dataset - Lemon dataset has been prepared to investigate the [...] `_ [`Meta `_] -* |OK_ICON| `Optimized Soil Adjusted Vegetation Index - The IDB is a tool for working [...] `_ [`Meta `_] +* |OK_ICON| `Optimized Soil Adjusted Vegetation Index - The IDB is a tool for working with remote sensing [...] `_ [`Meta `_] -* |OK_ICON| `U.S. Department of Agriculture's Nutrient Database `_ [`Meta `_] +* |FIXME_ICON| `U.S. Department of Agriculture's Nutrient Database `_ [`Meta `_] -* |OK_ICON| `U.S. Department of Agriculture's PLANTS Database - The Complete PLANTS [...] `_ [`Meta `_] +* |OK_ICON| `U.S. Department of Agriculture's PLANTS Database - The Complete PLANTS Checklist is nearly 7 [...] `_ [`Meta `_] Biology ------- -* |FIXME_ICON| `1000 Genomes - The 1000 Genomes Project ran between 2008 and 2015, [...] `_ [`Meta `_] +* |FIXME_ICON| `1000 Genomes - The 1000 Genomes Project ran between 2008 and 2015, creating the largest [...] `_ [`Meta `_] -* |OK_ICON| `American Gut (Microbiome Project) - The American Gut project is the [...] `_ [`Meta `_] +* |OK_ICON| `American Gut (Microbiome Project) - The American Gut project is the largest crowdsourced [...] `_ [`Meta `_] -* |OK_ICON| `BCNB - There are WSIs of 1058 patients, part of tumor regions are [...] `_ [`Meta `_] +* |OK_ICON| `BCNB - There are WSIs of 1058 patients, part of tumor regions are annotated in WSIs. Except [...] `_ [`Meta `_] -* |OK_ICON| `Broad Bioimage Benchmark Collection (BBBC) - The Broad Bioimage Benchmark [...] `_ [`Meta `_] +* |OK_ICON| `Broad Bioimage Benchmark Collection (BBBC) - The Broad Bioimage Benchmark Collection (BBBC) [...] `_ [`Meta `_] * |OK_ICON| `Broad Cancer Cell Line Encyclopedia (CCLE) `_ [`Meta `_] -* |FIXME_ICON| `Cell Image Library - This library is a public and easily accessible [...] `_ [`Meta `_] +* |OK_ICON| `Cell Image Library - This library is a public and easily accessible resource database of [...] `_ [`Meta `_] -* |OK_ICON| `Complete Genomics Public Data - A diverse data set of whole human genomes [...] `_ [`Meta `_] +* |OK_ICON| `Complete Genomics Public Data - A diverse data set of whole human genomes are freely [...] `_ [`Meta `_] -* |OK_ICON| `CytoImageNet - A large-scale dataset of microscopy images. Contains [...] `_ [`Meta `_] +* |OK_ICON| `CytoImageNet - A large-scale dataset of microscopy images. Contains 890,737 total grayscale [...] `_ [`Meta `_] -* |OK_ICON| `EBI ArrayExpress - ArrayExpress Archive of Functional Genomics Data [...] `_ [`Meta `_] +* |OK_ICON| `EBI ArrayExpress - ArrayExpress Archive of Functional Genomics Data stores data from high- [...] `_ [`Meta `_] -* |FIXME_ICON| `EBI Protein Data Bank in Europe - The Electron Microscopy Data Bank [...] `_ [`Meta `_] +* |OK_ICON| `EBI Protein Data Bank in Europe - The Electron Microscopy Data Bank (EMDB) is a public [...] `_ [`Meta `_] -* |OK_ICON| `ENCODE project - The Encyclopedia of DNA Elements (ENCODE) Consortium is [...] `_ [`Meta `_] +* |OK_ICON| `ENCODE project - The Encyclopedia of DNA Elements (ENCODE) Consortium is an ongoing [...] `_ [`Meta `_] -* |OK_ICON| `Electron Microscopy Pilot Image Archive (EMPIAR) - EMPIAR, the Electron [...] `_ [`Meta `_] +* |OK_ICON| `Electron Microscopy Pilot Image Archive (EMPIAR) - EMPIAR, the Electron Microscopy Public [...] `_ [`Meta `_] * |OK_ICON| `Ensembl Genomes `_ [`Meta `_] -* |OK_ICON| `Gene Expression Omnibus (GEO) - GEO is a public functional genomics data [...] `_ [`Meta `_] +* |OK_ICON| `Gene Expression Omnibus (GEO) - GEO is a public functional genomics data repository [...] `_ [`Meta `_] * |OK_ICON| `Gene Ontology (GO) - GO annotation files `_ [`Meta `_] * |OK_ICON| `Global Biotic Interactions (GloBI) `_ [`Meta `_] -* |OK_ICON| `Harvard Medical School (HMS) LINCS Project - The Harvard Medical School [...] `_ [`Meta `_] +* |OK_ICON| `Harvard Medical School (HMS) LINCS Project - The Harvard Medical School (HMS) LINCS Center is [...] `_ [`Meta `_] -* |OK_ICON| `Human Genome Diversity Project - A group of scientists at Stanford [...] `_ [`Meta `_] +* |OK_ICON| `Human Genome Diversity Project - A group of scientists at Stanford University have [...] `_ [`Meta `_] -* |OK_ICON| `Human Microbiome Project (HMP) - The HMP sequenced over 2000 reference [...] `_ [`Meta `_] +* |OK_ICON| `Human Microbiome Project (HMP) - The HMP sequenced over 2000 reference genomes isolated from [...] `_ [`Meta `_] -* |OK_ICON| `ICOS PSP Benchmark - The ICOS PSP benchmarks repository contains an [...] `_ [`Meta `_] +* |OK_ICON| `ICOS PSP Benchmark - The ICOS PSP benchmarks repository contains an adjustable real-world [...] `_ [`Meta `_] * |OK_ICON| `International HapMap Project `_ [`Meta `_] * |FIXME_ICON| `Journal of Cell Biology DataViewer `_ [`Meta `_] -* |OK_ICON| `KEGG - KEGG is a database resource for understanding high-level functions [...] `_ [`Meta `_] - -* |FIXME_ICON| `MIT Cancer Genomics Data `_ [`Meta `_] +* |OK_ICON| `KEGG - KEGG is a database resource for understanding high-level functions and utilities of [...] `_ [`Meta `_] * |OK_ICON| `NCBI Proteins `_ [`Meta `_] -* |OK_ICON| `NCBI Taxonomy - The NCBI Taxonomy database is a curated set of names and [...] `_ [`Meta `_] +* |OK_ICON| `NCBI Taxonomy - The NCBI Taxonomy database is a curated set of names and classifications for [...] `_ [`Meta `_] -* |OK_ICON| `NCI Genomic Data Commons - The GDC Data Portal is a robust data-driven [...] `_ [`Meta `_] +* |OK_ICON| `NCI Genomic Data Commons - The GDC Data Portal is a robust data-driven platform that allows [...] `_ [`Meta `_] * |OK_ICON| `NIH Microarray data `_ [`Meta `_] -* |OK_ICON| `OpenSNP genotypes data - openSNP allows customers of direct-to-customer [...] `_ [`Meta `_] +* |OK_ICON| `OpenSNP genotypes data - openSNP allows customers of direct-to-customer genetic tests to [...] `_ [`Meta `_] -* |OK_ICON| `Palmer Penguins - The goal of palmerpenguins is to provide a great [...] `_ [`Meta `_] +* |OK_ICON| `Palmer Penguins - The goal of palmerpenguins is to provide a great dataset for data [...] `_ [`Meta `_] * |OK_ICON| `Pathguid - Protein-Protein Interactions Catalog `_ [`Meta `_] -* |OK_ICON| `Protein Data Bank - This resource is powered by the Protein Data Bank [...] `_ [`Meta `_] +* |OK_ICON| `Protein Data Bank - This resource is powered by the Protein Data Bank archive-information [...] `_ [`Meta `_] -* |OK_ICON| `Psychiatric Genomics Consortium - The purpose of the Psychiatric Genomics [...] `_ [`Meta `_] +* |OK_ICON| `Psychiatric Genomics Consortium - The purpose of the Psychiatric Genomics Consortium (PGC) is [...] `_ [`Meta `_] -* |OK_ICON| `PubChem Project - PubChem is the world's largest collection of freely [...] `_ [`Meta `_] +* |OK_ICON| `PubChem Project - PubChem is the world's largest collection of freely accessible chemical [...] `_ [`Meta `_] -* |OK_ICON| `PubGene (now Coremine Medical) - COREMINE™ is a family of tools developed [...] `_ [`Meta `_] +* |OK_ICON| `PubGene (now Coremine Medical) - COREMINE™ is a family of tools developed by the Norwegian [...] `_ [`Meta `_] -* |OK_ICON| `Sanger Catalogue of Somatic Mutations in Cancer (COSMIC) - COSMIC, the [...] `_ [`Meta `_] +* |OK_ICON| `Sanger Catalogue of Somatic Mutations in Cancer (COSMIC) - COSMIC, the Catalogue Of Somatic [...] `_ [`Meta `_] * |OK_ICON| `Sanger Genomics of Drug Sensitivity in Cancer Project (GDSC) `_ [`Meta `_] -* |OK_ICON| `Sequence Read Archive(SRA) - The Sequence Read Archive (SRA) stores raw [...] `_ [`Meta `_] +* |OK_ICON| `Sequence Read Archive(SRA) - The Sequence Read Archive (SRA) stores raw sequence data from [...] `_ [`Meta `_] -* |FIXME_ICON| `Stanford Microarray Data `_ [`Meta `_] +* |OK_ICON| `Stanford Microarray Data (Retired NOW) `_ [`Meta `_] * |OK_ICON| `Stowers Institute Original Data Repository `_ [`Meta `_] -* |OK_ICON| `Systems Science of Biological Dynamics (SSBD) Database - Systems Science [...] `_ [`Meta `_] +* |OK_ICON| `Systems Science of Biological Dynamics (SSBD) Database - Systems Science of Biological [...] `_ [`Meta `_] -* |FIXME_ICON| `The Cancer Genome Atlas (TCGA), available via Broad GDAC `_ [`Meta `_] +* |OK_ICON| `The Cancer Genome Atlas (TCGA), available via Broad GDAC `_ [`Meta `_] -* |FIXME_ICON| `The Catalogue of Life - The Catalogue of Life is a quality-assured [...] `_ [`Meta `_] +* |OK_ICON| `The Catalogue of Life - The Catalogue of Life is a quality-assured checklist of more than 1.8 [...] `_ [`Meta `_] -* |OK_ICON| `The Personal Genome Project - The Personal Genome Project, initiated in [...] `_ [`Meta `_] +* |OK_ICON| `The Personal Genome Project - The Personal Genome Project, initiated in 2005, is a vision and [...] `_ [`Meta `_] * |OK_ICON| `UCSC Public Data `_ [`Meta `_] * |OK_ICON| `UniGene `_ [`Meta `_] -* |OK_ICON| `Universal Protein Resource (UnitProt) - The Universal Protein Resource [...] `_ [`Meta `_] +* |OK_ICON| `Universal Protein Resource (UnitProt) - The Universal Protein Resource (UniProt) is a [...] `_ [`Meta `_] -* |OK_ICON| `Rfam - The Rfam database is a collection of RNA families, each [...] `_ [`Meta `_] +* |OK_ICON| `Rfam - The Rfam database is a collection of RNA families, each represented by multiple [...] `_ [`Meta `_] Chemistry --------- @@ -153,15 +151,15 @@ Climate+Weather * |FIXME_ICON| `Australian Weather `_ [`Meta `_] -* |OK_ICON| `Aviation Weather Center - Consistent, timely and accurate weather [...] `_ [`Meta `_] +* |OK_ICON| `Aviation Weather Center - Consistent, timely and accurate weather information for the world [...] `_ [`Meta `_] -* |FIXME_ICON| `Brazilian Weather - Historical data (In Portuguese) - Data related to [...] `_ [`Meta `_] +* |FIXME_ICON| `Brazilian Weather - Historical data (In Portuguese) - Data related to climate and weather [...] `_ [`Meta `_] * |OK_ICON| `Canadian Meteorological Centre `_ [`Meta `_] * |FIXME_ICON| `Climate Data from UEA (updated monthly) `_ [`Meta `_] -* |OK_ICON| `Dutch Weather - The KNMI Data Center (KDC) portal provides access to KNMI [...] `_ [`Meta `_] +* |OK_ICON| `Dutch Weather - The KNMI Data Center (KDC) portal provides access to KNMI data on weather, [...] `_ [`Meta `_] * |OK_ICON| `European Climate Assessment & Dataset `_ [`Meta `_] @@ -169,7 +167,7 @@ Climate+Weather * |OK_ICON| `Global Climate Data Since 1929 `_ [`Meta `_] -* |OK_ICON| `Charting The Global Climate Change News Narrative 2009-2020 - These four [...] `_ [`Meta `_] +* |OK_ICON| `Charting The Global Climate Change News Narrative 2009-2020 - These four datasets represent [...] `_ [`Meta `_] * |OK_ICON| `NASA Global Imagery Browse Services `_ [`Meta `_] @@ -187,7 +185,7 @@ Climate+Weather * |OK_ICON| `WU Historical Weather Worldwide `_ [`Meta `_] -* |OK_ICON| `Wahington Post Climate Change - To analyze warming temperatures in the [...] `_ [`Meta `_] +* |OK_ICON| `Wahington Post Climate Change - To analyze warming temperatures in the United States, The [...] `_ [`Meta `_] * |OK_ICON| `WorldClim - Global Climate Data `_ [`Meta `_] @@ -232,7 +230,7 @@ ComplexNetworks * |FIXME_ICON| `WSU Graph Database `_ [`Meta `_] -* |FIXME_ICON| `Community Resource for Archiving Wireless Data At Dartmouth - Contains [...] `_ [`Meta `_] +* |FIXME_ICON| `Community Resource for Archiving Wireless Data At Dartmouth - Contains datasets of pcap files [...] `_ [`Meta `_] ComputerNetworks ---------------- @@ -257,13 +255,13 @@ ComputerNetworks * |OK_ICON| `Internet-Wide Scan Data Repository `_ [`Meta `_] -* |FIXME_ICON| `MIRAGE-2019 - MIRAGE-2019 is a human-generated dataset for mobile traffic [...] `_ [`Meta `_] +* |FIXME_ICON| `MIRAGE-2019 - MIRAGE-2019 is a human-generated dataset for mobile traffic analysis with [...] `_ [`Meta `_] * |OK_ICON| `OONI: Open Observatory of Network Interference - Internet censorship data `_ [`Meta `_] * |OK_ICON| `Open Mobile Data by MobiPerf `_ [`Meta `_] -* |OK_ICON| `The Peer-to-Peer Trace Archive - Real-world measurements play a key role [...] `_ [`Meta `_] +* |OK_ICON| `The Peer-to-Peer Trace Archive - Real-world measurements play a key role in studying the [...] `_ [`Meta `_] * |OK_ICON| `Rapid7 Sonar Internet Scans `_ [`Meta `_] @@ -272,9 +270,9 @@ ComputerNetworks CyberSecurity ------------- -* |OK_ICON| `CCCS-CIC-AndMal-2020 - The dataset includes 200K benign and 200K malware [...] `_ [`Meta `_] +* |OK_ICON| `CCCS-CIC-AndMal-2020 - The dataset includes 200K benign and 200K malware samples totalling to [...] `_ [`Meta `_] -* |OK_ICON| `Traffic and Log Data Captured During a Cyber Defense Exercise - This [...] `_ [`Meta `_] +* |OK_ICON| `Traffic and Log Data Captured During a Cyber Defense Exercise - This dataset was acquired [...] `_ [`Meta `_] DataChallenges -------------- @@ -309,12 +307,12 @@ DataChallenges * |OK_ICON| `TunedIT - Data mining & machine learning data sets, algorithms, challenges `_ [`Meta `_] -* |OK_ICON| `Yelp Dataset Challenge - The Yelp dataset is a subset of our businesses, [...] `_ [`Meta `_] +* |OK_ICON| `Yelp Dataset Challenge - The Yelp dataset is a subset of our businesses, reviews, and user [...] `_ [`Meta `_] EarthScience ------------ -* |OK_ICON| `38-Cloud (Cloud Detection) - Contains 38 Landsat 8 scene images and their [...] `_ [`Meta `_] +* |OK_ICON| `38-Cloud (Cloud Detection) - Contains 38 Landsat 8 scene images and their manually extracted [...] `_ [`Meta `_] * |OK_ICON| `AQUASTAT - Global water resources and uses `_ [`Meta `_] @@ -324,7 +322,7 @@ EarthScience * |FIXME_ICON| `Earth Models `_ [`Meta `_] -* |OK_ICON| `Global Wind Atlas - The Global Wind Atlas is a free, web-based [...] `_ [`Meta `_] +* |OK_ICON| `Global Wind Atlas - The Global Wind Atlas is a free, web-based application developed to help [...] `_ [`Meta `_] * |OK_ICON| `Integrated Marine Observing System (IMOS) - roughly 30TB of ocean measurements `_ [`Meta `_] @@ -332,9 +330,9 @@ EarthScience * |FIXME_ICON| `Alabama Real-Time Coastal Observing System `_ [`Meta `_] -* |OK_ICON| `National Estuarine Research Reserves System-Wide Monitoring Program - [...] `_ [`Meta `_] +* |OK_ICON| `National Estuarine Research Reserves System-Wide Monitoring Program - long-term estuarine [...] `_ [`Meta `_] -* |OK_ICON| `Oil and Gas Authority Open Data - The dataset covers 12,500 offshore [...] `_ [`Meta `_] +* |OK_ICON| `Oil and Gas Authority Open Data - The dataset covers 12,500 offshore wellbores, 5,000 seismic [...] `_ [`Meta `_] * |OK_ICON| `Smithsonian Institution Global Volcano and Eruption Database `_ [`Meta `_] @@ -343,35 +341,35 @@ EarthScience Economics --------- -* |OK_ICON| `Asian Productivity Organization (APO) - The AEPM provides a graphic [...] `_ [`Meta `_] +* |OK_ICON| `Asian Productivity Organization (APO) - The AEPM provides a graphic dashboard view of [...] `_ [`Meta `_] -* |OK_ICON| `ASEAN Stats - The ASEANstatsDataPortal was first launched in June 2018. [...] `_ [`Meta `_] +* |OK_ICON| `ASEAN Stats - The ASEANstatsDataPortal was first launched in June 2018. The Portal is [...] `_ [`Meta `_] * |OK_ICON| `American Economic Association (AEA) `_ [`Meta `_] -* |OK_ICON| `Asian KLEMS - Asia KLEMS is an Asian regional research consortium to [...] `_ [`Meta `_] +* |OK_ICON| `Asian KLEMS - Asia KLEMS is an Asian regional research consortium to promote building [...] `_ [`Meta `_] -* |OK_ICON| `Harvard Atlas of Economic Complexity - A database for people to explore [...] `_ [`Meta `_] +* |OK_ICON| `Harvard Atlas of Economic Complexity - A database for people to explore global trade flows [...] `_ [`Meta `_] -* |OK_ICON| `BIS Financial Database - The files contain the same data as in the BIS [...] `_ [`Meta `_] +* |OK_ICON| `BIS Financial Database - The files contain the same data as in the BIS Statistics Explorer [...] `_ [`Meta `_] -* |OK_ICON| `Barro-Lee Education Attainment - Barro-Lee Educational Attainment Data [...] `_ [`Meta `_] +* |OK_ICON| `Barro-Lee Education Attainment - Barro-Lee Educational Attainment Data from 1950 to 2010. [...] `_ [`Meta `_] -* |OK_ICON| `CEPII Database - A database of the world economy, through its country and [...] `_ [`Meta `_] +* |OK_ICON| `CEPII Database - A database of the world economy, through its country and region profiles, in [...] `_ [`Meta `_] -* |OK_ICON| `EUKLEMS - EU KLEMS is an industry level, growth and productivity research [...] `_ [`Meta `_] +* |OK_ICON| `EUKLEMS - EU KLEMS is an industry level, growth and productivity research project. EU KLEMS [...] `_ [`Meta `_] * |FIXME_ICON| `EconData from UMD `_ [`Meta `_] * |FIXME_ICON| `Economic Freedom of the World Data `_ [`Meta `_] -* |OK_ICON| `Historical National Accounts - The datahub on Comparative Historical [...] `_ [`Meta `_] +* |OK_ICON| `Historical National Accounts - The datahub on Comparative Historical National Accounts [...] `_ [`Meta `_] * |OK_ICON| `Historical MacroEconomic Statistics `_ [`Meta `_] * |FIXME_ICON| `INFORUM - Interindustry Forecasting at the University of Maryland `_ [`Meta `_] -* |OK_ICON| `DBnomics – the world's economic database - Aggregates hundreds of [...] `_ [`Meta `_] +* |FIXME_ICON| `DBnomics – the world's economic database - Aggregates hundreds of millions of time series [...] `_ [`Meta `_] * |OK_ICON| `International Trade Statistics `_ [`Meta `_] @@ -381,19 +379,19 @@ Economics * |FIXME_ICON| `Jon Haveman International Trade Data Links `_ [`Meta `_] -* |OK_ICON| `Latin America KLEMS - LAKLEMS is a technical cooperation project financed [...] `_ [`Meta `_] +* |OK_ICON| `Latin America KLEMS - LAKLEMS is a technical cooperation project financed by the Inter- [...] `_ [`Meta `_] -* |OK_ICON| `Long-Term Productivity Database - The Long-Term Productivity database was [...] `_ [`Meta `_] +* |OK_ICON| `Long-Term Productivity Database - The Long-Term Productivity database was created as a [...] `_ [`Meta `_] -* |OK_ICON| `Maddison Project Database - The Maddison Project Database provides [...] `_ [`Meta `_] +* |OK_ICON| `Maddison Project Database - The Maddison Project Database provides information on comparative [...] `_ [`Meta `_] -* |OK_ICON| `National Transfer Accounts - The goal of the National Transfer Accounts [...] `_ [`Meta `_] +* |OK_ICON| `National Transfer Accounts - The goal of the National Transfer Accounts (NTA) project is to [...] `_ [`Meta `_] * |OK_ICON| `OpenCorporates Database of Companies in the World `_ [`Meta `_] * |OK_ICON| `Our World in Data `_ [`Meta `_] -* |OK_ICON| `Penn World Table - PWT version 10.0 is a database with information on [...] `_ [`Meta `_] +* |OK_ICON| `Penn World Table - PWT version 10.0 is a database with information on relative levels of [...] `_ [`Meta `_] * |FIXME_ICON| `SciencesPo World Trade Gravity Datasets `_ [`Meta `_] @@ -407,18 +405,18 @@ Economics * |OK_ICON| `UN Human Development Reports `_ [`Meta `_] -* |OK_ICON| `World Input-Output Database - World Input-Output Tables and underlying [...] `_ [`Meta `_] +* |OK_ICON| `World Input-Output Database - World Input-Output Tables and underlying data, covering 43 [...] `_ [`Meta `_] -* |OK_ICON| `World KLEMS - Analytical KLEMS-type data sets for a broad set of [...] `_ [`Meta `_] +* |OK_ICON| `World KLEMS - Analytical KLEMS-type data sets for a broad set of countries around the world. [...] `_ [`Meta `_] Education --------- * |OK_ICON| `College Scorecard Data `_ [`Meta `_] -* |OK_ICON| `New York State Education Department Data - The New York State Education [...] `_ [`Meta `_] +* |OK_ICON| `New York State Education Department Data - The New York State Education Department (NYSED) is [...] `_ [`Meta `_] -* |OK_ICON| `Program for International Student Assessement (PISA) - Contains 15-year- [...] `_ [`Meta `_] +* |OK_ICON| `Program for International Student Assessement (PISA) - Contains 15-year-old students' [...] `_ [`Meta `_] * |OK_ICON| `Student Data from Free Code Camp `_ [`Meta `_] @@ -435,31 +433,31 @@ Energy * |OK_ICON| `DEL - Domestic Electrical Load study datsets for South Africa (1994 - 2014) `_ [`Meta `_] -* |OK_ICON| `ECO - The ECO data set is a comprehensive data set for non-intrusive load [...] `_ [`Meta `_] +* |OK_ICON| `ECO - The ECO data set is a comprehensive data set for non-intrusive load monitoring and [...] `_ [`Meta `_] * |OK_ICON| `EIA `_ [`Meta `_] -* |OK_ICON| `Global Power Plant Database - The Global Power Plant Database is a [...] `_ [`Meta `_] +* |OK_ICON| `Global Power Plant Database - The Global Power Plant Database is a comprehensive, open source [...] `_ [`Meta `_] -* |FIXME_ICON| `HES - Household Electricity Study, UK `_ [`Meta `_] +* |OK_ICON| `HES - Household Electricity Study, UK `_ [`Meta `_] * |OK_ICON| `HFED `_ [`Meta `_] -* |OK_ICON| `MORED: a Moroccan Buildings’ Electricity Consumption Dataset - Since [...] `_ [`Meta `_] +* |OK_ICON| `MORED: a Moroccan Buildings’ Electricity Consumption Dataset - Since spring of 2019, a data [...] `_ [`Meta `_] -* |OK_ICON| `Marktstammdatenregister - The German Marktstammdatenregister (MaStR) is a [...] `_ [`Meta `_] +* |OK_ICON| `Marktstammdatenregister - The German Marktstammdatenregister (MaStR) is a database of all [...] `_ [`Meta `_] * |OK_ICON| `PEM1 - Proton Exchange Membrane (PEM) Fuel Cell Dataset `_ [`Meta `_] * |FIXME_ICON| `PLAID - The Plug Load Appliance Identification Dataset `_ [`Meta `_] -* |OK_ICON| `The Public Utility Data Liberation Project (PUDL) - PUDL makes US energy [...] `_ [`Meta `_] +* |OK_ICON| `The Public Utility Data Liberation Project (PUDL) - PUDL makes US energy data easier to [...] `_ [`Meta `_] * |FIXME_ICON| `REDD `_ [`Meta `_] -* |OK_ICON| `SYND - A synthetic energy dataset for non-intrusive load monitoring - [...] `_ [`Meta `_] +* |OK_ICON| `SYND - A synthetic energy dataset for non-intrusive load monitoring - With SynD, we present a [...] `_ [`Meta `_] -* |OK_ICON| `Smart Meter Data Portal - The Smart Meter Data Portal is part of the [...] `_ [`Meta `_] +* |OK_ICON| `Smart Meter Data Portal - The Smart Meter Data Portal is part of the National Science [...] `_ [`Meta `_] * |OK_ICON| `Tracebase `_ [`Meta `_] @@ -474,18 +472,18 @@ Energy Entertainment ------------- -* |OK_ICON| `Top Streamers on Twitch - This contains data of Top 1000 Streamers from [...] `_ [`Meta `_] +* |OK_ICON| `Top Streamers on Twitch - This contains data of Top 1000 Streamers from past year. `_ [`Meta `_] Finance ------- -* |OK_ICON| `BIS Statistics - BIS statistics, compiled in cooperation with central [...] `_ [`Meta `_] +* |OK_ICON| `BIS Statistics - BIS statistics, compiled in cooperation with central banks and other [...] `_ [`Meta `_] -* |OK_ICON| `Blockmodo Coin Registry - A registry of JSON formatted information files [...] `_ [`Meta `_] +* |OK_ICON| `Blockmodo Coin Registry - A registry of JSON formatted information files that is primarily [...] `_ [`Meta `_] * |FIXME_ICON| `CBOE Futures Exchange `_ [`Meta `_] -* |OK_ICON| `Complete FAANG Stock data - This data set contains all the stock data of [...] `_ [`Meta `_] +* |OK_ICON| `Complete FAANG Stock data - This data set contains all the stock data of FAANG companies from [...] `_ [`Meta `_] * |OK_ICON| `Google Finance `_ [`Meta `_] @@ -501,7 +499,7 @@ Finance * |OK_ICON| `Quandl `_ [`Meta `_] -* |OK_ICON| `SEC EDGAR - EDGAR, the Electronic Data Gathering, Analysis, and Retrieval [...] `_ [`Meta `_] +* |OK_ICON| `SEC EDGAR - EDGAR, the Electronic Data Gathering, Analysis, and Retrieval system, is the [...] `_ [`Meta `_] * |OK_ICON| `St Louis Federal `_ [`Meta `_] @@ -510,13 +508,13 @@ Finance GIS --- -* |OK_ICON| `Awesome 3D Semantic City Models - Collection of open 3D semantic city and [...] `_ [`Meta `_] +* |OK_ICON| `Awesome 3D Semantic City Models - Collection of open 3D semantic city and region models. `_ [`Meta `_] * |OK_ICON| `ArcGIS Open Data portal `_ [`Meta `_] * |OK_ICON| `Cambridge, MA, US, GIS data on GitHub `_ [`Meta `_] -* |OK_ICON| `Database of all continents, countries, States/Subdivisions/Provinces and [...] `_ [`Meta `_] +* |OK_ICON| `Database of all continents, countries, States/Subdivisions/Provinces and Cities - Database [...] `_ [`Meta `_] * |FIXME_ICON| `Factual Global Location Data `_ [`Meta `_] @@ -532,7 +530,7 @@ GIS * |OK_ICON| `GeoNames Worldwide `_ [`Meta `_] -* |OK_ICON| `Global Administrative Areas Database (GADM) - Geospatial data organized [...] `_ [`Meta `_] +* |OK_ICON| `Global Administrative Areas Database (GADM) - Geospatial data organized by country. Includes [...] `_ [`Meta `_] * |OK_ICON| `Homeland Infrastructure Foundation-Level Data `_ [`Meta `_] @@ -554,7 +552,7 @@ GIS * |OK_ICON| `Robin Wilson - Free GIS Datasets `_ [`Meta `_] -* |OK_ICON| `Shadow Accrual Maps - The repository contains the accumulated shadow [...] `_ [`Meta `_] +* |OK_ICON| `Shadow Accrual Maps - The repository contains the accumulated shadow information for New York [...] `_ [`Meta `_] * |OK_ICON| `TIGER/Line - U.S. boundaries and roads `_ [`Meta `_] @@ -577,7 +575,7 @@ Government * |OK_ICON| `Argentina (non official) `_ [`Meta `_] -* |FIXME_ICON| `Datos Argentina - Portal de datos abiertos de la República Argentina. [...] `_ [`Meta `_] +* |OK_ICON| `Datos Argentina - Portal de datos abiertos de la República Argentina. Encontrá datos públicos [...] `_ [`Meta `_] * |OK_ICON| `Austin, TX, US `_ [`Meta `_] @@ -615,7 +613,7 @@ Government * |OK_ICON| `DataBC - data from the Province of British Columbia `_ [`Meta `_] -* |OK_ICON| `Debt to the Penny - The Debt to the Penny dataset provides information [...] `_ [`Meta `_] +* |OK_ICON| `Debt to the Penny - The Debt to the Penny dataset provides information about the total [...] `_ [`Meta `_] * |OK_ICON| `Denver Open Data `_ [`Meta `_] @@ -627,7 +625,7 @@ Government * |OK_ICON| `EuroStat `_ [`Meta `_] -* |OK_ICON| `EveryPolitician - Ongoing project collating and sharing data on every [...] `_ [`Meta `_] +* |OK_ICON| `EveryPolitician - Ongoing project collating and sharing data on every politician. `_ [`Meta `_] * |OK_ICON| `Federal Committee on Statistical Methodology (FCSM) (formerly FedStats) `_ [`Meta `_] @@ -639,7 +637,7 @@ Government * |OK_ICON| `Gatineau, QC, Canada `_ [`Meta `_] -* |FIXME_ICON| `Germany `_ [`Meta `_] +* |OK_ICON| `Germany `_ [`Meta `_] * |OK_ICON| `Ghent, Belgium `_ [`Meta `_] @@ -661,7 +659,7 @@ Government * |OK_ICON| `Indonesian Data Portal `_ [`Meta `_] -* |OK_ICON| `Iowa - Welcome to the State of Iowa's data portal. Please explore data [...] `_ [`Meta `_] +* |OK_ICON| `Iowa - Welcome to the State of Iowa's data portal. Please explore data about Iowa and your [...] `_ [`Meta `_] * |OK_ICON| `Ireland's Open Data Portal `_ [`Meta `_] @@ -669,9 +667,9 @@ Government * |FIXME_ICON| `Istanbul Municipality Open Data Portal `_ [`Meta `_] -* |OK_ICON| `Italy - Il Portale dati.gov.it è il catalogo nazionale dei metadati [...] `_ [`Meta `_] +* |OK_ICON| `Italy - Il Portale dati.gov.it è il catalogo nazionale dei metadati relativi ai dati [...] `_ [`Meta `_] -* |OK_ICON| `Jail deaths in America - The U.S. government does not release jail by [...] `_ [`Meta `_] +* |OK_ICON| `Jail deaths in America - The U.S. government does not release jail by jail mortality data, [...] `_ [`Meta `_] * |OK_ICON| `Japan `_ [`Meta `_] @@ -709,7 +707,7 @@ Government * |OK_ICON| `Netherlands `_ [`Meta `_] -* |OK_ICON| `New York Department of Sanitation Monthly Tonnage - DSNY Monthly Tonnage [...] `_ [`Meta `_] +* |OK_ICON| `New York Department of Sanitation Monthly Tonnage - DSNY Monthly Tonnage Data provides [...] `_ [`Meta `_] * |OK_ICON| `New Zealand `_ [`Meta `_] @@ -731,7 +729,7 @@ Government * |OK_ICON| `Palo Alto, California, US `_ [`Meta `_] -* |OK_ICON| `OpenDataPhilly - OpenDataPhilly is a catalog of open data in the [...] `_ [`Meta `_] +* |OK_ICON| `OpenDataPhilly - OpenDataPhilly is a catalog of open data in the Philadelphia region. In [...] `_ [`Meta `_] * |OK_ICON| `Portland, Oregon `_ [`Meta `_] @@ -753,7 +751,7 @@ Government * |OK_ICON| `San Diego, CA `_ [`Meta `_] -* |OK_ICON| `San Antonio, TX - Community Information Now - CI:Now is a nonprofit [...] `_ [`Meta `_] +* |OK_ICON| `San Antonio, TX - Community Information Now - CI:Now is a nonprofit serving Bexar (San [...] `_ [`Meta `_] * |OK_ICON| `San Francisco Data sets `_ [`Meta `_] @@ -811,7 +809,7 @@ Government * |OK_ICON| `UK 2011 Census Open Atlas Project `_ [`Meta `_] -* |OK_ICON| `US Counties - This is a repository of various data, broken down by US [...] `_ [`Meta `_] +* |OK_ICON| `US Counties - This is a repository of various data, broken down by US county. While most of [...] `_ [`Meta `_] * |OK_ICON| `U.S. Patent and Trademark Office (USPTO) Bulk Data Products `_ [`Meta `_] @@ -831,28 +829,28 @@ Government * |OK_ICON| `Vienna, Austria `_ [`Meta `_] -* |FIXME_ICON| `Statistics from the General Statistics Office of Vietnam - Data in [...] `_ [`Meta `_] +* |FIXME_ICON| `Statistics from the General Statistics Office of Vietnam - Data in different categories are [...] `_ [`Meta `_] * |OK_ICON| `U.S. Congressional Research Service (CRS) Reports `_ [`Meta `_] Healthcare ---------- -* |OK_ICON| `AWS COVID-19 Datasets - We're working with organizations who make [...] `_ [`Meta `_] +* |OK_ICON| `AWS COVID-19 Datasets - We're working with organizations who make COVID-19-related data [...] `_ [`Meta `_] -* |OK_ICON| `COVID-19 Case Surveillance Public Use Data - The COVID-19 case [...] `_ [`Meta `_] +* |OK_ICON| `COVID-19 Case Surveillance Public Use Data - The COVID-19 case surveillance system database [...] `_ [`Meta `_] -* |OK_ICON| `Covid-19 non-processed data of Ecuador - It's a project which provides [...] `_ [`Meta `_] +* |OK_ICON| `Covid-19 non-processed data of Ecuador - It's a project which provides non-processed datasets [...] `_ [`Meta `_] -* |OK_ICON| `2019 Novel Coronavirus COVID-19 Data Repository by Johns Hopkins CSSE - [...] `_ [`Meta `_] +* |OK_ICON| `2019 Novel Coronavirus COVID-19 Data Repository by Johns Hopkins CSSE - This is the data [...] `_ [`Meta `_] -* |OK_ICON| `Coronavirus (Covid-19) Data in the United States - The New York Times is [...] `_ [`Meta `_] +* |OK_ICON| `Coronavirus (Covid-19) Data in the United States - The New York Times is releasing a series [...] `_ [`Meta `_] -* |FIXME_ICON| `COVID-19 Reported Patient Impact and Hospital Capacity by Facility - The [...] `_ [`Meta `_] +* |FIXME_ICON| `COVID-19 Reported Patient Impact and Hospital Capacity by Facility - The following dataset [...] `_ [`Meta `_] * |OK_ICON| `Composition of Foods Raw, Processed, Prepared USDA National Nutrient Database for Standard [...] `_ [`Meta `_] -* |OK_ICON| `The COVID Tracking Project - The COVID Tracking Project collects and [...] `_ [`Meta `_] +* |OK_ICON| `The COVID Tracking Project - The COVID Tracking Project collects and publishes the most [...] `_ [`Meta `_] * |OK_ICON| `EHDP Large Health Data Sets `_ [`Meta `_] @@ -862,7 +860,7 @@ Healthcare * |OK_ICON| `MeSH, the vocabulary thesaurus used for indexing articles for PubMed `_ [`Meta `_] -* |OK_ICON| `MeDAL - A large medical text dataset curated for abbreviation [...] `_ [`Meta `_] +* |OK_ICON| `MeDAL - A large medical text dataset curated for abbreviation disambiguation - Medical [...] `_ [`Meta `_] * |OK_ICON| `Medicare Coverage Database (MCD), U.S. `_ [`Meta `_] @@ -886,7 +884,7 @@ Healthcare * |OK_ICON| `World Health Organization Global Health Observatory `_ [`Meta `_] -* |OK_ICON| `Yahoo Knowledge Graph COVID-19 Datasets - The Yahoo Knowledge Graph team [...] `_ [`Meta `_] +* |OK_ICON| `Yahoo Knowledge Graph COVID-19 Datasets - The Yahoo Knowledge Graph team at Verizon Media is [...] `_ [`Meta `_] * |OK_ICON| `Informatics for Integrating Biology and the Bedside `_ [`Meta `_] @@ -901,25 +899,25 @@ ImageProcessing * |OK_ICON| `Affective Image Classification `_ [`Meta `_] -* |OK_ICON| `Airborne Object Detection and Tracking - The Airborne Object Tracking [...] `_ [`Meta `_] +* |OK_ICON| `Airborne Object Detection and Tracking - The Airborne Object Tracking (AOT) dataset is a [...] `_ [`Meta `_] * |OK_ICON| `Animals with attributes `_ [`Meta `_] -* |OK_ICON| `CADDY Underwater Stereo-Vision Dataset of divers' hand gestures - [...] `_ [`Meta `_] +* |OK_ICON| `CADDY Underwater Stereo-Vision Dataset of divers' hand gestures - Contains 10K stereo pair [...] `_ [`Meta `_] -* |OK_ICON| `Cytology Dataset – CCAgT: Images of Cervical Cells with AgNOR Stain [...] `_ [`Meta `_] +* |OK_ICON| `Cytology Dataset – CCAgT: Images of Cervical Cells with AgNOR Stain Technique - The dataset [...] `_ [`Meta `_] * |FIXME_ICON| `Caltech Pedestrian Detection Benchmark `_ [`Meta `_] -* |OK_ICON| `Chars74K dataset - Character Recognition in Natural Images (both English [...] `_ [`Meta `_] +* |OK_ICON| `Chars74K dataset - Character Recognition in Natural Images (both English and Kannada are available) `_ [`Meta `_] -* |OK_ICON| `Cube++ - 4890 raw 18-megapixel images, each containing a SpyderCube color [...] `_ [`Meta `_] +* |OK_ICON| `Cube++ - 4890 raw 18-megapixel images, each containing a SpyderCube color target in their [...] `_ [`Meta `_] -* |OK_ICON| `Densely Annotated Video Driving Data Set - This data set consists of 28 [...] `_ [`Meta `_] +* |OK_ICON| `Densely Annotated Video Driving Data Set - This data set consists of 28 video sequences of [...] `_ [`Meta `_] -* |OK_ICON| `Danbooru Tagged Anime Illustration Dataset - A large-scale anime image [...] `_ [`Meta `_] +* |OK_ICON| `Danbooru Tagged Anime Illustration Dataset - A large-scale anime image database with 3.33m+ [...] `_ [`Meta `_] -* |FIXME_ICON| `DukeMTMC Data Set - DukeMTMC aims to accelerate advances in multi-target [...] `_ [`Meta `_] +* |FIXME_ICON| `DukeMTMC Data Set - DukeMTMC aims to accelerate advances in multi-target multi-camera [...] `_ [`Meta `_] * |OK_ICON| `ETH Entomological Collection (ETHEC) Fine Grained Butterfly (Lepidoptra) Images `_ [`Meta `_] @@ -929,7 +927,7 @@ ImageProcessing * |OK_ICON| `GDXray - X-ray images for X-ray testing and Computer Vision `_ [`Meta `_] -* |OK_ICON| `HumanEva Dataset - The HumanEva-I dataset contains 7 calibrated video [...] `_ [`Meta `_] +* |OK_ICON| `HumanEva Dataset - The HumanEva-I dataset contains 7 calibrated video sequences (4 grayscale [...] `_ [`Meta `_] * |OK_ICON| `ImageNet (in WordNet hierarchy) `_ [`Meta `_] @@ -939,23 +937,23 @@ ImageProcessing * |OK_ICON| `KITTI Vision Benchmark Suite `_ [`Meta `_] -* |OK_ICON| `Labeled Information Library of Alexandria - Biology and Conservation - [...] `_ [`Meta `_] +* |OK_ICON| `Labeled Information Library of Alexandria - Biology and Conservation - Contains over 10 [...] `_ [`Meta `_] * |OK_ICON| `MNIST database of handwritten digits, near 1 million examples `_ [`Meta `_] -* |OK_ICON| `Multi-View Region of Interest Prediction Dataset for Autonomous Driving - [...] `_ [`Meta `_] +* |OK_ICON| `Multi-View Region of Interest Prediction Dataset for Autonomous Driving - Contains 16 driving [...] `_ [`Meta `_] * |OK_ICON| `Massive Visual Memory Stimuli, MIT `_ [`Meta `_] -* |OK_ICON| `Newspaper Navigator - This dataset consists of extracted visual content [...] `_ [`Meta `_] +* |OK_ICON| `Newspaper Navigator - This dataset consists of extracted visual content for 16,358,041 [...] `_ [`Meta `_] -* |OK_ICON| `Open Images From Google - Pictures with segmentation masks for 2.8 [...] `_ [`Meta `_] +* |OK_ICON| `Open Images From Google - Pictures with segmentation masks for 2.8 million object instances [...] `_ [`Meta `_] -* |OK_ICON| `RuFa - Contains images of text written in one of two Arabic fonts (Ruqaa [...] `_ [`Meta `_] +* |OK_ICON| `RuFa - Contains images of text written in one of two Arabic fonts (Ruqaa and Nastaliq [...] `_ [`Meta `_] * |OK_ICON| `SUN database, MIT `_ [`Meta `_] -* |OK_ICON| `SVIRO Synthetic Vehicle Interior Rear Seat Occupancy - 25.000 synthetic [...] `_ [`Meta `_] +* |OK_ICON| `SVIRO Synthetic Vehicle Interior Rear Seat Occupancy - 25.000 synthetic scenery's across ten [...] `_ [`Meta `_] * |FIXME_ICON| `Several Shape-from-Silhouette Datasets `_ [`Meta `_] @@ -974,11 +972,11 @@ ImageProcessing MachineLearning --------------- -* |OK_ICON| `All-Age-Faces Dataset - Contains 13'322 Asian face images distributed [...] `_ [`Meta `_] +* |OK_ICON| `All-Age-Faces Dataset - Contains 13'322 Asian face images distributed across all ages (from 2 [...] `_ [`Meta `_] -* |OK_ICON| `Audi Autonomous Driving Dataset - We have published the Audi Autonomous [...] `_ [`Meta `_] +* |OK_ICON| `Audi Autonomous Driving Dataset - We have published the Audi Autonomous Driving Dataset [...] `_ [`Meta `_] -* |OK_ICON| `B3FD - Facial age (and gender) estimation dataset with 375k images - The [...] `_ [`Meta `_] +* |OK_ICON| `B3FD - Facial age (and gender) estimation dataset with 375k images - The B3FD dataset is a [...] `_ [`Meta `_] * |OK_ICON| `Context-aware data sets from five domains `_ [`Meta `_] @@ -986,7 +984,7 @@ MachineLearning * |OK_ICON| `Discogs Monthly Data `_ [`Meta `_] -* |OK_ICON| `Fluorescent Neuronal Cells - By releasing this dataset, we aim at [...] `_ [`Meta `_] +* |OK_ICON| `Fluorescent Neuronal Cells - By releasing this dataset, we aim at providing a new testbed for [...] `_ [`Meta `_] * |OK_ICON| `Free Music Archive `_ [`Meta `_] @@ -996,7 +994,7 @@ MachineLearning * |OK_ICON| `Keel Repository for classification, regression and time series `_ [`Meta `_] -* |OK_ICON| `LLVIP - This dataset contains 30976 images, or 15488 pairs, most of which [...] `_ [`Meta `_] +* |OK_ICON| `LLVIP - This dataset contains 30976 images, or 15488 pairs, most of which were taken at very [...] `_ [`Meta `_] * |OK_ICON| `Labeled Faces in the Wild (LFW) `_ [`Meta `_] @@ -1014,11 +1012,11 @@ MachineLearning * |FIXME_ICON| `RDataMining - "R and Data Mining" ebook data `_ [`Meta `_] -* |OK_ICON| `Registered Meteorites on Earth `_ [`Meta `_] +* |FIXME_ICON| `Registered Meteorites on Earth `_ [`Meta `_] * |OK_ICON| `Restaurants Health Score Data in San Francisco `_ [`Meta `_] -* |OK_ICON| `TikTok Dataset - More than 300 dance videos that capture a single person [...] `_ [`Meta `_] +* |OK_ICON| `TikTok Dataset - More than 300 dance videos that capture a single person performing dance [...] `_ [`Meta `_] * |OK_ICON| `UCI Machine Learning Repository `_ [`Meta `_] @@ -1056,7 +1054,7 @@ NaturalLanguage * |OK_ICON| `The Big Bad NLP Database `_ [`Meta `_] -* |OK_ICON| `Blizzard Challenge Speech - The speech + text data comes from [...] `_ [`Meta `_] +* |OK_ICON| `Blizzard Challenge Speech - The speech + text data comes from professional audiobooks [...] `_ [`Meta `_] * |OK_ICON| `Blogger Corpus `_ [`Meta `_] @@ -1068,17 +1066,17 @@ NaturalLanguage * |OK_ICON| `DBpedia - Structured data from Wikipedia `_ [`Meta `_] -* |OK_ICON| `Dirty Words - With millions of images in our library and billions of [...] `_ [`Meta `_] +* |OK_ICON| `Dirty Words - With millions of images in our library and billions of user-submitted keywords, [...] `_ [`Meta `_] * |FIXME_ICON| `Flickr Personal Taxonomies `_ [`Meta `_] * |FIXME_ICON| `Freebase of people, places, and things `_ [`Meta `_] -* |OK_ICON| `German Political Speeches Corpus - Collection of political speeches from [...] `_ [`Meta `_] +* |OK_ICON| `German Political Speeches Corpus - Collection of political speeches from the German [...] `_ [`Meta `_] * |OK_ICON| `Google Books Ngrams (2.2TB) `_ [`Meta `_] -* |OK_ICON| `Google MC-AFP - Generated based on the public available Gigaword dataset [...] `_ [`Meta `_] +* |OK_ICON| `Google MC-AFP - Generated based on the public available Gigaword dataset using Paragraph Vectors `_ [`Meta `_] * |OK_ICON| `Google Web 5gram (1TB, 2006) `_ [`Meta `_] @@ -1086,9 +1084,9 @@ NaturalLanguage * |FIXME_ICON| `Hansards text chunks of Canadian Parliament `_ [`Meta `_] -* |OK_ICON| `LJ Speech - Speech dataset consisting of 13,100 short audio clips of a [...] `_ [`Meta `_] +* |OK_ICON| `LJ Speech - Speech dataset consisting of 13,100 short audio clips of a single speaker reading [...] `_ [`Meta `_] -* |FIXME_ICON| `M-AILabs Speech - The M-AILABS Speech Dataset is the first large dataset [...] `_ [`Meta `_] +* |FIXME_ICON| `M-AILabs Speech - The M-AILABS Speech Dataset is the first large dataset that we are [...] `_ [`Meta `_] * |OK_ICON| `Microsoft MAchine Reading COmprehension Dataset (or MS MARCO) `_ [`Meta `_] @@ -1102,9 +1100,9 @@ NaturalLanguage * |OK_ICON| `Multi-Domain Sentiment Dataset (version 2.0) `_ [`Meta `_] -* |OK_ICON| `No Language Left Behind (NLLB - 200vo) - Dataset based on Meta's metadata [...] `_ [`Meta `_] +* |OK_ICON| `No Language Left Behind (NLLB - 200vo) - Dataset based on Meta's metadata for mined bitext. [...] `_ [`Meta `_] -* |OK_ICON| `Noisy speech database for training speech enhancement algorithms and TTS [...] `_ [`Meta `_] +* |OK_ICON| `Noisy speech database for training speech enhancement algorithms and TTS models - Clean and [...] `_ [`Meta `_] * |FIXME_ICON| `Open Multilingual Wordnet `_ [`Meta `_] @@ -1130,9 +1128,9 @@ NaturalLanguage * |OK_ICON| `WordNet databases and tools `_ [`Meta `_] -* |OK_ICON| `Wordbank - Open, de-identified database of vocabulary development from [...] `_ [`Meta `_] +* |OK_ICON| `Wordbank - Open, de-identified database of vocabulary development from 84,138 children and [...] `_ [`Meta `_] -* |OK_ICON| `WorldTree Corpus of Explanation Graphs for Elementary Science Questions - [...] `_ [`Meta `_] +* |OK_ICON| `WorldTree Corpus of Explanation Graphs for Elementary Science Questions - a corpus of [...] `_ [`Meta `_] Neuroscience ------------ @@ -1157,7 +1155,7 @@ Neuroscience * |OK_ICON| `NeuroData `_ [`Meta `_] -* |FIXME_ICON| `NeuroMorpho - NeuroMorpho.Org is a centrally curated inventory of [...] `_ [`Meta `_] +* |FIXME_ICON| `NeuroMorpho - NeuroMorpho.Org is a centrally curated inventory of digitally reconstructed [...] `_ [`Meta `_] * |OK_ICON| `Neuroelectro `_ [`Meta `_] @@ -1169,7 +1167,7 @@ Neuroscience * |OK_ICON| `Study Forrest `_ [`Meta `_] -* |OK_ICON| `The Nencki-Symfonia EEG/ERP dataset - A high-density [...] `_ [`Meta `_] +* |FIXME_ICON| `The Nencki-Symfonia EEG/ERP dataset - A high-density electroencephalography (EEG) dataset [...] `_ [`Meta `_] Physics ------- @@ -1180,7 +1178,7 @@ Physics * |OK_ICON| `IceCube - South Pole Neutrino Observatory `_ [`Meta `_] -* |OK_ICON| `Ligo Open Science Center (LOSC) - Gravitational wave data from the LIGO [...] `_ [`Meta `_] +* |OK_ICON| `Ligo Open Science Center (LOSC) - Gravitational wave data from the LIGO Hanford and [...] `_ [`Meta `_] * |OK_ICON| `NASA Exoplanet Archive `_ [`Meta `_] @@ -1191,87 +1189,87 @@ Physics ProstateCancer -------------- -* |OK_ICON| `EOPC-DE-Early-Onset-Prostate-Cancer-Germany - Early Onset Prostate Cancer [...] `_ [`Meta `_] +* |OK_ICON| `EOPC-DE-Early-Onset-Prostate-Cancer-Germany - Early Onset Prostate Cancer - Germany. [...] `_ [`Meta `_] -* |OK_ICON| `GENIE - Data from the Genomics Evidence Neoplasia Information Exchange [...] `_ [`Meta `_] +* |OK_ICON| `GENIE - Data from the Genomics Evidence Neoplasia Information Exchange (GENIE) project of the [...] `_ [`Meta `_] -* |OK_ICON| `Genomic-Hallmarks-Prostate-Adenocarcinoma-CPC-GENE - Comprehensive [...] `_ [`Meta `_] +* |OK_ICON| `Genomic-Hallmarks-Prostate-Adenocarcinoma-CPC-GENE - Comprehensive genomic profiling of 477 [...] `_ [`Meta `_] -* |OK_ICON| `MSK-IMPACT-Clinical-Sequencing-Cohort-MSKCC-Prostate-Cancer - Targeted [...] `_ [`Meta `_] +* |OK_ICON| `MSK-IMPACT-Clinical-Sequencing-Cohort-MSKCC-Prostate-Cancer - Targeted sequencing of clinical [...] `_ [`Meta `_] -* |OK_ICON| `Metastatic-Prostate-Adenocarcinoma-MCTP - Comprehensive profiling of 61 [...] `_ [`Meta `_] +* |OK_ICON| `Metastatic-Prostate-Adenocarcinoma-MCTP - Comprehensive profiling of 61 prostate cancer [...] `_ [`Meta `_] -* |OK_ICON| `Metastatic-Prostate-Cancer-SU2CPCF-Dream-Team - Comprehensive analysis of [...] `_ [`Meta `_] +* |OK_ICON| `Metastatic-Prostate-Cancer-SU2CPCF-Dream-Team - Comprehensive analysis of 150 metastatic [...] `_ [`Meta `_] -* |OK_ICON| `NPCR-2001-2015 - Database from CDC's National Program of Cancer [...] `_ [`Meta `_] +* |OK_ICON| `NPCR-2001-2015 - Database from CDC's National Program of Cancer Registries (NPCR). The [...] `_ [`Meta `_] -* |OK_ICON| `NPCR-2005-2015 - Database from CDC's National Program of Cancer [...] `_ [`Meta `_] +* |OK_ICON| `NPCR-2005-2015 - Database from CDC's National Program of Cancer Registries (NPCR). The [...] `_ [`Meta `_] -* |OK_ICON| `NaF-Prostate - NaF Prostate is a collection of F-18 NaF positron emission [...] `_ [`Meta `_] +* |OK_ICON| `NaF-Prostate - NaF Prostate is a collection of F-18 NaF positron emission tomography/computed [...] `_ [`Meta `_] -* |OK_ICON| `Neuroendocrine-Prostate-Cancer - Whole exome and RNA Seq data of [...] `_ [`Meta `_] +* |OK_ICON| `Neuroendocrine-Prostate-Cancer - Whole exome and RNA Seq data of castration resistant [...] `_ [`Meta `_] -* |OK_ICON| `PLCO-Prostate-Diagnostic-Procedures - The Prostate Diagnostic Procedures [...] `_ [`Meta `_] +* |OK_ICON| `PLCO-Prostate-Diagnostic-Procedures - The Prostate Diagnostic Procedures dataset (95,837 [...] `_ [`Meta `_] -* |OK_ICON| `PLCO-Prostate-Medical-Complications - The Prostate Medical Complications [...] `_ [`Meta `_] +* |OK_ICON| `PLCO-Prostate-Medical-Complications - The Prostate Medical Complications dataset (3,350 [...] `_ [`Meta `_] -* |OK_ICON| `PLCO-Prostate-Screening-Abnormalities - The Prostate Screening [...] `_ [`Meta `_] +* |OK_ICON| `PLCO-Prostate-Screening-Abnormalities - The Prostate Screening Abnormalities dataset (10,527 [...] `_ [`Meta `_] -* |OK_ICON| `PLCO-Prostate-Screening - The Prostate Screening dataset (177,315 [...] `_ [`Meta `_] +* |OK_ICON| `PLCO-Prostate-Screening - The Prostate Screening dataset (177,315 records, 35,875 subjects, [...] `_ [`Meta `_] -* |OK_ICON| `PLCO-Prostate-Treatments - The Prostate Treatments dataset (13,409 [...] `_ [`Meta `_] +* |OK_ICON| `PLCO-Prostate-Treatments - The Prostate Treatments dataset (13,409 records, 7,614 subjects, [...] `_ [`Meta `_] -* |OK_ICON| `PLCO-Prostate - The Prostate dataset is a comprehensive dataset that [...] `_ [`Meta `_] +* |OK_ICON| `PLCO-Prostate - The Prostate dataset is a comprehensive dataset that contains nearly all the [...] `_ [`Meta `_] -* |OK_ICON| `PRAD-CA-Prostate-Adenocarcinoma-Canada - Prostate Adenocarcinoma - [...] `_ [`Meta `_] +* |OK_ICON| `PRAD-CA-Prostate-Adenocarcinoma-Canada - Prostate Adenocarcinoma - Canada. Collected by the [...] `_ [`Meta `_] -* |OK_ICON| `PRAD-FR-Prostate-Adenocarcinoma-France - Prostate Adenocarcinoma - [...] `_ [`Meta `_] +* |OK_ICON| `PRAD-FR-Prostate-Adenocarcinoma-France - Prostate Adenocarcinoma - France. Collected by ten [...] `_ [`Meta `_] -* |OK_ICON| `PRAD-UK-Prostate-Adenocarcinoma-United-Kingdom - Prostate Adenocarcinoma [...] `_ [`Meta `_] +* |OK_ICON| `PRAD-UK-Prostate-Adenocarcinoma-United-Kingdom - Prostate Adenocarcinoma - United Kingdom. [...] `_ [`Meta `_] -* |FIXME_ICON| `PROSTATEx-Challenge - Retrospective set of prostate MR studies. All [...] `_ [`Meta `_] +* |FIXME_ICON| `PROSTATEx-Challenge - Retrospective set of prostate MR studies. All studies included [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-3T - The Prostate-3T project provided imaging data to TCIA as [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-3T - The Prostate-3T project provided imaging data to TCIA as part of an ISBI [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-Adenocarcinoma-Broad-Cornell-2012 - Comprehensive profiling of [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-Adenocarcinoma-Broad-Cornell-2012 - Comprehensive profiling of 112 prostate cancer [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-Adenocarcinoma-Broad-Cornell-2013 - Comprehensive profiling of [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-Adenocarcinoma-Broad-Cornell-2013 - Comprehensive profiling of 57 prostate cancer [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-Adenocarcinoma-CNA-study-MSKCC - Copy-number profiling of 103 [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-Adenocarcinoma-CNA-study-MSKCC - Copy-number profiling of 103 primary prostate [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-Adenocarcinoma-Fred-Hutchinson-CRC - Comprehensive profiling of [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-Adenocarcinoma-Fred-Hutchinson-CRC - Comprehensive profiling of prostate cancer [...] `_ [`Meta `_] -* |OK_ICON| `Prostate Adenocarcinoma (MSKCC/DFCI) - Whole Exome Sequencing of 1013 [...] `_ [`Meta `_] +* |OK_ICON| `Prostate Adenocarcinoma (MSKCC/DFCI) - Whole Exome Sequencing of 1013 prostate cancer samples. `_ [`Meta `_] -* |OK_ICON| `Prostate-Adenocarcinoma-MSKCC - MSKCC Prostate Oncogenome Project. 181 [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-Adenocarcinoma-MSKCC - MSKCC Prostate Oncogenome Project. 181 primary, 37 metastatic [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-Adenocarcinoma-Organoids-MSKCC - Exome profiling of prostate [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-Adenocarcinoma-Organoids-MSKCC - Exome profiling of prostate cancer samples and [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-Adenocarcinoma-Sun-Lab - Whole-genome and Transcriptome [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-Adenocarcinoma-Sun-Lab - Whole-genome and Transcriptome Sequencing of 65 Prostate [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-Adenocarcinoma-TCGA-PanCancer-Atlas - Comprehensive TCGA [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-Adenocarcinoma-TCGA-PanCancer-Atlas - Comprehensive TCGA PanCanAtlas data from 11k [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-Adenocarcinoma-TCGA - Integrated profiling of 333 primary [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-Adenocarcinoma-TCGA - Integrated profiling of 333 primary prostate adenocarcinoma samples. `_ [`Meta `_] -* |OK_ICON| `Prostate-Diagnosis - PCa T1- and T2-weighted magnetic resonance images [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-Diagnosis - PCa T1- and T2-weighted magnetic resonance images (MRIs) were acquired [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-Fused-MRI-Pathology - The Prostate Fused-MRI-Pathology [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-Fused-MRI-Pathology - The Prostate Fused-MRI-Pathology collection is a combination [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-MRI - The Prostate-MRI collection of prostate Magnetic Resonance [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-MRI - The Prostate-MRI collection of prostate Magnetic Resonance Images (MRIs) was [...] `_ [`Meta `_] -* |OK_ICON| `Prostate-R - The R package 'ElemStatLearn' contains a prostate cancer [...] `_ [`Meta `_] +* |OK_ICON| `Prostate-R - The R package 'ElemStatLearn' contains a prostate cancer dataset from Stamey et [...] `_ [`Meta `_] -* |OK_ICON| `QIN-PROSTATE-Repeatability - The QIN-PROSTATE-Repeatability dataset is a [...] `_ [`Meta `_] +* |OK_ICON| `QIN-PROSTATE-Repeatability - The QIN-PROSTATE-Repeatability dataset is a dataset with [...] `_ [`Meta `_] -* |OK_ICON| `QIN-PROSTATE - The QIN PROSTATE collection of the Quantitative Imaging [...] `_ [`Meta `_] +* |OK_ICON| `QIN-PROSTATE - The QIN PROSTATE collection of the Quantitative Imaging Network (QIN) contains [...] `_ [`Meta `_] -* |OK_ICON| `SEER-YR1973_2015.SEER9 - The SEER November 2017 Research Data files from [...] `_ [`Meta `_] +* |OK_ICON| `SEER-YR1973_2015.SEER9 - The SEER November 2017 Research Data files from nine SEER registries [...] `_ [`Meta `_] -* |OK_ICON| `SEER-YR1992_2015.SJ_LA_RG_AK - The SEER November 2017 Research Data files [...] `_ [`Meta `_] +* |OK_ICON| `SEER-YR1992_2015.SJ_LA_RG_AK - The SEER November 2017 Research Data files from the San Jose- [...] `_ [`Meta `_] -* |OK_ICON| `SEER-YR2000_2015.CA_KY_LO_NJ_GA - The SEER November 2017 Research Data [...] `_ [`Meta `_] +* |OK_ICON| `SEER-YR2000_2015.CA_KY_LO_NJ_GA - The SEER November 2017 Research Data files from the Greater [...] `_ [`Meta `_] -* |OK_ICON| `SEER-YR2000_2015.CA_KY_LO_NJ_GA - The July - December 2005 diagnoses for [...] `_ [`Meta `_] +* |OK_ICON| `SEER-YR2000_2015.CA_KY_LO_NJ_GA - The July - December 2005 diagnoses for Louisiana from their [...] `_ [`Meta `_] * |OK_ICON| `TCGA-PRAD-US - TCGA Prostate Adenocarcinoma (499 samples). `_ [`Meta `_] @@ -1295,7 +1293,7 @@ PublicDomains * |OK_ICON| `CMU StatLab collections `_ [`Meta `_] -* |FIXME_ICON| `Data.World `_ [`Meta `_] +* |OK_ICON| `Data.World `_ [`Meta `_] * |FIXME_ICON| `Data360 `_ [`Meta `_] @@ -1303,7 +1301,7 @@ PublicDomains * |OK_ICON| `Google `_ [`Meta `_] -* |FIXME_ICON| `Grand Comics Database - The Grand Comics Database (GCD) is a nonprofit, [...] `_ [`Meta `_] +* |FIXME_ICON| `Grand Comics Database - The Grand Comics Database (GCD) is a nonprofit, internet-based [...] `_ [`Meta `_] * |FIXME_ICON| `Infochimps `_ [`Meta `_] @@ -1323,7 +1321,7 @@ PublicDomains * |OK_ICON| `Sample R data sets `_ [`Meta `_] -* |OK_ICON| `Stack Overflow Annual Developer Survey - Annual developer surverys full [...] `_ [`Meta `_] +* |OK_ICON| `Stack Overflow Annual Developer Survey - Annual developer surverys full data sets from 2011 [...] `_ [`Meta `_] * |OK_ICON| `StatSci.org `_ [`Meta `_] @@ -1369,7 +1367,7 @@ SearchEngines SocialNetworks -------------- -* |OK_ICON| `2021 Portuguese Elections Twitter Dataset - 57M+ tweets, 1M+ users - This [...] `_ [`Meta `_] +* |OK_ICON| `2021 Portuguese Elections Twitter Dataset - 57M+ tweets, 1M+ users - This dataset contains [...] `_ [`Meta `_] * |OK_ICON| `72 hours #gamergate Twitter Scrape `_ [`Meta `_] @@ -1377,19 +1375,19 @@ SocialNetworks * |OK_ICON| `Cheng-Caverlee-Lee September 2009 - January 2010 Twitter Scrape `_ [`Meta `_] -* |OK_ICON| `China Biographical Database - The China Biographical Database is a freely [...] `_ [`Meta `_] +* |OK_ICON| `China Biographical Database - The China Biographical Database is a freely accessible [...] `_ [`Meta `_] * |OK_ICON| `Clubhouse Dataset `_ [`Meta `_] -* |OK_ICON| `A Twitter Dataset of 40+ million tweets related to COVID-19 - Due to the [...] `_ [`Meta `_] +* |OK_ICON| `A Twitter Dataset of 40+ million tweets related to COVID-19 - Due to the relevance of the [...] `_ [`Meta `_] -* |OK_ICON| `43k+ Donald Trump Twitter Screenshots - This archive contains screenshots [...] `_ [`Meta `_] +* |OK_ICON| `43k+ Donald Trump Twitter Screenshots - This archive contains screenshots of 43,475 Donald [...] `_ [`Meta `_] * |OK_ICON| `EDRM Enron EMail of 151 users, hosted on S3 `_ [`Meta `_] * |OK_ICON| `Facebook Data Scrape (2005) `_ [`Meta `_] -* |OK_ICON| `Facebook Social Connectedness Index - We use an anonymized snapshot of [...] `_ [`Meta `_] +* |OK_ICON| `Facebook Social Connectedness Index - We use an anonymized snapshot of all active Facebook [...] `_ [`Meta `_] * |OK_ICON| `Facebook Social Networks from LAW (since 2007) `_ [`Meta `_] @@ -1407,7 +1405,7 @@ SocialNetworks * |OK_ICON| `Network Twitter Data `_ [`Meta `_] -* |OK_ICON| `Reddit Comments `_ [`Meta `_] +* |FIXME_ICON| `Reddit Comments `_ [`Meta `_] * |OK_ICON| `Skytrax' Air Travel Reviews Dataset `_ [`Meta `_] @@ -1415,7 +1413,7 @@ SocialNetworks * |FIXME_ICON| `SourceForge.net Research Data `_ [`Meta `_] -* |OK_ICON| `The Reddit COVID dataset - This dataset attempts to capture the full [...] `_ [`Meta `_] +* |OK_ICON| `The Reddit COVID dataset - This dataset attempts to capture the full extent of COVID-19 [...] `_ [`Meta `_] * |OK_ICON| `Twitch Top Streamer's Data `_ [`Meta `_] @@ -1429,7 +1427,7 @@ SocialNetworks * |OK_ICON| `UNIMI/LAW Social Network Datasets `_ [`Meta `_] -* |OK_ICON| `United States Congress Twitter Data - Daily datasets with tweets of 1100+ [...] `_ [`Meta `_] +* |OK_ICON| `United States Congress Twitter Data - Daily datasets with tweets of 1100+ accounts associated [...] `_ [`Meta `_] * |OK_ICON| `Yahoo! Graph and Social Data `_ [`Meta `_] @@ -1440,7 +1438,7 @@ SocialSciences * |OK_ICON| `ACLED (Armed Conflict Location & Event Data Project) `_ [`Meta `_] -* |OK_ICON| `Authoritarian Ruling Elites Database - The Authoritarian Ruling Elites [...] `_ [`Meta `_] +* |OK_ICON| `Authoritarian Ruling Elites Database - The Authoritarian Ruling Elites Database (ARED) is a [...] `_ [`Meta `_] * |OK_ICON| `Canadian Legal Information Institute `_ [`Meta `_] @@ -1466,7 +1464,7 @@ SocialSciences * |OK_ICON| `Global Religious Futures Project `_ [`Meta `_] -* |OK_ICON| `Gun Violence Data - A comprehensive, accessible database that contains [...] `_ [`Meta `_] +* |OK_ICON| `Gun Violence Data - A comprehensive, accessible database that contains records of over 260k [...] `_ [`Meta `_] * |OK_ICON| `Humanitarian Data Exchange `_ [`Meta `_] @@ -1486,9 +1484,9 @@ SocialSciences * |FIXME_ICON| `MacroData Guide by Norsk samfunnsvitenskapelig datatjeneste `_ [`Meta `_] -* |OK_ICON| `Mass Mobilization Data Project - The Mass Mobilization (MM) data are an [...] `_ [`Meta `_] +* |OK_ICON| `Mass Mobilization Data Project - The Mass Mobilization (MM) data are an effort to understand [...] `_ [`Meta `_] -* |OK_ICON| `Microsoft Academic Knowledge Graph - The Microsoft Academic Knowledge [...] `_ [`Meta `_] +* |OK_ICON| `Microsoft Academic Knowledge Graph - The Microsoft Academic Knowledge Graph is a large RDF [...] `_ [`Meta `_] * |OK_ICON| `Minnesota Population Center `_ [`Meta `_] @@ -1496,7 +1494,7 @@ SocialSciences * |OK_ICON| `Open Crime and Policing Data in England, Wales and Northern Ireland `_ [`Meta `_] -* |OK_ICON| `OpenSanctions - A global database of persons and companies of political, [...] `_ [`Meta `_] +* |OK_ICON| `OpenSanctions - A global database of persons and companies of political, criminal, or [...] `_ [`Meta `_] * |OK_ICON| `Paul Hensel General International Data Page `_ [`Meta `_] @@ -1528,7 +1526,7 @@ SocialSciences * |OK_ICON| `World Bank Open Data `_ [`Meta `_] -* |OK_ICON| `World Inequality Database - The World Inequality Database (WID.world) [...] `_ [`Meta `_] +* |OK_ICON| `World Inequality Database - The World Inequality Database (WID.world) aims to provide open [...] `_ [`Meta `_] * |FIXME_ICON| `WorldPop project - Worldwide human population distributions `_ [`Meta `_] @@ -1537,30 +1535,30 @@ Software * |OK_ICON| `FLOSSmole data about free, libre, and open source software development `_ [`Meta `_] -* |FIXME_ICON| `GHTorrent - Scalable, queryable, offline mirror of data offered through [...] `_ [`Meta `_] +* |FIXME_ICON| `GHTorrent - Scalable, queryable, offline mirror of data offered through the GitHub REST API. `_ [`Meta `_] * |OK_ICON| `Libraries.io Open Source Repository and Dependency Metadata `_ [`Meta `_] -* |OK_ICON| `Public Git Archive - a Big Code dataset for all – dataset of 182,014 top- [...] `_ [`Meta `_] +* |OK_ICON| `Public Git Archive - a Big Code dataset for all – dataset of 182,014 top-bookmarked Git [...] `_ [`Meta `_] -* |OK_ICON| `Code duplicates - 2k Java file and 600 Java function pairs labeled as [...] `_ [`Meta `_] +* |OK_ICON| `Code duplicates - 2k Java file and 600 Java function pairs labeled as similar or different by [...] `_ [`Meta `_] * |OK_ICON| `Commit messages - 1.3 billion GitHub commit messages till March 2019 `_ [`Meta `_] -* |OK_ICON| `Pull Request review comments - 25.3 million GitHub PR review comments [...] `_ [`Meta `_] +* |OK_ICON| `Pull Request review comments - 25.3 million GitHub PR review comments since January 2015 till [...] `_ [`Meta `_] -* |OK_ICON| `Source Code Identifiers - 41.7 million distinct splittable identifiers [...] `_ [`Meta `_] +* |OK_ICON| `Source Code Identifiers - 41.7 million distinct splittable identifiers collected from 182,014 [...] `_ [`Meta `_] Sports ------ -* |OK_ICON| `American Ninja Warrior Obstacles - Contains every obstacle in the history [...] `_ [`Meta `_] +* |OK_ICON| `American Ninja Warrior Obstacles - Contains every obstacle in the history of American Ninja [...] `_ [`Meta `_] * |FIXME_ICON| `Betfair Historical Exchange Data `_ [`Meta `_] * |OK_ICON| `Cricsheet Matches (cricket) `_ [`Meta `_] -* |OK_ICON| `Equity in Athletics - The Equity in Athletics Data Analysis Cutting Tool [...] `_ [`Meta `_] +* |OK_ICON| `Equity in Athletics - The Equity in Athletics Data Analysis Cutting Tool is brought to you by [...] `_ [`Meta `_] * |OK_ICON| `Ergast Formula 1, from 1950 up to date (API) `_ [`Meta `_] @@ -1572,7 +1570,7 @@ Sports * |OK_ICON| `Pinhooker: Thoroughbred Bloodstock Sale Data `_ [`Meta `_] -* |OK_ICON| `Pro Kabadi season 1 to 7 - Pro Kabadi League is a professional-level [...] `_ [`Meta `_] +* |OK_ICON| `Pro Kabadi season 1 to 7 - Pro Kabadi League is a professional-level Kabaddi league in India. [...] `_ [`Meta `_] * |OK_ICON| `Retrosheet Baseball Statistics `_ [`Meta `_] @@ -1580,14 +1578,14 @@ Sports * |OK_ICON| `Tennis database of rankings, results, and stats for WTA `_ [`Meta `_] -* |OK_ICON| `Transfermarkt Datasets - Clean, structured and automatically updated [...] `_ [`Meta `_] +* |OK_ICON| `Transfermarkt Datasets - Clean, structured and automatically updated football (soccer) data [...] `_ [`Meta `_] -* |OK_ICON| `USA Soccer Teams and Locations - USA soccer teams and locations. MLS, [...] `_ [`Meta `_] +* |OK_ICON| `USA Soccer Teams and Locations - USA soccer teams and locations. MLS, NWSL, and USL [...] `_ [`Meta `_] TimeSeries ---------- -* |OK_ICON| `3W dataset - To the best of its authors' knowledge, this is the first [...] `_ [`Meta `_] +* |OK_ICON| `3W dataset - To the best of its authors' knowledge, this is the first realistic and public [...] `_ [`Meta `_] * |OK_ICON| `Databanks International Cross National Time Series Data Archive `_ [`Meta `_] @@ -1597,7 +1595,7 @@ TimeSeries * |OK_ICON| `Time Series Data Library (TSDL) from MU `_ [`Meta `_] -* |OK_ICON| `Turing Change Point Dataset - Contains 42 annotated time series collected [...] `_ [`Meta `_] +* |OK_ICON| `Turing Change Point Dataset - Contains 42 annotated time series collected for the development [...] `_ [`Meta `_] * |OK_ICON| `UC Riverside Time Series Dataset `_ [`Meta `_] @@ -1652,12 +1650,12 @@ Transportation * |FIXME_ICON| `U.S. Freight Analysis Framework since 2007 `_ [`Meta `_] -* |OK_ICON| `U.S. National Highway Traffic Safety Administration - Fatalities since [...] `_ [`Meta `_] +* |OK_ICON| `U.S. National Highway Traffic Safety Administration - Fatalities since 1975 - Contains CSV [...] `_ [`Meta `_] eSports ------- -* |OK_ICON| `CS:GO Competitive Matchmaking Data - In this data set we have data about [...] `_ [`Meta `_] +* |OK_ICON| `CS:GO Competitive Matchmaking Data - In this data set we have data about the CSGO matchmaking [...] `_ [`Meta `_] * |OK_ICON| `FIFA-2021 Complete Player Dataset `_ [`Meta `_]