2014-11-21 17:10:09 +08:00
Awesome Public Datasets
=======================
2018-01-15 01:04:07 +08:00
2015-08-08 00:30:45 +08:00
.. image :: https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg
2018-04-07 23:56:40 +08:00
:alt: Awesome
:target: https://github.com/sindresorhus/awesome
2018-04-07 14:44:49 +08:00
2014-11-21 17:10:09 +08:00
2018-01-16 19:01:20 +08:00
.. |OK_ICON| image :: https://raw.githubusercontent.com/awesomedata/apd-core/master/deploy/ok-24.png
.. |FIXME_ICON| image :: https://raw.githubusercontent.com/awesomedata/apd-core/master/deploy/fixme-24.png
2018-01-15 01:04:07 +08:00
2018-04-07 14:44:49 +08:00
2018-01-16 19:01:20 +08:00
**NOTICE** : This repo is automatically generated by `apd-core <https://github.com/awesomedata/apd-core/tree/master/core> `_ .
2018-01-15 11:54:59 +08:00
Please **DO NOT** modify this file directly. We have provided
2020-03-01 23:05:49 +08:00
`a new way <https://github.com/awesomedata/apd-core/blob/master/CONTRIBUTING.md> `_
2020-04-20 12:08:34 +08:00
to contribute to Awesome Public Datasets. `Join <https://join.slack.com/t/awesomedataworld/shared_invite/zt-dllew5xy-PJYi~mWUdY3hupohbmVZsA> `_ the `slack community <https://awesomedataworld.slack.com> `_ for more communication.
2018-01-15 01:04:07 +08:00
2018-02-11 00:00:41 +08:00
* |OK_ICON| I am well.
* |FIXME_ICON| Please fix me.
2018-01-15 01:04:07 +08:00
2018-01-17 18:32:32 +08:00
`This list of a topic-centric public data sources <https://github.com/awesomedata/awesome-public-datasets> `_
2018-01-15 01:04:07 +08:00
in high quality. They are collected and tidied from blogs, answers, and user responses.
2014-12-21 15:38:35 +08:00
Most of the data sets listed below are free, however, some are not.
2018-01-15 16:56:17 +08:00
Other amazingly awesome lists can be found in `sindresorhus's awesome <https://github.com/sindresorhus/awesome> `_ list.
2014-12-05 18:37:43 +08:00
2018-01-15 01:04:07 +08:00
2018-01-16 01:04:57 +08:00
.. contents :: **Table of Contents**
2014-12-05 18:37:43 +08:00
2018-01-15 01:04:07 +08:00
2014-12-26 22:12:33 +08:00
Agriculture
2018-01-15 01:06:25 +08:00
-----------
2018-01-15 01:04:07 +08:00
2020-06-15 07:43:05 +08:00
* |OK_ICON| `The global dataset of historical yields for major crops 1981– 2016 - The [...] <https://doi.pangaea.de/10.1594/PANGAEA.909132> `_
2021-02-19 04:58:20 +08:00
* |OK_ICON| `Hyperspectral benchmark dataset on soil moisture - This dataset was [...] <https://doi.org/10.5281/zenodo.1227837> `_
2018-12-04 14:05:44 +08:00
2020-08-04 23:35:26 +08:00
* |OK_ICON| `Lemons quality control dataset - Lemon dataset has been prepared to [...] <https://github.com/softwaremill/lemon-dataset> `_
2020-05-05 04:08:48 +08:00
* |OK_ICON| `Optimized Soil Adjusted Vegetation Index - The IDB is a tool for working [...] <https://www.indexdatabase.de/db/i-single.php?id=63> `_
2018-12-03 21:47:21 +08:00
* |OK_ICON| `U.S. Department of Agriculture's Nutrient Database <https://www.ars.usda.gov/northeast-area/beltsville-md/beltsville-human-nutrition-research-center/nutrient-data-laboratory/docs/sr28-download-files/> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `U.S. Department of Agriculture's PLANTS Database - The Complete PLANTS [...] <http://www.plants.usda.gov/dl_all.html> `_
2018-01-15 01:04:07 +08:00
2014-12-26 22:12:33 +08:00
Biology
2018-01-15 01:06:25 +08:00
-------
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `1000 Genomes - The 1000 Genomes Project ran between 2008 and 2015, [...] <http://www.1000genomes.org/data> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `American Gut (Microbiome Project) - The American Gut project is the [...] <https://github.com/biocore/American-Gut> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `Broad Bioimage Benchmark Collection (BBBC) - The Broad Bioimage Benchmark [...] <https://www.broadinstitute.org/bbbc> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Broad Cancer Cell Line Encyclopedia (CCLE) <http://www.broadinstitute.org/ccle/home> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `Cell Image Library - This library is a public and easily accessible [...] <http://www.cellimagelibrary.org> `_
2018-01-15 01:04:07 +08:00
2020-04-11 22:58:40 +08:00
* |OK_ICON| `Complete Genomics Public Data - A diverse data set of whole human genomes [...] <http://www.completegenomics.com/public-data/69-genomes/> `_
2018-01-15 01:04:07 +08:00
2020-06-15 07:43:05 +08:00
* |OK_ICON| `EBI ArrayExpress - ArrayExpress Archive of Functional Genomics Data [...] <http://www.ebi.ac.uk/arrayexpress/> `_
2018-01-15 01:04:07 +08:00
2020-03-25 23:29:33 +08:00
* |OK_ICON| `EBI Protein Data Bank in Europe - The Electron Microscopy Data Bank [...] <http://www.ebi.ac.uk/pdbe/emdb/index.html/> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `ENCODE project - The Encyclopedia of DNA Elements (ENCODE) Consortium is [...] <https://www.encodeproject.org> `_
2018-01-15 01:04:07 +08:00
2020-03-25 23:29:33 +08:00
* |OK_ICON| `Electron Microscopy Pilot Image Archive (EMPIAR) - EMPIAR, the Electron [...] <http://www.ebi.ac.uk/pdbe/emdb/empiar/> `_
2018-01-15 01:04:07 +08:00
2021-02-18 00:19:52 +08:00
* |OK_ICON| `Ensembl Genomes <https://ensemblgenomes.org/> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `Gene Expression Omnibus (GEO) - GEO is a public functional genomics data [...] <http://www.ncbi.nlm.nih.gov/geo/> `_
2018-01-15 01:04:07 +08:00
2019-03-30 16:02:05 +08:00
* |OK_ICON| `Gene Ontology (GO) - GO annotation files <http://geneontology.org/docs/download-go-annotations/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Global Biotic Interactions (GloBI) <https://github.com/jhpoelen/eol-globi-data/wiki#accessing-species-interaction-data> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `Harvard Medical School (HMS) LINCS Project - The Harvard Medical School [...] <http://lincs.hms.harvard.edu> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `Human Genome Diversity Project - A group of scientists at Stanford [...] <http://www.hagsc.org/hgdp/files.html> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `Human Microbiome Project (HMP) - The HMP sequenced over 2000 reference [...] <http://www.hmpdacc.org/reference_genomes/reference_genomes.php> `_
2018-01-15 01:08:24 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `ICOS PSP Benchmark - The ICOS PSP benchmarks repository contains an [...] <http://ico2s.org/datasets/psp_benchmark.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `International HapMap Project <http://hapmap.ncbi.nlm.nih.gov/downloads/index.html.en> `_
2018-01-15 01:04:07 +08:00
2021-02-19 04:14:20 +08:00
* |FIXME_ICON| `Journal of Cell Biology DataViewer <https://rupress.org/jcb/pages/jcb-dataviewer> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Biology/Journal-of-Cell-Biology-DataViewer.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `KEGG - KEGG is a database resource for understanding high-level functions [...] <http://www.genome.jp/kegg/> `_
2018-04-06 01:00:48 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `MIT Cancer Genomics Data <http://www.broadinstitute.org/cgi-bin/cancer/datasets.cgi> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NCBI Proteins <http://www.ncbi.nlm.nih.gov/guide/proteins/#databases> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `NCBI Taxonomy - The NCBI Taxonomy database is a curated set of names and [...] <http://www.ncbi.nlm.nih.gov/taxonomy> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `NCI Genomic Data Commons - The GDC Data Portal is a robust data-driven [...] <https://gdc.cancer.gov/access-data/gdc-data-portal> `_
2018-01-15 01:04:07 +08:00
2018-12-14 06:28:53 +08:00
* |OK_ICON| `NIH Microarray data <ftp://ftp.ncbi.nih.gov/pub/geo/DATA/supplementary/series/GSE6532/> `_
2018-01-15 01:04:07 +08:00
2020-06-23 05:15:54 +08:00
* |OK_ICON| `OpenSNP genotypes data - openSNP allows customers of direct-to-customer [...] <https://opensnp.org/> `_
2018-01-15 01:04:07 +08:00
2020-06-23 01:07:41 +08:00
* |OK_ICON| `Palmer Penguins - The goal of palmerpenguins is to provide a great [...] <https://allisonhorst.github.io/palmerpenguins/> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Pathguid - Protein-Protein Interactions Catalog <http://www.pathguide.org/> `_
2018-01-15 01:08:24 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `Protein Data Bank - This resource is powered by the Protein Data Bank [...] <http://www.rcsb.org/> `_
2018-01-15 01:04:07 +08:00
2020-02-01 05:09:08 +08:00
* |OK_ICON| `Psychiatric Genomics Consortium - The purpose of the Psychiatric Genomics [...] <https://www.med.unc.edu/pgc/downloads> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `PubChem Project - PubChem is the world's largest collection of freely [...] <https://pubchem.ncbi.nlm.nih.gov/> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `PubGene (now Coremine Medical) - COREMINE™ is a family of tools developed [...] <https://www.coremine.com/> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `Sanger Catalogue of Somatic Mutations in Cancer (COSMIC) - COSMIC, the [...] <http://cancer.sanger.ac.uk/cosmic> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Sanger Genomics of Drug Sensitivity in Cancer Project (GDSC) <http://www.cancerrxgene.org/> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `Sequence Read Archive(SRA) - The Sequence Read Archive (SRA) stores raw [...] <http://www.ncbi.nlm.nih.gov/Traces/sra/> `_
2018-01-15 01:04:07 +08:00
2020-04-11 22:58:40 +08:00
* |OK_ICON| `Stanford Microarray Data <http://smd.stanford.edu/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Stowers Institute Original Data Repository <http://www.stowers.org/research/publications/odr> `_
2018-01-15 01:04:07 +08:00
2020-02-05 05:32:39 +08:00
* |OK_ICON| `Systems Science of Biological Dynamics (SSBD) Database - Systems Science [...] <http://ssbd.qbic.riken.jp> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `The Cancer Genome Atlas (TCGA), available via Broad GDAC <https://gdac.broadinstitute.org/> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `The Catalogue of Life - The Catalogue of Life is a quality-assured [...] <http://www.catalogueoflife.org/content/annual-checklist-archive> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `The Personal Genome Project - The Personal Genome Project, initiated in [...] <http://www.personalgenomes.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `UCSC Public Data <http://hgdownload.soe.ucsc.edu/downloads.html> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `UniGene <https://ftp.ncbi.nlm.nih.gov/repository/UniGene/> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:11:59 +08:00
* |OK_ICON| `Universal Protein Resource (UnitProt) - The Universal Protein Resource [...] <http://www.uniprot.org/downloads> `_
2020-03-14 03:14:04 +08:00
* |OK_ICON| `Rfam - The Rfam database is a collection of RNA families, each [...] <https://docs.rfam.org/en/latest/database.html> `_
2018-01-15 01:04:07 +08:00
Climate+Weather
2018-01-15 01:06:25 +08:00
---------------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Actuaries Climate Index <http://actuariesclimateindex.org/data/> `_
2018-01-15 01:04:07 +08:00
2018-07-17 00:04:42 +08:00
* |OK_ICON| `Australian Weather <http://www.bom.gov.au/climate/dwo/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Aviation Weather Center - Consistent, timely and accurate weather [...] <https://aviationweather.gov/adds/dataserver> `_
2018-01-15 01:04:07 +08:00
2021-02-19 04:14:20 +08:00
* |OK_ICON| `Brazilian Weather - Historical data (In Portuguese) - Data related to [...] <http://sinda.crn.inpe.br/PCD/SITE/novo/site/historico/index.php> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Canadian Meteorological Centre <http://weather.gc.ca/grib/index_e.html> `_
2018-01-15 01:04:07 +08:00
2018-12-15 00:39:33 +08:00
* |OK_ICON| `Climate Data from UEA (updated monthly) <http://www.cru.uea.ac.uk/data/> `_
2018-01-15 01:04:07 +08:00
2019-06-04 00:43:00 +08:00
* |OK_ICON| `Dutch Weather - The KNMI Data Center (KDC) portal provides access to KNMI [...] <https://data.knmi.nl/datasets> `_
2019-04-19 00:54:37 +08:00
* |OK_ICON| `European Climate Assessment & Dataset <https://www.ecad.eu/> `_
2018-01-15 01:04:07 +08:00
2021-02-19 04:14:20 +08:00
* |OK_ICON| `German Climate Data Center <https://cdc.dwd.de/portal/> `_
2021-02-17 06:35:41 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Global Climate Data Since 1929 <http://en.tutiempo.net/climate> `_
2018-01-15 01:04:07 +08:00
2020-02-05 05:32:39 +08:00
* |OK_ICON| `Charting The Global Climate Change News Narrative 2009-2020 - These four [...] <https://blog.gdeltproject.org/four-massive-datasets-charting-the-global-climate-change-news-narrative-2009-2020/> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NASA Global Imagery Browse Services <https://wiki.earthdata.nasa.gov/display/GIBS> `_
2018-01-15 01:04:07 +08:00
2021-02-19 04:14:20 +08:00
* |FIXME_ICON| `NOAA Bering Sea Climate <http://www.beringclimate.noaa.gov/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Climate+Weather/NOAA-Bering-Sea-Climate.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NOAA Climate Datasets <http://www.ncdc.noaa.gov/data-access/quick-links> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NOAA Realtime Weather Models <http://www.ncdc.noaa.gov/data-access/model-data/model-datasets/numerical-weather-prediction> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NOAA SURFRAD Meteorology and Radiation Datasets <https://www.esrl.noaa.gov/gmd/grad/stardata.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `The World Bank Open Data Resources for Climate Change <http://data.worldbank.org/developers/climate-data-api> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `UEA Climatic Research Unit <http://www.cru.uea.ac.uk/data> `_
2018-01-15 01:04:07 +08:00
2019-06-16 00:11:24 +08:00
* |OK_ICON| `WU Historical Weather Worldwide <https://www.wunderground.com/history/index.html> `_
2018-01-15 01:04:07 +08:00
2020-08-12 23:21:41 +08:00
* |OK_ICON| `Wahington Post Climate Change - To analyze warming temperatures in the [...] <https://github.com/washingtonpost/data-2C-beyond-the-limit-usa> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `WorldClim - Global Climate Data <http://www.worldclim.org> `_
2018-01-15 01:04:07 +08:00
ComplexNetworks
2018-01-15 01:06:25 +08:00
---------------
2018-01-15 01:04:07 +08:00
2021-02-25 23:54:57 +08:00
* |OK_ICON| `AMiner Citation Network Dataset <http://aminer.org/citation> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `CrossRef DOI URLs <https://archive.org/details/doi-urls> `_
2018-01-15 01:04:07 +08:00
2019-11-26 03:59:26 +08:00
* |OK_ICON| `DBLP Citation dataset <https://kdl.cs.umass.edu/display/public/DBLP> `_
2018-01-15 01:04:07 +08:00
2021-02-24 02:18:11 +08:00
* |OK_ICON| `DIMACS Road Networks Collection <http://www.dis.uniroma1.it/challenge9/download.shtml> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NBER Patent Citations <http://nber.org/patents/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NIST complex networks data collection <http://math.nist.gov/~RPozo/complex_datasets.html> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |FIXME_ICON| `Network Repository with Interactive Exploratory Analysis Tools <http://networkrepository.com/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//ComplexNetworks/Network-Repository-with-Interactive-Exploratory-Analysis-Tools.yml> `_ ]
2018-01-15 01:08:24 +08:00
2018-10-29 17:45:43 +08:00
* |OK_ICON| `Protein-protein interaction network <http://vlado.fmf.uni-lj.si/pub/networks/data/bio/Yeast/Yeast.htm> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `PyPI and Maven Dependency Network <https://ogirardot.wordpress.com/2013/01/31/sharing-pypimaven-dependency-data/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Scopus Citation Database <https://www.elsevier.com/solutions/scopus> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Small Network Data <http://www-personal.umich.edu/~mejn/netdata/> `_
2018-01-15 01:08:24 +08:00
2020-09-04 01:06:47 +08:00
* |OK_ICON| `Stanford GraphBase <http://www3.cs.stonybrook.edu/~algorith/implement/graphbase/implement.shtml> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Stanford Large Network Dataset Collection <http://snap.stanford.edu/data/> `_
2018-01-15 01:04:07 +08:00
2019-05-26 18:53:08 +08:00
* |FIXME_ICON| `Stanford Longitudinal Network Data Sources <http://stanford.edu/group/sonia/dataSources/index.html> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//ComplexNetworks/Stanford-Longitudinal-Network-Data-Sources.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `The Koblenz Network Collection <http://konect.uni-koblenz.de/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `The Laboratory for Web Algorithmics (UNIMI) <http://law.di.unimi.it/datasets.php> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `UCI Network Data Repository <https://networkdata.ics.uci.edu/resources.php> `_
2018-01-15 01:04:07 +08:00
2018-04-20 00:24:59 +08:00
* |OK_ICON| `UFL sparse matrix collection <http://www.cise.ufl.edu/research/sparse/matrices/> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:45:38 +08:00
* |FIXME_ICON| `WSU Graph Database <http://www.eecs.wsu.edu/mgd/gdb.html> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//ComplexNetworks/WSU-Graph-Database.yml> `_ ]
2020-07-20 23:32:04 +08:00
* |OK_ICON| `Community Resource for Archiving Wireless Data At Dartmouth - Contains [...] <https://www.crawdad.org/> `_
2018-01-15 01:04:07 +08:00
ComputerNetworks
2018-01-15 01:06:25 +08:00
----------------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `3.5B Web Pages from CommonCrawl 2012 <http://www.bigdatanews.com/profiles/blogs/big-data-set-3-5-billion-web-pages-made-available-for-all-of-us> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `53.5B Web clicks of 100K users in Indiana Univ. <http://cnets.indiana.edu/groups/nan/webtraffic/click-dataset/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `CAIDA Internet Datasets <http://www.caida.org/data/overview/> `_
2018-01-15 01:04:07 +08:00
2020-10-01 01:39:01 +08:00
* |FIXME_ICON| `CRAWDAD Wireless datasets from Dartmouth Univ. <https://crawdad.cs.dartmouth.edu/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//ComputerNetworks/CRAWDAD-Wireless-datasets-from-Dartmouth-Univ..yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `ClueWeb09 - 1B web pages <http://lemurproject.org/clueweb09/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `ClueWeb12 - 733M web pages <http://lemurproject.org/clueweb12/> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `CommonCrawl Web Data over 7 years <http://commoncrawl.org/the-data/get-started/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Criteo click-through data <http://labs.criteo.com/2015/03/criteo-releases-its-new-dataset/> `_
2018-01-15 01:04:07 +08:00
2021-02-25 23:54:57 +08:00
* |OK_ICON| `Internet-Wide Scan Data Repository <https://scans.io/> `_
2018-01-17 21:46:48 +08:00
2021-02-24 02:18:11 +08:00
* |OK_ICON| `MIRAGE-2019 - MIRAGE-2019 is a human-generated dataset for mobile traffic [...] <http://traffic.comics.unina.it/mirage/> `_
2019-09-21 23:20:00 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `OONI: Open Observatory of Network Interference - Internet censorship data <https://ooni.torproject.org/data/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Open Mobile Data by MobiPerf <https://console.developers.google.com/storage/openmobiledata_public/> `_
2018-01-15 01:04:07 +08:00
2018-12-04 00:23:17 +08:00
* |OK_ICON| `The Peer-to-Peer Trace Archive - Real-world measurements play a key role [...] <http://p2pta.ewi.tudelft.nl/> `_
2019-03-30 16:02:05 +08:00
* |OK_ICON| `Rapid7 Sonar Internet Scans <https://sonar.labs.rapid7.com/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `UCSD Network Telescope, IPv4 /8 net <http://www.caida.org/projects/network_telescope/> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
CyberSecurity
-------------
2021-02-19 04:14:20 +08:00
* |OK_ICON| `CCCS-CIC-AndMal-2020 - The dataset includes 200K benign and 200K malware [...] <https://www.unb.ca/cic/datasets/andmal2020.html> `_
2021-02-19 04:58:20 +08:00
* |OK_ICON| `Traffic and Log Data Captured During a Cyber Defense Exercise - This [...] <https://zenodo.org/record/3746129> `_
2021-02-17 06:35:41 +08:00
2018-01-15 01:04:07 +08:00
DataChallenges
2018-01-15 01:06:25 +08:00
--------------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Bruteforce Database <https://github.com/duyetdev/bruteforce-database> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Challenges in Machine Learning <http://www.chalearn.org/> `_
2018-01-15 01:04:07 +08:00
2020-02-20 00:58:42 +08:00
* |FIXME_ICON| `CrowdANALYTIX dataX <http://data.crowdanalytix.com> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//DataChallenges/CrowdANALYTIX-dataX.yml> `_ ]
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |FIXME_ICON| `D4D Challenge of Orange <http://www.d4d.orange.com/en/home> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//DataChallenges/D4D-Challenge-of-Orange.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `DrivenData Competitions for Social Good <http://www.drivendata.org/> `_
2018-01-15 01:04:07 +08:00
2018-12-15 00:03:55 +08:00
* |OK_ICON| `ICWSM Data Challenge (since 2009) <https://www.icwsm.org/2018/datasets/datasets/#obtaining> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `KDD Cup by Tencent 2012 <http://www.kddcup2012.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Kaggle Competition Data <https://www.kaggle.com/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Localytics Data Visualization Challenge <https://github.com/localytics/data-viz-challenge> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Netflix Prize <http://netflixprize.com/leaderboard.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Space Apps Challenge <https://2015.spaceappschallenge.org> `_
2018-01-15 01:04:07 +08:00
2021-02-25 23:54:57 +08:00
* |FIXME_ICON| `Telecom Italia Big Data Challenge <https://dandelion.eu/datamine/open-big-data/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//DataChallenges/Telecom-Italia-Big-Data-Challenge.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `TravisTorrent Dataset - MSR'2017 Mining Challenge <https://travistorrent.testroots.org/> `_
2018-01-15 01:08:24 +08:00
2021-02-17 06:35:41 +08:00
* |FIXME_ICON| `TunedIT - Data mining & machine learning data sets, algorithms, challenges <http://tunedit.org/challenges/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//DataChallenges/TunedIT.yml> `_ ]
2018-01-17 21:46:48 +08:00
2020-03-25 02:48:40 +08:00
* |FIXME_ICON| `Yelp Dataset Challenge <http://www.yelp.com/dataset_challenge> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//DataChallenges/Yelp-Dataset-Challenge.yml> `_ ]
2018-01-15 01:04:07 +08:00
EarthScience
2018-01-15 01:06:25 +08:00
------------
2018-01-15 01:04:07 +08:00
2019-02-13 23:41:27 +08:00
* |OK_ICON| `38-Cloud (Cloud Detection) - Contains 38 Landsat 8 scene images and their [...] <https://github.com/SorourMo/38-Cloud-A-Cloud-Segmentation-Dataset> `_
2021-02-18 00:19:52 +08:00
* |OK_ICON| `AQUASTAT - Global water resources and uses <http://www.fao.org/nr/water/aquastat/data/query/index.html?lang=en> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `BODC - marine data of ~22K vars <https://www.bodc.ac.uk/data/> `_
2018-01-15 01:04:07 +08:00
2018-12-15 00:03:55 +08:00
* |OK_ICON| `EOSDIS - NASA's earth observing system data <http://sedac.ciesin.columbia.edu/data/sets/browse> `_
2018-01-15 01:04:07 +08:00
2021-02-19 04:23:55 +08:00
* |FIXME_ICON| `Earth Models <https://earthmodels.org/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//EarthScience/Earth-Models.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `Global Wind Atlas - The Global Wind Atlas is a free, web-based [...] <https://globalwindatlas.info/> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Integrated Marine Observing System (IMOS) - roughly 30TB of ocean measurements <https://imos.aodn.org.au> `_
2018-01-15 01:04:07 +08:00
2020-02-20 00:58:42 +08:00
* |OK_ICON| `Marinexplore - Open Oceanographic Data <http://marinexplore.org/> `_
2018-01-15 01:04:07 +08:00
2018-07-17 00:04:42 +08:00
* |OK_ICON| `Alabama Real-Time Coastal Observing System <http://mymobilebay.com> `_
2020-06-15 07:43:05 +08:00
* |OK_ICON| `National Estuarine Research Reserves System-Wide Monitoring Program - [...] <http://nerrsdata.org> `_
2018-07-17 00:04:42 +08:00
2019-03-26 20:44:19 +08:00
* |OK_ICON| `Oil and Gas Authority Open Data - The dataset covers 12,500 offshore [...] <https://data-ogauthority.opendata.arcgis.com/> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Smithsonian Institution Global Volcano and Eruption Database <http://volcano.si.edu/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `USGS Earthquake Archives <http://earthquake.usgs.gov/earthquakes/search/> `_
2018-01-15 01:04:07 +08:00
2014-11-21 17:10:09 +08:00
Economics
2018-01-15 01:06:25 +08:00
---------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `American Economic Association (AEA) <https://www.aeaweb.org/resources/data> `_
2018-01-15 01:08:24 +08:00
2019-03-20 04:45:49 +08:00
* |OK_ICON| `EconData from UMD <http://inforumweb.umd.edu/econdata/econdata.html> `_
2018-01-15 01:08:24 +08:00
2018-10-29 17:45:43 +08:00
* |OK_ICON| `Economic Freedom of the World Data <http://www.freetheworld.com/datasets_efw.html> `_
2018-01-15 01:04:07 +08:00
2019-01-17 15:21:37 +08:00
* |OK_ICON| `Historical MacroEconomic Statistics <http://www.historicalstatistics.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `INFORUM - Interindustry Forecasting at the University of Maryland <http://inforumweb.umd.edu/> `_
2018-01-17 21:46:48 +08:00
2019-04-18 03:24:58 +08:00
* |OK_ICON| `DBnomics – the world's economic database - Aggregates hundreds of [...] <https://db.nomics.world/> `_
2018-01-15 01:04:07 +08:00
2020-09-26 00:05:26 +08:00
* |OK_ICON| `International Trade Statistics <http://www.econostatistics.co.za/> `_
2018-01-15 01:08:24 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `Internet Product Code Database <http://www.upcdatabase.com/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Joint External Debt Data Hub <http://www.jedh.org/> `_
2018-01-15 01:04:07 +08:00
2019-08-14 15:47:05 +08:00
* |OK_ICON| `Jon Haveman International Trade Data Links <http://www.macalester.edu/research/economics/PAGE/HAVEMAN/Trade.Resources/TradeData.html> `_
2018-01-15 01:04:07 +08:00
2020-04-11 22:58:40 +08:00
* |OK_ICON| `Long-Term Productivity Database - The Long-Term Productivity database was [...] <http://longtermproductivity.com/download.html> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `OpenCorporates Database of Companies in the World <https://opencorporates.com/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Our World in Data <http://ourworldindata.org/> `_
2018-01-15 01:04:07 +08:00
2020-02-20 00:58:42 +08:00
* |FIXME_ICON| `SciencesPo World Trade Gravity Datasets <http://econ.sciences-po.fr/thierry-mayer/data> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Economics/SciencesPo-World-Trade-Gravity-Datasets.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `The Atlas of Economic Complexity <http://atlas.cid.harvard.edu> `_
2018-01-15 01:04:07 +08:00
2018-12-21 05:03:04 +08:00
* |OK_ICON| `The Center for International Data <http://cid.econ.ucdavis.edu> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |FIXME_ICON| `The Observatory of Economic Complexity <http://atlas.media.mit.edu/en/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Economics/The-Observatory-of-Economic-Complexity.yml> `_ ]
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `UN Commodity Trade Statistics <https://comtrade.un.org/data/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `UN Human Development Reports <http://hdr.undp.org/en> `_
2018-01-15 01:04:07 +08:00
2016-02-02 06:42:02 +08:00
Education
2018-01-15 01:06:25 +08:00
---------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `College Scorecard Data <https://collegescorecard.ed.gov/data/> `_
2018-01-15 01:08:24 +08:00
2020-01-07 03:34:36 +08:00
* |OK_ICON| `New York State Education Department Data - The New York State Education [...] <https://data.nysed.gov/downloads.php> `_
2018-11-28 01:42:08 +08:00
* |OK_ICON| `Student Data from Free Code Camp <https://github.com/freeCodeCamp/open-data> `_
2018-01-15 01:04:07 +08:00
2014-12-02 02:52:10 +08:00
Energy
2018-01-15 01:06:25 +08:00
------
2018-01-15 01:04:07 +08:00
2020-01-28 06:24:05 +08:00
* |OK_ICON| `AMPds - The Almanac of Minutely Power dataset <http://ampds.org/> `_
2018-01-15 01:04:07 +08:00
2020-01-28 06:29:59 +08:00
* |OK_ICON| `BLUEd - Building-Level fUlly labeled Electricity Disaggregation dataset <https://energy.duke.edu/content/building-level-fully-labeled-electricity-disaggregation-blued> `_
2018-01-15 01:04:07 +08:00
2020-06-23 01:06:59 +08:00
* |OK_ICON| `COMBED <http://combed.github.io/> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `DBFC - Direct Borohydride Fuel Cell (DBFC) Dataset <https://github.com/ECSIM/dbfc-dataset> `_
2020-01-18 05:56:39 +08:00
* |OK_ICON| `DEL - Domestic Electrical Load study datsets for South Africa (1994 - 2014) <https://www.datafirst.uct.ac.za/dataportal/index.php/catalog/DELS> `_
2020-01-28 06:36:45 +08:00
* |OK_ICON| `ECO - The ECO data set is a comprehensive data set for non-intrusive load [...] <http://www.vs.inf.ethz.ch/res/show.html?what=eco-data> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `EIA <http://www.eia.gov/electricity/data/eia923/> `_
2018-01-15 01:04:07 +08:00
2018-04-20 00:28:35 +08:00
* |OK_ICON| `Global Power Plant Database - The Global Power Plant Database is a [...] <http://datasets.wri.org/dataset/globalpowerplantdatabase> `_
2019-09-30 11:37:27 +08:00
* |OK_ICON| `HES - Household Electricity Study, UK <http://randd.defra.gov.uk/Default.aspx?Menu=Menu&Module=More&Location=None&ProjectID=17359&FromSearch=Y&Publisher=1&SearchText=EV0702&SortString=ProjectCode&SortOrder=Asc&Paging=10#Description> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `HFED <http://hfed.github.io/> `_
2018-01-15 01:04:07 +08:00
2020-03-01 21:34:46 +08:00
* |OK_ICON| `PEM1 - Proton Exchange Membrane (PEM) Fuel Cell Dataset <https://github.com/ECSIM/pem-dataset1> `_
2021-02-17 06:35:41 +08:00
* |FIXME_ICON| `PLAID - The Plug Load Appliance Identification Dataset <http://plaidplug.com/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Energy/PLAID.yml> `_ ]
2020-02-20 00:58:42 +08:00
* |OK_ICON| `The Public Utility Data Liberation Project (PUDL) - PUDL makes US energy [...] <https://github.com/catalyst-cooperative/pudl> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `REDD <http://redd.csail.mit.edu/> `_
2018-01-15 01:04:07 +08:00
2020-08-04 23:35:26 +08:00
* |OK_ICON| `SYND - A synthetic energy dataset for non-intrusive load monitoring - [...] <https://www.nature.com/articles/s41597-020-0434-6> `_
2018-12-03 21:18:07 +08:00
* |OK_ICON| `Smart Meter Data Portal - The Smart Meter Data Portal is part of the [...] <https://smda.github.io/smart-meter-data-portal> `_
2018-12-15 00:27:49 +08:00
* |OK_ICON| `Tracebase <https://github.com/areinhardt/tracebase> `_
2018-01-15 01:04:07 +08:00
2020-08-27 03:28:40 +08:00
* |OK_ICON| `Ukraine Energy Centre Datasets <https://ukrstat.org/en/operativ/menu/menu_e/energ.htm> `_
2020-01-28 06:43:40 +08:00
2020-01-18 05:53:14 +08:00
* |OK_ICON| `UK-DALE - UK Domestic Appliance-Level Electricity <https://jack-kelly.com/data> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `WHITED <http://nilmworkshop.org/2016/proceedings/Poster_ID18.pdf> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `iAWE <http://iawe.github.io/> `_
2018-01-15 01:04:07 +08:00
2021-02-25 23:54:57 +08:00
Entertainment
-------------
* |OK_ICON| `Top Streamers on Twitch - This contains data of Top 1000 Streamers from [...] <https://www.kaggle.com/aayushmishra1512/twitchdata> `_
2014-11-21 17:10:09 +08:00
Finance
2018-01-15 01:06:25 +08:00
-------
2018-01-15 01:04:07 +08:00
2020-08-14 07:29:34 +08:00
* |OK_ICON| `BIS Statistics - BIS statistics, compiled in cooperation with central [...] <https://www.bis.org/statistics/full_data_sets.htm> `_
2018-12-03 21:44:08 +08:00
* |OK_ICON| `Blockmodo Coin Registry - A registry of JSON formatted information files [...] <https://github.com/Blockmodo/coin_registry> `_
2018-12-15 00:45:59 +08:00
* |OK_ICON| `CBOE Futures Exchange <http://cfe.cboe.com/market-data/> `_
2018-01-15 01:04:07 +08:00
2020-09-26 00:05:26 +08:00
* |OK_ICON| `Complete FAANG Stock data - This data set contains all the stock data of [...] <https://www.kaggle.com/aayushmishra1512/faang-complete-stock-data> `_
2019-09-30 11:37:27 +08:00
* |OK_ICON| `Google Finance <https://www.google.com/finance> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Google Trends <http://www.google.com/trends?q=google&ctab=0&geo=all&date=all&sort=0> `_
2018-01-15 01:08:24 +08:00
2020-06-15 07:43:05 +08:00
* |FIXME_ICON| `NASDAQ <https://data.nasdaq.com/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Finance/NASDAQ.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-12-15 00:38:52 +08:00
* |OK_ICON| `NYSE Market Data <ftp://ftp.nyxdata.com/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `OANDA <http://www.oanda.com/> `_
2018-01-15 01:04:07 +08:00
2019-08-14 15:47:05 +08:00
* |FIXME_ICON| `OSU Financial data <http://fisher.osu.edu/fin/fdf/osudata.htm> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Finance/OSU-Financial-data.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Quandl <https://www.quandl.com/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `St Louis Federal <https://research.stlouisfed.org/fred2/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Yahoo Finance <http://finance.yahoo.com/> `_
2018-01-15 01:04:07 +08:00
2016-08-15 13:59:28 +08:00
GIS
2018-01-15 01:06:25 +08:00
---
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `Awesome 3D Semantic City Models - Collection of open 3D semantic city and [...] <https://github.com/OloOcki/awesome-citygml> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `ArcGIS Open Data portal <http://opendata.arcgis.com/> `_
2018-01-15 01:04:07 +08:00
2021-02-24 02:18:11 +08:00
* |OK_ICON| `Cambridge, MA, US, GIS data on GitHub <http://cambridgegis.github.io/gisdata.html> `_
2018-01-15 01:04:07 +08:00
2020-04-24 04:48:46 +08:00
* |OK_ICON| `Database of all continents, countries, States/Subdivisions/Provinces and [...] <https://www.back4app.com/database/back4app/list-of-all-continents-countries-cities> `_
2018-12-15 00:32:07 +08:00
* |OK_ICON| `Factual Global Location Data <https://places.factual.com/data/t/places> `_
2018-01-15 01:04:07 +08:00
2020-09-23 23:40:40 +08:00
* |OK_ICON| `IEEE Geoscience and Remote Sensing Society DASE Website <http://dase.grss-ieee.org> `_
2018-12-04 14:05:44 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Geo Maps - High Quality GeoJSON maps programmatically generated <https://github.com/simonepri/geo-maps> `_
2018-01-18 21:02:21 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `Geo Spatial Data from ASU <http://geodacenter.asu.edu/datalist/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Geo Wiki Project - Citizen-driven Environmental Monitoring <http://geo-wiki.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `GeoFabrik - OSM data extracted to a variety of formats and areas <http://download.geofabrik.de/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `GeoNames Worldwide <http://www.geonames.org/> `_
2018-01-15 01:04:07 +08:00
2018-12-03 21:16:43 +08:00
* |OK_ICON| `Global Administrative Areas Database (GADM) - Geospatial data organized [...] <https://gadm.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Homeland Infrastructure Foundation-Level Data <https://hifld-geoplatform.opendata.arcgis.com/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Landsat 8 on AWS <https://aws.amazon.com/public-data-sets/landsat/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `List of all countries in all languages <https://github.com/umpirsky/country-list> `_
2018-01-15 01:04:07 +08:00
2020-03-25 02:48:40 +08:00
* |OK_ICON| `National Weather Service GIS Data Portal <http://www.nws.noaa.gov/gis/> `_
2018-01-15 01:04:07 +08:00
2021-02-19 04:58:20 +08:00
* |FIXME_ICON| `Natural Earth - vectors and rasters of the world <https://www.naturalearthdata.com/downloads/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//GIS/Natural-Earth.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `OpenAddresses <http://openaddresses.io/> `_
2018-01-15 01:04:07 +08:00
2018-12-12 00:16:03 +08:00
* |OK_ICON| `OpenStreetMap (OSM) <http://wiki.openstreetmap.org/wiki/Downloading_data> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Pleiades - Gazetteer and graph of ancient places <http://pleiades.stoa.org/> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Reverse Geocoder using OSM data <https://github.com/kno10/reversegeocode> `_
2018-01-15 01:04:07 +08:00
2019-02-13 23:48:41 +08:00
* |OK_ICON| `Robin Wilson - Free GIS Datasets <http://freegisdata.rtwilson.com> `_
2019-06-16 00:11:24 +08:00
* |OK_ICON| `TIGER/Line - U.S. boundaries and roads <https://www.census.gov/geo/maps-data/data/tiger-line.html> `_
2018-01-15 01:04:07 +08:00
2020-09-29 00:47:50 +08:00
* |OK_ICON| `TZ Timezones shapefile <http://efele.net/maps/tz/world/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `TwoFishes - Foursquare's coarse geocoder <https://github.com/foursquare/twofishes> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `UN Environmental Data <http://geodata.grid.unep.ch/> `_
2018-01-15 01:04:07 +08:00
2018-12-17 23:57:53 +08:00
* |OK_ICON| `World boundaries from the U.S. Department of State <http://geonode.state.gov/layers/?limit=100&offset=0> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `World countries in multiple formats <https://github.com/mledoze/countries> `_
2018-01-15 01:04:07 +08:00
2014-11-21 17:10:09 +08:00
Government
2018-01-15 01:06:25 +08:00
----------
2018-01-15 01:04:07 +08:00
2020-08-12 23:21:41 +08:00
* |OK_ICON| `Alberta, Province of Canada <http://open.alberta.ca> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Antwerp, Belgium <http://opendata.antwerpen.be/datasets> `_
2018-01-15 01:04:07 +08:00
2020-03-25 02:48:40 +08:00
* |FIXME_ICON| `Argentina (non official) <http://datar.noip.me/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/Argentina-non-official.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Datos Argentina - Portal de datos abiertos de la República Argentina. [...] <http://datos.gob.ar/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Austin, TX, US <https://data.austintexas.gov/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Australia (abs.gov.au) <http://www.abs.gov.au/AUSSTATS/abs@.nsf/DetailsPage/3301.02009?OpenDocument> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Australia (data.gov.au) <https://data.gov.au/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Austria (data.gv.at) <https://www.data.gv.at/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Baton Rouge, LA, US <https://data.brla.gov/> `_
2018-01-15 01:04:07 +08:00
2019-12-12 04:07:49 +08:00
* |OK_ICON| `Beersheba, Israel - Open Data Portal (Smart7 OpenData) <https://www.beer-sheva.muni.il/OpenData/Pages/default.aspx> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Belgium <http://data.gov.be/> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `City of Berkeley Open Data <https://data.cityofberkeley.info/> `_
2020-09-04 01:06:47 +08:00
* |OK_ICON| `Brazil <http://dados.gov.br/dataset> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Buenos Aires, Argentina <http://data.buenosaires.gob.ar/> `_
2018-01-15 01:04:07 +08:00
2018-12-03 21:47:21 +08:00
* |OK_ICON| `Calgary, AB, Canada <https://data.calgary.ca/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Cambridge, MA, US <https://data.cambridgema.gov/> `_
2018-01-15 01:04:07 +08:00
2020-09-19 00:11:56 +08:00
* |OK_ICON| `Canada <http://open.canada.ca/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Chicago <https://data.cityofchicago.org/> `_
2018-01-15 01:04:07 +08:00
2020-04-24 04:48:46 +08:00
* |OK_ICON| `Chile <http://datos.gob.cl/dataset> `_
2018-01-15 01:04:07 +08:00
2020-08-27 03:28:40 +08:00
* |FIXME_ICON| `China <http://data.stats.gov.cn/english/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/China> `_ ]
2018-11-28 01:42:08 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Dallas Open Data <https://www.dallasopendata.com/> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |FIXME_ICON| `DataBC - data from the Province of British Columbia <http://www.data.gov.bc.ca/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/DataBC.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `Debt to the Penny - The Debt to the Penny dataset provides information [...] <https://fiscaldata.treasury.gov/datasets/debt-to-the-penny/debt-to-the-penny> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Denver Open Data <http://data.denvergov.org//> `_
2018-01-15 01:04:07 +08:00
2019-01-17 03:23:20 +08:00
* |OK_ICON| `Durham, NC Open Data <https://live-durhamnc.opendata.arcgis.com/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Edmonton, AB, Canada <https://data.edmonton.ca/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `England LGInform <http://lginform.local.gov.uk/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `EuroStat <http://ec.europa.eu/eurostat/data/database> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `EveryPolitician - Ongoing project collating and sharing data on every [...] <http://everypolitician.org/> `_
2018-01-15 01:04:07 +08:00
2018-12-15 00:53:58 +08:00
* |OK_ICON| `Federal Committee on Statistical Methodology (FCSM) (formerly FedStats) <https://nces.ed.gov/FCSM/index.asp> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Finland <https://www.opendata.fi/en> `_
2018-01-15 01:04:07 +08:00
2019-03-20 00:31:37 +08:00
* |OK_ICON| `France <https://www.data.gouv.fr/en/datasets/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Fredericton, NB, Canada <http://www.fredericton.ca/en/citygovernment/Catalogue.asp> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Gatineau, QC, Canada <http://www.gatineau.ca/donneesouvertes/default_fr.aspx> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Germany <https://www-genesis.destatis.de/genesis/online> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `Ghent, Belgium <https://data.stad.gent/explore> `_
2018-01-15 01:04:07 +08:00
2020-03-01 21:34:46 +08:00
* |FIXME_ICON| `Glasgow, Scotland, UK <https://data.glasgow.gov.uk/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/Glasgow-Scotland-UK.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-08-04 23:35:26 +08:00
* |OK_ICON| `Greece <http://www.data.gov.gr/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Guardian world governments <http://www.guardian.co.uk/world-government-data> `_
2018-01-15 01:04:07 +08:00
2018-12-04 00:54:32 +08:00
* |OK_ICON| `Halifax, NS, Canada <https://www.halifax.ca/home/open-data> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Helsinki Region, Finland <http://www.hri.fi/en/> `_
2018-01-15 01:04:07 +08:00
2020-01-07 03:34:36 +08:00
* |OK_ICON| `Hong Kong, China <https://data.gov.hk/en/> `_
2018-01-15 01:04:07 +08:00
2021-02-25 23:54:57 +08:00
* |FIXME_ICON| `Houston, TX, US <http://data.houstontx.gov/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/Houston-TX-US.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Indian Government Data <https://data.gov.in/> `_
2018-01-15 01:04:07 +08:00
2019-12-12 04:07:49 +08:00
* |OK_ICON| `Indonesian Data Portal <http://data.go.id/> `_
2018-01-15 01:04:07 +08:00
2020-05-15 02:24:32 +08:00
* |OK_ICON| `Iowa - Welcome to the State of Iowa's data portal. Please explore data [...] <https://data.iowa.gov/> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Ireland's Open Data Portal <https://data.gov.ie/data> `_
2018-01-15 01:04:07 +08:00
2019-12-12 04:07:49 +08:00
* |OK_ICON| `Israel's Open Data Portal <https://data.gov.il> `_
2020-06-15 07:43:05 +08:00
* |OK_ICON| `Istanbul Municipality Open Data Portal <https://data.ibb.gov.tr> `_
2020-01-28 06:09:57 +08:00
2021-02-24 02:18:11 +08:00
* |OK_ICON| `Italy - Il Portale dati.gov.it è il catalogo nazionale dei metadati [...] <https://www.dati.gov.it/> `_
2018-04-06 00:35:10 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `Jail deaths in America - The U.S. government does not release jail by [...] <https://www.reuters.com/investigates/special-report/usa-jails-graphic/> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Japan <http://www.e-stat.go.jp/SG1/estat/eStatTopPortalE.do> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Laval, QC, Canada <http://www.laval.ca/Pages/Fr/Citoyens/donnees.aspx> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Lexington, KY <http://data.lexingtonky.gov/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `London Datastore, UK <http://data.london.gov.uk/dataset> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |FIXME_ICON| `London, ON, Canada <http://www.london.ca/city-hall/open-data/Pages/default.aspx> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/London-ON-Canada.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Los Angeles Open Data <https://data.lacity.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-13 01:54:06 +08:00
* |OK_ICON| `Luxembourg - Luxembourgish Open Data Portal <https://data.public.lu/en/> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `MassGIS, Massachusetts, U.S. <http://www.mass.gov/anf/research-and-tech/it-serv-and-support/application-serv/office-of-geographic-information-massgis/> `_
2018-01-15 01:04:07 +08:00
2020-09-29 00:47:50 +08:00
* |OK_ICON| `Metropolitan Transportation Commission (MTC), California, US <http://mtc.ca.gov/tools-resources/data-tools/open-data-library> `_
2018-01-15 01:04:07 +08:00
2021-02-25 23:54:57 +08:00
* |OK_ICON| `Mexico <https://datos.gob.mx/busca/dataset> `_
2018-01-15 01:04:07 +08:00
2020-09-29 00:47:50 +08:00
* |OK_ICON| `Mississauga, ON, Canada <http://www.mississauga.ca/portal/residents/publicationsopendatacatalogue> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Moldova <http://data.gov.md/> `_
2018-01-15 01:04:07 +08:00
2018-12-08 16:41:32 +08:00
* |OK_ICON| `Moncton, NB, Canada <http://open.moncton.ca/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Montreal, QC, Canada <http://donnees.ville.montreal.qc.ca/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Mountain View, California, US (GIS) <http://data-mountainview.opendata.arcgis.com/> `_
2018-01-15 01:04:07 +08:00
2018-12-03 21:35:06 +08:00
* |FIXME_ICON| `NYC Open Data <https://opendata.cityofnewyork.us/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/NYC-Open-Data.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NYC betanyc <http://betanyc.us/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Netherlands <https://data.overheid.nl/> `_
2018-01-15 01:04:07 +08:00
2020-06-15 07:43:05 +08:00
* |OK_ICON| `New York Department of Sanitation Monthly Tonnage - DSNY Monthly Tonnage [...] <https://data.cityofnewyork.us/City-Government/DSNY-Monthly-Tonnage-Data/ebb7-mvp5> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `New Zealand <http://www.stats.govt.nz/browse_for_stats.aspx> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `OECD <https://data.oecd.org/> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |FIXME_ICON| `Oakland, California, US <https://data.oaklandnet.com/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/Oakland-California-US.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Oklahoma <https://data.ok.gov/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Open Data for Africa <http://opendataforafrica.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Open Government Data (OGD) Platform India <https://data.gov.in/> `_
2018-01-15 01:04:07 +08:00
2019-11-26 03:59:26 +08:00
* |OK_ICON| `OpenDataSoft's list of 1,600 open data <https://www.opendatasoft.com/blog/2015/11/02/how-we-put-together-a-list-of-1600-open-data-portals-around-the-world-to-help-open-data-community> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Oregon <https://data.oregon.gov/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Ottawa, ON, Canada <http://data.ottawa.ca/en/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Palo Alto, California, US <http://data.cityofpaloalto.org/home> `_
2018-01-15 01:04:07 +08:00
2018-12-03 21:08:30 +08:00
* |OK_ICON| `OpenDataPhilly - OpenDataPhilly is a catalog of open data in the [...] <https://www.opendataphilly.org/> `_
2018-04-20 00:24:59 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Portland, Oregon <https://www.portlandoregon.gov/28130> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Portugal - Pordata organization <http://www.pordata.pt/en/Home> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Puerto Rico Government <https://data.pr.gov//> `_
2018-01-15 01:04:07 +08:00
2020-01-18 05:53:14 +08:00
* |FIXME_ICON| `Quebec City, QC, Canada <http://donnees.ville.quebec.qc.ca/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/Quebec-City-QC-Canada.yml> `_ ]
2018-01-15 01:04:07 +08:00
2019-03-02 06:47:18 +08:00
* |OK_ICON| `Quebec Province of Canada <https://www.donneesquebec.ca/en/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Regina SK, Canada <http://open.regina.ca/> `_
2018-01-15 01:04:07 +08:00
2018-12-17 23:50:09 +08:00
* |OK_ICON| `Rio de Janeiro, Brazil <http://www.data.rio/> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `Romania <http://data.gov.ro/> `_
2018-01-15 01:04:07 +08:00
2020-04-15 03:57:38 +08:00
* |OK_ICON| `Russia <http://data.gov.ru> `_
2018-01-15 01:04:07 +08:00
2018-12-03 21:36:20 +08:00
* |OK_ICON| `San Diego, CA <https://data.sandiego.gov> `_
2019-05-26 18:53:08 +08:00
* |FIXME_ICON| `San Antonio, TX - Community Information Now - CI:Now is a nonprofit [...] <http://cinow.info/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/San-Antonio-TX-US-Community-Information-Now.yml> `_ ]
2018-04-20 00:24:59 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `San Francisco Data sets <http://datasf.org/> `_
2018-01-15 01:04:07 +08:00
2018-12-17 23:48:45 +08:00
* |OK_ICON| `San Jose, California, US <http://data.sanjoseca.gov/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `San Mateo County, California, US <https://data.smcgov.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Saskatchewan, Province of Canada <http://opendatask.ca/data/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Seattle <https://data.seattle.gov/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Singapore Government Data <https://data.gov.sg/> `_
2018-01-15 01:04:07 +08:00
2020-09-26 00:05:26 +08:00
* |OK_ICON| `South Africa Trade Statistics <http://www.econostatistics.co.za/> `_
2018-01-15 01:04:07 +08:00
2018-12-08 16:41:14 +08:00
* |OK_ICON| `South Africa <http://www.statssa.gov.za/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `State of Utah, US <https://opendata.utah.gov/> `_
2018-01-15 01:04:07 +08:00
2019-03-26 21:31:38 +08:00
* |OK_ICON| `Switzerland <http://www.opendata.admin.ch/> `_
2018-01-15 01:04:07 +08:00
2020-02-03 14:27:26 +08:00
* |OK_ICON| `Taiwan gov <https://data.gov.tw/> `_
2018-01-15 01:04:07 +08:00
2020-02-03 14:27:26 +08:00
* |OK_ICON| `Taiwan <http://data.gov.tw/> `_
2018-01-15 01:04:07 +08:00
2019-12-12 04:19:03 +08:00
* |OK_ICON| `Tel-Aviv Open Data <https://opendata.tel-aviv.gov.il/en/Pages/home.aspx> `_
2018-04-06 01:00:48 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Texas Open Data <https://data.texas.gov/> `_
2018-01-15 01:04:07 +08:00
2021-02-24 02:18:11 +08:00
* |FIXME_ICON| `The World Bank <https://openknowledge.worldbank.org/handle/10986/2124> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/The-World-Bank.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-08-27 03:28:40 +08:00
* |FIXME_ICON| `Toronto, ON, Canada <https://portal0.cf.opendata.inter.sandbox-toronto.ca/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/Toronto-ON-Canada.yml> `_ ]
2018-01-15 01:04:07 +08:00
2021-02-25 23:54:57 +08:00
* |OK_ICON| `Tunisia <http://www.data.gov.tn/> `_
2018-01-15 01:04:07 +08:00
2020-01-28 23:54:35 +08:00
* |OK_ICON| `U.K. Government Data <https://data.gov.uk> `_
2018-01-15 01:04:07 +08:00
2020-02-01 05:09:08 +08:00
* |OK_ICON| `U.S. American Community Survey <https://www.census.gov/programs-surveys/acs/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `U.S. CDC Public Health datasets <https://www.cdc.gov/nchs/data_access/ftp_data.htm> `_
2018-01-15 01:04:07 +08:00
2020-01-28 23:54:53 +08:00
* |OK_ICON| `U.S. Census Bureau <http://www.census.gov/data.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `U.S. Department of Housing and Urban Development (HUD) <http://www.huduser.gov/portal/datasets/pdrdatas.html> `_
2018-01-15 01:04:07 +08:00
2019-02-13 23:41:27 +08:00
* |OK_ICON| `U.S. Federal Government Agencies <http://www.data.gov/metrics> `_
2018-01-15 01:04:07 +08:00
2019-11-26 03:59:26 +08:00
* |OK_ICON| `U.S. Federal Government Data Catalog <http://catalog.data.gov/dataset> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `U.S. Food and Drug Administration (FDA) <https://open.fda.gov/index.html> `_
2018-01-15 01:04:07 +08:00
2019-01-21 23:42:09 +08:00
* |OK_ICON| `U.S. National Center for Education Statistics (NCES) <http://nces.ed.gov/> `_
2018-01-15 01:04:07 +08:00
2019-02-13 23:41:27 +08:00
* |OK_ICON| `U.S. Open Government <http://www.data.gov/open-gov/> `_
2018-01-15 01:04:07 +08:00
2020-09-19 00:11:56 +08:00
* |OK_ICON| `UK 2011 Census Open Atlas Project <https://data.cdrc.ac.uk/product/cdrc-2011-census-open-atlas> `_
2018-01-15 01:04:07 +08:00
2020-06-25 23:05:16 +08:00
* |OK_ICON| `US Counties - This is a repository of various data, broken down by US [...] <https://github.com/evangambit/JsonOfCounties> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `U.S. Patent and Trademark Office (USPTO) Bulk Data Products <https://www.uspto.gov/learning-and-resources/bulk-data-products> `_
2018-01-17 21:46:48 +08:00
2020-06-15 07:43:05 +08:00
* |FIXME_ICON| `Uganda Bureau of Statistics <http://www.ubos.org/unda/index.php/catalog> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/Uganda-Bureau-of-Statistics.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-12-12 00:16:03 +08:00
* |OK_ICON| `Ukraine <https://data.gov.ua/> `_
2018-12-04 00:31:15 +08:00
2018-11-28 01:37:55 +08:00
* |OK_ICON| `United Nations <http://data.un.org/> `_
2018-01-15 01:04:07 +08:00
2020-09-26 00:05:26 +08:00
* |OK_ICON| `Uruguay <https://catalogodatos.gub.uy/> `_
2018-01-15 01:04:07 +08:00
2020-09-19 00:11:56 +08:00
* |OK_ICON| `Valley Transportation Authority (VTA), California, US <https://data.vta.org/> `_
2018-01-15 01:04:07 +08:00
2020-06-25 23:05:16 +08:00
* |FIXME_ICON| `Vancouver, BC Open Data Catalog <http://data.vancouver.ca/datacatalogue/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/Vancouver-BC-Open-Data-Catalog.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-12-04 00:54:32 +08:00
* |OK_ICON| `Victoria, BC, Canada <http://opendata.victoria.ca/> `_
2018-01-15 01:04:07 +08:00
2018-12-03 21:46:17 +08:00
* |OK_ICON| `Vienna, Austria <https://open.wien.gv.at/site/open-data/> `_
2019-01-13 12:56:10 +08:00
2020-08-04 23:35:26 +08:00
* |FIXME_ICON| `Statistics from the General Statistics Office of Vietnam - Data in [...] <https://www.gso.gov.vn/Default_en.aspx?tabid=491> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Government/Vietnam.yml> `_ ]
2019-01-13 12:56:10 +08:00
* |OK_ICON| `U.S. Congressional Research Service (CRS) Reports <https://www.everycrsreport.com/> `_
2018-01-15 01:04:07 +08:00
2014-12-26 22:12:33 +08:00
Healthcare
2018-01-15 01:06:25 +08:00
----------
2018-01-15 01:04:07 +08:00
2020-04-15 03:57:38 +08:00
* |OK_ICON| `AWS COVID-19 Datasets - We're working with organizations who make [...] <https://dj2taa9i652rf.cloudfront.net/> `_
2020-07-20 23:32:04 +08:00
* |OK_ICON| `COVID-19 Case Surveillance Public Use Data - The COVID-19 case [...] <https://data.cdc.gov/Case-Surveillance/COVID-19-Case-Surveillance-Public-Use-Data/vbim-akqf> `_
2020-03-30 05:04:48 +08:00
* |OK_ICON| `2019 Novel Coronavirus COVID-19 Data Repository by Johns Hopkins CSSE - [...] <https://github.com/CSSEGISandData/COVID-19> `_
* |OK_ICON| `Coronavirus (Covid-19) Data in the United States - The New York Times is [...] <https://github.com/nytimes/covid-19-data> `_
2021-02-17 06:35:41 +08:00
* |OK_ICON| `COVID-19 Reported Patient Impact and Hospital Capacity by Facility - The [...] <https://healthdata.gov/dataset/covid-19-reported-patient-impact-and-hospital-capacity-facility?SorourMo/38-Cloud-A-Cloud-Segmentation-Dataset> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Composition of Foods Raw, Processed, Prepared USDA National Nutrient Database for Standard [...] <https://data.nal.usda.gov/dataset/composition-foods-raw-processed-prepared-usda-national-nutrient-database-standard-reference-release-27> `_
2018-04-06 01:00:48 +08:00
2020-05-09 06:50:23 +08:00
* |OK_ICON| `The COVID Tracking Project - The COVID Tracking Project collects and [...] <https://covidtracking.com/data> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `EHDP Large Health Data Sets <http://www.ehdp.com/vitalnet/datasets.htm> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `GDC - GDC supports several cancer genome programs for CCG, TCGA, TARGET etc. <https://gdc.cancer.gov/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Gapminder World demographic databases <http://www.gapminder.org/data/> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `MeSH, the vocabulary thesaurus used for indexing articles for PubMed <https://www.nlm.nih.gov/mesh/filelist.html> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `MeDAL - A large medical text dataset curated for abbreviation [...] <https://github.com/BruceWen120/medal> `_
2018-07-17 00:04:42 +08:00
* |OK_ICON| `Medicare Coverage Database (MCD), U.S. <https://www.cms.gov/medicare-coverage-database/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Medicare Data Engine of medicare.gov Data <https://data.medicare.gov/> `_
2018-01-15 01:04:07 +08:00
2018-07-17 00:04:42 +08:00
* |OK_ICON| `Medicare Data File <http://go.cms.gov/19xxPN4> `_
2018-01-15 01:04:07 +08:00
2018-07-17 00:01:26 +08:00
* |OK_ICON| `Number of Ebola Cases and Deaths in Affected Countries (2014) <https://data.humdata.org/dataset/ebola-cases-2014> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Open-ODS (structure of the UK NHS) <http://www.openods.co.uk> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `OpenPaymentsData, Healthcare financial relationship data <https://openpaymentsdata.cms.gov> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `PhysioBank Databases - A large and growing archive of physiological data. <https://www.physionet.org/physiobank/database/> `_
2018-01-15 01:04:07 +08:00
2020-02-01 05:09:08 +08:00
* |OK_ICON| `The Cancer Imaging Archive (TCIA) <https://www.cancerimagingarchive.net> `_
2018-01-18 21:02:21 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `The Cancer Genome Atlas project (TCGA) <https://portal.gdc.cancer.gov/> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `World Health Organization Global Health Observatory <http://www.who.int/gho/en/> `_
2019-01-15 23:36:46 +08:00
2020-05-05 04:04:26 +08:00
* |OK_ICON| `Yahoo Knowledge Graph COVID-19 Datasets - The Yahoo Knowledge Graph team [...] <https://github.com/yahoo/covid-19-data> `_
2020-10-27 05:23:18 +08:00
* |FIXME_ICON| `Informatics for Integrating Biology & the Bedside <https://www.i2b2.org/NLP/DataSets/Main.php> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Healthcare/i2b2.yml> `_ ]
2018-01-15 01:04:07 +08:00
ImageProcessing
2018-01-15 01:06:25 +08:00
---------------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `10k US Adult Faces Database <http://wilmabainbridge.com/facememorability2.html> `_
2018-01-15 01:04:07 +08:00
2018-12-15 02:44:42 +08:00
* |OK_ICON| `2GB of Photos of Cats <https://www.kaggle.com/crawford/cat-dataset/version/2> `_
2018-01-15 01:04:07 +08:00
2020-09-29 00:47:50 +08:00
* |OK_ICON| `Audience Unfiltered faces for gender and age classification <http://www.openu.ac.il/home/hassner/Adience/data.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Affective Image Classification <http://www.imageemotion.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Animals with attributes <http://attributes.kyb.tuebingen.mpg.de/> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `CADDY Underwater Stereo-Vision Dataset of divers' hand gestures - [...] <http://caddy-underwater-datasets.ge.issia.cnr.it/> `_
2019-06-16 00:11:24 +08:00
2020-09-23 23:40:40 +08:00
* |OK_ICON| `Cytology Dataset – CCAgT: Images of Cervical Cells with AgNOR Stain [...] <https://arquivos.ufsc.br/d/373be2177a33426a9e6c/> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Caltech Pedestrian Detection Benchmark <http://www.vision.caltech.edu/Image_Datasets/CaltechPedestrians/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Chars74K dataset - Character Recognition in Natural Images (both English [...] <http://www.ee.surrey.ac.uk/CVSSP/demos/chars74k/> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `Cube++ - 4890 raw 18-megapixel images, each containing a SpyderCube color [...] <https://github.com/Visillect/CubePlusPlus> `_
2019-03-20 00:34:30 +08:00
* |OK_ICON| `Danbooru Tagged Anime Illustration Dataset - A large-scale anime image [...] <https://www.gwern.net/Danbooru> `_
2019-06-16 00:11:24 +08:00
* |FIXME_ICON| `DukeMTMC Data Set - DukeMTMC aims to accelerate advances in multi-target [...] <http://vision.cs.duke.edu/DukeMTMC/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//ImageProcessing/DukeMTMC-Data-Set.yml> `_ ]
2019-01-17 02:29:54 +08:00
2021-02-24 02:18:11 +08:00
* |OK_ICON| `ETH Entomological Collection (ETHEC) Fine Grained Butterfly (Lepidoptra) Images <https://doi.org/10.3929/ethz-b-000365379> `_
2021-02-17 06:35:41 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Face Recognition Benchmark <http://www.face-rec.org/databases/> `_
2018-01-15 01:04:07 +08:00
2019-10-31 04:38:51 +08:00
* |FIXME_ICON| `Flickr: 32 Class Brand Logos <http://www.multimedia-computing.de/flickrlogos/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//ImageProcessing/Flickr-32-Class-Brand-Logos.yml> `_ ]
2018-01-15 01:04:07 +08:00
2019-09-30 11:37:27 +08:00
* |OK_ICON| `GDXray - X-ray images for X-ray testing and Computer Vision <http://dmery.ing.puc.cl/index.php/material/gdxray/> `_
2018-01-15 01:04:07 +08:00
2018-12-08 16:42:06 +08:00
* |OK_ICON| `HumanEva Dataset - The HumanEva-I dataset contains 7 calibrated video [...] <http://humaneva.is.tue.mpg.de/> `_
2019-09-30 11:37:27 +08:00
* |OK_ICON| `ImageNet (in WordNet hierarchy) <http://www.image-net.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Indoor Scene Recognition <http://web.mit.edu/torralba/www/indoor.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `International Affective Picture System, UFL <http://csea.phhp.ufl.edu/media/iapsmessage.html> `_
2018-01-15 01:04:07 +08:00
2020-08-12 23:21:41 +08:00
* |OK_ICON| `KITTI Vision Benchmark Suite <http://www.cvlibs.net/datasets/kitti/> `_
2018-07-17 00:06:14 +08:00
2019-03-20 04:45:49 +08:00
* |OK_ICON| `Labeled Information Library of Alexandria - Biology and Conservation - [...] <http://lila.science> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `MNIST database of handwritten digits, near 1 million examples <http://yann.lecun.com/exdb/mnist/> `_
2018-01-15 01:04:07 +08:00
2020-06-23 01:06:59 +08:00
* |OK_ICON| `Multi-View Region of Interest Prediction Dataset for Autonomous Driving - [...] <https://mediatum.ub.tum.de/1548761> `_
2020-03-01 21:34:46 +08:00
* |FIXME_ICON| `Massive Visual Memory Stimuli, MIT <http://cvcl.mit.edu/MM/stimuli.html> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//ImageProcessing/Massive-Visual-Memory-Stimuli-MIT.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `Newspaper Navigator - This dataset consists of extracted visual content [...] <https://news-navigator.labs.loc.gov/> `_
2019-05-26 18:53:08 +08:00
* |OK_ICON| `Open Images From Google - Pictures with segmentation masks for 2.8 [...] <https://storage.googleapis.com/openimages/web/download.html> `_
2020-07-20 23:32:04 +08:00
* |OK_ICON| `RuFa - Contains images of text written in one of two Arabic fonts (Ruqaa [...] <https://github.com/mhmoodlan/arabic-font-classification/releases/tag/v0.1.0> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `SUN database, MIT <http://groups.csail.mit.edu/vision/SUN/hierarchy.html> `_
2018-01-15 01:04:07 +08:00
2020-04-14 23:32:04 +08:00
* |OK_ICON| `SVIRO Synthetic Vehicle Interior Rear Seat Occupancy - 25.000 synthetic [...] <https://sviro.kl.dfki.de> `_
2018-04-11 01:06:33 +08:00
* |FIXME_ICON| `Several Shape-from-Silhouette Datasets <http://kaiwolf.no-ip.org/3d-model-repository.html> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//ImageProcessing/Several-Shape-from-Silhouette-Datasets.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-04-11 22:58:40 +08:00
* |OK_ICON| `Stanford Dogs Dataset <http://vision.stanford.edu/aditya86/ImageNetDogs/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `The Action Similarity Labeling (ASLAN) Challenge <http://www.openu.ac.il/home/hassner/data/ASLAN/ASLAN.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `The Oxford-IIIT Pet Dataset <http://www.robots.ox.ac.uk/~vgg/data/pets/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Violent-Flows - Crowd Violence / Non-violence Database and benchmark <http://www.openu.ac.il/home/hassner/data/violentflows/> `_
2018-01-15 01:04:07 +08:00
2018-04-20 00:24:59 +08:00
* |OK_ICON| `Visual genome <http://visualgenome.org/api/v0/api_home.html> `_
2018-01-15 01:08:24 +08:00
2019-01-17 15:21:37 +08:00
* |OK_ICON| `YouTube Faces Database <http://www.cs.tau.ac.il/~wolf/ytfaces/> `_
2018-01-15 01:04:07 +08:00
MachineLearning
2018-01-15 01:06:25 +08:00
---------------
2018-01-15 01:04:07 +08:00
2019-03-02 06:47:18 +08:00
* |OK_ICON| `All-Age-Faces Dataset - Contains 13'322 Asian face images distributed [...] <https://github.com/JingchunCheng/All-Age-Faces-Dataset> `_
2020-05-05 04:04:26 +08:00
* |OK_ICON| `Audi Autonomous Driving Dataset - We have published the Audi Autonomous [...] <https://www.a2d2.audi/a2d2/en.html> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Context-aware data sets from five domains <https://github.com/irecsys/CARSKit/tree/master/context-aware_data_sets> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Delve Datasets for classification and regression <http://www.cs.toronto.edu/~delve/data/datasets.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Discogs Monthly Data <http://data.discogs.com/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Free Music Archive <https://github.com/mdeff/fma> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `IMDb Database <http://www.imdb.com/interfaces> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `Iranis - A Large-scale Dataset of Farsi/Arabic License Plate Characters <https://alitourani.github.io/Iranis-dataset/> `_
2020-01-07 03:34:36 +08:00
* |OK_ICON| `Keel Repository for classification, regression and time series <http://sci2s.ugr.es/keel/datasets.php> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Labeled Faces in the Wild (LFW) <http://vis-www.cs.umass.edu/lfw/> `_
2018-01-15 01:04:07 +08:00
2018-04-13 01:54:06 +08:00
* |OK_ICON| `Lending Club Loan Data <https://www.lendingclub.com/info/download-data.action> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `Machine Learning Data Set Repository <http://mldata.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Million Song Dataset <http://labrosa.ee.columbia.edu/millionsong/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `More Song Datasets <http://labrosa.ee.columbia.edu/millionsong/pages/additional-datasets> `_
2018-01-15 01:04:07 +08:00
2019-04-09 23:38:35 +08:00
* |OK_ICON| `MovieLens Data Sets <http://grouplens.org/datasets/movielens/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `New Yorker caption contest ratings <https://github.com/nextml/caption-contest-data> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `RDataMining - "R and Data Mining" ebook data <http://www.rdatamining.com/data> `_
2018-01-15 01:04:07 +08:00
2020-01-07 03:34:36 +08:00
* |FIXME_ICON| `Registered Meteorites on Earth <http://publichealthintelligence.org/content/registered-meteorites-has-impacted-earth-visualized> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//MachineLearning/Registered-Meteorites-on-Earth.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-11-28 01:37:55 +08:00
* |OK_ICON| `Restaurants Health Score Data in San Francisco <https://data.sfgov.org/Health-and-Social-Services/Restaurant-Scores-LIVES-Standard/pyih-qa8i?row_index=0> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `UCI Machine Learning Repository <http://archive.ics.uci.edu/ml/> `_
2018-01-15 01:04:07 +08:00
2019-05-26 18:53:08 +08:00
* |OK_ICON| `Yahoo! Ratings and Classification Data <http://webscope.sandbox.yahoo.com/catalog.php?datatype=r> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `YouTube-BoundingBoxes <https://research.google.com/youtube-bb/> `_
2018-01-18 21:02:21 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Youtube 8m <https://research.google.com/youtube8m/download.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `eBay Online Auctions (2012) <http://www.modelingonlineauctions.com/datasets> `_
2018-01-15 01:04:07 +08:00
2014-12-26 22:12:33 +08:00
Museums
2018-01-15 01:06:25 +08:00
-------
2018-01-15 01:04:07 +08:00
2020-05-05 04:04:26 +08:00
* |OK_ICON| `Canada Science and Technology Museums Corporation's Open Data <http://techno-science.ca/en/data.php> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Cooper-Hewitt's Collection Database <https://github.com/cooperhewitt/collection> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `Metropolitan Museum of Art Collection API <https://metmuseum.github.io/> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Minneapolis Institute of Arts metadata <https://github.com/artsmia/collection> `_
2018-01-15 01:04:07 +08:00
2019-04-30 13:01:19 +08:00
* |OK_ICON| `Natural History Museum (London) Data Portal <http://data.nhm.ac.uk/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Rijksmuseum Historical Art Collection <https://www.rijksmuseum.nl/en/api> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Tate Collection metadata <https://github.com/tategallery/collection> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `The Getty vocabularies <http://vocab.getty.edu> `_
2018-01-15 01:04:07 +08:00
NaturalLanguage
2018-01-15 01:06:25 +08:00
---------------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Automatic Keyphrase Extraction <https://github.com/snkim/AutomaticKeyphraseExtraction/> `_
2018-01-15 01:04:07 +08:00
2020-08-04 23:35:26 +08:00
* |OK_ICON| `The Big Bad NLP Database <https://datasets.quantumstat.com> `_
2020-02-01 05:09:08 +08:00
2020-07-20 23:32:04 +08:00
* |OK_ICON| `Blizzard Challenge Speech - The speech + text data comes from [...] <https://www.synsig.org/index.php/Blizzard_Challenge_2018> `_
2018-12-03 20:20:13 +08:00
2018-12-12 00:16:03 +08:00
* |OK_ICON| `Blogger Corpus <http://u.cs.biu.ac.il/~koppel/BlogCorpus.htm> `_
2018-01-15 01:04:07 +08:00
2020-04-14 23:32:04 +08:00
* |FIXME_ICON| `CLiPS Stylometry Investigation Corpus <http://www.clips.uantwerpen.be/datasets/csi-corpus> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//NaturalLanguage/CLiPS-Stylometry-Investigation-Corpus.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `ClueWeb09 FACC <http://lemurproject.org/clueweb09/FACC1/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `ClueWeb12 FACC <http://lemurproject.org/clueweb12/FACC1/> `_
2018-01-15 01:04:07 +08:00
2020-05-05 04:04:26 +08:00
* |OK_ICON| `DBpedia - 4.58M things with 583M facts <http://wiki.dbpedia.org/Datasets> `_
2018-01-15 01:04:07 +08:00
2020-08-27 03:28:40 +08:00
* |OK_ICON| `Dirty Words - With millions of images in our library and billions of [...] <https://github.com/LDNOOBW/List-of-Dirty-Naughty-Obscene-and-Otherwise-Bad-Words> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Flickr Personal Taxonomies <http://www.isi.edu/~lerman/downloads/flickr/flickr_taxonomies.html> `_
2018-01-15 01:04:07 +08:00
2020-01-07 03:34:36 +08:00
* |FIXME_ICON| `Freebase of people, places, and things <http://www.freebase.com/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//NaturalLanguage/Freebase-of-people-places-and-things.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-09-26 00:07:37 +08:00
* |OK_ICON| `German Political Speeches Corpus - Collection of political speeches from [...] <http://adrien.barbaresi.eu/corpora/speeches/> `_
2018-12-03 21:35:06 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Google Books Ngrams (2.2TB) <https://aws.amazon.com/datasets/google-books-ngrams/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Google MC-AFP - Generated based on the public available Gigaword dataset [...] <https://github.com/google/mcafp> `_
2018-01-15 01:04:07 +08:00
2020-02-05 05:32:39 +08:00
* |OK_ICON| `Google Web 5gram (1TB, 2006) <https://catalog.ldc.upenn.edu/LDC2006T13> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |FIXME_ICON| `Gutenberg eBooks List <http://www.gutenberg.org/wiki/Gutenberg:Offline_Catalogs> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//NaturalLanguage/Gutenberg-eBooks-List.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Hansards text chunks of Canadian Parliament <http://www.isi.edu/natural-language/download/hansard/> `_
2018-01-15 01:04:07 +08:00
2018-12-03 20:20:13 +08:00
* |OK_ICON| `LJ Speech - Speech dataset consisting of 13,100 short audio clips of a [...] <https://keithito.com/LJ-Speech-Dataset> `_
2019-02-13 23:41:27 +08:00
* |FIXME_ICON| `M-AILabs Speech - The M-AILABS Speech Dataset is the first large dataset [...] <http://www.m-ailabs.bayern/en/the-mailabs-speech-dataset/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//NaturalLanguage/M-AILABS-Speech.yml> `_ ]
2018-12-03 20:20:13 +08:00
2020-05-15 02:24:32 +08:00
* |OK_ICON| `Microsoft MAchine Reading COmprehension Dataset (or MS MARCO) <http://www.msmarco.org/dataset.aspx> `_
2018-01-15 01:04:07 +08:00
2018-10-29 17:45:43 +08:00
* |OK_ICON| `Machine Comprehension Test (MCTest) of text from Microsoft Research <http://mattr1.github.io/mctest/> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `Machine Translation of European languages <http://statmt.org/wmt11/translation-task.html#download> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |FIXME_ICON| `Making Sense of Microposts 2013 - Concept Extraction <http://oak.dcs.shef.ac.uk/msm2013/challenge.html> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//NaturalLanguage/Making-Sense-of-Microposts-2013.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Making Sense of Microposts 2016 - Named Entity rEcognition and Linking <http://microposts2016.seas.upenn.edu/challenge.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Multi-Domain Sentiment Dataset (version 2.0) <http://www.cs.jhu.edu/~mdredze/datasets/sentiment/> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |FIXME_ICON| `Noisy speech database for training speech enhancement algorithms and TTS [...] <https://datashare.is.ed.ac.uk/handle/10283/2791> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//NaturalLanguage/Noisy-Speech.yml> `_ ]
2018-12-14 00:43:36 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Open Multilingual Wordnet <http://compling.hss.ntu.edu.sg/omw/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `POS/NER/Chunk annotated data <https://github.com/aritter/twitter_nlp/tree/master/data/annotated> `_
2018-01-15 01:04:07 +08:00
2020-04-18 05:04:14 +08:00
* |FIXME_ICON| `Personae Corpus <http://www.clips.uantwerpen.be/datasets/personae-corpus> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//NaturalLanguage/Personae-Corpus.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-04-11 22:58:40 +08:00
* |OK_ICON| `SMS Spam Collection in English <http://www.dt.fee.unicamp.br/~tiago/smsspamcollection/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `SaudiNewsNet Collection of Saudi Newspaper Articles (Arabic, 30K articles) <https://github.com/ParallelMazen/SaudiNewsNet> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Stanford Question Answering Dataset (SQuAD) <https://rajpurkar.github.io/SQuAD-explorer/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `USENET postings corpus of 2005~2011 <http://www.psych.ualberta.ca/~westburylab/downloads/usenetcorpus.download.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Universal Dependencies <http://universaldependencies.org> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Webhose - News/Blogs in multiple languages <https://webhose.io/datasets> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Wikidata - Wikipedia databases <https://www.wikidata.org/wiki/Wikidata:Database_download> `_
2018-01-15 01:08:24 +08:00
2020-05-09 06:50:51 +08:00
* |OK_ICON| `Wikipedia Links data - 40 Million Entities in Context <https://code.google.com/p/wiki-links/downloads/list> `_
2018-01-15 01:04:07 +08:00
2018-12-03 21:16:39 +08:00
* |OK_ICON| `WordNet databases and tools <http://wordnet.princeton.edu/download/> `_
2018-12-03 21:44:08 +08:00
* |OK_ICON| `WorldTree Corpus of Explanation Graphs for Elementary Science Questions - [...] <http://www.cognitiveai.org/explanationbank> `_
2018-01-15 01:04:07 +08:00
2016-07-04 23:05:14 +08:00
Neuroscience
2018-01-15 01:06:25 +08:00
------------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Allen Institute Datasets <http://www.brain-map.org/> `_
2018-01-15 01:04:07 +08:00
2020-02-20 00:58:42 +08:00
* |OK_ICON| `Brain Catalogue <http://braincatalogue.org/> `_
2018-01-15 01:04:07 +08:00
2020-01-18 05:53:14 +08:00
* |OK_ICON| `Brainomics <http://brainomics.cea.fr/localizer> `_
2018-01-15 01:08:24 +08:00
2020-09-29 00:47:50 +08:00
* |FIXME_ICON| `CodeNeuro Datasets <http://datasets.codeneuro.org/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Neuroscience/CodeNeuro-Datasets.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-06-23 05:15:54 +08:00
* |OK_ICON| `Collaborative Research in Computational Neuroscience (CRCNS) <http://crcns.org/data-sets> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `FCP-INDI <http://fcon_1000.projects.nitrc.org/index.html> `_
2018-01-15 01:04:07 +08:00
2019-03-30 16:02:05 +08:00
* |OK_ICON| `Human Connectome Project <http://www.humanconnectome.org/data/> `_
2018-01-15 01:04:07 +08:00
2019-01-17 02:29:54 +08:00
* |OK_ICON| `NDAR <https://ndar.nih.gov/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NIMH Data Archive <http://data-archive.nimh.nih.gov/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NeuroData <http://neurodata.io> `_
2018-01-15 01:04:07 +08:00
2020-04-14 23:32:04 +08:00
* |OK_ICON| `NeuroMorpho - NeuroMorpho.Org is a centrally curated inventory of [...] <http://neuromorpho.org/> `_
2018-12-04 00:24:35 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Neuroelectro <http://neuroelectro.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `OASIS <http://www.oasis-brains.org/> `_
2018-01-15 01:04:07 +08:00
2018-12-21 05:03:04 +08:00
* |OK_ICON| `OpenNEURO <https://openneuro.org/public/datasets> `_
2018-12-03 21:35:06 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `OpenfMRI <https://openfmri.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Study Forrest <http://studyforrest.org> `_
2018-01-15 01:04:07 +08:00
2014-12-26 22:12:33 +08:00
Physics
2018-01-15 01:06:25 +08:00
-------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `CERN Open Data Portal <http://opendata.cern.ch/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Crystallography Open Database <http://www.crystallography.net/> `_
2018-01-15 01:04:07 +08:00
2020-01-18 05:53:14 +08:00
* |OK_ICON| `IceCube - South Pole Neutrino Observatory <http://icecube.wisc.edu/science/data> `_
2018-01-18 21:02:21 +08:00
2018-12-03 21:35:06 +08:00
* |OK_ICON| `Ligo Open Science Center (LOSC) - Gravitational wave data from the LIGO [...] <https://losc.ligo.org> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NASA Exoplanet Archive <http://exoplanetarchive.ipac.caltech.edu/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NSSDC (NASA) data of 550 space spacecraft <http://nssdc.gsfc.nasa.gov/nssdc/obtaining_data.html> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Sloan Digital Sky Survey (SDSS) - Mapping the Universe <http://www.sdss.org/> `_
2018-01-15 01:04:07 +08:00
2019-03-09 16:10:14 +08:00
ProstateCancer
--------------
* |OK_ICON| `EOPC-DE-Early-Onset-Prostate-Cancer-Germany - Early Onset Prostate Cancer [...] <https://dcc.icgc.org/projects/EOPC-DE> `_
* |OK_ICON| `GENIE - Data from the Genomics Evidence Neoplasia Information Exchange [...] <https://www.synapse.org/genie> `_
* |OK_ICON| `Genomic-Hallmarks-Prostate-Adenocarcinoma-CPC-GENE - Comprehensive [...] <http://www.cbioportal.org/study?id=prad_cpcg_2017> `_
* |OK_ICON| `MSK-IMPACT-Clinical-Sequencing-Cohort-MSKCC-Prostate-Cancer - Targeted [...] <http://www.cbioportal.org/study?id=prad_mskcc_2017> `_
* |OK_ICON| `Metastatic-Prostate-Adenocarcinoma-MCTP - Comprehensive profiling of 61 [...] <http://www.cbioportal.org/study?id=prad_mich> `_
* |OK_ICON| `Metastatic-Prostate-Cancer-SU2CPCF-Dream-Team - Comprehensive analysis of [...] <http://www.cbioportal.org/study?id=prad_su2c_2015> `_
* |OK_ICON| `NPCR-2001-2015 - Database from CDC's National Program of Cancer [...] <https://www.cdc.gov/cancer/uscs/public-use> `_
* |OK_ICON| `NPCR-2005-2015 - Database from CDC's National Program of Cancer [...] <https://www.cdc.gov/cancer/uscs/public-use> `_
* |OK_ICON| `NaF-Prostate - NaF Prostate is a collection of F-18 NaF positron emission [...] <https://wiki.cancerimagingarchive.net/display/Public/NaF+Prostate> `_
* |OK_ICON| `Neuroendocrine-Prostate-Cancer - Whole exome and RNA Seq data of [...] <http://www.cbioportal.org/study?id=nepc_wcm_2016> `_
* |OK_ICON| `PLCO-Prostate-Diagnostic-Procedures - The Prostate Diagnostic Procedures [...] <https://biometry.nci.nih.gov/cdas/plco/> `_
* |OK_ICON| `PLCO-Prostate-Medical-Complications - The Prostate Medical Complications [...] <https://biometry.nci.nih.gov/cdas/plco/> `_
* |OK_ICON| `PLCO-Prostate-Screening-Abnormalities - The Prostate Screening [...] <https://biometry.nci.nih.gov/cdas/plco/> `_
* |OK_ICON| `PLCO-Prostate-Screening - The Prostate Screening dataset (177,315 [...] <https://biometry.nci.nih.gov/cdas/plco/> `_
* |OK_ICON| `PLCO-Prostate-Treatments - The Prostate Treatments dataset (13,409 [...] <https://biometry.nci.nih.gov/cdas/plco/> `_
* |OK_ICON| `PLCO-Prostate - The Prostate dataset is a comprehensive dataset that [...] <https://biometry.nci.nih.gov/cdas/plco/> `_
* |OK_ICON| `PRAD-CA-Prostate-Adenocarcinoma-Canada - Prostate Adenocarcinoma - [...] <https://dcc.icgc.org/projects/PRAD-CA> `_
* |OK_ICON| `PRAD-FR-Prostate-Adenocarcinoma-France - Prostate Adenocarcinoma - [...] <https://dcc.icgc.org/projects/PRAD-FR> `_
* |OK_ICON| `PRAD-UK-Prostate-Adenocarcinoma-United-Kingdom - Prostate Adenocarcinoma [...] <https://dcc.icgc.org/projects/PRAD-UK> `_
* |OK_ICON| `PROSTATEx-Challenge - Retrospective set of prostate MR studies. All [...] <https://wiki.cancerimagingarchive.net/display/Public/SPIE-AAPM-NCI+PROSTATEx+Challenges> `_
* |OK_ICON| `Prostate-3T - The Prostate-3T project provided imaging data to TCIA as [...] <https://wiki.cancerimagingarchive.net/display/Public/PROSTATE-3T> `_
* |OK_ICON| `Prostate-Adenocarcinoma-Broad-Cornell-2012 - Comprehensive profiling of [...] <http://www.cbioportal.org/study?id=prad_broad> `_
* |OK_ICON| `Prostate-Adenocarcinoma-Broad-Cornell-2013 - Comprehensive profiling of [...] <http://www.cbioportal.org/study?id=prad_broad_2013> `_
* |OK_ICON| `Prostate-Adenocarcinoma-CNA-study-MSKCC - Copy-number profiling of 103 [...] <http://www.cbioportal.org/study?id=prad_mskcc_2014> `_
* |OK_ICON| `Prostate-Adenocarcinoma-Fred-Hutchinson-CRC - Comprehensive profiling of [...] <http://www.cbioportal.org/study?id=prad_fhcrc> `_
* |OK_ICON| `Prostate Adenocarcinoma (MSKCC/DFCI) - Whole Exome Sequencing of 1013 [...] <http://www.cbioportal.org/study?id=prad_p1000> `_
* |OK_ICON| `Prostate-Adenocarcinoma-MSKCC - MSKCC Prostate Oncogenome Project. 181 [...] <http://www.cbioportal.org/study?id=prad_mskcc> `_
* |OK_ICON| `Prostate-Adenocarcinoma-Organoids-MSKCC - Exome profiling of prostate [...] <http://www.cbioportal.org/study?id=prad_mskcc_cheny1_organoids_2014> `_
* |OK_ICON| `Prostate-Adenocarcinoma-Sun-Lab - Whole-genome and Transcriptome [...] <http://www.cbioportal.org/study?id=prad_eururol_2017> `_
* |OK_ICON| `Prostate-Adenocarcinoma-TCGA-PanCancer-Atlas - Comprehensive TCGA [...] <http://www.cbioportal.org/study?id=prad_tcga_pan_can_atlas_2018> `_
* |OK_ICON| `Prostate-Adenocarcinoma-TCGA - Integrated profiling of 333 primary [...] <http://www.cbioportal.org/study?id=prad_tcga_pub> `_
* |OK_ICON| `Prostate-Diagnosis - PCa T1- and T2-weighted magnetic resonance images [...] <https://wiki.cancerimagingarchive.net/display/Public/PROSTATE-DIAGNOSIS> `_
* |OK_ICON| `Prostate-Fused-MRI-Pathology - The Prostate Fused-MRI-Pathology [...] <https://wiki.cancerimagingarchive.net/display/Public/Prostate+Fused-MRI-Pathology> `_
* |OK_ICON| `Prostate-MRI - The Prostate-MRI collection of prostate Magnetic Resonance [...] <https://wiki.cancerimagingarchive.net/display/Public/Prostate-MRI> `_
2020-05-09 06:50:51 +08:00
* |OK_ICON| `Prostate-R - The R package 'ElemStatLearn' contains a prostate cancer [...] <https://web.stanford.edu/~hastie/ElemStatLearn/datasets/prostate.data> `_
2019-03-09 16:10:14 +08:00
* |OK_ICON| `QIN-PROSTATE-Repeatability - The QIN-PROSTATE-Repeatability dataset is a [...] <https://wiki.cancerimagingarchive.net/display/Public/QIN-PROSTATE-Repeatability> `_
* |OK_ICON| `QIN-PROSTATE - The QIN PROSTATE collection of the Quantitative Imaging [...] <https://wiki.cancerimagingarchive.net/display/Public/QIN+PROSTATE> `_
* |OK_ICON| `SEER-YR1973_2015.SEER9 - The SEER November 2017 Research Data files from [...] <https://seer.cancer.gov/data/seerstat/nov2017/> `_
* |OK_ICON| `SEER-YR1992_2015.SJ_LA_RG_AK - The SEER November 2017 Research Data files [...] <https://seer.cancer.gov/data/seerstat/nov2017/> `_
* |OK_ICON| `SEER-YR2000_2015.CA_KY_LO_NJ_GA - The SEER November 2017 Research Data [...] <https://seer.cancer.gov/data/seerstat/nov2017/> `_
* |OK_ICON| `SEER-YR2000_2015.CA_KY_LO_NJ_GA - The July - December 2005 diagnoses for [...] <https://seer.cancer.gov/data/seerstat/nov2017/> `_
* |OK_ICON| `TCGA-PRAD-US - TCGA Prostate Adenocarcinoma (499 samples). <http://www.cbioportal.org/study?id=prad_tcga> `_
2018-01-15 01:04:07 +08:00
Psychology+Cognition
2018-01-15 01:06:25 +08:00
--------------------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |FIXME_ICON| `OSU Cognitive Modeling Repository Datasets <http://www.cmr.osu.edu/browse/datasets> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Psychology+Cognition/OSU-Cognitive-Modeling-Repository-Datasets.yml> `_ ]
2018-01-15 01:04:07 +08:00
PublicDomains
2018-01-15 01:06:25 +08:00
-------------
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `Ably Open Realtime Data <https://www.ably.io/hub/> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Amazon <http://aws.amazon.com/datasets/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Archive.org Datasets <https://archive.org/details/datasets> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Archive-it from Internet Archive <https://www.archive-it.org/explore?show=Collections> `_
2018-01-15 01:04:07 +08:00
2019-02-13 23:41:27 +08:00
* |OK_ICON| `CMU JASA data archive <http://lib.stat.cmu.edu/jasadata/> `_
2018-01-15 01:04:07 +08:00
2019-02-13 23:41:27 +08:00
* |OK_ICON| `CMU StatLab collections <http://lib.stat.cmu.edu/datasets/> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `Data.World <https://data.world> `_
2018-01-15 01:04:07 +08:00
2020-09-29 00:47:50 +08:00
* |FIXME_ICON| `Data360 <http://www.data360.org/index.aspx> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Data360.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Enigma Public <https://public.enigma.com/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Google <http://www.google.com/publicdata/directory> `_
2018-01-15 01:04:07 +08:00
2018-12-04 00:31:15 +08:00
* |OK_ICON| `Grand Comics Database - The Grand Comics Database (GCD) is a nonprofit, [...] <https://www.comics.org> `_
2020-08-27 03:28:40 +08:00
* |FIXME_ICON| `Infochimps <http://www.infochimps.com/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Infochimps.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `KDNuggets Data Collections <http://www.kdnuggets.com/datasets/index.html> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |FIXME_ICON| `Microsoft Azure Data Market Free DataSets <https://azuremarketplace.microsoft.com/en-us/marketplace/apps?source=datamarket&filters=pricing-free&page=1> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Microsoft-Azure-Data-Market-Free-DataSets.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Microsoft Data Science for Research <http://aka.ms/Data-Science> `_
2018-01-15 01:04:07 +08:00
2018-12-03 21:36:20 +08:00
* |OK_ICON| `Microsoft Research Open Data <https://msropendata.com/> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Open Library Data Dumps <https://openlibrary.org/developers/dumps> `_
2018-01-15 01:04:07 +08:00
2021-02-18 00:19:52 +08:00
* |FIXME_ICON| `Reddit Datasets <https://www.reddit.com/r/datasets> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/Reddit-Datasets.yml> `_ ]
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |FIXME_ICON| `RevolutionAnalytics Collection <https://packages.revolutionanalytics.com/datasets/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//PublicDomains/RevolutionAnalytics-Collection.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-04-18 05:04:14 +08:00
* |OK_ICON| `Sample R data sets <http://stat.ethz.ch/R-manual/R-patched/library/datasets/html/00Index.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `StatSci.org <http://www.statsci.org/datasets.html> `_
2018-01-15 01:04:07 +08:00
2020-04-14 23:32:04 +08:00
* |OK_ICON| `Stats4Stem R data sets (archived) <https://web.archive.org/web/20151024082129/http://www.stats4stem.org:80/data-sets.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `The Washington Post List <http://www.washingtonpost.com/wp-srv/metro/data/datapost.html> `_
2018-01-15 01:04:07 +08:00
2019-06-04 00:43:00 +08:00
* |OK_ICON| `UCLA SOCR data collection <http://wiki.stat.ucla.edu/socr/index.php/SOCR_Data> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `UFO Reports <http://www.nuforc.org/webreports.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Wikileaks 911 pager intercepts <https://911.wikileaks.org/files/index.html> `_
2018-01-15 01:04:07 +08:00
2019-05-26 18:53:08 +08:00
* |OK_ICON| `Yahoo Webscope <http://webscope.sandbox.yahoo.com/catalog.php> `_
2018-01-15 01:04:07 +08:00
SearchEngines
2018-01-15 01:06:25 +08:00
-------------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Academic Torrents of data sharing from UMB <http://academictorrents.com/> `_
2018-01-15 01:04:07 +08:00
2019-09-30 11:37:27 +08:00
* |FIXME_ICON| `DataMarket (Qlik) <https://datamarket.com/data/list/?q=all> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//SearchEngines/DataMarket-Qlik.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Datahub.io <https://datahub.io/dataset> `_
2018-01-15 01:04:07 +08:00
2020-03-25 02:48:40 +08:00
* |OK_ICON| `Domains Project - Sorted list of Internet domains <https://github.com/tb0hdan/domains> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Harvard Dataverse Network of scientific data <https://dataverse.harvard.edu/> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `ICPSR (UMICH) <https://www.icpsr.umich.edu/web/pages/ICPSR/index.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Institute of Education Sciences <http://eric.ed.gov> `_
2018-01-15 01:04:07 +08:00
2018-12-21 05:14:20 +08:00
* |OK_ICON| `National Technical Reports Library <https://ntrl.ntis.gov/NTRL/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Open Data Certificates (beta) <https://certificates.theodi.org/en/datasets> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `OpenDataNetwork - A search engine of all Socrata powered data portals <http://www.opendatanetwork.com/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Statista.com - statistics and Studies <http://www.statista.com/> `_
2018-01-15 01:04:07 +08:00
2018-12-13 12:47:11 +08:00
* |OK_ICON| `Zenodo - An open dependable home for the long-tail of science <https://zenodo.org/collection/datasets> `_
2018-01-15 01:04:07 +08:00
SocialNetworks
2018-01-15 01:06:25 +08:00
--------------
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `2021 Portuguese Elections Twitter Dataset - 57M+ tweets, 1M+ users - This [...] <https://github.com/msramalho/election-watch/blob/master/datasets/01_portuguese_presidential_elections_2021_01_24.md> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `72 hours #gamergate Twitter Scrape <http://waxy.org/random/misc/gamergate_tweets.csv> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `CMU Enron Email of 150 users <http://www.cs.cmu.edu/~enron/> `_
2018-01-15 01:04:07 +08:00
2020-06-15 07:43:05 +08:00
* |OK_ICON| `Cheng-Caverlee-Lee September 2009 - January 2010 Twitter Scrape <https://archive.org/details/twitter_cikm_2010> `_
2018-01-15 01:04:07 +08:00
2020-09-04 01:06:47 +08:00
* |OK_ICON| `China Biographical Database - The China Biographical Database is a freely [...] <https://projects.iq.harvard.edu/cbdb> `_
2021-02-19 04:23:55 +08:00
* |OK_ICON| `A Twitter Dataset of 40+ million tweets related to COVID-19 - Due to the [...] <https://zenodo.org/record/3723940> `_
2020-03-25 23:29:33 +08:00
2020-06-15 07:43:05 +08:00
* |OK_ICON| `43k+ Donald Trump Twitter Screenshots - This archive contains screenshots [...] <https://pikaso.me/blog/trump-twitter-archive> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `EDRM Enron EMail of 151 users, hosted on S3 <https://aws.amazon.com/datasets/enron-email-data/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Facebook Data Scrape (2005) <https://archive.org/details/oxford-2005-facebook-matrix> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `Facebook Social Connectedness Index - We use an anonymized snapshot of [...] <https://data.humdata.org/dataset/social-connectedness-index> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Facebook Social Networks from LAW (since 2007) <http://law.di.unimi.it/datasets.php> `_
2018-01-15 01:04:07 +08:00
2020-06-15 07:43:05 +08:00
* |OK_ICON| `Foursquare from UMN/Sarwat (2013) <https://archive.org/details/201309_foursquare_dataset_umn> `_
2018-01-15 01:04:07 +08:00
2019-03-20 00:31:37 +08:00
* |OK_ICON| `GitHub Collaboration Archive <https://www.gharchive.org/> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `Google Scholar citation relations <https://web.archive.org/web/20190522043016/http://www3.cs.stonybrook.edu/~leman/data/gscholar.db> `_
2018-01-15 01:04:07 +08:00
2020-08-27 03:28:40 +08:00
* |OK_ICON| `High-Resolution Contact Networks from Wearable Sensors <http://www.sociopatterns.org/datasets/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Indie Map: social graph and crawl of top IndieWeb sites <http://www.indiemap.org/> `_
2018-01-15 01:08:24 +08:00
2019-11-26 03:59:26 +08:00
* |OK_ICON| `Mobile Social Networks from UMASS <https://kdl.cs.umass.edu/display/public/Mobile+Social+Networks> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Network Twitter Data <http://snap.stanford.edu/data/higgs-twitter.html> `_
2018-01-15 01:04:07 +08:00
2019-01-21 23:42:09 +08:00
* |OK_ICON| `Reddit Comments <http://files.pushshift.io/reddit/comments/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Skytrax' Air Travel Reviews Dataset <https://github.com/quankiquanki/skytrax-reviews-dataset> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Social Twitter Data <http://snap.stanford.edu/data/egonets-Twitter.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `SourceForge.net Research Data <http://www3.nd.edu/~oss/Data/data.html> `_
2018-01-15 01:04:07 +08:00
2020-10-01 01:39:01 +08:00
* |OK_ICON| `Twitch Top Streamer's Data <https://www.kaggle.com/aayushmishra1512/twitchdata> `_
2020-02-05 05:32:39 +08:00
* |OK_ICON| `Twitter Data for Online Reputation Management <http://nlp.uned.es/replab2013/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Twitter Data for Sentiment Analysis <http://help.sentiment140.com/for-students/> `_
2018-01-15 01:04:07 +08:00
2020-03-14 03:14:04 +08:00
* |OK_ICON| `Twitter Graph of entire Twitter site <http://an.kaist.ac.kr/traces/WWW2010.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |FIXME_ICON| `Twitter Scrape Calufa May 2011 <http://archive.org/details/2011-05-calufa-twitter-sql> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//SocialNetworks/Twitter-Scrape-Calufa-May-2011.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `UNIMI/LAW Social Network Datasets <http://law.di.unimi.it/datasets.php> `_
2018-01-15 01:04:07 +08:00
2019-07-02 19:13:08 +08:00
* |OK_ICON| `United States Congress Twitter Data - Daily datasets with tweets of 1100+ [...] <https://github.com/alexlitel/congresstweets> `_
2019-05-26 18:53:08 +08:00
* |OK_ICON| `Yahoo! Graph and Social Data <http://webscope.sandbox.yahoo.com/catalog.php?datatype=g> `_
2018-01-15 01:04:07 +08:00
2019-07-02 19:13:08 +08:00
* |OK_ICON| `Youtube Video Social Graph in 2007,2008 <http://netsg.cs.sfu.ca/youtubedata/> `_
2018-01-15 01:04:07 +08:00
SocialSciences
2018-01-15 01:06:25 +08:00
--------------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `ACLED (Armed Conflict Location & Event Data Project) <http://www.acleddata.com/> `_
2018-01-15 01:04:07 +08:00
2020-09-04 01:06:47 +08:00
* |OK_ICON| `Authoritarian Ruling Elites Database - The Authoritarian Ruling Elites [...] <https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/QZ9BSA> `_
2020-07-20 23:32:04 +08:00
* |OK_ICON| `Canadian Legal Information Institute <https://www.canlii.org/en/index.php> `_
2018-01-15 01:04:07 +08:00
2019-09-21 23:20:00 +08:00
* |FIXME_ICON| `Center for Systemic Peace Datasets - Conflict Trends, Polities, State Fragility, etc <http://www.systemicpeace.org/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//SocialSciences/Center-for-Systemic-Peace-Datasets.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-06-15 07:43:05 +08:00
* |OK_ICON| `Correlates of War Project <http://www.correlatesofwar.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Cryptome Conspiracy Theory Items <http://cryptome.org> `_
2018-01-15 01:04:07 +08:00
2018-12-13 12:47:11 +08:00
* |FIXME_ICON| `Datacards <https://www.datacards.org/login/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//SocialSciences/Datacards.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `European Social Survey <http://www.europeansocialsurvey.org/data/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `FBI Hate Crime 2013 - aggregated data <https://github.com/emorisse/FBI-Hate-Crime-Statistics/tree/master/2013> `_
2018-01-15 01:04:07 +08:00
2019-04-09 23:38:35 +08:00
* |FIXME_ICON| `Fragile States Index <http://fundforpeace.org/fsi/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//SocialSciences/Fragile-States-Index.yml> `_ ]
2018-01-15 01:04:07 +08:00
2020-04-14 23:32:04 +08:00
* |OK_ICON| `GDELT Global Events Database <http://gdeltproject.org/data.html> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `General Social Survey (GSS) since 1972 <http://gss.norc.org> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `German Social Survey <http://www.gesis.org/en/home/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Global Religious Futures Project <http://www.globalreligiousfutures.org/> `_
2018-01-15 01:04:07 +08:00
2018-04-20 00:28:35 +08:00
* |OK_ICON| `Gun Violence Data - A comprehensive, accessible database that contains [...] <https://github.com/jamesqo/gun-violence-data> `_
2018-12-03 21:09:08 +08:00
* |OK_ICON| `Humanitarian Data Exchange <https://data.humdata.org/> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `INFORM Index for Risk Management <http://www.inform-index.org/Results/Global> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Institute for Demographic Studies <http://www.ined.fr/en/> `_
2018-01-15 01:04:07 +08:00
2018-12-12 13:49:31 +08:00
* |OK_ICON| `International Networks Archive <http://www.princeton.edu/~ina/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `International Social Survey Program ISSP <http://www.issp.org> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `International Studies Compendium Project <http://www.isacompendium.com/public/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `James McGuire Cross National Data <http://jmcguire.faculty.wesleyan.edu/welcome/cross-national-data/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `MIT Reality Mining Dataset <http://realitycommons.media.mit.edu/realitymining.html> `_
2018-01-15 01:04:07 +08:00
2019-09-30 11:37:27 +08:00
* |OK_ICON| `MacroData Guide by Norsk samfunnsvitenskapelig datatjeneste <http://nsd.uib.no> `_
2018-01-15 01:04:07 +08:00
2020-08-27 03:28:48 +08:00
* |OK_ICON| `Mass Mobilization Data Project - The Mass Mobilization (MM) data are an [...] <https://dataverse.harvard.edu/dataverse/MMdata> `_
2019-04-18 03:24:58 +08:00
* |OK_ICON| `Microsoft Academic Knowledge Graph - The Microsoft Academic Knowledge [...] <http://ma-graph.org> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Minnesota Population Center <https://www.ipums.org/> `_
2018-01-15 01:04:07 +08:00
2018-12-12 13:01:07 +08:00
* |OK_ICON| `Notre Dame Global Adaptation Index (ND-GAIN) <https://gain.nd.edu/our-work/country-index/download-data/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Open Crime and Policing Data in England, Wales and Northern Ireland <https://data.police.uk/data/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `OpenSanctions - A global database of persons and companies of political, [...] <http://www.opensanctions.org/#downloads> `_
2018-01-19 17:03:32 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Paul Hensel General International Data Page <http://www.paulhensel.org/dataintl.html> `_
2018-01-15 01:04:07 +08:00
2019-02-16 04:58:01 +08:00
* |OK_ICON| `PewResearch Internet Survey Project <http://www.pewinternet.org/?post_type=dataset> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `PewResearch Society Data Collection <http://www.pewresearch.org/data/download-datasets/> `_
2018-01-15 01:04:07 +08:00
2019-12-12 04:07:49 +08:00
* |FIXME_ICON| `Political Polarity Data <http://www3.cs.stonybrook.edu/~leman/data/14-icwsm-political-polarity-data.zip> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//SocialSciences/Political-Polarity-Data.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `StackExchange Data Explorer <http://data.stackexchange.com/help> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Terrorism Research and Analysis Consortium <http://www.trackingterrorism.org/> `_
2018-01-15 01:08:24 +08:00
2018-12-19 05:25:03 +08:00
* |OK_ICON| `Texas Inmates Executed Since 1984 <http://www.tdcj.state.tx.us/death_row/dr_executed_offenders.html> `_
2018-01-15 01:04:07 +08:00
2019-05-08 12:09:50 +08:00
* |OK_ICON| `Titanic Survival Data Set <https://www.kaggle.com/c/titanic/data> `_
2018-01-15 01:08:24 +08:00
2019-08-14 15:47:05 +08:00
* |FIXME_ICON| `UCB's Archive of Social Science Data (D-Lab) <http://ucdata.berkeley.edu/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//SocialSciences/UCBs-Archive-of-Social-Science-Data-D-Lab.yml> `_ ]
2018-01-15 01:08:24 +08:00
2018-12-13 12:47:11 +08:00
* |OK_ICON| `UCLA Social Sciences Data Archive <https://dataverse.harvard.edu/dataverse/ssda_ucla> `_
2018-01-15 01:04:07 +08:00
2020-09-26 00:05:26 +08:00
* |OK_ICON| `UN Civil Society Database <http://esango.un.org/civilsociety/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `UPJOHN for Labor Employment Research <http://www.upjohn.org/services/resources/employment-research-data-center> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Universities Worldwide <http://univ.cc/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Uppsala Conflict Data Program <http://ucdp.uu.se/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `World Bank Open Data <http://data.worldbank.org/> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |OK_ICON| `World Inequality Database - The World Inequality Database (WID.world) [...] <https://wid.world> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `WorldPop project - Worldwide human population distributions <http://www.worldpop.org.uk/data/get_data/> `_
2018-01-15 01:04:07 +08:00
2016-02-15 01:58:38 +08:00
Software
2018-01-15 01:06:25 +08:00
--------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `FLOSSmole data about free, libre, and open source software development <http://flossdata.syr.edu/data/> `_
2018-01-18 21:02:21 +08:00
2020-09-29 00:47:50 +08:00
* |OK_ICON| `GHTorrent - Scalable, queryable, offline mirror of data offered through [...] <https://ghtorrent.org> `_
2018-12-03 21:35:06 +08:00
2021-02-19 04:58:20 +08:00
* |OK_ICON| `Libraries.io Open Source Repository and Dependency Metadata <https://doi.org/10.5281/zenodo.1068916> `_
2018-11-28 01:29:32 +08:00
* |OK_ICON| `Public Git Archive - a Big Code dataset for all – dataset of 182,014 top- [...] <https://github.com/src-d/datasets/tree/master/PublicGitArchive> `_
2019-04-09 23:38:35 +08:00
* |OK_ICON| `Code duplicates - 2k Java file and 600 Java function pairs labeled as [...] <https://github.com/src-d/datasets/tree/master/Duplicates> `_
2019-04-09 23:40:20 +08:00
* |OK_ICON| `Commit messages - 1.3 billion GitHub commit messages till March 2019 <https://github.com/src-d/datasets/blob/master/CommitMessages> `_
2019-04-01 22:20:07 +08:00
* |OK_ICON| `Pull Request review comments - 25.3 million GitHub PR review comments [...] <https://github.com/src-d/datasets/blob/master/ReviewComments> `_
2018-11-28 01:29:32 +08:00
* |OK_ICON| `Source Code Identifiers - 41.7 million distinct splittable identifiers [...] <https://github.com/src-d/datasets/tree/master/Identifiers> `_
2018-01-15 01:04:07 +08:00
2014-12-26 22:12:33 +08:00
Sports
2018-01-15 01:06:25 +08:00
------
2018-01-15 01:04:07 +08:00
2019-02-21 01:57:31 +08:00
* |OK_ICON| `American Ninja Warrior Obstacles - Contains every obstacle in the history [...] <https://data.world/ninja/anw-obstacle-history> `_
2020-07-20 23:32:04 +08:00
* |OK_ICON| `Betfair Historical Exchange Data <http://data.betfair.com/> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Cricsheet Matches (cricket) <http://cricsheet.org/> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |OK_ICON| `Equity in Athletics - The Equity in Athletics Data Analysis Cutting Tool [...] <https://ope.ed.gov/athletics> `_
2020-03-14 03:14:04 +08:00
* |OK_ICON| `Ergast Formula 1, from 1950 up to date (API) <http://ergast.com/mrd/db> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Football/Soccer resources (data and APIs) <http://www.jokecamp.com/blog/guide-to-football-and-soccer-data-and-apis/> `_
2018-01-15 01:08:24 +08:00
2018-07-17 00:01:26 +08:00
* |OK_ICON| `Lahman's Baseball Database <http://www.seanlahman.com/baseball-archive/statistics/> `_
2018-01-15 01:08:24 +08:00
2020-06-15 07:43:05 +08:00
* |OK_ICON| `NFL play-by-play data - NFL play-by-play data sourced from: [...] <https://www.dolthub.com/repositories/Liquidata/nfl-play-by-play> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Pinhooker: Thoroughbred Bloodstock Sale Data <https://github.com/phillc73/pinhooker> `_
2018-01-15 01:04:07 +08:00
2019-09-30 11:37:27 +08:00
* |OK_ICON| `Pro Kabadi season 1 to 7 - Pro Kabadi League is a professional-level [...] <https://github.com/ranganadhkodali/Pro-Kabadi-season-1-7-Stats> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Retrosheet Baseball Statistics <http://www.retrosheet.org/game.htm> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Tennis database of rankings, results, and stats for ATP <https://github.com/JeffSackmann/tennis_atp> `_
2018-04-06 01:00:48 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Tennis database of rankings, results, and stats for WTA <https://github.com/JeffSackmann/tennis_wta> `_
2021-02-17 06:35:41 +08:00
* |OK_ICON| `USA Soccer Teams and Locations - USA soccer teams and locations. MLS, [...] <https://github.com/gavinr/usa-soccer> `_
2018-01-15 01:04:07 +08:00
TimeSeries
2018-01-15 01:06:25 +08:00
----------
2018-01-15 01:04:07 +08:00
2019-08-14 15:47:05 +08:00
* |OK_ICON| `3W dataset - To the best of its authors' knowledge, this is the first [...] <https://github.com/ricardovvargas/3w_dataset> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Databanks International Cross National Time Series Data Archive <http://www.cntsdata.com> `_
2018-01-15 01:08:24 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Hard Drive Failure Rates <https://www.backblaze.com/hard-drive-test-data.html> `_
2018-01-15 01:04:07 +08:00
2019-01-17 03:23:20 +08:00
* |OK_ICON| `Heart Rate Time Series from MIT <http://ecg.mit.edu/time-series/> `_
2018-01-15 01:08:24 +08:00
2020-04-01 23:58:53 +08:00
* |OK_ICON| `Time Series Data Library (TSDL) from MU <https://pkg.yangzhuoranyang.com/tsdl/> `_
2018-01-15 01:04:07 +08:00
2020-04-01 23:51:51 +08:00
* |OK_ICON| `Turing Change Point Dataset - Contains 42 annotated time series collected [...] <https://github.com/alan-turing-institute/TCPD> `_
2018-04-11 01:06:33 +08:00
* |OK_ICON| `UC Riverside Time Series Dataset <http://www.cs.ucr.edu/~eamonn/time_series_data/> `_
2018-01-15 01:04:07 +08:00
2014-12-26 22:12:33 +08:00
Transportation
2018-01-15 01:06:25 +08:00
--------------
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Airlines OD Data 1987-2008 <http://stat-computing.org/dataexpo/2009/the-data.html> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |FIXME_ICON| `Ford GoBike Data (formerly Bay Area Bike Share Data) <https://www.fordgobike.com/system-data> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Transportation/Bay-Area-Bike-Share-Data.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Bike Share Systems (BSS) collection <https://github.com/BetaNYC/Bike-Share-Data-Best-Practices/wiki/Bike-Share-Data-Systems> `_
2018-01-15 01:04:07 +08:00
2019-10-31 04:38:51 +08:00
* |OK_ICON| `Dutch Traffic Information <https://www.ndw.nu/en/> `_
2019-06-04 00:43:00 +08:00
2020-04-20 12:08:34 +08:00
* |OK_ICON| `GeoLife GPS Trajectory from Microsoft Research <http://research.microsoft.com/en-us/downloads/b16d359d-d164-469e-9fd4-daa38f2b2e13/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `German train system by Deutsche Bahn <http://data.deutschebahn.com/datasets/> `_
2018-01-15 01:04:07 +08:00
2019-11-05 00:21:10 +08:00
* |FIXME_ICON| `Hubway Million Rides in MA <http://hubwaydatachallenge.org/trip-history-data/> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Transportation/Hubway-Million-Rides-in-MA.yml> `_ ]
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Montreal BIXI Bike Share <https://montreal.bixi.com/en/open-data> `_
2018-01-15 01:04:07 +08:00
2019-04-30 13:01:19 +08:00
* |OK_ICON| `NYC Taxi Trip Data 2009- <https://www1.nyc.gov/site/tlc/about/tlc-trip-record-data.page> `_
2018-01-15 01:04:07 +08:00
2019-03-26 21:31:38 +08:00
* |OK_ICON| `NYC Taxi Trip Data 2013 (FOIA/FOILed) <https://archive.org/details/nycTaxiTripData2013> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `NYC Uber trip data April 2014 to September 2014 <https://github.com/fivethirtyeight/uber-tlc-foil-response> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Open Traffic collection <https://github.com/graphhopper/open-traffic-collection> `_
2018-01-15 01:08:24 +08:00
2020-06-15 07:43:05 +08:00
* |OK_ICON| `OpenFlights - airport, airline and route data <http://openflights.org/data.html> `_
2018-01-15 01:04:07 +08:00
2019-07-02 19:13:08 +08:00
* |OK_ICON| `Philadelphia Bike Share Stations (JSON) <https://www.rideindego.com/stations/json/> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Plane Crash Database, since 1920 <http://www.planecrashinfo.com/database.htm> `_
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |FIXME_ICON| `RITA Airline On-Time Performance data <http://www.transtats.bts.gov/Tables.asp?DB_ID=120> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Transportation/RITA-Airline-On.yml> `_ ]
2018-01-15 01:04:07 +08:00
2021-02-17 06:35:41 +08:00
* |FIXME_ICON| `RITA/BTS transport data collection (TranStat) <http://www.transtats.bts.gov/DataIndex.asp> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Transportation/RITA-BTS-transport-data-collection-TranStat.yml> `_ ]
2018-01-15 01:04:07 +08:00
2021-02-24 02:18:11 +08:00
* |OK_ICON| `Renfe (Spanish National Railway Network) dataset <https://data.renfe.com> `_
2019-02-19 12:35:18 +08:00
2018-12-03 22:02:39 +08:00
* |OK_ICON| `Toronto Bike Share Stations (JSON and GBFS files) <https://www.toronto.ca/city-government/data-research-maps/open-data/open-data-catalogue/#84045f23-7465-0892-8889-7b6f91049b29> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `Transport for London (TFL) <https://tfl.gov.uk/info-for/open-data-users/our-open-data> `_
2018-01-15 01:04:07 +08:00
2020-10-27 05:23:18 +08:00
* |FIXME_ICON| `Travel Tracker Survey (TTS) for Chicago <http://www.cmap.illinois.gov/data/transportation/travel-tracker-survey> `_ [`fixme <https://github.com/awesomedata/apd-core/tree/master/core//Transportation/Travel-Tracker-Survey-TTS-for-Chicago.yml> `_ ]
2018-01-15 01:04:07 +08:00
2019-04-30 13:01:19 +08:00
* |OK_ICON| `U.S. Bureau of Transportation Statistics (BTS) <https://www.bts.gov/browse-statistical-products-and-data> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `U.S. Domestic Flights 1990 to 2009 <http://academictorrents.com/details/a2ccf94bbb4af222bf8e69dad60a68a29f310d9a> `_
2018-01-15 01:04:07 +08:00
2018-04-11 01:06:33 +08:00
* |OK_ICON| `U.S. Freight Analysis Framework since 2007 <http://ops.fhwa.dot.gov/freight/freight_analysis/faf/index.htm> `_
2020-01-28 06:14:38 +08:00
* |OK_ICON| `U.S. National Highway Traffic Safety Administration - Fatalities since [...] <ftp://nhtsa.gov/FARS/> `_
2019-11-05 00:21:10 +08:00
eSports
-------
2021-02-24 02:18:11 +08:00
* |OK_ICON| `CS:GO Competitive Matchmaking Data - In this data set we have data about [...] <https://www.kaggle.com/skihikingkevin/csgo-matchmaking-damage> `_
2020-09-19 00:11:56 +08:00
* |OK_ICON| `FIFA-2021 Complete Player Dataset <https://www.kaggle.com/aayushmishra1512/fifa-2021-complete-player-data> `_
2019-11-05 00:21:10 +08:00
* |OK_ICON| `OpenDota data dump <https://blog.opendota.com/2017/03/24/datadump2/> `_
2014-11-21 17:10:09 +08:00
Complementary Collections
-------------------------
2016-02-04 22:20:49 +08:00
* `Data Packaged Core Datasets <https://github.com/datasets/> `_
2018-01-15 01:04:07 +08:00
2016-01-02 20:23:00 +08:00
* `Database of Scientific Code Contributions <https://mozillascience.org/collaborate> `_
2018-01-15 01:04:07 +08:00
2017-03-01 18:14:10 +08:00
* A growing collection of public datasets: `CoolDatasets. <http://cooldatasets.com/> `_
2018-01-15 01:04:07 +08:00
2014-12-27 00:27:06 +08:00
* DataWrangling: `Some Datasets Available on the Web <http://www.datawrangling.com/some-datasets-available-on-the-web> `_
2018-01-15 01:04:07 +08:00
2014-12-27 00:27:06 +08:00
* Inside-r: `Finding Data on the Internet <http://www.inside-r.org/howto/finding-data-internet> `_
2018-01-15 01:04:07 +08:00
2015-12-08 13:23:43 +08:00
* OpenDataMonitor: `An overview of available open data resources in Europe <http://opendatamonitor.eu> `_
2018-01-15 01:04:07 +08:00
2014-12-27 00:27:06 +08:00
* Quora: `Where can I find large datasets open to the public? <http://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public> `_
2018-01-15 01:04:07 +08:00
2015-11-21 01:15:47 +08:00
* RS.io: `100+ Interesting Data Sets for Statistics <http://rs.io/100-interesting-data-sets-for-statistics/> `_
2018-01-15 01:04:07 +08:00
2020-03-01 22:58:00 +08:00
* StaTrek: `Leveraging open data to understand urban lives <http://caesar0301.github.io/posts/2014/10/23/leveraging-open-data-to-understand-urban-lives/> `_
2018-01-15 01:04:07 +08:00
2020-06-23 05:15:54 +08:00
* CV Papers: `CV Datasets on the web <http://www.cvpapers.com/datasets.html/> `_
* CVonline: `Image Databases <http://homepages.inf.ed.ac.uk/rbf/CVonline/Imagedbase.htm/> `_