diff --git a/README.rst b/README.rst index 44379d7..7e403ee 100644 --- a/README.rst +++ b/README.rst @@ -5,28 +5,30 @@ Awesome Public Datasets :alt: Awesome :target: https://github.com/sindresorhus/awesome +This is a list of `topic-centric public data sources `_ +in high quality. They are collected and tidied from blogs, answers, and user responses. +Most of the data sets listed below are free, however, some are not. +This project was hatched at `OMNILab `_ during my Ph.D. career, which is now part of `BaiYuLan Open AI community `_. +Other amazingly awesome lists can be found in `sindresorhus's awesome `_ list. + +Special thanks to + .. image:: https://raw.githubusercontent.com/awesomedata/apd-core/master/logo/baiyulan.PNG :alt: BaiYuLanAI :target: https://github.com/Bai-Yu-Lan +**NOTICE**: This repo is automatically generated by `apd-core `_. +Please **DO NOT** modify this file directly. We have provided a new way to `contribute to +this repo `_. +`Join `_ +the `slack community `_ for an instant touch of HQ data updates. + .. |OK_ICON| image:: https://raw.githubusercontent.com/awesomedata/apd-core/master/deploy/ok-24.png .. |FIXME_ICON| image:: https://raw.githubusercontent.com/awesomedata/apd-core/master/deploy/fixme-24.png - -**NOTICE**: This repo is automatically generated by `apd-core `_. -Please **DO NOT** modify this file directly. We have provided -`a new way `_ -to contribute to Awesome Public Datasets. `Join `_ the `slack community `_ for more communication. - * |OK_ICON| I am well. * |FIXME_ICON| Please fix me. -`This list of a topic-centric public data sources `_ -in high quality. They are collected and tidied from blogs, answers, and user responses. -Most of the data sets listed below are free, however, some are not. -Other amazingly awesome lists can be found in `sindresorhus's awesome `_ list. - - .. contents:: **Table of Contents** @@ -71,7 +73,7 @@ Biology * |OK_ICON| `Complete Genomics Public Data - A diverse data set of whole human genomes are freely [...] `_ [`Meta `_] -* |FIXME_ICON| `CytoImageNet - A large-scale dataset of microscopy images. Contains 890,737 total grayscale [...] `_ [`Meta `_] +* |OK_ICON| `CytoImageNet - A large-scale dataset of microscopy images. Contains 890,737 total grayscale [...] `_ [`Meta `_] * |OK_ICON| `EBI ArrayExpress - ArrayExpress Archive of Functional Genomics Data stores data from high- [...] `_ [`Meta `_] @@ -251,7 +253,7 @@ ComplexNetworks ComputerNetworks ---------------- -* |FIXME_ICON| `3.5B Web Pages from CommonCrawl 2012 `_ [`Meta `_] +* |OK_ICON| `3.5B Web Pages from CommonCrawl 2012 `_ [`Meta `_] * |OK_ICON| `53.5B Web clicks of 100K users in Indiana Univ. `_ [`Meta `_] @@ -307,13 +309,13 @@ DataChallenges * |OK_ICON| `ICWSM Data Challenge (since 2009) `_ [`Meta `_] -* |FIXME_ICON| `KDD Cup by Tencent 2012 `_ [`Meta `_] +* |OK_ICON| `KDD Cup by Tencent 2012 `_ [`Meta `_] -* |FIXME_ICON| `Kaggle Competition Data `_ [`Meta `_] +* |OK_ICON| `Kaggle Competition Data `_ [`Meta `_] * |OK_ICON| `Localytics Data Visualization Challenge `_ [`Meta `_] -* |FIXME_ICON| `Netflix Prize `_ [`Meta `_] +* |OK_ICON| `Netflix Prize `_ [`Meta `_] * |OK_ICON| `Space Apps Challenge `_ [`Meta `_] @@ -354,7 +356,7 @@ EarthScience * |OK_ICON| `USGS Earthquake Archives `_ [`Meta `_] -* |FIXME_ICON| `Wellhead Protection Area (protection zone) prediction using breakthrough curves - This [...] `_ [`Meta `_] +* |OK_ICON| `Wellhead Protection Area (protection zone) prediction using breakthrough curves - This [...] `_ [`Meta `_] Economics --------- @@ -488,7 +490,7 @@ Energy Entertainment ------------- -* |FIXME_ICON| `Top Streamers on Twitch - This contains data of Top 1000 Streamers from past year. `_ [`Meta `_] +* |OK_ICON| `Top Streamers on Twitch - This contains data of Top 1000 Streamers from past year. `_ [`Meta `_] Finance ------- @@ -499,7 +501,7 @@ Finance * |FIXME_ICON| `CBOE Futures Exchange `_ [`Meta `_] -* |FIXME_ICON| `Complete FAANG Stock data - This data set contains all the stock data of FAANG companies from [...] `_ [`Meta `_] +* |OK_ICON| `Complete FAANG Stock data - This data set contains all the stock data of FAANG companies from [...] `_ [`Meta `_] * |OK_ICON| `Google Finance `_ [`Meta `_] @@ -909,7 +911,7 @@ ImageProcessing * |OK_ICON| `10k US Adult Faces Database `_ [`Meta `_] -* |FIXME_ICON| `2GB of Photos of Cats `_ [`Meta `_] +* |OK_ICON| `2GB of Photos of Cats `_ [`Meta `_] * |OK_ICON| `Audience Unfiltered faces for gender and age classification `_ [`Meta `_] @@ -959,7 +961,7 @@ ImageProcessing * |OK_ICON| `Multi-View Region of Interest Prediction Dataset for Autonomous Driving - Contains 16 driving [...] `_ [`Meta `_] -* |FIXME_ICON| `Massive Visual Memory Stimuli, MIT `_ [`Meta `_] +* |OK_ICON| `Massive Visual Memory Stimuli, MIT `_ [`Meta `_] * |OK_ICON| `Newspaper Navigator - This dataset consists of extracted visual content for 16,358,041 [...] `_ [`Meta `_] @@ -1397,7 +1399,7 @@ SocialNetworks * |OK_ICON| `China Biographical Database - The China Biographical Database is a freely accessible [...] `_ [`Meta `_] -* |FIXME_ICON| `Clubhouse Dataset `_ [`Meta `_] +* |OK_ICON| `Clubhouse Dataset `_ [`Meta `_] * |OK_ICON| `A Twitter Dataset of 40+ million tweets related to COVID-19 - Due to the relevance of the [...] `_ [`Meta `_] @@ -1415,7 +1417,7 @@ SocialNetworks * |OK_ICON| `GitHub Collaboration Archive `_ [`Meta `_] -* |OK_ICON| `Google Scholar citation relations `_ [`Meta `_] +* |FIXME_ICON| `Google Scholar citation relations `_ [`Meta `_] * |OK_ICON| `High-Resolution Contact Networks from Wearable Sensors `_ [`Meta `_] @@ -1435,7 +1437,7 @@ SocialNetworks * |OK_ICON| `The Reddit COVID dataset - This dataset attempts to capture the full extent of COVID-19 [...] `_ [`Meta `_] -* |FIXME_ICON| `Twitch Top Streamer's Data `_ [`Meta `_] +* |OK_ICON| `Twitch Top Streamer's Data `_ [`Meta `_] * |OK_ICON| `Twitter Data for Online Reputation Management `_ [`Meta `_] @@ -1530,7 +1532,7 @@ SocialSciences * |FIXME_ICON| `Texas Inmates Executed Since 1984 `_ [`Meta `_] -* |FIXME_ICON| `Titanic Survival Data Set `_ [`Meta `_] +* |OK_ICON| `Titanic Survival Data Set `_ [`Meta `_] * |OK_ICON| `UCB's Archive of Social Science Data (D-Lab) `_ [`Meta `_] @@ -1677,9 +1679,9 @@ Transportation eSports ------- -* |FIXME_ICON| `CS:GO Competitive Matchmaking Data - In this data set we have data about the CSGO matchmaking [...] `_ [`Meta `_] +* |OK_ICON| `CS:GO Competitive Matchmaking Data - In this data set we have data about the CSGO matchmaking [...] `_ [`Meta `_] -* |FIXME_ICON| `FIFA-2021 Complete Player Dataset `_ [`Meta `_] +* |OK_ICON| `FIFA-2021 Complete Player Dataset `_ [`Meta `_] * |OK_ICON| `OpenDota data dump `_ [`Meta `_]