From 0d0117a88a7f8ba4d8053b4305e834dea25c2ad6 Mon Sep 17 00:00:00 2001 From: Xiaming Chen Date: Sun, 18 Dec 2016 16:08:36 +0800 Subject: [PATCH] Update new image sets and three NLP sets Images: Chars74K dataset and MNIST, NLP: Google MC-AFP, MS-MACRO, and MDST --- README.rst | 5 +++++ 1 file changed, 5 insertions(+) diff --git a/README.rst b/README.rst index 7146d69..e971eba 100755 --- a/README.rst +++ b/README.rst @@ -284,11 +284,13 @@ Image Processing * `2GB of Photos of Cats `_ or `Archive version `_ * `Affective Image Classification `_ * `Animals with attributes `_ +* `Chars74K dataset, Character Recognition in Natural Images (both English and Kannada are available) `_ * `Face Recognition Benchmark `_ * `ImageNet (in WordNet hierarchy) `_ * `Indoor Scene Recognition `_ * `International Affective Picture System, UFL `_ * `Massive Visual Memory Stimuli, MIT `_ +* `MNIST database of handwritten digits, near 1 million examples `_ * `Several Shape-from-Silhouette Datasets `_ * `Stanford Dogs Dataset `_ * `SUN database, MIT `_ @@ -343,11 +345,14 @@ Natural Language * `Flickr Personal Taxonomies `_ * `Freebase.com of people, places, and things `_ * `Google Books Ngrams (2.2TB) `_ +* `Google MC-AFP, generated based on the public available Gigaword dataset using Paragraph Vectors `_ * `Google Web 5gram (1TB, 2006) `_ * `Gutenberg eBooks List `_ * `Hansards text chunks of Canadian Parliament `_ * `Machine Comprehension Test (MCTest) of text from Microsoft Research `_ * `Machine Translation of European languages `_ +* `Multi-Domain Sentiment Dataset (version 2.0) `_ +* `Microsoft MAchine Reading COmprehension Dataset (or MS MARCO) `_ * `Personae Corpus `_ * `SaudiNewsNet Collection of Saudi Newspaper Articles (Arabic, 30K articles) `_ * `SMS Spam Collection in English `_