From 234197dffbf9743688280db06195552e312a7f06 Mon Sep 17 00:00:00 2001 From: "Jamin X. Chen" Date: Sun, 11 Jan 2015 12:29:38 +0800 Subject: [PATCH] Update basic intro of NL category. --- README.rst | 34 +++++++++++++++++----------------- 1 file changed, 17 insertions(+), 17 deletions(-) diff --git a/README.rst b/README.rst index bc90f74..f3825f4 100644 --- a/README.rst +++ b/README.rst @@ -234,20 +234,20 @@ Music Natural Language ---------------- -* `40 Million Entities in Context `_ -* `ClueWeb09 FACC `_ -* `ClueWeb12 FACC `_ -* `DBpedia `_ -* `Flickr personal taxonomies `_ -* `Google Books Ngrams `_ -* `Google Web 5gram, 2006 (1T) `_ -* `Gutenberg eBooks List `_ -* `Hansards `_ -* `Machine Translation `_ -* `SMS Spam Collection `_ -* `USENET corpus `_ -* `Wikidata `_ -* `WordNet `_ +* `ClueWeb09 FACC - Annotated English-language Web pages from the ClueWeb09 corpora. `_ +* `ClueWeb12 FACC - Annotated English-language Web pages from the ClueWeb12 corpora. `_ +* `DBpedia - Multi-domain ontology describing 4.58M “things” with 583M “facts”. `_ +* `Flickr Personal Taxonomies - Personalized tagging pictures with descriptive labels. `_ +* `Google Books Ngrams (2.2TB) - N-gram corpuses extracted from Google Books. `_ +* `Google Web 5gram (1TB, 2006) - 5-gram corpuses extracted from Web pages. `_ +* `Gutenberg eBooks List - Basic information about each eBook from Project Gutenberg. `_ +* `Hansards - 1.3M aligned text chunks from official records of Canadian Parliament. `_ +* `Machine Translation - The recurring translation task focusing on European languages. `_ +* `SMS Spam Collection - 5,574 real English messages, labled as being ham or spam. `_ +* `USENET corpus - A collection of public USENET postings between Oct 2005 and Jan 2011. `_ +* `Wikidata - Wikipedia databases available in JSON and XML formats. `_ +* `Wikipedia Links data - 40 Million Entities in Context. `_ +* `WordNet - Databases, associated packages and tools. `_ Physics @@ -314,11 +314,11 @@ Social Sciences * `Titanic Survival Data Set - Demographic information of Titanic passengers `_ * `Twitter Graph - Crawled entire Twitter site including tweets, user profiles, relations `_ * `UCB's Archive of Social Science Data (D-Lab) - Holdings of political, social and health areas `_ -* `UCLA Social Sciences Data Archive - A collection of social science data on the Web, e.g., DHS surveys `_ +* `UCLA Social Sciences Data Archive - A collection of social science data on the Web `_ * `UNIMI/LAW Social Network Datasets - Social networks like amazon, LiveJournal, dblp and more `_ * `Universities Worldwide - Links to 9307 Universities in 205 countries `_ * `UPJOHN for Employment Research - Labor surveys, unemployment spells and more `_ -* `Yahoo Graph and Social Data - Web page hyperlink graph, user-group membership, IM friends etc. `_ +* `Yahoo Graph and Social Data - Web page graph, user-group membership, IM friends etc. `_ * `Youtube Video Graph (2007,2008) - Video relations, uploaders, views, ratings and more `_ @@ -355,7 +355,7 @@ Transportation * `Transport for London (TFL) - Trip histories and networking statistics `_ * `Travel Tracker Survey (TTS), Chicago, 1990, 2007-2008 `_ * `U.S. Bureau of Transportation Statistics (BTS) `_ -* `U.S. Freight Analysis Framework - Freight movement among states since 2007 `_ +* `**U.S. Freight Analysis Framework** - Freight movement among states since 2007 `_ Complementary Collections