mirror of
https://github.com/awesomedata/awesome-public-datasets.git
synced 2024-04-18 07:30:58 +08:00
commit
4cd159ba28
|
@ -350,6 +350,7 @@ Museums
|
|||
Natural Language
|
||||
----------------
|
||||
|
||||
* `POS/NER/Chunk annotated data <https://github.com/aritter/twitter_nlp/tree/master/data/annotated>`_
|
||||
* `Automatic Keyphrase Extraction <https://github.com/snkim/AutomaticKeyphraseExtraction/>`_
|
||||
* `Blogger Corpus <http://u.cs.biu.ac.il/~koppel/BlogCorpus.htm>`_
|
||||
* `CLiPS Stylometry Investigation Corpus <http://www.clips.uantwerpen.be/datasets/csi-corpus>`_
|
||||
|
@ -365,6 +366,8 @@ Natural Language
|
|||
* `Hansards text chunks of Canadian Parliament <http://www.isi.edu/natural-language/download/hansard/>`_
|
||||
* `Machine Comprehension Test (MCTest) of text from Microsoft Research <http://research.microsoft.com/en-us/um/redmond/projects/mctest/index.html>`_
|
||||
* `Machine Translation of European languages <http://statmt.org/wmt11/translation-task.html#download>`_
|
||||
* `Making Sense of Microposts 2013 - Concept Extraction <http://oak.dcs.shef.ac.uk/msm2013/challenge.html>`_
|
||||
* `Making Sense of Microposts 2016 - Named Entity rEcognition and Linking <http://microposts2016.seas.upenn.edu/challenge.html>`_
|
||||
* `Microsoft MAchine Reading COmprehension Dataset (or MS MARCO) <http://www.msmarco.org/dataset.aspx>`_
|
||||
* `Multi-Domain Sentiment Dataset (version 2.0) <http://www.cs.jhu.edu/~mdredze/datasets/sentiment/>`_
|
||||
* `Open Multilingual Wordnet <http://compling.hss.ntu.edu.sg/omw/>`_
|
||||
|
|
Loading…
Reference in New Issue
Block a user