NLP Guide

NLP Guide / 2017

2017

This is an open guide to the noteworthy happenings in natural language processing in 2017.

Want to add or correct? Send a pull request.

Research

advances in generalisable DL
char-level - not just for lang id and translit
unsupervised approaches, e.g. cross-lingual word embeddings
NMT beats SMT

ACL 2017 best papers

EFF AI Metrics - Written Language, Spoken Language
https://www.eff.org/ai/metrics

AI Index - Natural Language Understanding
https://aiindex.org/2017-report.pdf

“Poincaré Embeddings for Learning Hierarchical Representations” - Facebook
https://arxiv.org/pdf/1705.08039.pdf

“Attention Is All You Need” Google
https://arxiv.org/abs/1706.03762

“One Model To Learn Them All” Google
https://arxiv.org/abs/1706.05137

SQuAD leaderboard
https://rajpurkar.github.io/SQuAD-explorer/

“Interpreting neurons in an LSTM network” YerevaNN http://yerevann.github.io/2017/06/27/interpreting-neurons-in-an-LSTM-network/

Quora Question Pairs
https://www.kaggle.com/c/quora-question-pairs
data leakage problem

An Adversarial Review of “Adversarial Generation of Natural Language” - Yoav Goldberg https://medium.com/@yoav.goldberg/an-adversarial-review-of-adversarial-generation-of-natural-language-409ac3378bd7)

Libs and APIs

pytorch/text - PyTorch

MXNet sockeye - Amazon
MXNet Gluon - Amazon

InferSent - Facebook
ParlAI - Facebook
starSpace - Facebook

fastText lang id - Facebook

AllenNLP - Allen Institute

ABBYY Real-Time Recognition SDK

spaCy 2.0 - new features and new languages

Natural Language API - Google - new languages and new services

Datasets

fastText pre-trained word vectors on Wikipedia for 294 languages

CLEVR
http://cs.stanford.edu/people/jcjohns/clevr/ https://arxiv.org/pdf/1612.06890.pdf https://github.com/facebookresearch/clevr-iep https://github.com/facebookresearch/clevr-dataset-gen

Events

ACL

EMNLP

EACL

NIST TAC

PyData

RAAIS

NIPS

Education

http://cs224n.stanford.edu / http://cs224d.stanford.edu
https://www.youtube.com/watch?v=OQQ-W_63UgQ&list=PL3FW7Lu3i5Jsnh1rnUwq_TcylNr7EkRe6

https://github.com/oxford-cs-deepnlp-2017/lectures

http://thestraightdope.mxnet.io/chapter05_recurrent-neural-networks/simple-rnn.html

http://pytorch.org/tutorials/beginner/deep_learning_nlp_tutorial.html

http://pytorch.org/tutorials/intermediate/seq2seq_translation_tutorial.html

https://blog.keras.io/a-ten-minute-introduction-to-sequence-to-sequence-learning-in-keras.html

Products

Google Translate with NMT research.googleblog.com research.google.com wiki

Алиса - Yandex

DeepL Translator - NMT better than Google for 7 languages - https://www.deepl.com/translator

AWS Translate API preview - Amazon - https://aws.amazon.com/translate/

Read more

https://nlp.stanford.edu/read/
http://mitp.nautil.us/article/170/last-words-computational-linguistics-and-deep-learning
http://nathan.ai
https://yerevann.github.io/
http://approximatelycorrect.com/category/natural-language-processing/
http://approximatelycorrect.com/2017/09/26/a-random-walk-through-emnlp-2017/
http://newsletter.ruder.io/
http://ruder.io/highlights-emnlp-2017/index.html
http://ruder.io/word-embeddings-2017/
https://explosion.ai/blog/
https://explosion.ai/blog/quora-deep-text-pair-classification
https://www.producthunt.com/@bittlingmayer/collections
https://www.reddit.com/r/LanguageTechnology/
https://plus.google.com/communities/112547995826249627629