coha.word_sgns / README.md
rimonim's picture
Update README.md
41049f5 verified
metadata
license: pddl

By-decade word2vec (SGNS) embeddings trained on Corpus of Historical American English (Genre-Balanced American English, 1830s-2000s).

rds files are R numeric matrices with tokens as rownames.

References:

William L. Hamilton, Jure Leskovec, and Dan Jurafsky. ACL 2016. Diachronic Word Embeddings Reveal Statistical Laws of Semantic Change. https://nlp.stanford.edu/projects/histwords/