These lexicons are created by manual annotation. The lexicons with real-valued scores are created using Best-Worst Scaling, producing fine-grained, yet highly reliable annotation values. Show
Version # of Terms Categories Association Scores Method of Creation1a. NRC Word-Emotion Association Lexicon (also called NRC Emotion lexicon or EmoLex). README. Explore the interactive visualization. Homepage of the Lexicon. Also available in over 40 other languages here. The sense-level annotations provided by individual annotators for the eight emotions can also be obtained. 0.92 (2010) 14,182 unigrams (words)sentiments: emotions: Manual: By crowdsourcing Domain: General ~25,000 senses not associated, weakly, moderately, or strongly associatedPapers:
1b. NRC Emotion Intensity Lexicon (aka Affect Intensity Lexicon), created using Best-Worst Scaling.
2. NRC Valence, Arousal, Dominance Lexicon, created using Best-Worst Scaling. 1 (2018) ~20,000 termsValence Manual: By crowdsourcing Domain: General Paper:
Manually Created Sentiment Composition Lexicons These lexicons include sentiment scores for two- and three-word expressions as well as scores for their constituent words. LexiconVersion # of Terms Categories Association Scores Method of Creation1. created using Best-Worst Scaling (aka MaxDiff) 1.0 (Feb. 2016) ~3200 terms sentiments:negative, positive Real-valued score between -1 (most negative) to 1 (most positive) Manual. By crowdsourcing and using . Domain: General Papers:
2. created using Best-Worst Scaling (aka MaxDiff) 1.0 (Feb. 2015) ~1500 terms sentiments:negative, positive Real-valued score between -1 (most negative) to 1 (most positive) Manual. By crowdsourcing and using . Domain: Twitter Paper:
3. , created using Best-Worst Scaling (aka MaxDiff) 1.0 (Feb. 2016) ~1200 terms sentiments:negative, positive Real-valued score between -1 (most negative) to 1 (most positive) Manual. By crowdsourcing and using . Domain: Twitter Paper: Large Manually Created Word-Colour Association Lexicon Lexicon Version # of Terms Categories Association Scores Method of Creation1. NRC Word-Colour Association Lexicon 0.92 (2011) ~14,000 wordscolours: black, blue, brown, green, grey, orange purple, pink, red, white, yellow 0 (not associated) or 1 (associated) Manual: Crowdsourcing on Mechanical Turk. Domain: General ~25,000 senses not, weakly, moderately, or strongly associatedPapers:
Automatically Created Lexicons These lexicons are automatically extracted from large amounts of text using co-occurrence information. For example, the Hashtag Emotion Lexicon is generated from tweets and the score for a word--emotion pair is a quantification of the word's tendency to co-occur with the emotion-word hashtag. These are usually much larger than manually created lexicons. They have higher coverage, especially of terms often seen in the corpus that the lexicon is extracted from. However, the emotion scores can be less accurate than those in the manually created lexicons above. Large Automatically Generated Word-Emotion Association Lexicon LexiconVersion # of Terms Categories Association Scores Method of Creation1. NRC Hashtag Emotion Lexicon. The Hashtag Emotion Corpus (aka Twitter Emotion Corpus, or TEC) used to create the lexicon. 0.2 (2013) 16,862 unigrams (words) emotions:anger, anticipation, disgust, fear, joy, sadness, surprise, trust Real-valued score between 0 (not associated) to ∞ (maximally associated) Automatic: From tweets with emotion word hashtags. Domain: Twitter Papers: Large Automatically Generated Word-Sentiment Association Lexicons Lexicon Version a. NRC Hashtag Sentiment Lexicon 1.0 (2013) 54,129 unigrams sentiments:negative, positive Real-valued score between -∞ (most negative) to ∞ (most positive) Automatic: From tweets with sentiment word hashtags. Domain: Twitter 316,531 bigrams 308,808 pairsb. NRC Hashtag Affirmative Context Sentiment Lexicon and NRC Hashtag Negated Context Sentiment Lexicon 1.0 (2014) Affirmative contexts: 36,357 unigramsNegated contexts: 7,592 unigrams sentiments: negative, positive Real-valued score between -∞ (most negative) to ∞ (most positive) Automatic: From tweets with sentiment word hashtags. Separate entries for affirmative and negated contexts. Domain: Twitter Affirmative contexts: 159,479 bigramsNegated contexts: 23,875 bigrams c. Emoticon Lexicon aka Sentiment140 Lexicon (note that this is sentiment lexicon drawn from emoticons, and is not an emotion lexicon) 1.0 (2014) 62,468 unigrams sentiments:negative, positive Real-valued score between -∞ (most negative) to ∞ (most positive) Automatic: From tweets with emoticons. Domain: Twitter 677,698 bigrams 480,010 pairsd. Sentiment140 Affirmative Context Lexicon and Sentiment140 Negated Context Lexicon 1.0 (2014) Affirmative contexts: 45,255 unigramsNegated contexts: 9,891 unigrams sentiments: negative, positive Real-valued score between -∞ (most negative) to ∞ (most positive) Automatic: From tweets with sentiment word hashtags. Separate entries for affirmative and negated contexts. Domain: Twitter Affirmative contexts: 240,076 bigramsNegated contexts: 34,093 bigrams Papers (describing the four NRC Twitter Lexicons listed above): 2. Yelp and Amazon Sentiment Lexicons a. Yelp Restaurant Sentiment Lexicon 1.0 (2014) 39,274 entries for unigrams (includes affirmative and negated context entries) sentiments:negative, positive Real-valued score between -∞ (most negative) to ∞ (most positive) Automatic: From customer reviews on Yelp.com. Domain: Restaurant 276,651 entries for bigramsThe Yelp Word–Aspect Association Lexicons are also made available. b. Amazon Laptop Sentiment Lexicon 1.0 (2014) 26,577 entries for unigrams (includes affirmative and negated context entries) sentiments:negative, positive Real-valued score between -∞ (most negative) to ∞ (most positive) Automatic: From customer reviews on Amazon.com. Domain: Laptop 155,167 entries for bigramsPaper (describing the Yelp and Amazon Lexicons):
3. Macquarie Semantic Orientation Lexicon 0.1 (2009) 76,400 terms sentiments:negative, positive binary distinction: negative or positive Automatic: Using the structure of a thesaurus and affixes. Domain: General Paper:
|