Determining Sentiment

You determine sentiment by using the SentimentAnalysis() function on text.

The SentimentAnalysis() function first extracts the attributes (typically nouns) from the sentence, and then applies a sentiment score to each attribute. Scores can be one of the following:

This provides a more granular analysis than just determining the sentiment for the sentence as a whole. Consider the following quote from Abraham Lincoln; "Force is all-conquering, but its victories are short-lived." If you were to score the sentiment of the sentence as a whole by averaging the sentiment of its parts, then the sentiment is neutral.

=> select avg(t1.sentiment_score) as 'Average Sentiment' from (
        select sentimentAnalysis('Force is all-conquering, but its victories are short-lived.') 
        over (PARTITION BEST)
        ) as t1;
 
 Average Sentiment
 -----
   0   

If you score the individual attributes of the sentence, then you can obtain a much more precise analysis of the sentiment than if you were trying to assign a single score to the entire sentence. For example:

=> select sentimentAnalysis('Force is all-conquering, but its victories are short-lived.') over (PARTITION BEST);

 sentence | attribute | sentiment_score
----------+-----------+-----------------
        1 | force     |               1
        1 | victories |              -1

"Force" is scored with positive sentiment because it is "all-conquering". "Victories" is scored with negative sentiment because it is "short-lived".

Note: Vertica Pulse does not recognize personal pronouns (I, you, we, he, she, it, etc.) as attributes.

SentimentAnalysis() also extracts the sentiment from multiple sentences and returns the sentence in which attributes are found:

=> SELECT SentimentAnalysis('Force is all-conquering, but its victories are short-lived. Every good boy deserves fudge.') OVER(PARTITION BEST);
 sentence | attribute | sentiment_score
----------+-----------+-----------------
        1 | force     |               1
        1 | victories |              -1
        2 | boy       |               1
        2 | fudge     |               1
(4 rows)

"Boy" is scored with positive sentiment because he is good. Fudge is scored with positive sentiment because it is something that is deserved.

Note: The sentence detector considers a period to mark the end of a sentence. Some abbreviations that use a period, such as Dr. or Mr., cause the sentence detector to end the sentence at the abbreviation.

The SentimentAnalysis function also identifies attributes with neutral sentiment (a sentiment score of zero). For example:

SELECT SentimentAnalysis('Roses are red. Violets are blue.') OVER(PARTITION BEST);
 sentence | attribute | sentiment score
----------+-----------+-----------------
        1 | roses     |               0
        2 | violets   |               0
(2 rows)

Both roses and violets receive neutral sentiment because neither being red nor blue is considered positive or negative in this context.

See the Pulse Cookbook for more examples of determining sentiment.