SentimentAnalysis
Provides a sentiment score for each attribute (noun) in a given body of text. Positive sentiment receives a positive integer score and negative sentiment receives a negative integer score. A score of 0 indicates that the sentiment for the attribute is neutral.
This function must be used with the OVER()
clause. Use OVER(PARTITION BEST)
for the best performance if the query does not require specific columns in the OVER()
clause. Any valid PARTITION BY
clause is acceptable. However, only the PARTITION BY
clause which matches the segmentation clause of the table's projection provides optimum performance. You can improve performance by segmenting on the columns in the PARTITION BY
clause.
Syntax
SentimentAnalysis(text [, 'language'] [ USING PARAMETERS [ whitelistonly = boolean ] [, filterlinks = boolean ] [, filterusermentions = boolean ] [, filterhashtags = boolean ] [, filterpunctiation = boolean ] [, filterretweets = boolean ] [, relatedwords = boolean ] [, adjustcasing = boolean ] [, language = string ] [, label='label'] [, granularity='ASD'] [, actionPattern='boolean'] ])
Note: language can be specified as an argument and/or as a parameter. When specified as both, the argument value supersedes the parameter value.
Parameters
Argument | Description |
---|---|
text |
The text to analyze. Limited to 65,000 bytes. |
whitelistonly |
Optional. Default false. When set to true only attributes defined in the whitelist user-dictionary are scored. Use this setting to limit your analysis to the objects of action patterns. |
filterlinks |
Optional. Default false. When set to true, links are not included as attributes. |
filterusermentions |
Optional. Default false. When set to true, Twitter user mentions (@username) are not included as attributes. |
filterhashtags |
Optional. Default false. When set to true, Twitter hashtags (#hashtag) are not included as attributes. |
filterpunctuation |
Optional. Default true. Filters any punctuation that occurs at the beginning of an attribute other than @ and #. |
filterretweets |
Optional. Defaults to false.Filters out the characters "RT" from re-tweets in attributes. |
relatedwords | Optional. Defaults to false. When set to true, provides up to three words from the sentence used to help determine the sentiment of the attribute. |
adjustcasing | Optional. Defaults to false. When set to true, all letters in the text are converted to uppercase before sentence detection. After performing sentence detection, Vertica converts all letter to lowercase. This option can help you in cases where the original data is all in lowercase letters and Pulse is incorrectly identifying sentence boundaries. |
language |
The language:
|
label | Optional. The label of the dictionaries that you want to use for sentiment analysis. If you do not include a label, Pulse uses the default dictionaries. |
granularity |
Optional. The level of the sentiment analysis that you want to perform:
You can specify any granularity level or combination of levels with your sentiment analysis. If you do not specify a granularity level, Pulse performs an attribute level analysis. |
actionPattern |
Optional. Default false. When set to true checks for action patterns in the analyzed content. |
Examples
These examples show various ways you can use Pulse to detect user sentiment.
Query for sentiment in the following sentence.
SELECT SentimentAnalysis('The quick brown fox jumped over the lazy dog.') OVER(PARTITION BEST);
sentence | attribute | sentiment score ----------+-----------+----------------- 1 | fox | 1 1 | dog | -1 (2 rows)
Query to identify the words that triggered the sentiment score.
SELECT SentimentAnalysis('The quick brown fox jumped over the lazy dog.' USING PARAMETERS relatedwords=true) OVER(PARTITION BEST); sentence | attribute | sentiment_score | related_word_1 | related_word_2 | related_word_3 ----------+-----------+-----------------+----------------+----------------+---------------- 1 | fox | 1 | quick | lazy | 1 | dog | -1 | lazy | | (2 rows)
SELECT SentimentAnalysis('The quick brown fox jumped over the lazy dog.', 'english') OVER(PARTITION BEST); sentence | attribute | sentiment_score ----------+-----------+----------------- 1 | fox | 1 1 | dog | -1 (2 rows) SELECT SentimentAnalysis('The quick brown fox jumped over the lazy dog.' using PARAMETERS language='english') OVER(PARTITION BEST);
sentence | attribute | sentiment_score ----------+-----------+----------------- 1 | fox | 1 1 | dog | -1 (2 rows) SELECT SentimentAnalysis('El zorro rapido brinco sobre el perro flojo.', 'spanish') OVER(PARTITION BEST);
sentence | attribute | sentiment_score ----------+-----------+----------------- 1 | zorro | 1 1 | perro | -1 (2 rows) SELECT SentimentAnalysis('El zorro rapido brinco sobre el perro flojo.' using PARAMETERS language='spanish') OVER(PARTITION BEST); sentence | attribute | sentiment_score ----------+-----------+----------------- 1 | zorro | 1 1 | perro | -1 (2 rows) SELECT SentimentAnalysis('The camera takes great quality pictures but is expensive. It feels like a professional one.' USING PARAMETERS granularity='ASD') over(); sentence | attribute | sentiment_score | mixed ----------+------------------+-----------------+------- | | 1 | true 1 | | 0 | true 2 | | 1 | false 1 | camera | 1 | 1 | quality pictures | 1 | SELECT sentimentAnalysis('Right after school on November 8th I will go to target, walmart, and best buy and buy #blueslidepark just for @MacMiller' USING PARAMETERS actionPattern=true,whitelistonly=true) over(); sentence | attribute | sentiment_score | action | action_pattern ----------+-----------+-----------------+--------------+---------------------------- 1 | walmart | 1 | go to target | #action{$verb $prep $verb} 1 | walmart | 1 | go to target | #action{$verb to $verb} (2 rows)
Getting Twitter User-Mentioned Sentiment
SELECT SentimentAnalysis('@company is great!') OVER(PARTITION BEST); sentence | attribute | sentiment score ----------+-----------+----------------- 1 | @company | 1 (1 row)
Filtering Twitter User Sentiment
SELECT SentimentAnalysis('@company is great!' USING PARAMETERS
filterusermentions=true) OVER(PARTITION BEST); sentence | attribute | sentiment score ----------+-----------+----------------- (0 rows)