About the Vertica Pulse Package
Vertica Pulse provides a suite of functions that allow you to analyze and extract the sentiment from English and Spanish language text directly from your Vertica database.
Vertica Pulse features include:
- Attribute based sentiment scoring - Pulse scores the sentiment of attributes in a sentence. Attributes are generally nouns and are automatically discovered by Pulse. Pulse typically scores sentiment from a range of -1 (negative sentiment) to +1 (positive sentiment). A sentiment of 0 is considered neutral. Scoring individual attributes in a sentence instead of scoring the sentence as a whole provides a more granular analysis for the text. For example, consider the sentence "The quick brown fox jumped over the lazy dog." It would be difficult to score the sentiment on the sentence as a whole, but if you score on the attributes of fox and dog, you could say the sentiment on the fox was positive (the fox is quick), and the sentiment on the dog is negative (the dog is lazy).
- Tuning to your domain - Pulse provides functionality to recognize attributes that are specific to your domain. For example, you can add the name of your product or company to a 'white_list' so that it is discovered by Pulse.
- Tuning of how sentiment is scored - Pulse includes user-dictionaries of words that are used to help score sentiment. You can alter these user-dictionaries to fine tune the way your text is analyzed.
- Filtering of attributes you are not interested in - Pulse supports a special 'stop words' user-dictionary to indicate attributes that should not be analyzed. Alternately, you can choose to score sentiment only on attributes defined in your white_list.
- Synonym mappings - Pulse provides customizable mappings so that you can map synonyms to a base word, and then normalize the analysis for the synonyms to the base word. For example, you can map Hewlett Packard to HP.
Vertica Pulse requires that Java and the Vertica Java Support Package are installed on all nodes in the Vertica cluster.
Depending on the version of Pulse, it may support only one language (English or Spanish) or multiple languages (English and Spanish). For multilingual versions, Pulse can analyze each text row (for example a tweet) in the language of the text specified as argument, the language specified by the user as parameter or the default language. See Multilingual Pulse for details.