Normalization Map Effect on Results

Before any of the sentiment analysis functions are run on the text, the normalization map is applied. When a sentiment analysis function is run, Pulse replaces the synonym with the base word. The result of the sentiment analysis function displays the mapped words and not the original text. For example, Pulse maps both 'Hewlett Packard' and 'Hewlett-Packard' (with a hyphen) to 'HP' in the results when the normalization map is populated with those terms.

Before Mapping

The following example demonstrates sentiment analysis before mapping:

=> SELECT SentimentAnalysis('Hewlett-Packard was founded in 1939. 
Hewlett Packard was started in a garage in Palo Alto California')
OVER(PARTITION BEST);

sentence | attribute | sentiment_score ----------+----------------------+----------------- 1 | hewlett-packard | 0 2 | hewlett packard | 0 2 | garage | 0 2 | palo alto california | 0 (4 rows)

Insert Normalization Values and Load Map

You can add values to the normalization map using an INSERT statement. The following example demonstrates how to insert normalization values and load the map:

=> INSERT INTO pulse.normalization_en VALUES('HP', 'Hewlett-Packard');
=> INSERT INTO pulse.normalization_en VALUES('HP', 'Hewlett Packard'); => COMMIT;

=> SELECT LoadMapping(standard_base, standard_synonym
USING PARAMETERS mapName='normalization') OVER()
FROM pulse.normalization_en;

You can also map multiple values to the same term using a $LIST parameter. The following example would map multiple alternate names for the city of Boston to the value 'Boston'.

INSERT INTO normalization_en Values( 'Boston', '$LIST(BOS,beantown,the hub);

After Mapping

The mapping operation replaces the attributes with their counterparts from the normalization list and displays the base terms:

=> SELECT SentimentAnalysis('Hewlett-Packard was founded in 1939. 
Hewlett Packard was started in a garage in Palo Alto California')
OVER(PARTITION BEST);

sentence | attribute | sentiment_score ----------+----------------------+----------------- 1 | hp | 0 2 | hp | 0 2 | garage | 0 2 | palo alto california | 0 (4 rows)

The CommentAttribute() function also uses the normalization map and displays the base terms instead of the original text:

=> SELECT CommentAttributes('Hewlett-Packard was founded in 1939. 
Hewlett Packard was started in a garage in Palo Alto California')
OVER(PARTITION BEST);

sentence | attribute ----------+---------------------- 1 | hp 2 | hp 2 | garage 2 | palo alto california (4 rows)