Analyzing Comments for a Company or Product

Pulse allows you to analyze comments (such as tweets) for a particular company or product.

For example, imagine that the fictional company Pytell Corp has just released a new product called Owl-2. You want to analyze the sentiment of both the company and the product.

You've collected several tweets from Twitter about several companies and products into your database. However, for this analysis you only want to target tweets that have to do with Pytell Corpand/or Owl-2.

The dataset for this example is below:

create table tweets_sample(id int, author varchar(50), text varchar(400));

insert into tweets_sample values(400900, 'DramaBugs', 
'Pytell Corp has horrible customer support. On Hold 2 hours!'); insert into tweets_sample values(401200, 'Gemball',
'Owl-2 doesn''t fly!'); insert into tweets_sample values(403070, 'Postta',
'Pytell finally released Owl-2!'); insert into tweets_sample values(480920, 'Instana',
'Unboxing Owl-2 after work today! Stay Tuned!'); insert into tweets_sample values(434500, 'Dailydant',
'Owl-2 flies great! I like it!'); insert into tweets_sample values(450670, 'HelpfulBen',
'Owl-2 keeps crashing into things!'); insert into tweets_sample values(402092, 'Championtips',
'Owl-2 has solved our rodent infestation!'); insert into tweets_sample values(434950, 'Editone',
'Pytell fail? Reports of Owl-2 crashing through windows.'); insert into tweets_sample values(413956, 'CzarLatest',
'Pytell Corp''s Owl-2 just released!'); insert into tweets_sample values(459988, 'CelticMiss', 'I like Ponies!'); insert into tweets_sample values(403511, 'BuffDrama',
'I am afraid of small spiders.'); commit;
 
  1. Run SentimentAnalysis to get an idea of how Pulse is analyzing the data:

    SELECT author, SentimentAnalysis(text) OVER(PARTITION BY author, text) FROM tweets_sample ORDER BY attribute;

    author | sentence | attribute | sentiment_score --------------+----------+--------------------+----------------- DramaBugs | 1 | customer support | -1 Championtips | 1 | owl-2 | 0 HelpfulBen | 1 | owl-2 | -1 Instana | 1 | owl-2 | 1 CzarLatest | 1 | owl-2 | 0 Gemball | 1 | owl-2 | 0 Postta | 1 | owl-2 | 0 Dailydant | 1 | owl-2 | 1 Editone | 2 | owl-2 | -1 CelticMiss | 1 | ponies | 1 Editone | 2 | reports | -1 Championtips | 1 | rodent infestation | 0 BuffDrama | 1 | spiders | -1 Instana | 2 | tuned | 0 Editone | 1 | Pytell | -1 Postta | 1 | Pytell | 0 CzarLatest | 1 | Pytell corp | 0 DramaBugs | 1 | Pytell corp | -1 Editone | 2 | windows | -1 Instana | 1 | work today | 0 (20 rows)
  2. There are some attributes listed (ponies!) that do not apply to the analysis that you are doing. You can focus your analysis by adding whitelist entries and filtering on the whitelist. Insert whitelist entries for the company and product name into the standard whitelist:

    INSERT INTO pulse.white_list_en VALUES ('Pytell Corp');
    INSERT INTO pulse.white_list_en VALUES ('owl-2');
    commit;

    Reload the whitelist into Pulse. Loading a user-dictionary or mapping overwrites the existing user-dictionary or mapping:

    SELECT LoadDictionary(standard USING PARAMETERS listName='white_list') OVER() FROM pulse.white_list_en;
  3. Also, note that Pulse is not identifying all variations on the company name. There are also three obvious attributes for the product name ('Pytell', 'pytell corp). You can normalize these values by using a normalization mapping. Add the synonyms to the standard normalization mapping:

    insert into pulse.normalization_en values('Pytell', 'Pytell Corp');
    commit;
  4. Reload the normalization mapping to load the new values into Pulse:

    SELECT LoadMapping(standard_base, standard_synonym USING PARAMETERS mapName='normalization') OVER() FROM pulse.normalization_en;
    
  5. Run the query again to see how the normalization affects the results.

    Note that 'pytell corp' has been normalized to 'pytell' and Pulse is correctly identifying the synonyms and mapping them to the base term