Finding Associated Attributes

Once you've analyzed your tweets and stored them in a table (see Batch Analyzing Data as It Is Loaded) you can use the analyzed data to make quick comparisons, such as finding attributes most associated with another attribute.

For example, if your primary attribute is 'microsoft', you may want to determine which other attributes are used most often with the word 'microsoft' in the same tweet. This can be accomplished with the following SQL:

select t1.attribute, count(*), avg(t1.sentiment_score) from tweet_sentiment t1, 
tweet_sentiment t2 where t1.id=t2.id and not t1.attribute=t2.attribute and 
t2.attribute = 'microsoft' group by t1.attribute order by count desc limit 5;

We get the following results from a data set of 25,000 PC Manufacturer tweets:

            attribute             | count |        avg
----------------------------------+-------+--------------------
 windows phone                    |    81 | 0.0238095238095238
 power data center                |    77 |   0.58974358974359
 wind project                     |    77 |                  0
 investment                       |    73 |                  0
 windows                          |    57 |  0.175438596491228

The query allows you to gain additional insight into the scope of an attribute and may aid in determining the context of why a certain attribute it scored a certain way.