Standard and Pareto's 500

Earlier this week, the folks over at Benzinga asked us which stocks were tweeted about the most in 2014; the answer won't surprise you, but the shape of the overall distribution might.  In social data, as in so many other arenas, the Pareto principle of an approximate "80-20" dynamic holds true: in 2014, 66% of Twitter "cashtagged" traffic* related to the most popular 20% of S&P 500 stocks. 

$AAPL alone accounted for 11% of the tweets -- more than the least popular 200 stocks combined.  On a graph that shows the % of traffic associated with each stock, the head of the curve is so narrow and the tail is so long it can be hard to tell them apart from the axes:

That Apple together with Facebook, Amazon, Google and Microsoft account for more than 20% of the tweets is pretty remarkable (rounding out the top 10 were Netflix, Goldman Sachs, Yahoo, Bank of America and General Motors -- Tesla and Herbalife jump in if you don't limit it to the S&P 500). Of course, there is plenty of data to be mined out that long tail with more and more people joining the conversation -- we spend much of our time down there.  We'll tell you more about that in our next post.

* "Cashtag" is popular way to specify a particular financial instrument; it is generally understood, for example, that "$IBM" refers to IBM publicly traded stock, whereas "IBM" refers to the company itself.  The practice originated on StockTwits and is now common on Twitter.

Note on methods: For a variety of reasons, we restricted this analysis to tweets with only one cashtag, although most of the analysis does not vary much if you include tweets with multiple cashtags. We did exclude a few tickers that changed during the course of the year.  And we used Twitter as our data source.