Twitter is still something of a mystery to those of us in advertising and marketing. Everyone thinks they need to be on top of it, but no one is completely sure now to use it. Even fewer people have an idea of how to measure whether or not they’re using it effectively. Most of the time brands think about Twitter like this: Create an account, start tweeting, and then measure success by looking at how many followers we have. But that doesn’t tell you the whole story. In fact, that tells you almost nothing.
UPDATE: This post, along with a short interview, was recently posted on the BBH Labs blog.
Anyone who’s used Twitter for just a few days will quickly discover that it’s a haven for spammers. But just how bad is the problem? Well, I have an dummy account I created about 18 months ago and it was 300 followers, despite the fact that I’ve never sent a single tweet. That should tell you that, right off the bat, that looking at the number of followers is something of a useless metric. It has no context. It’s just a number.
Almost all of the data stored about Twitter users and their tweets is public and can be pulled down from the API with relative ease. The Twitter API is easily accessible with cURL. This means that anyone who knows how to use a text editor can start pulling down heaps of data. The rest of this post isn’t about how to use cURL, per se, but rather thinking through some of the different ways we might use the massive amount of data Twitter makes available to us to draw insights and set better goals.
To illustrate this, we need a target. I’ve chosen @bbhlabs. @bbhlabs is the Twitter account for BBH Labs, the “marketing skunkworks” division of BBH. I chose this account for two reasons: (1) I admire their work; and (2) I wanted to see what the follower data looked like for an ad agency. Who follows those who’s goal it is to encourage consumers to follow others?
At the time of writing this, BBH Labs had a little over 12,500 followers. Using the statuses/followers REST API Method I was able to quickly pull down the information for almost every one of their followers. Information like this:
- Profile Bio
- Profile Picture
- Web URL
- Privacy Settings
- # of Followers
- # of Friends (“following”)
- Account Creation Date
- # of Favorites
- UTC Offest
- Time Zone
- Per-tweet Geolocation Status
- Verified User Status
- # of Tweets
- And more… this is just what I thought was relevant
All of this information is public, for almost every follower (unless the account is private). Almost scary, right? It should also be noted that this is but a single API call. There are dozens of different API calls that cover everything from search to lists to retweets. Profile data is a very small subset of the total data available to play with.
After pulling down all of that information I was left with a massive XML file which needed to be parsed and formatted into a CSV file. If you want to pull follower information for 10k users, expect to be left with an XML file some 500k lines long. While parsing XML generally requires some programming knowledge (it sure makes it easier), it’s not a prerequisite to do this kind of analysis. Most of this can be done using cURL, a text editor, and simple functions like VLOOKUP in Microsoft Excel.
With CSV file in hand, you can open in up in any number of applications and start sorting, slicing, pivoting and filtering the data until you find what you’re looking for. And we’ll get there, but before we start looking at numbers let’s have a look at something a little more visually compelling. What if you wanted to map everyone who follows you? How would you do that? Turns out it’s pretty easy. All you have to do is head on over to one of Google’s lab projects, Fusion Tables.
Fusion Tables is an incredibly powerful tool for statistics, analysis, and visualizations. Once you start to use it, and realize that it has an open API behind it, Microsoft Excel starts to look like a toy. One of the processes that Fusion Tables makes simple is geocoding massive amounts of location information. All you have to do is upload a CSV with a list of locations and Fusion Tables does most of the work for you. This is what allowed me to create the interactive map you see at the top of this post. It also allows you to create a heatmap, like this:
All the major cities are account for here, including some interesting finds when you look across the globe. (Note: There are a number of geocoding errors that I’ve not bothered to filter out.)
You might look at this and say, “That’s neat. Who cares?” Well, consider this: One of the key problems with ARGs is that you never know if who’s actually going to participate. You have to spend buckets in media to ensure that you reach the right people, who want to participate, and then you hope that they do something like go on Twitter an tweet about it. You’re also hoping that they have a lot of followers, so the message gets seen by as many people as possible. But you don’t have to hope. You can use information from Twitter to make recommendations with incredible accuracy. Not city level, but street level.
And here’s more food for thought: Since all of this information is public, you can draw down information on your competitors Twitter followers and target billboards, bus stop ads, and other out of home media directly at them—with street level precision.
Strictly By The Numbers
Now back to the spreadsheets. One of the things I like to do when I look at follower data is find out how many “active” users there are. Of the 12,500+ people who follow BBH Labs, how many are spam? How many are inactive? How many people have a network of zero? There are any number of ways to do this, but I like to create filters around real-world usage patterns to get a slightly better idea of how big of an audience you’re actually reaching.
I applied the following four filters to the 12,601 people in my database:
- Private: = FALSE
- # of Followers: > 100
- # of Tweets: > 100
- # of Friends: > 10
How many people was I left with? 6,619—or bout 53%. And while that might not mean much without a comparison, that’s really pretty good. To be clear: I’m not suggesting that the other 5,892 people following BBH Labs are spammers; just that they probably aren’t as valuable as the those who passed our test. And when you think about the filter, it’s really not that strict. All it saying is that each follower has 100 people following them (50 of which are probably spam), that they’ve been using Twitter long enough to sent 100 tweets (even at one a day, you’ll hit 100 in a little over 3 months), and that they find at least 10 people interesting enough to follow.
While looking through the rest of the data I pulled out a few other simple statistics that are interesting to think about:
- Average # of followers: 1,746 | Median: 163
- Average # of friends: 982 | Median: 206
- Average # of tweets: 987 | Median: 247
- 6% of followers keep their tweets private
- 9% have per-tweet geolocation enabled
- 12 followers are “verified”
As you can see by the differences between means and medians, all followers are not created equal.
There are an unlimited number of ways to look at this kind of data, and depending what you’re looking for, probably a few surprises. My goal when I sat down to write this was not to thoroughly analyze BBH Labs (I’ve already gone too far), just to jot down some thoughts that might help others think beyond the follower count. I hope I’ve succeeded. Let me know in the comments.