I’m a bit of a data packrat. My hard drive is littered with piles of spreadsheets, CSV files, MySQL databases, and more, which comes in handy more often than you’d think. When Klout announced a major change to their algorithm on October 26, 2011, I knew I had to take a look and see how scores had changed – but I had to do it in a statistically valid way. I strive to avoid producing “studies” and “social media science” that would be labeled cringeworthy by folks like Tom Webster.
Luckily, I had a pool of old Klout data with original Twitter IDs from July laying around, so I was able to do a longitudinal study of Klout scores for the same set of IDs over time. Let’s see what changed.
Data disclosure: this pool of approximately 5,000 Twitter IDs was originally randomly chosen from my Twitter followers. My audience tends to skew towards marketing professionals, so bear that in mind – this audience is not representative of all Twitter users.
Here’s the basic line chart for old Klout scores:
Here’s the basic line chart for new Klout scores:
Take note that scores declined nearly linearly once you were past the short head in the old model. In the new model, there’s a change in inflection right around 35 or so, and then again around 15. Also take note that in Old Klout, scores could be as low as 1; in New Klout, scores bottom out at 10.
The change in the floor score impacts the normal distribution of scores pretty significantly. Here’s Old Klout as a normal distribution:
You can see the pile of low level 1 scores at the very left. Now the same for New Klout:
The pile of level 1s are now piled up with the level 10s on the left side. For data quality purposes, this makes it VERY hard to distinguish between what’s a crap account (the old level 1s, which was a good indicator of bots) and brand new people to Twitter (usually the old level 10s). This is very unfortunate in itself.
Second, it almost looks like Klout tried to balance active, influential folks in around 45 on the new chart. To show you the best illustration of this, let’s filter out all scores below 11 on both data sets so that you can see people with at least some activity and/or influence.
Two things leap out: If you were above 45 in Old Klout, it looks like you might have gotten a downgrade. Second, look at the low end – a lot more people moved from the second quartile to the left side in the algorithm change.
So with all of these changes, is there a “good” Klout score in the new model for my dataset? In the old model that was activity based, anything above 15 was probably not too bad – active users of Twitter. In the new model, 15 is one of the break points, but right around 35 is where you see scores really pick up for this sample set. If I were looking for “influencers” in the new scoring model, I might want to start looking at scores of 35 and up.
GREAT BIG HUGE WARNING: Remember that this is a biased, non-representative sample. I am most assuredly NOT saying that you should run out and update all your social media marketing Powerpoint slides with a shiny new “35 or bust” bullet point. What I am saying is that Klout now appears to have two tiers in their data – lower influence in the 11-15 range and higher influence in the 35-50 range.
Does that mean you’re a social media failure if you have a Klout score below 35? No. It could mean you’re not going to get access to as many of the perks in their perks program, but that’s about it for consequences of a score under 35 as far as I can tell. Beyond that, keep doing everything that is a generally accepted best practice on Twitter: share interesting stuff, have real conversations, be human, etc.
Do Klout scores matter? In the old model, they were based on activity and could be gamed fairly easily. I don’t have enough data for the new model yet (working on that) to see what aspects of social media practice correlate less or more strongly with the score, so there’s no way to tell if their algorithm is an improvement or not for the purposes of judging who is influential. That means for now, they’re not any less or more accurate than they were before the update, so put as little or as much faith in them as you did before until we have better data.
For those folks who are data junkies, you are welcome to download the anonymized CSV files for these two datasets here:
I’d love to hear about your conclusions in the comments.
You might also enjoy:
- How to Think About High Bounce Rates in Google Analytics
- Unsolicited "Embargoed" Press Releases Are Absurd
- Advanced Content Marketing Metrics: Reading Time, Part 1
- You Ask, I Answer: Best Language for Marketing Data Science, R or Python?
- The Power of Analogy in Marketing Communications
Want to read more like this from Christopher Penn? Get updates here:
Get your copy of AI For Marketers