Here’s a content marketing question to start your thinking: what would you assume the relationship is between average time on page and word count?
I would assume there would be a linear relationship, right? More words on a page means more time to read, so there should be a linear relationship between these two variables.
What if there wasn’t? What if that relationship didn’t exist?
For example, if you’ve got a page that’s 200 words and a page that’s 1200 words, you would expect the average time on page for the 1200 word page to be 6x longer than the time on page for the 200 word page, yes?
The absence of that relationship might indicate that you’ve got a content quality problem. Why? Because if a page is longer and people don’t stick around, then they’re not interested in what that page is about. They bail out before they read the whole thing.
A Walkthrough Example
Let’s take a look at how this might play out. I’ve loaded my blog’s Google Analytics data and a count of the words on each page into a spreadsheet, sorted by sessions in descending order. Google Analytics doesn’t have word or sentence count data, but that’s easily obtained from the SEO tool of your choice or from any good content scraping utility (I wrote my own).
Next, let’s make a simple scatterplot of average time on page and word count, with a sub-dimension of number of sessions:
Already we see that there isn’t really a relationship between these two variables – and there logically should be, if the content was all of the same quality. But it’s not- why is that? It’s because the pages aren’t the same quality. They’re not the same topic, not the same age, not the same writing quality. My blog is 14 years old as of 2021; it would be a bad thing if the writing quality of content from 2007 was the same as it is in 2021.
There are, of course, external factors to take into account as well. The audience has changed, search algorithms have changed, social media newsfeed algorithms (and social media channels) have changed. We can’t ignore those, but we also can’t do much about them.
Let’s take our data and make it a little easier to see by changing the axes from linear to logarithmic and putting some median lines on it:
Ah ha! Now we have four basic quadrants of content quality. In the lower left, we have content that has relatively few words and low time on page. That’s normal; those would be good pages to beef up, perhaps, especially those getting more traffic already.
In the upper left, we have pages with high time on page and low word counts. Those are definitely pages to take a look at and see if there are opportunities to improve them.
In the upper right, we have pages with high time on page and high word counts. These are the pages that are behaving as expected.
In the lower right, we have the problem pages – high word counts and low time on page. These are the pages people are simply not sticking around for.
Now that we know what pages are potentially problematic, we export them and start digging in:
- Is the page quality really bad?
- Is the writing so bad that it needs to be completely rewritten?
- Is the topic so far off what I want to be known for that the page should just be retired?
The good news is this analysis can be done in any modern spreadsheet software, combining the data from Google Analytics with data from an SEO tool. Try it for yourself, and see if you can shed some light on what content isn’t carrying its weight.
You might also enjoy:
- The Year of the Yin Metal Ox
- AI for Marketers, Third Edition, Available Now!
- Four Requirements of Great Marketing Data Visualization
- What Are Your Customers Telling You They Want?
- Transformer les gens, les processus et la technologie - Christopher S. Penn - Conférencier principal en science des données marketing
Want to read more like this from Christopher Penn? Get updates here:
Get your copy of AI For Marketers