Warning: this content is older than 365 days. It may be out of date and no longer relevant.

Google Analytics: A Content Marketing Engagement Test

Here’s a content marketing question to start your thinking: what would you assume the relationship is between average time on page and word count?

I would assume there would be a linear relationship, right? More words on a page means more time to read, so there should be a linear relationship between these two variables.

What if there wasn’t? What if that relationship didn’t exist?

For example, if you’ve got a page that’s 200 words and a page that’s 1200 words, you would expect the average time on page for the 1200 word page to be 6x longer than the time on page for the 200 word page, yes?

The absence of that relationship might indicate that you’ve got a content quality problem. Why? Because if a page is longer and people don’t stick around, then they’re not interested in what that page is about. They bail out before they read the whole thing.

A Walkthrough Example

Let’s take a look at how this might play out. I’ve loaded my blog’s Google Analytics data and a count of the words on each page into a spreadsheet, sorted by sessions in descending order. Google Analytics doesn’t have word or sentence count data, but that’s easily obtained from the SEO tool of your choice or from any good content scraping utility (I wrote my own).

Content analysis

Next, let’s make a simple scatterplot of average time on page and word count, with a sub-dimension of number of sessions:

Content scatterplot

Already we see that there isn’t really a relationship between these two variables – and there logically should be, if the content was all of the same quality. But it’s not- why is that? It’s because the pages aren’t the same quality. They’re not the same topic, not the same age, not the same writing quality. My blog is 14 years old as of 2021; it would be a bad thing if the writing quality of content from 2007 was the same as it is in 2021.

There are, of course, external factors to take into account as well. The audience has changed, search algorithms have changed, social media newsfeed algorithms (and social media channels) have changed. We can’t ignore those, but we also can’t do much about them.

Let’s take our data and make it a little easier to see by changing the axes from linear to logarithmic and putting some median lines on it:

Content quadrants

Ah ha! Now we have four basic quadrants of content quality. In the lower left, we have content that has relatively few words and low time on page. That’s normal; those would be good pages to beef up, perhaps, especially those getting more traffic already.

In the upper left, we have pages with high time on page and low word counts. Those are definitely pages to take a look at and see if there are opportunities to improve them.

In the upper right, we have pages with high time on page and high word counts. These are the pages that are behaving as expected.

In the lower right, we have the problem pages – high word counts and low time on page. These are the pages people are simply not sticking around for.

What Next?

Now that we know what pages are potentially problematic, we export them and start digging in:

  • Is the page quality really bad?
  • Is the writing so bad that it needs to be completely rewritten?
  • Is the topic so far off what I want to be known for that the page should just be retired?

The good news is this analysis can be done in any modern spreadsheet software, combining the data from Google Analytics with data from an SEO tool. Try it for yourself, and see if you can shed some light on what content isn’t carrying its weight.


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Take my Generative AI for Marketers course!

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!