As is customary this time of year, we look back at the year that was. What were the top posts of the year? What caught your eye, convinced you to click, and compelled you to share?
I’m a fan of looking at our marketing through more than one lens. This week, we’ll walk through my blog from a few different points of view, and see which perspectives make the most sense. We’ll conclude by using what we’ve learned to set a strategy for 2018.
Which Metrics Should We Focus On?
All the metrics we’ve examined so far are activities, from sharing to SEO to page visits. Nothing connects these activities to the bottom line yet. However, if we augment our data with page value, we start to see greater impact of our content. Before we begin using page value, it’s important to understand Google’s perspective on how page value is computed. Essentially, page value is the value of a goal spread over the average of pages it takes for users to achieve that goal.
Thus, pages which are part of more conversion paths overall are worth more than pages which are only occasionally part of the path to conversion. Let’s add page value into our basic metrics and see which pages are the most valuable in my blog for 2017.
Well, that didn’t clear anything up, did it? We now have four sets of metrics – search, social, traffic, and value – and very different pages leading the charts for each.
What should we do? How do we know what’s really important here? We could guess, certainly, or concoct a fanciful way of arranging these lists to tell a semblance of a story. It’s not difficult to imagine someone saying, “Well, clearly SEO is how people find things today, and social media is how we discuss what we find, so let’s find the top pages from our SEO data and map them to the social media pages…”
However, this makes a number of unproven assumptions. Instead, what we should do is some math. We should use supervised learning to determine what, out of all these factors and variables, truly contributes to page value.
The simplest way to look at this data is to do a correlation matrix, where we check every variable to see how it correlates with our target outcome, page value. We’ll use the R statistical programming language here and the corr() function to built our correlation matrix:
Well, we’ve got a great big correlation matrix that’s colorful, but when we look at page value in it, we see very little in the way of correlation with page value. Are we stuck?
Not necessarily. Page value, after all, is an aggregate metric. It’s the sum total of a user’s valuable experiences from their first encounter with us until they do something of value. Expecting a simple correlation to reveal a magic bullet answer is foolhardy. Only in the simplest of businesses could we expect such an occurrence.
If correlation doesn’t answer our question, what else might?
Multiple Linear Regression
Our next step is to perform what’s know as multiple linear regression. We attempt to find relationships between our target variable and its input variables. Using the R function lm() and the leaps() function, we perform an iterative regression of every possible combination of our variables:
In addition to a nearly unreadable output, this analysis isn’t helpful. We can rule out certain variables (the thick black columns) from this model, but we’re left with a significant amount of messy information on which to attempt interpretation.
Additionally, this combination of regressions doesn’t take into account dependencies. Think about our behavior online. Not all interactions are equal. Some interactions are more important than others. Some are dependent on others. We cannot, for example, evince interest or desire for a product or service if we are unaware of it.
So, regression isn’t the answer. What could be?
A machine learning technique, albeit a simple one, known as the random forest is likely to help us solve this mystery. Random forests are another way to iterate through all our data and every combination, but instead of simply combining different metrics together as is, random forests help us to understand dependencies better. Using the randomforest library in R, we construct a forest and ask the software what variables are most important for page value as an outcome:
We have here a much more easy to understand output – and one that’s almost prescriptive.
At the top of the forest, on the upper right, organic searches of a page are the top driver of page value. For pages where organic search isn’t the only way our audience finds our content, we see that total social media shares combined with organic searches provide a second tier of value. Beyond that, we see that bit.ly clicks matter as a tertiary driver of value.
We now have clear advice. To maximize page value, we should focus on increasing organic searches to our pages (most valuable pages first), followed by an emphasis on social media sharing with a bias towards clickthroughs (since I use bit.ly as my link shortener).
Next: Tackling Those Searches
Now that we’ve solved the mystery of what drives our page value, what makes the blog valuable, we move onto what’s next. What should I do to increase those organic searches, those social media shares, etc.? Stay tuned!
You might also enjoy:
- Branded Organic Search: The One PR Metric Almost No One Uses
- How To Start Your Public Speaking Career
- Cómo decide Google Analytics el seguimiento de atribuciones - Christopher S. Penn - Orador principal de ciencia de datos de marketing
- You Ask, I Answer: Best Language for Marketing Data Science, R or Python?
- Marketers, Stop Panicking About Apple Mail Privacy Protection
Want to read more like this from Christopher Penn? Get updates here:
Get your copy of AI For Marketers