Search engine optimization, or SEO, has changed significantly in the past few years. Thanks to the power of machine learning and artificial intelligence, the only way to build a sustainable, long-term SEO strategy is to create content people want to share. To combat these new trends, we need to employ our own machine learning technology to find what works and perform modern SEO at scale.
In this post, we’ll look at steps 3 and 4: validation and selection.
Validating Our Phrase Research
We know our space fairly well, yes? We have a sense, out of our lists of words and phrases, which are relevant and which are not, and we now have that list. Let’s use a common term in the business field, robotic process automation.
Using any keyword evaluation tool such as the AdWords Planner, SpyFu, SEMrush, Moz, etc. we will evaluate the keywords along two dimensions: volume and difficulty.
Difficulty is how competitive a keyword is. If we want to achieve some level of visibility, we need to choose a theme or topic where we’re not fighting against massive, well-funded competitors if possible.
Volume is how much interest a keyword has from the audience. A keyword with no search volume is useless; we will be #1 for something no one cares about.
Let’s return to our favorite clustering algorithm, k-means clustering, and built out our data in 4 clusters:
- Low volume, low difficulty: maybe something we create on a rainy today
- Low volume, high difficulty: avoid
- High volume, high difficulty: maybe something we find an angle for later
- High volume, low difficulty: the gold mine where we can make an impact
For more experienced data scientists, experiment with other clustering methods such as hierarchical or or distribution clustering. The advantage of k-means centroid clustering here is the ability to pre-define a set of 4 clusters (normally a disadvantage of centroids) that provide actionable data divisions.
A Note On Volume
The example above uses just over one hundred keywords. A human could analyze that small a volume of keywords in a relatively short time; machine learning tools and statistical clustering are probably unnecessary for such a small data set.
However, once we begin to explore all the different topic areas of a business, our keyword list is likely to expand to the thousands, if not millions, of words and phrases. At that point, not only is validation through software a good idea, it’s necessary.
What does a failure of validation look like? Validation fails in two ways: black hole and red ocean.
Black hole failure: If a substantial – or all – of our keywords come back with little to no volume, then we know we need to restart the process from the beginning. We know we’ve got an overall topic or theme that no one cares about – a black hole into which our efforts will never yield impact.
Red ocean failure: If a substantial – or all – of our keywords are extreme difficulty, then we must restart the process or refine our topic or theme. Chances are it’s too broad, and thus we will be unable to generate any significant impact against massive competition.
Once clustered and validated, we begin with our green keywords, the most valuable ones, in high volume and relatively low difficulty. We’d then move to the yellow keywords, where the tradeoff between volume and difficulty is more significant. Finally, we’d look in the blue and red clusters for some opportunities, knowing they will be few and far between.
From here, we’re ready to begin the process of extraction, which we’ll cover in the next post. Stay tuned!
Want to read more like this from Christopher Penn? Get updates here:
Get your copy of Marketing Blue Belt!