One of the earliest parts of Google’s algorithm was PageRank, a network graph that looked at who was most linked to as a proxy for which sites should rank highest for a given search term. While PageRank has evolved along with the rest of Google’s algorithm, it’s still very much part of the company’s search DNA.
Which raises the question: why don’t more SEO tools display link graph data themselves? Many of them have the data in some fashion or format. Why don’t more technical SEO marketers use link graph data as part of their SEO strategy?
Let’s dig into this a bit more and see if we can come up with some answers.
What is a Network Graph?
First, let’s define a network graph. A network graph is essentially a graph of relationships, a diagram of how different entities relate to each other.
A network graph is simply a way to visualize these relationships:
Inside a network graph, you have two kinds of entities, nodes and edges. Nodes are the things themselves – people, websites, social media handles, whatever. Edges are the connections between the nodes. If I link to Trust Insights from my blog, that’s an edge. If Trust Insights links back to my site, that’s an edge, too. Edges can be one-directional or bi-directional.
In the example above, we see four sites. Site A has two links going out and none coming in. Site B has one link coming in and two links going out. Site C has two links coming in and one link going out. Site D has two links coming in and no links going out. In this very rudimentary example, the site that’s most authoritative here is Site D, if you were to use the most primitive form of the PageRank algorithm on this network graph.
In the case of SEO, the more sites that link back to my website, the more my site is perceived as authoritative and trusted by the network graph portion of Google’s algorithm. That’s why SEO folks have been saying for nearly two decades that building links to your website is a critical part of SEO, one of the most important things you can do.
How Has PageRank Changed?
PageRank used to be the heart of Google’s algorithm, the core of everything it did. Over the years, as black hat (malicious or unethical) SEO folks have tried to trick the network graph, from link spam to outright website hacks, Google’s algorithm has adapted like an immune system to devalue more and more of what marketers can do to influence the algorithm:
As early as 10 years ago, Google started rolling out massive changes that dramatically reduced the value of both black hat techniques and linking techniques that were too easy to game, like buying press releases.
Today, over 200 different data points go into Google’s search rankings, and from there its machine learning models engineer many more behind the scenes that neither we nor Google even fully understand, thanks to the nature of deep learning models. However, we do know that quality inbound links still do matter, still do strongly influence the model. Google technical folks have said as much in very recent interviews and on their podcasts.
What Do We Do With Network Graphs?
So how do we make use of this information? How do we turn a concept into a functional reality? Let’s look at applying network graphing theory to real data. Suppose I want to rank for the term “marketing analytics”. I’d go into my SEO tool of choice (use pretty much any major vendor, this part is all the same) and see who ranks for those terms:
So far, so good. Now the question is, what kinds of inbound links help Google recognize these sites as authoritative? To understand that, we need to extract who links to them. Most modern SEO tools will allow you to extract backlinks, or the sites that link to a website. So what we’d do is export all the sites who link to this list of the top 10-20 results; because some of them are quite large, we might want to filter the links to be specifically about analytics or isolate those publications which create content about analytics frequently; doing so dramatically reduces the amount of data we need to process.
Once we’ve narrowed down our huge collection of backlinks, we need to reformat them to a list of edges and a list of nodes, then feed that data to network graphing software. For non-programmers, the open-source application Gephi is probably the best bet. For programmers, chooes the appropriate libraries in the coding language of your choice; I’ve become a fan of tidygraphs for the R programming language.
Feed the nodes and edges list into your graphing application and visualize it. It should look something like this, in a visualization:
You’ll know you have it correct when you see a graph network that looks like a constellation, with a handful of really connected hubs – those are our starting sites we put into our software – and then all the sites that link to them, helping boost their authority.
If we switch to our data view and use a measure of centrality that’s mathematically close to what Google used for its PageRank, eigenvector centrality, we can then rank all the sites granting links to our targets to understand which ones are the most valuable and important:
This is now our to-do list, our punch list of sites to go acquire links from. Some of them will be very straightforward; creating content on Medium.com, for example, is very simple to do. Creating a new repo on Github and ensuring we have code freely available is straightforward. Other sites will require the help of an SEO firm or a PR agency to obtain placements, to obtain credible links.
Why Is This Better?
Most modern SEO tools have some kind of link building feature, some way of identifying which sites to approach for building links. However, often these features use algorithms that are substantially different than the way Google uses network graphs. They use much lower-computational methods like basic link intersections, which often tends to bubble up the same very high authority sites over and over again – the New York Times, Wikipedia, Forbes, Gartner, etc. While this is factually correct – obtaining high value links from the New York Times would be wonderful for any site – it’s not attainable by most companies.
By using a network graph with a similar algorithm to Google’s, we can explore in a more nuanced way what sites help boost authority, and then pursue them. While some of the publications are still top-tier (and thus difficult), many others are not and accept contributions of useful content.
This technique also helps focus your agencies and in-house teams’ efforts. While they try to land the big fish – like the New York Times – they can also focus on the relevant industry publications that appear authoritative. They’ll reap more for their efforts and in a shorter period of time than focusing only on top-tier sites alone.
Work with your analysts and programmers to follow the steps outlined above, and see how your link building efforts change – and hopefully become more productive and valuable.
You might also enjoy:
- Transformer les personnes, les processus et la technologie - Christopher S. Penn - Conférencier principal sur la science des données marketing
- It's Okay to Not Be Okay Right Now
- Marketing Data Science: Introduction to Data Blending
- How I Think About NFTs
- B2B Email Marketers: Stop Blocking Personal Emails
Want to read more like this from Christopher Penn? Get updates here:
Get your copy of AI For Marketers