Warning: this content is older than 365 days. It may be out of date and no longer relevant.

The Most Important Question in Attribution Analysis

One of the most important questions you can ask a prospective customer is one almost no one asks:

How did you hear about us? (or its many variations, like “What made you come in today?”) is a question we don’t ask nearly enough.

Why? Why don’t we ask this critical question, a question that is the linchpin of attribution modeling? After all, nothing cements attribution analysis more than answers to an unaided recall question. If you can’t remember how you heard of a company, then that company’s marketing is clearly not very good.

More important, asking people how they heard about us helps us understand our attribution models much better, because asking people what they remember accounts for interactions that may not be captured in digital marketing analytics.

So why isn’t this best practice universal? Here’s one reason companies don’t do this as often as they should: the data analysis part can take some time if you’re collecting it correctly. Let’s look at an example.

Attribution Walkthrough

I’ve been collecting answers to this question for my newsletter for several years now:

Entry form

And this is why companies struggle to use this information:

Entry responses

The answers we get from a free-form response are wide and varied – so wide that analyzing them requires a decent amount of effort. Happily, you can use a huge assortment of tools to help categorize the answers; many of them will be semantically similar.

For example, in Excel, you could create a chained COUNTIF statement and tally up words for different categories. I do the same thing programmatically in the R programming language, but you don’t need to use programming software. Here’s an example of how I bucketed the different terms:

R list of terms

An example in Excel of one of these would be something like =COUNTIF(lower(A2), “*spin sucks*”) + COUNTIF(lower(A2), “*gini*”) in a cell in a column. This will help you tag and categorize responses in a series of columns for further analysis.

Once we tabulate the results, we should end up with something that looks like this:

Results chart

This tells us several things:

  1. We’ve still got more work to do on the categories; there are more unknowns than any other single topic for this dataset.
  2. Three of the top five sources are sources where there won’t be digital attribution: referrals from a colleague/friend, Ann Handley’s book Everybody Writes, and speaking.
  3. Social media plays a fairly large role, larger than I’d expect.

Now, let’s take a look at a digital customer journey for newsletter subscriptions for the same period of time.

Digital attribution model

We note here that organic search is the top of this particular model. Why is it so much more prominent here than in the version above, using user input?

Logically, if someone recommends something to you, what’s the first thing you’ll do? If someone says, “hey, you should check out Chris Penn’s newsletter”, what will you probably do?

Search for it

You will probably search for it. This exemplifies why surveying and asking people questions using unaided recall is so important for attribution models.

Take a moment to give this serious thought. If I think organic search is driving all my results – which by the digital model, it is – what action would I take? I’d optimize pages. Build links. Do guest posts. All the SEO tactics that are best practices, known, effective methods for generating inbound organic searches.

But I’d be wrong, wouldn’t I? Because colleagues and friends are referring me, Ann Handley’s book is referring me, speaking on stage is referring me to others. In all those offline formats, their natural output in a digital attribution model is organic search. The reality is, SEO isn’t working for me – referrals are! They’re just showing up as search because the referrals are in offline places.

The same is true for social media. On my digital attribution model, social media drives a handful of conversions. But in the survey data, it’s the fourth-largest source. Why? Why is there such a disparity?

Let’s look at a sample of some of the answers:

Social media answers

Well then. Some of these are Facebook groups, some of these are Twitter chats – and those are types of social media where there might not be a clickstream, a linear journey from click to click that we can follow. In fact, some of these are reputational answers, which maens it’s entirely possible that they too came in from organic search. If you’ve ever had the experience of seeing something on Facebook or LinkedIn and then have had to go search for it, you know exactly what is happening here.

By analyzing the responses people give me on my forms, I now know what’s driving the digital attribution model’s results, and I can calibrate my efforts accordingly. For example, I should be featuring Ann’s book more prominently if I want to drive newsletter subscribers.

Key Takeaway: Ask!

If your data collection on forms and other transactions does not include a freeform way to ask people how they heard about you, or what motivated them to do business with you, then half your attribution model may be missing.

Take time to implement this critical question as many places as practical in your business, and then take the time to analyze the data. You’ll be surprised at what people remember about you – and you can use that data to calibrate your marketing efforts.

And a special thank you goes out to Ann Handley for Everybody Writes. If you don’t already subscribe to Ann’s newsletter, you should.


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Take my Generative AI for Marketers course!

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!