You Ask, I Answer: New Insights from Old Data with Marketing Data Science?

Balabhaskar asks, “How can we use marketing data science to get more insights from the same old data or the few data points available because of privacy laws?”

Blending of new data with old data, especially credible third party data is one solution. The second solution is feature engineering. Both are vital parts of exploratory data analysis.

Can’t see anything? Watch it on YouTube here.

Listen to the audio here:

Download the MP3 audio here.

Machine-Generated Transcript

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for watching the video.

In today’s episode, Bala boss scar asks, How can we use marketing data science to get more insights from the same old data? Or the few data points available because of privacy laws? This is a very common question, particularly in light of all the changes in privacy laws like GDPR and ccpa.

You have less overall data to work with.

So the question is, what can you do in place of that? So there’s two things, there’s two tactics you can take.

First, what data do you have available? And is there credible third party data you can use to augment it? So if you have, for example, anonymous search data to your site, you have a visitor you don’t have any identifiable information about the visitor but you do know how they found your site back with say like keyword searches.

Can you then go out and get them third party data like SEO data, or social conversation data, to add to that, to do help forecast it to blend it in and get additional insights.

For example, if you know that someone is coming to your site for espresso drinks, and you were to do some historical trend analysis to figure out, when are people like that most interested? Could you use that data to infer some behaviors about this person.

And if you had a content recommendation engine, present them, the next two or three most relevant articles on your site to help entice them to provide them value, things like that.

blending of third party data is essential because as you pointed out, we don’t have as much data as we used to.

And honestly a lot of that data is questionable in terms of its usefulness anyway.

So that’s one part.

The second part which is a lot more more valuable is to do feature engineering.

So in data science and in machine learning, feature engineering is the process of extracting new data from the data you already have.

Now, there’s some feature engineering that may or may not be terribly useful.

For example, if you do have somebody’s name entering the number of characters in the name not super helpful, it’s not going to be a very good predictor.

But if you just have an email address, for example, what are the things that you can figure out from any about us? You can figure out the top level domain like.com.us.au you can figure out the host [email protected] is at TrustInsights.ai dot AI and then you can determine is that domain a corporate domain is that domain a consumer domain and from there you can start to engineer out what do those things have in common if you have marketing automation software, what percentage of your Leads Leads in your marketing automation software are consumer domains like Gmail and hotmail as such.

And how do they perform differently from say, corporate domains? Do they close faster? Do they close better? Something like that your engineering out and understanding of that data point from just the email address alone? Do people who read your emails click on them more from a gmail domain than a hotmail domain or less? What do what other content do they download? Do they download more content rather than less than, say somebody with a corporate domain? Doing that of data analysis gets you insights into the data without adding new data to it because you’re already collecting the behavioral data and one of the things that we’ve been saying for a while ever since.

Gosh, 2017 when GDPR was first thing was on people’s minds, is that we have to get away from marketing in general.

We have to get away from from collecting too much, personally identifiable information and focus on collecting the behavioral data that really matters.

What does somebody do with our stuff? How many pages on our website do they visit, if you have really good marketing automation, you can tell the number of sessions that that identified email has had on site.

And when you engineer out more and more of the data around behavior, you start to get a much more clear picture about the types of people who visit your site, the types of people who do stuff that you want them to do.

And you can then improve your targeting and your marketing from that.

For example, if you were to engineer this information out of your data, and you found that people with Gmail addresses converted at the same rate, as people corporate email addresses, where you have an identifiable company behind it, you might look at gmail ads, you might start running Gmail ads through Google because it clearly works.

Right, that’s an email domain that works really well.

If you if you find that a certain service provider, bell south, for example, does well, you might look at a display network like StackAdapt, to see where do Bell South users go if that data is available.

But it’s that engineering of the data that gets you more information without violating anyone’s privacy without violating any privacy laws.

You don’t need that information to know what it is that somebody is doing.

And I guess the third thing that I would add to this is, knowing what data you have, knowing what data is available.

A lot of marketers don’t a lot of marketers kind of see the top level of stuff that’s available.

You know, how many users visited our website yesterday, or how many people clicked on yesterday’s email.

And they don’t dig in.

If you dig in under the surface, Justin Google Analytics.

Take it to Take a moment to think about this.

How many data points variables do you think are available in Google Analytics? How many data points for one user 50 100 answers 510.

There’s 510 unique distinct data points categorical and continuous variables in Google Analytics, for what somebody with no personally identifiable information is 510 things you know about the time on site time on page, average page depth, all these different pieces of information.

And if you have that information, and you can extract it out of it, and then use tools, IBM Watson Studio r or Python or any of the data science tools that are out there, to do multiple regression on that and say, Okay, what are the most valuable users? What do they have in common? How many pages do they visit? How long do they spend on site, if you can do that level of analysis, you can come up with valuable insights as to the pages people visit.

places they go, all these things That’s where you’re going to get new insights from old marketing data.

That’s where you’re going to get more insights on the same old data to follow Oscar’s original question.

We don’t need a ton of PII, we shouldn’t have it anyway, it’s it’s a security risk.

If we’re clever, we’re have the proper tools, we can extract a lot of this information that will help us make our marketing better.

If you want to learn more about this particular topic, I would strongly recommend learning feature engineering, I think it’s an incredibly valuable discipline.

There you will find it typically in the process of exploratory data analysis or in just before the creation of a model in machine learning.

And there are a number of courses and things out there that have these aspects.

The one I recommend to people most is IBM’s, free cognitive class system.

If you go to cognitive class.ai you can take course for free, and learn all this stuff, even get the cute little certification stuff.

That’s fun.

But you’ll learn the techniques you need to know.

The challenging part of feature engineering is that you have to be the driver of the engineering, you have to know what it is you’re asking the software to do got to imagine so it is just as much creative as it is computational.

So you need the technology skills, but you also need the creative mindset to go What else could we infer about this data based on the characteristics that we have available? To know for example, that you can take a date and blow it up into year, month, day, a day of the week, day of the month, day of the quarter day of the year, week of the month, week of the quarter week of the year, etc.

You can engineer a tremendous amount of additional data.

It requires you to be creative and thinking about it.

So really good question.

Good.

spend a whole lot of time on this on features.

Engineering it is spending days on it.

But those are some good starting points to take a look at.

If you have follow up questions, leave them in the comments box below.

Subscribe to the YouTube channel and the newsletter.

I’ll talk to you soon take care.

One helps solving your company’s data analytics and digital marketing problems.

This is Trust insights.ai today and let us know how we can help you


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Get your copy of AI For Marketers

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!


Pin It on Pinterest

Shares
Share This