Katherine asks, “What’s the first thing or set of processes you do when you receive new data from a customer?”
Can’t see anything? Watch it on YouTube here.
Listen to the audio here:
- Got a question for You Ask, I'll Answer? Submit it here!
- Subscribe to my weekly newsletter for more useful marketing tips.
- Subscribe to Inbox Insights, the Trust Insights newsletter for weekly fresh takes and data.
- Find older episodes of You Ask, I Answer on my YouTube channel.
- Need help with your company's data and analytics? Let me know!
- Join my free Slack group for marketers interested in analytics!
What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for watching the video.
Christopher Penn 0:13
In today’s episode, Catherine asks, what’s the first thing or set of processes you do when you receive new data from a customer? Probably exploratory data analysis.
Alright, exploratory data analysis is the data science and machine learning equivalent of looking in the fridge before you cook.
Right? So you look at, you open up the fridge, you look at what’s in there, and you say, Okay, I’ve got chicken, I don’t have steak, I’ve got onions, but they don’t have peppers, I’ve got carrots, but I don’t have celery, and so on and so forth.
And based on what you’ve got in the fridge, that dictates what kinds of things you are or not going to cook.
If you’ve got your heart set on steak, but there’s no beef in the fridge.
You’re not having steak, right? So when a customer hands over new data, first thing is you look at it, you investigate it, you say, Okay, what’s in the box? Like? What did the customer give me? What condition? Is it in? Is it in good condition is in bad condition? Are there lots of missing variables? or missing data points? Are things labeled correctly? Does the data answer the question that the customer is trying to ask, that’s a critical part of this, if a customer says I want to know social media ROI, and they provide no cost data, you can’t do social media ROI, there’s just no way to do that you’ve got a substantial missing ingredients like baking a loaf of bread, and you’ve got no flour.
Now, you’re probably not breaking baking bread there.
So that’s the first part is exploratory data analysis.
And that’s, you know, eight different parts.
So you have your goal and your purpose.
You have your data requirements and data collection, you have your initial analysis, like looking at it, your descriptive analytics, see what kinds of dimensions and metrics are there? You look, do your data quality stuff, like what kinds of quality data is in there? There is recurrent requirements, verification, you’ll look at the data and go okay, Does this answer the question that’s being asked of it.
And if it doesn’t, you got to start over.
After that, you’ll do prep, which is cleaning, centering, scaling, etc, you’ll probably do some feature engineering, where you’re going to create new features out of existing ones, like day of week or hour of day, from a date, and then your modeling or your insights, depending on whether you’re going to be pushing a model into production, or just doing an analysis, those are the steps that are vital.
Anytime you get new data, it’s like anytime you get maybe a delivery of groceries, right? And you have a company that doesn’t shopping for you, and they drop off the box on your doorstep.
And the first thing you do is you open the box and go okay, did they get my order, right? I ordered apples and there’s pineapples.
Okay, that’s, that’s not helpful.
That’s where you start.
Because that will also help avoid failure later on.
If a customer hands you data, and that data, there’s something wrong with it.
The sooner you catch that, the less time and money you waste, right, the less beating your head against the wall, or worst case scenario, you think the data is fine, you’re running an analysis on it, you hand off the results to a customer and it’s wrong.
And it might be wrong in a subtle way in a way that you don’t catch.
But then, you know, a month a quarter a year later, the customers like, hey, our business is going down.
Why? Well, because you made an analysis of bad data.
Right? It’s like you you’re you eat something that tastes fine the next day, you’re sick.
Well, yeah, yeah, ate some food that was contaminated.
And you know, maybe you the next day, you find out that that was not the case.
Or if it was like a really bad mushroom, you might die 10 days later, because liquefied your internal organs, which can happen.
So that’s the first most important part, you got to open up that fridge and look inside and see what do we have? And can it make the things that we want to make? If you skip that part, if you skip the exploratory data analysis, you will be in a world of hurt, because at some point, you will be handed data that isn’t clean, that isn’t complete.
That isn’t correct.
And you will use it and you will lament your choices.
I guarantee it.
So that’s the first and most important step to do before you do anything else.
Thanks for asking.
If you’d like this video, go ahead and hit that subscribe button.
You might also enjoy:
- Almost Timely News, 17 October 2021: Content Creation Hacks, Vanity Metrics, NFTs
- Transformer les personnes, les processus et la technologie - Christopher S. Penn - Conférencier principal sur la science des données marketing
- It's Okay to Not Be Okay Right Now
- What Is The Difference Between Analysis and Insight?
- Best Practices for Public Speaking Pages
Want to read more like this from Christopher Penn? Get updates here:
Get your copy of AI For Marketers