Jessica asks, “I struggle with forming hypotheses. Do I need more data to get better?”
Data probably isn’t the problem. A well-defined question you want the answer to is probably the problem. Consider what a valid hypothesis is, within the domain of marketing data science: a testable, verifiably true or false statement about a single condition. If you’re struggling, you may be:
– questioning non-testable things
– questioning more than one thing at the same time
– questioning things which cannot be controlled
– questioning things not based in prior knowledge
Can’t see anything? Watch it on YouTube here.
Listen to the audio here:
- Got a question for You Ask, I’ll Answer? Submit it here!
- Subscribe to my weekly newsletter for more useful marketing tips.
- Find older episodes of You Ask, I Answer on my YouTube channel.
- Need help with your company’s data and analytics? Let me know!
- Join my free Slack group for marketers interested in analytics!
What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for watching the video.
In today’s episode, Jessica asks, I struggle with forming hypotheses Do I need to get better data or more data? data probably isn’t the problem here.
If you’re struggling with a hypothesis and with hypothesis formation or creation, the chances are that you’ve probably not got a well defined question.
So remember, the first three steps of the scientific method are to ask a question that you actually want an answer to, to define the question, define what information you’ll need, what information you have, and then formulate a valid hypothesis that you can then test.
A lot of the times when something goes wrong in data science, it is because we don’t have a well defined question.
We have a question that maybe is idle speculation.
We have a question that we don’t really have background in.
And so we’ll create just making stuff up.
And that obviously leads to terrible results.
When it comes to hypothesis formation, consider what a valid hypothesis is, within the domain of marketing, data science, it is a testable, verifiably true or false statement about a single condition.
There are in the scientific community, more broader definition of that, but for the purposes of marketing, data science and getting marketers to use the scientific method, that’s the definition we’re going to go with.
So think about that statement, testable, testable, provably true or false statement about a single condition.
What are the things that are going to go wrong? Obviously, trying to test multiple conditions, right.
So if you submit a hypothesis like if tweets on Tuesdays and emails longer than 1000 characters, engage users, then we should see our bounce rates go down and our conversions go up.
There’s a whole lot going on in there.
And that is impossible.
Well, it’s not impossible.
But it’s very difficult to to prove that statement.
As opposed to saying, If email sent on Tuesdays, get more engagement than tomorrow’s email on Tuesday should get an increased engagement compared to an email sent on a different day.
That is something is provably true or false about a single condition.
We’re going to test sending an email on Tuesdays.
So that’s one of the things that can go wrong.
And it’s one thing that a lot of marketers assume is perfectly fine to do when it’s not.
A second way your questions and your hypotheses go along as questioning things that can’t be controlled, right? testing things.
Can’t be controlled.
confounding variables confounding data are one of the biggest problems in marketing data science.
If you are trying to do post hoc analysis, meaning that you’ve, you’ve got some data and now you’re trying to analyze, but you didn’t set up an experiment and you didn’t control the conditions around the experiment, it’s going to be very difficult to turn that into something usable.
So let’s say you’re in Google Analytics, and you’re looking for a wide website traffic go down last month.
That’s a good question.
And you start coming up with all this analysis and theories about what happened to say your email marketing well, was the pay per click team doing something different? Was the social team doing something different with they running ads, it would be it’s much more difficult to do analysis after the fact rather than set up a properly controlled experiments.
That’s number two.
The things that will go wrong with your hypothesis is you don’t set up controlled events.
To the extent that you can, obviously within large complex websites and other digital marketing channels.
The third thing is questioning and trying to test non testable things.
There are things you can’t test, because the data is not available, or because in many cases since, fundamentally in marketing, we’re dealing with human beings, there are some things that are so subjective, that you can’t really test them not.
Well, not scientifically.
A good example, everyone will fall in love at some time, right? Say your perfume company, everyone will fall in love at some time.
Well, how do you define love, right? is such a subjective topic, that it’s really impossible to set up any kind of usable, testable, verifiable experiment because we wouldn’t be Be able to agree on what that is.
Same for something as simple as temperature, right? Say it’s hot outside.
Well, if you like cold weather, and your house is set at 58 in the wintertime, you clearly will think 70 degrees outside is hot.
Another person who loves hot weather, maybe 95 out there like it’s warm, but it’s not hot.
Like what? Okay, what’s hot 113 is hot for them in Fahrenheit.
And so it’d be very difficult to test something like a statement like everybody loves hot weather.
Well, how do you define hot? And the fourth way hypotheses go off the rails is testing things that are not based on prior knowledge.
Again, a hypothesis is something you’re trying to test based on an assumption or guests that you’ve made, which comes from existing data in your head, right? I believe that red as a call to action works better than blue.
So if we change the call to actions on our website from blue to red, we should see a 5% increase.
That’s based in some kind of prior knowledge even if it’s a guess.
But if you just start making things up, I believe that, you know, using images of dancing clowns will increase conversion.
Yes, you could test that.
But you’re questioning something is not based on prior knowledge and so it probably isn’t going to work out well.
Generally speaking, when you’re dealing with hypotheses, more data will not improve your hypothesis.
defining what data you need.
Will being able to say if I believe that you’ll read improves conversion over blue, having information prior studies prior information, biology information about how the human eye proceed Color all those things would be useful data points to collect, assessing what percentage of the population you have and their demographics because color perception changes with age.
Those are things that would be useful to have available as it’s not more data per se, it is being more clear about the data that you need.
The best way to deal with hypothesis creation really is to look at that whole.
Is this a provably true or false statement without a single condition? That’s where I would start.
Most of the time.
I would bet if your hypotheses aren’t working out, well, it’s because it is not in that format.
The lesson we learned in high school or or secondary school was if then statements If This Then That.
If red is more stimulating than blue, then changing the read the buttons on the website to read should result in higher conversion rates of 5% right? That’s the best way to start forming hypotheses, and get more comfortable with it.
If you have follow up questions or this is an important topic, leave a comment in the comments below.
Subscribe to the YouTube channel and the newsletter.
I’ll talk to you soon take care.
One helps solving your company’s data analytics and digital marketing problems.
This is Trust insights.ai today and let us know how we can help you
You might also enjoy:
- How to Measure the Marketing Impact of Public Speaking
- Branded Organic Search: The One PR Metric Almost No One Uses
- You Ask, I Answer: Best Language for Marketing Data Science, R or Python?
- Cómo decide Google Analytics el seguimiento de atribuciones - Christopher S. Penn - Orador principal de ciencia de datos de marketing
- You Ask, I Answer: The ROI of Data Quality?
Want to read more like this from Christopher Penn? Get updates here:
Get your copy of AI For Marketers