Warning: this content is older than 365 days. It may be out of date and no longer relevant.

You Ask, I Answer_ What Predictive Models Do You Work With_

Ben asks, “Predictive analytics seems like a big topic – what kinds do you work with most or find work best?”

Prediction falls into two general buckets, scoring and forecasting.

  • Scoring builds a model to understand, as best as possible, why something happened
    • The most common model is multiple linear regression, which is literally ancient
    • The most well-known type is credit scoring
    • The goal is to answer the question, “What caused this?” and secondarily, “Is this likely to happen?”
  • Forecasting attempts to predict when something will happen
    • The most common model is ARIMA, a 1976 algorithm
    • The most well-know type is financial forecasting
    • The goal is to answer the question, “When will this happen?”

Virtually every major predictive algorithm is available for free in open-source software like R and Python.

Want to dig into predictive more? Sign up for this free webinar on Thursday, May 10 at 2 PM Eastern, and available on-demand after the event is over.

You Ask, I Answer: What Predictive Models Do You Work With?

Can’t see anything? Watch it on YouTube here.

Listen to the audio here:

Download the MP3 audio here.

Machine-Generated Transcript

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for watching the video.

In today’s you ask I answer Ben asks predictive analytics seems like a big topic, what kinds. Do you work with most or find work best. It is absolutely a big topic and I think it’s probably important we should define what predictive analytics means in this case we are using analytics data to feed statistical algorithms to predict whether something will happen or not and the statistics part is important because predictive analytics in many ways when you boil it down to statistics. It is the probability that something is where is not going to happen. And so that’s, you know, you take away all the fancy industry buzzwords it is just math. It is statistics and it is using those statistics

to assign probabilities to outcomes,

which means that when you think about it, you’ve been doing a type of predictive analytics for really long time. I’m you’ve been you’ve been consuming predictive analytics for your entire adult lifetime. Every time you check the weather you are using predictive analytics. Now whether the analytics are any good or not is a second secondary question but that’s a case of, you know, when is something likely to happen

predictive analytics falls into two buckets there to general kinds of predictive analytics that you would use that we would all use in marketing and business and those two buckets are scoring and forecasting scoring is when you use all these systems and software and stuff to build a model to understand as best as possible. Why, something happened. So

really the the most common example

of this is credit scoring

what

constitutes someone being a credit risk versus what constitutes someone being the sort of person you would want lend money to and so we would use statistics math to take a whole bunch of variables and try to find a pattern says these combination of things. Gentlemen, something someone has a good risk. These combination of things means something is a bad risk.

Now

the most common technique and probably the one that

you’ve ever taken a statistics course in college or university. The most common model is called multiple linear regression, which is literally ancient it is as old as statistics themselves like thousands of years old and

you can get more and more complex based on that the goal of scoring is to answer the question what caused this right what caused this person to be a good credit risk or bad credit risk what caused this tax return to be fraudulent or not fraudulent and secondarily, you would use this data to then predict is this likely to happen. So when you fill out a form online. We see this a lot with advanced CRM is this lead likely to become an opportunity is that opportunity likely to close. So what are the characteristics, so that we can predict as early as possible. Yes, invest your time in this here

don’t invest your time there attribution modeling is another example and this is where predictive and descriptive kind of overlap. So if you were to go into your Google Analytics, you would try to build a descriptive model saying what

drove leads or what drove purchases or what drove people coming to our store

and then use that as the basis for

a predictive model. It’s okay if we know that emails,

the driving channel can we predict then based on that data, you know, that we should send more email said, Unless email or send an email with different subject lines or emoji things like that

so that scoring the second bucket is forecasting when is something likely to happen. The most common model here is a Rima. This is a a an algorithm stands for auto aggressive integrated moving averages and it’s from 1976 to data scientists George box. And I can remember Jenkins last first name, but it’s called the called the box Jenkins approach

and

probably the most well known consumer use of forecasting is the weather forecast literally

When is it going to rain

and and and

it was weather forecasting is certainly gotten better than it was in the old days when I was growing up, I was literally throwing darts at a board now it is you know substantially better other types of forecasting for when something is likely to happen. People have been trying to apply predictive analytics forecasting analytics to the stock market since the stock market came around

that is not a good application of it because there are so many hidden and interfering variables that making stock market predictions is very, very difficult, but other types of financial forecasting is certainly much more predictable for marketers this predicting search volume is probably one of the most common uses and certainly one of the most effective uses because search data is generally pretty good. You can forecast on any time series data so you can forecast on social media data you can forecast on email data you can forecast on your Google Analytics data your marketing automation data your sales CRM data. I did a project, not too long ago with a casino taking their daily slot machine revenues and forecasting that because it’s time series data and it has some, you know, very strong cyclicality to it.

So the goal of forecasting is to answer the question. When is this likely to happen when we know based on our scoring model what then we use forecasting to decide when

the good news for every month marketer out there is that

most predictive algorithms are certainly all the common ones are available for free. They’re built right into many open source statistical software like our like Python. For example, and of course commercial systems like SPSS and MATLAB and and and all those things,

the tough part about predictive analytics is not the concepts. It’s just the application, the concepts. Once you get

trained up on all the different ways to do both

scoring and forecasting, then

the next step is to learn how to decide what models work best and almost every form of algorithm has some sort of either error rate or error checking rate or some sort of probability indicated that tells you how reliable. The model is and that’s where that’s what really separates good from bad when it comes to predictive analytics is if there’s no expression of confidence interval or probability or ever then

it’s not very good. I would be very cautious of any vendor that says this is the prediction without providing some kind of error rate, maybe not

like a calorie label on a food, but certainly the ability to explain this is this is the error rate or the p value or something like that

at least if for software and services that go to fellow data scientists. Now

if you’re selling to a business user maybe the error rate is good potentially just confuse people so but at the very least, the vendor should be able to answer like this is the likely error rate for this forecast

in terms of where to get started if you wanted to get started learning this pickup statistics book that because that is the foundation of predictive analytics. So pick up statistics for dummies and go through it, or a pickup, you know, are for Dummies. The

the book about the physical programming language because that will help you learn the concepts of statistics as you learn to apply them with the programming language. So Ben, great question complex question

we have a webinar that you can attend on predictive analytics coming up soon put a link in the in the notes here and it will be available on demand afterwards but that’s a much deeper dive into this topic and looking forward to talking about more of this if if predictive analytics is of interest to you and you want to do it for your company my company Trust Insights does that and happy to have a conversation about how we can help. Thanks for watching as always subscribe to the YouTube channel and the email newsletter. I’ll talk to you soon.


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Take my Generative AI for Marketers course!

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!