Product Review: IBM Watson Studio AutoAI

Warning: this content is older than 365 days. It may be out of date and no longer relevant.

Product Review: IBM Watson Studio AutoAI

Today, we’re reviewing IBM Watson Studio’s new AutoAI feature.

FTC Disclosure: My company, Trust Insights, is an IBM Registered Business Partner. We receive compensation for any purchases you make through us from IBM and thus I receive indirect financial benefit.

AutoAI is a type of automated machine learning, which promises to help companies get to working machine learning models faster. How well does it live up to this promise? Watch the video for a full review.

Key features:

  • Regression and classification
  • No code
  • Does basic data science tasks like feature extraction and hyperparameter optimization


  • Limited in the number of models it can choose from
  • Data must be good going in
  • Model can’t be edited yet in a more granular way

Product Review: IBM Watson Studio AutoAI

Can’t see anything? Watch it on YouTube here.

Listen to the audio here:

Download the MP3 audio here.

Machine-Generated Transcript

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for watching the video.

In today’s episode we are reviewing the IBM Watson studio new feature called auto AI or auto automated artificial intelligence. Now, before we begin full FTC disclosure is my company trust insights is an IBM registered business partner. We receive compensation for any purchases you make from IBM through US and US I receive indirect financial benefits. So FTC closures disclosures out of the way, auto AI is a type of automated machine learning which promises to help companies get to working artificial intelligence models faster given the data set, instead of the time it takes to go from data cleaning, prep, feature engineering, hyper parameter, optimization, model experimentation, and then to production that can depending on your data set, take a really long time.

By the way, I promises to help companies do that in less time. Now, before we begin, there are both features and disadvantages of this approach. There is even though it can be very manual and tedious, there is some benefit. I’ve actually personally, there’s a ton of benefit to manually editing and cleaning your data set of

and doing the feature engineering because it helps you get to know your data. And so if it’s something as mission critical, at least the first time through, you want to have a human do that after that, absolutely use automated machine learning. And one of the questions that people have had over over the last few months is, you know, will this replace data scientists know, and we’ll talk a bit more about that. But it’s not going to replace data scientists anytime soon. It will help data scientists workload get a little bit easier for the most common types of machine learning models. So let’s get into the environment.

What we see here is Watson studio, and I’m going to fire up a an auto AI experiment. And we’ll do we’ll call this something auto test something. They’re really exciting.

And our first experiment where you see here, you can add in testing data that we are adding training and data I’m going to throw in, let’s, let’s do some Google Analytics data, right? Google Analytics data is a rich source of information. And it’s something that a lot of marketers have access to. So this is going to be things like users and sessions and balances, and I have goal completion. So that’s what I want.

All the way I’d help me do is help me understand, maybe a little bit better. What gets goal completions for my Google Analytics data?

See, I dropped the data set in and asked me what do you want to predict? What do you want to know more about Of course, I want to know more about goal completions. Now here’s a bit of a landmine in this because

Of the Way goal completions are structured is data set by pages, you know, one or two goals per page. So, auto a, I said, Hey, I think you’re trying to do a multi class classification. I’m actually not. This is again, why you can’t fully replace the data scientists with these software packages, because this is not a classification problem. This is a regression problem.

So choose that I can choose the error metric, which again, if you are a data scientist, these mean a lot to you, if you’re not a data scientist, just go with whatever is recommended.


this is a case where that was not the correct prediction type. So it’s going to run the experiment. And what you’ll see next is if the entire pipeline of what Watson is going to do with this data, it’s going to read it, it’s going to split it into three pieces. Generally speaking, when you’re doing model testing for AI you’re going to do you’re going to split your data into three pieces. 60% of it

Going to give them machine it’s going to try and learn from that and figure out, oh, well, this is your date, I’m going to try and learn what the patterns are, there’s 20% of it is going to be called test data. So once the machine first figures out, okay, I think this and this lead to conversions, it’s going to take the next 20% of the data set and test that, that conclusion out, see if that is is is true or not. And then there’s a third 20%, where it’s going to test the testing of the conclusion. This way, you avoid or you reduce the likelihood of what is called overfitting, where you make a prediction, that’s perfect. But then, when new data comes in, it goes off the rails. So it’s a split the holdout data, read the training data, do its own attempt at cleaning the data as best as it can.

None of the automated machine learning tools, zero of them on the market, do a really good job of cleaning data perfectly. Right? That’s still the data that goes in still

has to be good. And if it’s not good and it’s not in good shape models will simply not work. And this is true of Google auto ml. This is true of h2o Auto Mel This is true of IBM auto AI, it doesn’t matter whose system you’re using garbage in, garbage out, that’s going to be the truth forever just garbage. So, it then goes and chooses a model what kind of machine learning model would best fit this data?

We see here it is chosen x g boost regressive. So extra boost is one of those popular algorithms that can help find

what is likely causing regression or what it runs a regression model x g boost,

which stands for extreme gradient boosting is probably the the most popular machine learning model for doing any kind of regression to one a bunch of cackle competitions and it’s just one in the toolkit. Now this is where auto AI has some real benefit for

People who are trying to learn data science, I think this is a fantastic learning tool because you can see the choices it makes. And if you’re not familiar with the choice, and you go look it up and read up on it, but if you see the choice that makes any figure, okay, that’s an interesting choice. Why did it choose that? As it’s going through, you can see here on the on the bottom, as it makes each pipeline of learning, it tells you why it made those choices. So ranking based on the error that click on this pipeline data, you can see how it evaluated the model, the R squared error model information and the feature importance what it thinks is most likely to predict goal completions. And now it’s going to go through a few stages of really the the machine learning experimentation the exploratory process. So the first thing is to is hyper parameter optimize optimization.

Hyper parameter optimization is you

Really, really fancy way of saying it’s going to play with the settings. So an extra boost algorithm has a bunch of settings. It’s like It’s like an app on your phone. Right? An app on your phone has settings, you can make this you know the the app louder, we can turn off notifications or stuff.

A machine learning model is a piece of software. And therefore what it’s doing now is it’s testing it’s running simulations to test. Okay, what if I turn the brightness up on this

an extra boost to be like, How many? How many runs? Am I going to do? How many times Am I going to try this thing? How many different splits the data we’re going to make? Out of the box? You know, there are certain defaults in that and then what the software is doing is testing all of the defaults to see, do we get better error rates out of this test based on trying out different settings, once it does hyper parameter optimization is going to do feature engineering. And this is where

I’ve given this feedback to IBM, I think there’s a little bit of a bit of a misnomer. It does feature extraction. feature engineering is a five part process, right? So you have things like extraction, where it’s going to try and create new features from the existing data, which is exactly what this does. By doing things like multiplying columns together or dividing columns or adding or subtracting. There’s a part of feature engineering that called feature creation that we bring in net new data from the outside, it does not do that. So there’s only a limited type of feature engineering. And then it does another round of now that’s got more data to work with because it’s created these imputed columns to do another round of hyper parameter optimization. Now this will take probably 10 or 15 minutes, so we’re just going to pause here and and let it do its thing and come back when it’s finished baking.

We’re back it’s been a little more than an hour.

And what we can see here is that Watson has gone through and created four different machine learning pipelines, one with just a straight x g boost algorithm, one with some hyper parameter optimization, that tuning of all the knobs and dials on the different extra boost models. One with feature engineering done, and one with a second round of hyper parameter optimization after the feature engineering is done. So let’s take a look at see what what we see here we have four different pipelines. And with each of the enhancements, there’s the straight, I just analyze the data that you gave me and built a model on it. And then we see pipelines three and four have identical outcomes. The same route means squared error rate. One has feature engineering and hyper parameter optimization one has both. Let’s take a look at pipeline forces has the most number of things that have happened to it. We’ve got a small r squared. We’ve got the model information. We have a whole bunch of feature training

formations that have happened. You see it through here, there’s all these PCA that stands for

principal component analysis. It’s a way of reducing the number of total features because it means essentially, there’s too many for the machine to find a good conclusion from. And then of course, additional engineered features users, the difference free users and sessions and so on and so forth. So let’s go back and look at compare now at the quote number one model, which is pipeline three

yc pipeline three has the about the same art actually does have identical r squared,

same feature transformations

as the previous one.

And here it’s saying that new feature to which is the difference between users and entrances is the most important feature as moderately important with the score of point 31. for determining what, what drives or what predicts goal completions.

My Google Analytics data now if we were to look at let’s in fact look at what this similar setup would look like in a different programming language. This is a the language are and you can see in h2o which is the auto machine automated machine learning model that runs inside of our one of many do the exact same thing. There’s your your training the split data, there’s your testing this you’re running your models, then there’s the leaderboard with comparison of the different types of outcomes came up with and its outcome, which was averaged one page sessions and average session duration. Know what’s missing here. None of the hyper parameter optimization or the feature engineering has been done on this The h2o is auto ml literally just takes what you give it and it does its best, but it doesn’t do any of those extra steps. So what do you do with this right? You’ve got this thing what now what what what happens? You save this as a model

Inside your, your Watson studio environment and then you deploy the model using Watson machine learning that gives you a an API connection that you can then send additional data into this for and have the it score and predict like yes, will this convert or no will this not convert. And from that information, you would then build software, maybe build a special chatbot on your website that only pops up when certain conditions have been met, the ones that we see here, in in these models, maybe you use this to change your marketing strategy, if you know that the difference between users and Sessions is important in this model, maybe you use that information to figure out what kind of user what kind of

or what type what kind of

person or session is happening on your website, that you can then build additional features on your website, maybe different copy depending on what you can come up with.

So this is a useful tool for getting that model into into production. And being able to make use of it being able to, to hop, operationalize a lot of a lot of these insights.

There are still gaps. There are still things that I personally would have done differently.

But there are also things he has like, Huh, I never thought about doing that. So this is not only a good tool for getting your model into production, but also for learning from it and going okay, well, maybe, maybe I need to think differently about the types of data I include, we see that Watson is doing a lot of these mathematical operations on different variables. Okay, what if we include more variables? What if we include different variables? What if we pull more data out of Google Analytics or your Salesforce CRM or your HubSpot instance, or your social media monitoring tools doesn’t matter what but putting more data in will let the model have

More to work with now.

As you said at the beginning, this does not in any way replace a data scientist, there are still gotchas, there are still things that it can’t do within this, this framework there even still things that from a, a modeling perspective may not be the best choice of what’s available. For example, if you want to determine what drives conversions, there’s a particular model that I use a lot for trust insights customers called Markov chains. It’s not available in here. It’s not available in here and it’s something you have to build by hand. And that is a better way of doing attribution analysis. But this is not bad. But there are limitations to what auto AI can do. So


One, regression classification built right in no code. This is that’s a I think an important thing.

Does those basic data science has like feature extraction or hyper parameter optimization? I think those features are really useful for someone who’s learning data science. Someone who knows what kind of general outcome they want, they just want the machine to go through and build it quickly.

some disadvantages, obviously, you’re limited in the number of models. Watson auto way I can choose from there are limitations. Your data has to be good going in. You can you’ve seen here, even though it doesn’t feature engineering, it doesn’t preparation, it does not validate. Wow, this is really dumb data. Right? For example, at no point did it say, you know,

those are not the columns, Google Analytics, I would choose. The system simply doesn’t know that doesn’t have the domain expertise. You still have to provide that domain expertise, you still have to provide those insights. And the last thing, which I know according to the team, is coming at some point is the ability to go back and tune the model at any

In a more granular way, it’s not available in the platform yet. So

should you use this? It’s worth a try, especially since you can try it for free. If you go to IBM data platform data platform, cloud, IBM com, sign up for a free account, try it out, test it, see how it works. There are other features within Watson studio, you can also test out and try out.

Am I going to use this to replace all the work that I do trust insights? No.

But am I going to use this situationally as another tool in the toolkit? Absolutely. It’s one of those things that is worth doing, even if just to validate my own models to be able to look at it like when I’m looking at my, this auto ml model is did i do enough to engineer the data? The answer in this case? Probably not. Right? There are some more things that even I can learn from an add new features to the data sets that I already work with. So if you’re learning data science and

A great tool, if you know what you’re doing great tool. If you want to learn this great tool, give it a try doesn’t cost anything to get started. And again, back to FTC disclosure, we are an IBM registered business partner. So if you’d buy something from IBM through us, we do gain. We do have financial benefit. As always, leave your comments in the comments box below and subscribe to the YouTube channel and the newsletter. I’ll talk to you soon.

One help solving your company’s data analytics and digital marketing problems. This is trusted insights on AI today and let us know how we can help you

You might also enjoy:

Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here

AI for Marketers Book
Take my Generative AI for Marketers course!

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!


Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

Share This