You Ask, I Answer_ Choosing Data Science Software and Vendors

Kerry asks, “How do you choose data science software if you’re not already a data science expert? My company is evaluating some vendors and I’m not sure what to ask.”

By definition, a data science platform should be flexible enough and robust enough to fully embrace the classical scientific method. Be VERY suspicious of anyone advertising that their platform just gives you magic answers to your questions without going through the rigor of, well, science. It’s one thing to automate laborious pieces of work. It’s another thing to skip steps entirely. A kitchen stand mixer helps you with the laborious chore of mixing. No appliance maker in the world tells you their appliance helps you skip the process of mixing.

Be equally cautious of any platform that claims to do it all. The more it claims to do, the more difficult it will be to implement, and the further behind you may fall if the vendor doesn’t keep up with the latest.

I personally recommend learning how to use software like R or Python. Yes, it’s a bit like learning how to make a cake by forging your own pans and building your own oven, but you will know how everything works, and you will be able to iterate and update rapidly as technology changes. Their major cost is building or hiring expertise.

Watch the video for full details:

Can’t see anything? Watch it on YouTube here.

Listen to the audio here:

Download the MP3 audio here.

Machine-Generated Transcript

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for watching the video.

In today’s you ask I answer. Carrie asks, how do you choose data science software or a data science platform? If you’re not already a data science expert, a company’s evaluating vendors, and I’m not sure what to ask

by definition of data science platform

should be enabling the process of data science, which in turn means that you should be following all science and the scientific method. One of the things I’ve noticed in the software industry in general, in marketing software. And now in data sciences, that there are a lot of companies making, I think, are very suspicious claims.

Think about the scientific method, right? Let’s, in fact, bring that up here. This is the scientific method, you start with a question that you want to answer, you define your variables, then you put dicks and you make a hypothesis. And if you’re doing it, well, there’s probably a null hypothesis as well as, as a regular hypothesis, then you do your testing, you collect your data, you analyze it, you refine it, you have the observe the the prediction in an action, the hypothesis in action, if it was valid, and you start the process all over again, that’s the scientific method. It is centuries old.

And it is the way to do any form of science, but especially data science. And one of the things I think is especially problematic is you have a whole bunch of people now we’re taking these sort of crash courses in data science, and they’re learning the tools, which is important, and they’re learning a lot of the concepts was concepts that people didn’t learn in school, like statistics and such, but they’re not learning it from a scientific perspective. They’re learning it from a very narrow purpose built perspective, hey, you want to learn data science, here’s your going to learn just Bioinformatics, or just marketing or just operations, or just finance and not the scientific method,

be very suspicious of any vendor that has advertising that their platform just gives you a magic answers, right. And, and, and their platform doesn’t go through the rigor of science, right? It is one thing to automate some laborious pieces of work when you’re doing, for example, exploratory data analysis, and you need a tool to help summarize your data set totally makes sense to have a machine do that to go through and do all the standard observations. How many missing variables Mean, Median mode, all that stuff? total sense? So automating laborious work, okay, if the vendor saying hey, let’s skip

you could skip these processes and go right to your answers. Hmm, nope, does not work like that. Can you imagine like a kitchen stand mixer

or

an employee clients, maker of kitchen stand mixers, and advice kitchen appliances, saying, hey, in the baking process, you can just skip mixing our appliances so magical. Just make the bread for you. And you don’t need to mix Um, yes, there are such things as no mix breads, they’re not very good.

A kitchen stand mixer helps you with the laborious chore of the mixing process, right? It’s not fun to sit there with a whisk and do that for for 20 minutes.

But it doesn’t tell you what can you can skip the process of mixing cannot skip that step

in the scientific method. You have to you have to do each of these steps. And you have to do them in order. You can’t start analyzing data if you don’t have a hypothesis. And a lot of people do that. A lot of people say all I just know that this is the answer just in the process the data that’s not data science.

That is

that’s the opposite of science. That’s in curiosity, you’ve already got a conclusion. You want to prove canaries trying to back into that conclusion from your data as opposed to having a Is this the right answer? You know, I suspect that Twitter engagements lead conversions. Okay, that’s the start of a hypothesis. That’s a good question asked what, what data will you need to define step you make a prediction, I predict that Twitter engagements lead to conversions. And then you can go and test and collect and analyze and refine and observe that’s science,

not it’s not even data science, that’s just science.

The second thing to be cautious of, especially vendors is a platform that claims to do it all. When you think about even just this very simple process. Here, it is a very,

you know, the scientific method is very, very well defined. And every one of these phases, you doing something different, right? In the in the red section, you’re asking questions you’re thinking about your data, you may do some exploratory data analysis to and help you formulate the question. exploratory data analysis is a discipline, it’s a subset of data science. So you will want to, in that question, define phase, do your eta with the tools of your choice. But that’s going to be very different than the tools you use for test collection and analyze, right?

It’s a similar, you may apply similar statistical means. But it may be a very different approach. If you’re doing financial modeling, what you’re using for testing collection of data will be very different than just, you know, pulling stuff off the shelf, when you are analyzing your data, you will will help us very different methods. If you are observing your date in action, you’ll use very different methods. Think about it from a marketing perspective, if you are trying to figure out what your brand awareness is

the exploratory that you do use maybe using things like social media data, or search data, but then you’re testing and your collection data may be using things like market research or surveys,

there is at least in in the marketing world, no, one tool that does it all, there are tools that, you know, for example, in Google’s analytics suite, there’s a ton of different tools in that suite. And you will use different tools as appropriate, if you have a data science product or platform that claims to do it all. The flip side of that is that it’s going to be more difficult to implement than a point solution for a particular task. And it is more subject to technical debt, which means that the vendor will have a harder time updating it to do everything, then to do the one thing that does really well. And if your vendor doesn’t keep up, then you accumulate that technical debt in your organization. And it becomes very, very difficult to adapt to whatever the next thing is. So if you are today, doing very, very basic linear regression modeling, it will be very difficult for you to switch over to say TensorFlow and and doing neural network modeling, if your vendor doesn’t have that flexibility.

Personally, I put a lot more value into learning software like our or Python and services like that. Yes, it’s, it’s a bit like, you know, going back to the cake examples, bit like

MIT learning how to make a cake by forging your own path in your backyard iron fortune building your own ovens if

it’s not for everybody. But you will learn how everything works, you will learn how to iterate how to update rapidly how to add new libraries in to increase your knowledge store.

And it allows you to keep your technical debt to a minimum, because you’re always keeping things up to date, you’re maintaining your own code

as when you become a software developer. Obviously, that poses a different set of tasks and buying something off the shelf. But if you are concerned about

choosing a wrong vendor, particularly if something is very high risk, or is a significant undertaking that you may want to explore the route of building it yourself, because you will, chances are, especially if you’re not familiar with data science, right now, there are requirements in the requirements gathering process that you’re not good that you that you will uncover later on down the road. And the project Oh, we should have asked about that. And now this vendor you selected doesn’t have that. Whereas if you’re learning how to code you like, all right, we need to code that into thing as well the major cost of these programming languages of courses, building or hiring the expertise to do that. But that’s my personal preference. It is not for everybody, and by no means is it the right way. It’s just a perspective so

learn data science, at least learn the basics and learn the scientific method and then evaluate your vendors based on their rigor to the scientific method if you’re if you want to know how to get started evaluating vendors so great question Carrie complicated question, complicated question,

but the answers are in how well a vendor adheres to process so thanks for asking. As always, please subscribe to the YouTube channel and the newsletter and I’ll talk to you soon. Take

care

if you want help with your company’s data and analytics visit Trust Insights calm today and let us know how we can help you


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Get your copy of AI For Marketers

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!


Pin It on Pinterest

Shares
Share This