You Ask, I Answer: Citizen Data Scientists?

Jessica asks, “How do you feel about citizen data scientists?”

I love the theory, the concept, and to be sure, there are plenty of people who are data scientists that lend their expertise to causes and movements outside of their day jobs. But the question is, is a citizen data scientist someone who is a data scientist operating outside of work, or a citizen who becomes a data scientist?

Can’t see anything? Watch it on YouTube here.

Listen to the audio here:

Download the MP3 audio here.

Machine-Generated Transcript

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for watching the video.

In today’s episode, Jessica asks, How do you feel about citizen data scientists? I feel fine.

all seriousness.

I love the theory, the concept.

The question is what are we talking about here? Because the citizen data scientist could be two things.

It could be a data scientist who is applying their skills and their capabilities to solve problems causes social good outside of work, right? So there are data scientists like in a big bank during the daytime and then the evenings.

They’re so motivated or so excited to tackle a challenge for a cause that they go off and that’s what they do in their evenings.

That’s one interpretation.

The second interpretation is a citizen, a normal person who has Cause they’re passionate about and wants to learn data science skills to be able to help further along that cause.

Both interpretations are equally valid.

There’s no wrong answer here.

And there are certainly plenty in the first group of people who are data scientists who are applying their efforts to causes to, to champion things outside of work.

Those folks I’m not worried about those folks are the folks who are, who know what they’re doing, and have the skills, the training the tools, they need to be able to lend their talents, whatever.

So that’s, that’s one group.

It’s the second category that I feel like it has is difficult.

And here’s what I mean by that.

Data Science is four sets of capabilities.

It is business skills.

It is technical skills, his mathematical and statistical skills and his scientific skills.

Those are the four major categories of skills that you need to have as a data science to be effective at it.

Each of them.

You know, I joke that data scientists are so expensive because it’s four jobs for the price of one.

Each of those areas requires a certain level of competence to be effective.

If you are lending your expertise towards, say a cause.

Presumably, you have some background in that cause you have some knowledge of it already.

But to be effective in data science, you need to have a good deep understanding of the subject matter, you need to be something of a subject matter expert in it.

The technical skills we’ve discussed many times the ability to write some code, the ability to, to use coding tools, to get the machines to do what you want at the at the more advanced levels.

And I will caveat all this by saying that the fundamental underpinning of data science that we define is as someone who extracts insights meaningful insights from data using the scientific method.

So, of those four buckets of skills, the scientists part is actually the most important because if you’re not doing the scientific method, hypothesis testing and such, validating experiments, creating reproducible results, then you’re not doing data science, you may be doing data analysis, which is totally fine.

And then really important.

You may be doing, you know, data analytics, you may be doing data engineering, but you’re not necessarily doing data science unless you’re using the scientific method.

But when we think about the common ways people ascribe data science skills to individuals, we think of those four buckets business, technical, scientific, mathematical, and for the average person, they may not have enough background in those areas.

Now they can learn absolutely they can learn anyone can learn data science, anyone can learn the underpinnings, you can learn statistics, you can take stats one on one again and again.

As if you can learn how to code, you can learn probability, you can learn calculus, you can learn your cause really well.

But that’s typically not what people do.

Unless they are so invested in cause that it becomes all consuming that becomes their life.

And then yes, developing those skills and that passion does occur.

But for the most part, that’s not how I’ve seen people operate and it’s not a knock on people.

It’s just that if you throw it there, very few people can throw themselves at a cause so fiercely, that they will, frankly endure the months and months it will take to develop those skills need to those areas, to spend six to 12 months learning how to write Python code or our code to take the six to 12 months to learn how to work SQL databases and to do Learn probability and Bayesian network theory.

Do people do it? Yes.

Is it a lot of people know.

And my hesitation with the second category of citizen data scientist is that is the line of, you know, just enough to be dangerous, but not enough to know how dangerous and by that I mean, you you, you love this cause you believe this cause you don’t have all the skills, you need to be an effective data scientist to know that what’s likely to go wrong.

And you work for an organization that needs the help.

But because your skills are not complete skills, not complete young Jedi.

You mislead them.

You create incorrect analyses you you point them in the wrong direction and you end up harming the thing that you’re trying to help.

Now for some things, the amount of harm you could do is relatively low, right? If you are Working for an organization you’re helping them with like their email marketing analytics, like, hey, I want to help you make your emails better, you’re probably not going to do something so drastic that will cause the open rates to go to zero.

Right? You probably won’t impact them positively, if you don’t know you’re doing.

But for other causes and organizations and things like if you were to, I don’t know, actually a really good example is there’s, during the whole pandemic, there was a whole group of folks who like we’re going to use machine learning and data science to find the ideal therapeutic to stop this pandemic and the results they produced were unimpressive.

But more important, theirs they produced are actively harmful to some people.

And so you have a bunch of people who don’t have the domain expertise, trying to apply their technology skills, mostly technology skills to a problem they don’t understand and causing potentially harm to other people’s lives.

In a, like an acute, immediate way, like if you take this drug that they recommend, it probably will not do good things for you.

And so that’s my hesitation about that second category citizen data scientist.

And even in the pharmaceutical example, we were just talking about those people who have technical skills, but they don’t have the scientific skills, and they don’t have the domain knowledge to know that what they’re proposing is dangerous or potentially dangerous.

And so I would say citizen data scientists, ideally are confined to areas where they can learn the skills that can get skill up in each of those four areas, but are confined in such a way that if they come up with a wrong conclusion, the level of risk is low.

You really should not be doing anyway.

Advanced Data Science tasks, on things that are literally life and death.

Probably shouldn’t do it.

help someone with the email marketing.

Sure, help them understand the web analytics better.

Sure.

optimize the have buttons on their pages? Sure, that’s low risk stuff, you’re not going to blow up the world, you’re not going to kill anybody.

But I would say that I am hesitant to say that we should try to create an army of citizen data scientists without those guardrails? So, good question.

There’s a lot more to unpack here because we do need more people with data science skills, and I don’t want us to worry people and say, Oh, no, I’m not going to do it.

No, please, absolutely pursue it in low risk areas.

So that if something goes wrong, you’re not going to cause any harm.

Absolutely pursue it to learn to develop yourself professionally.

Absolutely.

do those things.

Just don’t apply it to life and death matters.

If you have follow up questions, please leave comments box below.

Subscribe to the YouTube channel on the newsletter.

I’ll talk to you soon take care.

One helps solving your company’s data analytics and digital marketing problems.

This is Trust insights.ai today and let us know how we can help you


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Get your copy of AI For Marketers

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!


Pin It on Pinterest

Shares
Share This