2022 Rewind: Making Sense of Seasonality and Predictive Analytics

This year, I had the chance to sit down with Dave Armlin of Chaos Search. Here’s what we chatted about:

Data Legends Podcast Episode 3, Christopher Penn

AI-powered summary created by GoCharlie:

Data At Scale: Unlocking The Power Of Big Data With TrustInsights.Ai
– I’m here with Christopher Penn, the co founder and chief data scientist at TrustInsights.ai
– This program is about data at scale, looking at how to handle large sets of data efficiently
– Christopher shared that TrustInsights helps marketers make their data work better for them
– They focus on the 6 C’s of Data: cleanliness, completeness, correctness, comprehensiveness, chosen well, calculable/usable by both machines & humans
– They use various tools and technology to blend heterogeneous data sources into a normalized format so it can be used for reporting or machine learning tasks
– To provide an example Christopher noted that they have looked at digging deeper on popularity metrics such as TikTok videos

From TikTok To AI: Harnessing The Power Of Data With The Right Tools
– Stakeholders often ask what the impact of a TikTok video is on sales, and there are tools such as Segment I/O or Google Data Flow that allow us to bring data sources together.
– We use open source technology to build our product, including BigQuery and the AMP stack. The language we typically use is R but more kids these days prefer Python.
– Chaos Search uses cloud object storage models like GCS (Google Cloud Storage) and S3 (Amazon Simple Storage Service). They also offer an ElasticSearch API for querying events and logs, with JDBC connections soon to come too.
– For AI integration, they follow a consumer experience model you push a button in an analytics tool and expect an answer right away. However, at times machine learning requires compute time before providing a response; this can take anywhere from minutes up to hours depending on the data set size.

Unlock The Potential Of Data With Google Data Studio
– The way data is presented to customers varies based on their expertise with the software they are using.
– Google Data Studio Software encourages less manual labor and fewer human errors in data entry.
– Visuals can communicate thousands of words quickly and easily, so utilizing free tools like Google’s free Data Studio Software is encouraged.
Marketing data comes from unifying conceptual ideas rather than just unified data; descriptive analytics tends to be the focus of most software available today, making it difficult to do diagnostic analytics with it.

Data Privacy: Keeping Real Data Safe With Synthetic Solutions
– I recently learned about California’s new data privacy act that takes effect on January 1st, 2021 and the implications it has for companies used to sharing customer data.
– Customers must consent to having their data sold under this legislation or else companies are unable to share it.
– To comply with this law without violating user privacy, many businesses have begun creating synthetic marketing data by building models of their original datasets which then fill in dummy information instead of using personal information.
– This requires more sophisticated skills than what is typically found in marketing professions today and vendors may need to provide assistance with processing such complex algorithms and transformations.

Quilting Together Compliance: Understanding Privacy Laws For Your Business
– There is a patchwork quilt of privacy laws that marketers need to understand and abide by in order to keep their business compliant.
– One example is the Chinese Intellectual Property Protection Law which has similarities to GDPR but with much stricter penalties, including imprisonment if found violating it while in China.
– It is important for businesses to not only know what data they have on hand, but also how they are using it.
– Tools like Chaos Search can help businesses classify and analyze data within their organization in order maintain compliance with applicable regulations.
– Additionally, companies should be mindful of preventing misuse of their platform as individuals may use protected classes without consent or authorization.

Data Governance: Stopping Inferred Class And Synthetic Variable Misuse
– I’m learning about inferred class issues and synthetic variables, which have the potential to be used in ways that are disallowed or unethical.
– SAS vendors need to come up with tooling to help prevent these types of issues from being an issue.
– IBM is leading the forefront on this issue, creating solutions that provide good governance over data and privacy concerns.
– Chaos Search strives to be a good citizen when it comes to using data ethically, staying in compliance with laws regarding data privacy and providing plumbing for customers where needed.
– Dual use technologies related to AI and data itself can lead toward misuse if not monitored properly.


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Get your copy of AI For Marketers

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!


Pin It on Pinterest

Shares
Share This