Almost Timely News, July 23, 2023: AI, South Park, and LLaMas

Almost Timely News: AI, South Park, and LLaMas (2023-07-23) :: View in Browser

Almost Timely News

👉 📕 Get my new book, The Woefully Incomplete Book of Generative AI, absolutely free of financial cost 📕

Content Authenticity Statement

100% of this newsletter was generated by me, the human. There’s no AI generated content at all this week.

Watch This Newsletter On YouTube 📺

Almost Timely News: AI, South Park, and LLaMas

Click here for the video 📺 version of this newsletter on YouTube »

Click here for an MP3 audio 🎧 only version »

What’s On My Mind: AI, South Park, and LLaMas

This week, let’s cover two important developments in the world of AI as I prepare to head for Cleveland to speak at the annual Marketing AI Conference, MAICON. First, let’s talk about South Park. Fable Studios released a paper this week (which I believe has not been peer-reviewed yet, so as with all such papers, take it with a grain of salt until it has been peer-reviewed) in which they used an ensemble of AI systems to replicate an episode of South Park.

The system, called SHOW-1 Showrunner Agents, is an ensemble of different AI systems:

SHOW-1 Architecture

The episode itself – from a consumer perspective – is okay. It’s not particularly funny, but it does capture the spirit well enough that if you saw it come up on the viewing device of your choice, you might just assume the writers had a bad day and produced an episode that was kind of a stinker.

This is the Turing test of entertainment: shown a piece of content, can a consumer tell whether it was machine-generated or not, and more important, would the consumer care?

If you read the paper and watch the demonstration video, a couple of notable points should leap out. First, the level of accuracy is very, very good. Because the system operators used the OpenAI models, they produced episodes that were mildly amusing but not the usual coarse South Park fare. Had they used an open source model like one of the many storyteller or roleplay models that are uncensored, they probably would have gotten much more South Park-like humor generated.

Second, it took their ensembled system a total of 3 hours and 15 minutes to assemble a 22 minute episode of South Park. For anyone who has ever worked in entertainment, this is absolutely mind-blowing. You can’t even get a writer’s room to agree on a plot in 3 hours, much less produce an entire episode from start to finish. Granted, that does NOT take into account the training time for the Stable Diffusion models; they had to assemble 1,200 characters and 600 background images, which would take a couple of days with modern GPUs (Graphics Processing Units). However, training such models in a production process would be a one-time up front cost, after which you would not need to do it again unless it started displaying unsatisfactory outputs.

The implications for the entertainment industry should be obvious – for a templated, systematic show like South Park or the Simpsons, for which there is ample training data (in some cases, decades of content), machines are perfectly capable of producing more of it. A truly original show would still need to be human-powered because the training data simply would not exist in sufficient quantity to have a machine make more of the same, but for a franchise with, say, 3-5 seasons of content? It’s more than realistic for a machine to just keep going. From a technical sophistication level, it’s unsurprising South Park was so easy to recreate; it’s not exactly complex imagery to replicate, compared to live action or even better-produced cartoons. But it’s still quite an accomplishment.

It’s even more stunning when you consider that the models used to manufacture this episode are basically toddlers. Stable Diffusion was released a little more than a year ago. Can you imagine having a 1 year old child who went from finger painting to drawing commercial animation by their first birthday? I can’t – but that’s exactly what happened, and exactly how fast the technology is moving.

There’s a bigger lesson here for all of us. Ensembled systems, like the SHOW-1 system, are inherently more capable than a single big system. By this, we mean that a collection of top-performing point solutions glued together is likely to outperform an all-in-one system. This matters because a lot of people think AI systems should be monolithic magicians, capable of doing everything and anything in one tool. Just as you’d probably find a vehicle that tried to combine the feature of a pickup truck, a U-Haul, a sports car, and an SUV to be less than satisfactory, the same is true of the average consumer’s expectations and understanding of what AI systems are.

The reality is that if you’re planning an AI strategy, plan from the start to have it be an ensembled system, a system of different tools – some AI, some not – chained together so that they work in harmony, in the same way that different appliances in a kitchen are used for the right purposes, at the right times by a master chef. This also saves an enormous amount of time, energy, and budget – again, you don’t have to invent one single monolithic solution. You can instead assemble best-of-breed point solutions that are orchestrated together, which means much less reinventing of the wheel or coercing language models to attempt tasks they’re simply not good at.

Okay, so that was big news point number one, and that’s still a pretty big news point. But there’s an even bigger news point number two. Meta (aka Facebook) released the second version of its LLaMa language model this week, and as promised, LLaMa 2 is commercially licensed. Its performance is significantly better than its predecessor and so good that on many common AI benchmarks, it scores as well as OpenAI’s GPT-3.5-Turbo, the default model that powers ChatGPT.

This is a really, really big deal. A huge deal. Let’s talk through why.

When you look at benchmark scores, LLaMa 2 is objectively amazing for the open source family of models – MPT, Falcon, etc. Comparing in families – 7B models, 30B models, etc. – LLaMa 2 beats out most other models on most benchmarks, sometimes very handily.

And there are instances where it does better than even the next family up; LLaMa 13B beats MPT 30B in a fair number of cases, and even Falcon 40B. That’s huge.

If you’re unfamiliar, the xB notation is the number of parameters, 13 billion parameters, 30 billion parameters, etc. If language models were Subway sandwiches, the number of parameters would be the density of toppings – light, medium, heavy, etc. Parameters are a tradeoff of accuracy and speed – the more parameters, the better the accuracy but the slower the model tends to be. Again, think of Subway sandwiches – less dense toppings means you get your sandwich faster, but it’s less satisfying.

LLaMa 2 makes a better sandwich, faster.

These open-source foundation models are starting to challenge the closed-source models as well. Meta’s benchmarks show that LLaMa 2 meets OpenAI’s GPT-3.5 on quality. GPT-4 still remains king of the hill, but Meta’s models are catching up REALLY fast – and with the power of the open source community, they have tens of thousands of developers on their side, tweaking and tuning their foundation models for all kinds of use cases.

As we’ve discussed previously, open-source models are incredibly valuable for any company wanting to deploy generative AI, especially inside their own software.

One of the biggest challenges of enterprise software management is version control. When you want to roll out a piece of software to thousands, tens of thousands, or even hundreds of thousands of employees around the world, or millions of customers, you want software that does what it’s supposed to do under normal operating conditions. And you want – and need – that software to do so in a reliable manner for years. There’s a reason why computers in production systems are often far behind even on basic operating system versions compared to the same operating system on consumer hardware. A company can’t afford to have unreliable software spread across the planet.

And these software lifecycles can be measured in years. Windows 10 is the dominant operating system on PCs; according to Statcounter, 71% of Windows installs are Windows 10, and it was released in 2015, 8 years ago.

When you look at the lifecycle of models provided by a company like OpenAI, you’ll notice that their lifecycles for models are very fast in an enterprise context. They just recently announced end of life for all their older models, requiring everyone to move to the Chat Completions API by January 2024, less than a year after making the Chat Completions API available. A year is a long time in AI, to be sure, but it’s the blink of an eye in enterprise software. For some large companies, it might take two years to roll out a piece of software; can you imagine having to roll out a major code update mid-rollout?

When you have a piece of software you can download and run on your own equipment, that’s a big deal. That means you have control over versioning. You have control over rollout. You have control over the user experience to a much greater extent because you and your corporate IT department decide when new versions get distributed. So Meta’s LLaMa 2 model means you could implement a large language model interface in your company and securely distribute local copies of it throughout your company and be assured of its stability until you’re ready to upgrade it on your own timetable.

The second major point in LLaMa 2’s favor is that until your application reaches hundreds of millions of monthly users (per the license terms) it’s free to use. If you’ve ever wondered why the Android operating system is in nearly every mobile device, it’s because Google made the decision to give it away as open source software and everyone ran with it. Meta is following that playbook – give away high quality software, knowing it will rapidly become the standard.

This also democratizes access to large language models. People who can’t afford OpenAI or Anthropic’s fees for their APIs – particularly in developing nations – can make use of LLaMa’s near state-of-the-art performance for free. That means all the capabilities of the free ChatGPT version are now in the hands of everyone on the planet, no credit card needed.

Why did Meta/Facebook do this? First, it means lots of innovation on their model – and because of the nature of open-source software, that essentially means they put hundreds of thousands of developers on their team, working for free to make improvements that they can then incorporate back into Meta’s internal models. Second, it helps prevent AI dominance. Meta sees the existential threat that big providers like OpenAI present. If they can release open source models that become de facto standards, then they don’t have to worry that eventually they’ll have to become OpenAI customers – and their model becomes the de facto standard. Indeed, within the first 24 hours, 301 derivatives of the LLaMa 2 model appeared on the Hugging Face AI model site.

For us, as consumers and businesses, the release of LLaMa 2 is a watershed moment because it means we can incorporate the model’s various versions into our own products without worrying about licensing or fees – and they’ll work even without Internet access. We can build one of the most capable models into every software package imaginable – accounting software, web developer software, movie-making software – any place where a natural language interface would be helpful. With LLaMa 2’s release, expect to see a language model interface in just about every piece of commercial software imaginable – and any software company not integrating it is going to be left behind very, very quickly.

Because it’s open-source and freely downloadable, we can also download it and fine-tune it with any number of cutting edge techniques to make it extremely good at specific tasks. We might use fine tuning techniques to strip away abilities we don’t need, like telling jokes, but make room to do a better of job of picking stocks or entering medical record data or detecting stress in a customer’s writing. LLaMa 2’s architecture is well-suited to fine-tuning and because it’s a piece of code you install on your computer, you have a lot of control over the tuning process.

It’s difficult to overstate the impact this model and its successors will have on the world of large language models and generative AI. For the technically savvy, it’s imperative you get good at working with models like these, because the ask will come sooner or later from your stakeholders to implement a language model in your products or services – including customer care – and these models will deliver quality work at nearly no additional cost. For the regular business user and consumer, you can and should expect to see language models embedded in everything, so understanding prompt engineering is critical to making the most of them in all the software you use.

Crafted and used intelligently and responsibly, these new models and new technologies herald greater capabilities, greater augmentations than ever before. Imagine being able to talk to nearly anything and get intelligible answers. Imagine being able to hold meaningful conversations with machines that are as fluent as conversations you hold with the average stranger. We are witnessing the acceleration into the Intelligence Revolution, an age where we humans are augmented and empowered by our machines. This is your time to shine, your time to be one of the leaders in the Intelligence Revolution, as a developer, manager, or creator, augmented by AI. The future is now.

Got a Question? Hit Reply

I do actually read the replies.

Share With a Friend or Colleague

If you enjoy this newsletter and want to share it with a friend/colleague, please do. Send this URL to your friend/colleague:

https://www.christopherspenn.com/newsletter

ICYMI: In Case You Missed it

Besides the newly-refreshed Google Analytics 4 course I’m relentlessly promoting (sorry not sorry), I recommend the livestream I did this week on Claude 2 and Whisper. Watch all the way to the end where I drop a sales technique with AI that will fry your brain.

Skill Up With Classes

These are just a few of the classes I have available over at the Trust Insights website that you can take.

Premium

Free

Get Back to Work

Folks who post jobs in the free Analytics for Marketers Slack community may have those jobs shared here, too. If you’re looking for work, check out these recent open positions, and check out the Slack group for the comprehensive list.

Advertisement: Bring My AI Talk To Your Company

📺 Click here to watch this ad in video format on YouTube

I’ve been lecturing a lot on large language models and generative AI (think ChatGPT) lately, and inevitably, there’s far more material than time permits at a regular conference keynote. There’s a lot more value to be unlocked – and that value can be unlocked by bringing me in to speak at your company. In a customized version of my AI keynote talk, delivered either in-person or virtually, we’ll cover all the high points of the talk, but specific to your industry, and critically, offer a ton of time to answer your specific questions that you might not feel comfortable asking in a public forum.

Here’s what one participant said after a working session at one of the world’s biggest consulting firms:

“No kidding, this was the best hour of learning or knowledge-sharing I’ve had in my years at the Firm. Chris’ expertise and context-setting was super-thought provoking and perfectly delivered. I was side-slacking teammates throughout the session to share insights and ideas. Very energizing and highly practical! Thanks so much for putting it together!”

Pricing begins at US$7,500 and will vary significantly based on whether it’s in person or not, and how much time you need to get the most value from the experience.

👉 To book a session, click here! 👈

What I’m Reading: Your Stuff

Let’s look at the most interesting content from around the web on topics you care about, some of which you might have even written.

Social Media Marketing

Media and Content

SEO, Google, and Paid Media

Advertisement: Business Cameos

If you’re familiar with the Cameo system – where people hire well-known folks for short video clips – then you’ll totally get Thinkers One. Created by my friend Mitch Joel, Thinkers One lets you connect with the biggest thinkers for short videos on topics you care about. I’ve got a whole slew of Thinkers One Cameo-style topics for video clips you can use at internal company meetings, events, or even just for yourself. Want me to tell your boss that you need to be paying attention to generative AI right now?

📺 Pop on by my Thinkers One page today and grab a video now.

Tools, Machine Learning, and AI

Analytics, Stats, and Data Science

All Things IBM

Dealer’s Choice : Random Stuff

Advertisement: Ukraine 🇺🇦 Humanitarian Fund

The war to free Ukraine continues. If you’d like to support humanitarian efforts in Ukraine, the Ukrainian government has set up a special portal, United24, to help make contributing easy. The effort to free Ukraine from Russia’s illegal invasion needs our ongoing support.

👉 Donate today to the Ukraine Humanitarian Relief Fund »

How to Stay in Touch

Let’s make sure we’re connected in the places it suits you best. Here’s where you can find different content:

Events I’ll Be At

Here’s where I’m speaking and attending. Say hi if you’re at an event also:

  • MAICON, Cleveland, July 2023 – use discount code TRUST150 to save $150 on the ticket
  • ISBM, Chicago, September 2023
  • Content Marketing World, DC, September 2023
  • Marketing Analytics and Data Science, DC, September 2023
  • Content Jam, Chicago, October 2023
  • MarketingProfs B2B Forum, Boston, October 2023
  • Social Media Marketing World, San Diego, February 2024

Events marked with a physical location may become virtual if conditions and safety warrant it.

If you’re an event organizer, let me help your event shine. Visit my speaking page for more details.

Can’t be at an event? Stop by my private Slack group instead, Analytics for Marketers.

Required Disclosures

Events with links have purchased sponsorships in this newsletter and as a result, I receive direct financial compensation for promoting them.

Advertisements in this newsletter have paid to be promoted, and as a result, I receive direct financial compensation for promoting them.

My company, Trust Insights, maintains business partnerships with companies including, but not limited to, IBM, Cisco Systems, Amazon, Talkwalker, MarketingProfs, MarketMuse, Agorapulse, Hubspot, Informa, Demandbase, The Marketing AI Institute, and others. While links shared from partners are not explicit endorsements, nor do they directly financially benefit Trust Insights, a commercial relationship exists for which Trust Insights may receive indirect financial benefit, and thus I may receive indirect financial benefit from them as well.

Thank You

Thanks for subscribing and reading this far. I appreciate it. As always, thank you for your support, your attention, and your kindness.

See you next week,

Christopher S. Penn


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Take my Generative AI for Marketers course!

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!



Comments

One response to “Almost Timely News, July 23, 2023: AI, South Park, and LLaMas”

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

Shares
Share This