Warning: this content is older than 365 days. It may be out of date and no longer relevant.

Topic Modeling for Marketers_ Definition

Topic modeling is one antidote to the overwhelming volume of content created every day that marketers must understand. In this series, we’ll explore what topic modeling is, why it’s important, how it works, and some practical applications for marketing.

Part 2: What Is Topic Modeling

Let’s begin by answering the question: what is topic modeling?

Here’s a great definition from KDNuggets:

Topic modeling can be described as a method for finding a group of words (i.e topic) from a collection of documents that best represents the information in the collection. It can also be thought of as a form of text mining – a way to obtain recurring patterns of words in textual material.

The easiest way to think of a topic model is a word-based summary of a body of text. Think of how a table of contents outlines a book, or how a menu outlines the food at a restaurant. That’s what a topic model essentially does.

Topic models first came into use in the late 1990s, with Thomas Hoffman’s probabilistic latent semantic analysis. They’ve become more popular over the years as computing power has increased.

How Do Topic Models Work?

Topic models are a product of mathematical and statistical analysis. In essence, they assign numerical values to words, then look at the mathematical probabilities of those numerical values.

For example, consider this sentence:

I ate breakfast.

We could assign arbitrary numerical values to this sentence, such as I = 1, ate = 2, and breakfast = 3.

Now, consider this sentence:

I ate eggs for breakfast.

We would have a sequence like 1, 2, 4, 5, 3 using the previous numbers.

Next, consider this sentence:

Mary ate breakfast with me.

This would have a sequence like 6, 2, 3, 7, 8.

Put these sequences together:

1, 2, 3

1, 2, 4, 5, 3

6, 2, 3, 7, 8

We begin to see increased frequencies in this table. The number 2 appears 3 times. The number 3 appears 3 times. The number 1 appears twice, and always next to the number 2. The number 3 moves around a bit.

This mathematical understanding of our text is how topic models work; statistical software predicts features such as:

  • How often does a number (word) appear?
  • How often does a number (word) appear only within one document, but not in others?
  • How often do certain numbers (words) appear next to each other?

While this seems like a lot of work to analyze three sentences, the value of topic modeling is performing this kind of analysis on thousands or millions of sentences – especially when time is important.

For example, suppose we’re attending a major conference like Dreamforce or CES. If we want to participate in relevant conversations, we should know what the most important topics are on the minds of attendees. However, mega-events often generate hundreds or thousands of social media posts per hour. No human or even group of humans could reasonably keep up with the raw feed from such an event. A machine will.

Walking Through a Topic Model

In the next post in this series, we’ll explore the process of creating a topic model. Stay tuned!


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Take my Generative AI for Marketers course!

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!



Comments

One response to “Topic Modeling for Marketers: Definition”

  1. […] what is topic modeling, and why is it important? Chris Penn has this first entry in a series on topic modeling and your […]

Leave a Reply

Your email address will not be published. Required fields are marked *

Pin It on Pinterest

Shares
Share This