GRAMMY Debates with IBM Watson

This week, I had the pleasure of sitting down with IBM Project Debater system lead Yoav Katz for an in-depth chat about how Project Debater has evolved since its debut at IBM THINK 2019 and how it’s being used for the GRAMMY Debates with Watson.

What is IBM Project Debater

For those unfamiliar, Project Debater is an IBM Research project to create a practical, conversational AI system that can hold its own in a verbal debate, academic style, with a human being. At its first major public appearance in 2019 at IBM’s THINK conference, Project Debater did indeed hold its own against a human debate champion onstage, on the topic of subsidizing kindergarten.

The core technologies used in that version of Project Debater have been extended to a new application: debate by crowd. For the GRAMMYS, IBM has opened up four debate by crowd opportunities for people to submit points of view in its GRAMMY Debates with Watson on these topics:

  • Billie Eilish is the biggest style icon in music
  • Music education should be mandatory in all K-12 schools
  • Prince is the most groundbreaking artist of all time
  • Virtual concerts are better experiences than live shows

If you’ve got a position you want to participate on, submit your arguments on the topics here; just navigate down to Try It Yourself.

Why IBM Project Debater Matters

IBM’s Project Debater is the combination of four sets of technologies – speech to text translation, topic identification, argument synthesis, and text to speech production. For the GRAMMYS project, the topic identification piece is the part at work to do what’s called abstractive summarization.

Abstractive summarization is exactly what it sounds like – a summary of content that is abstract, rather than extractive. In extractive summarization, we look for the most important words and phrases and more or less copy/paste them directly. In abstractive summarization, we may write without using any of the original words and phrases from our source data.

On the GRAMMY Debates with Watson site, we humans submit our arguments, and Project Debater ingests them to synthesize an abstractive summary of them. Here, for example, are the arguments for and against music education:

Project Debater summary


Greetings all. The following analysis is based on 329 arguments, contributed by people around the globe, identified as high-quality arguments supporting the notion that music education should be mandatory in all K-12 schools.

21 percent of the arguments argued that music in schools helps children develop better. Music education is an important aspect of providing children with a well-rounded education. When allowed to work in harmony with other subjects and areas of study, music helps children grow in self-esteem. Music education has been shown to enhance a students abilities in other disciplines, therefore learning music should be a required in public school to support greater overall achievement and knowledge. It allows a great development in children, allowing them to find their way. Music stimulates brain development in children. Music also integrates many different subjects. Music is a way to connect with other people and can relieve stress. Music education should be mandatory because it is factually proven that those who take music classes have better grades and reduced anxiety; this is crucial for students who are struggling.

Another recurring point, raised in 7 percent of the arguments, is that music education should be required because it encourages creativity! Music education nurtures and builds creativity, sharpness of thought and mind, establishes a process of innovation-driven thinking, and brings joy. Music education should be implemented in all educational systems, since it allows children to develop their passion and love for music. It has proven results in helping kids be more emotionally stable, as well as giving them a creative outlet. Music brings out creativity in children, helps with reading and math, and increases children’s attention spans.

7 percent of the arguments proposed that music enhances brain coordination and increases brain capacity. Music can lead to better brain development, increases in human connection, and even stress relief. Music helps logical thinking, and is thus useful. Using a different part of our brains gives greater control and balance; it is a good balance to our STEM focused curriculum. One of the most useful benefits of music education is the increased ability to process situations and find solutions mentally.

6 percent of the arguments mentioned that research shows that music training boosts IQ, focus and persistence. Music education in schools is of great benefit to children as it increases their memory, attention and concentration capacity. There is a heap of incontestable research showing that an education rich in music improves students’ cognitive function and academic performance. It is an important part of education in all K-12 schools at this stage since it trains children with great cognitive ability. Adolescents with music training have better cognitive skills and school grades and are more conscientious, open and ambitious.

To conclude, the above examples reflect the crowd’s opinions, that music education should be mandatory in all K-12 schools. Thank you for joining.


Greetings. The following analysis is based on 109 arguments submitted by people around the world, identified as high-quality arguments contesting the notion that music education should be mandatory in all K-12 schools.

22 percent of the arguments argued that music education can distract kids from really important subjects. STEM education should be a priority and music education takes away funding from more important subjects. There are more important topics such as economics and medicine; these subjects give basic knowledge to the students. Music should not be required at school because it can be very distracting for students. It should be considered optional, and students should focus on important topics such as grammar or mathematics.

Another 5 arguments conveyed that our taxes should not pay for non-essential education like music and art. Providing music education in K-12 schools is a waste of budget that could be invested in other more important areas such as physics, chemistry, mathematics and languages. Schools have limited budgets and the study of academic areas such as Math, English and Science need to be a priority.

4 arguments alluded that school districts do not have the funding needed for music education. Music education is prohibitively expensive. The poorest students cannot afford an expensive extracurricular activity like band; our tax dollars end up subsidizing privileged kids’ hobby. Music education puts too much of a strain on already limited resources. It requires funding that could be used to fund STEM programs instead. When budgets are stretched, there are other subject areas that schools should prioritize first – musical education should be kept as a subsidized, optional, after-hours extra.

To conclude, the above examples summarize the crowd’s arguments, opposing the notion that Music education should be mandatory in all K-12 schools. Thank you for joining.

Do you see how powerful this technology is at abstractive summarization, the ability to take in a lot of input and boil it down to relatively concise, understandable summaries?

This technology has applications far beyond debate topics. Abstractive summarization could, for example, ingest the entirety of your customer service inbox each day and provide a rollup summary of the key issues customers are facing in an easy to read brief that would help you understand the frustrations customers are feeling.

For content marketers, think of the amazing opportunities available to us to synthesize relevant, cogent new content from sources. Instead of simply parroting or replicating user-generated content, we could build entirely new content with these technologies. Imagine taking your top positive reviews for a product and synthesizing marketing copy from them, creating customer-centric, customer-led marketing content.

How Does IBM Project Debater Do This?

In my conversations with Yoav Katz, Manager of IBM Debating Technologies, we talked through the architecture of Project Debater in 2019 versus how it’s structured now. Back then, Project Debater was a monolithic system of 10 different AI engines all working together to process a single person’s human speech and create responses.

Today’s system, the one powering the GRAMMY Debates with Watson, is a much more scalable system. Broadly, (because the details are confidential) Project Debater moved all its symbolic AI (rules-based) up front to screen out junk, and completely changed out Project Debater’s neural engines on the back end, switching from LSTMs (long short-term memory neural networks) to transformers, the current state-of-the-art in natural language processing and generation.

Any time you put something on the Internet open to the public, you’ll inevitably get trolls and jerks, so this system is a model for how we should think about deploying AI in production. Transformers – the advanced language processing models used in the headline-making technologies like Google’s BERT, Facebook’s BART, and OpenAI’s GPT-3 – are incredible at natural language processing and generation, but at a computational cost that’s substantially higher than older technologies.

How much more? LSTMs run very well on small hardware; every time you use autocomplete on your smartphone, you’re using an LSTM. Transformers need beefy hardware; someone doing development at home needs hundreds, if not thousands of dollars in hardware to run transformers efficiently and at scale. For a project like GRAMMY Debates with Watson, you’re talking thousands of virtualized server instances on IBM Cloud that have to scale up when demand gets high.

So IBM’s use of more primitive, rules-based AI up front to screen out hate speech, inappropriate content, and irrelevant submissions takes the load off the transformer engines, ensuring that only relevant content makes it into processing.

Another key lesson Katz discussed with me was that the production model isn’t learning. IBM pre-trained and tested it, but the model itself isn’t doing any kind of reinforcement learning or active learning; our inputs have no impact on the model itself. This is an essential lesson for production AI. Why? Back in 2016, Microsoft deployed an experimental NLP model on a Twitter account, called Microsoft Tay. It was built on a reinforcement learning model that would take input from Twitter users to synthesize tweets.

The Internet being the internet, trolls managed to spike Tay’s language model and turn it into a racist, pornographic account in under 24 hours.

Keeping Project Debater’s model static not only decreases its computational costs, it insulates it from bad actors on the Internet.

Lessons in AI

What Project Debater’s GRAMMY Debates with Watson shows us is a great blueprint for deploying AI:

  • Build and tune your models up front
  • Move your lowest processing cost technologies early in the process to reduce the input dataset
  • Insulate your production models from drift in case of highly suspicious inputs
  • Build using microservices architectures in a cloud environment so that your deployment can scale up faster to meet demand

Go ahead and try out GRAMMY Debates with Watson and see for yourself how it works – and how the underlying technologies might be useful to you.

FTC Disclosures

I am an IBM Champion, and my company, Trust Insights, is a Registered IBM Business Partner. Should you do business with IBM through us, I receive indirect financial benefit. IBM did not provide direct compensation for me to participate in or review GRAMMY Debates with Watson.

You might also enjoy:

Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here

AI for Marketers Book
Get your copy of AI For Marketers

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!

Pin It on Pinterest

Share This