You Ask, I Answer: Reducing Generative AI Hallucinations?

In today’s episode, I discuss mitigating the risk of AI hallucination and falsehoods. Pure language models like GPT-4 can make convincing yet untrue claims. Tools like Bing and Google Bard cite sources so you can verify authenticity. Join me to explore best practices for reducing made-up responses from generative AI.

You Ask, I Answer: Reducing Generative AI Hallucinations?

Can’t see anything? Watch it on YouTube here.

Listen to the audio here:

Download the MP3 audio here.

Machine-Generated Transcript

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for watching the video.

Today’s episode of you ask I answer was recorded in front of a live studio audience at the digital now conference in Denver, Colorado, in November 2023.

The session title was appropriately you ask I answer live generative AI q&a.

Enjoy.

Other questions.

Yes.

It’s fine.

Right now I have a staff member who is taking content from content specialists in putting it into chat GPT and saying, Okay, write me 10 social media posts and advertising copy, etc, etc.

What I would really love is that it is we had a mechanism to implement the social posts and say, now schedule these posts, you know, two per month over a three month period, etc, etc.

Where are we having achieving that? And does it already exist time? It depends on whose platform using so Agorapulse, for example, has that utility where you can has generated AI prompts and you can say, here’s what I want and do the thing.

I don’t know if it fits that exact use case, but it will get you awfully close.

The if it can’t almost every platform has bulk import of stuff at some in some way.

So what again, what you would do is say, here’s the thing, make me social content, but you’re going to format the output as a CSV file.

Column one is the URL column to the social post column three is this the date and time I want to scheduled process this provide me a link to download.

So that’s your prompt.

It will spit out the output that you want.

And now you can just bulk load that into your social schedule.

Yeah.

What about the images? So most social posts have images anymore.

We all are using that if it’s not, you know, like just the web link that generates the image.

So is there a mechanism for that as well with a JPEG or a ping cloud? It depends on your social scheduler.

If your social schedule can take an image link, then you would put your visual assets on some publicly accessible server and then to provide the reference links to to those things.

That’s probably the easiest way to do that.

And the AI tool you’re describing could grab the link on the website and…

Yeah.

There was one called Agorapulse.

Yes, they’re based in France.

They they I’ve been using them for years and all of the social scheduling tools right now are struggling to figure out how to integrate AI because they all want to be able to say that they have it.

Most of them are putting some implementation of open AI software in it, but they haven’t really figured out yet how to make it integrated into the product.

So it’s that particular part of the industry is still a very nascent space.

Yes, here and then here.

So it sounds like so many people in the room are well along their AI journey and I am not.

The last session I was in, they mentioned that they had started down the process using GVT-4, got to the end of it and said, “This is giving us untrue responses and we can’t make it work.

I have to keep my data.

I mean, it’s legal guidance, so I can’t risk untrue responses.

Does that negate the use of AI? Can you take it back to PureSearch or is there a better tool? I would use Bing because you’ll at least get citations for where it’s getting its information or Google Bard is the other one.

So let’s go into Bard here.

Identify some ways that derivative works retain their copyright and the conditions under which a derivative work would lose its copyright, such as a transformative work.

Cite relevant cases.

So one of the things that Bard in particular has, they just added not too long ago, is a little button down here called, “Hey, are you lying?” It’s called double check my response.

But what it does is it then goes and crawls the Google’s index catalog and it highlights in green, “Hey, this is where I found this information.” And then this one here, this Goldsmith vs.

Hearst says, “I found content that differs.

I think I lied.” But this one here, in this case, the Google vs.

Oracle America, it found a citation that you can then go and check out to make sure it’s true.

So the search-based language models now have some level of, “Hey, here’s where I got this information from.” I would absolutely not use ChatGPT for finding relevant data because it just hallucinates.

And it’s not intentional, it’s not malicious.

The way it works is it’s pulling those word clouds and it finds associations that have the greatest strength and it assembles an answer.

In very early versions, when you ask a question like, “Who was president of the United States in 1492?” It pulls 1492, what are the words associated with that? Well, there’s like this Christopher Columbus person.

It pulls United States, what are the words associated with that? And the president, well, that’s an important person.

So it would answer, “Christopher Columbus was president of the United States in 1492,” even though it’s factually completely wrong, but the statistical associations made that the logical answer.

So pure language models like CLOD and like ChatGPT’s, the GPT-4 model, they have no fact checking, right? Whereas the search-engine based ones have some citations.

So I would always use that anytime I need to say, “Where did you get this information?” If you enjoyed this video, please hit the like button.

Subscribe to my channel if you haven’t already.

And if you want to know when new videos are available, hit the bell button to be notified as soon as new content is live.

[MUSIC]


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Get your copy of AI For Marketers

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!