You Ask, I Answer: Retrieval Augmented Generation for Tax Law?

In today’s episode, we dive into the intricacies of using generative AI in tax law. You’ll learn about the integration of new regulations into AI models, and the dual approach of fine-tuning and retrieval-augmented generation needed for accuracy. Discover the challenges and techniques involved in making AI adapt to the latest tax laws for 2023. Tune in for a detailed exploration of the advanced use cases of generative AI in legal contexts and how to effectively update and train these models.

You Ask, I Answer: Retrieval Augmented Generation for Tax Law?

Can’t see anything? Watch it on YouTube here.

Listen to the audio here:

Download the MP3 audio here.

Machine-Generated Transcript

What follows is an AI-generated transcript. The transcript may contain errors and is not a substitute for watching the video.

Christopher Penn: Someone asked on YouTube about a specific use case of generative AI involving tax law, and whether they can just upload the latest regulations to make a large language model answer current tax law questions for tax year 2023.

The answer is sort of, here’s why.

There are two main ways to improve the performance of a large language model, fine tuning, and retrieval, augmented generation.

Now, yes, there’s a bunch of other tech is the big two fine tuning helps guide a model to change how it answers and retrieval, augmented generation increases the overall latent space.

Go back to a previous episode of the almost timely newsletter if you want to learn more about latent space itself.

In non technical terms, think of this like a library, right? Think of a large language model like a library, it’s a really big library.

If you had a library, there’s no indexing system books just everywhere, right? You would have to wander around that library.

Until you found the books you want to do very slow, very inefficient, horrendously inefficient.

Now, if you taught someone or you learned yourself where in that maze of books, the tax books are, you provided maps and signs and guides me there’s indexing system, there’d be a whole lot easier for someone to get to the tax books in the library and subsequent visits.

That’s fine tuning, right? Fine tuning is teaching a model how to get to specific kinds of answers return specific kinds of answers much more effectively and correctly.

Retrieval augmented generation adds more books to the library, right? If you want a book on 2023 tax law, and it’s not in the library yet, the library will give you the next best thing which is probably a book on 2022 tax law.

If you’re trying to deal with new regulations from 2023, that is not super helpful, right? Because it’s old, it’s old information, retrieval, augmented generation allows you to say, Hey, model, here’s the 2023 tax law, right? Add it to the library.

And now the model has that information to draw on.

But here’s the thing about this.

The YouTube comment because it’s a good comment.

It’s a good question.

For this specific question of can you just add tax law to T to have it answer questions about current tax law? The answer is probably not you need to do both fine tuning and retrieval augmented generation.

Yes, you absolutely need to upload the new tax law.

That information has to be in the latent space, the model has to have knowledge of it.

But you may have specific questions about the new tax law that have not been seen before.

Maybe there’s a new regulation, a new law that was passed, that isn’t in previous models that wouldn’t be previously known, you would have to train the model to fine tune the model to handle those new tax law questions, right? And if it was a change to law, you would have to fine tune the model to not only know the new law, but then when when it encountered probabilities in index about the old version of law, to know that that’s not valid anymore.

It’s not just as simple as add more documents, add more documents doesn’t help here.

But you need to do both.

This is where you get to advanced use cases for generative AI because it’s not just as simple as add more documents.

Certainly adding the 2023 documents is helpful, and it’s better than doing nothing.

But it’s probably not going to solve the problem.

It’s probably not going to answer the questions in a correct way.

Because all the patterns that it knows, because that’s all these machines are, they’re just probability and pattern generators.

All the prob patterns it knows, are from previous versions.

So you need to not only change the knowledge, but you need to change how the machine knows where to get the knowledge and which knowledge to go get.

But it’s a really good question for understanding generative AI and what you need to do to make a language model do do what you want.

So thanks for the question.

Talk to you soon.

If you enjoyed this video, please hit the like button.

Subscribe to my channel if you haven’t already.

And if you want to know when new videos are available, hit the bell button to be notified as soon as new content is live.

♪ ♪


You might also enjoy:


Want to read more like this from Christopher Penn? Get updates here:

subscribe to my newsletter here


AI for Marketers Book
Get your copy of AI For Marketers

Analytics for Marketers Discussion Group
Join my Analytics for Marketers Slack Group!