Prompt Engineering for LLMs: Best Practices With John Berryman of Arcturus Labs (Scaling Tech Podcast Ep50)

LLMs are bringing so much to the table right now, with the potential to solve complex problems and automate tasks like never before. But getting the most out of these models requires new skills. And today, we’ve got just the expert to help you do that!

Meet John Berryman, a true leader in the field of LLMs. He is the Founder and Principal Consultant at Arcturus Labs, where he consults on LLM application development and is a specialist in prompt engineering. Before founding Arcturus, he was a senior machine learning researcher at GitHub, where he played a pivotal role in developing both the Copilot code completions product and the chat product. His most recent book is titled Prompt Engineering for LLMs: The Art and Science of Building Large Language Model–Based Applications.

In this episode of Scaling Tech, Arin and John dive deep into prompt engineering for LLMs. John shares practical techniques for best communicating with LLMs, making efficient prompts, and ultimately building LLM-powered applications.

Ready to level up your understanding of LLMs and prompt engineering? Don’t miss this episode!

Listen to the full episode on Spotify
Listen to the full episode on Apple Podcasts

Watch the video:
Key Insights are below

About Guest:

Name: John Berryman

What he does: He’s a Consultant in Large Language Model Application Development.

Company: Arcturus Labs

Where to find John: LinkedIn

Key Insights

⚡AI is moving towards more agency. As tech advances, language learning models are more autonomous and can perform tasks on behalf of the user. However, it still requires human involvement in the process. John explains, “What we’re kind of getting at the extreme of this is something that people are calling agency. Now, agency is another one of those terms. It’s going to be another one of those terms for a while. It’s kind of a buzzword. And when everyone says that, they have a different thing in mind. But when I think of agency with the technology we currently have on hand, I’m thinking about something that is allowing these language models to take multiple steps and do things on the behalf of the user but, at this poin,t still very much incorporates the user into the process.”

⚡Context is key for unlocking the full potential of AI models. As these models become more sophisticated, the importance of providing clear, structured context only increases. John explains, “It is completely expected that if you add more and more stuff, you’re not only going to saturate, it’s going to keep increasing in cost, but the amount of benefit that you provide is going to plateau. These models are doing better and better for more context. But empathy again. They’re quite a bit like humans. You give it so much stuff, and they end up having more and more trouble picking out the important stuff because you said, ‘Here’s everything that might be useful, solve this problem, I’m going to lunch.’ So crafting a good context is important for getting the most of these models efficiently, low latency and low cost.”

⚡How to get better results from AI models? The more detailed approach with AI, the better. John suggests approaching it like you would a human—providing well-structured, chunked information for the best results. He says, “If you’re trying to transition from the user’s understanding of the problem space to the model’s understanding of the problem space, which is text and content like that, then the first step you do is retrieve the context. So our exercise in the book here is just lay out in front of you, maybe you can do it like the yellow post-it notes, every possible thing that might be of benefit for considering the answer to this problem. Think through it as if you’re instructing a junior coworker. How would you solve the problem? You can sort of treat them like humans.”

Episode Highlights

What is prompt engineering?

Prompt engineering is less about asking the right questions and more about providing the right context to guide the model’s responses.

John explains, “I think when you think of prompt engineering, the first thing that comes to mind is something like ChatGPT or even these generative image models. You type in something and you get it to generate some sort of output, and that’s a big part of it. That is where the name, after all, came from. But when I talk about prompt engineering, really, and perhaps it’s a bit of a misnomer at this point, but really, what I’m getting at is something a bit deeper than that. I’m getting at not only how you converse with these models to get the best behavior out of these models but also how you shape the application around it so that you have a good interaction between the user and this alien technology that we’ve created. How can you solve the problems for the user? So, building the whole application, really.”

Empathy is key to getting the best results from AI models

To get the best results from AI, it’s important to approach it with empathy—yes, it might sound a bit abstract, but it’s a key strategy. John explains how it works, “One of the things that we really tried to drive home in the book is again the need to empathize with the model. And so you think about the model, and you think about, ‘Well, what is it familiar with?’ Fortunately, it’s read the internet five times or something like that. So whenever you’re conversing with the model, whenever you’re trying to put something, here’s a way to think about it. Here’s a way to empathize with it. The original models and even the chat models, which you really dig down deeper, are just completions models. They see the top half of a document, the prefix, and they’re one token at a time predicting next word, next word, next word, next word. And so if you want it to have the best shot and making a sensible prediction, then you make sure that the top half of that document looks like something that it’s familiar with. And then you can depend on it to do a pretty good job.”

Human judgment is irreplaceable in AI

Despite significant advancements in AI, human judgment remains essential for ensuring accuracy and making meaningful decisions.

As John explains, ”Large language models as judges. That’s kind of a whole domain. We’re still trying to wrap our heads around this. Who’s judging the judges? They don’t always do a great job. But there are good rules of thumb about how to get the best judgments. And you tune your judge model so that it will most closely agree with your human judges and then eventually you reduce the workload on humans entirely and have something like that. But it’s like Hamel Husain often says, the most important thing is not the cool toys, the large language models to judge. The most important thing is that a human is actually looking at the data and making the human judgment. We’re still important.”