How to Fix AI + Contracts being so Expensive and Slow

Transcript

Hi, I’m Josh, I’m the founder and CEO of Nomio, and I have spent the last 7 years building and maintaining contract repositories and doing very little else, which I believe makes me the only person in the world who can make that claim.

In the last session, I talked about how, if you naively stick Claude or ChatGPT or similar on top of your big pile of contracts, you’re going to end up with a very expensive and slow mess.

And the reason for that is that you’re going to be processing so much material from all of your contracts. All of the text in those contracts is going to make every single question you ask extremely expensive, and it’s going to take a long time to answer each question, which only gets worse when you’ve got agents hammering your contracts way more frequently than anyone on your team would.

So, the way that we fix this is by trying to cut out looking at all of your contracts, or all of the text in your contracts, for every question that you ask.

So, what if we could only query the relevant parts of each contract for the question that we’ve got? There’s already a technique that is widely used for this, and it’s called RAG, Retrieval Augmented Generation.

You might have heard the term Vector Database. If you’ve not heard either of those terms, don’t worry, this is how most of these systems work, under the hood. So, what you do is you split up a document into a bunch of chunks. Then you classify those chunks, based on what they contain, and then you make yourself a nice database of pieces that you can gather every time you’ve got a question. Rather than looking at all of them at once, you can just pick the relevant pieces.

So, say we have a question like, which contracts auto-renew? We can look at our pieces, and instead of processing all of the text in all of our contracts and racking up a big bill, we can just look at the term-related pieces in our collection of contract text.

Now, the first thing I want to make clear is that we’re admitting that we need this intermediate layer, once we get to this point.

We are saying, hold on, just sticking Claude directly on top of our contracts is not going to be sufficient. We need some sort of representation that we can query, instead of just querying our big pile of contracts.

And that is a very important step, because we’ve answered the question of ‘to database or not to database.’

And now we’re trying to figure out what should that database look like. So, this is a popular way of doing things, but it has a bunch of problems and things to think about.

And most of it comes back to the fact that we have very little control over what this database system actually decides is relevant to our question.

So remember that the whole reason it cuts down on cost and time taken is because when we ask a question, we have some mechanism to determine which subset of all of my stuff am I going to look at and retrieve in order to answer this question.

First off, we’ve got to think about how we actually ask the question. So here we’ve got a question that touches several different pieces across our contracts.

The system is sensitive to exactly how we ask the question. I might have neglected to mention ‘supplier’ in this question and just said which active contracts come with early termination fees.

Even by the way that this question is currently phrased, it’s not clear whether we’re talking about early termination penalties for the supplier or for the customer.

And you’re really beholden to how you or your colleagues ask the question with a system like this. Because it will give you different answers based on slight variations in questions.

So there’s already a danger there. And then you’ve got the challenge that is kind of opaque to anyone using the system.

How does it actually figure out which pieces that it holds are relevant to the question that you’re asking? Especially when the question is not that simple.

So, when you say: ‘Which active supplier contracts?’ I’ve got to first go and look at which of my contracts are active, and then across those results, I’ve got to figure out which one of those have some sort of early termination provision and then of those, which of those concern penalties.

And what happens is we can still very easily blow through that context window that this whole technique was designed to avoid.

And then we’ve got the problem of how to take all of the pieces that we’ve got, whether or not they blow through the context window, and assemble them to synthesise an answer.

Are we simply going to cut and paste all of these blobs of text from contracts? We’re probably going to have to annotate them with text.

Like, this blob of text comes from this contract, this blob of text comes from this contract. If we’re asking a question that requires us to see who those contracts are with who, we need to make sure that we go and fetch all of the information about the name of the counterparty for each of the pieces that we’re gathering.

The problem doesn’t end just because we’ve got the raw pieces.

We have to attach useful information to those raw pieces so that we can actually answer the question. And there is uncertainty at every step of this process.

So by the time you get your answer, you’ve gone through several steps, several lossy layers, where you’re not quite sure if you’ve asked the question in the right way. If the thing has interpreted your question properly. If it’s gone and fetched all of the correct pieces based on your question. And then if it’s gone and assembled those pieces in a way that Claude, who you’ve passed them to, can actually synthesise a good answer.

So much of why we have these problems is because we haven’t thought about the structure of our outputs in advance.

If we want our outputs to be useful, then we need them to be structured. There’s limited usefulness to just having a big list of clauses or a big prose output.

We want something like a table. But the problem is, with a system like this, we are leaving it to the person asking the question to also define the shape of the output that they want to see every single time they ask the question.

This actually increases the friction of asking a question because you now have to do a lot more work if you want the answer to that question to be useful.

But also, you’re dependent on the competence of the person asking the question to know exactly how to frame the answer to make it really useful to a business.

And then if you’re doing this every single time, how are you going to get the consistency that you need in order to build a business process off this kind of stuff?

How are you going to ensure that the other systems in your business that rely on the answers here are going to be able to ingest these answers in the same form every single time?

It all points to doing the work up front and with ease. With care to define what your answers are going to look like to the questions that you have, so that you can build a predictable process on top of it.

If you know what the shape of your answers is going to be before you answer, before you ask any question, it becomes much, much easier to build a robust business process on top.

And so if you don’t do this and you simply solve the expensive and slow LLM problem with a layer like this, which is what lots and lots of systems do, then you’re essentially trying to build a business process on chance.

And that is not a good way to build a business process. Ultimately, it just won’t end up getting used.

So all of this effort and all of this expense will be for nothing. And to illustrate how bad the answers are going to be, I’ve used a picture of a broken tractor.

Alright, thank you very much.

Transcript

More from the blog

How Nomio integrates with Salesforce

Multi-team rollouts don't need to start big

Watch this before putting Claude all over your contracts

Stop managing your contracts manually.