Andrey Listopadov

Trusting LLMs

@random-thoughts llm ai ~3 minutes read

Recently I had a discussion on the topic of trust and it got me thinking about large language models. I will come back to LLMs shortly, but imagine the situation:

You ask a real person for some bit of information, and the information they’ve provided to you is false but you don’t know it yet. After wasting some of your time, and realizing that the information wasn’t correct, or even malicious you get back to the person and tell them that. The response highly depends on the person, and on their intentions, but what I’m trying to say is that people can lie, often unconsciously. “Unconscious lie” isn’t really a lie in a traditional sense, often it’s just caused by misremembering something, but it’s sometimes hard to draw a line when a person just misremembers something, or they have some malicious intent.

So, if a person gives you misinformation over and over again, your trust credit for them drops pretty quickly, and in the future, you usually avoid asking them anything. And that brings us back to large language models.

LLMs usually don’t have any intent of misinformation, but they drift quite often, especially when you’re exploring an area that had a lot less training data, compared to other areas. If you ever tried asking ChatGPT for help with some not-so-popular programming languages, you may have noticed that the examples often are complete gibberish containing syntax from a completely different language or just non-existing functions. But why doesn’t our trust credit for LLMs drops like it does when we’re talking with a real person? And why we don’t assume that LLM can in fact have malicious intents?

What I often think about is that LLMs are created by big corporations, because this kind of work requires a lot of funding. And such corporations can, and in fact often have, specific intentions, one of which is expanding their reach in the global market. Not so long ago a lot of people talked about how Google is scared by ChatGPT because soon no one will now use their search engine, which they use for advertising. But no one talked about how ChatGPT can be used for advertising, for some reason.

What I mean is that once the technology behind LLMs is widespread enough, it will have advertisements integrated into the text probably much better than when Google integrates ads into their search results. And when asking LLM a question, we’ll have to remember, that LLM may in fact try to advertise something that we don’t really want. Or, it may straight up generate a response that diminishes one product while comparing it to another. And so forth.

I think a lot of this comes from the fact that we know that LLM is not a real human, and it has a limited amount of knowledge. That it is a relatively new breakthrough in natural language-related tech, and it can give faulty responses because of that. But because things are changing gradually, we may not notice the moment when these seemingly accidental faults will become intentional. Because despite the fact that the model gave us a wrong answer we still come back to it, which we probably wouldn’t do if it wasn’t a model but a human person.

So my thinking here is that we should not give LLMs a discount when it comes to our trust credit.