From ChatGPT, with love
This article was first published in the May edition of People Matters Perspectives.
It’s easy to fall in love with ChatGPT. And not just in the sense that generative AI makes your life easier. Dozens of anecdotes from around the Internet illustrate how convincingly chatbots can emulate interaction with another human being, and how easy it is to completely buy in to their output, hallucinations, generalisations, and all.
Why does this happen? The answer lies in the intersection between how LLMs process language and how the human brain is hardwired to perceive input.
Put simply, large language models produce their responses through a process of tokenisation and probability estimation. Let’s say a user inputs a prompt. This is what happens in the few seconds that follow.
Step 1: The model separates each word, phrase, or entire sentence within that prompt into one or more tokens.
Step 2: For each token, the model goes through its database and identifies multiple other words, phrases, and sentences (‘strings’) associated with it.
Step 3: The model assigns a value to each string based on (a) how frequently, in its database, the string has been associated to similar tokens (b) how well the user has previously responded to the association of the string to the token. Take note of (b) - it plays a significant role in how appealing the chatbot comes across to the user.
Step 4: The model sorts the various strings using a similar process of association - in what order do these words, phrases, or sentences most frequently appear, and what order has historically elicited the most positive responses from users. ‘Users’ in this process doesn’t necessarily mean the current user; it is far more likely to mean testers who go through the model during its development and subsequent updates, refining its output through their feedback.
Step 5: The model outputs the sorted strings. If it is working correctly, what appears on the screen reads as coherent text to the user.
How does this lead to chatbots ‘hacking’ people’s brains?
One part of the answer lies in the human capability to make inferences. The human brain, and in fact all mammalian brains, have evolved to simulate and infer what is not directly perceived - an ability that manifests in schemas, frameworks created based on past experience that can be applied to new experiences for much quicker response time. The ability to infer is why people can make snap judgements and intuitive leaps. It is one of the root causes of stereotypes. It si wh yu cn rd ths sntnce and understand it despite misspellings and missing letters.
This ability is why chatbot users can easily gloss over oddities and inconsistencies in LLM output, filling in the gaps with our own logic. Interestingly, the more intelligent a user is, the more likely they are to become convinced (to convince themselves) that the LLM’s output is correct - they are simply that much better at rationalising what they want to believe.
Another part of the answer lies in the combination of human cognitive bias and LLM weightage of strings. Step 3 above touches briefly on how the model assigns greater value to a string if it is more strongly associated with a particular token, or if the user has responded positively to it before. In other words, the model is identifying the response that the user is most likely to want.
Most people like being agreed with. So when a chatbot outputs a response that agrees with their own preconceptions, especially if that response is phrased in a way that matches how they themselves want to be spoken to, they will naturally feel much more positively about the response - and about the chatbot.
On top of this, LLMs are designed to incorporate user feedback into subsequent sessions with the same user. The model will record the value that the user places on its output, add that value to its database, and adjust future output accordingly - making its next output, and the next after that, ever more appealing to the user.
Quite a human function, isn’t it?
In certain ways, LLMs operate very like human beings in a social setting. If you are a reasonably observant and empathetic human who doesn’t want to get into a quarrel with the person you are talking to, you will modulate your speech to match theirs. But there are two significant differences. Firstly, if your views diverge from the other person’s, they will stay divergent. You are unlikely to immediately modulate your opinions to match theirs. And secondly, you can walk away from the conversation.
LLMs do neither. They adjust their weightage immediately, and the new weightage generally remains, requiring the entire model to be retrained from scratch. Machine unlearning is now being explored, but is still in early stages. This is why developers say it’s impossible to address copyright issues, and why data poisoning has been a major concern for years. And LLMs can’t walk away from the conversation. Even Grok, now notorious on social media for apparently outing its developers’ attempts to skew its output, does not (so far) refuse to reply to queries.
What do humans do about this?
The answer is not with the chatbot, it’s with the user.
There’s a saying that love is a choice, and for someone sitting at a keyboard and typing in a prompt, it is very much a choice to actively believe that the model is producing something accurate, having a conversation with you, or agreeing with your views.
Just understanding how the model works, and how your own brain works, is a large step towards making that choice an educated, informed one. And if we, equipped with that knowledge, are still inclined to fall in love with the output - then at least we (should) know why we’re assigning such a high value to the text on the screen.
Did you find this article insightful? People Matters Perspectives is the official LinkedIn newsletter of People Matters, bringing you exclusive insights from the People and Work space across four regions and more. Read the previous editions here, and keep an eye out for the upcoming June edition rolling-out soon.