Forum: War Ensemble BBS

The LLMentalist Effect (AI & psychic's con)

From Ben Collver@bencollver@tilde.pink to comp.misc on Fri Apr 12 00:02:12 2024

From Newsgroup: comp.misc

The LLMentalist Effect
======================
by Baldur Bjarnason, July 4th, 2023

how chat-based Large Language Models replicate the mechanisms of a
psychic's con

For the past year or so I've been spending most of my time researching
the use of language and diffusion models in software businesses.

One of the issues in during this research--one that has perplexed
me--has been that many people are convinced that language models, or specifically chat-based language models, are intelligent.

But there isn't any mechanism inherent in large language models (LLMs)
that would seem to enable this and, if real, it would be completely unexplained.

LLMs are not brains and do not meaningfully share any of the mechanisms
that animals or people use to reason or think.

LLMs are a mathematical model of language tokens. You give a LLM text,
and it will give you a mathematically plausible response to that text.

There is no reason to believe that it thinks or reasons--indeed, every
AI researcher and vendor to date has repeatedly emphasised that these
models don't think.

There are two possible explanations for this effect:

1. The tech industry has accidentally invented the initial stages a
completely new kind of mind, based on completely unknown
principles, using completely unknown processes that have no
parallel in the biological world.

2. The intelligence illusion is in the mind of the user and not in
the LLM itself.

Many AI critics, including myself, are firmly in the second camp. It's
why I titled my book on the risks of generative "AI" The Intelligence
Illusion.

<https://illusion.baldurbjarnason.com/>

For the past couple of months, I've been working on an idea that I think explains the mechanism of this intelligence illusion.

I now believe that there is even less intelligence and reasoning in
these LLMs than I thought before.

Many of the proposed use cases now look like borderline fraudulent pseudoscience to me.

The rise of the mechanical psychic
==================================
The intelligence illusion seems to be based on the same mechanism as
that of a psychic's con, often called cold reading. It looks like an
accidental automation of the same basic tactic.

By using validation statements, such as sentences that use the Forer
effect, the chatbot and the psychic both give the impression of being
able to make extremely specific answers, but those answers are in fact statistically generic.

<https://www.skepdic.com/forer.html>

The psychic uses these statements to give the impression of being able
to read minds and hear the secrets of the dead.

The chatbot gives the impression of an intelligence that is specifically engaging with you and your work, but that impression is nothing more
than a statistical trick.

This idea was first planted in my head when I was going over some of the statements people have been making about the reasoning of these "AI."

I first thought that these were just classic cases of tech bubble
enthusiasm, but no, "AI" has both taken a different crowd and the
believers in the "AI" bubble sound very different from those of prior
bubbles.

* "This is real. It's a bit worrying, but it's real."

* "There really is something there. Not sure what to think of it, but
I've experienced it myself."

* You need to keep your mind open to the possibilities. Once you do,
you'll see that there's something to it."

That's when I remembered, triggered by a blog post by Terence Eden on
the prevalence of Forer statements in chatbot replies. I have heard this before.

<https://shkspr.mobi/blog/2023/02/ how-much-of-ais-recent-success-is-due-to-the-forer-effect/>

This specific blend of awe, disbelief, and dread all sound like the
words of a victim of a mentalist scam artist--psychics.

The psychic's con is a tried and true method for scamming people that
has been honed through the ages.

What I describe below is one variation. There are many variations, but
the core mechanism remains the same.

The Psychic's Con
=================
The audience is represented by a collection of characters. The
disinterested are periods. The interested are upper case O's.

_ _ _
. O . O . . . . O .
- - _ _ - _
. . . . . O O . . O
- - -

1. The Audience Selects Itself

Most people aren't interested in psychics or the like, so the initial
audience pool is already generally more open-minded and less critical
than the population in general. The chart now has different letters
to indicate that they are not of a single demographic

R B B G G B

Y Y B G R Y

2. The Scene is Set

The initial audience is prepared. Lights are dimmed. The psychic is
hyped up. Staff research the audience on social media or through
conversation. The audience's demographics are noted. All the letters representing demographics not chosen are lower case.

_ _
r b b G G b
= -
y y b G r y
-

3. Narrowing Down the Demographic

The psychic gauges the information they have on the audience,
gestures towards a row or cluster, and makes a statement that sounds
specific but is in fact statistically likely for the demographic.
Usually at least one person reacts. If not, the psychic will imply
that the secret is too embarrassing for the "real" person to come
forward, reminds people that they're available for private readings,
and tries again. An at-symbol representing the psychic has an arrow
pointing to the letter that represents the mark.

- -
@ -> G G
= -
G
-

4. The Mark is Tested

The reaction indicates that the mark believes they were "read". This
leads to a burst of questions that, again, sound very specific but
are actually statistically generic. If the mark doesn't respond, the
psychic declares the initial read a success and tries again. The
mark's letter and the psychic's symbol have arrows pointing to each
other representing a loop.

->
@ G
<-

5. The Subjective Validation Loop

The con begins in earnest. The psychic asks a series of questions
that all sound very specific to the mark but are in reality just
statistically probable guesses, based on their demographics and prior
answers, phrased in a specific, highly confident way. The mark's
letter has exclamation marks.

!!!
G

6. "Wow! That psychic is the real thing!"

The psychic ends the conversation and the mark is left with the sense
that the psychic has uncanny powers. But the psychic isn't the real
thing. It's all a con.

1. Audience selection
=====================
Seers, tarot card readers, psychics, mind readers aren't all con
artists. Sometimes the "psychic" is open about it all just being
entertainment and aren't pretending to be able to contact spirits or
read minds. Some psychics do not have a profit motive at all, and
without the grift it doesn't seem fair to call somebody a con artist.

But many of them are con artists deliberately fooling people, and they
all operate using the same basic mechanisms that begin well before the
reading proper.

The audience is usually only composed of those already pre-disposed to
believe in psychic phenomena and those they have managed to drag with
them. Hardcore sceptics will almost always be in a very small minority
of the audience, which both makes them easy to manage and provides
social pressure on them to tone down their scepticism.

Those who attend are primed to believe and are already familiar with
the mythology surrounding psychics. All of which helps them manage
expectations and frame their performance.

2. Setting the scene
====================
Usually the audience is reminded of the ground rules for how psychic
readings "work" at the start of the performance. They are helped by the popularisation of these rules by media, cinema, and TV.

Everybody now "knows" that:

* Readings usually begin murky and unclear.
* They then become clearer as the "connection" to the "spirit world"
gets stronger.
* Errors are expected. The "spirits" are often vague or hard to hear.
* Non-believers can weaken or even disrupt the connection.

Psychics also habitually research their audience, by mapping out their demographics, looking them up on social media, or even with informal
interviews performed by staff mingling with attendees before the
performance begins.

When the lights dim, the psychic should have a clear idea of which
members of the audience will make for a good mark.

3. Narrowing down
=================
The mark usually chooses themselves. The psychic makes a statement and
points towards a row, quickly altering their gesture based on somebody responding visible to the statement. This makes it look like they
pointed at the mark right from the beginning.

The mark is that way primed from the start to believe the psychic.
They're off-guard. Usually a bit surprised and totally unprepared for
the quick burst of questions the psychic offers next. If those questions
land and draw the mark in, they are followed by the actual reading.
Otherwise, they move on and try again.

4. Testing the mark--Cold reading using subjective validation =============================================================
The con--cold reading--hinges on a quirk of human psychology: if we
personally relate to a statement, we will generally consider it to be
accurate.

<https://en.wikipedia.org/wiki/Cold_reading>

This unfortunate side effect of how our mind functions is called
subjective validation.

<https://en.wikipedia.org/wiki/Subjective_validation>

Subjective validation, sometimes called personal validation effect,
is a cognitive bias by which people will consider a statement or
another piece of information to be correct if it has any personal
meaning or significance to them. People whose opinion is affected
by subjective validation will perceive two unrelated events (i.e.,
a coincidence) to be related because their personal beliefs demand
that they be related.

As a consequence, many people will interpret even the most generic
statement as being specifically about them if they can relate to what
was said.

The more eager they are to find meaning in the statement, the stronger
the effect.

The more they believe in the speaker's ability to make accurate
statements, the stronger the effect.

The basic mechanism of the psychic's con is built on the mark being
willing and able to relate what was said to themselves, even if it's unintentional.

5. The subjective validation loop using validation statements =============================================================
The psychic taps into this cognitive bias by making a series of
statements that are tailored to be personally relatable--sound specific
to you--while actually being statistically generic.

These statements come in many types. I use "validation statements" here
as an umbrella term for all these various tactics.

Some common examples:

* Forer or Barnum statements are probably the most famous kind of
statement that plays into the subjective validation effect. Many of
these statements are inherently meaningless but are nonetheless
felt to be accurate by listeners. Most people will consider "you
tend to be hard on yourself" to be an accurate description of
themselves, for example.
<https://en.wikipedia.org/wiki/Barnum_effect>

* Vanishing negative is where a question is rephrased to include a
negative such as "not" or "don't". If the psychic asks "you don't
play the piano?" then they will be able to reframe the question as
accurate after the fact, no matter what the answer is. If you
answer negative: "didn't think so". Positive: "that's what I
thought."

* Rainbow ruse where the psychic associates the mark with both a
trait and its opposite. "You're a very calm person, but if provoked
you can get very angry."
<https://en.wikipedia.org/wiki/Cold_reading#The_rainbow_ruse>

* Statistical guesses.
Statements like "you have, or used to have, a scar on your left
leg or knee" apply to almost everybody. With enough knowledge of
common statistics, the psychic can make general statements that
sound incredibly specific to the mark.

* Demographic guesses.
Similar to statistical guesses, these are statements that are
common to a demographic but will sound very specific to the mark
that's listening.

* Unverifiable predictions.
Predictions like "somebody bears a strong ill will towards you but
they are unlikely to act on it" are impossible to verify, but will
sound true to many people.

* Shotgunning is one of the more common tactic where the psychic will
fire off a series of statements. The mark will find one of the
statements to be accurate and, due to how our minds work, will come
away only remembering the correct statement.
<https://en.wikipedia.org/wiki/Cold_reading#Shotgunning>

An important part of this process is the tone and bearing of the
psychic. They need to be confident, be quick in dismissing errors and
moving on when they make mistakes, and they need to be quick to read
people's expressions and body language and adjust their responses to
match.

6. The con is completed
=======================
At the end of the process, the mark is likely to remember that the
reading was eerily correct--that the psychic had an almost supernatural accuracy--which primes them to become even more receptive the next time
they attend.

This is where the con often becomes insidious: the effect becomes
stronger the more cooperative the mark is, and they often become more cooperative over time.

What's more, susceptibility has nothing to do with intelligence.

Somebody raised to believe they have high IQ is more likely to fall
for this than somebody raised to think less of their own intellectual capabilities. Subjective validation is a quirk of the human mind.
We all fall for it. But if you think you're unlikely to be fooled,
you will be tempted instead to apply your intelligence to "figure
out" how it happened. This means you can end up using considerable
creativity and intelligence to help the psychic fool you by coming up
with rationalisations for their "ability". And because you think you
can't be fooled, you also bring your intelligence to bear to defend the psychic's claim of their powers. Smart people (or, those who think of themselves as smart) can become the biggest, most lucrative marks.

Whereas the sceptic who thinks less of themselves is more likely to just
go:

"That's a neat trick. I don't know how you pulled it off. Must be very
clever."

And just move on.

Many psychics fool themselves
=============================
It isn't unusual for psychics to unconsciously develop a practice of
cold reading subconsciously. The psychics themselves might not even be
aware of their own tactics.

<https://en.wikipedia.org/wiki/Cold_reading#Subconscious_cold_reading>

As Denis Dutton describes:

As a postgraduate student in pursuit of a scientific career, he
became intrigued with astrology. Though during this period he had
nagging doubts about the physical basis of astrology, he was
encouraged to continue with it by his many satisfied clients, who
invariably found his readings "amazingly accurate" in describing
their personal situations and problems. Not until he had one day
obtained such a gratifying reaction to a horoscope which, he
realized later, he had cast completely incorrectly, did he begin
slowly to understand the real nature of his activity: his great
success as an astrologer had nothing whatsoever to do with the
validity of astrology as a science. He had become, in fact, a
proficient cold reader, one who sincerely believed in the power of
astrology under the constant reinforcement of his clients. He was
fooling them, of course, but only after falling for the illusion
himself.

<http://www.denisdutton.com/cold_reading.htm>

There are many examples of this easily found once you start doing the
research. The mechanism is simple enough and already baked into people's preconceptions of how readings work so many psychics accidentally
develop the knack for it, meaning that they're not just conning the
person being read, they are also conning themselves.

This point will become important later.

The LLMentalist Effect
======================

_ _ _
. O . O . . . . O .
- - _ _ - _
. . . . . O O . . O
- - -

1. The Audience Selects Itself

People sceptical about "AI" chatbots are less likely to use them.
Those who actively don't disbelieve the possibility of chatbot
"intelligence" won't get pulled in by the bot. The most active
audience will be early adopters, tech enthusiasts, and genuine
believers in AGI who will all generally be less critical and more
open-minded. The characters now have different letters to indicate
that they are not of a single demographic, all overlaid by the word
'HYPE' and arrows indicating a prevailing atmosphere of hype.

^ ^ ^ ^ ^ ^
R|B|B|G|G|B|
| | | | | |
Y| HYPE Y|

2. The Scene is Set

Users are primed by the hype surrounding the technology. The chat
environment sets the mood and expectations. Warnings about it being
"early days" and "hallucinations" both anthropomorphise the bot and
provide ready-made excuses for when one of its constant failures are
noticed. All the letters representing demographics not chosen are
lower case.

_ _ _
r B b g G B
- _ _ - -
y y B G r y
- -

3. The Prompt Establishes the Context

Each user gives the chatbot a prompt and it answers. Many will either
accept the answer as given or repeat variations on the initial prompt
to get the desired result. They move on without falling for the
effect. But some users engage in conversation and get drawn in.
Various letters representing marks are connected via loop arrows with
at symbols representing the chatbot. The rest are lower case.

B| b G| g B|
^ v ^ v ^ v
|@ @ |@ @ |@

4. The Marks Test Themselves

The chatbot's answers sound extremely specific to the current context
but are in fact statistically generic. The mathematical model behind
the chatbot delivers a statistically plausible response to the
question. The marks that find this convincing get pulled in. The
mark's letter and the chatbot's symbol have arrows pointing to each
other representing a loop.

->
@ G
<-

5. The Subjective Validation Loop

The mark asks a series of questions and all of the replies sound like
reasoned answers specific to the context but are in reality just
statistically probable guesses. The more the mark engages, the more
convinced they are of the chatbot's intelligence. The mark's letter
has exclamation marks.

!!!
G

6. "Wow! This chatbot thinks! It has sparks of general intelligence!"

The mark is left with the sense that the chatbot is uncannily close to
being self-aware and that it is definitely capable of reasoning But
it's nothing more than a statistical and psychological effect.

1. The audience selects itself
==============================
If you aren't interested in "AI", you aren't going to use an "AI"
chatbot, and if you try one, you're less likely to return.

This means that many of the avid users of these chatbots are
self-selected to be enthusiastic and open-minded about the field of AI
and the notion of Artificial General Intelligence (AGI)--that these technologies might lead to self-aware and self-improving reasoning
systems.

Those who are genuine enthusiasts about AGI--that this field is about
to invent a new kind of mind--are likely to be substantially more
enthusiastic about using these chatbots than the rest.

This parallels the audience selection for the psychic's con. Those who
believe in an afterlife and that it can be contacted by the living are substantially more likely to attend a psychic's reading than others.

2. Setting the stage
====================
Our current environment of relentless hype sets the stage and builds
up an expectation for at least glimmers of genuine intelligence. For
all the warnings vendors make about these systems not being general intelligences, those statements are always followed by either an implied
or an actual "yet". The hype strongly implies that these are "almost" intelligences and that you should be able to perceive "sparks" of
intelligence in them.

Those who believe are primed for subjective validation.

The warnings also play a role in setting the stage. "It's early days"
means that when the statistically generic nature of the response is
spotted, it's easily dismissed as an "error". Anthropomorphising
concepts such as using "hallucination" as a term help dismiss the fact
that statistical responses are completely disconnected from meaning
and facts. The hype and mythology of AI primes the audience to think
of these systems as persons to be understood and engaged with, all
but guaranteeing subjective validation.

3. The prompt establishes the context
=====================================
The initial prompt interaction is the first filter. Most will just take
the first answer and leave, or at most will repeat variations of their
prompt until they get the result they wanted. These interactions are
purely mechanical. The end-user is treating the chatbot merely as a
generative widget, so they never get pulled into the LLMentalist effect.

Some of the end-users, usually those who are more enthusiastic
about the prospect of "AI", begin to engage and get pulled into
"conversation" with a mathematical language model.

4. The mark tests themselves--subjective validation kicks in ============================================================
That conversation is the primary filter. Those who want to believe will
see the responses to their prompt as being both specifically about them
and intelligent. They are primed to see the chatbot as a person that
is reading their texts and thoughtfully responding to them. But that
isn't how language models work. LLMs model the distribution of words and phrases in a language as tokens. Their responses are nothing more than a statistically likely continuation of the prompt.

You give it text. It gives you a response that matches responses that
texts like yours commonly get in its training data set.

Already, this is working along the same fundamental principle as
the psychic's con: the LLM isn't "reading" your text any more than
the psychic is reading your mind. They are giving you statistically
plausible responses based on what you say. You're the one finding ways
to validate those responses as being specific to you as the subject of
the conversation.

Because of how large the training data set is, the responses from
the chatbot will look extremely convincing and specific, even though
they are statistically generic. Once you've trained on most of the
past twenty years of the web, large collections of stolen ebooks, all
of Reddit, most of social media, and a substantial amount of custom interactions by low-wage workers, the model will have a response for
almost everything you can think of, or can use a variation of something
it's already seen.

These initial interactions can be quite compelling, especially if you're
a believer in "AI", but it is in the longer and repeated conversations
that the effect really begins to kick in.

5. The subjective validation loop--RLHF enters the picture ==========================================================
It's important to remember at this stage how Reinforcement Learning
through Human Feedback works.

<https://huggingface.co/blog/rlhf>

This is the method that vendors use to turn a raw language model into a
chatbot that can hold a conversation.

RLHF doesn't let the vendor make specific corrections to an LLM's
output. The method involves using human feedback to rank a variety of
texts generated by the model, usually following some other form of
fine-tuning. The ranked texts are in turn used to train a separate
reward model. It's this model that is responsible for the actual
Reinforcement Learning of the LLM. The reward model, coupled with
fine-tuning the LLM on collections of chats, is what turns the
borderline unhinged conversations of a regular model into the fluent
experience you see in systems such as ChatGPT.

Because the feedback is based on rankings, it can't easily be based on
specific issues. If a model makes a false statement in a conversation,
that conversation gets a lower rank.

This lack of concrete specificity likely means that RLHF models in
general are likely to reward responses that sound accurate. As the
reward model is likely just another language model, it can't reward
based on facts or anything specific, so it can only reward output
that has a tone, style, and structure that's commonly associated with statements that have been rated as accurate.

Even the ratings themselves are suspect. Most, if not all, of the
workers who provide this feedback to AI vendors are low-paid workers
who are unlikely to have specialised knowledge relevant to the topic
they're rating, and even if they do, they are unlikely to have the time
to fact-check everything.

That means they are going to be ranking the conversations almost
entirely based on tone and sentence structure.

This is why I think that RLHF has effectively become a reward system
that specifically optimises language models for generating validation statements: Forer statements, shotgunning, vanishing negatives, and
statistical guesses.

In trying to make the LLM sound more human, more confident, and more
engaging, but without being able to edit specific details in its output,
AI researchers seem to have created a mechanical mentalist.

Instead of pretending to read minds thrgh statistically plausible
validation statements, it pretends to read and understand your text
through statistically plausible validation statements.

The validation loop can continue for a while, with the mark constantly
doing the work of convincing themselves of the language model's
intelligence. Done long enough, it becomes a form of reinforcement
learning for the mark.

6. The marks become cheerleaders
================================
The most enthusiastic believers in an imminent AI revolution are
starting to sound very similar to long-time believers in psychics and mind-reading.

They come up with increasingly convoluted ideas and models to explain
why the impossible is possible. They become more and more dismissive of
fields of science and research that challenge their world view. Their
own statements become tinged with awe and dread.

And they keep evangelising. This is real!

Often followed by: This is dangerous!

Remember, the effect becomes more powerful when the mark is both
intelligent and wants to believe. Subjective validation is based on how
our minds work, in general, and is unaffected by your reported IQ.

If anything, your intelligence will just improve your ability to
rationalise your subjective validation and make the effect stronger.
When it's coupled with a genuine desire to believe in the con--that we
are on the verge of discovering Artificial General Intelligence--the
effect should both be irresistible and powerful once it takes hold.

This is why you can't rely on user reports to discover these issues.
People who believe in psychics will generally have only positive things
to say about a psychic, even as they're being bilked. People who believe
we're on the verge of building an AGI will only have positive things to
say about chatbots that support that belief.

It's easy to fall for this
==========================
Falling for this statistical illusion is easy. It has nothing to do with
your intelligence or even your gullibility. It's your brain working
against you. Most of the time conversations are collaborative and
personal, so your mind is optimised for finding meaning in what is said
under those circumstances. If you also want to believe, whether it's in psychics or in AGI, your mind will helpfully find reasons to believe in
the conversation you're having.

Once you're so deep into it that you've done a press tour and committed yourself as a public figure to this idea, dislodging the belief that we
now have a proto-AGI becomes impossible. Much like a scientist publicly
stating that they believe in a particular psychic, their self-image
becomes intertwined with their belief in that psychic. Any dismissal of
the phenomenon will feel to them like a personal attack.

The psychic's con is a mechanism that has been extraordinarily
successful at fooling people over the years. It works.

The best defence is to respond the same way as you would to a convincing psychic's reading: "That's a neat trick, I wonder how they pulled it
off?"

Well, now you know.

Once you're aware of the fallibility of how your mind works, you should
have an easier time spotting when that fallibility is being exploited, intentionally or not.

That brings us to an important question.

Is this intentional?
====================
Given that there are billions of dollars at stake in the tech industry,
it would be tempting to assume that the statistical illusion of
intelligence was intentionally created by people in the tech industry.

I personally think that's extraordinarily unlikely.

A popular response to various government conspiracy theories is that
government institutions just aren't that good at keeping secrets.

Well, the tech industry just isn't that good at software. This illusion
is, honestly, too clever to have been created intentionally by those
making it.

The field of AI research has a reputation for disregarding the value of
other fields, so I'm certain that this reimplementation of a psychic's
con is entirely accidental. It's likely that, being unaware of much of
the research in psychology on cognitive biases or how a psychic's con
works, they stumbled into a mechanism and made chatbots that fooled many
of the chatbot makers themselves.

Remember what I wrote above about psychics frequently having conned
themselves, that many of them aren't even aware of their own scam?

The same applies here. I think this is an industry that didn't
understand what it was doing and, now, doesn't understand what it did.

That's why so many people in tech are completely and utterly convinced
Ththat they have created the first spark of true Artificial General Intelligence.

This new era of tech seems to be built on superstition and pseudoscience ========================================================================
Once I started to research the possibility that LLM interactions were a variation on the psychic's con, I began to see parallels everywhere in
the field of "AI".

* Hooking a language model up to an MRI and claiming that it can read
minds.

* Claiming to be able to discern criminality based on facial
expressions and gait.

* Proposing magical solutions to health problems.

* Literal predictions of the future.

* Claiming to be able to discern the honesty of potential employees.

All of these are proposed applications of "AI" systems, but they are
also all common psychic scams. Mind reading, police assistance, faith
healing, prophecy, and even psychic employee vetting are all right out
of the mentalist playbook.

Even though I have no doubts that these efforts are sincere, it's
becoming more and more obvious that the tech industry has given itself wholesale to superstition and pseudoscience. They keep ignoring the
warnings coming from other fields and the concerns from critics in their
own camp.

Large Language Models don't have the functionality or features to make
up for this wave of superstition.

* "Hallucinations" are a pervasive flaw that's baked into how LLMs
work.
<https://needtoknow.fyi/card/hallucinations/>

* Summarisations are error-prone and prone to generalising about the
text being summarised.
<https://www.baldurbjarnason.com/2023/ai-summaries-unreliable/>

* Their "reasoning" is a statistical illusion.

* Their performance at natural language processing tasks is only
marginally better than that of smaller language models.
<http://opensamizdat.com/posts/chatgpt_survey/>

* They tend to memorise and copy text without attribution.
<https://needtoknow.fyi/card/copyright/>

Taken together, these flaws make LLMs look less like an information
technology and more like a modern mechanisation of the psychic hotline.

Delegating your decision-making, ranking, assessment, strategising,
analysis, or any other form of reasoning to a chatbot becomes the
functional equivalent to phoning a psychic for advice.

Imagine Google or a major tech company trying to fix their search engine
by adding a psychic hotline to their front page? That's what they're
doing with Bard.

* "Our university students can't make heads nor tails of our website.
Let's add a psychic hotline!"

* "We need to improve our customer service portal. Let's add a
psychic hotline!"

* "We've added a psychic hotline button to your web browser! No, you
can't get rid of it. You're welcome!"

* "Can't understand a thing in our technical docs? Refer to our fancy
new psychic hotline!"

The AI bubble is going to be a tough one to weather.

More on "AI"
============
I've spent some time writing about the many flaws of language models and generative "AI".

* I've written about how language models are a backward-facing tool
in a novelty-seeking industry and why I think using language models
for programming is a bad idea.
<https://softwarecrisis.dev/letters/ai-code-quality/>
<https://softwarecrisis.dev/letters/ai-and-software-quality/>

* "AI" summaries are inherently unreliable.
<https://www.baldurbjarnason.com/2023/ai-summaries-unreliable/>

* Their tendency towards shortcuts makes them dangerous in healthcare.
<https://www.baldurbjarnason.com/2023/ai-in-healthcare/>

* Most of the research indicating a productivity benefit to "AI" is,
at best, flawed, and at worst are completely detached from the
reality of modern office work.
<https://www.baldurbjarnason.com/2023/ignore-most-ai-research/>
<https://www.baldurbjarnason.com/2023/ai-research-again/>

* AI vendors have a history of pseudoscience and snake oil.
<https://www.baldurbjarnason.com/2023/beware-of-ai-snake-oil/>

* Even if you do think that a language model's unsolvable tendency
towards ‘hallucinations' doesn't disqualify the technology from
replacing search engines, the many security issues that language
models suffer from should. The "write a prompt; get the output"
model is inherently insecure. These systems are also vulnerable to
a form of keyword manipulation exploit that's impossible to prevent.
<https://softwarecrisis.dev/letters/prompts-are-not-fit-for-purpose/>
<https://softwarecrisis.dev/letters/google-bard-seo/>

I've come to the conclusion that a language model is almost always the
wrong tool for the job.

***
I strongly advise against integrating an LLM or chatbot into your
product, website, or organisational processes.
***

If you do have to use generative AI, either because it's a mandate from
above your pay grade or some other requirement, I have written a book
that's specifically about the issues with using generative "AI" for
work:

The Intelligence Illusion: a practical guide to the business risks of Generative AI.
<https://illusion.baldurbjarnason.com/>

It's only $35 USD for EPUB and PDF, which is only 15% of the $240 USD
cost of twelve months of ChatGPT Plus.

But, again, I'd much rather you just avoid using a language model in
the first place and save both the cost of the ebook and the ChatGPT subscription.

References on the Psychic's Con
===============================
* Cold reading (Wikipedia)
<https://en.wikipedia.org/wiki/Cold_reading>
* How to Become Psychic and Cold Read People
<http://positivelybrainwashed.com/
how-to-become-psychic-and-cold-read-people/>
* Derren Brown Cold Reading revealed
<https://secrets-explained.com/derren-brown/cold-reading>
* Cold reading (Rational Wiki)
<https://secrets-explained.com/derren-brown/cold-reading>
* 7 Tricks Psychics Bullshit People With That Everyone Should Know
<https://www.thrillist.com/culture/7-tricks-psychics-and-mediums-use-
how-psychics-use-cold-reading-the-forer-effect>
* Should You Believe in Psychics? Psychology and logic join forces to
debunk psychics (Psychology Today)
<https://www.psychologytoday.com/us/blog/hot-thought/201904/
should-you-believe-in-psychics>
* Motivated reasoning (Wikipedia)
<https://en.wikipedia.org/wiki/Motivated_reasoning>
* Cold Reading: How I Made Others Believe I Had Psychic Powers
<https://medium.com/@chris.kirsch/cold-reading-how-i-
made-others-believe-i-had-psychic-powers-dc184879d264>
* Cold reading (Sceptic's Dictionary)
<https://www.skepdic.com/coldread.html>
* Subjective validation (Sceptic's Dictionary)
<https://www.skepdic.com/subjectivevalidation.html>
* Subjective validation (Wikipedia)
<https://en.wikipedia.org/wiki/Subjective_validation>
* Coincidences: Remarkable or Random?
<https://skepticalinquirer.org/1998/09/
coincidences-remarkable-or-random/>
* Psychic Experiences: Psychic Illusions
<https://www.susanblackmore.uk/articles/
psychic-experiences-psychic-illusions/>
* Guide to Cold Reading
<https://www.skeptics.com.au/resources/articles/
guide-to-cold-reading-ray-hyman/>
* The Cold Reading Technique
<http://www.denisdutton.com/cold_reading.htm>
* Forer effect (Sceptic's Dictionary)
<https://www.skepdic.com/forer.html>
* Tricks of the Psychic Trade (Psychology Today)
<https://www.psychologytoday.com/us/blog/speaking-in-tongues/
201201/tricks-the-psychic-trade>
* Psychic Scams
<https://www.aarp.org/money/scams-fraud/info-2022/psychic.html>
* Ten Tricks of the Psychics I Bet You Didn't Know (You Won't Believe #6!)
<https://skepticalinquirer.org/exclusive/
ten-tricks-of-the-psychics-i-bet-you-didnrsquot-know/>

From: <https://softwarecrisis.dev/letters/llmentalist/>
--- Synchronet 3.20a-Linux NewsLink 1.114

From nospam@nospam@example.net to comp.misc on Fri Apr 12 10:28:20 2024

From Newsgroup: comp.misc

This message is in MIME format. The first part should be readable text,
while the remaining parts are likely unreadable without MIME-aware tools.

--8323328-500091342-1712910501=:13976
Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8BIT

Nice article. Thank you. I'm not in the wooo-club so I enjoyed a skeptical take on AI since the common opinion is that god will soon visit us.

On Fri, 12 Apr 2024, Ben Collver wrote:

The LLMentalist Effect
======================
by Baldur Bjarnason, July 4th, 2023

how chat-based Large Language Models replicate the mechanisms of a
psychic's con

For the past year or so I've been spending most of my time researching
the use of language and diffusion models in software businesses.

One of the issues in during this research--one that has perplexed
me--has been that many people are convinced that language models, or specifically chat-based language models, are intelligent.

But there isn't any mechanism inherent in large language models (LLMs)
that would seem to enable this and, if real, it would be completely unexplained.

LLMs are not brains and do not meaningfully share any of the mechanisms
that animals or people use to reason or think.

LLMs are a mathematical model of language tokens. You give a LLM text,
and it will give you a mathematically plausible response to that text.

There is no reason to believe that it thinks or reasons--indeed, every
AI researcher and vendor to date has repeatedly emphasised that these
models don't think.

There are two possible explanations for this effect:

1. The tech industry has accidentally invented the initial stages a
completely new kind of mind, based on completely unknown
principles, using completely unknown processes that have no
parallel in the biological world.

2. The intelligence illusion is in the mind of the user and not in
the LLM itself.

Many AI critics, including myself, are firmly in the second camp. It's
why I titled my book on the risks of generative "AI" The Intelligence Illusion.

<https://illusion.baldurbjarnason.com/>

For the past couple of months, I've been working on an idea that I think explains the mechanism of this intelligence illusion.

I now believe that there is even less intelligence and reasoning in
these LLMs than I thought before.

Many of the proposed use cases now look like borderline fraudulent pseudoscience to me.

The rise of the mechanical psychic
==================================
The intelligence illusion seems to be based on the same mechanism as
that of a psychic's con, often called cold reading. It looks like an accidental automation of the same basic tactic.

By using validation statements, such as sentences that use the Forer
effect, the chatbot and the psychic both give the impression of being
able to make extremely specific answers, but those answers are in fact statistically generic.

<https://www.skepdic.com/forer.html>

The psychic uses these statements to give the impression of being able
to read minds and hear the secrets of the dead.

The chatbot gives the impression of an intelligence that is specifically engaging with you and your work, but that impression is nothing more
than a statistical trick.

This idea was first planted in my head when I was going over some of the statements people have been making about the reasoning of these "AI."

I first thought that these were just classic cases of tech bubble
enthusiasm, but no, "AI" has both taken a different crowd and the
believers in the "AI" bubble sound very different from those of prior bubbles.

* "This is real. It's a bit worrying, but it's real."

* "There really is something there. Not sure what to think of it, but
I've experienced it myself."

* You need to keep your mind open to the possibilities. Once you do,
you'll see that there's something to it."

That's when I remembered, triggered by a blog post by Terence Eden on
the prevalence of Forer statements in chatbot replies. I have heard this before.

<https://shkspr.mobi/blog/2023/02/ how-much-of-ais-recent-success-is-due-to-the-forer-effect/>

This specific blend of awe, disbelief, and dread all sound like the
words of a victim of a mentalist scam artist--psychics.

The psychic's con is a tried and true method for scamming people that
has been honed through the ages.

What I describe below is one variation. There are many variations, but
the core mechanism remains the same.

The Psychic's Con
=================
The audience is represented by a collection of characters. The
disinterested are periods. The interested are upper case O's.

_ _ _
. O . O . . . . O .
- - _ _ - _
. . . . . O O . . O
- - -

1. The Audience Selects Itself

Most people aren't interested in psychics or the like, so the initial audience pool is already generally more open-minded and less critical
than the population in general. The chart now has different letters
to indicate that they are not of a single demographic

R B B G G B

Y Y B G R Y

2. The Scene is Set

The initial audience is prepared. Lights are dimmed. The psychic is
hyped up. Staff research the audience on social media or through conversation. The audience's demographics are noted. All the letters representing demographics not chosen are lower case.

_ _
r b b G G b
= -
y y b G r y
-

3. Narrowing Down the Demographic

The psychic gauges the information they have on the audience,
gestures towards a row or cluster, and makes a statement that sounds
specific but is in fact statistically likely for the demographic.
Usually at least one person reacts. If not, the psychic will imply
that the secret is too embarrassing for the "real" person to come
forward, reminds people that they're available for private readings,
and tries again. An at-symbol representing the psychic has an arrow
pointing to the letter that represents the mark.

- -
@ -> G G
= -
G
-

4. The Mark is Tested

The reaction indicates that the mark believes they were "read". This
leads to a burst of questions that, again, sound very specific but
are actually statistically generic. If the mark doesn't respond, the
psychic declares the initial read a success and tries again. The
mark's letter and the psychic's symbol have arrows pointing to each
other representing a loop.

->
@ G
<-

5. The Subjective Validation Loop

The con begins in earnest. The psychic asks a series of questions
that all sound very specific to the mark but are in reality just statistically probable guesses, based on their demographics and prior answers, phrased in a specific, highly confident way. The mark's
letter has exclamation marks.

!!!
G

6. "Wow! That psychic is the real thing!"

The psychic ends the conversation and the mark is left with the sense
that the psychic has uncanny powers. But the psychic isn't the real
thing. It's all a con.

1. Audience selection
=====================
Seers, tarot card readers, psychics, mind readers aren't all con
artists. Sometimes the "psychic" is open about it all just being entertainment and aren't pretending to be able to contact spirits or
read minds. Some psychics do not have a profit motive at all, and
without the grift it doesn't seem fair to call somebody a con artist.

But many of them are con artists deliberately fooling people, and they
all operate using the same basic mechanisms that begin well before the reading proper.

The audience is usually only composed of those already pre-disposed to believe in psychic phenomena and those they have managed to drag with
them. Hardcore sceptics will almost always be in a very small minority
of the audience, which both makes them easy to manage and provides
social pressure on them to tone down their scepticism.

Those who attend are primed to believe and are already familiar with
the mythology surrounding psychics. All of which helps them manage expectations and frame their performance.

2. Setting the scene
====================
Usually the audience is reminded of the ground rules for how psychic
readings "work" at the start of the performance. They are helped by the popularisation of these rules by media, cinema, and TV.

Everybody now "knows" that:

* Readings usually begin murky and unclear.
* They then become clearer as the "connection" to the "spirit world"
gets stronger.
* Errors are expected. The "spirits" are often vague or hard to hear.
* Non-believers can weaken or even disrupt the connection.

Psychics also habitually research their audience, by mapping out their demographics, looking them up on social media, or even with informal interviews performed by staff mingling with attendees before the
performance begins.

When the lights dim, the psychic should have a clear idea of which
members of the audience will make for a good mark.

3. Narrowing down
=================
The mark usually chooses themselves. The psychic makes a statement and
points towards a row, quickly altering their gesture based on somebody responding visible to the statement. This makes it look like they
pointed at the mark right from the beginning.

The mark is that way primed from the start to believe the psychic.
They're off-guard. Usually a bit surprised and totally unprepared for
the quick burst of questions the psychic offers next. If those questions
land and draw the mark in, they are followed by the actual reading. Otherwise, they move on and try again.

4. Testing the mark--Cold reading using subjective validation =============================================================
The con--cold reading--hinges on a quirk of human psychology: if we personally relate to a statement, we will generally consider it to be accurate.

<https://en.wikipedia.org/wiki/Cold_reading>

This unfortunate side effect of how our mind functions is called
subjective validation.

<https://en.wikipedia.org/wiki/Subjective_validation>

Subjective validation, sometimes called personal validation effect,
is a cognitive bias by which people will consider a statement or
another piece of information to be correct if it has any personal
meaning or significance to them. People whose opinion is affected
by subjective validation will perceive two unrelated events (i.e.,
a coincidence) to be related because their personal beliefs demand
that they be related.

As a consequence, many people will interpret even the most generic
statement as being specifically about them if they can relate to what
was said.

The more eager they are to find meaning in the statement, the stronger
the effect.

The more they believe in the speaker's ability to make accurate
statements, the stronger the effect.

The basic mechanism of the psychic's con is built on the mark being
willing and able to relate what was said to themselves, even if it's unintentional.

5. The subjective validation loop using validation statements =============================================================
The psychic taps into this cognitive bias by making a series of
statements that are tailored to be personally relatable--sound specific
to you--while actually being statistically generic.

These statements come in many types. I use "validation statements" here
as an umbrella term for all these various tactics.

Some common examples:

* Forer or Barnum statements are probably the most famous kind of
statement that plays into the subjective validation effect. Many of
these statements are inherently meaningless but are nonetheless
felt to be accurate by listeners. Most people will consider "you
tend to be hard on yourself" to be an accurate description of
themselves, for example.
<https://en.wikipedia.org/wiki/Barnum_effect>

* Vanishing negative is where a question is rephrased to include a
negative such as "not" or "don't". If the psychic asks "you don't
play the piano?" then they will be able to reframe the question as
accurate after the fact, no matter what the answer is. If you
answer negative: "didn't think so". Positive: "that's what I
thought."

* Rainbow ruse where the psychic associates the mark with both a
trait and its opposite. "You're a very calm person, but if provoked
you can get very angry."
<https://en.wikipedia.org/wiki/Cold_reading#The_rainbow_ruse>

* Statistical guesses.
Statements like "you have, or used to have, a scar on your left
leg or knee" apply to almost everybody. With enough knowledge of
common statistics, the psychic can make general statements that
sound incredibly specific to the mark.

* Demographic guesses.
Similar to statistical guesses, these are statements that are
common to a demographic but will sound very specific to the mark
that's listening.

* Unverifiable predictions.
Predictions like "somebody bears a strong ill will towards you but
they are unlikely to act on it" are impossible to verify, but will
sound true to many people.

* Shotgunning is one of the more common tactic where the psychic will
fire off a series of statements. The mark will find one of the
statements to be accurate and, due to how our minds work, will come
away only remembering the correct statement.
<https://en.wikipedia.org/wiki/Cold_reading#Shotgunning>

An important part of this process is the tone and bearing of the
psychic. They need to be confident, be quick in dismissing errors and
moving on when they make mistakes, and they need to be quick to read
people's expressions and body language and adjust their responses to
match.

6. The con is completed
=======================
At the end of the process, the mark is likely to remember that the
reading was eerily correct--that the psychic had an almost supernatural accuracy--which primes them to become even more receptive the next time
they attend.

This is where the con often becomes insidious: the effect becomes
stronger the more cooperative the mark is, and they often become more cooperative over time.

What's more, susceptibility has nothing to do with intelligence.

Somebody raised to believe they have high IQ is more likely to fall
for this than somebody raised to think less of their own intellectual capabilities. Subjective validation is a quirk of the human mind.
We all fall for it. But if you think you're unlikely to be fooled,
you will be tempted instead to apply your intelligence to "figure
out" how it happened. This means you can end up using considerable
creativity and intelligence to help the psychic fool you by coming up
with rationalisations for their "ability". And because you think you
can't be fooled, you also bring your intelligence to bear to defend the psychic's claim of their powers. Smart people (or, those who think of themselves as smart) can become the biggest, most lucrative marks.

Whereas the sceptic who thinks less of themselves is more likely to just
go:

"That's a neat trick. I don't know how you pulled it off. Must be very clever."

And just move on.

Many psychics fool themselves
=============================
It isn't unusual for psychics to unconsciously develop a practice of
cold reading subconsciously. The psychics themselves might not even be
aware of their own tactics.

<https://en.wikipedia.org/wiki/Cold_reading#Subconscious_cold_reading>

As Denis Dutton describes:

As a postgraduate student in pursuit of a scientific career, he
became intrigued with astrology. Though during this period he had
nagging doubts about the physical basis of astrology, he was
encouraged to continue with it by his many satisfied clients, who
invariably found his readings "amazingly accurate" in describing
their personal situations and problems. Not until he had one day
obtained such a gratifying reaction to a horoscope which, he
realized later, he had cast completely incorrectly, did he begin
slowly to understand the real nature of his activity: his great
success as an astrologer had nothing whatsoever to do with the
validity of astrology as a science. He had become, in fact, a
proficient cold reader, one who sincerely believed in the power of
astrology under the constant reinforcement of his clients. He was
fooling them, of course, but only after falling for the illusion
himself.

<http://www.denisdutton.com/cold_reading.htm>

There are many examples of this easily found once you start doing the research. The mechanism is simple enough and already baked into people's preconceptions of how readings work so many psychics accidentally
develop the knack for it, meaning that they're not just conning the
person being read, they are also conning themselves.

This point will become important later.

The LLMentalist Effect
======================

_ _ _
. O . O . . . . O .
- - _ _ - _
. . . . . O O . . O
- - -

1. The Audience Selects Itself

People sceptical about "AI" chatbots are less likely to use them.
Those who actively don't disbelieve the possibility of chatbot
"intelligence" won't get pulled in by the bot. The most active
audience will be early adopters, tech enthusiasts, and genuine
believers in AGI who will all generally be less critical and more open-minded. The characters now have different letters to indicate
that they are not of a single demographic, all overlaid by the word
'HYPE' and arrows indicating a prevailing atmosphere of hype.

^ ^ ^ ^ ^ ^
R|B|B|G|G|B|
| | | | | |
Y| HYPE Y|

2. The Scene is Set

Users are primed by the hype surrounding the technology. The chat
environment sets the mood and expectations. Warnings about it being
"early days" and "hallucinations" both anthropomorphise the bot and
provide ready-made excuses for when one of its constant failures are
noticed. All the letters representing demographics not chosen are
lower case.

_ _ _
r B b g G B
- _ _ - -
y y B G r y
- -

3. The Prompt Establishes the Context

Each user gives the chatbot a prompt and it answers. Many will either
accept the answer as given or repeat variations on the initial prompt
to get the desired result. They move on without falling for the
effect. But some users engage in conversation and get drawn in.
Various letters representing marks are connected via loop arrows with
at symbols representing the chatbot. The rest are lower case.

B| b G| g B|
^ v ^ v ^ v
|@ @ |@ @ |@

4. The Marks Test Themselves

The chatbot's answers sound extremely specific to the current context
but are in fact statistically generic. The mathematical model behind
the chatbot delivers a statistically plausible response to the
question. The marks that find this convincing get pulled in. The
mark's letter and the chatbot's symbol have arrows pointing to each
other representing a loop.

->
@ G
<-

5. The Subjective Validation Loop

The mark asks a series of questions and all of the replies sound like reasoned answers specific to the context but are in reality just statistically probable guesses. The more the mark engages, the more
convinced they are of the chatbot's intelligence. The mark's letter
has exclamation marks.

!!!
G

6. "Wow! This chatbot thinks! It has sparks of general intelligence!"

The mark is left with the sense that the chatbot is uncannily close to
being self-aware and that it is definitely capable of reasoning But
it's nothing more than a statistical and psychological effect.

1. The audience selects itself
==============================
If you aren't interested in "AI", you aren't going to use an "AI"
chatbot, and if you try one, you're less likely to return.

This means that many of the avid users of these chatbots are
self-selected to be enthusiastic and open-minded about the field of AI
and the notion of Artificial General Intelligence (AGI)--that these technologies might lead to self-aware and self-improving reasoning
systems.

Those who are genuine enthusiasts about AGI--that this field is about
to invent a new kind of mind--are likely to be substantially more enthusiastic about using these chatbots than the rest.

This parallels the audience selection for the psychic's con. Those who believe in an afterlife and that it can be contacted by the living are substantially more likely to attend a psychic's reading than others.

2. Setting the stage
====================
Our current environment of relentless hype sets the stage and builds
up an expectation for at least glimmers of genuine intelligence. For
all the warnings vendors make about these systems not being general intelligences, those statements are always followed by either an implied
or an actual "yet". The hype strongly implies that these are "almost" intelligences and that you should be able to perceive "sparks" of intelligence in them.

Those who believe are primed for subjective validation.

The warnings also play a role in setting the stage. "It's early days"
means that when the statistically generic nature of the response is
spotted, it's easily dismissed as an "error". Anthropomorphising
concepts such as using "hallucination" as a term help dismiss the fact
that statistical responses are completely disconnected from meaning
and facts. The hype and mythology of AI primes the audience to think
of these systems as persons to be understood and engaged with, all
but guaranteeing subjective validation.

3. The prompt establishes the context
=====================================
The initial prompt interaction is the first filter. Most will just take
the first answer and leave, or at most will repeat variations of their
prompt until they get the result they wanted. These interactions are
purely mechanical. The end-user is treating the chatbot merely as a generative widget, so they never get pulled into the LLMentalist effect.

Some of the end-users, usually those who are more enthusiastic
about the prospect of "AI", begin to engage and get pulled into "conversation" with a mathematical language model.

4. The mark tests themselves--subjective validation kicks in ============================================================
That conversation is the primary filter. Those who want to believe will
see the responses to their prompt as being both specifically about them
and intelligent. They are primed to see the chatbot as a person that
is reading their texts and thoughtfully responding to them. But that
isn't how language models work. LLMs model the distribution of words and phrases in a language as tokens. Their responses are nothing more than a statistically likely continuation of the prompt.

You give it text. It gives you a response that matches responses that
texts like yours commonly get in its training data set.

Already, this is working along the same fundamental principle as
the psychic's con: the LLM isn't "reading" your text any more than
the psychic is reading your mind. They are giving you statistically
plausible responses based on what you say. You're the one finding ways
to validate those responses as being specific to you as the subject of
the conversation.

Because of how large the training data set is, the responses from
the chatbot will look extremely convincing and specific, even though
they are statistically generic. Once you've trained on most of the
past twenty years of the web, large collections of stolen ebooks, all
of Reddit, most of social media, and a substantial amount of custom interactions by low-wage workers, the model will have a response for
almost everything you can think of, or can use a variation of something
it's already seen.

These initial interactions can be quite compelling, especially if you're
a believer in "AI", but it is in the longer and repeated conversations
that the effect really begins to kick in.

5. The subjective validation loop--RLHF enters the picture ==========================================================
It's important to remember at this stage how Reinforcement Learning
through Human Feedback works.

<https://huggingface.co/blog/rlhf>

This is the method that vendors use to turn a raw language model into a chatbot that can hold a conversation.

RLHF doesn't let the vendor make specific corrections to an LLM's
output. The method involves using human feedback to rank a variety of
texts generated by the model, usually following some other form of fine-tuning. The ranked texts are in turn used to train a separate
reward model. It's this model that is responsible for the actual Reinforcement Learning of the LLM. The reward model, coupled with
fine-tuning the LLM on collections of chats, is what turns the
borderline unhinged conversations of a regular model into the fluent experience you see in systems such as ChatGPT.

Because the feedback is based on rankings, it can't easily be based on specific issues. If a model makes a false statement in a conversation,
that conversation gets a lower rank.

This lack of concrete specificity likely means that RLHF models in
general are likely to reward responses that sound accurate. As the
reward model is likely just another language model, it can't reward
based on facts or anything specific, so it can only reward output
that has a tone, style, and structure that's commonly associated with statements that have been rated as accurate.

Even the ratings themselves are suspect. Most, if not all, of the
workers who provide this feedback to AI vendors are low-paid workers
who are unlikely to have specialised knowledge relevant to the topic
they're rating, and even if they do, they are unlikely to have the time
to fact-check everything.

That means they are going to be ranking the conversations almost
entirely based on tone and sentence structure.

This is why I think that RLHF has effectively become a reward system
that specifically optimises language models for generating validation statements: Forer statements, shotgunning, vanishing negatives, and statistical guesses.

In trying to make the LLM sound more human, more confident, and more engaging, but without being able to edit specific details in its output,
AI researchers seem to have created a mechanical mentalist.

Instead of pretending to read minds thrgh statistically plausible
validation statements, it pretends to read and understand your text
through statistically plausible validation statements.

The validation loop can continue for a while, with the mark constantly
doing the work of convincing themselves of the language model's
intelligence. Done long enough, it becomes a form of reinforcement
learning for the mark.

6. The marks become cheerleaders
================================
The most enthusiastic believers in an imminent AI revolution are
starting to sound very similar to long-time believers in psychics and mind-reading.

They come up with increasingly convoluted ideas and models to explain
why the impossible is possible. They become more and more dismissive of fields of science and research that challenge their world view. Their
own statements become tinged with awe and dread.

And they keep evangelising. This is real!

Often followed by: This is dangerous!

Remember, the effect becomes more powerful when the mark is both
intelligent and wants to believe. Subjective validation is based on how
our minds work, in general, and is unaffected by your reported IQ.

If anything, your intelligence will just improve your ability to
rationalise your subjective validation and make the effect stronger.
When it's coupled with a genuine desire to believe in the con--that we
are on the verge of discovering Artificial General Intelligence--the
effect should both be irresistible and powerful once it takes hold.

This is why you can't rely on user reports to discover these issues.
People who believe in psychics will generally have only positive things
to say about a psychic, even as they're being bilked. People who believe we're on the verge of building an AGI will only have positive things to
say about chatbots that support that belief.

It's easy to fall for this
==========================
Falling for this statistical illusion is easy. It has nothing to do with
your intelligence or even your gullibility. It's your brain working
against you. Most of the time conversations are collaborative and
personal, so your mind is optimised for finding meaning in what is said
under those circumstances. If you also want to believe, whether it's in psychics or in AGI, your mind will helpfully find reasons to believe in
the conversation you're having.

Once you're so deep into it that you've done a press tour and committed yourself as a public figure to this idea, dislodging the belief that we
now have a proto-AGI becomes impossible. Much like a scientist publicly stating that they believe in a particular psychic, their self-image
becomes intertwined with their belief in that psychic. Any dismissal of
the phenomenon will feel to them like a personal attack.

The psychic's con is a mechanism that has been extraordinarily
successful at fooling people over the years. It works.

The best defence is to respond the same way as you would to a convincing psychic's reading: "That's a neat trick, I wonder how they pulled it
off?"

Well, now you know.

Once you're aware of the fallibility of how your mind works, you should
have an easier time spotting when that fallibility is being exploited, intentionally or not.

That brings us to an important question.

Is this intentional?
====================
Given that there are billions of dollars at stake in the tech industry,
it would be tempting to assume that the statistical illusion of
intelligence was intentionally created by people in the tech industry.

I personally think that's extraordinarily unlikely.

A popular response to various government conspiracy theories is that government institutions just aren't that good at keeping secrets.

Well, the tech industry just isn't that good at software. This illusion
is, honestly, too clever to have been created intentionally by those
making it.

The field of AI research has a reputation for disregarding the value of
other fields, so I'm certain that this reimplementation of a psychic's
con is entirely accidental. It's likely that, being unaware of much of
the research in psychology on cognitive biases or how a psychic's con
works, they stumbled into a mechanism and made chatbots that fooled many
of the chatbot makers themselves.

Remember what I wrote above about psychics frequently having conned themselves, that many of them aren't even aware of their own scam?

The same applies here. I think this is an industry that didn't
understand what it was doing and, now, doesn't understand what it did.

That's why so many people in tech are completely and utterly convinced
Ththat they have created the first spark of true Artificial General Intelligence.

This new era of tech seems to be built on superstition and pseudoscience ========================================================================
Once I started to research the possibility that LLM interactions were a variation on the psychic's con, I began to see parallels everywhere in
the field of "AI".

* Hooking a language model up to an MRI and claiming that it can read
minds.

* Claiming to be able to discern criminality based on facial
expressions and gait.

* Proposing magical solutions to health problems.

* Literal predictions of the future.

* Claiming to be able to discern the honesty of potential employees.

All of these are proposed applications of "AI" systems, but they are
also all common psychic scams. Mind reading, police assistance, faith healing, prophecy, and even psychic employee vetting are all right out
of the mentalist playbook.

Even though I have no doubts that these efforts are sincere, it's
becoming more and more obvious that the tech industry has given itself wholesale to superstition and pseudoscience. They keep ignoring the
warnings coming from other fields and the concerns from critics in their
own camp.

Large Language Models don't have the functionality or features to make
up for this wave of superstition.

* "Hallucinations" are a pervasive flaw that's baked into how LLMs
work.
<https://needtoknow.fyi/card/hallucinations/>

* Summarisations are error-prone and prone to generalising about the
text being summarised.
<https://www.baldurbjarnason.com/2023/ai-summaries-unreliable/>

* Their "reasoning" is a statistical illusion.

* Their performance at natural language processing tasks is only
marginally better than that of smaller language models.
<http://opensamizdat.com/posts/chatgpt_survey/>

* They tend to memorise and copy text without attribution.
<https://needtoknow.fyi/card/copyright/>

Taken together, these flaws make LLMs look less like an information technology and more like a modern mechanisation of the psychic hotline.

Delegating your decision-making, ranking, assessment, strategising,
analysis, or any other form of reasoning to a chatbot becomes the
functional equivalent to phoning a psychic for advice.

Imagine Google or a major tech company trying to fix their search engine
by adding a psychic hotline to their front page? That's what they're
doing with Bard.

* "Our university students can't make heads nor tails of our website.
Let's add a psychic hotline!"

* "We need to improve our customer service portal. Let's add a
psychic hotline!"

* "We've added a psychic hotline button to your web browser! No, you
can't get rid of it. You're welcome!"

* "Can't understand a thing in our technical docs? Refer to our fancy
new psychic hotline!"

The AI bubble is going to be a tough one to weather.

More on "AI"
============
I've spent some time writing about the many flaws of language models and generative "AI".

* I've written about how language models are a backward-facing tool
in a novelty-seeking industry and why I think using language models
for programming is a bad idea.
<https://softwarecrisis.dev/letters/ai-code-quality/>
<https://softwarecrisis.dev/letters/ai-and-software-quality/>

* "AI" summaries are inherently unreliable.
<https://www.baldurbjarnason.com/2023/ai-summaries-unreliable/>

* Their tendency towards shortcuts makes them dangerous in healthcare.
<https://www.baldurbjarnason.com/2023/ai-in-healthcare/>

* Most of the research indicating a productivity benefit to "AI" is,
at best, flawed, and at worst are completely detached from the
reality of modern office work.
<https://www.baldurbjarnason.com/2023/ignore-most-ai-research/>
<https://www.baldurbjarnason.com/2023/ai-research-again/>

* AI vendors have a history of pseudoscience and snake oil.
<https://www.baldurbjarnason.com/2023/beware-of-ai-snake-oil/>

* Even if you do think that a language model's unsolvable tendency
towards ‘hallucinations' doesn't disqualify the technology from
replacing search engines, the many security issues that language
models suffer from should. The "write a prompt; get the output"
model is inherently insecure. These systems are also vulnerable to
a form of keyword manipulation exploit that's impossible to prevent.
<https://softwarecrisis.dev/letters/prompts-are-not-fit-for-purpose/>
<https://softwarecrisis.dev/letters/google-bard-seo/>

I've come to the conclusion that a language model is almost always the
wrong tool for the job.

***
I strongly advise against integrating an LLM or chatbot into your
product, website, or organisational processes.
***

If you do have to use generative AI, either because it's a mandate from
above your pay grade or some other requirement, I have written a book
that's specifically about the issues with using generative "AI" for
work:

The Intelligence Illusion: a practical guide to the business risks of Generative AI.
<https://illusion.baldurbjarnason.com/>

It's only $35 USD for EPUB and PDF, which is only 15% of the $240 USD
cost of twelve months of ChatGPT Plus.

But, again, I'd much rather you just avoid using a language model in
the first place and save both the cost of the ebook and the ChatGPT subscription.

References on the Psychic's Con
===============================
* Cold reading (Wikipedia)
<https://en.wikipedia.org/wiki/Cold_reading>
* How to Become Psychic and Cold Read People
<http://positivelybrainwashed.com/
how-to-become-psychic-and-cold-read-people/>
* Derren Brown Cold Reading revealed
<https://secrets-explained.com/derren-brown/cold-reading>
* Cold reading (Rational Wiki)
<https://secrets-explained.com/derren-brown/cold-reading>
* 7 Tricks Psychics Bullshit People With That Everyone Should Know
<https://www.thrillist.com/culture/7-tricks-psychics-and-mediums-use-
how-psychics-use-cold-reading-the-forer-effect>
* Should You Believe in Psychics? Psychology and logic join forces to
debunk psychics (Psychology Today)
<https://www.psychologytoday.com/us/blog/hot-thought/201904/
should-you-believe-in-psychics>
* Motivated reasoning (Wikipedia)
<https://en.wikipedia.org/wiki/Motivated_reasoning>
* Cold Reading: How I Made Others Believe I Had Psychic Powers
<https://medium.com/@chris.kirsch/cold-reading-how-i-
made-others-believe-i-had-psychic-powers-dc184879d264>
* Cold reading (Sceptic's Dictionary)
<https://www.skepdic.com/coldread.html>
* Subjective validation (Sceptic's Dictionary)
<https://www.skepdic.com/subjectivevalidation.html>
* Subjective validation (Wikipedia)
<https://en.wikipedia.org/wiki/Subjective_validation>
* Coincidences: Remarkable or Random?
<https://skepticalinquirer.org/1998/09/
coincidences-remarkable-or-random/>
* Psychic Experiences: Psychic Illusions
<https://www.susanblackmore.uk/articles/
psychic-experiences-psychic-illusions/>
* Guide to Cold Reading
<https://www.skeptics.com.au/resources/articles/
guide-to-cold-reading-ray-hyman/>
* The Cold Reading Technique
<http://www.denisdutton.com/cold_reading.htm>
* Forer effect (Sceptic's Dictionary)
<https://www.skepdic.com/forer.html>
* Tricks of the Psychic Trade (Psychology Today)
<https://www.psychologytoday.com/us/blog/speaking-in-tongues/
201201/tricks-the-psychic-trade>
* Psychic Scams
<https://www.aarp.org/money/scams-fraud/info-2022/psychic.html>
* Ten Tricks of the Psychics I Bet You Didn't Know (You Won't Believe #6!)
<https://skepticalinquirer.org/exclusive/
ten-tricks-of-the-psychics-i-bet-you-didnrsquot-know/>

From: <https://softwarecrisis.dev/letters/llmentalist/>

--8323328-500091342-1712910501=:13976--
--- Synchronet 3.20a-Linux NewsLink 1.114

From Anton Shepelev@anton.txt@g{oogle}mail.com to comp.misc on Fri Apr 12 13:49:52 2024

From Newsgroup: comp.misc

Ben Collver quoted:

<https://softwarecrisis.dev/letters/llmentalist/>
[...]
LLMs are not brains and do not meaningfully share any of
the mechanisms that animals or people use to reason or
think.

LLMs are a mathematical model of language tokens. You give
a LLM text, and it will give you a mathematically
plausible response to that text.

There is no reason to believe that it thinks or
reasons--indeed, every AI researcher and vendor to date
has repeatedly emphasised that these models don't think.

What say ye to:

1. LLM's can playing chess, that is they understand the
rules of the game, because all the training set is not
nearly sufficient if it were used merely
statistically:
<https://parrotchess.com/>

2. Emergent World Models and Latent Variable Estimation
in Chess-Playing Language Models:
<https://arxiv.org/abs/2403.15498>

I fear they cannot be explained by the Forer effect.
--
() ascii ribbon campaign -- against html e-mail
/\ www.asciiribbon.org -- against proprietary attachments
--- Synchronet 3.20a-Linux NewsLink 1.114

From Eric Pozharski@apple.universe@posteo.net to comp.misc on Sat Apr 13 16:03:54 2024

From Newsgroup: comp.misc

with <20240412134952.391fb054793f0d1946a29ce6@g{oogle}mail.com> Anton
Shepelev wrote:

Ben Collver quoted:

<https://softwarecrisis.dev/letters/llmentalist/>

*SKIP* [ 8 lines 2 levels deep]

There is no reason to believe that it thinks or reasons--indeed,
every AI researcher and vendor to date has repeatedly emphasised that
these models don't think.

What say ye to:

Oh, look at that. We've got True Believer. Useful feature -- sceptical
one.

1. LLM's can playing chess, that is they understand the rules of the
game, because all the training set is not nearly sufficient if it were
used merely statistically: <https://parrotchess.com/>

(disclaimer: I'm not immersed and I can't be pressed to research this,
but) Simplest possible somewhat problematic play: KQ vs KR. By the
chess theory: blacks always loose. To delay inevitable checkmate
blacks must never separate K and R.

This was approached with different perspective: (a) setup all possible combinations; (b) build graph connecting would be previous
combinations; (c) blacks choose move that would delay checkmate.
Simple, eh?

When NI faced this it was a catastrophy. Whites expect opponent that
would play by theory and intuition. They don't expect graph. Then
blacks make unexpected moves (thus bringing havoc onto whites plans).
During that dance some combination repeates three times. Then, by chess
rules, it's a tie.

2. Emergent World Models and Latent Variable Estimation in
Chess-Playing Language Models: <https://arxiv.org/abs/2403.15498>

. ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -- I, for one, smell oxymoron.

Is it possible to build such graph for all positions and combinations of pieces? Turns out, yes, it's possible just by playing.

I fear they cannot be explained by the Forer effect.

For sceptics, the Forer effect is on recieving part. It has nothing to
do with whatever is sold.
--
Torvalds' goal for Linux is very simple: World Domination
Stallman's goal for GNU is even simpler: Freedom
--- Synchronet 3.20a-Linux NewsLink 1.114

Who's Online

System Info

Sysop:	DaiTengu
Location:	Appleton, WI
Users:	918
Nodes:	10 (1 / 9)
Uptime:	39:45:38
Calls:	12,180
Files:	186,523
Messages:	2,235,887

The LLMentalist Effect (AI & psychic's con)

Who's Online

System Info