• The LLMentalist Effect (AI & psychic's con)

    From Ben Collver@bencollver@tilde.pink to comp.misc on Fri Apr 12 00:02:12 2024
    From Newsgroup: comp.misc

    The LLMentalist Effect
    ======================
    by Baldur Bjarnason, July 4th, 2023

    how chat-based Large Language Models replicate the mechanisms of a
    psychic's con


    For the past year or so I've been spending most of my time researching
    the use of language and diffusion models in software businesses.

    One of the issues in during this research--one that has perplexed
    me--has been that many people are convinced that language models, or specifically chat-based language models, are intelligent.

    But there isn't any mechanism inherent in large language models (LLMs)
    that would seem to enable this and, if real, it would be completely unexplained.

    LLMs are not brains and do not meaningfully share any of the mechanisms
    that animals or people use to reason or think.

    LLMs are a mathematical model of language tokens. You give a LLM text,
    and it will give you a mathematically plausible response to that text.

    There is no reason to believe that it thinks or reasons--indeed, every
    AI researcher and vendor to date has repeatedly emphasised that these
    models don't think.

    There are two possible explanations for this effect:

    1. The tech industry has accidentally invented the initial stages a
    completely new kind of mind, based on completely unknown
    principles, using completely unknown processes that have no
    parallel in the biological world.

    2. The intelligence illusion is in the mind of the user and not in
    the LLM itself.

    Many AI critics, including myself, are firmly in the second camp. It's
    why I titled my book on the risks of generative "AI" The Intelligence
    Illusion.

    <https://illusion.baldurbjarnason.com/>

    For the past couple of months, I've been working on an idea that I think explains the mechanism of this intelligence illusion.

    I now believe that there is even less intelligence and reasoning in
    these LLMs than I thought before.

    Many of the proposed use cases now look like borderline fraudulent pseudoscience to me.

    The rise of the mechanical psychic
    ==================================
    The intelligence illusion seems to be based on the same mechanism as
    that of a psychic's con, often called cold reading. It looks like an
    accidental automation of the same basic tactic.

    By using validation statements, such as sentences that use the Forer
    effect, the chatbot and the psychic both give the impression of being
    able to make extremely specific answers, but those answers are in fact statistically generic.

    <https://www.skepdic.com/forer.html>

    The psychic uses these statements to give the impression of being able
    to read minds and hear the secrets of the dead.

    The chatbot gives the impression of an intelligence that is specifically engaging with you and your work, but that impression is nothing more
    than a statistical trick.

    This idea was first planted in my head when I was going over some of the statements people have been making about the reasoning of these "AI."

    I first thought that these were just classic cases of tech bubble
    enthusiasm, but no, "AI" has both taken a different crowd and the
    believers in the "AI" bubble sound very different from those of prior
    bubbles.

    * "This is real. It's a bit worrying, but it's real."

    * "There really is something there. Not sure what to think of it, but
    I've experienced it myself."

    * You need to keep your mind open to the possibilities. Once you do,
    you'll see that there's something to it."

    That's when I remembered, triggered by a blog post by Terence Eden on
    the prevalence of Forer statements in chatbot replies. I have heard this before.

    <https://shkspr.mobi/blog/2023/02/ how-much-of-ais-recent-success-is-due-to-the-forer-effect/>

    This specific blend of awe, disbelief, and dread all sound like the
    words of a victim of a mentalist scam artist--psychics.

    The psychic's con is a tried and true method for scamming people that
    has been honed through the ages.

    What I describe below is one variation. There are many variations, but
    the core mechanism remains the same.

    The Psychic's Con
    =================
    The audience is represented by a collection of characters. The
    disinterested are periods. The interested are upper case O's.

    _ _ _
    . O . O . . . . O .
    - - _ _ - _
    . . . . . O O . . O
    - - -

    1. The Audience Selects Itself

    Most people aren't interested in psychics or the like, so the initial
    audience pool is already generally more open-minded and less critical
    than the population in general. The chart now has different letters
    to indicate that they are not of a single demographic


    R B B G G B

    Y Y B G R Y


    2. The Scene is Set

    The initial audience is prepared. Lights are dimmed. The psychic is
    hyped up. Staff research the audience on social media or through
    conversation. The audience's demographics are noted. All the letters representing demographics not chosen are lower case.

    _ _
    r b b G G b
    = -
    y y b G r y
    -

    3. Narrowing Down the Demographic

    The psychic gauges the information they have on the audience,
    gestures towards a row or cluster, and makes a statement that sounds
    specific but is in fact statistically likely for the demographic.
    Usually at least one person reacts. If not, the psychic will imply
    that the secret is too embarrassing for the "real" person to come
    forward, reminds people that they're available for private readings,
    and tries again. An at-symbol representing the psychic has an arrow
    pointing to the letter that represents the mark.

    - -
    @ -> G G
    = -
    G
    -

    4. The Mark is Tested

    The reaction indicates that the mark believes they were "read". This
    leads to a burst of questions that, again, sound very specific but
    are actually statistically generic. If the mark doesn't respond, the
    psychic declares the initial read a success and tries again. The
    mark's letter and the psychic's symbol have arrows pointing to each
    other representing a loop.

    ->
    @ G
    <-

    5. The Subjective Validation Loop

    The con begins in earnest. The psychic asks a series of questions
    that all sound very specific to the mark but are in reality just
    statistically probable guesses, based on their demographics and prior
    answers, phrased in a specific, highly confident way. The mark's
    letter has exclamation marks.

    !!!
    G

    6. "Wow! That psychic is the real thing!"

    The psychic ends the conversation and the mark is left with the sense
    that the psychic has uncanny powers. But the psychic isn't the real
    thing. It's all a con.

    1. Audience selection
    =====================
    Seers, tarot card readers, psychics, mind readers aren't all con
    artists. Sometimes the "psychic" is open about it all just being
    entertainment and aren't pretending to be able to contact spirits or
    read minds. Some psychics do not have a profit motive at all, and
    without the grift it doesn't seem fair to call somebody a con artist.

    But many of them are con artists deliberately fooling people, and they
    all operate using the same basic mechanisms that begin well before the
    reading proper.

    The audience is usually only composed of those already pre-disposed to
    believe in psychic phenomena and those they have managed to drag with
    them. Hardcore sceptics will almost always be in a very small minority
    of the audience, which both makes them easy to manage and provides
    social pressure on them to tone down their scepticism.

    Those who attend are primed to believe and are already familiar with
    the mythology surrounding psychics. All of which helps them manage
    expectations and frame their performance.

    2. Setting the scene
    ====================
    Usually the audience is reminded of the ground rules for how psychic
    readings "work" at the start of the performance. They are helped by the popularisation of these rules by media, cinema, and TV.

    Everybody now "knows" that:

    * Readings usually begin murky and unclear.
    * They then become clearer as the "connection" to the "spirit world"
    gets stronger.
    * Errors are expected. The "spirits" are often vague or hard to hear.
    * Non-believers can weaken or even disrupt the connection.

    Psychics also habitually research their audience, by mapping out their demographics, looking them up on social media, or even with informal
    interviews performed by staff mingling with attendees before the
    performance begins.

    When the lights dim, the psychic should have a clear idea of which
    members of the audience will make for a good mark.

    3. Narrowing down
    =================
    The mark usually chooses themselves. The psychic makes a statement and
    points towards a row, quickly altering their gesture based on somebody responding visible to the statement. This makes it look like they
    pointed at the mark right from the beginning.

    The mark is that way primed from the start to believe the psychic.
    They're off-guard. Usually a bit surprised and totally unprepared for
    the quick burst of questions the psychic offers next. If those questions
    land and draw the mark in, they are followed by the actual reading.
    Otherwise, they move on and try again.

    4. Testing the mark--Cold reading using subjective validation =============================================================
    The con--cold reading--hinges on a quirk of human psychology: if we
    personally relate to a statement, we will generally consider it to be
    accurate.

    <https://en.wikipedia.org/wiki/Cold_reading>

    This unfortunate side effect of how our mind functions is called
    subjective validation.

    <https://en.wikipedia.org/wiki/Subjective_validation>

    Subjective validation, sometimes called personal validation effect,
    is a cognitive bias by which people will consider a statement or
    another piece of information to be correct if it has any personal
    meaning or significance to them. People whose opinion is affected
    by subjective validation will perceive two unrelated events (i.e.,
    a coincidence) to be related because their personal beliefs demand
    that they be related.

    As a consequence, many people will interpret even the most generic
    statement as being specifically about them if they can relate to what
    was said.

    The more eager they are to find meaning in the statement, the stronger
    the effect.

    The more they believe in the speaker's ability to make accurate
    statements, the stronger the effect.

    The basic mechanism of the psychic's con is built on the mark being
    willing and able to relate what was said to themselves, even if it's unintentional.

    5. The subjective validation loop using validation statements =============================================================
    The psychic taps into this cognitive bias by making a series of
    statements that are tailored to be personally relatable--sound specific
    to you--while actually being statistically generic.

    These statements come in many types. I use "validation statements" here
    as an umbrella term for all these various tactics.

    Some common examples:

    * Forer or Barnum statements are probably the most famous kind of
    statement that plays into the subjective validation effect. Many of
    these statements are inherently meaningless but are nonetheless
    felt to be accurate by listeners. Most people will consider "you
    tend to be hard on yourself" to be an accurate description of
    themselves, for example.
    <https://en.wikipedia.org/wiki/Barnum_effect>

    * Vanishing negative is where a question is rephrased to include a
    negative such as "not" or "don't". If the psychic asks "you don't
    play the piano?" then they will be able to reframe the question as
    accurate after the fact, no matter what the answer is. If you
    answer negative: "didn't think so". Positive: "that's what I
    thought."

    * Rainbow ruse where the psychic associates the mark with both a
    trait and its opposite. "You're a very calm person, but if provoked
    you can get very angry."
    <https://en.wikipedia.org/wiki/Cold_reading#The_rainbow_ruse>

    * Statistical guesses.
    Statements like "you have, or used to have, a scar on your left
    leg or knee" apply to almost everybody. With enough knowledge of
    common statistics, the psychic can make general statements that
    sound incredibly specific to the mark.

    * Demographic guesses.
    Similar to statistical guesses, these are statements that are
    common to a demographic but will sound very specific to the mark
    that's listening.

    * Unverifiable predictions.
    Predictions like "somebody bears a strong ill will towards you but
    they are unlikely to act on it" are impossible to verify, but will
    sound true to many people.

    * Shotgunning is one of the more common tactic where the psychic will
    fire off a series of statements. The mark will find one of the
    statements to be accurate and, due to how our minds work, will come
    away only remembering the correct statement.
    <https://en.wikipedia.org/wiki/Cold_reading#Shotgunning>

    An important part of this process is the tone and bearing of the
    psychic. They need to be confident, be quick in dismissing errors and
    moving on when they make mistakes, and they need to be quick to read
    people's expressions and body language and adjust their responses to
    match.

    6. The con is completed
    =======================
    At the end of the process, the mark is likely to remember that the
    reading was eerily correct--that the psychic had an almost supernatural accuracy--which primes them to become even more receptive the next time
    they attend.

    This is where the con often becomes insidious: the effect becomes
    stronger the more cooperative the mark is, and they often become more cooperative over time.

    What's more, susceptibility has nothing to do with intelligence.

    Somebody raised to believe they have high IQ is more likely to fall
    for this than somebody raised to think less of their own intellectual capabilities. Subjective validation is a quirk of the human mind.
    We all fall for it. But if you think you're unlikely to be fooled,
    you will be tempted instead to apply your intelligence to "figure
    out" how it happened. This means you can end up using considerable
    creativity and intelligence to help the psychic fool you by coming up
    with rationalisations for their "ability". And because you think you
    can't be fooled, you also bring your intelligence to bear to defend the psychic's claim of their powers. Smart people (or, those who think of themselves as smart) can become the biggest, most lucrative marks.

    Whereas the sceptic who thinks less of themselves is more likely to just
    go:

    "That's a neat trick. I don't know how you pulled it off. Must be very
    clever."

    And just move on.

    Many psychics fool themselves
    =============================
    It isn't unusual for psychics to unconsciously develop a practice of
    cold reading subconsciously. The psychics themselves might not even be
    aware of their own tactics.

    <https://en.wikipedia.org/wiki/Cold_reading#Subconscious_cold_reading>

    As Denis Dutton describes:

    As a postgraduate student in pursuit of a scientific career, he
    became intrigued with astrology. Though during this period he had
    nagging doubts about the physical basis of astrology, he was
    encouraged to continue with it by his many satisfied clients, who
    invariably found his readings "amazingly accurate" in describing
    their personal situations and problems. Not until he had one day
    obtained such a gratifying reaction to a horoscope which, he
    realized later, he had cast completely incorrectly, did he begin
    slowly to understand the real nature of his activity: his great
    success as an astrologer had nothing whatsoever to do with the
    validity of astrology as a science. He had become, in fact, a
    proficient cold reader, one who sincerely believed in the power of
    astrology under the constant reinforcement of his clients. He was
    fooling them, of course, but only after falling for the illusion
    himself.

    <http://www.denisdutton.com/cold_reading.htm>

    There are many examples of this easily found once you start doing the
    research. The mechanism is simple enough and already baked into people's preconceptions of how readings work so many psychics accidentally
    develop the knack for it, meaning that they're not just conning the
    person being read, they are also conning themselves.

    This point will become important later.

    The LLMentalist Effect
    ======================

    _ _ _
    . O . O . . . . O .
    - - _ _ - _
    . . . . . O O . . O
    - - -

    1. The Audience Selects Itself

    People sceptical about "AI" chatbots are less likely to use them.
    Those who actively don't disbelieve the possibility of chatbot
    "intelligence" won't get pulled in by the bot. The most active
    audience will be early adopters, tech enthusiasts, and genuine
    believers in AGI who will all generally be less critical and more
    open-minded. The characters now have different letters to indicate
    that they are not of a single demographic, all overlaid by the word
    'HYPE' and arrows indicating a prevailing atmosphere of hype.

    ^ ^ ^ ^ ^ ^
    R|B|B|G|G|B|
    | | | | | |
    Y| HYPE Y|


    2. The Scene is Set

    Users are primed by the hype surrounding the technology. The chat
    environment sets the mood and expectations. Warnings about it being
    "early days" and "hallucinations" both anthropomorphise the bot and
    provide ready-made excuses for when one of its constant failures are
    noticed. All the letters representing demographics not chosen are
    lower case.

    _ _ _
    r B b g G B
    - _ _ - -
    y y B G r y
    - -

    3. The Prompt Establishes the Context

    Each user gives the chatbot a prompt and it answers. Many will either
    accept the answer as given or repeat variations on the initial prompt
    to get the desired result. They move on without falling for the
    effect. But some users engage in conversation and get drawn in.
    Various letters representing marks are connected via loop arrows with
    at symbols representing the chatbot. The rest are lower case.


    B| b G| g B|
    ^ v ^ v ^ v
    |@ @ |@ @ |@


    4. The Marks Test Themselves

    The chatbot's answers sound extremely specific to the current context
    but are in fact statistically generic. The mathematical model behind
    the chatbot delivers a statistically plausible response to the
    question. The marks that find this convincing get pulled in. The
    mark's letter and the chatbot's symbol have arrows pointing to each
    other representing a loop.

    ->
    @ G
    <-

    5. The Subjective Validation Loop

    The mark asks a series of questions and all of the replies sound like
    reasoned answers specific to the context but are in reality just
    statistically probable guesses. The more the mark engages, the more
    convinced they are of the chatbot's intelligence. The mark's letter
    has exclamation marks.

    !!!
    G

    6. "Wow! This chatbot thinks! It has sparks of general intelligence!"

    The mark is left with the sense that the chatbot is uncannily close to
    being self-aware and that it is definitely capable of reasoning But
    it's nothing more than a statistical and psychological effect.

    1. The audience selects itself
    ==============================
    If you aren't interested in "AI", you aren't going to use an "AI"
    chatbot, and if you try one, you're less likely to return.

    This means that many of the avid users of these chatbots are
    self-selected to be enthusiastic and open-minded about the field of AI
    and the notion of Artificial General Intelligence (AGI)--that these technologies might lead to self-aware and self-improving reasoning
    systems.

    Those who are genuine enthusiasts about AGI--that this field is about
    to invent a new kind of mind--are likely to be substantially more
    enthusiastic about using these chatbots than the rest.

    This parallels the audience selection for the psychic's con. Those who
    believe in an afterlife and that it can be contacted by the living are substantially more likely to attend a psychic's reading than others.

    2. Setting the stage
    ====================
    Our current environment of relentless hype sets the stage and builds
    up an expectation for at least glimmers of genuine intelligence. For
    all the warnings vendors make about these systems not being general intelligences, those statements are always followed by either an implied
    or an actual "yet". The hype strongly implies that these are "almost" intelligences and that you should be able to perceive "sparks" of
    intelligence in them.

    Those who believe are primed for subjective validation.

    The warnings also play a role in setting the stage. "It's early days"
    means that when the statistically generic nature of the response is
    spotted, it's easily dismissed as an "error". Anthropomorphising
    concepts such as using "hallucination" as a term help dismiss the fact
    that statistical responses are completely disconnected from meaning
    and facts. The hype and mythology of AI primes the audience to think
    of these systems as persons to be understood and engaged with, all
    but guaranteeing subjective validation.

    3. The prompt establishes the context
    =====================================
    The initial prompt interaction is the first filter. Most will just take
    the first answer and leave, or at most will repeat variations of their
    prompt until they get the result they wanted. These interactions are
    purely mechanical. The end-user is treating the chatbot merely as a
    generative widget, so they never get pulled into the LLMentalist effect.

    Some of the end-users, usually those who are more enthusiastic
    about the prospect of "AI", begin to engage and get pulled into
    "conversation" with a mathematical language model.

    4. The mark tests themselves--subjective validation kicks in ============================================================
    That conversation is the primary filter. Those who want to believe will
    see the responses to their prompt as being both specifically about them
    and intelligent. They are primed to see the chatbot as a person that
    is reading their texts and thoughtfully responding to them. But that
    isn't how language models work. LLMs model the distribution of words and phrases in a language as tokens. Their responses are nothing more than a statistically likely continuation of the prompt.

    You give it text. It gives you a response that matches responses that
    texts like yours commonly get in its training data set.

    Already, this is working along the same fundamental principle as
    the psychic's con: the LLM isn't "reading" your text any more than
    the psychic is reading your mind. They are giving you statistically
    plausible responses based on what you say. You're the one finding ways
    to validate those responses as being specific to you as the subject of
    the conversation.

    Because of how large the training data set is, the responses from
    the chatbot will look extremely convincing and specific, even though
    they are statistically generic. Once you've trained on most of the
    past twenty years of the web, large collections of stolen ebooks, all
    of Reddit, most of social media, and a substantial amount of custom interactions by low-wage workers, the model will have a response for
    almost everything you can think of, or can use a variation of something
    it's already seen.

    These initial interactions can be quite compelling, especially if you're
    a believer in "AI", but it is in the longer and repeated conversations
    that the effect really begins to kick in.

    5. The subjective validation loop--RLHF enters the picture ==========================================================
    It's important to remember at this stage how Reinforcement Learning
    through Human Feedback works.

    <https://huggingface.co/blog/rlhf>

    This is the method that vendors use to turn a raw language model into a
    chatbot that can hold a conversation.

    RLHF doesn't let the vendor make specific corrections to an LLM's
    output. The method involves using human feedback to rank a variety of
    texts generated by the model, usually following some other form of
    fine-tuning. The ranked texts are in turn used to train a separate
    reward model. It's this model that is responsible for the actual
    Reinforcement Learning of the LLM. The reward model, coupled with
    fine-tuning the LLM on collections of chats, is what turns the
    borderline unhinged conversations of a regular model into the fluent
    experience you see in systems such as ChatGPT.

    Because the feedback is based on rankings, it can't easily be based on
    specific issues. If a model makes a false statement in a conversation,
    that conversation gets a lower rank.

    This lack of concrete specificity likely means that RLHF models in
    general are likely to reward responses that sound accurate. As the
    reward model is likely just another language model, it can't reward
    based on facts or anything specific, so it can only reward output
    that has a tone, style, and structure that's commonly associated with statements that have been rated as accurate.

    Even the ratings themselves are suspect. Most, if not all, of the
    workers who provide this feedback to AI vendors are low-paid workers
    who are unlikely to have specialised knowledge relevant to the topic
    they're rating, and even if they do, they are unlikely to have the time
    to fact-check everything.

    That means they are going to be ranking the conversations almost
    entirely based on tone and sentence structure.

    This is why I think that RLHF has effectively become a reward system
    that specifically optimises language models for generating validation statements: Forer statements, shotgunning, vanishing negatives, and
    statistical guesses.

    In trying to make the LLM sound more human, more confident, and more
    engaging, but without being able to edit specific details in its output,
    AI researchers seem to have created a mechanical mentalist.

    Instead of pretending to read minds thrgh statistically plausible
    validation statements, it pretends to read and understand your text
    through statistically plausible validation statements.

    The validation loop can continue for a while, with the mark constantly
    doing the work of convincing themselves of the language model's
    intelligence. Done long enough, it becomes a form of reinforcement
    learning for the mark.

    6. The marks become cheerleaders
    ================================
    The most enthusiastic believers in an imminent AI revolution are
    starting to sound very similar to long-time believers in psychics and mind-reading.

    They come up with increasingly convoluted ideas and models to explain
    why the impossible is possible. They become more and more dismissive of
    fields of science and research that challenge their world view. Their
    own statements become tinged with awe and dread.

    And they keep evangelising. This is real!

    Often followed by: This is dangerous!

    Remember, the effect becomes more powerful when the mark is both
    intelligent and wants to believe. Subjective validation is based on how
    our minds work, in general, and is unaffected by your reported IQ.

    If anything, your intelligence will just improve your ability to
    rationalise your subjective validation and make the effect stronger.
    When it's coupled with a genuine desire to believe in the con--that we
    are on the verge of discovering Artificial General Intelligence--the
    effect should both be irresistible and powerful once it takes hold.

    This is why you can't rely on user reports to discover these issues.
    People who believe in psychics will generally have only positive things
    to say about a psychic, even as they're being bilked. People who believe
    we're on the verge of building an AGI will only have positive things to
    say about chatbots that support that belief.

    It's easy to fall for this
    ==========================
    Falling for this statistical illusion is easy. It has nothing to do with
    your intelligence or even your gullibility. It's your brain working
    against you. Most of the time conversations are collaborative and
    personal, so your mind is optimised for finding meaning in what is said
    under those circumstances. If you also want to believe, whether it's in psychics or in AGI, your mind will helpfully find reasons to believe in
    the conversation you're having.

    Once you're so deep into it that you've done a press tour and committed yourself as a public figure to this idea, dislodging the belief that we
    now have a proto-AGI becomes impossible. Much like a scientist publicly
    stating that they believe in a particular psychic, their self-image
    becomes intertwined with their belief in that psychic. Any dismissal of
    the phenomenon will feel to them like a personal attack.

    The psychic's con is a mechanism that has been extraordinarily
    successful at fooling people over the years. It works.

    The best defence is to respond the same way as you would to a convincing psychic's reading: "That's a neat trick, I wonder how they pulled it
    off?"

    Well, now you know.

    Once you're aware of the fallibility of how your mind works, you should
    have an easier time spotting when that fallibility is being exploited, intentionally or not.

    That brings us to an important question.

    Is this intentional?
    ====================
    Given that there are billions of dollars at stake in the tech industry,
    it would be tempting to assume that the statistical illusion of
    intelligence was intentionally created by people in the tech industry.

    I personally think that's extraordinarily unlikely.

    A popular response to various government conspiracy theories is that
    government institutions just aren't that good at keeping secrets.

    Well, the tech industry just isn't that good at software. This illusion
    is, honestly, too clever to have been created intentionally by those
    making it.

    The field of AI research has a reputation for disregarding the value of
    other fields, so I'm certain that this reimplementation of a psychic's
    con is entirely accidental. It's likely that, being unaware of much of
    the research in psychology on cognitive biases or how a psychic's con
    works, they stumbled into a mechanism and made chatbots that fooled many
    of the chatbot makers themselves.

    Remember what I wrote above about psychics frequently having conned
    themselves, that many of them aren't even aware of their own scam?

    The same applies here. I think this is an industry that didn't
    understand what it was doing and, now, doesn't understand what it did.

    That's why so many people in tech are completely and utterly convinced
    Ththat they have created the first spark of true Artificial General Intelligence.

    This new era of tech seems to be built on superstition and pseudoscience ========================================================================
    Once I started to research the possibility that LLM interactions were a variation on the psychic's con, I began to see parallels everywhere in
    the field of "AI".

    * Hooking a language model up to an MRI and claiming that it can read
    minds.

    * Claiming to be able to discern criminality based on facial
    expressions and gait.

    * Proposing magical solutions to health problems.

    * Literal predictions of the future.

    * Claiming to be able to discern the honesty of potential employees.

    All of these are proposed applications of "AI" systems, but they are
    also all common psychic scams. Mind reading, police assistance, faith
    healing, prophecy, and even psychic employee vetting are all right out
    of the mentalist playbook.

    Even though I have no doubts that these efforts are sincere, it's
    becoming more and more obvious that the tech industry has given itself wholesale to superstition and pseudoscience. They keep ignoring the
    warnings coming from other fields and the concerns from critics in their
    own camp.

    Large Language Models don't have the functionality or features to make
    up for this wave of superstition.

    * "Hallucinations" are a pervasive flaw that's baked into how LLMs
    work.
    <https://needtoknow.fyi/card/hallucinations/>

    * Summarisations are error-prone and prone to generalising about the
    text being summarised.
    <https://www.baldurbjarnason.com/2023/ai-summaries-unreliable/>

    * Their "reasoning" is a statistical illusion.

    * Their performance at natural language processing tasks is only
    marginally better than that of smaller language models.
    <http://opensamizdat.com/posts/chatgpt_survey/>

    * They tend to memorise and copy text without attribution.
    <https://needtoknow.fyi/card/copyright/>

    Taken together, these flaws make LLMs look less like an information
    technology and more like a modern mechanisation of the psychic hotline.

    Delegating your decision-making, ranking, assessment, strategising,
    analysis, or any other form of reasoning to a chatbot becomes the
    functional equivalent to phoning a psychic for advice.

    Imagine Google or a major tech company trying to fix their search engine
    by adding a psychic hotline to their front page? That's what they're
    doing with Bard.

    * "Our university students can't make heads nor tails of our website.
    Let's add a psychic hotline!"

    * "We need to improve our customer service portal. Let's add a
    psychic hotline!"

    * "We've added a psychic hotline button to your web browser! No, you
    can't get rid of it. You're welcome!"

    * "Can't understand a thing in our technical docs? Refer to our fancy
    new psychic hotline!"

    The AI bubble is going to be a tough one to weather.

    More on "AI"
    ============
    I've spent some time writing about the many flaws of language models and generative "AI".

    * I've written about how language models are a backward-facing tool
    in a novelty-seeking industry and why I think using language models
    for programming is a bad idea.
    <https://softwarecrisis.dev/letters/ai-code-quality/>
    <https://softwarecrisis.dev/letters/ai-and-software-quality/>

    * "AI" summaries are inherently unreliable.
    <https://www.baldurbjarnason.com/2023/ai-summaries-unreliable/>

    * Their tendency towards shortcuts makes them dangerous in healthcare.
    <https://www.baldurbjarnason.com/2023/ai-in-healthcare/>

    * Most of the research indicating a productivity benefit to "AI" is,
    at best, flawed, and at worst are completely detached from the
    reality of modern office work.
    <https://www.baldurbjarnason.com/2023/ignore-most-ai-research/>
    <https://www.baldurbjarnason.com/2023/ai-research-again/>

    * AI vendors have a history of pseudoscience and snake oil.
    <https://www.baldurbjarnason.com/2023/beware-of-ai-snake-oil/>

    * Even if you do think that a language model's unsolvable tendency
    towards ‘hallucinations' doesn't disqualify the technology from
    replacing search engines, the many security issues that language
    models suffer from should. The "write a prompt; get the output"
    model is inherently insecure. These systems are also vulnerable to
    a form of keyword manipulation exploit that's impossible to prevent.
    <https://softwarecrisis.dev/letters/prompts-are-not-fit-for-purpose/>
    <https://softwarecrisis.dev/letters/google-bard-seo/>

    I've come to the conclusion that a language model is almost always the
    wrong tool for the job.

    ***
    I strongly advise against integrating an LLM or chatbot into your
    product, website, or organisational processes.
    ***

    If you do have to use generative AI, either because it's a mandate from
    above your pay grade or some other requirement, I have written a book
    that's specifically about the issues with using generative "AI" for
    work:

    The Intelligence Illusion: a practical guide to the business risks of Generative AI.
    <https://illusion.baldurbjarnason.com/>

    It's only $35 USD for EPUB and PDF, which is only 15% of the $240 USD
    cost of twelve months of ChatGPT Plus.

    But, again, I'd much rather you just avoid using a language model in
    the first place and save both the cost of the ebook and the ChatGPT subscription.

    References on the Psychic's Con
    ===============================
    * Cold reading (Wikipedia)
    <https://en.wikipedia.org/wiki/Cold_reading>
    * How to Become Psychic and Cold Read People
    <http://positivelybrainwashed.com/
    how-to-become-psychic-and-cold-read-people/>
    * Derren Brown Cold Reading revealed
    <https://secrets-explained.com/derren-brown/cold-reading>
    * Cold reading (Rational Wiki)
    <https://secrets-explained.com/derren-brown/cold-reading>
    * 7 Tricks Psychics Bullshit People With That Everyone Should Know
    <https://www.thrillist.com/culture/7-tricks-psychics-and-mediums-use-
    how-psychics-use-cold-reading-the-forer-effect>
    * Should You Believe in Psychics? Psychology and logic join forces to
    debunk psychics (Psychology Today)
    <https://www.psychologytoday.com/us/blog/hot-thought/201904/
    should-you-believe-in-psychics>
    * Motivated reasoning (Wikipedia)
    <https://en.wikipedia.org/wiki/Motivated_reasoning>
    * Cold Reading: How I Made Others Believe I Had Psychic Powers
    <https://medium.com/@chris.kirsch/cold-reading-how-i-
    made-others-believe-i-had-psychic-powers-dc184879d264>
    * Cold reading (Sceptic's Dictionary)
    <https://www.skepdic.com/coldread.html>
    * Subjective validation (Sceptic's Dictionary)
    <https://www.skepdic.com/subjectivevalidation.html>
    * Subjective validation (Wikipedia)
    <https://en.wikipedia.org/wiki/Subjective_validation>
    * Coincidences: Remarkable or Random?
    <https://skepticalinquirer.org/1998/09/
    coincidences-remarkable-or-random/>
    * Psychic Experiences: Psychic Illusions
    <https://www.susanblackmore.uk/articles/
    psychic-experiences-psychic-illusions/>
    * Guide to Cold Reading
    <https://www.skeptics.com.au/resources/articles/
    guide-to-cold-reading-ray-hyman/>
    * The Cold Reading Technique
    <http://www.denisdutton.com/cold_reading.htm>
    * Forer effect (Sceptic's Dictionary)
    <https://www.skepdic.com/forer.html>
    * Tricks of the Psychic Trade (Psychology Today)
    <https://www.psychologytoday.com/us/blog/speaking-in-tongues/
    201201/tricks-the-psychic-trade>
    * Psychic Scams
    <https://www.aarp.org/money/scams-fraud/info-2022/psychic.html>
    * Ten Tricks of the Psychics I Bet You Didn't Know (You Won't Believe #6!)
    <https://skepticalinquirer.org/exclusive/
    ten-tricks-of-the-psychics-i-bet-you-didnrsquot-know/>

    From: <https://softwarecrisis.dev/letters/llmentalist/>
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From nospam@nospam@example.net to comp.misc on Fri Apr 12 10:28:20 2024
    From Newsgroup: comp.misc

    This message is in MIME format. The first part should be readable text,
    while the remaining parts are likely unreadable without MIME-aware tools.

    --8323328-500091342-1712910501=:13976
    Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8BIT

    Nice article. Thank you. I'm not in the wooo-club so I enjoyed a skeptical take on AI since the common opinion is that god will soon visit us.

    On Fri, 12 Apr 2024, Ben Collver wrote:

    The LLMentalist Effect
    ======================
    by Baldur Bjarnason, July 4th, 2023

    how chat-based Large Language Models replicate the mechanisms of a
    psychic's con


    For the past year or so I've been spending most of my time researching
    the use of language and diffusion models in software businesses.

    One of the issues in during this research--one that has perplexed
    me--has been that many people are convinced that language models, or specifically chat-based language models, are intelligent.

    But there isn't any mechanism inherent in large language models (LLMs)
    that would seem to enable this and, if real, it would be completely unexplained.

    LLMs are not brains and do not meaningfully share any of the mechanisms
    that animals or people use to reason or think.

    LLMs are a mathematical model of language tokens. You give a LLM text,
    and it will give you a mathematically plausible response to that text.

    There is no reason to believe that it thinks or reasons--indeed, every
    AI researcher and vendor to date has repeatedly emphasised that these
    models don't think.

    There are two possible explanations for this effect:

    1. The tech industry has accidentally invented the initial stages a
    completely new kind of mind, based on completely unknown
    principles, using completely unknown processes that have no
    parallel in the biological world.

    2. The intelligence illusion is in the mind of the user and not in
    the LLM itself.

    Many AI critics, including myself, are firmly in the second camp. It's
    why I titled my book on the risks of generative "AI" The Intelligence Illusion.

    <https://illusion.baldurbjarnason.com/>

    For the past couple of months, I've been working on an idea that I think explains the mechanism of this intelligence illusion.

    I now believe that there is even less intelligence and reasoning in
    these LLMs than I thought before.

    Many of the proposed use cases now look like borderline fraudulent pseudoscience to me.

    The rise of the mechanical psychic
    ==================================
    The intelligence illusion seems to be based on the same mechanism as
    that of a psychic's con, often called cold reading. It looks like an accidental automation of the same basic tactic.

    By using validation statements, such as sentences that use the Forer
    effect, the chatbot and the psychic both give the impression of being
    able to make extremely specific answers, but those answers are in fact statistically generic.

    <https://www.skepdic.com/forer.html>

    The psychic uses these statements to give the impression of being able
    to read minds and hear the secrets of the dead.

    The chatbot gives the impression of an intelligence that is specifically engaging with you and your work, but that impression is nothing more
    than a statistical trick.

    This idea was first planted in my head when I was going over some of the statements people have been making about the reasoning of these "AI."

    I first thought that these were just classic cases of tech bubble
    enthusiasm, but no, "AI" has both taken a different crowd and the
    believers in the "AI" bubble sound very different from those of prior bubbles.

    * "This is real. It's a bit worrying, but it's real."

    * "There really is something there. Not sure what to think of it, but
    I've experienced it myself."

    * You need to keep your mind open to the possibilities. Once you do,
    you'll see that there's something to it."

    That's when I remembered, triggered by a blog post by Terence Eden on
    the prevalence of Forer statements in chatbot replies. I have heard this before.

    <https://shkspr.mobi/blog/2023/02/ how-much-of-ais-recent-success-is-due-to-the-forer-effect/>

    This specific blend of awe, disbelief, and dread all sound like the
    words of a victim of a mentalist scam artist--psychics.

    The psychic's con is a tried and true method for scamming people that
    has been honed through the ages.

    What I describe below is one variation. There are many variations, but
    the core mechanism remains the same.

    The Psychic's Con
    =================
    The audience is represented by a collection of characters. The
    disinterested are periods. The interested are upper case O's.

    _ _ _
    . O . O . . . . O .
    - - _ _ - _
    . . . . . O O . . O
    - - -

    1. The Audience Selects Itself

    Most people aren't interested in psychics or the like, so the initial audience pool is already generally more open-minded and less critical
    than the population in general. The chart now has different letters
    to indicate that they are not of a single demographic


    R B B G G B

    Y Y B G R Y


    2. The Scene is Set

    The initial audience is prepared. Lights are dimmed. The psychic is
    hyped up. Staff research the audience on social media or through conversation. The audience's demographics are noted. All the letters representing demographics not chosen are lower case.

    _ _
    r b b G G b
    = -
    y y b G r y
    -

    3. Narrowing Down the Demographic

    The psychic gauges the information they have on the audience,
    gestures towards a row or cluster, and makes a statement that sounds
    specific but is in fact statistically likely for the demographic.
    Usually at least one person reacts. If not, the psychic will imply
    that the secret is too embarrassing for the "real" person to come
    forward, reminds people that they're available for private readings,
    and tries again. An at-symbol representing the psychic has an arrow
    pointing to the letter that represents the mark.

    - -
    @ -> G G
    = -
    G
    -

    4. The Mark is Tested

    The reaction indicates that the mark believes they were "read". This
    leads to a burst of questions that, again, sound very specific but
    are actually statistically generic. If the mark doesn't respond, the
    psychic declares the initial read a success and tries again. The
    mark's letter and the psychic's symbol have arrows pointing to each
    other representing a loop.

    ->
    @ G
    <-

    5. The Subjective Validation Loop

    The con begins in earnest. The psychic asks a series of questions
    that all sound very specific to the mark but are in reality just statistically probable guesses, based on their demographics and prior answers, phrased in a specific, highly confident way. The mark's
    letter has exclamation marks.

    !!!
    G

    6. "Wow! That psychic is the real thing!"

    The psychic ends the conversation and the mark is left with the sense
    that the psychic has uncanny powers. But the psychic isn't the real
    thing. It's all a con.

    1. Audience selection
    =====================
    Seers, tarot card readers, psychics, mind readers aren't all con
    artists. Sometimes the "psychic" is open about it all just being entertainment and aren't pretending to be able to contact spirits or
    read minds. Some psychics do not have a profit motive at all, and
    without the grift it doesn't seem fair to call somebody a con artist.

    But many of them are con artists deliberately fooling people, and they
    all operate using the same basic mechanisms that begin well before the reading proper.

    The audience is usually only composed of those already pre-disposed to believe in psychic phenomena and those they have managed to drag with
    them. Hardcore sceptics will almost always be in a very small minority
    of the audience, which both makes them easy to manage and provides
    social pressure on them to tone down their scepticism.

    Those who attend are primed to believe and are already familiar with
    the mythology surrounding psychics. All of which helps them manage expectations and frame their performance.

    2. Setting the scene
    ====================
    Usually the audience is reminded of the ground rules for how psychic
    readings "work" at the start of the performance. They are helped by the popularisation of these rules by media, cinema, and TV.

    Everybody now "knows" that:

    * Readings usually begin murky and unclear.
    * They then become clearer as the "connection" to the "spirit world"
    gets stronger.
    * Errors are expected. The "spirits" are often vague or hard to hear.
    * Non-believers can weaken or even disrupt the connection.

    Psychics also habitually research their audience, by mapping out their demographics, looking them up on social media, or even with informal interviews performed by staff mingling with attendees before the
    performance begins.

    When the lights dim, the psychic should have a clear idea of which
    members of the audience will make for a good mark.

    3. Narrowing down
    =================
    The mark usually chooses themselves. The psychic makes a statement and
    points towards a row, quickly altering their gesture based on somebody responding visible to the statement. This makes it look like they
    pointed at the mark right from the beginning.

    The mark is that way primed from the start to believe the psychic.
    They're off-guard. Usually a bit surprised and totally unprepared for
    the quick burst of questions the psychic offers next. If those questions
    land and draw the mark in, they are followed by the actual reading. Otherwise, they move on and try again.

    4. Testing the mark--Cold reading using subjective validation =============================================================
    The con--cold reading--hinges on a quirk of human psychology: if we personally relate to a statement, we will generally consider it to be accurate.

    <https://en.wikipedia.org/wiki/Cold_reading>

    This unfortunate side effect of how our mind functions is called
    subjective validation.

    <https://en.wikipedia.org/wiki/Subjective_validation>

    Subjective validation, sometimes called personal validation effect,
    is a cognitive bias by which people will consider a statement or
    another piece of information to be correct if it has any personal
    meaning or significance to them. People whose opinion is affected
    by subjective validation will perceive two unrelated events (i.e.,
    a coincidence) to be related because their personal beliefs demand
    that they be related.

    As a consequence, many people will interpret even the most generic
    statement as being specifically about them if they can relate to what
    was said.

    The more eager they are to find meaning in the statement, the stronger
    the effect.

    The more they believe in the speaker's ability to make accurate
    statements, the stronger the effect.

    The basic mechanism of the psychic's con is built on the mark being
    willing and able to relate what was said to themselves, even if it's unintentional.

    5. The subjective validation loop using validation statements =============================================================
    The psychic taps into this cognitive bias by making a series of
    statements that are tailored to be personally relatable--sound specific
    to you--while actually being statistically generic.

    These statements come in many types. I use "validation statements" here
    as an umbrella term for all these various tactics.

    Some common examples:

    * Forer or Barnum statements are probably the most famous kind of
    statement that plays into the subjective validation effect. Many of
    these statements are inherently meaningless but are nonetheless
    felt to be accurate by listeners. Most people will consider "you
    tend to be hard on yourself" to be an accurate description of
    themselves, for example.
    <https://en.wikipedia.org/wiki/Barnum_effect>

    * Vanishing negative is where a question is rephrased to include a
    negative such as "not" or "don't". If the psychic asks "you don't
    play the piano?" then they will be able to reframe the question as
    accurate after the fact, no matter what the answer is. If you
    answer negative: "didn't think so". Positive: "that's what I
    thought."

    * Rainbow ruse where the psychic associates the mark with both a
    trait and its opposite. "You're a very calm person, but if provoked
    you can get very angry."
    <https://en.wikipedia.org/wiki/Cold_reading#The_rainbow_ruse>

    * Statistical guesses.
    Statements like "you have, or used to have, a scar on your left
    leg or knee" apply to almost everybody. With enough knowledge of
    common statistics, the psychic can make general statements that
    sound incredibly specific to the mark.

    * Demographic guesses.
    Similar to statistical guesses, these are statements that are
    common to a demographic but will sound very specific to the mark
    that's listening.

    * Unverifiable predictions.
    Predictions like "somebody bears a strong ill will towards you but
    they are unlikely to act on it" are impossible to verify, but will
    sound true to many people.

    * Shotgunning is one of the more common tactic where the psychic will
    fire off a series of statements. The mark will find one of the
    statements to be accurate and, due to how our minds work, will come
    away only remembering the correct statement.
    <https://en.wikipedia.org/wiki/Cold_reading#Shotgunning>

    An important part of this process is the tone and bearing of the
    psychic. They need to be confident, be quick in dismissing errors and
    moving on when they make mistakes, and they need to be quick to read
    people's expressions and body language and adjust their responses to
    match.

    6. The con is completed
    =======================
    At the end of the process, the mark is likely to remember that the
    reading was eerily correct--that the psychic had an almost supernatural accuracy--which primes them to become even more receptive the next time
    they attend.

    This is where the con often becomes insidious: the effect becomes
    stronger the more cooperative the mark is, and they often become more cooperative over time.

    What's more, susceptibility has nothing to do with intelligence.

    Somebody raised to believe they have high IQ is more likely to fall
    for this than somebody raised to think less of their own intellectual capabilities. Subjective validation is a quirk of the human mind.
    We all fall for it. But if you think you're unlikely to be fooled,
    you will be tempted instead to apply your intelligence to "figure
    out" how it happened. This means you can end up using considerable
    creativity and intelligence to help the psychic fool you by coming up
    with rationalisations for their "ability". And because you think you
    can't be fooled, you also bring your intelligence to bear to defend the psychic's claim of their powers. Smart people (or, those who think of themselves as smart) can become the biggest, most lucrative marks.

    Whereas the sceptic who thinks less of themselves is more likely to just
    go:

    "That's a neat trick. I don't know how you pulled it off. Must be very clever."

    And just move on.

    Many psychics fool themselves
    =============================
    It isn't unusual for psychics to unconsciously develop a practice of
    cold reading subconsciously. The psychics themselves might not even be
    aware of their own tactics.

    <https://en.wikipedia.org/wiki/Cold_reading#Subconscious_cold_reading>

    As Denis Dutton describes:

    As a postgraduate student in pursuit of a scientific career, he
    became intrigued with astrology. Though during this period he had
    nagging doubts about the physical basis of astrology, he was
    encouraged to continue with it by his many satisfied clients, who
    invariably found his readings "amazingly accurate" in describing
    their personal situations and problems. Not until he had one day
    obtained such a gratifying reaction to a horoscope which, he
    realized later, he had cast completely incorrectly, did he begin
    slowly to understand the real nature of his activity: his great
    success as an astrologer had nothing whatsoever to do with the
    validity of astrology as a science. He had become, in fact, a
    proficient cold reader, one who sincerely believed in the power of
    astrology under the constant reinforcement of his clients. He was
    fooling them, of course, but only after falling for the illusion
    himself.

    <http://www.denisdutton.com/cold_reading.htm>

    There are many examples of this easily found once you start doing the research. The mechanism is simple enough and already baked into people's preconceptions of how readings work so many psychics accidentally
    develop the knack for it, meaning that they're not just conning the
    person being read, they are also conning themselves.

    This point will become important later.

    The LLMentalist Effect
    ======================

    _ _ _
    . O . O . . . . O .
    - - _ _ - _
    . . . . . O O . . O
    - - -

    1. The Audience Selects Itself

    People sceptical about "AI" chatbots are less likely to use them.
    Those who actively don't disbelieve the possibility of chatbot
    "intelligence" won't get pulled in by the bot. The most active
    audience will be early adopters, tech enthusiasts, and genuine
    believers in AGI who will all generally be less critical and more open-minded. The characters now have different letters to indicate
    that they are not of a single demographic, all overlaid by the word
    'HYPE' and arrows indicating a prevailing atmosphere of hype.

    ^ ^ ^ ^ ^ ^
    R|B|B|G|G|B|
    | | | | | |
    Y| HYPE Y|


    2. The Scene is Set

    Users are primed by the hype surrounding the technology. The chat
    environment sets the mood and expectations. Warnings about it being
    "early days" and "hallucinations" both anthropomorphise the bot and
    provide ready-made excuses for when one of its constant failures are
    noticed. All the letters representing demographics not chosen are
    lower case.

    _ _ _
    r B b g G B
    - _ _ - -
    y y B G r y
    - -

    3. The Prompt Establishes the Context

    Each user gives the chatbot a prompt and it answers. Many will either
    accept the answer as given or repeat variations on the initial prompt
    to get the desired result. They move on without falling for the
    effect. But some users engage in conversation and get drawn in.
    Various letters representing marks are connected via loop arrows with
    at symbols representing the chatbot. The rest are lower case.


    B| b G| g B|
    ^ v ^ v ^ v
    |@ @ |@ @ |@


    4. The Marks Test Themselves

    The chatbot's answers sound extremely specific to the current context
    but are in fact statistically generic. The mathematical model behind
    the chatbot delivers a statistically plausible response to the
    question. The marks that find this convincing get pulled in. The
    mark's letter and the chatbot's symbol have arrows pointing to each
    other representing a loop.

    ->
    @ G
    <-

    5. The Subjective Validation Loop

    The mark asks a series of questions and all of the replies sound like reasoned answers specific to the context but are in reality just statistically probable guesses. The more the mark engages, the more
    convinced they are of the chatbot's intelligence. The mark's letter
    has exclamation marks.

    !!!
    G

    6. "Wow! This chatbot thinks! It has sparks of general intelligence!"

    The mark is left with the sense that the chatbot is uncannily close to
    being self-aware and that it is definitely capable of reasoning But
    it's nothing more than a statistical and psychological effect.

    1. The audience selects itself
    ==============================
    If you aren't interested in "AI", you aren't going to use an "AI"
    chatbot, and if you try one, you're less likely to return.

    This means that many of the avid users of these chatbots are
    self-selected to be enthusiastic and open-minded about the field of AI
    and the notion of Artificial General Intelligence (AGI)--that these technologies might lead to self-aware and self-improving reasoning
    systems.

    Those who are genuine enthusiasts about AGI--that this field is about
    to invent a new kind of mind--are likely to be substantially more enthusiastic about using these chatbots than the rest.

    This parallels the audience selection for the psychic's con. Those who believe in an afterlife and that it can be contacted by the living are substantially more likely to attend a psychic's reading than others.

    2. Setting the stage
    ====================
    Our current environment of relentless hype sets the stage and builds
    up an expectation for at least glimmers of genuine intelligence. For
    all the warnings vendors make about these systems not being general intelligences, those statements are always followed by either an implied
    or an actual "yet". The hype strongly implies that these are "almost" intelligences and that you should be able to perceive "sparks" of intelligence in them.

    Those who believe are primed for subjective validation.

    The warnings also play a role in setting the stage. "It's early days"
    means that when the statistically generic nature of the response is
    spotted, it's easily dismissed as an "error". Anthropomorphising
    concepts such as using "hallucination" as a term help dismiss the fact
    that statistical responses are completely disconnected from meaning
    and facts. The hype and mythology of AI primes the audience to think
    of these systems as persons to be understood and engaged with, all
    but guaranteeing subjective validation.

    3. The prompt establishes the context
    =====================================
    The initial prompt interaction is the first filter. Most will just take
    the first answer and leave, or at most will repeat variations of their
    prompt until they get the result they wanted. These interactions are
    purely mechanical. The end-user is treating the chatbot merely as a generative widget, so they never get pulled into the LLMentalist effect.

    Some of the end-users, usually those who are more enthusiastic
    about the prospect of "AI", begin to engage and get pulled into "conversation" with a mathematical language model.

    4. The mark tests themselves--subjective validation kicks in ============================================================
    That conversation is the primary filter. Those who want to believe will
    see the responses to their prompt as being both specifically about them
    and intelligent. They are primed to see the chatbot as a person that
    is reading their texts and thoughtfully responding to them. But that
    isn't how language models work. LLMs model the distribution of words and phrases in a language as tokens. Their responses are nothing more than a statistically likely continuation of the prompt.

    You give it text. It gives you a response that matches responses that
    texts like yours commonly get in its training data set.

    Already, this is working along the same fundamental principle as
    the psychic's con: the LLM isn't "reading" your text any more than
    the psychic is reading your mind. They are giving you statistically
    plausible responses based on what you say. You're the one finding ways
    to validate those responses as being specific to you as the subject of
    the conversation.

    Because of how large the training data set is, the responses from
    the chatbot will look extremely convincing and specific, even though
    they are statistically generic. Once you've trained on most of the
    past twenty years of the web, large collections of stolen ebooks, all
    of Reddit, most of social media, and a substantial amount of custom interactions by low-wage workers, the model will have a response for
    almost everything you can think of, or can use a variation of something
    it's already seen.

    These initial interactions can be quite compelling, especially if you're
    a believer in "AI", but it is in the longer and repeated conversations
    that the effect really begins to kick in.

    5. The subjective validation loop--RLHF enters the picture ==========================================================
    It's important to remember at this stage how Reinforcement Learning
    through Human Feedback works.

    <https://huggingface.co/blog/rlhf>

    This is the method that vendors use to turn a raw language model into a chatbot that can hold a conversation.

    RLHF doesn't let the vendor make specific corrections to an LLM's
    output. The method involves using human feedback to rank a variety of
    texts generated by the model, usually following some other form of fine-tuning. The ranked texts are in turn used to train a separate
    reward model. It's this model that is responsible for the actual Reinforcement Learning of the LLM. The reward model, coupled with
    fine-tuning the LLM on collections of chats, is what turns the
    borderline unhinged conversations of a regular model into the fluent experience you see in systems such as ChatGPT.

    Because the feedback is based on rankings, it can't easily be based on specific issues. If a model makes a false statement in a conversation,
    that conversation gets a lower rank.

    This lack of concrete specificity likely means that RLHF models in
    general are likely to reward responses that sound accurate. As the
    reward model is likely just another language model, it can't reward
    based on facts or anything specific, so it can only reward output
    that has a tone, style, and structure that's commonly associated with statements that have been rated as accurate.

    Even the ratings themselves are suspect. Most, if not all, of the
    workers who provide this feedback to AI vendors are low-paid workers
    who are unlikely to have specialised knowledge relevant to the topic
    they're rating, and even if they do, they are unlikely to have the time
    to fact-check everything.

    That means they are going to be ranking the conversations almost
    entirely based on tone and sentence structure.

    This is why I think that RLHF has effectively become a reward system
    that specifically optimises language models for generating validation statements: Forer statements, shotgunning, vanishing negatives, and statistical guesses.

    In trying to make the LLM sound more human, more confident, and more engaging, but without being able to edit specific details in its output,
    AI researchers seem to have created a mechanical mentalist.

    Instead of pretending to read minds thrgh statistically plausible
    validation statements, it pretends to read and understand your text
    through statistically plausible validation statements.

    The validation loop can continue for a while, with the mark constantly
    doing the work of convincing themselves of the language model's
    intelligence. Done long enough, it becomes a form of reinforcement
    learning for the mark.

    6. The marks become cheerleaders
    ================================
    The most enthusiastic believers in an imminent AI revolution are
    starting to sound very similar to long-time believers in psychics and mind-reading.

    They come up with increasingly convoluted ideas and models to explain
    why the impossible is possible. They become more and more dismissive of fields of science and research that challenge their world view. Their
    own statements become tinged with awe and dread.

    And they keep evangelising. This is real!

    Often followed by: This is dangerous!

    Remember, the effect becomes more powerful when the mark is both
    intelligent and wants to believe. Subjective validation is based on how
    our minds work, in general, and is unaffected by your reported IQ.

    If anything, your intelligence will just improve your ability to
    rationalise your subjective validation and make the effect stronger.
    When it's coupled with a genuine desire to believe in the con--that we
    are on the verge of discovering Artificial General Intelligence--the
    effect should both be irresistible and powerful once it takes hold.

    This is why you can't rely on user reports to discover these issues.
    People who believe in psychics will generally have only positive things
    to say about a psychic, even as they're being bilked. People who believe we're on the verge of building an AGI will only have positive things to
    say about chatbots that support that belief.

    It's easy to fall for this
    ==========================
    Falling for this statistical illusion is easy. It has nothing to do with
    your intelligence or even your gullibility. It's your brain working
    against you. Most of the time conversations are collaborative and
    personal, so your mind is optimised for finding meaning in what is said
    under those circumstances. If you also want to believe, whether it's in psychics or in AGI, your mind will helpfully find reasons to believe in
    the conversation you're having.

    Once you're so deep into it that you've done a press tour and committed yourself as a public figure to this idea, dislodging the belief that we
    now have a proto-AGI becomes impossible. Much like a scientist publicly stating that they believe in a particular psychic, their self-image
    becomes intertwined with their belief in that psychic. Any dismissal of
    the phenomenon will feel to them like a personal attack.

    The psychic's con is a mechanism that has been extraordinarily
    successful at fooling people over the years. It works.

    The best defence is to respond the same way as you would to a convincing psychic's reading: "That's a neat trick, I wonder how they pulled it
    off?"

    Well, now you know.

    Once you're aware of the fallibility of how your mind works, you should
    have an easier time spotting when that fallibility is being exploited, intentionally or not.

    That brings us to an important question.

    Is this intentional?
    ====================
    Given that there are billions of dollars at stake in the tech industry,
    it would be tempting to assume that the statistical illusion of
    intelligence was intentionally created by people in the tech industry.

    I personally think that's extraordinarily unlikely.

    A popular response to various government conspiracy theories is that government institutions just aren't that good at keeping secrets.

    Well, the tech industry just isn't that good at software. This illusion
    is, honestly, too clever to have been created intentionally by those
    making it.

    The field of AI research has a reputation for disregarding the value of
    other fields, so I'm certain that this reimplementation of a psychic's
    con is entirely accidental. It's likely that, being unaware of much of
    the research in psychology on cognitive biases or how a psychic's con
    works, they stumbled into a mechanism and made chatbots that fooled many
    of the chatbot makers themselves.

    Remember what I wrote above about psychics frequently having conned themselves, that many of them aren't even aware of their own scam?

    The same applies here. I think this is an industry that didn't
    understand what it was doing and, now, doesn't understand what it did.

    That's why so many people in tech are completely and utterly convinced
    Ththat they have created the first spark of true Artificial General Intelligence.

    This new era of tech seems to be built on superstition and pseudoscience ========================================================================
    Once I started to research the possibility that LLM interactions were a variation on the psychic's con, I began to see parallels everywhere in
    the field of "AI".

    * Hooking a language model up to an MRI and claiming that it can read
    minds.

    * Claiming to be able to discern criminality based on facial
    expressions and gait.

    * Proposing magical solutions to health problems.

    * Literal predictions of the future.

    * Claiming to be able to discern the honesty of potential employees.

    All of these are proposed applications of "AI" systems, but they are
    also all common psychic scams. Mind reading, police assistance, faith healing, prophecy, and even psychic employee vetting are all right out
    of the mentalist playbook.

    Even though I have no doubts that these efforts are sincere, it's
    becoming more and more obvious that the tech industry has given itself wholesale to superstition and pseudoscience. They keep ignoring the
    warnings coming from other fields and the concerns from critics in their
    own camp.

    Large Language Models don't have the functionality or features to make
    up for this wave of superstition.

    * "Hallucinations" are a pervasive flaw that's baked into how LLMs
    work.
    <https://needtoknow.fyi/card/hallucinations/>

    * Summarisations are error-prone and prone to generalising about the
    text being summarised.
    <https://www.baldurbjarnason.com/2023/ai-summaries-unreliable/>

    * Their "reasoning" is a statistical illusion.

    * Their performance at natural language processing tasks is only
    marginally better than that of smaller language models.
    <http://opensamizdat.com/posts/chatgpt_survey/>

    * They tend to memorise and copy text without attribution.
    <https://needtoknow.fyi/card/copyright/>

    Taken together, these flaws make LLMs look less like an information technology and more like a modern mechanisation of the psychic hotline.

    Delegating your decision-making, ranking, assessment, strategising,
    analysis, or any other form of reasoning to a chatbot becomes the
    functional equivalent to phoning a psychic for advice.

    Imagine Google or a major tech company trying to fix their search engine
    by adding a psychic hotline to their front page? That's what they're
    doing with Bard.

    * "Our university students can't make heads nor tails of our website.
    Let's add a psychic hotline!"

    * "We need to improve our customer service portal. Let's add a
    psychic hotline!"

    * "We've added a psychic hotline button to your web browser! No, you
    can't get rid of it. You're welcome!"

    * "Can't understand a thing in our technical docs? Refer to our fancy
    new psychic hotline!"

    The AI bubble is going to be a tough one to weather.

    More on "AI"
    ============
    I've spent some time writing about the many flaws of language models and generative "AI".

    * I've written about how language models are a backward-facing tool
    in a novelty-seeking industry and why I think using language models
    for programming is a bad idea.
    <https://softwarecrisis.dev/letters/ai-code-quality/>
    <https://softwarecrisis.dev/letters/ai-and-software-quality/>

    * "AI" summaries are inherently unreliable.
    <https://www.baldurbjarnason.com/2023/ai-summaries-unreliable/>

    * Their tendency towards shortcuts makes them dangerous in healthcare.
    <https://www.baldurbjarnason.com/2023/ai-in-healthcare/>

    * Most of the research indicating a productivity benefit to "AI" is,
    at best, flawed, and at worst are completely detached from the
    reality of modern office work.
    <https://www.baldurbjarnason.com/2023/ignore-most-ai-research/>
    <https://www.baldurbjarnason.com/2023/ai-research-again/>

    * AI vendors have a history of pseudoscience and snake oil.
    <https://www.baldurbjarnason.com/2023/beware-of-ai-snake-oil/>

    * Even if you do think that a language model's unsolvable tendency
    towards ‘hallucinations' doesn't disqualify the technology from
    replacing search engines, the many security issues that language
    models suffer from should. The "write a prompt; get the output"
    model is inherently insecure. These systems are also vulnerable to
    a form of keyword manipulation exploit that's impossible to prevent.
    <https://softwarecrisis.dev/letters/prompts-are-not-fit-for-purpose/>
    <https://softwarecrisis.dev/letters/google-bard-seo/>

    I've come to the conclusion that a language model is almost always the
    wrong tool for the job.

    ***
    I strongly advise against integrating an LLM or chatbot into your
    product, website, or organisational processes.
    ***

    If you do have to use generative AI, either because it's a mandate from
    above your pay grade or some other requirement, I have written a book
    that's specifically about the issues with using generative "AI" for
    work:

    The Intelligence Illusion: a practical guide to the business risks of Generative AI.
    <https://illusion.baldurbjarnason.com/>

    It's only $35 USD for EPUB and PDF, which is only 15% of the $240 USD
    cost of twelve months of ChatGPT Plus.

    But, again, I'd much rather you just avoid using a language model in
    the first place and save both the cost of the ebook and the ChatGPT subscription.

    References on the Psychic's Con
    ===============================
    * Cold reading (Wikipedia)
    <https://en.wikipedia.org/wiki/Cold_reading>
    * How to Become Psychic and Cold Read People
    <http://positivelybrainwashed.com/
    how-to-become-psychic-and-cold-read-people/>
    * Derren Brown Cold Reading revealed
    <https://secrets-explained.com/derren-brown/cold-reading>
    * Cold reading (Rational Wiki)
    <https://secrets-explained.com/derren-brown/cold-reading>
    * 7 Tricks Psychics Bullshit People With That Everyone Should Know
    <https://www.thrillist.com/culture/7-tricks-psychics-and-mediums-use-
    how-psychics-use-cold-reading-the-forer-effect>
    * Should You Believe in Psychics? Psychology and logic join forces to
    debunk psychics (Psychology Today)
    <https://www.psychologytoday.com/us/blog/hot-thought/201904/
    should-you-believe-in-psychics>
    * Motivated reasoning (Wikipedia)
    <https://en.wikipedia.org/wiki/Motivated_reasoning>
    * Cold Reading: How I Made Others Believe I Had Psychic Powers
    <https://medium.com/@chris.kirsch/cold-reading-how-i-
    made-others-believe-i-had-psychic-powers-dc184879d264>
    * Cold reading (Sceptic's Dictionary)
    <https://www.skepdic.com/coldread.html>
    * Subjective validation (Sceptic's Dictionary)
    <https://www.skepdic.com/subjectivevalidation.html>
    * Subjective validation (Wikipedia)
    <https://en.wikipedia.org/wiki/Subjective_validation>
    * Coincidences: Remarkable or Random?
    <https://skepticalinquirer.org/1998/09/
    coincidences-remarkable-or-random/>
    * Psychic Experiences: Psychic Illusions
    <https://www.susanblackmore.uk/articles/
    psychic-experiences-psychic-illusions/>
    * Guide to Cold Reading
    <https://www.skeptics.com.au/resources/articles/
    guide-to-cold-reading-ray-hyman/>
    * The Cold Reading Technique
    <http://www.denisdutton.com/cold_reading.htm>
    * Forer effect (Sceptic's Dictionary)
    <https://www.skepdic.com/forer.html>
    * Tricks of the Psychic Trade (Psychology Today)
    <https://www.psychologytoday.com/us/blog/speaking-in-tongues/
    201201/tricks-the-psychic-trade>
    * Psychic Scams
    <https://www.aarp.org/money/scams-fraud/info-2022/psychic.html>
    * Ten Tricks of the Psychics I Bet You Didn't Know (You Won't Believe #6!)
    <https://skepticalinquirer.org/exclusive/
    ten-tricks-of-the-psychics-i-bet-you-didnrsquot-know/>

    From: <https://softwarecrisis.dev/letters/llmentalist/>

    --8323328-500091342-1712910501=:13976--
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Anton Shepelev@anton.txt@g{oogle}mail.com to comp.misc on Fri Apr 12 13:49:52 2024
    From Newsgroup: comp.misc

    Ben Collver quoted:

    <https://softwarecrisis.dev/letters/llmentalist/>
    [...]
    LLMs are not brains and do not meaningfully share any of
    the mechanisms that animals or people use to reason or
    think.

    LLMs are a mathematical model of language tokens. You give
    a LLM text, and it will give you a mathematically
    plausible response to that text.

    There is no reason to believe that it thinks or
    reasons--indeed, every AI researcher and vendor to date
    has repeatedly emphasised that these models don't think.

    What say ye to:

    1. LLM's can playing chess, that is they understand the
    rules of the game, because all the training set is not
    nearly sufficient if it were used merely
    statistically:
    <https://parrotchess.com/>

    2. Emergent World Models and Latent Variable Estimation
    in Chess-Playing Language Models:
    <https://arxiv.org/abs/2403.15498>

    I fear they cannot be explained by the Forer effect.
    --
    () ascii ribbon campaign -- against html e-mail
    /\ www.asciiribbon.org -- against proprietary attachments
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Eric Pozharski@apple.universe@posteo.net to comp.misc on Sat Apr 13 16:03:54 2024
    From Newsgroup: comp.misc

    with <20240412134952.391fb054793f0d1946a29ce6@g{oogle}mail.com> Anton
    Shepelev wrote:
    Ben Collver quoted:

    <https://softwarecrisis.dev/letters/llmentalist/>
    *SKIP* [ 8 lines 2 levels deep]
    There is no reason to believe that it thinks or reasons--indeed,
    every AI researcher and vendor to date has repeatedly emphasised that
    these models don't think.
    What say ye to:

    Oh, look at that. We've got True Believer. Useful feature -- sceptical
    one.

    1. LLM's can playing chess, that is they understand the rules of the
    game, because all the training set is not nearly sufficient if it were
    used merely statistically: <https://parrotchess.com/>

    (disclaimer: I'm not immersed and I can't be pressed to research this,
    but) Simplest possible somewhat problematic play: KQ vs KR. By the
    chess theory: blacks always loose. To delay inevitable checkmate
    blacks must never separate K and R.

    This was approached with different perspective: (a) setup all possible combinations; (b) build graph connecting would be previous
    combinations; (c) blacks choose move that would delay checkmate.
    Simple, eh?

    When NI faced this it was a catastrophy. Whites expect opponent that
    would play by theory and intuition. They don't expect graph. Then
    blacks make unexpected moves (thus bringing havoc onto whites plans).
    During that dance some combination repeates three times. Then, by chess
    rules, it's a tie.

    2. Emergent World Models and Latent Variable Estimation in
    Chess-Playing Language Models: <https://arxiv.org/abs/2403.15498>
    . ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ -- I, for one, smell oxymoron.

    Is it possible to build such graph for all positions and combinations of pieces? Turns out, yes, it's possible just by playing.

    I fear they cannot be explained by the Forer effect.

    For sceptics, the Forer effect is on recieving part. It has nothing to
    do with whatever is sold.
    --
    Torvalds' goal for Linux is very simple: World Domination
    Stallman's goal for GNU is even simpler: Freedom
    --- Synchronet 3.20a-Linux NewsLink 1.114