• Re: Representation of =?UTF-8?B?X0Jvb2w=?=

    From learningcpp1@learningcpp1@gmail.com (m137) to comp.lang.c on Fri Jan 17 02:47:49 2025
    From Newsgroup: comp.lang.c

    Hi Keith,

    Thank you for posting this. I noticed that the newer drafts of C23
    (N2912 onwards, I think) have replaced the term "trap representation"
    with "non-value representation":
    - **Trap representation** was last defined in [N2731 3.19.4(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=)
    as "an object representation that need not represent a value of the
    object type."
    - **Non-value representation** is most recently defined in [N3435 3.26(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23)
    as "an object representation that does not represent a value of the
    object type."

    The definition of non-value representation rules out object
    representations that represent a value of the object type from being
    non-value representations. So it seems to be stricter than the
    definition of trap representation, which does not seem to rule out such
    object representations from being trap representations. Is this
    interpretation correct?

    If so, what happens to the 254 trap representations that GCC and Clang
    reserve for `_Bool`? Assuming a width of 1, each of those 254 object representations represents a value in `_Bool`'s domain (the half whose
    value bit is 1 represents the value `true`, while the other half whose
    value bit is 0 represents the value `false`), so they cannot be thought
    of as non-value representations (since a non-value representation must
    be an object representation that **does not** represent a value of the
    object type).

    I've been stuck on this for quite some time, so would be grateful for
    any guidance you could provide.


    Thank you
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Fri Jan 17 04:40:16 2025
    From Newsgroup: comp.lang.c

    On 2025-01-17, m137 <learningcpp1@gmail.com> wrote:
    Hi Keith,

    Thank you for posting this.

    When, where? No attribution; referenced article is expired from this
    Eternal September server, which has decently long retentation times.

    I noticed that the newer drafts of C23
    (N2912 onwards, I think) have replaced the term "trap representation"
    with "non-value representation":

    That is correct. Probably because "trap representation" insinuates
    that such a representation *must* produce a trap, or else the
    implementation has no right to specify such a representation.

    Impelmentations are not obliged to produce traps in relation to
    non-value representations. Since the behaviors in question are
    undefined, they may do so.

    If so, what happens to the 254 trap representations that GCC and Clang reserve for `_Bool`? Assuming a width of 1, each of those 254 object

    GCC and Clang specifies trap representations for _Bool? Where is this
    found in their documentation?

    representations represents a value in `_Bool`'s domain (the half whose
    value bit is 1 represents the value `true`, while the other half whose
    value bit is 0 represents the value `false`), so they cannot be thought
    of as non-value representations (since a non-value representation must
    be an object representation that **does not** represent a value of the
    object type).

    In an integer type, it is indeed possible for the padding bits to be
    nonzero, without changing the value given by the value bits.

    However, how that works is not specified; it's up to an implementation,
    and doesn't have to be documented.

    An implementation could say that the padding bits don't mean anything;
    they can have any value whatsoever and so the situation is as you
    say: the bool representations with a 0 in the value bit are all false,
    and those with a 1 are all true.

    However, an implementation can also say that certain patterns of
    bits are non-value reprensentations.

    One example given is the possibility of parity bits. Suppose some
    integer type has one padding bit which behaves as a parity bit. Then
    suppose whenever that bit has incorrect parity, the representation is
    deemed a non-value representation.

    With regard to bool (say, one implemented in 8 bits), an impelmentation
    can assert that if there is a nonzero value in any padding bit, the
    result is a non-value representation. Then, only 0 and 1 are valid;
    all other byte codes are non-value representations.

    Implementations determine their own rules for how configurations of
    padding bits may, on their own, or in interaction with configurations
    of value bits, give rise to non-value representations.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Fri Jan 17 10:18:25 2025
    From Newsgroup: comp.lang.c

    On 17/01/2025 05:40, Kaz Kylheku wrote:
    On 2025-01-17, m137 <learningcpp1@gmail.com> wrote:
    Hi Keith,

    Thank you for posting this.

    When, where? No attribution; referenced article is expired from this
    Eternal September server, which has decently long retentation times.

    I noticed that the newer drafts of C23
    (N2912 onwards, I think) have replaced the term "trap representation"
    with "non-value representation":

    That is correct. Probably because "trap representation" insinuates
    that such a representation *must* produce a trap, or else the
    implementation has no right to specify such a representation.

    Yes, I believe that is the reason. Earlier standards make it clear in
    the definitions that accessing a "trap representation" does not imply "performing a trap" - but the term is easily misunderstood for those
    that read parts of the standard without referring back to the definitions.


    Impelmentations are not obliged to produce traps in relation to
    non-value representations. Since the behaviors in question are
    undefined, they may do so.

    Agreed. I can't see any differences in the semantics here - only the
    term used has changed.


    If so, what happens to the 254 trap representations that GCC and Clang
    reserve for `_Bool`? Assuming a width of 1, each of those 254 object

    GCC and Clang specifies trap representations for _Bool? Where is this
    found in their documentation?


    I can't answer for clang - I don't know it in high enough detail. But
    gcc certainly considers the use of _Bool representations other than 0 or
    1 as undefined behaviour (except when accessed through a char type
    lvalue - such as by memcpy). I once had a bug where data memcpy'ed into
    a struct containing a bool resulted in a bool that had something other
    than 0 or 1 in the memory byte - and that lead to the both "true" and
    "false" paths being followed in some code that used it.

    That was, of course, perfectly good code generation for undefined behaviour.

    But to be clear - it was UB, not a trap. Pre-C23 usage of "trap representations" is UB and may or may not perform a trap, depending on
    the implementation (including any flags it is given). With C23, it
    would now have the more appropriate name "non-value representation" and exactly the same effect.

    gcc now has a new "hardbool" feature, implemented as a type attribute:

    <https://gcc.gnu.org/onlinedocs/gcc/Common-Type-Attributes.html#index-hardbool-type-attribute>

    This lets you create new types that act much like booleans, except that
    you can specify the true and false representations directly, and that
    other representations are actually trapping - they are checked at
    runtime and lead to a call to __builtin_trap().


    representations represents a value in `_Bool`'s domain (the half whose
    value bit is 1 represents the value `true`, while the other half whose
    value bit is 0 represents the value `false`), so they cannot be thought
    of as non-value representations (since a non-value representation must
    be an object representation that **does not** represent a value of the
    object type).

    In an integer type, it is indeed possible for the padding bits to be
    nonzero, without changing the value given by the value bits.

    However, how that works is not specified; it's up to an implementation,
    and doesn't have to be documented.

    An implementation could say that the padding bits don't mean anything;
    they can have any value whatsoever and so the situation is as you
    say: the bool representations with a 0 in the value bit are all false,
    and those with a 1 are all true.

    However, an implementation can also say that certain patterns of
    bits are non-value reprensentations.

    One example given is the possibility of parity bits. Suppose some
    integer type has one padding bit which behaves as a parity bit. Then
    suppose whenever that bit has incorrect parity, the representation is
    deemed a non-value representation.

    With regard to bool (say, one implemented in 8 bits), an impelmentation
    can assert that if there is a nonzero value in any padding bit, the
    result is a non-value representation. Then, only 0 and 1 are valid;
    all other byte codes are non-value representations.

    Implementations determine their own rules for how configurations of
    padding bits may, on their own, or in interaction with configurations
    of value bits, give rise to non-value representations.



    All good stuff, nicely written.

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Fri Jan 17 12:06:31 2025
    From Newsgroup: comp.lang.c

    On Fri, 17 Jan 2025 04:40:16 -0000 (UTC)
    Kaz Kylheku <643-408-1753@kylheku.com> wrote:

    On 2025-01-17, m137 <learningcpp1@gmail.com> wrote:
    Hi Keith,

    Thank you for posting this.

    When, where? No attribution; referenced article is expired from this
    Eternal September server, which has decently long retentation times.


    Eternal September server used to have decently long retention times.
    Something like 8 or 9 years. Although I personally don't see why
    retention time should be set to anything, but forever.
    Then approximately half a year ago Eternal September server crashed.
    And it turned out that it had no functioning backup.
    At the beginning Ray Banana promised that he will bring everything
    back, downloading stuff from other servers. But it turned out less
    trivial than expected, some problems with article IDs of something. For
    few weeks he kept promising. After that I lost track. The thread about
    it is still active at e.support, but I am not reading.

    As to this particular thread, you can see the beginning here: https://www.novabbs.com/devel/article-flat.php?id=16634&group=comp.lang.c#16634 Or on google groups.

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Fri Jan 17 10:39:38 2025
    From Newsgroup: comp.lang.c

    learningcpp1@gmail.com (m137) writes:

    Hi Keith,

    Thank you for posting this.

    Normally followup postings include a reference of some sort to the
    article being replied to.

    I noticed that the newer drafts of C23
    (N2912 onwards, I think) have replaced the term "trap representation"
    with "non-value representation":
    - **Trap representation** was last defined in [N2731 3.19.4(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=) as "an object representation that need not represent a value of the
    object type."
    - **Non-value representation** is most recently defined in [N3435 3.26(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23) as "an object representation that does not represent a value of the
    object type."

    The definition of non-value representation rules out object
    representations that represent a value of the object type from being non-value representations. So it seems to be stricter than the
    definition of trap representation, which does not seem to rule out such object representations from being trap representations. Is this interpretation correct?

    No. Except for using a different name, there is no difference
    between "trap representation" and "non-value representation".

    If so, what happens to the 254 trap representations that GCC and Clang reserve for `_Bool`? Assuming a width of 1, each of those 254 object representations represents a value in `_Bool`'s domain (the half whose
    value bit is 1 represents the value `true`, while the other half whose
    value bit is 0 represents the value `false`), so they cannot be thought
    of as non-value representations (since a non-value representation must
    be an object representation that **does not** represent a value of the
    object type).

    I don't know that either gcc or clang have any trap representations
    for _Bool. Furthermore whether they do could depend on either which
    version or what compiler options are being used.

    Let's assume 8-bit chars, and also that the width of _Bool is 1
    (which is optional before C23 and required in C23). Here is what
    can be said about the 256 states of a _Bool object.

    1. All zero bits must be a legal value for 0.

    2. There must be at least one combination of bits that is a legal
    value for 1 (and since it must be distinct from the all-zero
    value for 0, must have at least one bit set to 1).

    3. The remaining 254 possible combinations of bit settings can be
    any mixture of legal values and trap representations, which are also
    known as non-value representations starting in C23.

    4. Considering the set of legal value bit settings, there must be at
    least one bit position that is 0 in all cases where the value is
    0, and is 1 in all cases where the value is 1.

    5. Accessing any representation corresponding to a legal value has
    well-defined behavior, and yields 0 or 1 depending on the setting of
    the bit (or bits) mentioned in #4.

    6. Accessing any trap/non-value representation is undefined behavior
    and might do anything at all. It might appear to work. It might
    work in some cases but not others. It might yield a value that is
    neither 0 or 1. It might abort the program. It might cause the
    computer the program is running on to run a different operating
    system (of course this outcome isn't very likely, but as far as the
    C standard is concerned it cannot be ruled out).

    Does this answer all your questions?
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Fri Jan 17 14:10:11 2025
    From Newsgroup: comp.lang.c

    On 1/16/25 23:40, Kaz Kylheku wrote:
    On 2025-01-17, m137 <learningcpp1@gmail.com> wrote:
    Hi Keith,

    Thank you for posting this.

    When, where? No attribution; referenced article is expired from this
    Eternal September server, which has decently long retentation times.

    While Google Groups has stopped archiving new messages, it retains all
    of the messages it previously archived, including the ones for this
    thread. It was started on 2021-05-23 by Keith Thompson.

    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Fri Jan 17 13:34:53 2025
    From Newsgroup: comp.lang.c

    learningcpp1@gmail.com (m137) writes:
    Hi Keith,

    Thank you for posting this.

    The message being referred to is one I posted Sun 2021-05-23, with
    Message-ID <87tums515a.fsf@nosuchdomain.example.com>. It's visible on
    Google Groups at <https://groups.google.com/g/comp.lang.c/c/4FUlV_XkmXg/m/OG8WeUCfAwAJ>.

    As others have suggested, please include attribution information when
    posting a followup. You don't need to quote the entire message,
    but provide at least some context, particularly when the parent
    message is old.

    This is an update to that message.

    I noticed that the newer drafts of C23
    (N2912 onwards, I think) have replaced the term "trap representation"
    with "non-value representation":
    - **Trap representation** was last defined in [N2731 3.19.4(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=) as "an object representation that need not represent a value of the
    object type."
    - **Non-value representation** is most recently defined in [N3435 3.26(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23) as "an object representation that does not represent a value of the
    object type."

    The definition of non-value representation rules out object
    representations that represent a value of the object type from being non-value representations. So it seems to be stricter than the
    definition of trap representation, which does not seem to rule out such object representations from being trap representations. Is this interpretation correct?

    I don't believe so. As far as I can tell, a "non-value
    representation" (C23 and later) is exactly the same thing as a "trap representation" (C17 and earlier). The older term was probably
    considered unclear, since it could imply that a trap is required.
    In fact, reading an object with a trap/non-value representation
    has undefined behavior, which can include yielding the value you
    might have expected.

    If so, what happens to the 254 trap representations that GCC and Clang reserve for `_Bool`?

    I see no evidence in gcc's documentation that gcc treats
    representations other than 0 or 1 as trap/non-value representations.
    I see only two references to "trap representation", one for signed
    integer types (saying that there are no trap representations)
    and one regarding type-punning via unions. There are no relevant
    references to "padding bits".

    I'm less familiar with clang's documentation, but I see no reference
    to "trap representation" or "non-value representation".

    We can get some information about this by running a test program.
    See below.

    Assuming a width of 1, each of those 254 object representations represents a value in `_Bool`'s domain (the half whose
    value bit is 1 represents the value `true`, while the other half whose
    value bit is 0 represents the value `false`), so they cannot be thought
    of as non-value representations (since a non-value representation must
    be an object representation that **does not** represent a value of the
    object type).

    Reading an object with a non-value representation has undefined
    behavior. If the observed value happens to be a valid value of the
    object's type, that's still consistent with undefined behavior.
    *Everything* is consistent with undefined behavior.

    I've been stuck on this for quite some time, so would be grateful for
    any guidance you could provide.

    Editions of the C standard earlier than C23 were not entirely
    clear about the representation of _Bool. (C90 does not have _Bool
    or bool. C99 through C17 have _Bool as a keyword, with bool as
    a macro defined in <stdbool.h>. C23 has bool as a keyword, with
    _Bool as an alternate spelling.)

    In C99 and later, _Bool/bool is required to be an unsigned integer
    type large enough to hold the values 0 and 1. Its size must be at
    least CHAR_BIT bits (which is at least 8). The *rank* of _Bool is
    less than the rank of all other standard integer types.

    The rank implies that the range of values is a subset of the
    range of values of any other unsigned integer type. The rank does
    *not* imply anything about relative sizes. unsigned char has a
    higher rank than bool, but bool could have additional padding bits
    making sizeof(bool)>1. (Probably no implementation does this.)
    unsigned char has no padding bits.

    C11 implies that _Bool can have more than one value bit, which
    means it could represent values greater than 1 (but no more than
    0..UCHAR_MAX).

    C23 (I'm using the N3096 draft) tightens the requirements, saying
    that bool has exactly one value bit and (sizeof(bool)*CHAR_BIT)-1
    padding bits -- again implying that sizeof(bool) might be greater
    than 1, but forbidding values greater than 1.

    Typically in C17 and earlier, and always in C23, _Bool/bool will
    have exactly 1 value bit and CHAR_BIT-1 padding bits. Padding bits
    do not contribute to the value of an object (so 0 and 1 are the
    only possible values), but non-zero padding bits *may or may not*
    create trap/non-value representations. (A gratuitously exotic
    implementation might use a representation other than 00000001 for
    true, but 00000000 is guaranteed to be a representation for 0/false.)

    As far as I can tell, the standard is silent on whether a bool object
    with non-zero padding bits is a trap/non-value representation or not.

    I wrote a test program to explore how bool is treated. It uses
    memcpy to set the representation of a bool object and then prints
    the value of that object. Source is at the bottom of this message.

    If bool has no non-value representations, then the values of the
    CHAR_BIT-1 padding bits must be ignored when reading a bool object,
    and the value of such an object is determined only by its single
    value bit, 0 or 1. If it does have non-value representations,
    then reading such an object has undefined behavior.

    With gcc 14.2.0, with "-std=c23", all-zeros is treated as false
    when used in a condition and all other representations are treated
    as true. Converting the value of a bool object to another integer
    type yields the value of its full 8-bit representation. If a bool
    object holds a representation other than 00000000 or 00000001,
    it compares equal to both `true` and `false`.

    This implies that bool has 1 value bit and 7 padding bits (as
    required by C23) and that it has 2 value representations and 254
    trap representations. The observed behavior for the non-value
    representations is the result of undefined behavior. (gcc -std=c23
    sets __STDC_VERSION__ to 202000L, not 202311L. The documentation
    acknowledges that support for C23 is experimental and incomplete.)

    With clang 19.1.4, with "-std=c23", the behavior is consistent
    with bool having no non-value representations. The 7 padding bits
    do not contribute to the value of a bool object. Any bool object
    with 0 as the low-order bit is treated as false in a condition and
    yields 0 when converted to another integer type,. Any bool object
    with 1 as the low-order bit is treated as true, and yields 1 when
    converted to another integer type. I presume the intent is for bool
    to have 256 value representations and no non-value representations
    (with the padding bits ignored as required), but it's also consistent
    with bool having non-value representations and the observed behavior
    being undefined. It's not possible to determine with a test program
    whether the output is the result of undefined behavior or not.

    As far as I can tell, the question of whether bool has non-value representations is unspecified but not implementation-defined,
    meaning that an implementation is not required to document its
    choice.

    #include <stdio.h>
    #include <string.h>
    #include <limits.h>
    #if __STDC_VERSION__ < 202311L
    #include <stdbool.h>
    #endif
    int main() {
    printf("__STDC_VERSION__ = %ldL\n", __STDC_VERSION__);
    #if __STDC_VERSION__ < 202311L
    puts("Older than C23, using <stdbool.h>");
    #else
    puts("C23 or later, using bool directly");
    #endif
    printf("sizeof (unsigned char) = %zu, sizeof (bool) = %zu\n",
    sizeof (unsigned char), sizeof (bool));

    const bool no = false;
    const bool yes = true;
    unsigned char uc;
    memcpy(&uc, &no, 1);
    printf("false is represented as %d\n", (int)uc);
    memcpy(&uc, &yes, 1);
    printf("true is represented as %d\n", (int)uc);

    for (int i = 0; i <= UCHAR_MAX; i ++) {
    const unsigned char uc = i;
    bool b;
    memcpy(&b, &uc, 1);
    const unsigned char value = b;
    printf("uc = 0x%02x b = 0x%02x b is %s, b%sfalse, b%strue\n",
    (unsigned)uc,
    value,
    b ? "truthy" : "falsy ",
    b == false ? "==" : "!=",
    b == true ? "==" : "!=");
    }
    }
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat Jan 18 12:17:02 2025
    From Newsgroup: comp.lang.c

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    learningcpp1@gmail.com (m137) writes:

    Hi Keith,

    Thank you for posting this.

    The message being referred to is one I posted Sun 2021-05-23, with
    Message-ID <87tums515a.fsf@nosuchdomain.example.com>. It's visible on
    Google Groups at <https://groups.google.com/g/comp.lang.c/c/4FUlV_XkmXg/m/OG8WeUCfAwAJ>.

    As others have suggested, please include attribution information when
    posting a followup. You don't need to quote the entire message,
    but provide at least some context, particularly when the parent
    message is old.

    This is an update to that message.

    I noticed that the newer drafts of C23
    (N2912 onwards, I think) have replaced the term "trap representation"
    with "non-value representation":
    - **Trap representation** was last defined in [N2731 3.19.4(1)]
    (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=)
    as "an object representation that need not represent a value of the
    object type."
    - **Non-value representation** is most recently defined in
    [N3435 3.26(1)]
    (https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23)
    as "an object representation that does not represent a value of the
    object type."

    The definition of non-value representation rules out object
    representations that represent a value of the object type from
    being non-value representations. So it seems to be stricter than
    the definition of trap representation, which does not seem to rule
    out such object representations from being trap representations.
    Is this interpretation correct?

    I don't believe so. As far as I can tell, a "non-value
    representation" (C23 and later) is exactly the same thing as a
    "trap representation" (C17 and earlier). The older term was
    probably considered unclear, since it could imply that a trap is
    required. In fact, reading an object with a trap/non-value
    representation has undefined behavior, which can include yielding
    the value you might have expected.

    If so, what happens to the 254 trap representations that GCC and
    Clang reserve for `_Bool`?

    I see no evidence in gcc's documentation that gcc treats
    representations other than 0 or 1 as trap/non-value representations.
    I see only two references to "trap representation", one for signed
    integer types (saying that there are no trap representations) and
    one regarding type-punning via unions. There are no relevant
    references to "padding bits".

    I'm less familiar with clang's documentation, but I see no reference
    to "trap representation" or "non-value representation".

    We can get some information about this by running a test program.
    See below.

    Assuming a width of 1, each of those 254
    object representations represents a value in `_Bool`'s domain (the
    half whose value bit is 1 represents the value `true`, while the
    other half whose value bit is 0 represents the value `false`), so
    they cannot be thought of as non-value representations (since a
    non-value representation must be an object representation that
    **does not** represent a value of the object type).

    Reading an object with a non-value representation has undefined
    behavior. If the observed value happens to be a valid value of
    the object's type, that's still consistent with undefined
    behavior. *Everything* is consistent with undefined behavior.

    I've been stuck on this for quite some time, so would be grateful
    for any guidance you could provide.

    Editions of the C standard earlier than C23 were not entirely
    clear about the representation of _Bool. (C90 does not have _Bool
    or bool. C99 through C17 have _Bool as a keyword, with bool as
    a macro defined in <stdbool.h>. C23 has bool as a keyword, with
    _Bool as an alternate spelling.)

    In C99 and later, _Bool/bool is required to be an unsigned integer
    type large enough to hold the values 0 and 1. Its size must be at
    least CHAR_BIT bits (which is at least 8). The *rank* of _Bool is
    less than the rank of all other standard integer types.

    The rank implies that the range of values is a subset of the
    range of values of any other unsigned integer type. The rank does
    *not* imply anything about relative sizes. unsigned char has a
    higher rank than bool, but bool could have additional padding bits
    making sizeof(bool)>1. (Probably no implementation does this.)
    unsigned char has no padding bits.

    C11 implies that _Bool can have more than one value bit, which
    means it could represent values greater than 1 (but no more than 0..UCHAR_MAX).

    C23 (I'm using the N3096 draft) tightens the requirements, saying
    that bool has exactly one value bit and (sizeof(bool)*CHAR_BIT)-1
    padding bits -- again implying that sizeof(bool) might be greater
    than 1, but forbidding values greater than 1.

    Typically in C17 and earlier, and always in C23, _Bool/bool will
    have exactly 1 value bit and CHAR_BIT-1 padding bits. Padding bits
    do not contribute to the value of an object (so 0 and 1 are the
    only possible values), but non-zero padding bits *may or may not*
    create trap/non-value representations. (A gratuitously exotic
    implementation might use a representation other than 00000001 for
    true, but 00000000 is guaranteed to be a representation for 0/false.)

    As far as I can tell, the standard is silent on whether a bool object
    with non-zero padding bits is a trap/non-value representation or not.

    There are no conditions other than the rules for how integer
    types are represented. As long as those conditions are met an
    implementation is free to make any set of object representations
    be a trap representation (and I assume that hasn't changed for
    C23, not counting the change that the width of _Bool must be
    one under C23).

    I wrote a test program to explore how bool is treated. It uses
    memcpy to set the representation of a bool object and then prints
    the value of that object. Source is at the bottom of this message.

    If bool has no non-value representations, then the values of the
    CHAR_BIT-1 padding bits must be ignored when reading a bool object,
    and the value of such an object is determined only by its single
    value bit, 0 or 1. If it does have non-value representations,
    then reading such an object has undefined behavior.

    With gcc 14.2.0, with "-std=c23", all-zeros is treated as false
    when used in a condition and all other representations are treated
    as true. Converting the value of a bool object to another integer
    type yields the value of its full 8-bit representation. If a bool
    object holds a representation other than 00000000 or 00000001,
    it compares equal to both `true` and `false`.

    This implies that bool has 1 value bit and 7 padding bits (as
    required by C23) and that it has 2 value representations and 254
    trap representations. The observed behavior for the non-value representations is the result of undefined behavior. (gcc -std=c23
    sets __STDC_VERSION__ to 202000L, not 202311L. The documentation acknowledges that support for C23 is experimental and incomplete.)

    With clang 19.1.4, with "-std=c23", the behavior is consistent
    with bool having no non-value representations. The 7 padding bits
    do not contribute to the value of a bool object. Any bool object
    with 0 as the low-order bit is treated as false in a condition and
    yields 0 when converted to another integer type,. Any bool object
    with 1 as the low-order bit is treated as true, and yields 1 when
    converted to another integer type. I presume the intent is for bool
    to have 256 value representations and no non-value representations
    (with the padding bits ignored as required), but it's also consistent
    with bool having non-value representations and the observed behavior
    being undefined. It's not possible to determine with a test program
    whether the output is the result of undefined behavior or not.

    As far as I can tell, the question of whether bool has non-value representations is unspecified but not implementation-defined,
    meaning that an implementation is not required to document its
    choice.

    6.2.6.1 paragraph 2 says objects other than bitfields are composed
    of contiguous sequences of one or more bytes, the number, order,
    and encoding of which are either explicitly specified or implementation-defined. Which object representations are legal
    values and which are non-value/trap representations should be
    part of the encoding, and hence implementation defined.


    #include <stdio.h>
    #include <string.h>
    #include <limits.h>
    #if __STDC_VERSION__ < 202311L
    #include <stdbool.h>
    #endif
    int main() {
    printf("__STDC_VERSION__ = %ldL\n", __STDC_VERSION__);
    #if __STDC_VERSION__ < 202311L
    puts("Older than C23, using <stdbool.h>");
    #else
    puts("C23 or later, using bool directly");
    #endif
    printf("sizeof (unsigned char) = %zu, sizeof (bool) = %zu\n",
    sizeof (unsigned char), sizeof (bool));

    const bool no = false;
    const bool yes = true;
    unsigned char uc;
    memcpy(&uc, &no, 1);
    printf("false is represented as %d\n", (int)uc);
    memcpy(&uc, &yes, 1);
    printf("true is represented as %d\n", (int)uc);

    for (int i = 0; i <= UCHAR_MAX; i ++) {
    const unsigned char uc = i;
    bool b;
    memcpy(&b, &uc, 1);
    const unsigned char value = b;
    printf("uc = 0x%02x b = 0x%02x b is %s, b%sfalse, b%strue\n",
    (unsigned)uc,
    value,
    b ? "truthy" : "falsy ",
    b == false ? "==" : "!=",
    b == true ? "==" : "!=");
    }
    }

    I was surprised to discover that running this program (as C11,
    under gcc 8.4.0) with the last 'false' changed to 'no' and the
    last 'true' changed to 'yes' gave a different result, namely,
    except for value==0 and value==1 there were no "==" for the
    b comparisons.
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From learningcpp1@learningcpp1@gmail.com (m137) to comp.lang.c on Sun Jan 19 02:08:54 2025
    From Newsgroup: comp.lang.c

    On Fri, 17 Jan 2025 4:40:16 +0000, Kaz Kylheku wrote:
    On 2025-01-17, m137 <learningcpp1@gmail.com> wrote:
    Hi Keith,

    Thank you for posting this.

    When, where? No attribution; referenced article is expired from this
    Eternal September server, which has decently long retentation times.


    Hi Kaz,

    Sorry for the confusion, I am new to the platform and had not realised
    that I needed to quote Keith's post in my text.

    I noticed that the newer drafts of C23
    (N2912 onwards, I think) have replaced the term "trap representation"
    with "non-value representation":

    That is correct. Probably because "trap representation" insinuates
    that such a representation *must* produce a trap, or else the
    implementation has no right to specify such a representation.

    Impelmentations are not obliged to produce traps in relation to
    non-value representations. Since the behaviors in question are
    undefined, they may do so.


    Thanks, I was wondering about this.

    If so, what happens to the 254 trap representations that GCC and Clang
    reserve for `_Bool`? Assuming a width of 1, each of those 254 object

    GCC and Clang specifies trap representations for _Bool? Where is this
    found in their documentation?


    It is not documented (see this thread for GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88662). But I think it can
    be inferred from the code snippets in Keith's OP and most recent post.
    GCC seems to treat all object representations of `_Bool` other than
    0b00000000 and 0b00000001 as trap/non-value representations.
    I am not sure about Clang, but compiling the last snippet in this
    article: https://www.trust-in-soft.com/resources/blogs/2016-06-16-trap-representations-and-padding-bits with
    Clang 19.1.0 and options "-std=c23 -O3 -pedantic" seems to show that
    Clang treats `_Bool` as having 254 non-value representations (see here: https://gcc.godbolt.org/z/4jK9d69P8).

    In an integer type, it is indeed possible for the padding bits to be
    nonzero, without changing the value given by the value bits.

    However, how that works is not specified; it's up to an implementation,
    and doesn't have to be documented.

    An implementation could say that the padding bits don't mean anything;
    they can have any value whatsoever and so the situation is as you
    say: the bool representations with a 0 in the value bit are all false,
    and those with a 1 are all true.

    However, an implementation can also say that certain patterns of
    bits are non-value reprensentations.

    One example given is the possibility of parity bits. Suppose some
    integer type has one padding bit which behaves as a parity bit.  Then suppose whenever that bit has incorrect parity, the representation is
    deemed a non-value representation.

    With regard to bool (say, one implemented in 8 bits), an impelmentation
    can assert that if there is a nonzero value in any padding bit, the
    result is a non-value representation. Then, only 0 and 1 are valid;
    all other byte codes are non-value representations.

    Implementations determine their own rules for how configurations of
    padding bits may, on their own, or in interaction with configurations
    of value bits, give rise to non-value representations.

    Thank you, I really appreciate you taking the time to reply.

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From learningcpp1@learningcpp1@gmail.com (m137) to comp.lang.c on Sun Jan 19 02:11:37 2025
    From Newsgroup: comp.lang.c

    On Fri, 17 Jan 2025 18:39:38 +0000, Tim Rentsch wrote:

    learningcpp1@gmail.com (m137) writes:

    Hi Keith,

    Thank you for posting this.

    Normally followup postings include a reference of some sort to the
    article being replied to.


    Hi Tim,

    Sorry for the confusion, I am new to platform and hadn't realised that I
    need to quote Keith's post in my reply.

    I noticed that the newer drafts of C23
    (N2912 onwards, I think) have replaced the term "trap representation"
    with "non-value representation":
    - **Trap representation** was last defined in [N2731
    3.19.4(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n2912.pdf#page=) >> as "an object representation that need not represent a value of the
    object type."
    - **Non-value representation** is most recently defined in [N3435
    3.26(1)](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3435.pdf#page=23) >> as "an object representation that does not represent a value of the
    object type."

    The definition of non-value representation rules out object
    representations that represent a value of the object type from being
    non-value representations. So it seems to be stricter than the
    definition of trap representation, which does not seem to rule out such
    object representations from being trap representations. Is this
    interpretation correct?

    No. Except for using a different name, there is no difference
    between "trap representation" and "non-value representation".


    The reason I thought they were different was because the definition of
    trap representation uses the phrase "need not", which seemed more
    permissive than the "does not" in the definition of non-value
    representation.

    Let's assume 8-bit chars, and also that the width of _Bool is 1
    (which is optional before C23 and required in C23). Here is what
    can be said about the 256 states of a _Bool object.

    1. All zero bits must be a legal value for 0.

    2. There must be at least one combination of bits that is a legal
    value for 1 (and since it must be distinct from the all-zero
    value for 0, must have at least one bit set to 1).

    3. The remaining 254 possible combinations of bit settings can be
    any mixture of legal values and trap representations, which are also
    known as non-value representations starting in C23.

    4. Considering the set of legal value bit settings, there must be at
    least one bit position that is 0 in all cases where the value is
    0, and is 1 in all cases where the value is 1.

    5. Accessing any representation corresponding to a legal value has well-defined behavior, and yields 0 or 1 depending on the setting of
    the bit (or bits) mentioned in #4.

    6. Accessing any trap/non-value representation is undefined behavior
    and might do anything at all. It might appear to work. It might
    work in some cases but not others. It might yield a value that is
    neither 0 or 1. It might abort the program. It might cause the
    computer the program is running on to run a different operating
    system (of course this outcome isn't very likely, but as far as the
    C standard is concerned it cannot be ruled out).

    Does this answer all your questions?

    Yes, thank you for taking the time to reply, I really appreciate it.
    Just to clarify, since padding bits do not count towards the value being represented, in point (2) above, it would have to be the value bit
    specifically that is set to 1; and similarly in point (4), the bit
    position that is being referred to is the value bit. Is this correct?

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Sat Jan 18 18:28:26 2025
    From Newsgroup: comp.lang.c

    learningcpp1@gmail.com (m137) writes:
    [...]
    It is not documented (see this thread for GCC: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=88662). But I think it can
    be inferred from the code snippets in Keith's OP and most recent post.
    GCC seems to treat all object representations of `_Bool` other than 0b00000000 and 0b00000001 as trap/non-value representations.
    I am not sure about Clang, but compiling the last snippet in this
    article: https://www.trust-in-soft.com/resources/blogs/2016-06-16-trap-representations-and-padding-bits with
    Clang 19.1.0 and options "-std=c23 -O3 -pedantic" seems to show that
    Clang treats `_Bool` as having 254 non-value representations (see here: https://gcc.godbolt.org/z/4jK9d69P8).

    Interesting.

    Here's a program based on that snippet:

    #include <stdio.h>
    #include <string.h>

    int f(bool *b) {
    if (*b)
    return 1;
    else
    return 0;
    }

    int main(void) {
    bool arg;
    unsigned char uc = 123;
    memcpy(&arg, &uc, 1);
    printf("%d\n", f(&arg));
    }

    With the latest gcc (14.2.0) and clang (19.1.4), it prints 1 when
    compiled with "-std=c23 -O0 -pedantic", and 123 when compiled with
    -O1, -O2, and -O3.

    For gcc, my previous results already indicated that bool has 254
    non-value representations. For clang, the results do seem to
    indicate the same thing (though I might argue that it could also
    be a bug, unless the clang developers actually intended bool to
    have 254 non-value representations).
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From learningcpp1@learningcpp1@gmail.com (m137) to comp.lang.c on Sun Jan 19 02:30:02 2025
    From Newsgroup: comp.lang.c

    On Fri, 17 Jan 2025 21:34:53 +0000, Keith Thompson wrote:

    The message being referred to is one I posted Sun 2021-05-23, with
    Message-ID <87tums515a.fsf@nosuchdomain.example.com>. It's visible on
    Google Groups at <https://groups.google.com/g/comp.lang.c/c/4FUlV_XkmXg/m/OG8WeUCfAwAJ>.

    As others have suggested, please include attribution information when
    posting a followup. You don't need to quote the entire message,
    but provide at least some context, particularly when the parent
    message is old.


    Hi Keith,

    Sorry for the confusion, I am new to the platform and had not realised
    that I needed to quote your post in my reply.


    The definition of non-value representation rules out object
    representations that represent a value of the object type from being
    non-value representations. So it seems to be stricter than the
    definition of trap representation, which does not seem to rule out such
    object representations from being trap representations. Is this
    interpretation correct?

    I don't believe so. As far as I can tell, a "non-value
    representation" (C23 and later) is exactly the same thing as a "trap representation" (C17 and earlier). The older term was probably
    considered unclear, since it could imply that a trap is required.
    In fact, reading an object with a trap/non-value representation
    has undefined behavior, which can include yielding the value you
    might have expected.


    The reason I thought they were different was because the definition of
    trap representation uses the phrase "need not", i.e. a trap
    representation is an object representation that **need not** represent a
    value of the object type. I read this as saying that a trap
    representation could be an object representation that represents a value
    of the object type, **or** it could be one that does not. This seemed
    more permissive than the definition of non-value representation, which
    uses the phrase "does not", i.e. a non-value representation is an object representation that *does not* represent a value of the object type. I
    took that as meaning that object representations that do represent a
    value of the object type (such as those 254 representations of `_Bool`, assuming a width of 1) are excluded from being classed as non-value representations. But I understand now that that is not the case.

    Editions of the C standard earlier than C23 were not entirely
    clear about the representation of _Bool.

    Yes, confusingly, I could not find anything about the width of a `_Bool`
    in C99, and C11 and C17 only talk about it in a footnote all the way
    down in 6.7.2.1:

    - C11 final draft, footnote 122: https://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf#page=131
    - C17 final draft, footnote 124: https://web.archive.org/web/20181230041359/http://www.open-std.org/jtc1/sc22/wg14/www/abq/c17_updated_proposed_fdis.pdf#page=100

    Typically in C17 and earlier, and always in C23, _Bool/bool will
    have exactly 1 value bit and CHAR_BIT-1 padding bits. Padding bits
    do not contribute to the value of an object (so 0 and 1 are the
    only possible values), but non-zero padding bits *may or may not*
    create trap/non-value representations. (A gratuitously exotic
    implementation might use a representation other than 00000001 for
    true, but 00000000 is guaranteed to be a representation for 0/false.)

    As far as I can tell, the standard is silent on whether a bool object
    with non-zero padding bits is a trap/non-value representation or not.

    I wrote a test program to explore how bool is treated. It uses
    memcpy to set the representation of a bool object and then prints
    the value of that object. Source is at the bottom of this message.

    If bool has no non-value representations, then the values of the
    CHAR_BIT-1 padding bits must be ignored when reading a bool object,
    and the value of such an object is determined only by its single
    value bit, 0 or 1. If it does have non-value representations,
    then reading such an object has undefined behavior.

    With gcc 14.2.0, with "-std=c23", all-zeros is treated as false
    when used in a condition and all other representations are treated
    as true. Converting the value of a bool object to another integer
    type yields the value of its full 8-bit representation. If a bool
    object holds a representation other than 00000000 or 00000001,
    it compares equal to both `true` and `false`.

    This implies that bool has 1 value bit and 7 padding bits (as
    required by C23) and that it has 2 value representations and 254
    trap representations. The observed behavior for the non-value representations is the result of undefined behavior. (gcc -std=c23
    sets __STDC_VERSION__ to 202000L, not 202311L. The documentation acknowledges that support for C23 is experimental and incomplete.)

    With clang 19.1.4, with "-std=c23", the behavior is consistent
    with bool having no non-value representations. The 7 padding bits
    do not contribute to the value of a bool object. Any bool object
    with 0 as the low-order bit is treated as false in a condition and
    yields 0 when converted to another integer type,. Any bool object
    with 1 as the low-order bit is treated as true, and yields 1 when
    converted to another integer type. I presume the intent is for bool
    to have 256 value representations and no non-value representations
    (with the padding bits ignored as required), but it's also consistent
    with bool having non-value representations and the observed behavior
    being undefined. It's not possible to determine with a test program
    whether the output is the result of undefined behavior or not.

    Compiling the last snippet in this article: https://www.trust-in-soft.com/resources/blogs/2016-06-16-trap-representations-and-padding-bits,
    with Clang 19.1.0 and options "-std=c23 -O3 -pedantic" seems to show
    that Clang does treat `_Bool` as having 2 value representations and 254 non-value representations (see here:
    https://gcc.godbolt.org/z/4jK9d69P8).

    Thank you so much for taking the time to provide such a thorough
    analysis. It really clears things up for me.

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat Jan 18 20:37:25 2025
    From Newsgroup: comp.lang.c

    learningcpp1@gmail.com (m137) writes:

    On Fri, 17 Jan 2025 18:39:38 +0000, Tim Rentsch wrote:
    [...]

    Hi Tim,

    Sorry for the confusion, I am new to platform and hadn't realised
    that I need to quote Keith's post in my reply.

    No worries. Glad you are up to speed now.

    Let's assume 8-bit chars, and also that the width of _Bool is 1
    (which is optional before C23 and required in C23). Here is what
    can be said about the 256 states of a _Bool object.

    1. All zero bits must be a legal value for 0.

    2. There must be at least one combination of bits that is a legal
    value for 1 (and since it must be distinct from the all-zero
    value for 0, must have at least one bit set to 1).

    3. The remaining 254 possible combinations of bit settings can be
    any mixture of legal values and trap representations, which are also
    known as non-value representations starting in C23.

    4. Considering the set of legal value bit settings, there must be at
    least one bit position that is 0 in all cases where the value is
    0, and is 1 in all cases where the value is 1.

    5. Accessing any representation corresponding to a legal value has
    well-defined behavior, and yields 0 or 1 depending on the setting of
    the bit (or bits) mentioned in #4.

    6. Accessing any trap/non-value representation is undefined behavior
    and might do anything at all. It might appear to work. It might
    work in some cases but not others. It might yield a value that is
    neither 0 or 1. It might abort the program. It might cause the
    computer the program is running on to run a different operating
    system (of course this outcome isn't very likely, but as far as the
    C standard is concerned it cannot be ruled out).

    Does this answer all your questions?

    Yes, thank you for taking the time to reply, I really appreciate it.
    Just to clarify, since padding bits do not count towards the value being represented, in point (2) above, it would have to be the value bit specifically that is set to 1; and similarly in point (4), the bit
    position that is being referred to is the value bit. Is this correct?

    Yes, I think that's right, but we can't always tell which bit is the
    value bit just by looking at the set of legal values. Consider an implementation where _Bool 0 is represented by all zeros and _Bool 1
    is represented by all ones, and every combination that includes both
    zeros and ones (which is everything else) is a trap representation.
    The width of _Bool must be 1, but which bit is the value bit? We
    can't tell. Fortunately the C standard says how different types are
    encoded is implementation defined (if not defined explicitly), so we
    can consult the documentation to see which bit of _Bool is the value
    bit.
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From gazelle@gazelle@shell.xmission.com (Kenny McCormack) to comp.lang.c on Sun Jan 19 09:31:18 2025
    From Newsgroup: comp.lang.c

    In article <0f5ec645511128c21c90fb0688247e60@www.novabbs.com>,
    m137 <learningcpp1@gmail.com> wrote:
    On Fri, 17 Jan 2025 21:34:53 +0000, Keith Thompson wrote:

    The message being referred to is one I posted Sun 2021-05-23, with
    Message-ID <87tums515a.fsf@nosuchdomain.example.com>. It's visible on
    Google Groups at
    <https://groups.google.com/g/comp.lang.c/c/4FUlV_XkmXg/m/OG8WeUCfAwAJ>.

    As others have suggested, please include attribution information when
    posting a followup. You don't need to quote the entire message,
    but provide at least some context, particularly when the parent
    message is old.


    Hi Keith,

    Sorry for the confusion, I am new to the platform and had not realised
    that I needed to quote your post in my reply.

    You don't need to (in the sense that the world would end if you don't do
    it), but it makes things easier for your readers if you do.

    It is common in CLC to way overstate the case for various things. This
    seems to be an instance of that.
    --
    Res ipsa loquitur.
    --- Synchronet 3.20c-Linux NewsLink 1.2
  • From learningcpp1@learningcpp1@gmail.com (m137) to comp.lang.c on Tue Jan 21 00:16:40 2025
    From Newsgroup: comp.lang.c

    On Sun, 19 Jan 2025 9:31:18 +0000, Kenny McCormack wrote:

    You don't need to (in the sense that the world would end if you don't do
    it), but it makes things easier for your readers if you do.

    Hi Kenny,

    Thanks for letting me know. I like the quotes as it makes it easier to
    see which parts of a post are being addressed in a follow-up post.

    --
    --- Synchronet 3.20c-Linux NewsLink 1.2