• Re: Assignment between union object members of incompatible types

    From James Kuyper@jameskuyper@alumni.caltech.edu to comp.std.c on Mon Nov 30 14:45:35 2020
    From Newsgroup: comp.std.c

    On 11/30/20 1:44 PM, Ian Abbott wrote:
    A question (not posted by myself) from https://stackoverflow.com/questions/65077630 :

    Consider the following:

    union { int i; char c; } x = {0};
    x.c = x.i;

    Does the assignment x.c = x.i result in undefined behavior?

    C18 6.15.16.1/3 says:

    That's actually 6.5.16.1/3.

    | If the value being stored in an object is read from another object
    | that overlaps in any way the storage of the first object, then the
    | overlap shall be exact and the two objects shall have qualified or
    | unqualified versions of a compatible type; otherwise, the behavior is
    | undefined.

    The objects x.c and x.i overlap, but have incompatible types, so on first glance it appears to be UB.

    That is correct.
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.std.c on Mon Nov 30 12:20:02 2020
    From Newsgroup: comp.std.c

    Ian Abbott <ijabbott63@gmail.com> writes:
    A question (not posted by myself) from https://stackoverflow.com/questions/65077630 :

    Consider the following:

    union { int i; char c; } x = {0};
    x.c = x.i;

    Does the assignment x.c = x.i result in undefined behavior?

    C18 6.15.16.1/3 says:

    | If the value being stored in an object is read from another object
    | that overlaps in any way the storage of the first object, then the
    | overlap shall be exact and the two objects shall have qualified or
    | unqualified versions of a compatible type; otherwise, the behavior is
    | undefined.

    The objects x.c and x.i overlap, but have incompatible types, so on
    first glance it appears to be UB.

    On first glance, yes, but I think this passage needs to be updated to
    reflect its intent.

    The RHS of an assignment is not an lvalue. If it starts out as an
    lvalue, then lvalue conversion is applied. Logically the value is
    retrieved *and then* copied into the destination object.

    A simple case like this is not likely to cause problems (in the absence
    of agressive optimization), but we can construct more problematic cases
    where the object being copied is arbitrarily large and the overlap might
    not be detectable at compile time.

    I've written an answer:
    https://stackoverflow.com/a/65080498/827263

    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    Working, but not speaking, for Philips Healthcare
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.std.c on Tue Dec 1 02:49:10 2020
    From Newsgroup: comp.std.c

    Ian Abbott <ijabbott63@gmail.com> writes:

    A question (not posted by myself) from https://stackoverflow.com/questions/65077630 :

    Consider the following:

    union { int i; char c; } x = {0};
    x.c = x.i;

    Does the assignment x.c = x.i result in undefined behavior?

    C18 6.15.16.1/3 says:

    As elsewhere noted, the reference is 6.5.16.1 paragraph 3.

    | If the value being stored in an object is read from another object
    | that overlaps in any way the storage of the first object, then the
    | overlap shall be exact and the two objects shall have qualified or
    | unqualified versions of a compatible type; otherwise, the behavior is
    | undefined.

    The objects x.c and x.i overlap, but have incompatible types, so on
    first glance it appears to be UB.

    The value to be stored is read from object x.i. This value is
    being stored in object x.c.

    The object x.i overlaps with object x.c.

    The overlap is not exact (probably). The two types involved are
    not qualified or unqualified versions of a compatible type.

    The assignment has undefined behavior. No doubt about it.
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.std.c on Tue Dec 1 03:39:38 2020
    From Newsgroup: comp.std.c

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Ian Abbott <ijabbott63@gmail.com> writes:

    A question (not posted by myself) from
    https://stackoverflow.com/questions/65077630 :

    Consider the following:

    union { int i; char c; } x = {0};
    x.c = x.i;

    Does the assignment x.c = x.i result in undefined behavior?

    C18 6.15.16.1/3 says:

    | If the value being stored in an object is read from another
    | object that overlaps in any way the storage of the first object,
    | then the overlap shall be exact and the two objects shall have
    | qualified or unqualified versions of a compatible type;
    | otherwise, the behavior is undefined.

    The objects x.c and x.i overlap, but have incompatible types, so on
    first glance it appears to be UB.

    On first glance, yes, but I think this passage needs to be updated
    to reflect its intent.

    The RHS of an assignment is not an lvalue. If it starts out as an
    lvalue, then lvalue conversion is applied. Logically the value is
    retrieved *and then* copied into the destination object.

    A simple case like this is not likely to cause problems (in the
    absence of agressive optimization), but we can construct more
    problematic cases where the object being copied is arbitrarily large
    and the overlap might not be detectable at compile time.

    I've written an answer:
    https://stackoverflow.com/a/65080498/827263

    Let me start with where we agree. I agree that 6.5.16.1 p3 deserves
    some clarification. After that however my reading and yours reach
    different conclusions.

    First I think the passage as written clearly conveys the meaning
    intended in this case, that this assignment is undefined behavior.
    The value being stored was read out of x.i, and is being stored
    into x.c. The types of those two objects don't mesh, and so the
    assignment is undefined behavior. This assignment is about the
    clearest possible case that would violate 6.5.16.1 p3, and the
    passage as written conveys that, IMO with no room for argument.

    Second I think whether the RHS is an lvalue has no bearing on the
    question. The cited paragraph does not mention anything about
    lvalues; it speaks only of "the value being stored". Consider a
    slight variation:

    x.c = (printf( "hello world\n" ), x.i);

    The RHS of this assignment is not an lvalue. But the /value/
    being stored was read from x.i. As I read the Standard this
    assignment too is undefined behavior, and IMO the Standard does
    convey that intention.

    Here is another variation:

    x.c = 0 ? printf( "hello world\n" ) : x.i;

    Once again the value being stored was read from x.i, and again
    as I read the Standard this assignment too is undefined behavior,
    and IMO the Standard does convey that intention.

    Now let's look at an example you give in your stackoverflow answer.
    Quoting a whole paragraph:

    The passage says that the value "is read from another
    object". That's ambiguous. Must name of the object be the
    entire RHS expression, or can it be just a subexpression?
    If the latter, then x.c = x.i + 1; would have undefined
    behavior, which in my opinion would be absurd.

    You left out an important part: "the value /being stored/".
    Here the value being stored is x.i + 1. That /value/ was not
    read from x.i; it was formed by an addition operation after
    reading x.i. Ostensibly this assignment would not be undefined
    behavior. I say "ostensibly" because actually I think this case
    is not clear, and may have been intended to be undefined behavior
    along with the others. And I don't see anything absurd about
    having it be undefined behavior, or even surprising if it were
    discovered that this case had been meant to be undefined behavior
    all along.

    Disclaimer: I haven't yet done any research into Defect Reports
    or committee meeting notes to see if this question has been
    addressed there. I suspect it has, but I just haven't looked
    yet.
    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Francis Glassborow@francis.glassborow@btinternet.com to comp.std.c on Sat Dec 5 14:27:50 2020
    From Newsgroup: comp.std.c

    On 01/12/2020 11:39, Tim Rentsch wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Ian Abbott <ijabbott63@gmail.com> writes:

    A question (not posted by myself) from
    https://stackoverflow.com/questions/65077630 :

    Consider the following:

    union { int i; char c; } x = {0};
    x.c = x.i;

    Does the assignment x.c = x.i result in undefined behavior?

    C18 6.15.16.1/3 says:

    | If the value being stored in an object is read from another
    | object that overlaps in any way the storage of the first object,
    | then the overlap shall be exact and the two objects shall have
    | qualified or unqualified versions of a compatible type;
    | otherwise, the behavior is undefined.

    The objects x.c and x.i overlap, but have incompatible types, so on
    first glance it appears to be UB.

    On first glance, yes, but I think this passage needs to be updated
    to reflect its intent.

    The RHS of an assignment is not an lvalue. If it starts out as an
    lvalue, then lvalue conversion is applied. Logically the value is
    retrieved *and then* copied into the destination object.

    A simple case like this is not likely to cause problems (in the
    absence of agressive optimization), but we can construct more
    problematic cases where the object being copied is arbitrarily large
    and the overlap might not be detectable at compile time.

    I've written an answer:
    https://stackoverflow.com/a/65080498/827263

    Let me start with where we agree. I agree that 6.5.16.1 p3 deserves
    some clarification. After that however my reading and yours reach
    different conclusions.

    First I think the passage as written clearly conveys the meaning
    intended in this case, that this assignment is undefined behavior.
    The value being stored was read out of x.i, and is being stored
    into x.c. The types of those two objects don't mesh, and so the
    assignment is undefined behavior. This assignment is about the
    clearest possible case that would violate 6.5.16.1 p3, and the
    passage as written conveys that, IMO with no room for argument.

    Second I think whether the RHS is an lvalue has no bearing on the
    question. The cited paragraph does not mention anything about
    lvalues; it speaks only of "the value being stored". Consider a
    slight variation:

    x.c = (printf( "hello world\n" ), x.i);

    The RHS of this assignment is not an lvalue. But the /value/
    being stored was read from x.i. As I read the Standard this
    assignment too is undefined behavior, and IMO the Standard does
    convey that intention.

    Here is another variation:

    x.c = 0 ? printf( "hello world\n" ) : x.i;

    Once again the value being stored was read from x.i, and again
    as I read the Standard this assignment too is undefined behavior,
    and IMO the Standard does convey that intention.

    Now let's look at an example you give in your stackoverflow answer.
    Quoting a whole paragraph:

    The passage says that the value "is read from another
    object". That's ambiguous. Must name of the object be the
    entire RHS expression, or can it be just a subexpression?
    If the latter, then x.c = x.i + 1; would have undefined
    behavior, which in my opinion would be absurd.

    if
    x.c = x.i + 1;

    has defined behaviour then

    x.c = x.i + 0;

    also has defined behaviour.

    So IMO, consistency requires that we tighten the requirement so that

    x.c = any expression that uses x.i;

    has undefined behaviour.

    However I still have reservations because:

    {int const i = x.i; x.c = i;}

    is surely OK. However any optimising compiler will concatenate that to

    x.c = x.i;

    Note the braces limit the scope of i.

    Francis


    You left out an important part: "the value /being stored/".
    Here the value being stored is x.i + 1. That /value/ was not
    read from x.i; it was formed by an addition operation after
    reading x.i. Ostensibly this assignment would not be undefined
    behavior. I say "ostensibly" because actually I think this case
    is not clear, and may have been intended to be undefined behavior
    along with the others. And I don't see anything absurd about
    having it be undefined behavior, or even surprising if it were
    discovered that this case had been meant to be undefined behavior
    all along.

    Disclaimer: I haven't yet done any research into Defect Reports
    or committee meeting notes to see if this question has been
    addressed there. I suspect it has, but I just haven't looked
    yet.


    --- Synchronet 3.18a-Linux NewsLink 1.113
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.std.c on Sat Jul 10 08:42:09 2021
    From Newsgroup: comp.std.c

    Francis Glassborow <francis.glassborow@btinternet.com> writes:

    On 01/12/2020 11:39, Tim Rentsch wrote:

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:

    Ian Abbott <ijabbott63@gmail.com> writes:

    A question (not posted by myself) from
    https://stackoverflow.com/questions/65077630 :

    Consider the following:

    union { int i; char c; } x = {0};
    x.c = x.i;

    Does the assignment x.c = x.i result in undefined behavior?

    C18 6.15.16.1/3 says:

    | If the value being stored in an object is read from another
    | object that overlaps in any way the storage of the first object,
    | then the overlap shall be exact and the two objects shall have
    | qualified or unqualified versions of a compatible type;
    | otherwise, the behavior is undefined.

    The objects x.c and x.i overlap, but have incompatible types, so on
    first glance it appears to be UB.

    On first glance, yes, but I think this passage needs to be updated
    to reflect its intent.

    The RHS of an assignment is not an lvalue. If it starts out as an
    lvalue, then lvalue conversion is applied. Logically the value is
    retrieved *and then* copied into the destination object.

    A simple case like this is not likely to cause problems (in the
    absence of agressive optimization), but we can construct more
    problematic cases where the object being copied is arbitrarily large
    and the overlap might not be detectable at compile time.

    I've written an answer:
    https://stackoverflow.com/a/65080498/827263

    Let me start with where we agree. I agree that 6.5.16.1 p3 deserves
    some clarification. After that however my reading and yours reach
    different conclusions.

    First I think the passage as written clearly conveys the meaning
    intended in this case, that this assignment is undefined behavior.
    The value being stored was read out of x.i, and is being stored
    into x.c. The types of those two objects don't mesh, and so the
    assignment is undefined behavior. This assignment is about the
    clearest possible case that would violate 6.5.16.1 p3, and the
    passage as written conveys that, IMO with no room for argument.

    Second I think whether the RHS is an lvalue has no bearing on the
    question. The cited paragraph does not mention anything about
    lvalues; it speaks only of "the value being stored". Consider a
    slight variation:

    x.c = (printf( "hello world\n" ), x.i);

    The RHS of this assignment is not an lvalue. But the /value/
    being stored was read from x.i. As I read the Standard this
    assignment too is undefined behavior, and IMO the Standard does
    convey that intention.

    Here is another variation:

    x.c = 0 ? printf( "hello world\n" ) : x.i;

    Once again the value being stored was read from x.i, and again
    as I read the Standard this assignment too is undefined behavior,
    and IMO the Standard does convey that intention.

    (see note given below)

    Now let's look at an example you give in your stackoverflow answer.
    Quoting a whole paragraph:

    The passage says that the value "is read from another
    object". That's ambiguous. Must name of the object be the
    entire RHS expression, or can it be just a subexpression?
    If the latter, then x.c = x.i + 1; would have undefined
    behavior, which in my opinion would be absurd.

    My apologies for taking so long to respond to your comments.

    if
    x.c = x.i + 1;

    has defined behaviour then

    x.c = x.i + 0;

    also has defined behaviour.

    Yes. It does, and it should.

    So IMO, consistency requires that we tighten the requirement so that

    x.c = any expression that uses x.i;

    has undefined behaviour.

    Surely that is not the intent. It is only when the value being
    stored is _read directly_ from an overlapping object, and is
    not instead _formed as the result of a computation using_ an
    overlapping object, that undefined behavior occurs.

    However I still have reservations because:

    {int const i = x.i; x.c = i;}

    is surely OK. However any optimising compiler will concatenate that to

    x.c = x.i;

    Note the braces limit the scope of i.

    What an optimizing compiler might do has no effect on the program's
    semantics, which is defined in terms of an abstract machine where
    no optimizations occur. Each compiler has a responsibility to
    ensure its optimizer faithfully preserves the program semantics
    that the Standard requires, which in this case has defined behavior
    because the value being stored is read out of 'i' and not out of
    'x.i'.


    (Note for the "see below" remark: I'm willing to be convinced that
    the example

    x.c = 0 ? printf( "hello world\n" ) : x.i;

    has defined behavior, because the value being stored is the result
    of a ?: operator, and thus not just a direct read out of x.i. This
    argument can be seen more clearly if we consider

    x.c = 0 ? 0ULL : x.i;

    because the value of the RHS clearly is not the same as what is
    read out of x.i, which is an int. Similarly

    x.c = (unsigned long long) x.i;

    has defined behavior, because the value being stored is the result
    of a conversion operation, not the value read from x.i.)

    --- Synchronet 3.19a-Linux NewsLink 1.113