• signed vs unsigned and gcc -Wsign-conversion

    From pozz@pozzugno@gmail.com to comp.lang.c on Mon Oct 20 17:03:58 2025
    From Newsgroup: comp.lang.c

    After many years programming in C language, I'm always unsure if it is
    safer to use signed int or unsigned int.

    Of course there are situations where signed or unsigned is clearly
    better. For example, if the values could assume negative values, signed
    int is the only solution. If you are manipulating single bits (&, |, ^,
    <<, >>), unsigned ints are your friends.

    What about other situations? For example, what do you use for the "i"
    loop variable?

    I recently activated gcc -Wsign-conversion option on a codebase and
    received a lot of warnings. I started to fix them, usually expliciting casting. Is it the way or is it better to avoid the warning from the beginning, choosing the right signed or unsigned type?


    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Mon Oct 20 17:38:44 2025
    From Newsgroup: comp.lang.c

    Am 20.10.2025 um 17:03 schrieb pozz:

    I recently activated gcc -Wsign-conversion option on a codebase and
    received a lot of warnings. I started to fix them, usually expliciting casting. ...

    As long as this doesn't fix malfunctions it's purely aesthetical.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Mon Oct 20 19:43:37 2025
    From Newsgroup: comp.lang.c

    On Mon, 20 Oct 2025 17:03:58 +0200
    pozz <pozzugno@gmail.com> wrote:

    After many years programming in C language, I'm always unsure if it
    is safer to use signed int or unsigned int.

    Of course there are situations where signed or unsigned is clearly
    better. For example, if the values could assume negative values,
    signed int is the only solution. If you are manipulating single bits
    (&, |, ^, <<, >>), unsigned ints are your friends.

    What about other situations? For example, what do you use for the "i"
    loop variable?


    I'd just point out that small negative numbers are FAR more common than
    numbers in range [2**31..2**32-1].
    Now, make your own conclusion.


    I recently activated gcc -Wsign-conversion option on a codebase and
    received a lot of warnings. I started to fix them, usually
    expliciting casting. Is it the way or is it better to avoid the
    warning from the beginning, choosing the right signed or unsigned
    type?



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Mon Oct 20 19:07:07 2025
    From Newsgroup: comp.lang.c

    Am 20.10.2025 um 18:43 schrieb Michael S:
    On Mon, 20 Oct 2025 17:03:58 +0200
    pozz <pozzugno@gmail.com> wrote:

    After many years programming in C language, I'm always unsure if it
    is safer to use signed int or unsigned int.

    Of course there are situations where signed or unsigned is clearly
    better. For example, if the values could assume negative values,
    signed int is the only solution. If you are manipulating single bits
    (&, |, ^, <<, >>), unsigned ints are your friends.

    What about other situations? For example, what do you use for the "i"
    loop variable?


    I'd just point out that small negative numbers are FAR more common than numbers in range [2**31..2**32-1].

    So use a short instead of an int for a loop counter to make the code
    run faster on a 68000-CPU ? ;-)

    Now, make your own conclusion.


    I recently activated gcc -Wsign-conversion option on a codebase and
    received a lot of warnings. I started to fix them, usually
    expliciting casting. Is it the way or is it better to avoid the
    warning from the beginning, choosing the right signed or unsigned
    type?




    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Mon Oct 20 18:01:34 2025
    From Newsgroup: comp.lang.c

    Michael S <already5chosen@yahoo.com> writes:
    On Mon, 20 Oct 2025 17:03:58 +0200
    pozz <pozzugno@gmail.com> wrote:

    After many years programming in C language, I'm always unsure if it
    is safer to use signed int or unsigned int.

    Of course there are situations where signed or unsigned is clearly
    better. For example, if the values could assume negative values,
    signed int is the only solution. If you are manipulating single bits
    (&, |, ^, <<, >>), unsigned ints are your friends.

    What about other situations? For example, what do you use for the "i"
    loop variable?


    I'd just point out that small negative numbers are FAR more common than >numbers in range [2**31..2**32-1].
    Now, make your own conclusion.

    One might also point out that negative loop indicies are rare, and
    thus ones conclusion may be that generally speaking loop indexes should
    be unsigned.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon Oct 20 20:03:34 2025
    From Newsgroup: comp.lang.c

    On 20/10/2025 17:03, pozz wrote:
    After many years programming in C language, I'm always unsure if it is
    safer to use signed int or unsigned int.

    Of course there are situations where signed or unsigned is clearly
    better. For example, if the values could assume negative values, signed
    int is the only solution. If you are manipulating single bits (&, |, ^,
    <<, >>), unsigned ints are your friends.

    What about other situations? For example, what do you use for the "i"
    loop variable?

    I recently activated gcc -Wsign-conversion option on a codebase and
    received a lot of warnings. I started to fix them, usually expliciting casting. Is it the way or is it better to avoid the warning from the beginning, choosing the right signed or unsigned type?



    Signed and unsigned types are equally safe. If you are sure you are
    within the ranges you know will work for the types you use, your code is
    safe. If you are not sure, you are unsafe. It doesn't matter if an
    overflow is undefined behaviour leading to a bug, or defined but
    unexpected behaviour leading to a bug. (Of course, if you are using the defined wrapping behaviour of unsigned types in a way that you know is
    correct for your program, then that is safe. All overflows for signed
    types are unsafe, as are all unexpected overflows of unsigned types.)

    Signed types can be more efficient in some circumstances, as they obey a number of useful mathematical rules that can be used for optimisation. Unsigned types - of the size of "unsigned int" or bigger - obey
    different mathematical identities that can occasionally be useful but
    often hinder optimisations.

    Beware assumptions about wrapping of unsigned types smaller than
    "unsigned int" - these promote to "int", and arithmetic is then done
    with "int" with UB overflow, before possibly being converted back to the smaller unsigned integer type.

    If your target has bigger registers than "int", sometimes code can be surprisingly inefficient for "unsigned int". If you have a 64-bit
    target and have an expression like "array[i++];" in a loop, it can be significantly less efficient if "i" is "unsigned int" because the
    compiler must assume that "i++" might wrap. If "i" is "int", or a
    64-bit type (like "size_t" on such a target), there is no such issue.
    (It is not uncommon that "int_fast32_t" or "uint_fast32_t" will be
    faster than plain "int" or "unsigned int", because these will be 64-bit
    on 64-bit targets.)

    So very often, the efficient choice of type is "int" or "int_fastN_t"
    for code that might be used on 64-bit platforms. Size-explicit types
    are the best choice if your code has to run on smaller platforms, so
    that you can be sure your types are big enough. But if the compiler can determine the range of an unsigned variable (such as in a "for" loop
    where the start and end cases are known at compile-time), then unsigned
    types will be just as efficient.

    Comparisons between signed and unsigned types do turn up regularly, and sometimes casts are necessary if you want to enable warnings about these
    and can't reasonably pick the signedness for the two sides of the
    comparison. It's a question of style and preference whether you want to enable such warnings - most people do not, I think.



    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Mon Oct 20 20:09:14 2025
    From Newsgroup: comp.lang.c

    On 2025-10-20, David Brown <david.brown@hesbynett.no> wrote:
    On 20/10/2025 17:03, pozz wrote:
    After many years programming in C language, I'm always unsure if it is
    safer to use signed int or unsigned int.

    Of course there are situations where signed or unsigned is clearly
    better. For example, if the values could assume negative values, signed
    int is the only solution. If you are manipulating single bits (&, |, ^,
    <<, >>), unsigned ints are your friends.

    What about other situations? For example, what do you use for the "i"
    loop variable?

    I recently activated gcc -Wsign-conversion option on a codebase and
    received a lot of warnings. I started to fix them, usually expliciting
    casting. Is it the way or is it better to avoid the warning from the
    beginning, choosing the right signed or unsigned type?



    Signed and unsigned types are equally safe. If you are sure you are
    within the ranges you know will work for the types you use, your code is safe. If you are not sure, you are unsafe.

    Safe generally means that the language somehow protects from harm, not
    that you protect yourself.

    Correct code operating on correct inputs, using unsafe constructs,
    is still called unsafe code.

    However using unsigned types due to them being safe is often poorly
    considered because if something goes wrong contrary to the programmer's
    intent, there likely will be undefined behavior somewhere.

    E.g. an array underflow using an unsigned index will not produce
    integer overlow undefined behavior, but the access will go out of
    bounds, which is undefined behavior.

    There are bugs which play out without any undefined behavior:
    the program calculates something contrary to its requirements,
    but stays within the confines of the defined language.

    The odds that by using unsigned numbers you will get only that type of
    bug are low, and even if so, it is not a big comfort.

    Signed numbers behave more like mathematical integers, in cases
    when there is no overflow.

    If a, b and c are small, non-negative quantities, you might be tempted
    to make them unsigned. But if you do so, then you can no longer make
    this derivation of inequalities:

    a + b > c

    c > a - b

    Under the unsigned types, we cannot add -b to both sides of the
    inequality, preserving its truth value, even if all the operands
    are tiny numbers that fit into a single decimal digit!

    If b happens to be greater than a, we get a huge value on the right
    side that is now larger than c, not smaller.

    Gratuitous use of unsigned types impairs our ability to
    algebra to simplify code, due to the "cliff" at zero.

    This is a nuanced topic where there isn't a one-type-fits-all answer,
    but I gravitate toward signed; use of unsigned has to be justified in
    some way.

    When sizes are being calculated and they come from functions or
    operators that produce size_t, then that tends to dictate unsigned.

    If the quantities are large and can possibly overflow, there are
    situations in which unsigned makes that simpler.

    For instance if a and b are unsigned such that a + b can semantically
    overflow (i.e. the result of the natural addition of a + b doesn't
    fit into the type). It is simpler to detect: you can just do the
    addition, and then test:

    c = a + b;

    when there is no overflow, it must be that (c >= a && c >= b)
    so if either (c < a) or (c < b) is true, it overflowed.

    This is significantly less verbose than a correct overflow test
    for signed addition, which has to avoid doing the actual addition,
    and has to be split into three cases: a and b have opposite
    sign (always okay), a and b are both positive, and a and b are
    both negative.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon Oct 20 14:48:40 2025
    From Newsgroup: comp.lang.c

    pozz <pozzugno@gmail.com> writes:
    After many years programming in C language, I'm always unsure if it is
    safer to use signed int or unsigned int.

    Of course there are situations where signed or unsigned is clearly
    better. For example, if the values could assume negative values,
    signed int is the only solution. If you are manipulating single bits
    (&, |, ^, <<, >>), unsigned ints are your friends.

    What about other situations? For example, what do you use for the "i"
    loop variable?

    I usually use int (certainly for iterating over argc/argv), but
    sometimes size_t. size_t is typically the most correct type for
    representing sizes or counts of objects in memory, but int is a
    bit easier to work with.

    Both signed and unsigned types are (usually) used to model subranges
    of the unbounded mathematical integers. If none of your operations
    yields results outside the range of the type you're using, you're
    safe -- but ensuring you don't stray outside that range can be easy
    or difficult. If you're counting no more than a few thousand items,
    int is fine. If you're counting bytes in a file or pennies in the
    national debt, you have to think about just what range of values
    you need to handle.

    The thing about unsigned types is that they have a discontinuity at
    0, which is much easier to run into than signed int's discontinuties
    at INT_MIN and INT_MAX. Subtraction in particular can easily yield mathematically incorrect results for unsigned types (unless your
    problem domain actuall calls for modular arithmetic).

    If you start with a value of type size_t, say from sizeof or
    strlen(), it's probably best to stick with size_t for any derived
    values. My vague impression is that most things that should use
    unsigned types should use size_t (there are of course plenty of
    exceptions).

    I recently activated gcc -Wsign-conversion option on a codebase and
    received a lot of warnings. I started to fix them, usually expliciting casting. Is it the way or is it better to avoid the warning from the beginning, choosing the right signed or unsigned type?

    Here's the description of -Wconversion :

    ‘-Wsign-conversion’
    Warn for implicit conversions that may change the sign of an
    integer value, like assigning a signed integer expression to an
    unsigned integer variable. An explicit cast silences the warning.
    In C, this option is enabled also by ‘-Wconversion’.

    If you're converting between different types, it's often (but by no
    means always) best to pick one type and use it consistently. I'm
    suspicious of most casts; if I need a conversion, I find that C's
    implicit conversions usually do the right thing.
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From BGB@cr88192@gmail.com to comp.lang.c on Mon Oct 20 17:44:04 2025
    From Newsgroup: comp.lang.c

    On 10/20/2025 11:43 AM, Michael S wrote:
    On Mon, 20 Oct 2025 17:03:58 +0200
    pozz <pozzugno@gmail.com> wrote:

    After many years programming in C language, I'm always unsure if it
    is safer to use signed int or unsigned int.

    Of course there are situations where signed or unsigned is clearly
    better. For example, if the values could assume negative values,
    signed int is the only solution. If you are manipulating single bits
    (&, |, ^, <<, >>), unsigned ints are your friends.

    What about other situations? For example, what do you use for the "i"
    loop variable?


    I'd just point out that small negative numbers are FAR more common than numbers in range [2**31..2**32-1].
    Now, make your own conclusion.


    Yeah, the distribution is lopsided, but I had usually noted that for
    numeric values of n bits, by the time n reaches 9 or 10, one becomes
    more likely to encounter a negative value than one in the range of n+1.

    Whereas, below this point, one is more likely to encounter a positive
    value larger than n, than to encounter a negative value.

    So:
    Positive values between 0 and 511: Very common;
    Negative values:
    Less common than values under +512
    More common than values over 1024.

    There is typically a large cluster of small positive numbers near 0,
    with a very steep falloff as numbers get larger.
    So, for example:
    1 is most common;
    2 is less common than 1;
    3 is less common than 2;
    ...
    Like, where the probability of seeing N is seemingly 1/(N+1).

    Outside of this main cluster, which largely falls to "very little" by
    512, there are a few big spikes up near a few locations:
    n = 2^15 and 2^16 (Best covered by a 17-bit sign-extended value)
    n = 2^31 and 2^32 (Best covered by a 33-bit sign-extended value)
    n = 2^63

    If expressing values as fixed-width binary fields, there is often sort
    of a "no man's land" for values between 34 and 61 bits where one is
    unlikely to find a whole lot of anything.

    Contrast, between 18 and 30 bits, there are still a handful of values
    spread across this range, usually in small counts (so, this space isn't
    really as empty as the gap starting at 34 bits).


    So, say, it is not all that useful to be able to represent a value
    larger than 33 bits without going all the way to 64.

    And, at this upper end, most of what one encounters tends to be things
    like double-precision values and EIGHTCC style values.


    And, statistically speaking, int32 is likely to hold the vast majority
    of integer values one is likely to encounter.


    A lot is likely to depend on what one is looking at (this is mostly for
    a distribution of literal values in my compiler stats).


    Ironically, because of the distribution, having things like some CPU instructions with only 5 or 6 bit fields for integer immediate values
    isn't totally useless.



    I recently activated gcc -Wsign-conversion option on a codebase and
    received a lot of warnings. I started to fix them, usually
    expliciting casting. Is it the way or is it better to avoid the
    warning from the beginning, choosing the right signed or unsigned
    type?




    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Mon Oct 20 23:35:35 2025
    From Newsgroup: comp.lang.c

    On Mon, 20 Oct 2025 17:03:58 +0200, pozz wrote:

    What about other situations? For example, what do you use for the "i"
    loop variable?

    I use unsigned integers if negative values are not involved, where the
    extra positive values might be useful. Here’s an example of the kind
    of loop I might write:

    unsigned int i;
    bool found;
    for (i = len(s);;)
    {
    if (i == 0)
    {
    found = false;
    break;
    } /*if*/
    if (matches(s[i]))
    {
    found = true;
    break;
    } /*if*/
    --i;
    } /*for*/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Mon Oct 20 23:36:40 2025
    From Newsgroup: comp.lang.c

    On Mon, 20 Oct 2025 19:43:37 +0300, Michael S wrote:

    I'd just point out that small negative numbers are FAR more common than numbers in range [2**31..2**32-1].

    Perhaps you mean *large* negative numbers?

    -1 is a larger number than -1000000.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Mon Oct 20 23:38:08 2025
    From Newsgroup: comp.lang.c

    On Mon, 20 Oct 2025 23:35:35 -0000 (UTC), I wrote:

    for (i = len(s);;)
    {
    ...
    } /*for*/

    Make that

    for (i = len(s);;)
    {
    if (i == 0)
    {
    found = false;
    break;
    } /*if*/
    --i;
    if (matches(s[i]))
    {
    found = true;
    break;
    } /*if*/
    } /*for*/
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Mon Oct 20 23:52:43 2025
    From Newsgroup: comp.lang.c

    On 2025-10-20, Lawrence D’Oliveiro <ldo@nz.invalid> wrote:
    On Mon, 20 Oct 2025 19:43:37 +0300, Michael S wrote:

    I'd just point out that small negative numbers are FAR more common than
    numbers in range [2**31..2**32-1].

    Perhaps you mean *large* negative numbers?

    Oh look, twit is out of his depth again.

    There is definitely usage of "large" and "small" which implies
    magnitude: the more bits are required to encode the number, the larger
    it is.

    This semantics is particularly emphasized/clarified when followed by the
    word "negative". A "small negative" number is one closer to zero than a
    "large negative".

    If your bank account is -100,000.00, you have a large debt
    compared to someone with -100.

    There is little dispute that -100 is "greater than" -100,000.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon Oct 20 16:58:52 2025
    From Newsgroup: comp.lang.c

    Lawrence D’Oliveiro <ldo@nz.invalid> writes:
    On Mon, 20 Oct 2025 19:43:37 +0300, Michael S wrote:

    I'd just point out that small negative numbers are FAR more common than
    numbers in range [2**31..2**32-1].

    Perhaps you mean *large* negative numbers?

    -1 is a larger number than -1000000.

    He clearly meant larger in magnitude. -1 is *greater* than -1000000,
    but smaller in magnitude -- and "-1" is clearly smaller/shorter than "-1000000".
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Mon Oct 20 17:13:03 2025
    From Newsgroup: comp.lang.c

    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    The thing about unsigned types is that they have a discontinuity at
    0, which is much easier to run into than signed int's discontinuties
    at INT_MIN and INT_MAX. Subtraction in particular can easily yield mathematically incorrect results for unsigned types (unless your
    problem domain actuall calls for modular arithmetic).

    One specific footgun enabled by unsigned types involves loops that count
    down to zero. This :

    for (int i = N; i >= 0; i --) {
    // ...
    }

    is well behaved, but this :

    for (size_t i = N; i >= 0; i --) {
    // ...
    }

    is an infinite loop. A compiler might warn that `i >= 0` is always
    true. You can work around this by checking the condition inside
    the body of the loop, before the decrement that causes a wraparound :

    for (size_t i = N; /* i >= 0 */; i --) {
    // ...
    if (i == 0) break;
    }

    But if your loop counts up, this isn't an issue.

    "You too may be a big hero
    Once you've learned to count backwards to zero."
    -- Tom Lehrer, "Wernher Von Braun"
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From rbowman@bowman@montana.com to comp.lang.c on Tue Oct 21 01:43:16 2025
    From Newsgroup: comp.lang.c

    On Mon, 20 Oct 2025 20:09:14 -0000 (UTC), Kaz Kylheku wrote:

    This is a nuanced topic where there isn't a one-type-fits-all answer,
    but I gravitate toward signed; use of unsigned has to be justified in
    some way.

    It's more an illustration of legacy designs that didn't stand up well but
    a short was originally used in our code for object numbers. Gotta save
    bytes. Who ever thought there would be more than 32k objects?

    Changing it to unsigned short bought us time. Going to an int would have
    had repercussions because of those bytes a diligent programmer saved back
    in the '90s.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Tue Oct 21 01:45:34 2025
    From Newsgroup: comp.lang.c

    On 2025-10-21, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    The thing about unsigned types is that they have a discontinuity at
    0, which is much easier to run into than signed int's discontinuties
    at INT_MIN and INT_MAX. Subtraction in particular can easily yield
    mathematically incorrect results for unsigned types (unless your
    problem domain actuall calls for modular arithmetic).

    One specific footgun enabled by unsigned types involves loops that count
    down to zero. This :

    for (int i = N; i >= 0; i --) {
    // ...
    }

    is well behaved, but this :

    for (size_t i = N; i >= 0; i --) {
    // ...
    }


    We just have to translate the signed "i >= 0" into unsigned.

    One way is to just directly translate the two's complement semantics
    is doing, pretending that the high bit of the value is a sign bit:

    // if the two's-complement-like "sign bit" is zero ...

    (SIZE_MAX & (SIZE_MAX >> 1) & i) == 0

    In a downard counting loop, we can just stop when we wrap around
    to the highest value, so we get to use most of the range:

    for (size_t i = N; i != SIZE_MAX; --i) // or (size_t) -1

    (Note: I like to write --i when it's downward, just as a style; it
    comes from stacks: stack[i++] = push; pop = stack[--i].)

    The troublesome case is when N needs to start at SIZE_MAX!

    But that troublesome case exists when counting upward also,
    signed or unsigned.

    Signed:

    // We must break the loop before undefined i++:

    for (int i = 0; i <= INT_MAX; i++)

    // Need a bottom-loop break on SIZE_MAX or else infinite loop:

    for (size_t i = 0; i <= SIZE_MAX; i++)

    This is where BartC will chime in with how languages benefit from
    built-in idioms for ranged loops that solve these problems under the
    hood. It's a valid argument.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Tue Oct 21 04:27:11 2025
    From Newsgroup: comp.lang.c

    On 20.10.2025 20:01, Scott Lurndal wrote:
    Michael S <already5chosen@yahoo.com> writes:
    [...]

    One might also point out that negative loop indicies are rare, and
    thus ones conclusion may be that generally speaking loop indexes should
    be unsigned.

    Just note that loop-indices typically run over arrays. And while okay
    in an (more common) ascending traversal it may become error-prone in a descending array-traversal loop. uint i; for (i=N-1; i>=0; i--) a[i];

    Janis

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Tue Oct 21 03:52:40 2025
    From Newsgroup: comp.lang.c

    Kaz Kylheku <643-408-1753@kylheku.com> wrote:
    On 2025-10-21, Keith Thompson <Keith.S.Thompson+u@gmail.com> wrote:
    Keith Thompson <Keith.S.Thompson+u@gmail.com> writes:
    [...]
    The thing about unsigned types is that they have a discontinuity at
    0, which is much easier to run into than signed int's discontinuties
    at INT_MIN and INT_MAX. Subtraction in particular can easily yield
    mathematically incorrect results for unsigned types (unless your
    problem domain actuall calls for modular arithmetic).

    One specific footgun enabled by unsigned types involves loops that count
    down to zero. This :

    for (int i = N; i >= 0; i --) {
    // ...
    }

    is well behaved, but this :

    for (size_t i = N; i >= 0; i --) {
    // ...
    }


    We just have to translate the signed "i >= 0" into unsigned.

    One way is to just directly translate the two's complement semantics
    is doing, pretending that the high bit of the value is a sign bit:

    // if the two's-complement-like "sign bit" is zero ...

    (SIZE_MAX & (SIZE_MAX >> 1) & i) == 0

    In a downard counting loop, we can just stop when we wrap around
    to the highest value, so we get to use most of the range:

    for (size_t i = N; i != SIZE_MAX; --i) // or (size_t) -1

    (Note: I like to write --i when it's downward, just as a style; it
    comes from stacks: stack[i++] = push; pop = stack[--i].)

    The troublesome case is when N needs to start at SIZE_MAX!

    But that troublesome case exists when counting upward also,
    signed or unsigned.

    Signed:

    // We must break the loop before undefined i++:

    for (int i = 0; i <= INT_MAX; i++)

    // Need a bottom-loop break on SIZE_MAX or else infinite loop:

    for (size_t i = 0; i <= SIZE_MAX; i++)

    This is where BartC will chime in with how languages benefit from
    built-in idioms for ranged loops that solve these problems under the
    hood. It's a valid argument.

    If you have variable upper and lower bounds which may cover the
    whole range of the type, than AFAIK on normal machine architecture
    there is significant loss of efficiency. C gives you loops
    which are always efficient, but do not cover corner cases.
    Other languages may give you low efficiency in cases where programmer
    thinks that loop is optimal. Now, most of the time it is
    better to aim at nicer sematics, possibly making code less
    efficient. But C was born to allow "hand optimization", that
    is writing efficient programs even if this means that programmer
    must be more careful and spend more work writing a program. I think
    that this is still important feature of C.
    --
    Waldek Hebisch
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Tue Oct 21 04:42:03 2025
    From Newsgroup: comp.lang.c

    pozz <pozzugno@gmail.com> wrote:
    After many years programming in C language, I'm always unsure if it is
    safer to use signed int or unsigned int.

    Of course there are situations where signed or unsigned is clearly
    better. For example, if the values could assume negative values, signed
    int is the only solution. If you are manipulating single bits (&, |, ^,
    <<, >>), unsigned ints are your friends.

    What about other situations? For example, what do you use for the "i"
    loop variable?

    I recently activated gcc -Wsign-conversion option on a codebase and
    received a lot of warnings. I started to fix them, usually expliciting casting. Is it the way or is it better to avoid the warning from the beginning, choosing the right signed or unsigned type?

    I oscilated between various uses, but for PC programming I now
    have strong preference for signed, with unsigned used when there
    are special reasons. Basically, as long as you stay within range
    signed agrees with mathematical integers which is normally wanted
    semantics. Given availability of relatively cheap 64-bit integers
    cases where you need to worry about going out of range tend to
    be rather special. Similarly, cases where you want wraparound
    are also rather special.

    If you are going to "fix" warning by adding casts agreeing with default convertions, then IMO it makes little sense. Turning off warnings
    (possibly by using pragma if warning if imposed on you by build
    machinery) is equally effective. You may sometimes need casts
    which are different than default convertions, those make
    sense. But IIUC it is basically when you want to convert
    unsigned to bigger unsigned type. IME promoting to signed usually
    works fine, so casts of this sort should be rare.

    BTW: I tried this warning on a piece of code which intensively
    used unsigned types (for things like device registers, etc.).
    It produced a bunch of warnings about changed value, but all were
    false positives: the change was intended.

    I would expect that in well written code need for convertions
    different than default promotions it relatively rare. It makes
    some sense to turn on warnings, inspect all cases and fix
    ones which are wrong. But having warnings on and writing
    a lot of casts means that you loose benefits of warnings:
    if you make a mistake writing code with casts warning will
    not help you finding the mistake.
    --
    Waldek Hebisch
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Oct 21 09:13:38 2025
    From Newsgroup: comp.lang.c

    On 20/10/2025 20:01, Scott Lurndal wrote:
    Michael S <already5chosen@yahoo.com> writes:
    On Mon, 20 Oct 2025 17:03:58 +0200
    pozz <pozzugno@gmail.com> wrote:

    After many years programming in C language, I'm always unsure if it
    is safer to use signed int or unsigned int.

    Of course there are situations where signed or unsigned is clearly
    better. For example, if the values could assume negative values,
    signed int is the only solution. If you are manipulating single bits
    (&, |, ^, <<, >>), unsigned ints are your friends.

    What about other situations? For example, what do you use for the "i"
    loop variable?


    I'd just point out that small negative numbers are FAR more common than
    numbers in range [2**31..2**32-1].
    Now, make your own conclusion.

    One might also point out that negative loop indicies are rare, and
    thus ones conclusion may be that generally speaking loop indexes should
    be unsigned.


    Loop indicies greater than 2 ^ 31 are equally rare. (Loops of between 2
    ^ 15 and 2 ^ 16 - 1 on 8-bit and 16-bit targets are less unrealistic.)

    Loops where you actually want the index counter to wrap are very rare,
    except perhaps when your loop is shifting the index each count (and then
    you are firmly in unsigned type territory).

    So in general, you are dealing with numbers that will fit comfortably
    within the ranges of both "int" and "unsigned int". If there is no
    other deciding factor, pick the one with the fewest unnecessary
    additional specifications, since that gives the compiler the most
    flexibility - "int".

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Bonita Montero@Bonita.Montero@gmail.com to comp.lang.c on Tue Oct 21 09:57:45 2025
    From Newsgroup: comp.lang.c

    Am 21.10.2025 um 01:38 schrieb Lawrence D’Oliveiro:
    On Mon, 20 Oct 2025 23:35:35 -0000 (UTC), I wrote:

    for (i = len(s);;)
    {
    ...
    } /*for*/

    Make that

    for (i = len(s);;)
    {
    if (i == 0)
    {
    found = false;
    break;
    } /*if*/
    --i;
    if (matches(s[i]))
    {
    found = true;
    break;
    } /*if*/
    } /*for*/

    Your coding style doesn't match any common style and is really sick.
    No wonder you chose C.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Oct 21 12:42:20 2025
    From Newsgroup: comp.lang.c

    On 20/10/2025 22:09, Kaz Kylheku wrote:
    On 2025-10-20, David Brown <david.brown@hesbynett.no> wrote:
    On 20/10/2025 17:03, pozz wrote:
    After many years programming in C language, I'm always unsure if it is
    safer to use signed int or unsigned int.

    Of course there are situations where signed or unsigned is clearly
    better. For example, if the values could assume negative values, signed
    int is the only solution. If you are manipulating single bits (&, |, ^,
    <<, >>), unsigned ints are your friends.

    What about other situations? For example, what do you use for the "i"
    loop variable?

    I recently activated gcc -Wsign-conversion option on a codebase and
    received a lot of warnings. I started to fix them, usually expliciting
    casting. Is it the way or is it better to avoid the warning from the
    beginning, choosing the right signed or unsigned type?



    Signed and unsigned types are equally safe. If you are sure you are
    within the ranges you know will work for the types you use, your code is
    safe. If you are not sure, you are unsafe.

    Safe generally means that the language somehow protects from harm, not
    that you protect yourself.

    No - "safe" means lower risk of harm, at least in /my/ book. It doesn't matter if it is something /you/ do, or something the language does, or something the tools do. (Ideally, of course, you want these all working together.)


    Correct code operating on correct inputs, using unsafe constructs,
    is still called unsafe code.

    It will be called "unsafe code" by Rust salesmen, but not by software developers who work on safe code. "Safe code" is code used safely, it
    is not an inherent property of code constructs, types, or languages.
    All code constructs are unsafe if used incorrectly, while clear and well-understood code constructs are safe if used correctly. (Of course
    some languages, tools, and programming practices make it easier to write
    safe code, or harder to write unsafe code, or easier to tell the
    difference.)


    However using unsigned types due to them being safe is often poorly considered because if something goes wrong contrary to the programmer's intent, there likely will be undefined behavior somewhere.

    Exactly. Unsigned types are not somehow "safer" than signed types, just because signed types have UB on overflow. Don't overflow your signed
    types, then you have no UB. And if you overflow your unsigned types
    without that being an intentional and understood part of your code, you
    will at the very least get unexpected behaviour - a bug - and just like
    UB, there are no limits to how bad that can get.


    E.g. an array underflow using an unsigned index will not produce
    integer overlow undefined behavior, but the access will go out of
    bounds, which is undefined behavior.


    Yes - bugs of all sorts often lead to UB sooner or later, even if the behaviour of the code is defined by the C language standards up to that
    point.

    There are bugs which play out without any undefined behavior:
    the program calculates something contrary to its requirements,
    but stays within the confines of the defined language.

    The odds that by using unsigned numbers you will get only that type of
    bug are low, and even if so, it is not a big comfort.

    Signed numbers behave more like mathematical integers, in cases
    when there is no overflow.

    If a, b and c are small, non-negative quantities, you might be tempted
    to make them unsigned. But if you do so, then you can no longer make
    this derivation of inequalities:

    a + b > c

    c > a - b

    Under the unsigned types, we cannot add -b to both sides of the
    inequality, preserving its truth value, even if all the operands
    are tiny numbers that fit into a single decimal digit!

    If b happens to be greater than a, we get a huge value on the right
    side that is now larger than c, not smaller.

    Gratuitous use of unsigned types impairs our ability to
    algebra to simplify code, due to the "cliff" at zero.


    Yes.

    This is a nuanced topic where there isn't a one-type-fits-all answer,
    but I gravitate toward signed; use of unsigned has to be justified in
    some way.

    When sizes are being calculated and they come from functions or
    operators that produce size_t, then that tends to dictate unsigned.

    If the quantities are large and can possibly overflow, there are
    situations in which unsigned makes that simpler.


    But normally, use of a bigger integer type makes the code significantly simpler and easier to get correct - and often more efficient.

    For instance if a and b are unsigned such that a + b can semantically overflow (i.e. the result of the natural addition of a + b doesn't
    fit into the type). It is simpler to detect: you can just do the
    addition, and then test:

    c = a + b;

    when there is no overflow, it must be that (c >= a && c >= b)
    so if either (c < a) or (c < b) is true, it overflowed.


    Or you use a bigger type and check simply and clearly for a result that
    is too big for your needs. Far too often, programmers go through
    reasoning like this and figure out what they see as "optimal" source
    code, then leave it in the source with no explanation as to what is
    going on. Aim to write code that does what it looks like it does - such
    as adds the two values correctly giving the mathematically correct
    result, then checks the range. Otherwise, good luck to the maintainer
    that changes the expression to "c = a + b + 1;".

    Or, with C23, use chk_add(). (Many compilers have extensions with the
    same effect, like __builtin_add_overflow, if you are happy using them.)

    This is significantly less verbose than a correct overflow test
    for signed addition, which has to avoid doing the actual addition,
    and has to be split into three cases: a and b have opposite
    sign (always okay), a and b are both positive, and a and b are
    both negative.


    Sure. But it is still significantly worse than using "long long int"
    (or "int_least64_t" if you prefer), or using ckd_add().

    There are things in C23 that are somewhat controversial, but I think the checked integer operations are clearly a good standardisation of
    existing compiler-specific practice.

    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Tue Oct 21 14:44:46 2025
    From Newsgroup: comp.lang.c

    On 2025-10-21 06:42, David Brown wrote:
    On 20/10/2025 22:09, Kaz Kylheku wrote:
    On 2025-10-20, David Brown <david.brown@hesbynett.no> wrote:
    On 20/10/2025 17:03, pozz wrote:
    However using unsigned types due to them being safe is often poorly
    considered because if something goes wrong contrary to the programmer's
    intent, there likely will be undefined behavior somewhere.

    Exactly. Unsigned types are not somehow "safer" than signed types, just because signed types have UB on overflow. Don't overflow your signed
    types, then you have no UB. And if you overflow your unsigned types
    without that being an intentional and understood part of your code, you
    will at the very least get unexpected behaviour - a bug - and just like
    UB, there are no limits to how bad that can get.

    No, there are limits on unexpected behavior: being unexpected, you might
    not know what they are, but it is still the case that the behavior
    starts out with having nothing more than an expression with an
    unexpected but valid value. That's pretty bad, and your code might make
    it worse, for example by promoting the unexpected value into undefined behavior. However, unless and until it actually does so, the behavior is somewhat more restricted than UB.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From Lawrence =?iso-8859-13?q?D=FFOliveiro?=@ldo@nz.invalid to comp.lang.c on Tue Oct 21 19:45:22 2025
    From Newsgroup: comp.lang.c

    On Tue, 21 Oct 2025 09:57:45 +0200, Bonita Montero wrote:

    Am 21.10.2025 um 01:38 schrieb Lawrence D’Oliveiro:

    for (i = len(s);;)
    {
    if (i == 0)
    {
    found = false;
    break;
    } /*if*/
    --i;
    if (matches(s[i]))
    {
    found = true;
    break;
    } /*if*/
    } /*for*/

    Your coding style doesn't match any common style ...

    You might want to write

    for (i = len(s) - 1; i >= 0; --i)

    wouldn’t you?

    It’s all about the correctness of the code.
    --- Synchronet 3.21a-Linux NewsLink 1.2
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Tue Oct 21 22:56:58 2025
    From Newsgroup: comp.lang.c

    On 21/10/2025 20:44, James Kuyper wrote:
    On 2025-10-21 06:42, David Brown wrote:
    On 20/10/2025 22:09, Kaz Kylheku wrote:
    On 2025-10-20, David Brown <david.brown@hesbynett.no> wrote:
    On 20/10/2025 17:03, pozz wrote:
    However using unsigned types due to them being safe is often poorly
    considered because if something goes wrong contrary to the programmer's
    intent, there likely will be undefined behavior somewhere.

    Exactly. Unsigned types are not somehow "safer" than signed types, just
    because signed types have UB on overflow. Don't overflow your signed
    types, then you have no UB. And if you overflow your unsigned types
    without that being an intentional and understood part of your code, you
    will at the very least get unexpected behaviour - a bug - and just like
    UB, there are no limits to how bad that can get.

    No, there are limits on unexpected behavior: being unexpected, you might
    not know what they are, but it is still the case that the behavior
    starts out with having nothing more than an expression with an
    unexpected but valid value. That's pretty bad, and your code might make
    it worse, for example by promoting the unexpected value into undefined behavior. However, unless and until it actually does so, the behavior is somewhat more restricted than UB.

    The effect of "unexpected behaviour" - something that has well-defined behaviour according to the C standard or the implementation, but was not
    what the programmer had intended or expected - is clear at the point it happens. Your unsigned arithmetic overflows in a defined and specified manner. But the knock-on effects are, in general, unpredictable - there
    are no specific limits for how bad things can get. It is not unlikely
    that you'll end up with "real" UB. In theory, real UB can lead to
    launching of nasal daemons, while unexpected behaviour, if it does not
    lead to real UB, cannot launch nasal daemons unless you have nasal
    daemon launch procedures in your program. In practice, real UB can more
    often lead to a quick crash and perhaps "nicer" bad behaviour (via OS
    memory protections and the like), while the unexpected behaviour can
    continue on, quietly causing future havoc and problems that are harder
    to find and debug. Either way, I think we can agree that bad things can happen!

    --- Synchronet 3.21a-Linux NewsLink 1.2