• Re: C Plagiarism

    From David Brown@david.brown@hesbynett.no to comp.lang.misc on Mon Nov 21 16:30:27 2022
    From Newsgroup: comp.lang.misc

    On 19/11/2022 17:01, Bart wrote:

    On 16/11/2022 16:50, David Brown wrote:
    Yes, but for you, a "must-have" list for a programming language would be mainly "must be roughly like ancient style C in functionality, but with enough change in syntax and appearance so that no one will think it is C".  If that's what you like, and what pays for your daily bread, then that's absolutely fine.

    On 18/11/2022 07:12, David Brown wrote:
    Yes, it is a lot like C.  It has a number of changes, some that I think are good, some that I think are bad, but basically it is mostly like C.

    The above remarks implies strongly that my systems language is a rip-off
    of C.


    No, it does not. You can infer what you want from what I write, but I
    don't see any such implications from my remark. If anyone were to write
    a (relatively) simple structured language for low level work, suitable
    for "direct" compilation to assembly on a reasonable selection of common general-purpose processors, and with the aim of giving a "portable
    alternative to writing in assembly", then the result will inevitably
    have a good deal in common with C. There can be plenty of differences
    in the syntax and details, but the "ethos" or "flavour" of the language
    will be similar.

    Note that I have referred to Pascal as C-like in this sense.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Mon Nov 21 18:56:02 2022
    From Newsgroup: comp.lang.misc

    On 19/11/2022 22:49, Bart wrote:
    On 19/11/2022 21:02, James Harris wrote:
    On 19/11/2022 20:30, Bart wrote:
    On 19/11/2022 20:17, James Harris wrote:


    I try to keep my main influences to hardware and various assembly
    languages I've used over the years. But even though we try not to be
    influenced by C I don't think any of us can help it. Two reasons: C
    became the base for so many languages which came after it, and C so
    well fits the underlying machine.

    I even suspect that the CPUs we use today are also as they are in
    part due to C. It has been that influential.

    C is /massively/ influential to the general purpose CPUs we have today.
    The prime requirement for almost any CPU design is that you should be
    able to use it efficiently for C. After all, the great majority of
    software is written in languages that, at their core, are similar to C
    (in the sense that once the compiler front-end has finished with them,
    you have variables, imperative functions, pointers, objects in memory,
    etc., much like C). Those languages that are significantly different
    rely on run-times and libraries that are written in C.


    Well, there's a lot of C code around that needs to be keep working.

    Yes.


    However, what aspects of today's processors do you think owe anything
    to C?

    Things like the 8-bit byte, 2's complement, and the lack of segmentation.

    Really? C was pretty much the only language in the world that does not specify the size of a byte. (It doesn't even a 'byte' type.)


    8-bit byte and two's complement were, I think, inevitable regardless of
    C. But while the C standard does not require them, their popularity has
    grown along with C.

    And it's a language that, even now (until C23) DOESN'T stipulate that integers use two's complement.

    As for segmentation, or lack of, that was very common across machines.


    There are plenty of architectures that did not have linear addressing,
    and there are many advantages of not allowing memory to be viewed and
    accessed as one continuous address space (primarily, it can make buffer overruns and out of bounds accesses almost impossible). C's model does
    not /require/ a simple linear memory space, but such a setup makes C far easier.

    It is really nothing at all to do with C. (How would it have influenced
    that anyway, given that C implementions were adept are dealing with any memory model?)


    C implementations are /not/ good at dealing with non-linear memory, and
    lots of C software assumes memory is linear (and also that bytes are
    8-bit, and integers are two's complement). Having the C standard
    /allow/ more varied systems does not imply that other systems are good
    for C.

    But of course C was not the only influence on processor evolution.



    The progression from 8 to 16 to 32 to 64 bits and beyond has long
    been on the cards, irrespective of languages.

    Actually C is lagging behind since most implementations are stuck
    with a 32-bit int type. Which means lots of software, for those
    lazily using 'int' everywhere, will perpetuate the limitations of
    that type.

    C famously also doesn't like to pin down its types. It doesn't even
    have a `byte` type, and its `char` type, apart from not have a
    specified signedness, could have any width of 8 bits or more.

    Pre C99 yes. But AIUI since C99 C has had very precise types such as

       int64_t

    I'm sure the byte type, it's size and byte-addressibility, was more influenced more by IBM, such as with its 360 mainframes from the 1960s
    BC (Before C). The first byte-addressed machine I used was a 360-clone.

    In any case, I would dispute that C even now properly has fixed-width
    types. First, you need to do this to enable them:

    Dispute all you want - it does not change a thing.


        #include <stdint.h>

    Otherwise it knows nothing about them. Second, if you look inside a
    typical stdint.h file (this one is from gcc/TDM on Windows), you might
    well see:

        typedef signed char int8_t;
        typedef unsigned char uint8_t;

    Nothing here guarantees that int8_t will be an 8-bit type; these 'exact-width' types are defined on top of those loosely-defined types. They're an illusion.


    Sorry, you are completely wrong here. Feel free to look it up in the C standards if you don't believe me.


    One of the biggest influences C had on processor design was the idea of
    a single stack for return addresses and data, with stack pointer +
    offset and frame pointer + offset addressing. C is not the only
    language that works well with that setup, but it can't really take any
    kind of advantage of more advanced setups with multiple stacks or linked
    stack frames. Languages that have local functions, such as Pascal or
    Ada, could benefit from more sophisticated stack models. Better stack
    models on processors would also greatly reduce the risk of stack
    overflows, corruption (intentionally or unintentionally) of return
    addresses on stacks, and other bugs in software.

    However, any kind of guesses as to how processors would have looked
    without C, and therefore what influence C /really/ had, are always going
    to be speculative.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Mon Nov 21 18:44:05 2022
    From Newsgroup: comp.lang.misc

    On 21/11/2022 17:56, David Brown wrote:
    On 19/11/2022 22:49, Bart wrote:

    I even suspect that the CPUs we use today are also as they are in
    part due to C. It has been that influential.

    C is /massively/ influential to the general purpose CPUs we have today.

    "Massively" influential? Why, how do you think CPUs would have ended up without C?

    Two of the first machines I used were PDP10 and PDP11, developed by DEC
    in the 1960s, both using linear memory spaces. While the former was word-based, the PDP11 was byte-addressable, just like the IBM 360 also
    from the 1960s.

    The early microprocessors I used (6800, Z80) also had a linear memory
    space, at a time when it was unlikely C implementations existed for
    them, or that people even thought that much about C outside of Unix.

     The prime requirement for almost any CPU design is that you should be able to use it efficiently for C.

    And not Assembly, or Fortran or any other language? Don't forget that at
    the point it all began to change, mid-70s to mid-80, C wasn't that
    dominant. Any C implementations for microprocessors were incredibly slow
    and produced indifferent code.

    The OSes I used (for PDP10, PDP11, ICL 4/72, Z80) had no C involvement.
    When x86 popularised segment memory, EVERYBODY hated it, and EVERY
    language had a problem with it.

    The REASON for segmented memory was becaused 16-bits and address spaces
    larger than 64K words didn't mix. When this was eventually fixed on
    80386 for x86, that was able to use 32-bit registers.

    According to you, without C, we would have been using 64KB segments even
    with 32 bit registers, or we maybe would never have got to 32 bits at
    all. What nonsense!

    (I was designing paper CPUs with linear addressing long before then,
    probably like lots of people.)


    ll, the great majority of
    software is written in languages that, at their core, are similar to C
    (in the sense that once the compiler front-end has finished with them,
    you have variables, imperative functions, pointers, objects in memory,
    etc., much like C).

    I wish people would just accept that C does not have and never has had a monopoly on lower level languages.

    It a shame that people now associate 'close-to-the-metal' programming
    with a language where a function pointer type is written as
    `void(*)(void)`, and that's in the simples case.



    Really? C was pretty much the only language in the world that does not
    specify the size of a byte. (It doesn't even a 'byte' type.)


    8-bit byte and two's complement were, I think, inevitable regardless of
    C.

    So were lots of things. It didn't take a clairvoyant to guess that the
    next progression of 8 -> 16 was going to be 32 and then 64.

    (The Z8000 came out in 1979. It was a 16-bit processor with a register
    set that could be accessed as 8, 16, 32 or 64-bit chunks. Actually you
    can also look at 68000 from that era, and the NatSemi 32032. I was an
    engineer at the time and very familiar with this stuff.

    C didn't figure in that world at all as far as I was concerned.)

    It is really nothing at all to do with C. (How would it have
    influenced that anyway, given that C implementions were adept are
    dealing with any memory model?)


    C implementations are /not/ good at dealing with non-linear memory,

    No longer likes it, as I said.

    But of course C was not the only influence on processor evolution.

    OK, you admit now it was not '/massive/'; good!


         #include <stdint.h>

    Otherwise it knows nothing about them. Second, if you look inside a
    typical stdint.h file (this one is from gcc/TDM on Windows), you might
    well see:

         typedef signed char int8_t;
         typedef unsigned char uint8_t;

    Nothing here guarantees that int8_t will be an 8-bit type; these
    'exact-width' types are defined on top of those loosely-defined types.
    They're an illusion.


    Sorry, you are completely wrong here.  Feel free to look it up in the C standards if you don't believe me.

    The above typedefs are from a C compiler you may have heard of: 'gcc'.
    Some may well use internal types such as `__int8`, but the above is the
    actual content of stdint.h, and makes `int8_t` a direct synonym for
    `signed char`.



    However, any kind of guesses as to how processors would have looked
    without C, and therefore what influence C /really/ had, are always going
    to be speculative.

    Without C, another lower-level systems language would have dominated,
    since such a language was necessary.

    More interesting however is what Unix would have looked like without C.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Mon Nov 21 21:20:20 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-21 19:44, Bart wrote:

    Two of the first machines I used were PDP10 and PDP11, developed by DEC
    in the 1960s, both using linear memory spaces. While the former was word-based, the PDP11 was byte-addressable, just like the IBM 360 also
    from the 1960s.

    PDP-11 was not linear. The internal machine address was 24-bit. But the effective address in the program was 16-bit. The address space was 64K
    for data and 64K for code mapped by the virtual memory manager. Some
    machines had a third 64K space.

    And not Assembly, or Fortran or any other language?

    Assember is not portable. FORTRAN had no pointers. Programmers
    implemented memory management on top of an array (e.g. LOGICAL*1, since
    it had no bytes or characters either (:-)). Since FORTRAN was totally
    untyped, you don't even need to cast anything! (:-))

    The REASON for segmented memory was becaused 16-bits and address spaces larger than 64K words didn't mix. When this was eventually fixed on
    80386 for x86, that was able to use 32-bit registers.

    Segmented memory requires less memory registers because the segment size
    may vary. A potential advantage, as it was already mentioned, is that
    you could theoretically implement bounds checking on top of it. One
    example of such techniques was VAX debugger which ran programs at normal
    speed between breakpoints. The trick was to place active breakpoints on no-access pages. I don't advocate segmented memory, BTW.

    More interesting however is what Unix would have looked like without C.

    Though I hate both, I don't think C influenced UNIX much.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Mon Nov 21 21:22:49 2022
    From Newsgroup: comp.lang.misc

    On 21/11/2022 19:44, Bart wrote:
    On 21/11/2022 17:56, David Brown wrote:
    On 19/11/2022 22:49, Bart wrote:

    I even suspect that the CPUs we use today are also as they are in >>>>>> part due to C. It has been that influential.

    C is /massively/ influential to the general purpose CPUs we have today.

    "Massively" influential? Why, how do you think CPUs would have ended up without C?

    As I said at the end of my previous post, it's very difficult to tell.
    Maybe they would be more varied. Maybe we'd have more stacks. Maybe
    we'd be freed from the idea that a "pointer" is nothing more than a
    linear address - it could have bounds, or access flags. Registers and
    memory could hold type information as well as values. Processors could
    have had support for multi-threading or parallel processing. They could
    have been designed around event models and signal passing, or have
    hardware acceleration for accessing code or data by name. They could
    have been better at handling coroutines. There are all kinds of
    different things hardware /could/ do, at least some of which would
    greatly suit some of the many different kinds of programming languages
    we have seen through the years.

    A few of these have turned up - there are processors with multiple
    stacks optimised for Forth, there were early massively parallel
    processors designed alongside the Occam language, the company Linn Smart Computing made a radical new processor design for more efficient implementation of their own programming language. Some ARM cores had
    hardware acceleration for Java virtual machines.

    But I have no specific thoughts - predictions about possible parallel
    pasts are just as hard as predictions about the future!


    Two of the first machines I used were PDP10 and PDP11, developed by DEC
    in the 1960s, both using linear memory spaces. While the former was word-based, the PDP11 was byte-addressable, just like the IBM 360 also
    from the 1960s.


    C was developed originally for these processors, and was a major reason
    for their long-term success.

    C was designed with some existing processors in mind - I don't think
    anyone is suggesting that features such as linear memory came about
    solely because of C. But there was more variety of processor
    architectures in the old days, while almost all we have now are
    processors that are good for running C code.

    The early microprocessors I used (6800, Z80) also had a linear memory
    space, at a time when it was unlikely C implementations existed for
    them, or that people even thought that much about C outside of Unix.

      The prime requirement for almost any CPU design is that you should
    be able to use it efficiently for C.

    And not Assembly, or Fortran or any other language?

    Not assembly, no - /very/ little code is now written in assembly.
    FORTRAN efficiency used to be important for processor design, but not
    for a very long time. (FORTRAN is near enough the same programming
    model as C, however.)

    Don't forget that at
    the point it all began to change, mid-70s to mid-80, C wasn't that
    dominant. Any C implementations for microprocessors were incredibly slow
    and produced indifferent code.

    The OSes I used (for PDP10, PDP11, ICL 4/72, Z80) had no C involvement.
    When x86 popularised segment memory, EVERYBODY hated it, and EVERY
    language had a problem with it.


    Yes - the choice of the 8086 for PC's was a huge mistake. It was purely economics - the IBM designers wanted a 68000 processor. But IBM PHB's
    said that since the IBM PC was just a marketing exercise and they would
    never make more than a few thousand machines, technical benefits were irrelevant and the 8086 devices were cheaper. (By the same logic, they
    bought the cheapest OS they could get, despite everyone saying it was rubbish.)

    The REASON for segmented memory was becaused 16-bits and address spaces larger than 64K words didn't mix. When this was eventually fixed on
    80386 for x86, that was able to use 32-bit registers.

    According to you, without C, we would have been using 64KB segments even with 32 bit registers, or we maybe would never have got to 32 bits at
    all. What nonsense!


    Eh, no. I did not say anything /remotely/ like that.

    (I was designing paper CPUs with linear addressing long before then, probably like lots of people.)


    ll, the great majority of software is written in languages that, at
    their core, are similar to C (in the sense that once the compiler
    front-end has finished with them, you have variables, imperative
    functions, pointers, objects in memory, etc., much like C).

    I wish people would just accept that C does not have and never has had a monopoly on lower level languages.


    I does have, and has had for 40+ years, a /near/ monopoly on low-level languages. You can dislike C as much as you want, but you really cannot
    deny that!

    It a shame that people now associate 'close-to-the-metal' programming
    with a language where a function pointer type is written as
    `void(*)(void)`, and that's in the simples case.


    I don't disagree that it is a shame, or that better (for whatever value
    of "better" you like) low-level languages exist or can be made. That
    doesn't change the facts.



    Really? C was pretty much the only language in the world that does
    not specify the size of a byte. (It doesn't even a 'byte' type.)


    8-bit byte and two's complement were, I think, inevitable regardless
    of C.

    So were lots of things. It didn't take a clairvoyant to guess that the
    next progression of 8 -> 16 was going to be 32 and then 64.


    Agreed.

    (The Z8000 came out in 1979. It was a 16-bit processor with a register
    set that could be accessed as 8, 16, 32 or 64-bit chunks. Actually you
    can also look at 68000 from that era, and the NatSemi 32032. I was an engineer at the time and very familiar with this stuff.

    C didn't figure in that world at all as far as I was concerned.)

    It is really nothing at all to do with C. (How would it have
    influenced that anyway, given that C implementions were adept are
    dealing with any memory model?)


    C implementations are /not/ good at dealing with non-linear memory,

    No longer likes it, as I said.

    But of course C was not the only influence on processor evolution.

    OK, you admit now it was not '/massive/'; good!


    Would you please stop making things up and pretending I said them?

    C was a /massive/ influence on processor evolution and the current standardisation of general-purpose processors as systems for running C
    code efficiently. But it was not the only influence, or the sole reason
    for current processor design.


         #include <stdint.h>

    Otherwise it knows nothing about them. Second, if you look inside a
    typical stdint.h file (this one is from gcc/TDM on Windows), you
    might well see:

         typedef signed char int8_t;
         typedef unsigned char uint8_t;

    Nothing here guarantees that int8_t will be an 8-bit type; these
    'exact-width' types are defined on top of those loosely-defined
    types. They're an illusion.


    Sorry, you are completely wrong here.  Feel free to look it up in the
    C standards if you don't believe me.

    The above typedefs are from a C compiler you may have heard of: 'gcc'.
    Some may well use internal types such as `__int8`, but the above is the actual content of stdint.h, and makes `int8_t` a direct synonym for
    `signed char`.


    They are part of C - specified precisely in the C standards. It does
    not matter how any particular implementation defines them. The C
    standards say they are part of C, and the type names are introduced into
    the current namespace using "#include <stdint.h>" (Or "#include <inttypes.h>".) The standards also say that "int8_t" is an 8-bit type,
    with no padding, and two's complement representation. This has been the
    case since C99 - there is no "looseness" or "illusions" in these types.



    However, any kind of guesses as to how processors would have looked
    without C, and therefore what influence C /really/ had, are always
    going to be speculative.

    Without C, another lower-level systems language would have dominated,
    since such a language was necessary.

    Perhaps - perhaps not. Domination of a particular market or niche does
    not always happen. Perhaps we would instead have Forth, Ada, and
    compiled BASIC in an equal balance.


    More interesting however is what Unix would have looked like without C.

    How do you think it would have looked?


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Mon Nov 21 21:38:19 2022
    From Newsgroup: comp.lang.misc

    On 21/11/2022 20:20, Dmitry A. Kazakov wrote:
    On 2022-11-21 19:44, Bart wrote:

    Two of the first machines I used were PDP10 and PDP11, developed by
    DEC in the 1960s, both using linear memory spaces. While the former
    was word-based, the PDP11 was byte-addressable, just like the IBM 360
    also from the 1960s.

    PDP-11 was not linear. The internal machine address was 24-bit. But the effective address in the program was 16-bit. The address space was 64K
    for data and 64K for code mapped by the virtual memory manager. Some machines had a third 64K space.


    My PDP11/34 probably didn't have that much memory. But if you couldn't
    access more than 64K per task (say for code or data, if treated
    separately), then I would still call that linear from the task's point
    of view.


    And not Assembly, or Fortran or any other language?

    Assember is not portable.

    That is not relevant. The suggestion was that keeping C happy was a
    motivation for CPU designers, but a lot of ASM code was still being run too.


    FORTRAN had no pointers. Programmers
    implemented memory management on top of an array

    But those arrays work better in linear memory. There was a lot of
    Fortran code around too (probably a lot more than C at the time I got
    into it), and /that/ code needed to stay efficient too.

    So I was questioning whether C was that big a factor at that period when
    the architectures we have now were just beginning to be developed.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Mon Nov 21 23:03:09 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-21 22:38, Bart wrote:
    On 21/11/2022 20:20, Dmitry A. Kazakov wrote:
    On 2022-11-21 19:44, Bart wrote:

    Two of the first machines I used were PDP10 and PDP11, developed by
    DEC in the 1960s, both using linear memory spaces. While the former
    was word-based, the PDP11 was byte-addressable, just like the IBM 360
    also from the 1960s.

    PDP-11 was not linear. The internal machine address was 24-bit. But
    the effective address in the program was 16-bit. The address space was
    64K for data and 64K for code mapped by the virtual memory manager.
    Some machines had a third 64K space.

    My PDP11/34 probably didn't have that much memory. But if you couldn't access more than 64K per task (say for code or data, if treated
    separately), then I would still call that linear from the task's point
    of view.

    So is segmented memory if you have single segment. Once you needed more
    that 64K of data or code, your linearity would end.

    FORTRAN had no pointers. Programmers implemented memory management on
    top of an array

    But those arrays work better in linear memory.

    FORTRAN was not that high level to support memory mapping on indexing.
    The method of handling larger than address space data structures and
    code was per loader's overlay trees, kind of precursor of paging/swap. Segmented or paged played no difference.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Tue Nov 22 12:38:18 2022
    From Newsgroup: comp.lang.misc

    On 21/11/2022 20:22, David Brown wrote:
    On 21/11/2022 19:44, Bart wrote:

    Two of the first machines I used were PDP10 and PDP11, developed by
    DEC in the 1960s, both using linear memory spaces. While the former
    was word-based, the PDP11 was byte-addressable, just like the IBM 360
    also from the 1960s.


    C was developed originally for these processors, and was a major reason
    for their long-term success.

    Of the PDP10 and IBM 360? Designed in the 1960s and discontinued in 1983
    and 1979 respectively. C only came out in a first version in 1972.

    The PDP11 was superceded around this time (either side of 1980) by the
    VAX-11, a 32-bit version, no doubt inspired by the C language, one that
    was well known for not specifying the sizes of its types - it adapted to
    the size of the hardware.

    Do you really believe this stuff?

    C was designed with some existing processors in mind - I don't think
    anyone is suggesting that features such as linear memory came about
    solely because of C.  But there was more variety of processor
    architectures in the old days, while almost all we have now are
    processors that are good for running C code.

    As I said, C is the language that adapts itself to the hardware, and in
    fact still is the primary language now that can and does run on every
    odd-ball architecture.

    Which is why it is an odd candidate for a language that was supposed to
    drive the evolution of hardware because of its requirements.

    The early microprocessors I used (6800, Z80) also had a linear memory
    space, at a time when it was unlikely C implementations existed for
    them, or that people even thought that much about C outside of Unix.

      The prime requirement for almost any CPU design is that you should
    be able to use it efficiently for C.

    And not Assembly, or Fortran or any other language?

    Not assembly, no - /very/ little code is now written in assembly.

    Now, yes. I'm talking about that formative period of mid-70s to mid-80s
    when everything changed. From being dominated by mainframes, to 32-bit microprocessors which are only one step behind the 64-bit ones we have now.


    FORTRAN efficiency used to be important for processor design, but not
    for a very long time.  (FORTRAN is near enough the same programming
    model as C, however.)

    Oh, right. In that case could be it possibly have been the need to run
    Fortran efficiency that was a driving force in that period?

    (I spent a year in the late 70s writing Fortran code in two scientific establishments in the UK. No one used C.)

    Don't forget that at the point it all began to change, mid-70s to
    mid-80, C wasn't that dominant. Any C implementations for
    microprocessors were incredibly slow and produced indifferent code.

    The OSes I used (for PDP10, PDP11, ICL 4/72, Z80) had no C
    involvement. When x86 popularised segment memory, EVERYBODY hated it,
    and EVERY language had a problem with it.


    Yes - the choice of the 8086 for PC's was a huge mistake.  It was purely economics - the IBM designers wanted a 68000 processor.

    When you looked at the 68000 more closely, it had nearly as much non-orthoganality as the 8086. (I was trying at that time to get my
    company to switch to a processor like the 68k.)

    (The 8086 was bearable, but it had one poor design choice that had huge implications: forming an address by shifting a 16-bit segment address by
    4 bits instead 8.

    That meant an addressing range of only 1MB instead of 16MB, leading to a situation later where you could cheaply install 4MB or 8MB of memory,
    but you couldn't easily make use of it.)


    According to you, without C, we would have been using 64KB segments
    even with 32 bit registers, or we maybe would never have got to 32
    bits at all. What nonsense!


    Eh, no.  I did not say anything /remotely/ like that.

    It sounds like it! Just accept that C had no more nor less influence
    than any other language /at that time/.


    I does have, and has had for 40+ years, a /near/ monopoly on low-level languages.  You can dislike C as much as you want, but you really cannot deny that!

    It's also the fact that /I/ at least have also successively avoided
    using C for 40+ years (and, probably fairly uniquely, have used private languages). I'm sure there are other stories like mine that you don't
    hear about.


    But of course C was not the only influence on processor evolution.

    OK, you admit now it was not '/massive/'; good!


    Would you please stop making things up and pretending I said them?

    You actually said this:

    C is /massively/ influential to the general purpose CPUs we have today.

    Which suggests that you don't think any other language comes close.

    I don't know which individual language, if any, was most influential,
    but I doubt C played a huge part because it came out too late, and was
    not that popular in those formative years, but which time the way
    processors were going to evolve was becoming clear anyway.

    (That is, still dominated by von Neumann architectures, as has been the
    case since long before C.)

    But C probably has influenced modern 64-bit ABIs, even though they are supposed to be language-independent.

    More interesting however is what Unix would have looked like without C.

    How do you think it would have looked?

    Case insensitive? Or maybe that's just wishful thinking.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Tue Nov 22 16:29:42 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 13:38, Bart wrote:
    On 21/11/2022 20:22, David Brown wrote:
    On 21/11/2022 19:44, Bart wrote:

    Two of the first machines I used were PDP10 and PDP11, developed by
    DEC in the 1960s, both using linear memory spaces. While the former
    was word-based, the PDP11 was byte-addressable, just like the IBM 360
    also from the 1960s.


    C was developed originally for these processors, and was a major
    reason for their long-term success.

    Of the PDP10 and IBM 360? Designed in the 1960s and discontinued in 1983
    and 1979 respectively. C only came out in a first version in 1972.


    I was thinking primarily of the PDP11, which was the first real target
    for C (assuming I have my history correct - this was around the time I
    was born). And by "long-term success" of these systems, I mean their successors that were built in the same style - such as the VAX.

    The PDP11 was superceded around this time (either side of 1980) by the VAX-11, a 32-bit version, no doubt inspired by the C language, one that
    was well known for not specifying the sizes of its types - it adapted to
    the size of the hardware.

    Do you really believe this stuff?

    C was designed with some existing processors in mind - I don't think
    anyone is suggesting that features such as linear memory came about
    solely because of C.  But there was more variety of processor
    architectures in the old days, while almost all we have now are
    processors that are good for running C code.

    As I said, C is the language that adapts itself to the hardware, and in
    fact still is the primary language now that can and does run on every odd-ball architecture.


    C does not "adapt itself to the hardware". It is specified with some
    details of features being decided by the implementer. (Some of these
    details are quite important.) Part of the reason for this is to allow efficient implementations on a wide range of hardware, but it also
    determines a balance between implementer freedom, and limits that a
    programmer can rely upon. There are plenty of cases where different implementations on the same hardware make different choices of the
    details. (Examples include the size of "long" on 64-bit x86 systems
    being different for Windows and the rest of the world, or some compilers
    for the original m68k having 16-bit int while others had 32-bit int.)

    Which is why it is an odd candidate for a language that was supposed to drive the evolution of hardware because of its requirements.

    There is a difference between a language being usable on a range of
    systems, and being very /efficient/ on a range of systems. You can use
    C on an 8-bit AVR processor - there is a gcc port. But it is not a good processor design for C - there are few pointer registers, 16-bit
    manipulation is inefficient, there are separate address spaces for flash
    and ram, there is no stack pointer + offset addressing mode. So while C
    is far and away the most popular language for programming AVR's, AVR's
    are not good processors for C. (Other 8-bit cores such as the 8051 are
    even worse, and that is a reason for them being dropped as soon as
    32-bit ARM cores became cheap enough.)


    The early microprocessors I used (6800, Z80) also had a linear memory
    space, at a time when it was unlikely C implementations existed for
    them, or that people even thought that much about C outside of Unix.

      The prime requirement for almost any CPU design is that you should >>>> be able to use it efficiently for C.

    And not Assembly, or Fortran or any other language?

    Not assembly, no - /very/ little code is now written in assembly.

    Now, yes. I'm talking about that formative period of mid-70s to mid-80s
    when everything changed. From being dominated by mainframes, to 32-bit microprocessors which are only one step behind the 64-bit ones we have now.


    OK, for a time the ability to program efficiently in assembly was
    important. But that was already in decline by the early 1980's in big systems, as we began to see a move towards RISC processors optimised for compiler output rather than CISC processors optimised for human assembly coding. (The continued existence of CISC was almost entirely due to the
    IBM PC's choice of the 8088 processor.)


    FORTRAN efficiency used to be important for processor design, but not
    for a very long time.  (FORTRAN is near enough the same programming
    model as C, however.)

    Oh, right. In that case could be it possibly have been the need to run Fortran efficiency that was a driving force in that period?

    That would have been important too, but C quickly overwhelmed FORTRAN in popularity. FORTRAN was used in scientific and engineering work, but C
    was the choice for systems programming and most application programming.


    (I spent a year in the late 70s writing Fortran code in two scientific establishments in the UK. No one used C.)

    Don't forget that at the point it all began to change, mid-70s to
    mid-80, C wasn't that dominant. Any C implementations for
    microprocessors were incredibly slow and produced indifferent code.

    The OSes I used (for PDP10, PDP11, ICL 4/72, Z80) had no C
    involvement. When x86 popularised segment memory, EVERYBODY hated it,
    and EVERY language had a problem with it.


    Yes - the choice of the 8086 for PC's was a huge mistake.  It was
    purely economics - the IBM designers wanted a 68000 processor.

    When you looked at the 68000 more closely, it had nearly as much non-orthoganality as the 8086. (I was trying at that time to get my
    company to switch to a processor like the 68k.)

    No, it does not. (Yes, I have looked at it closely, and used 68k
    processors extensively.)


    (The 8086 was bearable, but it had one poor design choice that had huge implications: forming an address by shifting a 16-bit segment address by
    4 bits instead 8.

    That meant an addressing range of only 1MB instead of 16MB, leading to a situation later where you could cheaply install 4MB or 8MB of memory,
    but you couldn't easily make use of it.)

    The 8086 was horrible in all sorts of ways. Comparing a 68000 with an
    8086 is like comparing a Jaguar E-type with a bathtub with wheels. And
    for the actual chip used in the first PC, an 8088, half the wheels were removed.



    According to you, without C, we would have been using 64KB segments
    even with 32 bit registers, or we maybe would never have got to 32
    bits at all. What nonsense!


    Eh, no.  I did not say anything /remotely/ like that.

    It sounds like it! Just accept that C had no more nor less influence
    than any other language /at that time/.


    The most successful (by a huge margin - like it or not) programming
    language evolved, spread and conquered the programming world, at the
    same time as the basic processor architecture evolved and solidified
    into a style that is very good at executing C programs, and is missing countless features that would be useful for many other kinds of
    programming languages. Coincidence? I think not.

    Of course there were other languages that benefited from those same processors, but none were or are as popular as C and its clear descendants.


    I does have, and has had for 40+ years, a /near/ monopoly on low-level
    languages.  You can dislike C as much as you want, but you really
    cannot deny that!

    It's also the fact that /I/ at least have also successively avoided
    using C for 40+ years (and, probably fairly uniquely, have used private languages). I'm sure there are other stories like mine that you don't
    hear about.

    Sure. But for every person like you that has made a successful career
    with your own language, there are perhaps 100,000 other programmers who
    have used other languages as the basis of their careers. 90% of them at
    least will have C or its immediate descendants (C++, Java, C#, etc.) as
    their main language.

    You can have your opinions about quality, but in terms of /quantity/
    there is no contest.



    But of course C was not the only influence on processor evolution.

    OK, you admit now it was not '/massive/'; good!


    Would you please stop making things up and pretending I said them?

    You actually said this:

    C is /massively/ influential to the general purpose CPUs we have today.

    Which suggests that you don't think any other language comes close.

    That is correct. But it also means I don't think it was the only reason processors are the way they are. And I most certainly did not "admit
    now that it was not massive".


    I don't know which individual language, if any, was most influential,
    but I doubt C played a huge part because it came out too late, and was
    not that popular in those formative years, but which time the way
    processors were going to evolve was becoming clear anyway.

    (That is, still dominated by von Neumann architectures, as has been the
    case since long before C.)

    But C probably has influenced modern 64-bit ABIs, even though they are supposed to be language-independent.


    What makes you think they are supposed to be language independent? What
    makes you think they are not? What makes you care?

    The types and terms from C are a very convenient way to describe an ABI,
    since it is a language familiar to any programmer who might be
    interested in the details of an ABI. Such ABI's only cover a
    (relatively) simple common subset of possible interfaces, but do so in a
    way that can be used from any language (with wrappers if needed) and can
    be extended as needed.

    People make ABI's for practical use. MS made the ABI for Win64 to suit
    their own needs and uses. AMD and a range of *nix developers (both OS
    and application developers) and compiler developers got together to
    develop the 64-bit x86 ABI used by everyone else, designed to suit
    /there/ needs and uses.

    If a language needs something more for their ABI - such as Python
    wanting support for Python objects - then that can be built on top of
    the standard ABI.


    More interesting however is what Unix would have looked like without C.

    How do you think it would have looked?

    Case insensitive? Or maybe that's just wishful thinking.


    Case insensitivity is a mistake, born from the days before computers
    were advanced enough to have small letters as well as capitals. It
    leads to ugly inconsistencies, wastes the opportunity to convey useful semantic information, and is an absolute nightmare as soon as you stray
    from the simple English-language alphabet.

    I believe Unix's predecessor, Multics, was case-sensitive. But I could
    be wrong.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Andy Walker@anw@cuboid.co.uk to comp.lang.misc on Tue Nov 22 16:27:08 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 15:29, David Brown wrote:
    Case insensitivity is a mistake, born from the days before computers
    were advanced enough to have small letters as well as capitals.

    I don't believe I have ever used a computer that did not "have
    small letters". There has been some discussion over in "comp.compilers" recently, but it's basically the difference between punched cards and
    paper tape. The Flexowriter can be traced back to the 1920s, and its
    most popular form was certainly being used by computers in the 1950s,
    so there really weren't many "days before" to be considered.
    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Hertel
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Tue Nov 22 17:13:32 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 15:29, David Brown wrote:
    On 22/11/2022 13:38, Bart wrote:

    When you looked at the 68000 more closely, it had nearly as much
    non-orthoganality as the 8086. (I was trying at that time to get my
    company to switch to a processor like the 68k.)

    No, it does not.  (Yes, I have looked at it closely, and used 68k processors extensively.)

    As a compiler writer? The first thing you noticed is that you have to
    decide whether to use D-registers or A-registers, as they had different characteristics, but the 3-bit register field of instructions could only
    use one or the other.

    That made the 8086 simpler because there was no choice! The registers
    were limited and only one was general purpose.

    But C probably has influenced modern 64-bit ABIs, even though they are
    supposed to be language-independent.


    What makes you think they are supposed to be language independent?  What makes you think they are not?  What makes you care?

    Language A can talk to language B via the machine's ABI. Where does C
    come into it?

    Language A can talk to a library or OS component that resides in a DLL,
    via the ABI. The library might have been implemented in C, or assembler,
    or in anything else, but in binary form, is pure machine code anyway.

    What makes /you/ think that such ABIs were invented purely for the use
    of C programs? Do you think the designers of the ABI simply assumed that
    only programs written in the C language could call into the OS?

    When you download a shared library DLL, do you think they have different versions depending on what language will be using the DLL?

    The types and terms from C are a very convenient way to describe an ABI,

    They're pretty terrible actually. The types involved in SYS V ABI can be expressed as follows in a form that everyone understands and many
    languages use:

    i8 i16 i32 i64 i128
    u8 u16 u32 u64 u128
    f32 f64 f128

    This document (https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf)
    lists the C equivalents as follows (only signed integers shown):

    i8 char, signed char
    i16 short, signed short
    i32 int, signed int
    i64 long, signed long, long long, signed long long
    i128 __int128, signed __int128

    (No use of int8_t etc despite the document dated 2012.)

    This comes up in APIs too where it is 100 times more relevant (only
    compiler writers care about the API). The C denotations shown here are
    not fit for purpose for language-neutral interfaces.

    (Notice also that 'long' and 'long long' are both 64 bits, and that
    'char' is assumed to be signed. In practice the C denotations would vary across platforms, while those i8-i128 would stay constant, provided only
    that the machine uses conventional register sizes.)

    So it's more like, such interfaces were developed /despite/ C.

    since it is a language familiar to any programmer who might be
    interested in the details of an ABI.  Such ABI's only cover a
    (relatively) simple common subset of possible interfaces, but do so in a
    way that can be used from any language (with wrappers if needed) and can
    be extended as needed.

    People make ABI's for practical use.  MS made the ABI for Win64 to suit their own needs and uses.  AMD and a range of *nix developers (both OS
    and application developers) and compiler developers got together to
    develop the 64-bit x86 ABI used by everyone else, designed to suit
    /there/ needs and uses.

    x86-32 used a number of different ABIs depending on language and
    compiler. x86-64 tends to use one ABI, which is a strong indication that
    that that ABI was intended to work across languages and compilers.


    Case insensitive? Or maybe that's just wishful thinking.


    Case insensitivity is a mistake, born from the days before computers
    were advanced enough to have small letters as well as capitals.  It
    leads to ugly inconsistencies, wastes the opportunity to convey useful semantic information, and is an absolute nightmare as soon as you stray
    from the simple English-language alphabet.

    Yet Google searches are case-insensitive. How is that possible, given
    that search strings can use Unicode which you say does not define case equivalents across most alphabets?

    As are email addresses and domain names.

    As are most things in everyday life, even now that it is all tied up
    with computers and smartphones and tablets with everything being online.

    (Actually, most people's exposure to case-sensitivity is in online
    passwords, which is also the worst place to have it, as usually you
    can't see them!)

    Your objections make no sense at all. Besides which, plenty of case-insensitive languages, file-systems and shell programs and
    applications exist.

    I believe Unix's predecessor, Multics, was case-sensitive.  But I could
    be wrong.

    I'm surprised the Unix and C developers even had a terminal that could
    do upper and lower case. I was stuck with upper case for the first year
    or two. File-systems and global linker symbols were also restricted in
    length and case for a long time, to minimise space.

    Case-sensitivity was a luxury into the 80s.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Tue Nov 22 18:47:08 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-22 18:13, Bart wrote:

    Language A can talk to language B via the machine's ABI. Where does C
    come into it?

    Data types of arguments including padding/gaping of structures, calling conventions. E.g. Windows' native calling convention is stdcall, while C deploys cdecl.

    Language A can talk to a library or OS component that resides in a DLL,
    via the ABI. The library might have been implemented in C, or assembler,
    or in anything else, but in binary form, is pure machine code anyway.

    Same as above. Data types, calling conventions.

    What makes /you/ think that such ABIs were invented purely for the use
    of C programs? Do you think the designers of the ABI simply assumed that only programs written in the C language could call into the OS?

    That depends on the OS.

    - VMS used MACRO-11 and unified calling conventions. That was DEC and
    that was the time people really care, before the Dark Age of Computing.

    - Windows was stdcall, but then some it parts gave way to C.

    - UNIXes used C's conventions, naturally.

    When you download a shared library DLL, do you think they have different versions depending on what language will be using the DLL?

    That is certainly a possibility. There are lots of libraries having language-specific adapters. If you use a higher level language you would
    like to make advantage of this. Usually there are quite complicated elaboration protocols upon library loading ensuring initialization of
    complex objects, versioning consistency all thing the stupid loaders
    cannot. The price is that you might not be able to use it with C or
    another language.

    I'm surprised the Unix and C developers even had a terminal that could
    do upper and lower case.

    No idea, but already DEC VT52 has lower-case.

    (Of course, case-sensitivity was an incredibly stupid choice)
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Andy Walker@anw@cuboid.co.uk to comp.lang.misc on Tue Nov 22 20:14:18 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 17:13, Bart wrote:
    I'm surprised the Unix and C developers even had a terminal that
    could do upper and lower case. I was stuck with upper case for the
    first year or two. [...]
    Case-sensitivity was a luxury into the 80s.

    As per my nearby article, lower case was available for paper
    tape long before [electronic] computers existed. It's difficult to
    do word processing without a decent character set; I was doing it
    [admittedly in a rather primitive way] in the mid-60s. There were
    some peripherals [esp many lineprinters, card punches and teletypes]
    that were restricted to upper case, but lower case was scarcely a
    "luxury" when many secretaries were using electric typewriters.
    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Hertel
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Tue Nov 22 20:17:53 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 15:29, David Brown wrote:
    On 22/11/2022 13:38, Bart wrote:

    As I said, C is the language that adapts itself to the hardware, and
    in fact still is the primary language now that can and does run on
    every odd-ball architecture.


    C does not "adapt itself to the hardware".

    It will work on a greater range of hardware than one of my languages.

    For example, mine would have trouble on anything which is not byte-addressable, has anything other than 8-bit bytes, or supports
    primitive types that are now powers of two.

    This is important because you are claiming that it is the less fussy C language which is driving those characteristics, whereas it is more
    likely other languages that are more demanding.

    You may in fact be partly right in that the existence of controllers
    with odd word sizes may actually be due to their being a language which
    was conducive to writing suitable custom compilers. If C wasn't around,
    they may have needed to be more conforming.

    But then you can't have a language which is responsible both for 64-bit desktop processors, and 24-bit signal processors.

    Or perhaps we can, let's just rewrite history so that C is
    single-handedly responsible for the design of all hardware, even that
    devised a decade before C came out. No other languages matter.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Tue Nov 22 20:42:18 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 17:47, Dmitry A. Kazakov wrote:
    On 2022-11-22 18:13, Bart wrote:

    Language A can talk to language B via the machine's ABI. Where does C
    come into it?

    Data types of arguments including padding/gaping of structures, calling conventions.
    Actually the Win64 ABI doesn't go into types much at all. The main types
    are integers which are 1/2/4/8 bytes, each of which occupies one 64-bit
    GP register or one 64-bit stack slot; and floats of 4/8 bytes which are
    passed in the bottom end of a 128-bit XMM register, or via one 64-bit
    stack slot.

    Surely you're not going to claim that this is all thanks to C? That the hardware uses 64/128-bit GP/XMM registers and requires a 64/128-bit
    aligned stack couldn't possibly be the reason?

    Or are you going to claim like David Brown that the hardware is like
    that solely due to the need to run C programs? (Because any other
    language would run perfectly fine with 37-bit integers implemented on a 29-bit-addressable memory.)

    It doesn't go into struct layouts much other, but those are mainly
    driven by alignment needs, which again are due to hardware, not C.
    (Several structs which occur in Win32 API actually aren't strictly aligned.)

    E.g. Windows' native calling convention is stdcall, while C
    deploys cdecl.

    That all disappears with 64 bits. With 32-bit DLLs, while there was
    still one DLL, you needed to know the call-convention in use; this would
    have been part of the API. But while there were 100s of languages, there
    were only a handful of call conventions.


    When you download a shared library DLL, do you think they have
    different versions depending on what language will be using the DLL?

    That is certainly a possibility.

    My Win64 machine has 3300 DLL files in \windows\system32. Which language should they be for? It would be crazy to cater to every possible language.

    Plus, DLLs tend to include other DLLs; when the OS loads a DLL A, which imports DLL B, it will not know which language version of B to look for
    (and they would all be called B.DLL). All it might do is look for 32-bit
    and 64-bit versions which are stored in different places.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Tue Nov 22 20:54:43 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 20:14, Andy Walker wrote:
    On 22/11/2022 17:13, Bart wrote:
    I'm surprised the Unix and C developers even had a terminal that
    could do upper and lower case. I was stuck with upper case for the
    first year or two. [...]
    Case-sensitivity was a luxury into the 80s.

        As per my nearby article, lower case was available for paper
    tape long before [electronic] computers existed.  It's difficult to
    do word processing without a decent character set;  I was doing it [admittedly in a rather primitive way] in the mid-60s.  There were
    some peripherals [esp many lineprinters, card punches and teletypes]
    that were restricted to upper case, but lower case was scarcely a
    "luxury" when many secretaries were using electric typewriters.


    Perhaps the computer department of my college, and the places I worked
    at, were poorly equipped then. We used ASR33s and video terminals that emulated those teletypes, so upper case only.

    All the Fortran I wrote was in upper case that I can remember.

    The file systems of my PDP10 machine at least used 'sixbit' encoding, so
    could only do upper-case. The 'radix50' encoding of the PDP11 linker
    also restricted things to upper case.

    The bitmaps fonts of early screens and dot-matrix printers may also have
    been limited to upper case (the first video display of my own was).

    I think the Tektronix 4010 vector display I used was upper case only.

    My point was, there were so many restrictions, how did people manage to
    write C code? It was only into the 1980s that I could reliably make use
    of mixed case.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Tue Nov 22 23:24:15 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-22 21:42, Bart wrote:
    On 22/11/2022 17:47, Dmitry A. Kazakov wrote:
    On 2022-11-22 18:13, Bart wrote:

    Language A can talk to language B via the machine's ABI. Where does C
    come into it?

    Data types of arguments including padding/gaping of structures,
    calling conventions.
    Actually the Win64 ABI doesn't go into types much at all.

    It is all about types. The funny thing, it even specifies endianness
    thanks to the C's stupidity of unions, see how LARGE_INTEGER is defined.

    Or are you going to claim like David Brown that the hardware is like
    that solely due to the need to run C programs?

    Nobody would ever use any hardware if there is no C compiler. So David
    is certainly right.

    Long ago, there existed Lisp machines, machines designed for tagging
    data with types etc. All this sunk down when C took the reign. Today
    situation slowly changes with FPGA and the NN hype foaming over...

    That all disappears with 64 bits. With 32-bit DLLs, while there was
    still one DLL, you needed to know the call-convention in use; this would have been part of the API. But while there were 100s of languages, there were only a handful of call conventions.

    There are as many conventions as languages because complex types and
    closures require techniques unknown to plain C.

    Plus, DLLs tend to include other DLLs; when the OS loads a DLL A, which imports DLL B, it will not know which language version of B to look for
    (and they would all be called B.DLL).

    This is why there exists the callback on DLL load. Elaboration stuff is performed from there. In case of Ada, lots of things happen there
    because all library level objects are initialized there, library level
    tasks start there etc.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 00:03:16 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 22:24, Dmitry A. Kazakov wrote:
    On 2022-11-22 21:42, Bart wrote:
    On 22/11/2022 17:47, Dmitry A. Kazakov wrote:
    On 2022-11-22 18:13, Bart wrote:

    Language A can talk to language B via the machine's ABI. Where does
    C come into it?

    Data types of arguments including padding/gaping of structures,
    calling conventions.
    Actually the Win64 ABI doesn't go into types much at all.

    It is all about types. The funny thing, it even specifies endianness
    thanks to the C's stupidity of unions, see how LARGE_INTEGER is defined.

    LARGE_INTEGER is not mentioned in the ABI and is not listed here: https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types.

    The ABI really doesn't care about types other than it needs to know how
    many bytes values occupy, and whether they need to go into GP or FLOAT registers. It is quite low-level.

    Or are you going to claim like David Brown that the hardware is like
    that solely due to the need to run C programs?

    Nobody would ever use any hardware if there is no C compiler. So David
    is certainly right.

    You're both certainly wrong. People used hardware before C; they used
    hardware without C. And I spent a few years building bare computer
    boards that I programmed from scratch, with no C compiler in sight.


    Long ago, there existed Lisp machines, machines designed for tagging
    data with types etc. All this sunk down when C took the reign. Today situation slowly changes with FPGA and the NN hype foaming over...

    It sunk down because nobody used Lisp.


    That all disappears with 64 bits. With 32-bit DLLs, while there was
    still one DLL, you needed to know the call-convention in use; this
    would have been part of the API. But while there were 100s of
    languages, there were only a handful of call conventions.

    There are as many conventions as languages because complex types and closures require techniques unknown to plain C.

    If complex language X wants to talk to complex language Y, then they
    have to agree on a common way to represent the concepts that they share.
    Then they can either build on top of the platform ABI, or devise a
    private ABI.

    The platform ABI is still needed if they want to make use of a third
    party library that exports functions in a lower level API.

    Some libraries I can't use even via a DLL and via the ABI because they
    use complex C++ types or things like COM. But this will be clear when
    you look at their APIs.

    I can only use lower-level APIs using more primitive types. But I get
    angry when people suggest that such interfaces that I used for many
    years within my own languages, even my own hardware, are now being
    claimed as an invention of C and that wouldn't exist otherwise. What
    nonsense!

    Give people a /choice/ of lower level languages and there would have
    been more possibilities for writing libraries with such interfaces.

    If I export a function F taking an i64 type and returning an i64 type,
    it is thanks to C that that is possible? Nothing to do with the hardware implementing a 64-bit type and making use of that fact.

    This is 50% of why I hate C (the other 50% is because it does things in
    such a crappy manner - unsigned long long int indeed, which still
    actually 74 bits rather than 64), because people credit it with too much.

    To me it still looks like something a couple of students threw together
    over a drunken weekend for an laugh.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Andy Walker@anw@cuboid.co.uk to comp.lang.misc on Wed Nov 23 00:27:04 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 20:54, Bart wrote:
    Perhaps the computer department of my college, and the places I
    worked at, were poorly equipped then. We used ASR33s and video
    terminals that emulated those teletypes, so upper case only.
    All the Fortran I wrote was in upper case that I can remember.

    Yes, Fortran was very usually upper case, and equally usually on punched cards rather than paper tape. I can't usefully comment on what
    your college did. But in the UK, paper tape equipment was reasonably
    common, so were Flexowriters; and Algol [all dialects; amongst others]
    was always printed using lower/mixed case [usually with some stropping
    regime to allow for upper-case environments]. Lower case may not have
    been universal, but it was not your claimed "luxury". Not everyone was
    tied to Fortran and "scientific computing".

    The file systems of my PDP10 machine at least used 'sixbit' encoding,
    so could only do upper-case. The 'radix50' encoding of the PDP11
    linker also restricted things to upper case.

    File systems? Now /that/ was a luxury! I had been computing for
    more than a decade before we got our hands on one. Or an editor. We had
    to cut and splice our paper tapes by hand. As for "'sixbit' encoding",
    perhaps worth noting that the Flexowriter was six-bit plus parity, and
    [eg] Atlas stored eight six-bit characters per 48-bit word. There were
    "shift" characters to switch between cases, as on a typewriter.

    The bitmaps fonts of early screens and dot-matrix printers may also
    have been limited to upper case (the first video display of my own
    was).

    Some were; and many early screens/printers had really ugly lower
    case fonts. Unlike, for example, daisy-wheel printers.

    I think the Tektronix 4010 vector display I used was upper case
    only.
    My point was, there were so many restrictions, how did people manage
    to write C code? It was only into the 1980s that I could reliably
    make use of mixed case.

    Well, obviously people with little or no access to peripherals
    that could use lower case were SOL. But Bell Labs had such peripherals
    and were not, in 1970-odd, planning for Unix/C to take over the world.
    So they just used what was available /to them/.
    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Hertel
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Wed Nov 23 10:04:30 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-23 01:03, Bart wrote:
    On 22/11/2022 22:24, Dmitry A. Kazakov wrote:
    On 2022-11-22 21:42, Bart wrote:
    On 22/11/2022 17:47, Dmitry A. Kazakov wrote:
    On 2022-11-22 18:13, Bart wrote:

    Language A can talk to language B via the machine's ABI. Where does >>>>> C come into it?

    Data types of arguments including padding/gaping of structures,
    calling conventions.
    Actually the Win64 ABI doesn't go into types much at all.

    It is all about types. The funny thing, it even specifies endianness
    thanks to the C's stupidity of unions, see how LARGE_INTEGER is defined.

    LARGE_INTEGER is not mentioned in the ABI and is not listed here: https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types.

    It is defined in winnt.h

    The ABI really doesn't care about types other than it needs to know how
    many bytes values occupy, and whether they need to go into GP or FLOAT registers. It is quite low-level.

    At this point, I must ask, did you ever use any OS API at all? Or maybe
    you just do not know what a datatype is?

    Or are you going to claim like David Brown that the hardware is like
    that solely due to the need to run C programs?

    Nobody would ever use any hardware if there is no C compiler. So David
    is certainly right.

    You're both certainly wrong. People used hardware before C; they used hardware without C. And I spent a few years building bare computer
    boards that I programmed from scratch, with no C compiler in sight.

    We do not talk about hobby projects.

    That all disappears with 64 bits. With 32-bit DLLs, while there was
    still one DLL, you needed to know the call-convention in use; this
    would have been part of the API. But while there were 100s of
    languages, there were only a handful of call conventions.

    There are as many conventions as languages because complex types and
    closures require techniques unknown to plain C.

    If complex language X wants to talk to complex language Y,

    They just don't. Most more or less professionally designed languages
    provide interfacing to and from C. That limits the things that could be interfaced to a bare minimum. Dynamic languages are slightly better
    because of their general primitivism and because they are actually
    written in C. But dealing with real languages like C++ is almost
    impossible, e.g. handing virtual tables etc. So nobody cares.

    If I export a function F taking an i64 type and returning an i64 type,
    it is thanks to C that that is possible?

    For the machine you are using, the answer is yes.

    Nothing to do with the hardware
    implementing a 64-bit type and making use of that fact.

    Not even with power supply unit and the screws holding the motherboard...
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Wed Nov 23 11:58:33 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 17:27, Andy Walker wrote:
    On 22/11/2022 15:29, David Brown wrote:
    Case insensitivity is a mistake, born from the days before computers
    were advanced enough to have small letters as well as capitals.

        I don't believe I have ever used a computer that did not "have
    small letters".  There has been some discussion over in "comp.compilers" recently, but it's basically the difference between punched cards and
    paper tape.  The Flexowriter can be traced back to the 1920s, and its
    most popular form was certainly being used by computers in the 1950s,
    so there really weren't many "days before" to be considered.


    Computers were using 6-bit character encodings well into the 1970's,
    before ASCII and EBCDIC became dominant (at least in the Western World).
    There are lots of older programming languages where all keywords were
    in capitals. When modernising to support small letters, many of these
    choose to be case insensitive - allowing people to write code using
    small letters instead of ugly, shouty capitals, but keeping
    compatibility with existing code.

    Such history is not the only reason for a given programming language to
    be case insensitive, but it is certainly part of it for some languages.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 11:55:58 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 09:04, Dmitry A. Kazakov wrote:
    On 2022-11-23 01:03, Bart wrote:

    LARGE_INTEGER is not mentioned in the ABI and is not listed here:
    https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types.

    It is defined in winnt.h

    In my 'windows.h' header for my C compilers, it is defined in windows.h.
    So are myriad other types.

    Did you look at my link? It says:

    "The following table contains the following types: character, integer, Boolean, pointer, and handle."

    LARGE_INTEGER is not in there; it's something that is used for a handful
    of functions out of 1000s. Maybe it was an early kind of type for
    dealing with 64 bits and kept for compatibility.


    The ABI really doesn't care about types other than it needs to know
    how many bytes values occupy, and whether they need to go into GP or
    FLOAT registers. It is quite low-level.

    At this point, I must ask, did you ever use any OS API at all? Or maybe
    you just do not know what a datatype is?

    Have you ever looked at the Win64 ABI? Have you ever written compilers
    that generate ABI-compliant code?

    You are basically manipulating dumb chunks of data that are usually 64
    bits wide; you just need to ensure correct alignment, and need to know whether, is using registers, they need to be go into float registers
    instead.

    Apart from that, ABIs really, really don't care what that chunk
    represents. They are mainly concerned with where things go.

    You're both certainly wrong. People used hardware before C; they used
    hardware without C. And I spent a few years building bare computer
    boards that I programmed from scratch, with no C compiler in sight.

    We do not talk about hobby projects.

    Who said they were hobby projects? I was an engineer developing business computers as well as working on dozens on speculative, experimental
    projects. I used some 4 kinds of processors in the designs, and
    investigated many more as possibilities.

    Those were not the days of downloading free compilers off the internet
    and having a super-computer on your own desk to run them on.

    The point, C did not figure in any of this AT ALL. I doubt I was the
    only one either.

    Why are you two trying to rewrite history?

    That all disappears with 64 bits. With 32-bit DLLs, while there was
    still one DLL, you needed to know the call-convention in use; this
    would have been part of the API. But while there were 100s of
    languages, there were only a handful of call conventions.

    There are as many conventions as languages because complex types and
    closures require techniques unknown to plain C.

    If complex language X wants to talk to complex language Y,

    They just don't. Most more or less professionally designed languages
    provide interfacing to and from C.

    They need to provide a FFI to be able to deal with myriad libraries that
    use a C-style API. I call it C-style because unfortunately there is no
    other name that can describe a type system based around primitive
    machine types.

    Odd, because you find the same machine types used in a dozen other contemporary languages.

    For some reason, people think a type like int32 was popularised by C.
    I'm sure I used such a type in 1980s without any help from C. So did a
    million other people. But C gets the credit, EVEN THOUGH IT DIDN'T HAVE
    FIXED WIDTH TYPES UNTIL 1999. Go figure.

    That limits the things that could be
    interfaced to a bare minimum. Dynamic languages are slightly better
    because of their general primitivism and because they are actually
    written in C. But dealing with real languages like C++ is almost
    impossible, e.g. handing virtual tables etc. So nobody cares.

    If I export a function F taking an i64 type and returning an i64 type,
    it is thanks to C that that is possible?

    For the machine you are using, the answer is yes.

    If you really believe that, then both you and David Brown are deluded. I
    long suspected that C worshipping was more akin to a religious cult; it
    now seems it's more widespread than I thought with people being
    brainwashed into believing any old rubbish.

    Nothing to do with the hardware implementing a 64-bit type and making
    use of that fact.

    Not even with power supply unit and the screws holding the motherboard...

    Now I know this is actually a wind-up. Fuck C.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 12:21:43 2022
    From Newsgroup: comp.lang.misc

    On 21/11/2022 15:30, David Brown wrote:
    On 19/11/2022 17:01, Bart wrote:

    On 16/11/2022 16:50, David Brown wrote:
    Yes, but for you, a "must-have" list for a programming language
    would be
    mainly "must be roughly like ancient style C in functionality, but
    with
    enough change in syntax and appearance so that no one will think it is >>  > C".  If that's what you like, and what pays for your daily bread, then >>  > that's absolutely fine.

    On 18/11/2022 07:12, David Brown wrote:
    Yes, it is a lot like C.  It has a number of changes, some that I
    think
    are good, some that I think are bad, but basically it is mostly
    like C.

    The above remarks implies strongly that my systems language is a
    rip-off of C.


    No, it does not.  You can infer what you want from what I write, but I don't see any such implications from my remark.

    I haven't responded before because I thought people could draw their own conclusions from your remarks. But it seems it needs to be pointed out;
    you wrote:

    "must be roughly like ancient style C in functionality, but with enough
    change in syntax and appearance so that no one will think it is C"

    This is a /very/ thinly veiled suggestion that my language was a rip-off
    of C, by copying the language and changing the syntax so that it looked
    like a new language.

      If anyone were to write
    a (relatively) simple structured language for low level work, suitable
    for "direct" compilation to assembly on a reasonable selection of common general-purpose processors, and with the aim of giving a "portable alternative to writing in assembly", then the result will inevitably
    have a good deal in common with C.  There can be plenty of differences
    in the syntax and details, but the "ethos" or "flavour" of the language
    will be similar.

    Note that I have referred to Pascal as C-like in this sense.

    Now you're having a go at Pascal; maybe Pascal was a rip-off of C too
    (even though it predated it).

    I've got a better idea; why not call such languages 'Machine Oriented'?
    That was an actual thing in the 1970s; I even implemented one such language.

    (A summary of my various compilers is here: https://github.com/sal55/langs/blob/master/mycompilers.md)

    So, 'machine oriented' languages are a kind of language that I
    independently discovered, through the demands of my work, were needed
    and useful.

    I used that to my advantage to create in-house tools to give us an edge. Probably the same happened in lots of companies.

    But, for unfortunately for everyone, it was C that popularised that kind
    of language, with a crude, laughable implementation that we are now
    stuck with. And that we all now have to kow-tow to. Fuck that.



    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Wed Nov 23 13:40:26 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-23 12:55, Bart wrote:
    On 23/11/2022 09:04, Dmitry A. Kazakov wrote:
    On 2022-11-23 01:03, Bart wrote:

    LARGE_INTEGER is not mentioned in the ABI and is not listed here:
    https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types. >>
    It is defined in winnt.h

    In my 'windows.h' header for my C compilers, it is defined in windows.h.
    So are myriad other types.

    Now you have an opportunity to look at it.

    LARGE_INTEGER is not in there; it's something that is used for a handful
    of functions out of 1000s. Maybe it was an early kind of type for
    dealing with 64 bits and kept for compatibility.

    LARGE_INTEGER is massively used in Windows API.

    Apart from that, ABIs really, really don't care what that chunk
    represents. They are mainly concerned with where things go.

    Try to actually program using Windows API. You will know or at least
    read some MS documentation. Start with something simple:

    https://learn.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-token_access_information

    (:-))

    Why are you two trying to rewrite history?

    Maybe because we lived it through and saw it happen?...

    If you really believe that, then both you and David Brown are deluded. I long suspected that C worshipping was more akin to a religious cult; it
    now seems it's more widespread than I thought with people being
    brainwashed into believing any old rubbish.

    I do not know about David, but I hate C and consider it a very bad
    language. That does not alter the fact that C influenced and shaped both hardware and software as well as corrupted minds of several generations
    and is keeping doing so.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 14:04:16 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 12:40, Dmitry A. Kazakov wrote:
    On 2022-11-23 12:55, Bart wrote:
    On 23/11/2022 09:04, Dmitry A. Kazakov wrote:
    On 2022-11-23 01:03, Bart wrote:

    LARGE_INTEGER is not mentioned in the ABI and is not listed here:
    https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types.

    It is defined in winnt.h

    In my 'windows.h' header for my C compilers, it is defined in
    windows.h. So are myriad other types.

    Now you have an opportunity to look at it.

    LARGE_INTEGER is not in there; it's something that is used for a
    handful of functions out of 1000s. Maybe it was an early kind of type
    for dealing with 64 bits and kept for compatibility.

    LARGE_INTEGER is massively used in Windows API.

    Apart from that, ABIs really, really don't care what that chunk
    represents. They are mainly concerned with where things go.

    Try to actually program using Windows API. You will know or at least
    read some MS documentation. Start with something simple:

    https://learn.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-token_access_information


    I've used the WinAPI since the early 90s, but also kept my interactions
    with it to a minimum: just enough to be able to write GUI applicaions
    used by 1000s of clients.

    The reason for keeping it minimal are simple: when you use such a thing
    from a private language, then you have to manually create bindings in
    your language for every Type, Struct, Enum, Define, Macro and Function,
    which would have been a huge undertaking as there many thousands. So you translate only what is necessary.

    Why are you two trying to rewrite history?

    Maybe because we lived it through and saw it happen?...

    Are you that much older than me, or started as a toddler? I lived
    through it too, and for the first 16 years, had virtually nothing to do
    with C at all. Until I had to use Other People's Software, which in the
    case of WinAPI, was defined as C headers.

    (Now renamed by MS as C++; for some reason, they want to break the
    association with C.)


    If you really believe that, then both you and David Brown are deluded.
    I long suspected that C worshipping was more akin to a religious cult;
    it now seems it's more widespread than I thought with people being
    brainwashed into believing any old rubbish.

    I do not know about David, but I hate C and consider it a very bad
    language.

    Which languages don't you hate?

    That does not alter the fact that C influenced and shaped both
    hardware and software as well as corrupted minds of several generations
    and is keeping doing so.

    How can it have shaped hardware which was devised before it existed? How
    could it have influenced processors like the 8080, Z80, 8086, 68000
    which were devised when the use of C was still limited?

    From somewhere on the internet:

    "What was the leading programming language from 1965 to 1979?"

    "From 1965 to 1980, Fortran kept the 1st place. In 1980, Pascal got
    first and kept on top for five years. The C programming language got its
    main popularity from 1985 until 2001."

    But also from the internet:

    "C (1972) was the very first high-level language"

    It looks like people can just make stuff up. But I can very well believe
    that C started to become dominant in the mid-80s, partly because I was
    there.

    And by the mid-80s, we already had 32-bit microprocessors.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Wed Nov 23 15:53:03 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 18:13, Bart wrote:
    On 22/11/2022 15:29, David Brown wrote:
    On 22/11/2022 13:38, Bart wrote:

    When you looked at the 68000 more closely, it had nearly as much
    non-orthoganality as the 8086. (I was trying at that time to get my
    company to switch to a processor like the 68k.)

    No, it does not.  (Yes, I have looked at it closely, and used 68k
    processors extensively.)

    As a compiler writer?

    As an assembly programmer and C programmer.

    The first thing you noticed is that you have to
    decide whether to use D-registers or A-registers, as they had different characteristics, but the 3-bit register field of instructions could only
    use one or the other.


    Yes, although they share quite a bit in common too. You have 8 data
    registers that are all orthogonal and can be used for any data
    instructions as source and designation, all 32 bit. You have 8 address registers that could all be used for all kinds of addressing modes (and
    a few kinds of calculations, and as temporary storage) - the only
    special one was A7 that was used for stack operations (as well as being available for all the other addressing modes).

    How does that even begin to compare to the 8086 with its 4 schizophrenic "main" registers that are sometimes 16-bit, sometimes two 8-bit
    registers, with a wide range of different dedicated usages for each
    register? Then you have 4 "index" registers, each with different
    dedicated uses. And 4 "segment" registers, each with different
    dedicated uses.

    Where the 68000 has wide-spread, planned and organised orthogonality and flexibility, the 8086 is a collection of random dedicated bits and pieces.

    That made the 8086 simpler because there was no choice! The registers
    were limited and only one was general purpose.


    A design like the 8086 might feel nicer for some assembly programmers.
    I've worked in assembly on a range of systems - including 8-bit CISC
    devices with only a few dedicated registers, 8-bit RISC processors,
    16-bit devices, and 32-bit devices. Without a doubt, the m68k
    architecture is the nicest I have used for assembly programming. The
    msp430 is also good, but as a 16-bit device it is a bit more limited.
    (It has 16 registers, of which 12 are fully orthogonal.) At the high
    end, PowerPC is extremely orthogonal but quite tedious to program - it's
    hard to track so many registers (it has 32 registers) manually. A small number of dedicated registers is okay if you are only doing very simple
    and limited assembly programming. For a bit more advanced stuff, you
    want more registers. (And for very advanced stuff you don't want to use assembly at all.)

    As you know, I personally have not written a compiler - but I know a
    good deal more about compilers than most programmers. There is not a
    shadow of a doubt that serious compiler writers prefer processors with a reasonable number of orthogonal general-purpose registers to those with
    a small number of specialised registers.

    You can understand this by looking at the compiler market - there are
    many compilers available for orthogonal processors, and multi-target
    compilers commonly support many orthogonal processors. (This is not
    just gcc and clang/llvm - it includes Metrowerks, Green Hills, Wind
    River, etc.) Compilers that target CISC devices with specialised
    registers are typically more dedicated and specialised tools, and often
    very expensive. The only exception is the x86, which is so common that
    lots of compilers support it.

    You can also understand it by looking at the processor market. Real
    CISC with dedicated and specialised registers is dead. In the progress
    of x86 through 32-bit and then 64-bit, the architecture became more and
    more orthogonal - the old registers A, B, C, D, SI, DI, etc., are now no
    more than legacy alternative names for r0, r1, etc., general purpose registers.



    But C probably has influenced modern 64-bit ABIs, even though they
    are supposed to be language-independent.


    What makes you think they are supposed to be language independent?
    What makes you think they are not?  What makes you care?

    Language A can talk to language B via the machine's ABI. Where does C
    come into it?

    Language A can talk to a library or OS component that resides in a DLL,
    via the ABI. The library might have been implemented in C, or assembler,
    or in anything else, but in binary form, is pure machine code anyway.

    What makes /you/ think that such ABIs were invented purely for the use
    of C programs? Do you think the designers of the ABI simply assumed that only programs written in the C language could call into the OS?

    As so often happens, you are making up stuff that you think I think. I
    think you find it easier than reading what I actually write.


    When you download a shared library DLL, do you think they have different versions depending on what language will be using the DLL?

    The types and terms from C are a very convenient way to describe an ABI,

    They're pretty terrible actually. The types involved in SYS V ABI can be expressed as follows in a form that everyone understands and many
    languages use:

        i8 i16 i32 i64 i128
        u8 u16 u32 u64 u128
        f32 f64 f128

    Or they can be expressed in a form that everyone understands, like
    "char", "int", etc., that are defined in the ABI, and that everybody and
    every language /does/ use when integrating between different languages.

    I don't disagree that size-specific types might have been a better
    choice to standardise on - but the world has standardised on C types for
    the purpose. They do a good enough job, and everyone but you is happy
    with them.


    This document (https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf)
    lists the C equivalents as follows (only signed integers shown):

       i8    char, signed char
       i16   short, signed short
       i32   int, signed int
       i64   long, signed long, long long, signed long long
       i128  __int128, signed __int128

    (No use of int8_t etc despite the document dated 2012.)

    That document has no mention anywhere of your personal short names for size-specific types. It has a table stating the type names and sizes.
    Think of it as just a definition of the technical terms used in the
    document, no different from when one processor reference might define
    "word" to mean 16 bits and another define "word" to mean 32 bits.

    Why does it not use <stdint.h> types like "int16_t" ? Even now, in
    2022, people still use C90 standard C - for good reasons or bad reasons.
    And C++ did not standardise the <stdint.h> types until C++11 (though
    every C++ implementation supported them long before that).


    This comes up in APIs too where it is 100 times more relevant (only
    compiler writers care about the API). The C denotations shown here are
    not fit for purpose for language-neutral interfaces.

    (Notice also that 'long' and 'long long' are both 64 bits, and that
    'char' is assumed to be signed. In practice the C denotations would vary across platforms, while those i8-i128 would stay constant, provided only that the machine uses conventional register sizes.)

    So it's more like, such interfaces were developed /despite/ C.

    since it is a language familiar to any programmer who might be
    interested in the details of an ABI.  Such ABI's only cover a
    (relatively) simple common subset of possible interfaces, but do so in
    a way that can be used from any language (with wrappers if needed) and
    can be extended as needed.

    People make ABI's for practical use.  MS made the ABI for Win64 to
    suit their own needs and uses.  AMD and a range of *nix developers
    (both OS and application developers) and compiler developers got
    together to develop the 64-bit x86 ABI used by everyone else, designed
    to suit /there/ needs and uses.

    x86-32 used a number of different ABIs depending on language and
    compiler. x86-64 tends to use one ABI, which is a strong indication that that that ABI was intended to work across languages and compilers.


    There are so many x86-32 ABI's that it doesn't have an ABI - Intel never bothered trying to make one for their processors, and MS never bothered
    making one for their OS. (The various *nix systems for x86-32 agreed on
    an ABI.)

    For x86-64, there are two ABI's - the one developed by AMD, *nix
    vendors, and compiler developers based on what would work efficiently
    for real code, and the one MS picked based on... well, I don't know if
    anyone really knows what they based it on. It has some pretty silly differences from the one everyone else had standardised on before they
    even started thinking about it.

    But yes, even in the MS world the ABI situation is vastly better for
    x86-64 than it was for x86-32, and it works across languages (limited by
    the lowest common denominator) and compilers.



    Case insensitive? Or maybe that's just wishful thinking.


    Case insensitivity is a mistake, born from the days before computers
    were advanced enough to have small letters as well as capitals.  It
    leads to ugly inconsistencies, wastes the opportunity to convey useful
    semantic information, and is an absolute nightmare as soon as you
    stray from the simple English-language alphabet.

    Yet Google searches are case-insensitive. How is that possible, given
    that search strings can use Unicode which you say does not define case equivalents across most alphabets?

    Human language is often case insensitive - certainly in speech. So
    natural human language interfaces have to take that into account.
    Programming is not a natural human language.

    How does Google manage case-insensitive searches with text in Unicode in
    many languages? By being /very/ smart. I didn't say it was impossible
    to be case-insensitive beyond plain English alphabet, I said it was an "absolute nightmare". It is. It is done where it has to be done -
    you'll find all major databases have support for doing sorting,
    searching, and case translation for large numbers of languages and
    alphabets. It is a /huge/ undertaking to handle it all. You don't do
    it if it is not important.

    Name just /one/ real programming language that supports case-insensitive identifiers but is not restricted to ASCII. (Let's define "real
    programming language" as a programming language that has its own
    Wikipedia entry.)

    There are countless languages that have case-sensitive Unicode
    identifiers, because that's easy to implement and useful for programmers.


    As are email addresses and domain names.

    The email standards say that email addresses are case-sensitive, but originally encouraged servers to be lenient in how they check them.

    In current email standards, this is referred to as "unwise in practice":

    <https://datatracker.ietf.org/doc/html/rfc6530#section-10.1>


    Domain names are case insensitive if they are in ASCII. For other
    characters, it gets complicated.


    As are most things in everyday life, even now that it is all tied up
    with computers and smartphones and tablets with everything being online.

    (Actually, most people's exposure to case-sensitivity is in online passwords, which is also the worst place to have it, as usually you
    can't see them!)

    Programmers are not "most people". Programs are not "most things in
    everyday life".

    Most people are quite tolerant of spelling mistakes in everyday life -
    do you think programming languages should be too?


    Your objections make no sense at all. Besides which, plenty of case-insensitive languages, file-systems and shell programs and
    applications exist.


    They do exist, yes. That does not make them a good idea.

    I believe Unix's predecessor, Multics, was case-sensitive.  But I
    could be wrong.

    I'm surprised the Unix and C developers even had a terminal that could
    do upper and lower case. I was stuck with upper case for the first year
    or two. File-systems and global linker symbols were also restricted in length and case for a long time, to minimise space.

    Case-sensitivity was a luxury into the 80s.


    Perhaps they were forward-thinking people.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Wed Nov 23 16:03:19 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 15:04, Bart wrote:
    On 23/11/2022 12:40, Dmitry A. Kazakov wrote:


    I do not know about David, but I hate C and consider it a very bad
    language.

    Which languages don't you hate?


    You asked the very question I was thinking!

    He hates functional programming languages, because he didn't like using
    a couple of ancient languages that were not functional programming
    languages but happened to be declarative rather than imperative.

    He hates any use of generics, templates, or other advanced language feature.

    He likes imperative programming, and only imperative programming.

    I think he likes object oriented programming, but it could just be that
    he thinks he knows about it.

    He hates C.


    I know /you/ have a similar set of features that you like and dislike,
    with the overriding rule of "if it is in C, it is bad". And I know the
    only languages you really like are the ones you made yourself (though
    you think ALGOL was not too bad).

    But I can't figure out what Dmitry might like - unless he too has his
    own personal language.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Wed Nov 23 16:23:44 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-23 16:03, David Brown wrote:

    But I can't figure out what Dmitry might like - unless he too has his
    own personal language.

    No, I am not that megalomaniac. (:-))

    I want stuff useful for software engineering. To me it is a DIY shop. I
    choose techniques I find useful in long term perspective and reject
    other. I generally avoid academic exercises, hobby languages, big-tech/corporate/vendor-lock bullshit. You can guess which of your pet languages falls into which category. (:-))
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Wed Nov 23 16:34:11 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-23 15:53, David Brown wrote:

    Name just /one/ real programming language that supports case-insensitive identifiers but is not restricted to ASCII.  (Let's define "real programming language" as a programming language that has its own
    Wikipedia entry.)

    1. https://en.wikipedia.org/wiki/Ada_(programming_language)

    2. Ada Reference Manual 2.3:

    Two identifiers are considered the same if they consist of the same sequence of characters after applying locale-independent simple case
    folding, as defined by documents referenced in the note in Clause 1 of
    ISO/IEC 10646:2011.
    After applying simple case folding, an identifier shall not be
    identical to a reserved word.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Wed Nov 23 15:56:09 2022
    From Newsgroup: comp.lang.misc

    On 20/11/2022 00:35, Bart wrote:
    On 19/11/2022 22:23, James Harris wrote:

    ...

    I remember reading that when AMD wanted to design a 64-bit
    architecture they asked programmers (especially at Microsoft) what
    they wanted. One thing was 'no segmentation'. The C model had
    encouraged programmers to think in terms of flat address spaces, and
    the mainstream segmented approach for x86 was a nightmare that people
    didn't want to repeat.


    I think you're ascribing too much to C. In what way did any other
    languages (Algol, Pascal, Cobol, Fortran, even Ada by then) encourage
    the use of segmented memory?

    I wouldn't say they did. What I would say is that probably none of them
    had C's influence on what programming became. Yes, Cobol was widespread
    for a long time but its design didn't get incorporated into later
    languages. Conversely, much of Algol's approach was adopted by nearly
    all later languages but it itself never achieved the widespread use of
    C. Only C had widespread use as well as strong influence on others. Much
    of the programming community today still thinks in C terms even 50 years
    (!!!) after its release.


    Do you mean because C required the use of different kinds of pointers,
    and people were fed up with that? Whereas other languages hid that
    detail better.

    I am not sure what you mean but while some languages restricted /where/
    a pointer could point C allowed a single pointer to point anywhere. (I
    may be wrong but I think the only split that would work is between data
    and code because the language can tell which of the two any reference is.)

    On pointers to data consider the subexpression

    f(p)

    where p is a pointer. Even on a segmented machine that call has no
    concept of whether p is pointing to, say, the stack or one of many data segments. In general, all pointers have to be flat: any pointer can
    point anywhere; that's the C model.


    You might as well say then that Assembly was equally responsible since
    it was even more of a pain to deal with segments!

    There are lots of assembly languages, one per CPU!

    But if you are thinking of Intel then you are right that their
    half-hearted approach gave segmentation a bad name.

    Consider a segment of memory as a simple range from byte 'first' to byte 'last'. With such ranges:

    * all accesses can be range checked automatically
    * no access outside the range would be permitted
    * the range could be extended or shortened
    * the memory used could be moved around as needed

    all without impacting a program which accesses them.



    (Actually, aren't the segments still there on x86? Except they are 4GB
    in size instead of 64KB.)

    Intel could, instead, have said that a 32-bit address was split into a
    range and an offset, such that the CPU in hardware would use the range
    to find 'first' and 'last', then add the offset to the 'first' and use
    the 'last' for range checking.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 16:36:37 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 15:56, James Harris wrote:
    On 20/11/2022 00:35, Bart wrote:
    On 19/11/2022 22:23, James Harris wrote:

    ...

    I remember reading that when AMD wanted to design a 64-bit
    architecture they asked programmers (especially at Microsoft) what
    they wanted. One thing was 'no segmentation'. The C model had
    encouraged programmers to think in terms of flat address spaces, and
    the mainstream segmented approach for x86 was a nightmare that people
    didn't want to repeat.


    I think you're ascribing too much to C. In what way did any other
    languages (Algol, Pascal, Cobol, Fortran, even Ada by then) encourage
    the use of segmented memory?

    I wouldn't say they did. What I would say is that probably none of them
    had C's influence on what programming became.

    Examples? Since the current crop of languages all have very different
    ideas from C.

    Yes, Cobol was widespread
    for a long time but its design didn't get incorporated into later
    languages. Conversely, much of Algol's approach was adopted by nearly
    all later languages but it itself never achieved the widespread use of
    C. Only C had widespread use as well as strong influence on others. Much
    of the programming community today still thinks in C terms even 50 years (!!!) after its release.

    Is it really C terms, or does that just happen to be the hardware model?

    Yes, C is a kind of lingua franca that lots of people know, but notice
    that people talk about a 'u64' type, something everyone understands, but
    not 'unsigned long long int' (which is not even defined by C to be
    exactly 64 bits), nor even `uint64_t` (which not even C programs
    recognise unless you use stdint.h or inttypes.h!).


    On pointers to data consider the subexpression

      f(p)

    where p is a pointer. Even on a segmented machine that call has no
    concept of whether p is pointing to, say, the stack or one of many data segments. In general, all pointers have to be flat: any pointer can
    point anywhere; that's the C model.

    Why the C model? Do you have any languages in mind with a different model?

    Pointers or references occur in many languages (from that time period,
    Pascal, Ada, Algol68); I don't recall them being restricted in their
    model of memory.

    C, on the other, had lots of restrictions:

    * Having FAR and NEAR pointer types

    * Having distinct object and function pointers (you aren't even allowed
    to directly cast between them)

    * Not being able to compare pointers to two different objects

    It is the only one that I recall which exposes the fact that these could
    all exist in different, non-compatible and /non-linear/ regions of memory.

    So your calling linear memory the 'C model' is backwards!

    Consider a segment of memory as a simple range from byte 'first' to byte 'last'. With such ranges:

    * all accesses can be range checked automatically
    * no access outside the range would be permitted
    * the range could be extended or shortened
    * the memory used could be moved around as needed

    all without impacting a program which accesses them.

    See my comments above.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Wed Nov 23 16:51:15 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 16:36, Bart wrote:
    On 23/11/2022 15:56, James Harris wrote:
    On 20/11/2022 00:35, Bart wrote:
    On 19/11/2022 22:23, James Harris wrote:

    ...

    I remember reading that when AMD wanted to design a 64-bit
    architecture they asked programmers (especially at Microsoft) what
    they wanted. One thing was 'no segmentation'. The C model had
    encouraged programmers to think in terms of flat address spaces, and
    the mainstream segmented approach for x86 was a nightmare that
    people didn't want to repeat.


    I think you're ascribing too much to C. In what way did any other
    languages (Algol, Pascal, Cobol, Fortran, even Ada by then) encourage
    the use of segmented memory?

    I wouldn't say they did. What I would say is that probably none of
    them had C's influence on what programming became.

    Examples? Since the current crop of languages all have very different
    ideas from C.

    Cobol and Algol:


    Yes, Cobol was widespread for a long time but its design didn't get
    incorporated into later languages. Conversely, much of Algol's
    approach was adopted by nearly all later languages but it itself never
    achieved the widespread use of C. Only C had widespread use as well as
    strong influence on others. Much of the programming community today
    still thinks in C terms even 50 years (!!!) after its release.

    Is it really C terms, or does that just happen to be the hardware model?

    Yes, C is a kind of lingua franca that lots of people know, but notice
    that people talk about a 'u64' type, something everyone understands, but
    not 'unsigned long long int' (which is not even defined by C to be
    exactly 64 bits), nor even `uint64_t` (which not even C programs
    recognise unless you use stdint.h or inttypes.h!).

    u64 is just a name.



    On pointers to data consider the subexpression

       f(p)

    where p is a pointer. Even on a segmented machine that call has no
    concept of whether p is pointing to, say, the stack or one of many
    data segments. In general, all pointers have to be flat: any pointer
    can point anywhere; that's the C model.

    Why the C model? Do you have any languages in mind with a different model?

    Yes, the C model is as stated: any pointer can point anywhere. A C
    pointer must be able to point to rodata, stack, and anywhere in the data section.


    Pointers or references occur in many languages (from that time period, Pascal, Ada, Algol68); I don't recall them being restricted in their
    model of memory.

    C, on the other, had lots of restrictions:

    * Having FAR and NEAR pointer types

    Are you sure that FAR and NEAR were part of C?
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 18:12:09 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 16:51, James Harris wrote:
    On 23/11/2022 16:36, Bart wrote:

    I wouldn't say they did. What I would say is that probably none of
    them had C's influence on what programming became.

    Examples? Since the current crop of languages all have very different
    ideas from C.

    Cobol and Algol:

    I was asking about C's influence, but those two languages predated C.

    well as strong influence on others. Much of the programming community
    today still thinks in C terms even 50 years (!!!) after its release.

    Is it really C terms, or does that just happen to be the hardware model?

    Yes, C is a kind of lingua franca that lots of people know, but notice
    that people talk about a 'u64' type, something everyone understands,
    but not 'unsigned long long int' (which is not even defined by C to be
    exactly 64 bits), nor even `uint64_t` (which not even C programs
    recognise unless you use stdint.h or inttypes.h!).

    u64 is just a name.

    So what are the 'C terms' you mentioned? Since if talking about
    primitive types for example, u64 or uint64 or whatever are common ways
    of refering to a 64-bit unsigned integer type, then unless the
    discussion specically about C, you wouldn't use C denotations for it.

    Why the C model? Do you have any languages in mind with a different
    model?

    Yes, the C model is as stated: any pointer can point anywhere. A C
    pointer must be able to point to rodata, stack, and anywhere in the data section.

    And that is different from any other language that had pointers, how?

    Because I'm having trouble in understanding how you can attribute linear memory models to C and only C, when it is the one language that exposes
    the limitations of non-linear memory.



    Pointers or references occur in many languages (from that time period,
    Pascal, Ada, Algol68); I don't recall them being restricted in their
    model of memory.

    C, on the other, had lots of restrictions:

    * Having FAR and NEAR pointer types

    Are you sure that FAR and NEAR were part of C?

    They were part of implementations of it for 8086. There were actually
    'near', 'far' and 'huge'. I think a 'far' pointer had a fixed segment part.

    (I can't remember my own arrangements for 8086, but I certainly don't
    remember having multiple pointer types. Probably I used only one type, a 32-bit segment+offset value, of which the offset part was usually
    normalised to be in the range 0 to 15.)


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Andy Walker@anw@cuboid.co.uk to comp.lang.misc on Wed Nov 23 18:31:48 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 16:36, Bart wrote:
    C, on the other, had lots of restrictions:
    * Having FAR and NEAR pointer types

    Never part of C. [Non-standard extension in some implementations.]

    * Having distinct object and function pointers (you aren't even
    allowed to directly cast between them)

    Correctly so. In a proper HLL, type punning should, in general,
    be forbidden. There could be a case made out for casting between two structures that are identical apart from the names of the components,
    otherwise it is a recipe for hard-to-find bugs.

    * Not being able to compare pointers to two different objects

    Of course you can. Such pointers compare as unequal. You can
    also reliably subtract pointers in some cases. What more can you
    reasonably expect?

    It is the only one that I recall which exposes the fact that these
    could all exist in different, non-compatible and /non-linear/ regions
    of memory.

    "Exposes"? How? Where? Examples? [In either K&R C or standard
    C, of course, not in some dialect implementation.]
    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Marpurg
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Wed Nov 23 19:38:28 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 16:23, Dmitry A. Kazakov wrote:
    On 2022-11-23 16:03, David Brown wrote:

    But I can't figure out what Dmitry might like - unless he too has his
    own personal language.

    No, I am not that megalomaniac. (:-))

    I want stuff useful for software engineering. To me it is a DIY shop. I choose techniques I find useful in long term perspective and reject
    other. I generally avoid academic exercises, hobby languages, big-tech/corporate/vendor-lock bullshit. You can guess which of your pet languages falls into which category. (:-))


    The languages I mostly use are C, C++ and Python, depending on the task
    and the target system. (And while I enjoy working with each of these,
    and see their advantages in particular situations, I also appreciate
    that they are not good in other cases and they all have features I
    dislike.) Your criteria would not rule out any of these - I too
    generally avoid languages with vendor lock-in, and small developer or
    user communities. Academic exercise languages are of course no use
    unless you are doing academic exercises.

    Your criteria would also not rule out several key functional programming languages, including Haskell, OCaml, and Scala.

    It would rule out C#, VB, Bart's languages, and possibly Java. Pascal
    is in theory open and standard, but in practice it is disjoint with vendor-specific variants. (There's FreePascal, which has no lock-in.)

    You would still have Ada, D, Erlang, Fortran, Forth, JavaScript, Lua,
    Rust, Modula-2, Perl, and PHP.

    I think that covers most of the big languages (I assume you also don't
    like ones that have very small user bases).



    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Wed Nov 23 19:38:29 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 17:36, Bart wrote:
    On 23/11/2022 15:56, James Harris wrote:

    On pointers to data consider the subexpression

       f(p)

    where p is a pointer. Even on a segmented machine that call has no
    concept of whether p is pointing to, say, the stack or one of many
    data segments. In general, all pointers have to be flat: any pointer
    can point anywhere; that's the C model.

    Why the C model? Do you have any languages in mind with a different model?

    Languages that don't have pointers can have their data organised any way
    they want. (I don't have any particular languages in mind.)


    Pointers or references occur in many languages (from that time period, Pascal, Ada, Algol68); I don't recall them being restricted in their
    model of memory.

    C, on the other, had lots of restrictions:

    * Having FAR and NEAR pointer types

    The C language has never had any such thing. A few implementations of C
    (such as some DOS compilers, as well as compilers for some brain-dead
    8-bit CISC microcontrollers like the 8051, or microcontrollers whose
    memory has outgrown their 16-bit address space) have had such features
    as extensions used to let people write efficient but powerful code, at
    the expense of being non-portable. Other languages had various methods
    of dealing with the same kind of issues - some had pointers that were
    fixed as always "near" pointers (for efficient code but limited memory
    size), some where fixed as always "far" pointers, some used compiler
    flags to choose the "memory model", some used compiler directives, some supported both kinds in some manner. It is the same today for some
    embedded processor toolchains.


    * Having distinct object and function pointers (you aren't even allowed
    to directly cast between them)


    That seems both normal and sensible. Neither Pascal nor Ada will let
    you mix object and function pointers or convert between them, at least
    not without a great deal more effort than in C.

    * Not being able to compare pointers to two different objects

    I don't know other languages' standards well enough to be sure. I doubt
    if you do either. (I don't even know if their standards consider such details.)

    However, I did find this: <https://en.wikibooks.org/wiki/Pascal_Programming/Pointers> which says
    that ordering comparison operators like < and >= do not apply to
    pointers - only "=" and "<>" are allowed (as they are in C for any
    pointers, regardless of where they point). There are many variants of
    Pascal, however, with very significant differences - some may happily
    allow pointer comparison. (Just as some C implementations may happily
    allow any pointer comparisons.)



    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 18:53:33 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 18:31, Andy Walker wrote:
    On 23/11/2022 16:36, Bart wrote:
    C, on the other, had lots of restrictions:
    * Having FAR and NEAR pointer types

        Never part of C.  [Non-standard extension in some implementations.]

    * Having distinct object and function pointers (you aren't even
    allowed to directly cast between them)

        Correctly so.  In a proper HLL, type punning should, in general,
    be forbidden.  There could be a case made out for casting between two structures that are identical apart from the names of the components, otherwise it is a recipe for hard-to-find bugs.

    * Not being able to compare pointers to two different objects

        Of course you can.  Such pointers compare as unequal.  You can also reliably subtract pointers in some cases.  What more can you
    reasonably expect?

    C doesn't allow relative compares, or subtracting operators. Or rather,
    it will make those operations implementation defined or UB, simply
    because pointers could in fact refer to incompatible memory regions.

    This goes against the suggestion that C is more conducive to linear
    memory than any other languages.


    It is the only one that I recall which exposes the fact that these
    could all exist in different, non-compatible and /non-linear/ regions
    of memory.

        "Exposes"?  How?  Where?  Examples?  [In either K&R C or standard
    C, of course, not in some dialect implementation.]


    What is being claimed is that it is largely C that has been responsible
    for linear memory layouts in hardware.

    What I've been trying to establish is how exactly it managed that; what
    did other languages with pointers do differently?

    So far no has managed to answer that; it's just a C love-fest.

    All I know is that when there IS segmented memory, then C will make you
    aware of it. On the IBM PC x86 machines, then if you were writing in C,
    then you still had to grapple with those kinds of pointers.

    Actually I've lost track of what is being claimed, and now I'm highly sceptical. So far:

    * C was responsible for the success of hardware that predated C

    * C influenced the design of microprocesses in the mid to late 70s,
    where C was in its early days

    * C single-handly was responsible for us having linear memory today (and nothing to do with machines having more address bits)

    * C was responsible for us having power-of-two word sizes now.

    This despite C not being mainstream until the mid-80s when there were
    already machines with power-of-two word sizes and linear memory.




    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Andy Walker@anw@cuboid.co.uk to comp.lang.misc on Wed Nov 23 20:24:56 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 18:53, Bart wrote:
    * Not being able to compare pointers to two different objects
         Of course you can.  Such pointers compare as unequal.  You can >> also reliably subtract pointers in some cases.  What more can you
    reasonably expect?
    C doesn't allow relative compares, or subtracting operators. Or
    rather, it will make those operations implementation defined or UB,> simply because pointers could in fact refer to incompatible memory
    regions.

    N2478 [other standards are available], section 6.5.6.10:

    " When two pointers are subtracted, both shall point to elements of
    " the same array object, or one past the last element of the array
    " object; the result is the difference of the subscripts of the two
    " array elements. The size of the result is implementation-defined,
    " and its type (a signed integer type) is ptrdiff_t defined in the
    " <stddef.h> header. If the result is not representable in an object
    " of that type, the behavior is undefined. "

    So the behaviour is undefined only if the subtraction overflows, and is implementation defined only to the extent of what size of signed integer
    the implementation prefers. It's difficult to see what other behaviour
    could reasonably be specified in the Standard.

    Section 6.5.8.6:

    " When two pointers are compared, the result depends on the relative
    " locations in the address space of the objects pointed to. If two
    " pointers to object types both point to the same object, or both
    " point one past the last element of the same array object, they
    " compare equal. If the objects pointed to are members of the same
    " aggregate object, pointers to structure members declared later
    " compare greater than pointers to members declared earlier in the
    " structure, and pointers to array elements with larger subscript
    " values compare greater than pointers to elements of the same
    " array with lower subscript values. All pointers to members of the
    " same union object compare equal. If the expression P points to an
    " element of an array object and the expression Q points to the last
    " element of the same array object, the pointer expression Q+1
    " compares greater than P. In all other cases, the behavior is
    " undefined. "

    Well, it's rather verbose, but it all seems common sense to me. No
    mention anywhere of "incompatible memory regions", so I suspect that
    you're making it up based on what you think C is like rather than how
    it is defined in reality.

    May be worth noting that [eg] Algol defines only the relations
    "is" and "isn't" between pointers; C is at least more "helpful" than
    that. But that is largely driven by C's use of pointers in arrays.

    This goes against the suggestion that C is more conducive to linear
    memory than any other languages.

    Well, /I/ have made no such suggestion, and don't really even
    know what that is claimed to mean. Most HLLs specifically hide the
    layout and structure of memory from ordinary programmers; no doubt a
    Good Thing.

    It is the only one that I recall which exposes the fact that these
    could all exist in different, non-compatible and /non-linear/ regions
    of memory.
         "Exposes"?  How?  Where?  Examples?  [In either K&R C or standard
    C, of course, not in some dialect implementation.]
    What is being claimed is that it is largely C that has been
    responsible for linear memory layouts in hardware.

    Not a claim I have ever made, nor even seen before. There is remarkably little in [standard] C that relates to memory layouts. So
    I take it that you have no examples of this claimed exposure?

    FTAOD, I'm not a great fan of C. But that's another matter.
    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Marpurg
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Wed Nov 23 21:25:59 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-23 19:38, David Brown wrote:
    On 23/11/2022 16:23, Dmitry A. Kazakov wrote:
    On 2022-11-23 16:03, David Brown wrote:

    But I can't figure out what Dmitry might like - unless he too has his
    own personal language.

    No, I am not that megalomaniac. (:-))

    I want stuff useful for software engineering. To me it is a DIY shop.
    I choose techniques I find useful in long term perspective and reject
    other. I generally avoid academic exercises, hobby languages,
    big-tech/corporate/vendor-lock bullshit. You can guess which of your
    pet languages falls into which category. (:-))


    The languages I mostly use are C, C++ and Python, depending on the task
    and the target system.  (And while I enjoy working with each of these,
    and see their advantages in particular situations, I also appreciate
    that they are not good in other cases and they all have features I dislike.)  Your criteria would not rule out any of these - I too
    generally avoid languages with vendor lock-in, and small developer or
    user communities.  Academic exercise languages are of course no use
    unless you are doing academic exercises.

    Your criteria would also not rule out several key functional programming languages, including Haskell, OCaml, and Scala.

    It would rule out C#, VB, Bart's languages, and possibly Java.  Pascal
    is in theory open and standard, but in practice it is disjoint with vendor-specific variants.  (There's FreePascal, which has no lock-in.)

    You would still have Ada, D, Erlang, Fortran, Forth, JavaScript, Lua,
    Rust, Modula-2, Perl, and PHP.

    I think that covers most of the big languages (I assume you also don't
    like ones that have very small user bases).

    Narrow user base is no reason to reject a language. However there is a
    danger that the language might go extinct.

    To me most important is the language toolbox:

    - modules, separate compilation, late bindings
    - abstract data types
    - generic programming (i.e. in terms of sets of types)
    - formal verification, contracts, correctness proofs
    - some object representation control
    - interfacing to C and thus system and other libraries
    - high level concurrency support
    - program readability, reasonable syntax AKA don't be APL (:-))
    - standard library abstracting the underlying OS
    - some type introspection

    Things not important or ones I actively avoid are

    - lambdas
    - relational algebra
    - patterns
    - recursive types
    - closures
    - dynamic/duck/weak/no-typing
    - macros/preprocessor/templates/generics
    - standard container library (like std or boost)
    - standard GUI library
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Wed Nov 23 21:33:57 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 16:34, Dmitry A. Kazakov wrote:
    On 2022-11-23 15:53, David Brown wrote:

    Name just /one/ real programming language that supports
    case-insensitive identifiers but is not restricted to ASCII.  (Let's
    define "real programming language" as a programming language that has
    its own Wikipedia entry.)

    1. https://en.wikipedia.org/wiki/Ada_(programming_language)

    2. Ada Reference Manual 2.3:

      Two identifiers are considered the same if they consist of the same sequence of characters after applying locale-independent simple case folding, as defined by documents referenced in the note in Clause 1 of ISO/IEC 10646:2011.
      After applying simple case folding, an identifier shall not be
    identical to a reserved word.


    OK, that's one!


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Wed Nov 23 21:46:57 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 19:53, Bart wrote:
    On 23/11/2022 18:31, Andy Walker wrote:
    On 23/11/2022 16:36, Bart wrote:
    C, on the other, had lots of restrictions:
    * Having FAR and NEAR pointer types

         Never part of C.  [Non-standard extension in some implementations.]

    * Having distinct object and function pointers (you aren't even
    allowed to directly cast between them)

         Correctly so.  In a proper HLL, type punning should, in general, >> be forbidden.  There could be a case made out for casting between two
    structures that are identical apart from the names of the components,
    otherwise it is a recipe for hard-to-find bugs.

    * Not being able to compare pointers to two different objects

         Of course you can.  Such pointers compare as unequal.  You can >> also reliably subtract pointers in some cases.  What more can you
    reasonably expect?

    C doesn't allow relative compares, or subtracting operators. Or rather,
    it will make those operations implementation defined or UB, simply
    because pointers could in fact refer to incompatible memory regions.


    It doesn't allow them because they don't make sense. When would you
    want to subtract two unrelated pointers? What would it give you? At
    what time might you want to compare two unrelated pointers for anything
    other than equality? (Note that an implementation can support what it
    likes in implementation-specific code, such as the guts of its standard library.)

    This goes against the suggestion that C is more conducive to linear
    memory than any other languages.


    Linear memory makes it easier to implement C. You can also have C for a target that does not have linear memory. There is no contradiction there.


    It is the only one that I recall which exposes the fact that these
    could all exist in different, non-compatible and /non-linear/ regions
    of memory.

         "Exposes"?  How?  Where?  Examples?  [In either K&R C or standard
    C, of course, not in some dialect implementation.]


    What is being claimed is that it is largely C that has been responsible
    for linear memory layouts in hardware.


    Who claimed that? All that has been said is that C is influential in
    the way modern computer and processor architecture has developed - with different vague expressions of how influential it has been, and some
    points and cpu features that fit well for C but are not necessarily
    ideal for some other languages.



    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Wed Nov 23 21:59:27 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 21:25, Dmitry A. Kazakov wrote:
    On 2022-11-23 19:38, David Brown wrote:
    On 23/11/2022 16:23, Dmitry A. Kazakov wrote:
    On 2022-11-23 16:03, David Brown wrote:

    But I can't figure out what Dmitry might like - unless he too has
    his own personal language.

    No, I am not that megalomaniac. (:-))

    I want stuff useful for software engineering. To me it is a DIY shop.
    I choose techniques I find useful in long term perspective and reject
    other. I generally avoid academic exercises, hobby languages,
    big-tech/corporate/vendor-lock bullshit. You can guess which of your
    pet languages falls into which category. (:-))


    The languages I mostly use are C, C++ and Python, depending on the
    task and the target system.  (And while I enjoy working with each of
    these, and see their advantages in particular situations, I also
    appreciate that they are not good in other cases and they all have
    features I dislike.)  Your criteria would not rule out any of these -
    I too generally avoid languages with vendor lock-in, and small
    developer or user communities.  Academic exercise languages are of
    course no use unless you are doing academic exercises.

    Your criteria would also not rule out several key functional
    programming languages, including Haskell, OCaml, and Scala.

    It would rule out C#, VB, Bart's languages, and possibly Java.  Pascal
    is in theory open and standard, but in practice it is disjoint with
    vendor-specific variants.  (There's FreePascal, which has no lock-in.)

    You would still have Ada, D, Erlang, Fortran, Forth, JavaScript, Lua,
    Rust, Modula-2, Perl, and PHP.

    I think that covers most of the big languages (I assume you also don't
    like ones that have very small user bases).

    Narrow user base is no reason to reject a language. However there is a danger that the language might go extinct.


    Yes. (I said "very small user bases".)

    To me most important is the language toolbox:

    - modules, separate compilation, late bindings
    - abstract data types
    - generic programming (i.e. in terms of sets of types)

    I thought you didn't like that?

    - formal verification, contracts, correctness proofs

    Yet you reject functional programming? You can do a bit of formal
    proofs with SPARK, but people doing serious formal correctness proofs
    tend to prefer pure functional programming languages.

    - some object representation control
    - interfacing to C and thus system and other libraries
    - high level concurrency support
    - program readability, reasonable syntax AKA don't be APL (:-))
    - standard library abstracting the underlying OS
    - some type introspection

    I think Haskell would fit for all of that. And C++ is as good as Ada.


    Things not important or ones I actively avoid are

    - lambdas
    - relational algebra
    - patterns
    - recursive types
    - closures
    - dynamic/duck/weak/no-typing
    - macros/preprocessor/templates/generics

    So generics are important to you, but you actively avoid them?

    - standard container library (like std or boost)
    - standard GUI library



    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 21:01:58 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 20:24, Andy Walker wrote:
    On 23/11/2022 18:53, Bart wrote:
    * Not being able to compare pointers to two different objects
         Of course you can.  Such pointers compare as unequal.  You can >>> also reliably subtract pointers in some cases.  What more can you
    reasonably expect?
    C doesn't allow relative compares, or subtracting operators. Or
    rather, it will make those operations implementation defined or UB,>
    simply because pointers could in fact refer to incompatible memory
    regions.

    N2478 [other standards are available], section 6.5.6.10:

      " When two pointers are subtracted, both shall point to elements of
      " the same array object, or one past the last element of the array
      " object; the result is the difference of the subscripts of the two
      " array elements. The size of the result is implementation-defined,
      " and its type (a signed integer type) is ptrdiff_t defined in the
      " <stddef.h> header. If the result is not representable in an object
      " of that type, the behavior is undefined. "

    So the behaviour is undefined only if the subtraction overflows, and is implementation defined only to the extent of what size of signed integer
    the implementation prefers.  It's difficult to see what other behaviour could reasonably be specified in the Standard.

    Section 6.5.8.6:

      " When two pointers are compared, the result depends on the relative
      " locations in the address space of the objects pointed to. If two
      " pointers to object types both point to the same object, or both
      " point one past the last element of the same array object, they
      " compare equal. If the objects pointed to are members of the same
      " aggregate object, pointers to structure members declared later
      " compare greater than pointers to members declared earlier in the
      " structure, and pointers to array elements with larger subscript
      " values compare greater than pointers to elements of the same
      " array with lower subscript values. All pointers to members of the
      " same union object compare equal. If the expression P points to an
      " element of an array object and the expression Q points to the last
      " element of the same array object, the pointer expression Q+1
      " compares greater than P. In all other cases, the behavior is
      " undefined. "

    Well, it's rather verbose, but it all seems common sense to me.

    So, basically, everything is fully defined when both pointers refer to
    the same objects, which is what I said, more briefly.

      No
    mention anywhere of "incompatible memory regions", so I suspect that
    you're making it up based on what you think C is like rather than how
    it is defined in reality.

    This part of it is implied by those restrictions, when you think of the reasons why they might apply.

    Except C applies those whether or not pointers to those memory registers
    would be compatible or not.

    In fact, you /can/ have distinct kinds of memory, though more common on
    older hardware, or in microcontrollers.

    But this is all by the by; my quest was trying to figure what it was
    about how how C (and only C) does pointers, that made architecture
    designers decide they need more linear memory than segmented.

    My opinion is that C had very little if anything to do with it; it's
    just natural evolution when you move from 16 address bits to 32 and then
    64, and we already had 32 address bits on the 80386 in the mid-80s, and
    on lesser know machines before that.

    C wasn't mature enough for that much influence, and I don't believe
    languages, or any one in particular, were that influential.

    Computers had to be made to continue running the dozens of other
    languages also in use, and most would equally benefit from the same developments: speed, memory size, word sizes, more registers. Linear
    memory is a consequence of having a big enough word size to address all
    the code and data for a task.

    The stuff about C being solely responsible may just have been a wind-up.


        May be worth noting that [eg] Algol defines only the relations
    "is" and "isn't" between pointers;  C is at least more "helpful" than that.  But that is largely driven by C's use of pointers in arrays.

    This goes against the suggestion that C is more conducive to linear
    memory than any other languages.

        Well, /I/ have made no such suggestion, and don't really even
    know what that is claimed to mean.  Most HLLs specifically hide the
    layout and structure of memory from ordinary programmers;  no doubt a
    Good Thing.

    At least 3 people in the group were claiming all sorts of unlikely
    things of C.

    It is the only one that I recall which exposes the fact that these
    could all exist in different, non-compatible and /non-linear/ regions
    of memory.
         "Exposes"?  How?  Where?  Examples?  [In either K&R C or standard
    C, of course, not in some dialect implementation.]
    What is being claimed is that it is largely C that has been
    responsible for linear memory layouts in hardware.

        Not a claim I have ever made, nor even seen before.  There is remarkably little in [standard] C that relates to memory layouts.  So
    I take it that you have no examples of this claimed exposure?

    It's not me making the claims.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Wed Nov 23 22:33:58 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-23 21:59, David Brown wrote:
    On 23/11/2022 21:25, Dmitry A. Kazakov wrote:
    On 2022-11-23 19:38, David Brown wrote:
    On 23/11/2022 16:23, Dmitry A. Kazakov wrote:
    On 2022-11-23 16:03, David Brown wrote:

    But I can't figure out what Dmitry might like - unless he too has
    his own personal language.

    No, I am not that megalomaniac. (:-))

    I want stuff useful for software engineering. To me it is a DIY
    shop. I choose techniques I find useful in long term perspective and
    reject other. I generally avoid academic exercises, hobby languages,
    big-tech/corporate/vendor-lock bullshit. You can guess which of your
    pet languages falls into which category. (:-))


    The languages I mostly use are C, C++ and Python, depending on the
    task and the target system.  (And while I enjoy working with each of
    these, and see their advantages in particular situations, I also
    appreciate that they are not good in other cases and they all have
    features I dislike.)  Your criteria would not rule out any of these -
    I too generally avoid languages with vendor lock-in, and small
    developer or user communities.  Academic exercise languages are of
    course no use unless you are doing academic exercises.

    Your criteria would also not rule out several key functional
    programming languages, including Haskell, OCaml, and Scala.

    It would rule out C#, VB, Bart's languages, and possibly Java.
    Pascal is in theory open and standard, but in practice it is disjoint
    with vendor-specific variants.  (There's FreePascal, which has no
    lock-in.)

    You would still have Ada, D, Erlang, Fortran, Forth, JavaScript, Lua,
    Rust, Modula-2, Perl, and PHP.

    I think that covers most of the big languages (I assume you also
    don't like ones that have very small user bases).

    Narrow user base is no reason to reject a language. However there is a
    danger that the language might go extinct.


    Yes.  (I said "very small user bases".)

    To me most important is the language toolbox:

    - modules, separate compilation, late bindings
    - abstract data types
    - generic programming (i.e. in terms of sets of types)

    I thought you didn't like that?

    Generic programming can be achieved without parametric/static
    polymorphism. It is only one, inferior, way of constructing sets of
    types. I prefer dynamic polymorphism.

    - formal verification, contracts, correctness proofs

    Yet you reject functional programming?

    Sure.

    You can do a bit of formal
    proofs with SPARK, but people doing serious formal correctness proofs
    tend to prefer pure functional programming languages.

    It is about priorities. I need proving correctness of parts of real-life programs. The most difficult problem with proofs is that you must bend
    the program to make it provable potentially introducing bugs, e.g. in contracts. I'd like to see partial and conditional proofs rather than absolutist approaches.

    - some object representation control
    - interfacing to C and thus system and other libraries
    - high level concurrency support
    - program readability, reasonable syntax AKA don't be APL (:-))
    - standard library abstracting the underlying OS
    - some type introspection

    I think Haskell would fit for all of that.  And C++ is as good as Ada.

    C++ has problems with high-level concurrency and massive syntax issues. Looking at a modern C++ code I am not sure whether it is plain text or Base64-encoded. Early C++ was in some aspects admirable language before Stepanov poured poison in the ear of poor Bjarne... (:-))

    Things not important or ones I actively avoid are

    - lambdas
    - relational algebra
    - patterns
    - recursive types
    - closures
    - dynamic/duck/weak/no-typing
    - macros/preprocessor/templates/generics

    So generics are important to you, but you actively avoid them?

    See above. Generic programming /= programming using generics.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Wed Nov 23 22:20:36 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 17:13, Bart wrote:

    ...

    That made the 8086 simpler because there was no choice! The registers
    were limited and only one was general purpose.

    One was /almost/ general purpose! :-)
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Wed Nov 23 22:38:42 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 18:12, Bart wrote:
    On 23/11/2022 16:51, James Harris wrote:
    On 23/11/2022 16:36, Bart wrote:

    I wouldn't say they did. What I would say is that probably none of
    them had C's influence on what programming became.

    Examples? Since the current crop of languages all have very different
    ideas from C.

    Cobol and Algol:

    I was asking about C's influence, but those two languages predated C.

    You mean the languages which C's design has influenced? Many such as
    Java, C#, C++, Objective-C, D, Go, etc.


    well as strong influence on others. Much of the programming
    community today still thinks in C terms even 50 years (!!!) after
    its release.

    Is it really C terms, or does that just happen to be the hardware model? >>>
    Yes, C is a kind of lingua franca that lots of people know, but
    notice that people talk about a 'u64' type, something everyone
    understands, but not 'unsigned long long int' (which is not even
    defined by C to be exactly 64 bits), nor even `uint64_t` (which not
    even C programs recognise unless you use stdint.h or inttypes.h!).

    u64 is just a name.

    So what are the 'C terms' you mentioned?

    I mean things like pointers, memory as an array of bytes, etc.

    Since if talking about
    primitive types for example, u64 or uint64 or whatever are common ways
    of refering to a 64-bit unsigned integer type, then unless the
    discussion specically about C, you wouldn't use C denotations for it.

    Why the C model? Do you have any languages in mind with a different
    model?

    Yes, the C model is as stated: any pointer can point anywhere. A C
    pointer must be able to point to rodata, stack, and anywhere in the
    data section.

    And that is different from any other language that had pointers, how?

    Because I'm having trouble in understanding how you can attribute linear memory models to C and only C, when it is the one language that exposes
    the limitations of non-linear memory.

    Earlier in this discussion you seemed to understand that I was saying C
    had a primary influence. When did that change to C being the only influence?




    Pointers or references occur in many languages (from that time
    period, Pascal, Ada, Algol68); I don't recall them being restricted
    in their model of memory.

    C, on the other, had lots of restrictions:

    * Having FAR and NEAR pointer types

    Are you sure that FAR and NEAR were part of C?

    They were part of implementations of it for 8086. There were actually 'near', 'far' and 'huge'. I think a 'far' pointer had a fixed segment part.

    Then they weren't part of C. Perhaps their inclusion in certain /implementations/ backs up my assertion that programmers viewed C's
    pointers as unsegmented.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 22:42:51 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 14:53, David Brown wrote:
    On 22/11/2022 18:13, Bart wrote:

    As a compiler writer?

    As an assembly programmer and C programmer.

    The first thing you noticed is that you have to decide whether to use
    D-registers or A-registers, as they had different characteristics, but
    the 3-bit register field of instructions could only use one or the other.


    Yes, although they share quite a bit in common too.  You have 8 data registers that are all orthogonal and can be used for any data
    instructions as source and designation, all 32 bit.  You have 8 address registers that could all be used for all kinds of addressing modes (and
    a few kinds of calculations, and as temporary storage) - the only
    special one was A7 that was used for stack operations (as well as being available for all the other addressing modes).

    How does that even begin to compare to the 8086 with its 4 schizophrenic "main" registers that are sometimes 16-bit, sometimes two 8-bit
    registers, with a wide range of different dedicated usages for each register?  Then you have 4 "index" registers, each with different
    dedicated uses.  And 4 "segment" registers, each with different
    dedicated uses.

    Where the 68000 has wide-spread, planned and organised orthogonality and flexibility, the 8086 is a collection of random dedicated bits and pieces.

    It's too big an effort to dig into to now, many decades on, what gave me
    that impression about the 68K. But the big one /is/ those two kinds of registers.

    Current machines already have GP and float registers to make things more difficult, but here there are separate registers for integers - and
    integers that might be used as memory addresses.

    So you would have instructions that operated on one set but not the
    other. You'd need to decide whether functions returned values in D0 or A0.

    Glancing at the instruction set now, you have ADD which adds to
    everything except A regs; ADDA which /only/ adds to AREGS.

    ADDI which adds immed values to everything except AREGS, and ADDQ which
    adds small values (1..8) to everything /including/ AREGS.

    Similarly with ANDI, which works for every dest except AREGS, but there
    is no version for AREGS (so if you were playing with tagged pointers and needed to clear the bottom bits then use them for an address, it gets awkward).

    With a compiler, you had to make decisions on whether it's best to start evaluating in DREGS or AREGS and then move across, if it involved mixed operations that were only available for one set.

    Note that the 80386 processor, which apparently first appeared in 1985, removed many of the restrictions of the 8086, also widening the
    registers but not adding any more. Further, these 32-bit additions and
    new address modes were available while running in 16-bit mode within a
    16-bit application.


    You can also understand it by looking at the processor market.  Real
    CISC with dedicated and specialised registers is dead.  In the progress
    of x86 through 32-bit and then 64-bit, the architecture became more and
    more orthogonal - the old registers A, B, C, D, SI, DI, etc., are now no more than legacy alternative names for r0, r1, etc., general purpose registers.

    What become completely unorthogonal on x86 is the register naming. It's
    a zoo of mismatched names of mixed lengths. The mapping is also bizarre,
    with the stack pointer somewhere below the middle.

    (However that is easy to fix as I can use my own register names and
    ordering as well as the official names. My 64-bit registers are called
    D0 to D15, with D15 (aka Dstack) being the stack pointer.)

         i8 i16 i32 i64 i128
         u8 u16 u32 u64 u128
         f32 f64 f128

    Or they can be expressed in a form that everyone understands, like
    "char", "int", etc., that are defined in the ABI, and that everybody and every language /does/ use when integrating between different languages.

    Sorry, but C typenames using C syntax are NOT good enough, not for cross-language use. You don't really want to see 'int long unsigned
    long'; you want 'uint64' or 'u64'.

    Even C decided that `int` `char` were not good enough by adding types
    like `int32_t` and ... sorry I can't even tell you what `char`
    corresponds to. That is how rubbish C type designations are.



    That document has no mention anywhere of your personal short names for size-specific types.

    It uses names of its own like 'unsigned eightbyte' which unequivocally describes the type. However you will see `u64` all over forums; you will
    never see `unsigned eightbyte`, and never 'unsigned long long int'
    outside of C forums or actual C code.

      It has a table stating the type names and sizes.

    Yes, that helps too. What doesn't help is just using 'long'.

    Think of it as just a definition of the technical terms used in the document, no different from when one processor reference might define
    "word" to mean 16 bits and another define "word" to mean 32 bits.

    So defining a dozen variations on 'unsigned long long int` is better
    than just using `u64` or `uint64`?

    That must be the reason why a dozen different languages have all adopted
    those C designations because they work so well and are so succinct and unambiguous. Oh, hang on...



    How does Google manage case-insensitive searches with text in Unicode in many languages?  By being /very/ smart.  I didn't say it was impossible
    to be case-insensitive beyond plain English alphabet, I said it was an "absolute nightmare".  It is.


    No, it really isn't. Now you're making things up. You don't need to be
    very smart at all, it's actually very easy.



    It is done where it has to be done -
    you'll find all major databases have support for doing sorting,
    searching, and case translation for large numbers of languages and alphabets.  It is a /huge/ undertaking to handle it all.  You don't do
    it if it is not important.

    Think about the 'absolute nightmare' if /everything/ was case sensitive
    and a database has 1000 variations of people called 'David Brown'.
    (There could be 130,000 with my name.)

    Now imagine talking over the phone to someone, they create an account in
    the name you give them, but they use or omit capitalisation you weren't
    aware of. How would you log in?


    Name just /one/ real programming language that supports case-insensitive identifiers

    I'm not talking about Unicode identifiers. I wouldn't go there becase
    there are too many issues. For a start, which of the 1.1 million
    characters should be allowed at the beginning, and which within an
    identifier?

    but is not restricted to ASCII.  (Let's define "real
    programming language" as a programming language that has its own
    Wikipedia entry.)

    There are countless languages that have case-sensitive Unicode
    identifiers, because that's easy to implement and useful for programmers.

    And also a nightmare, since there are probably 20 distinct characters
    that share the same glyph as 'A'.

    Adding Unicode to identifiers is too easy to do badly.



    Domain names are case insensitive if they are in ASCII.

    Because?

    For other
    characters, it gets complicated.

    So, the same situation with language keywords and commands in CLIs.

    But hey, think of the advantage of having Sort and sorT working in decreasing/increasing order; no need to specify that separately. Plus
    you have 14 more variations to apply meanings to. Isn't this the point
    of being case-sensitive?

    Because if it isn't, then I don't get it. On Windows, I can type 'sort'
    or `SORT`, it doesn't matter. I don't even need to look at the screen or
    have to double-check caps lock.

    BTW my languages (2 HLLs and one assembler) use case-insensitive
    identifiers and keywords, but allow case-sensitive names when they are sometimes needed, mainly to with working with FFIs.

    It really isn't hard at all.

    Programmers are not "most people".  Programs are not "most things in everyday life".

    Most people are quite tolerant of spelling mistakes in everyday life -
    do you think programming languages should be too?

    Using case is not a spelling mistake; it's a style. In my languages,
    someone can write 'int', 'Int' or 'INT' according to preference.

    Or that can use CamelCase if they like that, but someone importing such
    a function can just write camelcase if they hate the style.

    I use upper case when writing debug code so that I can instantly
    identify it.


    They do exist, yes.  That does not make them a good idea.

    Yes, it does. How do you explain to somebody why using exact case is absolutely essential, when it clearly shouldn't matter?


    Look: I create my own languages, yes? And I could have chosen at any
    time to make them case sensitive, yes?

    So why do you think I would choose to make like an 'absolute nightmare'
    for myself?

    The reason is obvious: because case insensitivity just works better and
    is far more useful.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Wed Nov 23 22:45:15 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 21:01, Bart wrote:

    ...

    The stuff about C being solely responsible may just have been a wind-up.

    Maybe I've missed it but I've not noticed anyone claim C was solely responsible.

    ...

    It's not me making the claims.

    It might be you making up the claims. ;-)
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 22:47:57 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 21:33, Dmitry A. Kazakov wrote:
    On 2022-11-23 21:59, David Brown wrote:

    I think Haskell would fit for all of that.  And C++ is as good as Ada.

    C++ has problems with high-level concurrency and massive syntax issues. Looking at a modern C++ code I am not sure whether it is plain text or Base64-encoded.

    I've long had that problem in C, which, partly thanks to
    case-sensitivity so that people have to write correctly cased names
    (like macros), often looks like a sea of Mime-encoded text.

    I can't hack it. C++, I just wouldn't bother with it; 90% seems to be pointless punctuation.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 23:25:15 2022
    From Newsgroup: comp.lang.misc

    On 22/11/2022 15:29, David Brown wrote:
    The 8086 was horrible in all sorts of ways.  Comparing a 68000 with an
    8086 is like comparing a Jaguar E-type with a bathtub with wheels.  And
    for the actual chip used in the first PC, an 8088, half the wheels were removed.

    You've forgotten the 68008.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Wed Nov 23 23:31:27 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 22:45, James Harris wrote:
    On 23/11/2022 21:01, Bart wrote:

    ...

    The stuff about C being solely responsible may just have been a wind-up.

    Maybe I've missed it but I've not noticed anyone claim C was solely responsible.

    ...

    It's not me making the claims.

    It might be you making up the claims. ;-)

    Here's a selection of quotes from the thread (BC is me):

    JH:
    I even suspect that the CPUs we use today are also as they are in part
    due to C. It has been that influential.

    BC:

    However, what aspects of today's processors do you think owe anything
    to C?

    JH:

    Things like the 8-bit byte, 2's complement, and the lack of segmentation.


    JH:
    I remember reading that when AMD wanted to design a 64-bit architecture
    they asked programmers (especially at Microsoft) what they wanted. One
    thing was 'no segmentation'. The C model had encouraged programmers to
    think in terms of flat address spaces, and the mainstream segmented
    approach for x86 was a nightmare that people didn't want to repeat.


    DB:
    C is /massively/ influential to the general purpose CPUs we have today.
    The prime requirement for almost any CPU design is that you should be
    able to use it efficiently for C. After all, the great majority of
    software is written in languages that, at their core, are similar to C


    BC:
    Two of the first machines I used were PDP10 and PDP11, developed by
    DEC in the 1960s, both using linear memory spaces. While the former was word-based, the PDP11 was byte-addressable, just like the IBM 360 also
    from the 1960s.


    DB:
    C was developed originally for these processors, and was a major reason
    for their long-term success.

    BC:
    Of the PDP10 and IBM 360? Designed in the 1960s and discontinued in
    1983 and 1979 respectively. C only came out in a first version in 1972.


    DB:
    I was thinking primarily of the PDP11, which was the first real target
    for C (assuming I have my history correct - this was around the time I
    was born). And by "long-term success" of these systems, I mean their successors that were built in the same style - such as the VAX.



    DB:
    C was a /massive/ influence on processor evolution and the current standardisation of general-purpose processors as systems for running C
    code efficiently. But it was not the only influence, or the sole reason
    for current processor design.


    BC:
    Or are you going to claim like David Brown that the hardware is like
    that solely due to the need to run C programs?

    DAK:
    Nobody would ever use any hardware if there is no C compiler. So
    David is certainly right.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Thu Nov 24 11:15:31 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 22:01, Bart wrote:
    On 23/11/2022 20:24, Andy Walker wrote:
    On 23/11/2022 18:53, Bart wrote:
    * Not being able to compare pointers to two different objects
         Of course you can.  Such pointers compare as unequal.  You can >>>> also reliably subtract pointers in some cases.  What more can you
    reasonably expect?
    C doesn't allow relative compares, or subtracting operators. Or
    rather, it will make those operations implementation defined or UB,>
    simply because pointers could in fact refer to incompatible memory
    regions.

    N2478 [other standards are available], section 6.5.6.10:

       " When two pointers are subtracted, both shall point to elements of
       " the same array object, or one past the last element of the array
       " object; the result is the difference of the subscripts of the two
       " array elements. The size of the result is implementation-defined,
       " and its type (a signed integer type) is ptrdiff_t defined in the
       " <stddef.h> header. If the result is not representable in an object
       " of that type, the behavior is undefined. "

    So the behaviour is undefined only if the subtraction overflows, and is
    implementation defined only to the extent of what size of signed integer
    the implementation prefers.  It's difficult to see what other behaviour
    could reasonably be specified in the Standard.

    Section 6.5.8.6:

       " When two pointers are compared, the result depends on the relative
       " locations in the address space of the objects pointed to. If two
       " pointers to object types both point to the same object, or both
       " point one past the last element of the same array object, they
       " compare equal. If the objects pointed to are members of the same
       " aggregate object, pointers to structure members declared later
       " compare greater than pointers to members declared earlier in the
       " structure, and pointers to array elements with larger subscript
       " values compare greater than pointers to elements of the same
       " array with lower subscript values. All pointers to members of the
       " same union object compare equal. If the expression P points to an
       " element of an array object and the expression Q points to the last
       " element of the same array object, the pointer expression Q+1
       " compares greater than P. In all other cases, the behavior is
       " undefined. "

    Well, it's rather verbose, but it all seems common sense to me.

    So, basically, everything is fully defined when both pointers refer to
    the same objects, which is what I said, more briefly.

    Roughly, yes. Some things - such as equality comparisons - are defined
    even if they are in different objects.


      No
    mention anywhere of "incompatible memory regions", so I suspect that
    you're making it up based on what you think C is like rather than how
    it is defined in reality.

    This part of it is implied by those restrictions, when you think of the reasons why they might apply.

    You can't read between the lines like that and guess about what the
    standard does /not/ say. Standards documents are a bit special - they
    are concerned solely about what is explicitly discussed in the document,
    and do not imply anything at all about things that are not covered.

    So the standards don't say C can be used on systems with disjoint memory regions, or systems with different address spaces. Nor do they say that
    it /can't/ be used on them. Nor do they imply that such systems exist,
    or don't exist.

    They simply say that the C language says what happens when you
    order-compare (or subtract) pointers that are within the same object,
    because that's the only case the C language cares about.


    Except C applies those whether or not pointers to those memory registers would be compatible or not.

    The C language does not care about that. And certainly the C language standards don't "apply" anything.

    A given C compiler can choose to do whatever it likes if you try to
    order compare two pointers that are not part of the same object -
    including assuming that it does not happen, and including doing a simple naïve comparison of the pointer values as though they were integers.
    (And that comparison could be done signed or unsigned, which may result
    in a different answer from what you might be expecting.)


    In fact, you /can/ have distinct kinds of memory, though more common on older hardware, or in microcontrollers.


    Or newer hardware with NUMA, remote memory on PCI buses (yes, that's a
    thing now), virtual memory, disjoint memory setups for access control or memory access debugging, or...

    Usually, this is still all viewed as one logical address space, despite
    being different physical spaces.

    But this is all by the by; my quest was trying to figure what it was
    about how how C (and only C) does pointers, that made architecture
    designers decide they need more linear memory than segmented.


    Linear memory in one address space is much more convenient for
    toolchains to handle. It is especially efficient when you have a
    low-level compiled language, because it means you can use a simple naïve implementation in most cases. It means you can implement a function like :

    int get_data(const int * p) {
    return *p;
    }

    with just a simple read instruction.

    If you have a language that has no programmer-visible pointers then you
    cannot write such functions in the library - there is no need to have a
    way to implement it. Or if you have advanced pointers/references that
    handle access control and bounds checking, it's a minor matter to add
    checking for different address spaces too. Then your same system and
    same code will work for, say, remote objects accessed over a network.

    Or if you have a language built around communicating actors, then there
    is never a need to access memory outside your local data, so memory can
    be as disjoint as you like.

    There are many different computing models other than the von Neumann
    setup that has become ubiquitous as a result of the popularity of von
    Neumann programming languages. C was not the first such language, nor
    is it the only one, but it is far and away the biggest, most popular and
    most vital to the computing world we have now.

    This places great limits on the efficiency, cost, size, and power of
    computing - von Neumann architectures and programs for them do not scale
    well in directions other than single-thread speed. Buying a 32-core cpu
    does not make your C code run 32 times faster - but it /would/ make your
    Occam code run 32 times faster, or your well-written Haskell code, or
    your Erlang code, or code written in languages that were not targeting
    this simple linear model.


    The stuff about C being solely responsible may just have been a wind-up.


    You always seem to end up with such conclusions when people disagree
    with you.


         May be worth noting that [eg] Algol defines only the relations
    "is" and "isn't" between pointers;  C is at least more "helpful" than
    that.  But that is largely driven by C's use of pointers in arrays.

    This goes against the suggestion that C is more conducive to linear
    memory than any other languages.

         Well, /I/ have made no such suggestion, and don't really even
    know what that is claimed to mean.  Most HLLs specifically hide the
    layout and structure of memory from ordinary programmers;  no doubt a
    Good Thing.

    At least 3 people in the group were claiming all sorts of unlikely
    things of C.


    All three of these people are, I suspect, based strongly on your misinterpretation or your exaggerated interpretation of what people
    actually wrote. You have a long habit of assuming everything about C
    (or rather, your skewed idea of C) is evil in every way, as well as interpreting every comment other people make about C as some kind of
    fan-boy obsessive love for the language.

    It is the only one that I recall which exposes the fact that these
    could all exist in different, non-compatible and /non-linear/ regions >>>>> of memory.
         "Exposes"?  How?  Where?  Examples?  [In either K&R C or standard
    C, of course, not in some dialect implementation.]
    What is being claimed is that it is largely C that has been
    responsible for linear memory layouts in hardware.

         Not a claim I have ever made, nor even seen before.  There is
    remarkably little in [standard] C that relates to memory layouts.  So
    I take it that you have no examples of this claimed exposure?

    It's not me making the claims.


    It is you making the claims about what others say.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Thu Nov 24 16:03:11 2022
    From Newsgroup: comp.lang.misc

    On 23/11/2022 23:42, Bart wrote:
    On 23/11/2022 14:53, David Brown wrote:
    On 22/11/2022 18:13, Bart wrote:

    As a compiler writer?

    As an assembly programmer and C programmer.

    The first thing you noticed is that you have to decide whether to use
    D-registers or A-registers, as they had different characteristics,
    but the 3-bit register field of instructions could only use one or
    the other.


    Yes, although they share quite a bit in common too.  You have 8 data
    registers that are all orthogonal and can be used for any data
    instructions as source and designation, all 32 bit.  You have 8
    address registers that could all be used for all kinds of addressing
    modes (and a few kinds of calculations, and as temporary storage) -
    the only special one was A7 that was used for stack operations (as
    well as being available for all the other addressing modes).

    How does that even begin to compare to the 8086 with its 4
    schizophrenic "main" registers that are sometimes 16-bit, sometimes
    two 8-bit registers, with a wide range of different dedicated usages
    for each register?  Then you have 4 "index" registers, each with
    different dedicated uses.  And 4 "segment" registers, each with
    different dedicated uses.

    Where the 68000 has wide-spread, planned and organised orthogonality
    and flexibility, the 8086 is a collection of random dedicated bits and
    pieces.

    It's too big an effort to dig into to now, many decades on, what gave me that impression about the 68K. But the big one /is/ those two kinds of registers.

    Certainly the distinction between A and D registers is a
    non-orthogonality. But it is just /one/ case, and it really isn't so
    big in practice since you have many identical registers in each class.
    It's akin to the difference between GP registers and FP registers you
    mention below.

    (I am not disagreeing with the remark that the 68000 is not entirely orthogonal - I am disagreeing with the claim that it is at a similar
    level to the 8086. And I am jogging happy memories of old processor architectures!)


    Current machines already have GP and float registers to make things more difficult, but here there are separate registers for integers - and
    integers that might be used as memory addresses.

    Note that there are very good reasons for separating integer and FP
    registers, in terms of hardware implementations. It might be nice to
    have them merged from the programmer's viewpoint, but it is not worth
    the hardware cost. (A similar logic is behind the separate A and D
    registers on the m68k architecture.)


    So you would have instructions that operated on one set but not the
    other. You'd need to decide whether functions returned values in D0 or A0.

    Glancing at the instruction set now, you have ADD which adds to
    everything except A regs; ADDA which /only/ adds to AREGS.

    ADDI which adds immed values to everything except AREGS, and ADDQ which
    adds small values (1..8) to everything /including/ AREGS.

    Similarly with ANDI, which works for every dest except AREGS, but there
    is no version for AREGS (so if you were playing with tagged pointers and needed to clear the bottom bits then use them for an address, it gets awkward).

    With a compiler, you had to make decisions on whether it's best to start evaluating in DREGS or AREGS and then move across, if it involved mixed operations that were only available for one set.


    Yes, there is no doubt that it is a non-orthogonality. But it is a
    minor matter in practice. A simple compiler can decide "pointers go in
    A registers, everything else goes in D registers". That's it - done.
    (To get the maximum efficiency, you'll need more complex register allocations.)

    In comparison to the 8086, it is /nothing/.

    Note that the 80386 processor, which apparently first appeared in 1985, removed many of the restrictions of the 8086, also widening the
    registers but not adding any more. Further, these 32-bit additions and
    new address modes were available while running in 16-bit mode within a 16-bit application.


    Yes, the 80386 helped and removed some of the specialisations of the
    8086. There were still plenty left, and still plenty of cases where the
    use of particular registers was more efficient than others. The x86
    world improved gradually in this way, so that the current x86-64 ISA is
    vastly better than the 8086.


    You can also understand it by looking at the processor market.  Real
    CISC with dedicated and specialised registers is dead.  In the
    progress of x86 through 32-bit and then 64-bit, the architecture
    became more and more orthogonal - the old registers A, B, C, D, SI,
    DI, etc., are now no more than legacy alternative names for r0, r1,
    etc., general purpose registers.

    What become completely unorthogonal on x86 is the register naming. It's
    a zoo of mismatched names of mixed lengths. The mapping is also bizarre, with the stack pointer somewhere below the middle.

    Yes.


    (However that is easy to fix as I can use my own register names and
    ordering as well as the official names. My 64-bit registers are called
    D0 to D15, with D15 (aka Dstack) being the stack pointer.)


    I think it is not uncommon to refer to the registers in x86-64 as r0 to
    r15 - that is, the A, B, C, D, DI, SI, SP, and BP registers are renamed,
    with the extra 8 registers of x86-64 having never had any other name.

         i8 i16 i32 i64 i128
         u8 u16 u32 u64 u128
         f32 f64 f128

    Or they can be expressed in a form that everyone understands, like
    "char", "int", etc., that are defined in the ABI, and that everybody
    and every language /does/ use when integrating between different
    languages.

    Sorry, but C typenames using C syntax are NOT good enough, not for cross-language use. You don't really want to see 'int long unsigned
    long'; you want 'uint64' or 'u64'.

    Sorry, but they /are/ good enough for everyone else. The world can't be expected to change to suit /you/ - it is you who must adapt. (But you
    don't have to like it!)


    Even C decided that `int` `char` were not good enough by adding types
    like `int32_t` and ... sorry I can't even tell you what `char`
    corresponds to. That is how rubbish C type designations are.

    These type names were /added/ to the language - they did not replace the existing types. People use different type names for different purposes.
    I write "int" when "int" is appropriate, and "int32_t" when "int32_t"
    is appropriate - it's not a case of one set of names being "better" than
    the other.


    That document has no mention anywhere of your personal short names for
    size-specific types.

    It uses names of its own like 'unsigned eightbyte' which unequivocally describes the type. However you will see `u64` all over forums; you will never see `unsigned eightbyte`, and never 'unsigned long long int'
    outside of C forums or actual C code.


    Standards documents are not everyday language. (I think I've mentioned
    that before.) In everyday use, people tend to use shorter and more
    convenient names - though they vary how they balance shortness with explicitness, and that varies by context. (Programs are not everyday
    language either.)

      It has a table stating the type names and sizes.

    Yes, that helps too. What doesn't help is just using 'long'.


    It works fine. You read the table of definitions, see that in this
    document the word "long" means "64-bit integer".

    Standards documents define all kinds of terms and expressions in a
    particular manner that applies only within the document (or other formal
    texts that refer to the document).


    Think of it as just a definition of the technical terms used in the
    document, no different from when one processor reference might define
    "word" to mean 16 bits and another define "word" to mean 32 bits.

    So defining a dozen variations on 'unsigned long long int` is better
    than just using `u64` or `uint64`?


    Are you confusing the flexible syntax of C with the technical terms in
    the ABI document? It sounds a lot like it.

    That must be the reason why a dozen different languages have all adopted those C designations because they work so well and are so succinct and unambiguous. Oh, hang on...


    As I said - it might have been better to have names with explicit sizes.
    That does not mean that the C terms are not good enough for the job, regardless of what language you use. And since in the solid majority of
    cases where ABI's are used between two languages, at least one of the languages is C, it seems sensible to use C terms. Why should Rust users
    be forced to learn Go's type names in order to use a C library - when
    they need to know the C names anyway? Why should Go users need to learn
    the names used by Rust?

    Think of C like English - the spelling in English is horrible and inconsistent, and is different depending on which side of the pond you
    live. Yet it works extremely well for international communication, and
    lets Bulgarians talk to Koreans. Perhaps Esperanto would be a
    hypothetically better language, but it's not going to happen in practice.



    How does Google manage case-insensitive searches with text in Unicode
    in many languages?  By being /very/ smart.  I didn't say it was
    impossible to be case-insensitive beyond plain English alphabet, I
    said it was an "absolute nightmare".  It is.


    No, it really isn't. Now you're making things up. You don't need to be
    very smart at all, it's actually very easy.


    You can do Unicode case-folding based on a table from the Unicode
    people. But I think you'll find Google's search engine is a touch more advanced than that.



    It is done where it has to be done - you'll find all major databases
    have support for doing sorting, searching, and case translation for
    large numbers of languages and alphabets.  It is a /huge/ undertaking
    to handle it all.  You don't do it if it is not important.

    Think about the 'absolute nightmare' if /everything/ was case sensitive
    and a database has 1000 variations of people called 'David Brown'.
    (There could be 130,000 with my name.)

    Now imagine talking over the phone to someone, they create an account in
    the name you give them, but they use or omit capitalisation you weren't aware of. How would you log in?


    I have no idea what you are going on about.

    Some things in life need to be flexible and deal with variations such as spelling differences, capitalisation differences, etc.

    Other things can and should be precise and unambiguous.

    So when programming, you say /exactly/ what you mean. You don't write
    "call fooo a few times" and expect it to be obvious to the computer how
    many is "a few" and that you really meant "foo". You write "for i = 1
    to 5 do foo()", or whatever the language in question expects.

    I expect a compiler to demand precision from the code I write.
    Accepting muddled letter case is setting the standard too low IMHO - I
    want a complaint if I write "foo" one place and "Foo" another. Of
    course I can live with such weaknesses in a language, and set higher
    standards for my own code than the language allows - I do that for all
    coding, as I think most people do. But I see no advantage in having
    weak identifier matching in a programming language - it adds nothing to
    code readability, allows poor coders to make more of a mess, and
    generally allows a totally unnecessary inconsistency.

    I see /no/ advantages in being able to write "foo" when defining an
    identifier and "Foo" or "FOO" when using it. It is utterly pointless.
    (It is a different matter to say that if you have defined an identifier
    "foo" then you may not define a separate one written "Foo", disallowing identifiers that differ only in case. I could appreciate wanting that
    as a feature.)


    And I cannot see any contradiction between wanting case sensitivity when writing code while having no cases chatting to a human on the phone.


    Name just /one/ real programming language that supports
    case-insensitive identifiers

    I'm not talking about Unicode identifiers. I wouldn't go there becase
    there are too many issues. For a start, which of the 1.1 million
    characters should be allowed at the beginning, and which within an identifier?

    but is not restricted to ASCII.  (Let's define "real programming
    language" as a programming language that has its own Wikipedia entry.)

    There are countless languages that have case-sensitive Unicode
    identifiers, because that's easy to implement and useful for programmers.

    And also a nightmare, since there are probably 20 distinct characters
    that share the same glyph as 'A'.

    Adding Unicode to identifiers is too easy to do badly.


    It is another case of a feature that can be used or abused. You pick
    the balance you want, accepting that either choice is a trade-off.



    Domain names are case insensitive if they are in ASCII.

    Because?

    Who cares? They are domain names, not program code.


    For other characters, it gets complicated.

    So, the same situation with language keywords and commands in CLIs.


    No, these are case sensitive - except for systems that haven't grown up
    since lower case letters were invented.

    But hey, think of the advantage of having Sort and sorT working in decreasing/increasing order; no need to specify that separately. Plus
    you have 14 more variations to apply meanings to. Isn't this the point
    of being case-sensitive?

    Because if it isn't, then I don't get it. On Windows, I can type 'sort'
    or `SORT`, it doesn't matter. I don't even need to look at the screen or have to double-check caps lock.

    BTW my languages (2 HLLs and one assembler) use case-insensitive
    identifiers and keywords, but allow case-sensitive names when they are sometimes needed, mainly to with working with FFIs.

    It really isn't hard at all.

    It really isn't hard to write "sort".


    Programmers are not "most people".  Programs are not "most things in
    everyday life".

    Most people are quite tolerant of spelling mistakes in everyday life -
    do you think programming languages should be too?

    Using case is not a spelling mistake; it's a style. In my languages,
    someone can write 'int', 'Int' or 'INT' according to preference.


    No, it is a mess.

    But of course, it is not a problem in your language - personal
    preferences are entirely consistent there.

    And in serious languages that are case-insensitive, such as Ada, people
    stick strongly to the conventions and write their identifiers with
    consistent casing. Which leaves everyone wondering what the point is of
    being case-insensitive - it's just a historical mistake that can't be
    changed.

    Or that can use CamelCase if they like that, but someone importing such
    a function can just write camelcase if they hate the style.

    I use upper case when writing debug code so that I can instantly
    identify it.


    They do exist, yes.  That does not make them a good idea.

    Yes, it does. How do you explain to somebody why using exact case is absolutely essential, when it clearly shouldn't matter?


    Look: I create my own languages, yes? And I could have chosen at any
    time to make them case sensitive, yes?

    So why do you think I would choose to make like an 'absolute nightmare'
    for myself?


    You didn't use Unicode - which is where the implementation gets hard.
    There's no difficulty in implementing case insensitive keywords and identifiers in plain ASCII - there's just no advantage to it (unless you
    call being able to write an inconsistent mess an advantage).

    The reason is obvious: because case insensitivity just works better and
    is far more useful.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Thu Nov 24 16:07:35 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 00:25, Bart wrote:
    On 22/11/2022 15:29, David Brown wrote:
    The 8086 was horrible in all sorts of ways.  Comparing a 68000 with an
    8086 is like comparing a Jaguar E-type with a bathtub with wheels.
    And for the actual chip used in the first PC, an 8088, half the wheels
    were removed.

    You've forgotten the 68008.

    That would be a Jaguar with half the engine cylinders removed. Still
    very comfortable and stylish, but a good deal less power :-)

    And current AMD and Intel chips are bathtubs with wheels and rocket engines!

    (The great thing about car analogies is how much they can be abused...)




    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Thu Nov 24 17:22:13 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-24 16:07, David Brown wrote:

    And current AMD and Intel chips are bathtubs with wheels and rocket
    engines!

    Judging by how they screech ... there is no wheels. (:-))

    (The great thing about car analogies is how much they can be abused...)

    OT. I remember the story of a guy who installed the rocket engine on a,
    I believe, VW Beetle and honorably died riding his invention. Death by
    Rock and Roll, as Pretty Reckless sung...
    /OT
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Thu Nov 24 16:55:05 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 15:03, David Brown wrote:
    On 23/11/2022 23:42, Bart wrote:

    Using case is not a spelling mistake; it's a style. In my languages,
    someone can write 'int', 'Int' or 'INT' according to preference.


    No, it is a mess.

    Allowing int Int INT to be three distinct types, or to represent types, variables, functions etc, is perfectly fine?

    int Int = INT;

    You can make a case for case-sensitivity within the confines of a
    programming language syntax which will have lots of other rules too.

    But on the other side of the user interface where it applies to user
    commands, user inputs and file systems, it makes a lot less sense and
    becomes user-unfriendly. People unnecessarily need to remember the exact capitalisation of that file, otherwise they might never find it again.


    Here's what I don't like about case-sensitivity:

    * Somebody else makes capitalisation style choices I don't like, but I
    have use exactly the same style

    * I have to remember the exact capitalisation used, instead of just remembering the sound of the identifier used, which very often I can't,
    I have to keep referring back to see what it was

    * With poor choices of capitalisation, source code can look chaotic (I mentioned elsewhere that it can look like Mime64 encoded text)

    * Often the same letters are used for distinct identifiers which differ
    only in capitalisation, sometimes very subtly (I can give loads of
    examples).

    * Often identifiers are used that are the same as reserved words, but
    again differ only in case

    * I can't use my way of writing temporary and/or debug code in capitals.


    If you don't like the idea that case-insensitivity allows people to use inconsistent case to refer to the same identifier like abc, Abc, aBC
    (which rarely happens except for all-caps), then a compiler could
    enforce consistent case.

    With the important difference from case-sensitivity in that you can't write:

    int Int = INT;

    You have to be a bit more creative.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Thu Nov 24 17:56:59 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 16:55, Bart wrote:
    On 24/11/2022 15:03, David Brown wrote:
    On 23/11/2022 23:42, Bart wrote:

    Using case is not a spelling mistake; it's a style. In my languages,
    someone can write 'int', 'Int' or 'INT' according to preference.


    No, it is a mess.

    Allowing int Int INT to be three distinct types, or to represent types, variables, functions etc, is perfectly fine?

        int Int = INT;

    Contrast

    MyVal := a
    myVal := myval + b

    Are you happy for a language to allow so much inconsistency?
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Thu Nov 24 19:02:23 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-24 18:56, James Harris wrote:
    On 24/11/2022 16:55, Bart wrote:
    On 24/11/2022 15:03, David Brown wrote:
    On 23/11/2022 23:42, Bart wrote:

    Using case is not a spelling mistake; it's a style. In my languages,
    someone can write 'int', 'Int' or 'INT' according to preference.


    No, it is a mess.

    Allowing int Int INT to be three distinct types, or to represent
    types, variables, functions etc, is perfectly fine?

         int Int = INT;

    Contrast

      MyVal := a
      myVal := myval + b

    Are you happy for a language to allow so much inconsistency?

    Make it

    MyVal := a
    myVal := MyVal + b

    better be case-sensitive?
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Thu Nov 24 18:07:38 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 18:02, Dmitry A. Kazakov wrote:
    On 2022-11-24 18:56, James Harris wrote:
    On 24/11/2022 16:55, Bart wrote:
    On 24/11/2022 15:03, David Brown wrote:
    On 23/11/2022 23:42, Bart wrote:

    Using case is not a spelling mistake; it's a style. In my
    languages, someone can write 'int', 'Int' or 'INT' according to
    preference.


    No, it is a mess.

    Allowing int Int INT to be three distinct types, or to represent
    types, variables, functions etc, is perfectly fine?

         int Int = INT;

    Contrast

       MyVal := a
       myVal := myval + b

    Are you happy for a language to allow so much inconsistency?

    Make it

        MyVal := a
        myVal := MyVal + b

    better be case-sensitive?

    My point (to you and Bart) is that programmers can choose identifier
    names so the latter example need not arise unless it is written
    deliberately; but if the compiler folds case then programmers can
    /mistype/ names accidentally, leading to the messy inconsistency
    mentioned above.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Thu Nov 24 19:39:50 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-24 19:07, James Harris wrote:
    On 24/11/2022 18:02, Dmitry A. Kazakov wrote:
    On 2022-11-24 18:56, James Harris wrote:
    On 24/11/2022 16:55, Bart wrote:
    On 24/11/2022 15:03, David Brown wrote:
    On 23/11/2022 23:42, Bart wrote:

    Using case is not a spelling mistake; it's a style. In my
    languages, someone can write 'int', 'Int' or 'INT' according to
    preference.


    No, it is a mess.

    Allowing int Int INT to be three distinct types, or to represent
    types, variables, functions etc, is perfectly fine?

         int Int = INT;

    Contrast

       MyVal := a
       myVal := myval + b

    Are you happy for a language to allow so much inconsistency?

    Make it

         MyVal := a
         myVal := MyVal + b

    better be case-sensitive?

    My point (to you and Bart) is that programmers can choose identifier
    names so the latter example need not arise unless it is written deliberately;

    Why did you suggest an error? The point is, you could not know. Nobody
    could.

    but if the compiler folds case then programmers can
    /mistype/ names accidentally, leading to the messy inconsistency
    mentioned above.

    Same question. Why do you think that the example I gave was mistyped?

    In a case-insensitive language mistyping the case has no effect on the
    program legality. Any decent IDE enforces preferred case style.

    Moreover, tools for the case-sensitive languages like C++ do just the
    same. You cannot have reasonable names in C++ anymore. There would be
    lurking clang-format or SonarQube configured to force something a three
    year old suffering dyslexia would pen... (:-))
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Thu Nov 24 18:42:51 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 17:56, James Harris wrote:
    On 24/11/2022 16:55, Bart wrote:
    On 24/11/2022 15:03, David Brown wrote:
    On 23/11/2022 23:42, Bart wrote:

    Using case is not a spelling mistake; it's a style. In my languages,
    someone can write 'int', 'Int' or 'INT' according to preference.


    No, it is a mess.

    Allowing int Int INT to be three distinct types, or to represent
    types, variables, functions etc, is perfectly fine?

         int Int = INT;

    Contrast

      MyVal := a
      myVal := myval + b

    Are you happy for a language to allow so much inconsistency?


    This is inconsistency of style, something which doesn't affect the
    meaning of the code. Languages already allow that:

    MyVal:=a
    myVal := myval +b

    Maybe there can be a tool to warn about this or tidy it up for you, but
    I don't believe it should be the job of the language, or compiler.

    However I did suggest that a case-sensitive language /could/ enforce consistency across identifiers intended to be identical.

    And as DAK said, you can have inconsistencies in case-sensitive code
    that are actually dangerous. It took me a few seconds to realise the
    second `MyVal` had a small `m` so would be a different identifier.

    In a language with declarations, perhaps that would be picked up (unless
    it was Go, then := serves to creat declare a new variable). In dynamic
    ones, `myVal` would be silently created as a fresh variable.

    That can happen with case-insensitivity too, but you have to actually
    misspell the name, not just use the wrong capitalisation.


    Here are some examples from sqlite3.c of names which are identical
    except for subtle differences of case:

    (walCkptInfo,WalCkptInfo)
    (walIndexHdr,WalIndexHdr)
    (wrflag,wrFlag)
    (writeFile,WriteFile)
    (xHotSpot,xHotspot)
    (yHotspot,yHotSpot)
    (yymajor,yyMajor)
    (yyminor,yyMinor)
    (zErrMsg,zErrmsg)
    (zSql,zSQL)

    Try to spot the differences. Remember that in a real program, it will be
    much busier, and these names haven't been pre-selected and helpfully
    placed side by side! Usually you will see them in isolation.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Thu Nov 24 18:55:24 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 18:39, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:07, James Harris wrote:
    On 24/11/2022 18:02, Dmitry A. Kazakov wrote:
    On 2022-11-24 18:56, James Harris wrote:
    On 24/11/2022 16:55, Bart wrote:
    On 24/11/2022 15:03, David Brown wrote:
    On 23/11/2022 23:42, Bart wrote:

    Using case is not a spelling mistake; it's a style. In my
    languages, someone can write 'int', 'Int' or 'INT' according to >>>>>>> preference.


    No, it is a mess.

    Allowing int Int INT to be three distinct types, or to represent
    types, variables, functions etc, is perfectly fine?

         int Int = INT;

    Contrast

       MyVal := a
       myVal := myval + b

    Are you happy for a language to allow so much inconsistency?

    Make it

         MyVal := a
         myVal := MyVal + b

    better be case-sensitive?

    My point (to you and Bart) is that programmers can choose identifier
    names so the latter example need not arise unless it is written
    deliberately;

    Why did you suggest an error? The point is, you could not know. Nobody could.

    but if the compiler folds case then programmers can /mistype/ names
    accidentally, leading to the messy inconsistency mentioned above.

    Same question. Why do you think that the example I gave was mistyped?

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval. The difference is that in a case-sensitive language
    (such as C) a programmer would have deliberately to choose daft names to engineer the mess; whereas in a language which ignores case (such as
    Ada) the mess can come about accidentally, via typos.


    In a case-insensitive language mistyping the case has no effect on the program legality. Any decent IDE enforces preferred case style.

    Moreover, tools for the case-sensitive languages like C++ do just the
    same. You cannot have reasonable names in C++ anymore. There would be lurking clang-format or SonarQube configured to force something a three
    year old suffering dyslexia would pen... (:-))

    As you suggest, for languages which ignore case extra tools are needed
    to help tidy up the code.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Thu Nov 24 19:07:03 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 18:55, James Harris wrote:
    On 24/11/2022 18:39, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:07, James Harris wrote:

    My point (to you and Bart) is that programmers can choose identifier
    names so the latter example need not arise unless it is written
    deliberately;

    Why did you suggest an error? The point is, you could not know. Nobody
    could.

    but if the compiler folds case then programmers can /mistype/ names
    accidentally, leading to the messy inconsistency mentioned above.

    Same question. Why do you think that the example I gave was mistyped?

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval. The difference is that in a case-sensitive language (such as C) a programmer would have deliberately to choose daft names to engineer the mess;

    They do. I gave examples in my other post. But this kind of idiom I find annoying:

    Image image;
    Colour colour; //(At least it's not colour color!)
    Matrix matrix;

    (Actual examples from the Raylib API. Which also cause grief when ported
    to my case-insensitive syntax, yet another problem.)

    whereas in a language which ignores case (such as
    Ada) the mess can come about accidentally, via typos.

    Using the wrong case isn't really a typo. A real typo would yield the
    wrong letters

    Using the wrong case is harmless. At some point, the discrepancy in
    style, if not intentional, will be discovered and fixed.



    In a case-insensitive language mistyping the case has no effect on the
    program legality. Any decent IDE enforces preferred case style.

    Moreover, tools for the case-sensitive languages like C++ do just the
    same. You cannot have reasonable names in C++ anymore. There would be
    lurking clang-format or SonarQube configured to force something a
    three year old suffering dyslexia would pen... (:-))

    As you suggest, for languages which ignore case extra tools are needed
    to help tidy up the code.

    Possibly; I've never actually needed to in 46 years of case-insensitive coding. But I also use upper case for emphasis.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Thu Nov 24 20:23:37 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 19:39, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:07, James Harris wrote:
    On 24/11/2022 18:02, Dmitry A. Kazakov wrote:
    On 2022-11-24 18:56, James Harris wrote:
    On 24/11/2022 16:55, Bart wrote:
    On 24/11/2022 15:03, David Brown wrote:
    On 23/11/2022 23:42, Bart wrote:

    Using case is not a spelling mistake; it's a style. In my
    languages, someone can write 'int', 'Int' or 'INT' according to >>>>>>> preference.


    No, it is a mess.

    Allowing int Int INT to be three distinct types, or to represent
    types, variables, functions etc, is perfectly fine?

         int Int = INT;

    Contrast

       MyVal := a
       myVal := myval + b

    Are you happy for a language to allow so much inconsistency?

    Make it

         MyVal := a
         myVal := MyVal + b

    better be case-sensitive?

    My point (to you and Bart) is that programmers can choose identifier
    names so the latter example need not arise unless it is written
    deliberately;

    Why did you suggest an error? The point is, you could not know. Nobody could.

    Of course you could know, if the language requires variables to be
    declared before usage. Using C syntax for consistency here:

    int Myval = 1;
    myval = 2;

    In a case-sensitive language, that is clearly a typo by the programmer,
    and it is a compile-time error. In a case-insensitive language, it's an inconsistent mess that is perfectly acceptable to the compiler and no
    one can tell if it is intentional or not because the language is quite
    happy with different choices of cases.

    int Myval = 1;
    int myval = 2;

    In a case-sensitive language, it is legal but written by an
    intentionally bad programmer - and no matter how hard you try, bad
    programmers will find a way to write bad code. In a case-insensitive language, it is an error written intentionally by a bad programmer.

    Give me the language that helps catch typos, not the language that is
    happy with an inconsistent jumble.


    but if the compiler folds case then programmers can /mistype/ names
    accidentally, leading to the messy inconsistency mentioned above.

    Same question. Why do you think that the example I gave was mistyped?

    In a case-insensitive language mistyping the case has no effect on the program legality.

    I prefer mistypes to be considered errors where possible.

    Any decent IDE enforces preferred case style.


    A good IDE is nice - a good language choice is better.

    Moreover, tools for the case-sensitive languages like C++ do just the
    same. You cannot have reasonable names in C++ anymore. There would be lurking clang-format or SonarQube configured to force something a three
    year old suffering dyslexia would pen... (:-))


    Some people know how to use tools properly.



    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Thu Nov 24 20:28:33 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 20:07, Bart wrote:
    On 24/11/2022 18:55, James Harris wrote:
    On 24/11/2022 18:39, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:07, James Harris wrote:

    My point (to you and Bart) is that programmers can choose identifier
    names so the latter example need not arise unless it is written
    deliberately;

    Why did you suggest an error? The point is, you could not know.
    Nobody could.

    but if the compiler folds case then programmers can /mistype/ names
    accidentally, leading to the messy inconsistency mentioned above.

    Same question. Why do you think that the example I gave was mistyped?

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval. The difference is that in a case-sensitive
    language (such as C) a programmer would have deliberately to choose
    daft names to engineer the mess;

    They do. I gave examples in my other post. But this kind of idiom I find annoying:

        Image image;
        Colour colour;     //(At least it's not colour color!)
        Matrix matrix;


    I have never come across any programmer for any language that does not
    find some commonly-used idioms or coding styles annoying.

    I bet that even you, who are the only programmer for the languages you yourself designed, can look back at old code and think some of your own
    idioms are annoying.

    (Actual examples from the Raylib API. Which also cause grief when ported
    to my case-insensitive syntax, yet another problem.)


    Do not blame C for the deficiencies in your language or your use of it!

    whereas in a language which ignores case (such as Ada) the mess can
    come about accidentally, via typos.

    Using the wrong case isn't really a typo. A real typo would yield the
    wrong letters

    The wrong case is either a typo, or appalling lack of attention to
    detail and care of code quality.


    Using the wrong case is harmless. At some point, the discrepancy in
    style, if not intentional, will be discovered and fixed.


    With a decent language it will be discovered as soon as you compile (if
    not before, when you use a good editor).



    In a case-insensitive language mistyping the case has no effect on
    the program legality. Any decent IDE enforces preferred case style.

    Moreover, tools for the case-sensitive languages like C++ do just the
    same. You cannot have reasonable names in C++ anymore. There would be
    lurking clang-format or SonarQube configured to force something a
    three year old suffering dyslexia would pen... (:-))

    As you suggest, for languages which ignore case extra tools are needed
    to help tidy up the code.

    Possibly; I've never actually needed to in 46 years of case-insensitive coding. But I also use upper case for emphasis.




    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Thu Nov 24 20:33:27 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 17:22, Dmitry A. Kazakov wrote:
    On 2022-11-24 16:07, David Brown wrote:

    And current AMD and Intel chips are bathtubs with wheels and rocket
    engines!

    Judging by how they screech ... there is no wheels. (:-))

    (The great thing about car analogies is how much they can be abused...)

    OT. I remember the story of a guy who installed the rocket engine on a,
    I believe, VW Beetle and honorably died riding his invention. Death by
    Rock and Roll, as Pretty Reckless sung...
    /OT


    Died honourably, or died horribly? It's easy to mix these up!

    <https://darwinawards.com/darwin/darwin1995-04.html>



    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Thu Nov 24 20:01:00 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 19:28, David Brown wrote:
    On 24/11/2022 20:07, Bart wrote:

    They do. I gave examples in my other post. But this kind of idiom I
    find annoying:

         Image image;
         Colour colour;     //(At least it's not colour color!)
         Matrix matrix;


    I have never come across any programmer for any language that does not
    find some commonly-used idioms or coding styles annoying.

    I can port all my identifiers to a case-sensitive language with no
    clashes. I can't guarantee no clashes when porting from case-sensitive
    to case-insensitive.

    Which would be less hassle?

    I don't like the C idiom because if I read it in my head it sounds stupid.


    I bet that even you, who are the only programmer for the languages you yourself designed, can look back at old code and think some of your own idioms are annoying.

    Some of my code layout styles (like 1-space indents) look dreadful, yes.
    But I can fix that with two keypresses.


    (Actual examples from the Raylib API. Which also cause grief when
    ported to my case-insensitive syntax, yet another problem.)


    Do not blame C for the deficiencies in your language or your use of it!

    whereas in a language which ignores case (such as Ada) the mess can
    come about accidentally, via typos.

    Using the wrong case isn't really a typo. A real typo would yield the
    wrong letters

    The wrong case is either a typo, or appalling lack of attention to
    detail and care of code quality.


    Using the wrong case is harmless. At some point, the discrepancy in
    style, if not intentional, will be discovered and fixed.


    With a decent language it will be discovered as soon as you compile (if
    not before, when you use a good editor).

    You're assuming it's an error. But when I call the Windows MessageBoxA function from my language, I write it like this:

    messageboxa(message:"Hello")


    In C, it /must/ be written as:

    MessageBoxA(0, "Hello", "Caption etc", 0);

    So I use my choice of capitalisation (which is usually none; I just
    don't care), and I've added keyword parameters to the declaration.

    That means that, given a choice of what to with lower and upper case
    letters, I've selected different priorities, since I place little value
    on writing code like this:

    struct Foo Foo[FOO] = {foo};

    Clearly, you have a different opinion.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Thu Nov 24 20:13:03 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 18:42, Bart wrote:
    On 24/11/2022 17:56, James Harris wrote:
    On 24/11/2022 16:55, Bart wrote:
    On 24/11/2022 15:03, David Brown wrote:
    On 23/11/2022 23:42, Bart wrote:

    Using case is not a spelling mistake; it's a style. In my
    languages, someone can write 'int', 'Int' or 'INT' according to
    preference.


    No, it is a mess.

    Allowing int Int INT to be three distinct types, or to represent
    types, variables, functions etc, is perfectly fine?

         int Int = INT;

    Contrast

       MyVal := a
       myVal := myval + b

    Are you happy for a language to allow so much inconsistency?


    This is inconsistency of style, something which doesn't affect the
    meaning of the code. Languages already allow that:

        MyVal:=a
          myVal :=    myval +b

    Maybe there can be a tool to warn about this or tidy it up for you, but
    I don't believe it should be the job of the language, or compiler.

    However I did suggest that a case-sensitive language /could/ enforce consistency across identifiers intended to be identical.

    And as DAK said, you can have inconsistencies in case-sensitive code
    that are actually dangerous. It took me a few seconds to realise the
    second `MyVal` had a small `m` so would be a different identifier.

    In a language with declarations, perhaps that would be picked up (unless
    it was Go, then := serves to creat declare a new variable). In dynamic
    ones, `myVal` would be silently created as a fresh variable.

    That can happen with case-insensitivity too, but you have to actually misspell the name, not just use the wrong capitalisation.


    Here are some examples from sqlite3.c of names which are identical
    except for subtle differences of case:

    (walCkptInfo,WalCkptInfo)
    (walIndexHdr,WalIndexHdr)
    (wrflag,wrFlag)
    (writeFile,WriteFile)
    (xHotSpot,xHotspot)
    (yHotspot,yHotSpot)
    (yymajor,yyMajor)
    (yyminor,yyMinor)
    (zErrMsg,zErrmsg)
    (zSql,zSQL)

    Try to spot the differences. Remember that in a real program, it will be much busier, and these names haven't been pre-selected and helpfully
    placed side by side! Usually you will see them in isolation.

    Well, some of those would happen regardless of the case sensitivity of
    the language. For example, in the version of sqlite3.c I found online I
    saw that some routines use wrFlag and others use wrflag. From a quick
    look I cannot see any routine which uses both. Such a discrepancy
    wouldn't be picked up whether the language was case sensitive or not.
    Also, it looks as though zSQL is used only in comments. I don't know any language which would check that names in comments match those in code.

    Nevertheless, I take your point. A programmer /could/ unwisely choose to
    use names which differed only by the case of one letter.

    Here's a suggestion: make the language case sensitive and have the
    compiler reject programs which give access to two names with no changes
    other than case, such that Myvar and myvar could not be simultaneously accessible.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Thu Nov 24 20:52:02 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 20:13, James Harris wrote:
    On 24/11/2022 18:42, Bart wrote:

    Here are some examples from sqlite3.c of names which are identical
    except for subtle differences of case:

    (walCkptInfo,WalCkptInfo)
    (walIndexHdr,WalIndexHdr)
    (wrflag,wrFlag)
    (writeFile,WriteFile)
    (xHotSpot,xHotspot)
    (yHotspot,yHotSpot)
    (yymajor,yyMajor)
    (yyminor,yyMinor)
    (zErrMsg,zErrmsg)
    (zSql,zSQL)

    Try to spot the differences. Remember that in a real program, it will
    be much busier, and these names haven't been pre-selected and
    helpfully placed side by side! Usually you will see them in isolation.

    Well, some of those would happen regardless of the case sensitivity of
    the language. For example, in the version of sqlite3.c I found online  I saw that some routines use wrFlag and others use wrflag. From a quick
    look I cannot see any routine which uses both. Such a discrepancy
    wouldn't be picked up whether the language was case sensitive or not.
    Also, it looks as though zSQL is used only in comments.

    zSQL occurs here:

    SQLITE_API int sqlite3_declare_vtab(sqlite3*, const char *zSQL);

    which is between two block comments but is not itself insid a comment.

    I have done analysis in the past which tried to detect whether any of
    those pairs occurred within the same function; I think one or two did,
    but is too much effort to repeat now.

    (The whole list is 200 entries; 3 of them have 3 variations on the same
    name:

    (hmenu,hMenu,HMENU)
    (next,nExt,Next)
    (short,Short,SHORT)
    )

    But this is just to show that such variances can occur, especially in
    the longer names where the difference is subtle.

    These are likely to create a lot of confusion, if you type the wrong capitalisation because you assume xHotSpot style rather then xHotspot.
    Or even if you're just browsing the code: was this the same name I saw a minute ago? No; one has a small s the other a big S, they just sound the
    same when you say them out loud (or in your head!).

    Such confusion /has/ to be less when xHotSpot, xHotspot, plus the other
    62 (I think) variations have to be the same identifier, 'xhotspot' when normalised.

    Nevertheless, I take your point. A programmer /could/ unwisely choose to
    use names which differed only by the case of one letter.

    In C this happens /all the time/. It's almost a requirement. When I
    translated OpenGL headers, then many macro names shared the same
    identifers with functions if you took away case.


    Here's a suggestion: make the language case sensitive and have the
    compiler reject programs which give access to two names with no changes other than case, such that Myvar and myvar could not be simultaneously accessible.

    Apart from not being able to do this:

    Colour colour;

    what would be the point of case sensitivity in this case? Or would the restriction not apply to types? What about a variable that clashes with
    a reserved word when case is ignored?

    (BTW my syntax can represent the above as:

    `Colour colour

    The backtick is case-preserving, and also allows names that clash with reserved words. But I don't want to have to write that in code; this is
    for automatic translation tools, or for a one-off.)
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Thu Nov 24 22:35:53 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval.

    Which you want to make legal, right? Again,

    If int and INT shall never mean two different entities, why do you let them?

    The difference is that in a case-sensitive language
    (such as C) a programmer would have deliberately to choose daft names to engineer the mess; whereas in a language which ignores case (such as
    Ada) the mess can come about accidentally, via typos.

    That is evidently wrong. Why exactly

    int INT;

    must be legal?

    Moreover, tools for the case-sensitive languages like C++ do just the
    same. You cannot have reasonable names in C++ anymore. There would be
    lurking clang-format or SonarQube configured to force something a
    three year old suffering dyslexia would pen... (:-))

    As you suggest, for languages which ignore case extra tools are needed
    to help tidy up the code.

    While 99% of all these tools were developed specifically for
    case-sensitive languages? Come on!
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Thu Nov 24 21:47:05 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 20:52, Bart wrote:
    On 24/11/2022 20:13, James Harris wrote:
    On 24/11/2022 18:42, Bart wrote:

    Here are some examples from sqlite3.c of names which are identical
    except for subtle differences of case:

    (walCkptInfo,WalCkptInfo)
    (walIndexHdr,WalIndexHdr)
    (wrflag,wrFlag)
    (writeFile,WriteFile)
    (xHotSpot,xHotspot)
    (yHotspot,yHotSpot)
    (yymajor,yyMajor)
    (yyminor,yyMinor)
    (zErrMsg,zErrmsg)
    (zSql,zSQL)

    Try to spot the differences. Remember that in a real program, it will
    be much busier, and these names haven't been pre-selected and
    helpfully placed side by side! Usually you will see them in isolation.

    Well, some of those would happen regardless of the case sensitivity of
    the language. For example, in the version of sqlite3.c I found online
    I saw that some routines use wrFlag and others use wrflag. From a
    quick look I cannot see any routine which uses both. Such a
    discrepancy wouldn't be picked up whether the language was case
    sensitive or not. Also, it looks as though zSQL is used only in comments.

    zSQL occurs here:

    SQLITE_API int sqlite3_declare_vtab(sqlite3*, const char *zSQL);

    which is between two block comments but is not itself insid a comment.

    Yes, although it's a function declaration; the presumably incorrectly
    typed identifier zSQL is ignored. It's simply not used. Two comments. IMO:

    1. Forward declarations should not be needed.

    2. Parameter names should be part of the interface.


    I have done analysis in the past which tried to detect whether any of
    those pairs occurred within the same function; I think one or two did,
    but is too much effort to repeat now.

    (The whole list is 200 entries; 3 of them have 3 variations on the same name:

       (hmenu,hMenu,HMENU)
       (next,nExt,Next)
       (short,Short,SHORT)
    )

    But this is just to show that such variances can occur, especially in
    the longer names where the difference is subtle.

    Similar could be said for any names which differed only slightly.


    These are likely to create a lot of confusion, if you type the wrong capitalisation because you assume xHotSpot style rather then xHotspot.
    Or even if you're just browsing the code: was this the same name I saw a minute ago? No; one has a small s the other a big S, they just sound the same when you say them out loud (or in your head!).

    Such confusion /has/ to be less when xHotSpot, xHotspot, plus the other
    62 (I think) variations have to be the same identifier, 'xhotspot' when normalised.

    I think I'd prefer the compiler to reject the code. Then one would know
    that compilable code had no such problems - whether the language was
    case sensitive or case insensitive.


    Nevertheless, I take your point. A programmer /could/ unwisely choose
    to use names which differed only by the case of one letter.

    In C this happens /all the time/. It's almost a requirement. When I translated OpenGL headers, then many macro names shared the same
    identifers with functions if you took away case.

    One cannot stop programmers doing daft things. For example, a programmer
    could declare names such as

    CreateTableForwandReference

    and

    createtableforwardrefarence

    The differences are not obvious. Nor would it be easy to get a compiler
    to complain about the similarity.

    IOW we can help but we cannot stop programmers doing unwise things.



    Here's a suggestion: make the language case sensitive and have the
    compiler reject programs which give access to two names with no
    changes other than case, such that Myvar and myvar could not be
    simultaneously accessible.

    Apart from not being able to do this:

        Colour colour;

    what would be the point of case sensitivity in this case? Or would the restriction not apply to types? What about a variable that clashes with
    a reserved word when case is ignored?

    I'd better not comment on type names here. It is a big issue, and one on
    which I may be coming round to a different point of view. Identifiers
    which clashed with reserved words sounds like a good thing to prohibit
    on one condition: there is some way to add reserved words to a later
    version of the language without potentially breaking lots of existing code.

    Maybe the best a language designer can do for cases such as this is to
    help reduce the number of different names a programmer would have to
    define in any given location.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Thu Nov 24 21:50:33 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval.

    Which you want to make legal, right?

    No.

    ...

    That is evidently wrong. Why exactly

       int INT;

    must be legal?

    I didn't say it should be.

    Moreover, tools for the case-sensitive languages like C++ do just the
    same. You cannot have reasonable names in C++ anymore. There would be
    lurking clang-format or SonarQube configured to force something a
    three year old suffering dyslexia would pen... (:-))

    As you suggest, for languages which ignore case extra tools are needed
    to help tidy up the code.

    While 99% of all these tools were developed specifically for
    case-sensitive languages? Come on!

    It's a personal view but IMO a language should be independent of, and
    should not rely on, IDEs or special editors.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Thu Nov 24 22:58:12 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-24 20:23, David Brown wrote:
    On 24/11/2022 19:39, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:07, James Harris wrote:
    On 24/11/2022 18:02, Dmitry A. Kazakov wrote:
    On 2022-11-24 18:56, James Harris wrote:
    On 24/11/2022 16:55, Bart wrote:
    On 24/11/2022 15:03, David Brown wrote:
    On 23/11/2022 23:42, Bart wrote:

    Using case is not a spelling mistake; it's a style. In my
    languages, someone can write 'int', 'Int' or 'INT' according to >>>>>>>> preference.


    No, it is a mess.

    Allowing int Int INT to be three distinct types, or to represent
    types, variables, functions etc, is perfectly fine?

         int Int = INT;

    Contrast

       MyVal := a
       myVal := myval + b

    Are you happy for a language to allow so much inconsistency?

    Make it

         MyVal := a
         myVal := MyVal + b

    better be case-sensitive?

    My point (to you and Bart) is that programmers can choose identifier
    names so the latter example need not arise unless it is written
    deliberately;

    Why did you suggest an error? The point is, you could not know. Nobody
    could.

    Of course you could know, if the language requires variables to be
    declared before usage.

    I meant some fancy language where no declarations needed. But OK, take this:

    int MyVal = a;
    int myVal = MyVal + b;

    How do you know?

        int Myval = 1;
        int myval = 2;

    In a case-sensitive language, it is legal but written by a > intentionally bad programmer - and no matter how hard you try, bad
    programmers will find a way to write bad code.  In a case-insensitive language, it is an error written intentionally by a bad programmer.

    Give me the language that helps catch typos, not the language that is
    happy with an inconsistent jumble.

    declare
    Myval : Integer := 1;
    myval : Integer := 2;
    begin

    This is illegal in Ada.

    but if the compiler folds case then programmers can /mistype/ names
    accidentally, leading to the messy inconsistency mentioned above.

    A programmer cannot mistype names if the language is case-sensitive?

    Purely statistically your argument makes no sense. Since the set of
    unique identifiers in a case-insensitive language is by order of
    magnitude narrower, any probability of mess/error etc is also less under equivalent conditions.

    The only reason to have case-sensitive identifiers is for having
    homographs = for producing mess.

    I prefer mistypes to be considered errors where possible.

    And I gave more or less formal proof why case-insensitive languages are
    better here.

    Moreover, tools for the case-sensitive languages like C++ do just the
    same. You cannot have reasonable names in C++ anymore. There would be
    lurking clang-format or SonarQube configured to force something a
    three year old suffering dyslexia would pen... (:-))

    Some people know how to use tools properly.

    These people don't buy them and thus do not count... (:-))
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Thu Nov 24 23:00:56 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-24 22:50, James Harris wrote:
    On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval.

    Which you want to make legal, right?

    No.

    ...

    That is evidently wrong. Why exactly

        int INT;

    must be legal?

    I didn't say it should be.

    But it is. q.e.d.

    Moreover, tools for the case-sensitive languages like C++ do just
    the same. You cannot have reasonable names in C++ anymore. There
    would be lurking clang-format or SonarQube configured to force
    something a three year old suffering dyslexia would pen... (:-))

    As you suggest, for languages which ignore case extra tools are
    needed to help tidy up the code.

    While 99% of all these tools were developed specifically for
    case-sensitive languages? Come on!

    It's a personal view but IMO a language should be independent of, and
    should not rely on, IDEs or special editors.

    Yet you must rely on them in order to prevent:

    int INT;
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Thu Nov 24 22:51:34 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 21:47, James Harris wrote:
    On 24/11/2022 20:52, Bart wrote:

    Yes, although it's a function declaration; the presumably incorrectly
    typed identifier zSQL is ignored.

    We don't know the purpose of zSQL. But the point is it is there, a
    slightly differently-case version of with the same name, which I can't
    for the life of me recall right now. That is the problem.

    (If I look back, it is zSql. But if now encounter these even in 10
    minutes time, which one would be which? I would forget.)



    Similar could be said for any names which differed only slightly.

    Say short, Short and SHORT out loud; any difference?

    You're debugging some code and need to print out the value of hmenu. Or
    is hMenu or Hmenu? Personally I am half-blind to case usage because it
    so commonly irrelevant and ignored in English.

    I would be constantly double-checking and constantly getting it wrong
    too. And that's with just one of these three in place.

    Differences in spelling are another matter; I'm a good speller.

    You might notice when the cup you're handed in Starbucks has Janes
    rather than James and would want to check it is yours; but you probably wouldn't care if its james or James or JAMES because that is just style.
    You know they are all the same name.

    But also, just because people can make typos by pressing the wrong
    letter or being the wrong length doesn't make allowing 2**N more
    incorrect possibilities acceptable.

    In C this happens /all the time/. It's almost a requirement. When I
    translated OpenGL headers, then many macro names shared the same
    identifers with functions if you took away case.

    One cannot stop programmers doing daft things. For example, a programmer could declare names such as

      CreateTableForwandReference

    and

      createtableforwardrefarence

    The differences are not obvious.

    So to fix it, we allow

    CreateTableForwardRefarence

    createtableforwandreference

    as synonyms? Case sensitive, you have subtle differences in letters
    /plus/ subtle differences in case!

    Maybe the best a language designer can do for cases such as this is to
    help reduce the number of different names a programmer would have to
    define in any given location.

    Given the various restrictions you've mentioned that you'd want even
    with case sensitive names, is there any point to having case
    sensitivity? What would it be used for; what would it allow?

    I had a scripting language that shipped with my applications. While case insensitive, I would usually write keywords in lower case as if, then,
    while.

    But some users would write If, Then, While, and make more use in
    identifiers of mixed case. And they would capitalise global variables
    that I defined in lower case.

    It provided a choice.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 08:52:14 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 21:01, Bart wrote:
    On 24/11/2022 19:28, David Brown wrote:
    On 24/11/2022 20:07, Bart wrote:

    They do. I gave examples in my other post. But this kind of idiom I
    find annoying:

         Image image;
         Colour colour;     //(At least it's not colour color!)
         Matrix matrix;


    I have never come across any programmer for any language that does not
    find some commonly-used idioms or coding styles annoying.

    I can port all my identifiers to a case-sensitive language with no
    clashes. I can't guarantee no clashes when porting from case-sensitive
    to case-insensitive.

    Which would be less hassle?


    I realise this is not the answer you want, but here goes - nobody cares!

    It is not the fault of /C/ that /you/ have made a language that does not support direct literal translations from C.

    Honestly, of all your arguments against C (some of which are valid and reasonable), and all your arguments against case sensitivity, this is
    the most pathetic. Get over yourself - the world does not revolve
    around you or your language, and nobody gives a **** if you have to put slightly more effort into your porting tasks.

    I don't like the C idiom because if I read it in my head it sounds stupid.


    Even that ranks miles above whinging about porting.

    That means that, given a choice of what to with lower and upper case letters, I've selected different priorities, since I place little value
    on writing code like this:

        struct Foo Foo[FOO] = {foo};

    Clearly, you have a different opinion.

    Clearly you prefer to form your own opinions on people rather than
    bothering to read anything.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 09:18:25 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval.

    Which you want to make legal, right? Again,

    If int and INT shall never mean two different entities, why do you let
    them?

    If int and INT are poor style when referring to the same entities, why
    do you let them?


    Either choice of case sensitivity or case insensitivity allows abuuses.
    But case sensitive makes accidental misuse far more likely to be
    caught by the compiler, and it allows more possibilities. In comparison
    to case insensitive languages, it is one step back and two steps forward
    - a clear win.


    Of course there are other options as well, which are arguably better
    than either of these. One is to say you have to get the case right for consistency, but disallow identifiers that differ only in case. (I
    think that's what you get with Ada along with appropriate tools or
    compiler warning flags. There are also C tools for spotting confusing identifiers.)

    Another is to say that the case is significant. This is often done in C
    by convention - all-caps for macros is a very common convention, and in
    C++ it is quite common to use initial caps for classes. Some languages enforce such rules, making "Int int" perfectly clear as "Int" is
    guaranteed to be a type, while "int" is guaranteed to be an object.



    The difference is that in a case-sensitive language (such as C) a
    programmer would have deliberately to choose daft names to engineer
    the mess; whereas in a language which ignores case (such as Ada) the
    mess can come about accidentally, via typos.

    That is evidently wrong.

    What exactly is wrong about my statement? "int INT;" is an example of deliberately daft names. Legislating against stupidity or malice is
    /very/ difficult. Legislating against accidents and inconsistency is
    easier, and a better choice.

    Why exactly

       int INT;

    must be legal?

    If a language can make such things illegal, great - but /not/ at the
    cost of making "int a; INT b; inT c = A + B;" legal.



    Moreover, tools for the case-sensitive languages like C++ do just the
    same. You cannot have reasonable names in C++ anymore. There would be
    lurking clang-format or SonarQube configured to force something a
    three year old suffering dyslexia would pen... (:-))

    As you suggest, for languages which ignore case extra tools are needed
    to help tidy up the code.

    While 99% of all these tools were developed specifically for
    case-sensitive languages? Come on!


    I see no problem with using extra tools, or extra compiler warnings, to improve code quality or catch errors. Indeed, I am a big fan of them.
    As a fallible programmer I like all the help I can get, and I like it as
    early in the process as possible (such as smart editors or IDEs).

    However, I am aware that not all programmers are equally concerned with writing good code, so the more a language enforces good quality, the better.

    (And before Bart chimes in with examples of the nonsense gcc accepts as
    "valid C" when no flags are given, I would prefer toolchains had
    stringent extra checks by default and only allow technically legal but
    poor style code if given explicit flags.)
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 09:21:03 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 23:00, Dmitry A. Kazakov wrote:
    On 2022-11-24 22:50, James Harris wrote:
    On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval.

    Which you want to make legal, right?

    No.

    ...

    That is evidently wrong. Why exactly

        int INT;

    must be legal?

    I didn't say it should be.

    But it is. q.e.d.

    It is legal in C - but if James doesn't want it to be legal in /his/
    language, then it won't be.


    Moreover, tools for the case-sensitive languages like C++ do just
    the same. You cannot have reasonable names in C++ anymore. There
    would be lurking clang-format or SonarQube configured to force
    something a three year old suffering dyslexia would pen... (:-))

    As you suggest, for languages which ignore case extra tools are
    needed to help tidy up the code.

    While 99% of all these tools were developed specifically for
    case-sensitive languages? Come on!

    It's a personal view but IMO a language should be independent of, and
    should not rely on, IDEs or special editors.

    Yet you must rely on them in order to prevent:

       int INT;


    No, you don't - not if a language or compiler is designed to prevent
    them. (And it can still be case-sensitive.)
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 09:43:41 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 22:58, Dmitry A. Kazakov wrote:
    On 2022-11-24 20:23, David Brown wrote:
    On 24/11/2022 19:39, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:07, James Harris wrote:
    On 24/11/2022 18:02, Dmitry A. Kazakov wrote:
    On 2022-11-24 18:56, James Harris wrote:
    On 24/11/2022 16:55, Bart wrote:
    On 24/11/2022 15:03, David Brown wrote:
    On 23/11/2022 23:42, Bart wrote:

    Using case is not a spelling mistake; it's a style. In my
    languages, someone can write 'int', 'Int' or 'INT' according to >>>>>>>>> preference.


    No, it is a mess.

    Allowing int Int INT to be three distinct types, or to represent >>>>>>> types, variables, functions etc, is perfectly fine?

         int Int = INT;

    Contrast

       MyVal := a
       myVal := myval + b

    Are you happy for a language to allow so much inconsistency?

    Make it

         MyVal := a
         myVal := MyVal + b

    better be case-sensitive?

    My point (to you and Bart) is that programmers can choose identifier
    names so the latter example need not arise unless it is written
    deliberately;

    Why did you suggest an error? The point is, you could not know.
    Nobody could.

    Of course you could know, if the language requires variables to be
    declared before usage.

    I meant some fancy language where no declarations needed. But OK, take
    this:

       int MyVal = a;
       int myVal = MyVal + b;

    How do you know?


    It is unavoidable in any language, with any rules, that people will be
    able to write confusing code, or that people will be able to make
    mistakes that compilers and tools can't catch. No matter how smart you
    make the language or the tools, that will /always/ be possible.

    Thus there is no benefit in any discussion in stretching examples to
    that point.

    Given the code above, it is clear that it is not the language that is
    flawed, or the tools, or the code - it is the programmer that is flawed.


         int Myval = 1;
         int myval = 2;

    In a case-sensitive language, it is legal but written by a >
    intentionally bad programmer - and no matter how hard you try, bad
    programmers will find a way to write bad code.  In a case-insensitive
    language, it is an error written intentionally by a bad programmer.

    Give me the language that helps catch typos, not the language that is
    happy with an inconsistent jumble.

       declare
          Myval : Integer := 1;
          myval : Integer := 2;
       begin

    This is illegal in Ada.

    Great. Ada catches some mistakes. It lets others through. That's life
    in programming.


    but if the compiler folds case then programmers can /mistype/ names
    accidentally, leading to the messy inconsistency mentioned above.

    A programmer cannot mistype names if the language is case-sensitive?


    Sure - but at least some typos are more likely to be caught.

    Purely statistically your argument makes no sense. Since the set of
    unique identifiers in a case-insensitive language is by order of
    magnitude narrower, any probability of mess/error etc is also less under equivalent conditions.

    "Purely statistically" you are talking drivel and comparing one
    countably infinite set with a different countably infinite set.

    There are some programmers who appear to pick identifiers by letting
    their cat walk at random over the keyboard. Most don't. Programmers
    mostly pick the same identifiers regardless of case sensitivity, and
    mostly pick identifiers that differ in more than just case. Baring
    abusing programmers, the key exception is idioms such as "Point point"
    where "Point" is a type and "point" is an object of that type.


    The only reason to have case-sensitive identifiers is for having
    homographs = for producing mess.


    No, it /avoids/ mess. And case insensitivity does not avoid homographs
    - HellO and He110 are homographs in simple fonts, despite being
    different identifiers regardless of case sensitivity. "Int" and "int"
    are not homographs in any font. "Ρο" and "Po" are homographs,
    regardless of case sensitivity, despite being completely different
    Unicode identifiers (the first uses Greek letters).

    ("Homograph" means they look much the same, but are actually different -
    not that the cases are different.)


    The key benefit of case sensitivity is disallowing inconsistent cases,
    rather than because it allows identifiers that differ in case.


    I prefer mistypes to be considered errors where possible.

    And I gave more or less formal proof why case-insensitive languages are better here.


    Really? I must have missed that the "more or less formal" proof. I saw
    some arguments, and I don't disagree that case sensitivity has a
    disadvantage in allowing some new kinds of intentional abuse. But
    that's all.

    Moreover, tools for the case-sensitive languages like C++ do just the
    same. You cannot have reasonable names in C++ anymore. There would be
    lurking clang-format or SonarQube configured to force something a
    three year old suffering dyslexia would pen... (:-))

    Some people know how to use tools properly.

    These people don't buy them and thus do not count... (:-))


    I don't follow. I make a point of learning how to use my tools as best
    I can, whether they are commercial paid-for tools or zero cost price.

    But if you mean that the programmers who could most benefit from good
    tools to check style and code quality are precisely the ones that don't
    use them, I agree. Usually they don't even have to buy them or acquire
    them - they already have tools they could use, but don't use them properly.

    If I were making a compiler, all its warnings would be on by default,
    and you'd have to use the flag "-old-bad-code" to disable them.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 09:55:46 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 21:52, Bart wrote:
    On 24/11/2022 20:13, James Harris wrote:
    On 24/11/2022 18:42, Bart wrote:

    Nevertheless, I take your point. A programmer /could/ unwisely choose
    to use names which differed only by the case of one letter.

    In C this happens /all the time/. It's almost a requirement. When I translated OpenGL headers, then many macro names shared the same
    identifers with functions if you took away case.


    Please stop cherry-picking code you don't like and assuming all C code
    is like that.

    Oh, and examples like "Point point;" are idiomatic in C. That means
    anyone who understands C will have no problem following the code. It is
    not an issue or bad code. (This is unlike having both "myVar" and
    "MyVar" as identifiers in the same scope - I think everyone agrees that
    that /is/ bad code.)


    Here's a suggestion: make the language case sensitive and have the
    compiler reject programs which give access to two names with no
    changes other than case, such that Myvar and myvar could not be
    simultaneously accessible.

    Apart from not being able to do this:

        Colour colour;

    what would be the point of case sensitivity in this case? Or would the restriction not apply to types? What about a variable that clashes with
    a reserved word when case is ignored?

    (BTW my syntax can represent the above as:

        `Colour colour

    The backtick is case-preserving, and also allows names that clash with reserved words. But I don't want to have to write that in code; this is
    for automatic translation tools, or for a one-off.)

    (SQL has something similar. Table and column names inside quotation
    marks are case sensitive and can contain spaces or match keywords.)


    Case sensitivity is primarily about enforcing consistency, and only secondarily about allowing identifiers that differ only in case.

    As for enforcing rules that prevent identifiers that differ only in
    case, there are many sub-options you could have. (James - these are
    ideas and suggestions, not necessarily recommendations. You pick for
    your language.)

    You could have different namespaces for types, functions, and objects
    (and maybe other entities). So you could have "Point" as a type and
    "point" as an object, but not both "Point" and "point" as types or
    objects. (It's not really any different from allowing an identifier for
    a field in a structure to also be a function name - namespaces are vital.)

    You could enforce a convention on capitalisation, such as types must
    start with a capital and objects start with a lower case letter.
    Whether you also allow "pointa" and "pointA" as separate objects is
    another choice.

    You could say capitals are only allowed at the start of an identifier,
    or after an underscore - "pointA" is not allowed but "point_A" is.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Fri Nov 25 10:13:16 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-25 09:18, David Brown wrote:
    On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval.

    Which you want to make legal, right? Again,

    If int and INT shall never mean two different entities, why do you let
    them?

    If int and INT are poor style when referring to the same entities, why
    do you let them?

    Any how exactly case-sensitiveness would not let them? So far the outcome:

    case-insensitive: illegal
    case-sensitive: OK

     But case sensitive makes accidental misuse far more likely to be
    caught by the compiler,

    Example please. Typing 'i' instead of 'I' is not a misuse to me.

    Of course there are other options as well, which are arguably better
    than either of these.  One is to say you have to get the case right for consistency, but disallow identifiers that differ only in case.

    You could do that. You can even say that identifiers must be in italics
    and keywords in bold Arial and then apply all your arguments to font
    shapes, sizes, orientation etc. Why not?

    One of the reasons Ada did not do this and many filesystems as well,
    because one might wish to be able to convert names to some canonical
    form without changing the meaning. After all this is how the letter case appeared in European languages in the first place - to beautify written
    text.

    If you do not misuse the concept that a program is a text, you should
    have no problem with the idea that text appearance may vary. Never
    changed IDE fonts? (:-))

    The difference is that in a case-sensitive language (such as C) a
    programmer would have deliberately to choose daft names to engineer
    the mess; whereas in a language which ignores case (such as Ada) the
    mess can come about accidentally, via typos.

    That is evidently wrong.

    What exactly is wrong about my statement?  "int INT;" is an example of deliberately daft names.

    What's wrong with the name int? Let's take

    integer Integer;

    Legislating against stupidity or malice is
    /very/ difficult.

    There is no malice, it is quite common practice to do things like:

    void Boo::Foo (Object * object) {
    int This = this->idx;

    etc.

      Legislating against accidents and inconsistency is
    easier, and a better choice.

    Why exactly

        int INT;

    must be legal?

    If a language can make such things illegal, great - but /not/ at the
    cost of making "int a; INT b; inT c = A + B;" legal.

    I don't see any cost here, because int, INT, inT is the same word to me.
    It boils down to how you choose identifiers. If an identifier is a
    combination of dictionary words case/font/size-insensitivity is the most natural choice. If the idea is to obfuscate the meaning, then it quickly becomes pointless since there is no way you could defeat ill intents.

    Moreover, tools for the case-sensitive languages like C++ do just
    the same. You cannot have reasonable names in C++ anymore. There
    would be lurking clang-format or SonarQube configured to force
    something a three year old suffering dyslexia would pen... (:-))

    As you suggest, for languages which ignore case extra tools are
    needed to help tidy up the code.

    While 99% of all these tools were developed specifically for
    case-sensitive languages? Come on!

    I see no problem with using extra tools, or extra compiler warnings, to improve code quality or catch errors.  Indeed, I am a big fan of them.
    As a fallible programmer I like all the help I can get, and I like it as early in the process as possible (such as smart editors or IDEs).

    It is OK, James argued that these tools somewhat exist because of Ada's case-insensitivity! (:-))

    (To me a tool is an indicator of a problem, but that is another story)
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 10:24:15 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 22:47, James Harris wrote:

    1. Forward declarations should not be needed.


    Usually not, for functions. But sometimes you will need them for
    mutually recursive functions, and I think it makes sense to have some
    kind of module interface definition with a list of declared functions
    (and other entities). In other words, a function should not be exported
    from a module just by writing "export" at the definition. You should
    have an interface section with the declarations (like Pascal), or a
    separate interface file (like Modula-2).

    2. Parameter names should be part of the interface.

    I agree - though not everyone does, so there are choices here too. Some people like to write a declaration such as :

    void rectangle(int top, int_left, int width, int height);

    and then the definition :

    void rectangle(t, l, w, h) { ... }


    (In C, there is a big complication. When you need to be most flexible,
    such as for standard libraries, you can't declare a function like "void
    * malloc(size_t size);", because some twat might have defined a macro
    called "size" before including the header. This means declarations
    often have "funny" parameter names with underscores, or no name at all.
    Obviously you will avoid this possibility in your own language!)


    Sometimes a function will have a parameter whose value is not used - it
    can be good to leave it unnamed. That can happen for historical reasons
    as code changes over time. It can also be useful for "tag types" that
    carry no value but are useful for typing, particularly in connection
    with overloaded functions.

    So you might have (this is rough C++ rather than C, since C does not
    have function overloads) :

    struct Rect { ... }; // To save writing top, left, etc.

    struct Fill {}; // A type with no content
    constexpr Fill fill; // An object of that type

    struct NoFill {}; // A type with no content
    constexpr NoFill nofill; // An object of that type


    void draw_rectangle(Rect rect, Fill);
    void draw_rectangle(Rect rect, NoFill);


    Now the user picks the actual function by calling :

    draw_rectangle(r, fill);

    or

    draw_rectangle(r, nofill);

    This is clearer and less error-prone than using "bool fill" as a
    parameter as it conveys more information explicitly when the function is called.

    But as a tag type with no value, there is no benefit in naming the
    parameter - all the information is carried in the type at compile-time,
    not in a value at run-time.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Fri Nov 25 10:48:26 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-25 09:43, David Brown wrote:
    On 24/11/2022 22:58, Dmitry A. Kazakov wrote:

    I meant some fancy language where no declarations needed. But OK, take
    this:

        int MyVal = a;
        int myVal = MyVal + b;

    How do you know?

    It is unavoidable in any language, with any rules, that people will be
    able to write confusing code, or that people will be able to make
    mistakes that compilers and tools can't catch.  No matter how smart you make the language or the tools, that will /always/ be possible.

    Sill, the above is illegal in Ada and legal in C.

    Give me the language that helps catch typos, not the language that is
    happy with an inconsistent jumble.

        declare
           Myval : Integer := 1;
           myval : Integer := 2;
        begin

    This is illegal in Ada.

    Great.  Ada catches some mistakes.  It lets others through.  That's life in programming.

    No. James want to give an example how case-insensitivity may introduce
    bugs, and failed.

    but if the compiler folds case then programmers can /mistype/ names >>>>> accidentally, leading to the messy inconsistency mentioned above.

    A programmer cannot mistype names if the language is case-sensitive?


    Sure - but at least some typos are more likely to be caught.

    Purely statistically your argument makes no sense. Since the set of
    unique identifiers in a case-insensitive language is by order of
    magnitude narrower, any probability of mess/error etc is also less
    under equivalent conditions.

    "Purely statistically" you are talking drivel and comparing one
    countably infinite set with a different countably infinite set.

    The probability theory deals with infinite sets. Sets must be
    measurable, not countable.

    But the set of identifiers is of course countable, since no human and no
    FSM can deploy infinite identifiers.

    There are some programmers who appear to pick identifiers by letting
    their cat walk at random over the keyboard.  Most don't.  Programmers mostly pick the same identifiers regardless of case sensitivity, and
    mostly pick identifiers that differ in more than just case.  Baring
    abusing programmers, the key exception is idioms such as "Point point"
    where "Point" is a type and "point" is an object of that type.

    It is a bad idiom. Spoken languages use articles and other grammatical
    means to disambiguate classes and instances of. A programming language
    may also have different name spaces for different categories of entities (hello, first-class types, functions etc (:-)). Writing "Point point" specifically in C++ is laziness, stupidity and abuse.

    The only reason to have case-sensitive identifiers is for having
    homographs = for producing mess.

    No, it /avoids/ mess.  And case insensitivity does not avoid homographs
    - HellO and He110 are homographs in simple fonts, despite being
    different identifiers regardless of case sensitivity.  "Int" and "int"
    are not homographs in any font.  "Ρο" and "Po" are homographs,
    regardless of case sensitivity, despite being completely different
    Unicode identifiers (the first uses Greek letters).

    ("Homograph" means they look much the same, but are actually different -
    not that the cases are different.)

    Cannot avoid some homographs, let's introduce more?

    The key benefit of case sensitivity is disallowing inconsistent cases, rather than because it allows identifiers that differ in case.

    How "point" is disallowed by being different from "Point"?

    Some people know how to use tools properly.

    These people don't buy them and thus do not count... (:-))


    I don't follow.  I make a point of learning how to use my tools as best
    I can, whether they are commercial paid-for tools or zero cost price.

    But if you mean that the programmers who could most benefit from good
    tools to check style and code quality are precisely the ones that don't
    use them, I agree.  Usually they don't even have to buy them or acquire them - they already have tools they could use, but don't use them properly.

    If I were making a compiler, all its warnings would be on by default,
    and you'd have to use the flag "-old-bad-code" to disable them.

    Ideally, you should not need a tool if your primary instrument (the
    language) works well.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 10:52:17 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 10:13, Dmitry A. Kazakov wrote:
    On 2022-11-25 09:18, David Brown wrote:
    On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval.

    Which you want to make legal, right? Again,

    If int and INT shall never mean two different entities, why do you
    let them?

    If int and INT are poor style when referring to the same entities, why
    do you let them?

    Any how exactly case-sensitiveness would not let them? So far the outcome:

    case-insensitive: illegal
    case-sensitive:   OK

    You misunderstood my question.

    You dislike case sensitivity because it lets you have two different identifiers written "int" and "INT". That is a fair point, and a clear disadvantage of case sensitivity.

    But if you have a case insensitive language, it lets you write "int" and
    "INT" for the /same/ identifier, despite written differences. That is a
    clear disadvantage of case /insensitivity/.



     But case sensitive makes accidental misuse far more likely to be
    caught by the compiler,

    Example please. Typing 'i' instead of 'I' is not a misuse to me.

    If I accidentally type "I" instead of "i", a C compiler will catch the
    error. "for (int i = 0; I < 10; i++) ..." It's an error in C.


    Of course there are other options as well, which are arguably better
    than either of these.  One is to say you have to get the case right
    for consistency, but disallow identifiers that differ only in case.

    You could do that. You can even say that identifiers must be in italics
    and keywords in bold Arial and then apply all your arguments to font
    shapes, sizes, orientation etc. Why not?

    Sorry, I was only giving sensible suggestions.


    One of the reasons Ada did not do this and many filesystems as well,
    because one might wish to be able to convert names to some canonical
    form without changing the meaning. After all this is how the letter case appeared in European languages in the first place - to beautify written text.

    There is a very simple canonical form for ASCII text - leave it alone.
    For Unicode, there is a standard normalisation procedure (converting
    combining diacriticals into single combination codes where applicable).

    Ada has its roots in a time when many programming languages were
    all-caps, at least for their keywords, and significant computer systems
    were still using punched cards, 6-bit character sizes, and other
    limitations. If you wanted a language that could be used widely (and
    that was one of Ada's aim) without special requirements, you had to
    accept that some people would be using all-caps. At the same time, it
    was clear by then that all-caps was ugly and people preferred to use
    small letters when possible. The obvious solution was to make the
    language case-insensitive, like many other languages of that time (such
    as Pascal, which was a big influence for Ada). It was a /practical/
    decision, not made because someone thought being case-insensitive made
    the language inherently better.


    If you do not misuse the concept that a program is a text, you should
    have no problem with the idea that text appearance may vary. Never
    changed IDE fonts? (:-))

    The difference is that in a case-sensitive language (such as C) a
    programmer would have deliberately to choose daft names to engineer
    the mess; whereas in a language which ignores case (such as Ada) the
    mess can come about accidentally, via typos.

    That is evidently wrong.

    What exactly is wrong about my statement?  "int INT;" is an example of
    deliberately daft names.

    What's wrong with the name int? Let's take

       integer Integer;

    OK, so I assume you now agree there was nothing wrong with my statement,
    since you can't say what you thought was wrong.


    Legislating against stupidity or malice is /very/ difficult.

    There is no malice, it is quite common practice to do things like:

       void Boo::Foo (Object * object) {
          int This = this->idx;

    etc.


    As has been said, again and again, writing something like "Object
    object" is a common idiom and entirely clear to anyone experienced as a
    C or C++ programmer. It is less common to see a pointer involved
    (idiomatic C++ would likely have "object" as a reference or const
    reference here). I can't remember ever seeing a capitalised keyword
    used as an identifier - it is /far/ from common practice. It counts as stupidity, not malice.


      Legislating against accidents and inconsistency is
    easier, and a better choice.

    Why exactly

        int INT;

    must be legal?

    If a language can make such things illegal, great - but /not/ at the
    cost of making "int a; INT b; inT c = A + B;" legal.

    I don't see any cost here, because int, INT, inT is the same word to me.

    They are so visually distinct that there is a higher cognitive cost in
    reading them - that makes them bad, even when you know they mean the
    same thing. (The same applies to confusingly similar but distinct identifiers, regardless of case sensitivity - they require more brain
    effort to comprehend.) Higher cognitive costs translates to more
    effort, lower productivity, and higher error rates - you make more
    errors when you type, and you spot fewer errors when you read.

    It boils down to how you choose identifiers. If an identifier is a combination of dictionary words case/font/size-insensitivity is the most natural choice. If the idea is to obfuscate the meaning, then it quickly becomes pointless since there is no way you could defeat ill intents.

    Moreover, tools for the case-sensitive languages like C++ do just
    the same. You cannot have reasonable names in C++ anymore. There
    would be lurking clang-format or SonarQube configured to force
    something a three year old suffering dyslexia would pen... (:-))

    As you suggest, for languages which ignore case extra tools are
    needed to help tidy up the code.

    While 99% of all these tools were developed specifically for
    case-sensitive languages? Come on!

    I see no problem with using extra tools, or extra compiler warnings,
    to improve code quality or catch errors.  Indeed, I am a big fan of
    them. As a fallible programmer I like all the help I can get, and I
    like it as early in the process as possible (such as smart editors or
    IDEs).

    It is OK, James argued that these tools somewhat exist because of Ada's case-insensitivity! (:-))

    (To me a tool is an indicator of a problem, but that is another story)


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Fri Nov 25 10:12:53 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
    On 2022-11-24 22:50, James Harris wrote:
    On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval.

    Which you want to make legal, right?

    No.

    ...

    That is evidently wrong. Why exactly

        int INT;

    must be legal?

    I didn't say it should be.

    But it is. q.e.d.

    Not necessarily. As I said before, within a scope names which vary only
    by case could be prohibited.


    Moreover, tools for the case-sensitive languages like C++ do just
    the same. You cannot have reasonable names in C++ anymore. There
    would be lurking clang-format or SonarQube configured to force
    something a three year old suffering dyslexia would pen... (:-))

    As you suggest, for languages which ignore case extra tools are
    needed to help tidy up the code.

    While 99% of all these tools were developed specifically for
    case-sensitive languages? Come on!

    It's a personal view but IMO a language should be independent of, and
    should not rely on, IDEs or special editors.

    Yet you must rely on them in order to prevent:

       int INT;

    No, the compiler could detect it. No need for special tools.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Fri Nov 25 10:31:39 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 08:18, David Brown wrote:
    On 24/11/2022 22:35, Dmitry A. Kazakov wrote:

    ...

    Why exactly

        int INT;

    must be legal?

    If a language can make such things illegal, great - but /not/ at the
    cost of making "int a; INT b; inT c = A + B;" legal.

    Well put! It's that kind of mess which makes me dislike the idea of a
    language ignoring case. I don't understand how anyone can think that a compiler actually allowing such a jumble and viewing it as legal is a
    good idea.

    Anyone reading or having to maintain code with such a mixture of cases
    would be justified in thinking that either there was some unofficial
    stropping scheme that he was supposed to adhere to or the programmer who
    wrote it was sloppy.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Fri Nov 25 11:37:18 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-25 10:52, David Brown wrote:
    On 25/11/2022 10:13, Dmitry A. Kazakov wrote:
    On 2022-11-25 09:18, David Brown wrote:
    On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval.

    Which you want to make legal, right? Again,

    If int and INT shall never mean two different entities, why do you
    let them?

    If int and INT are poor style when referring to the same entities,
    why do you let them?

    Any how exactly case-sensitiveness would not let them? So far the
    outcome:

    case-insensitive: illegal
    case-sensitive:   OK

    You misunderstood my question.

    You dislike case sensitivity because it lets you have two different identifiers written "int" and "INT".  That is a fair point, and a clear disadvantage of case sensitivity.

    But if you have a case insensitive language, it lets you write "int" and "INT" for the /same/ identifier, despite written differences.  That is a clear disadvantage of case /insensitivity/.

    Only when identifiers are not supposed to mean anything, which is not
    how I want programs to be. So to me, in context of programming as an
    activity to communicate ideas written in programs, this disadvantage
    does not exist.

     But case sensitive makes accidental misuse far more likely to be
    caught by the compiler,

    Example please. Typing 'i' instead of 'I' is not a misuse to me.

    If I accidentally type "I" instead of "i", a C compiler will catch the error.  "for (int i = 0; I < 10; i++) ..."  It's an error in C.

    But this is no error to me, because there cannot be two different object
    named i and I.

    Of course there are other options as well, which are arguably better
    than either of these.  One is to say you have to get the case right
    for consistency, but disallow identifiers that differ only in case.

    You could do that. You can even say that identifiers must be in
    italics and keywords in bold Arial and then apply all your arguments
    to font shapes, sizes, orientation etc. Why not?

    Sorry, I was only giving sensible suggestions.

    Why distinction of case is sensible and distinction of fonts is not?

    One of the reasons Ada did not do this and many filesystems as well,
    because one might wish to be able to convert names to some canonical
    form without changing the meaning. After all this is how the letter
    case appeared in European languages in the first place - to beautify
    written text.

    There is a very simple canonical form for ASCII text - leave it alone.

    No, regarding identifiers the alphabet is not ASCII, never was. At best
    you can say let identifiers be Latin letters plus some digits, maybe
    some binding signs. ASCII provides means to encode, in particular, Latin letters. Letters can be encoded in a great number of ways.

    For Unicode, there is a standard normalisation procedure (converting combining diacriticals into single combination codes where applicable).

    Ada has its roots in a time when many programming languages were
    all-caps, at least for their keywords, and significant computer systems
    were still using punched cards, 6-bit character sizes, and other limitations.  If you wanted a language that could be used widely (and
    that was one of Ada's aim) without special requirements, you had to
    accept that some people would be using all-caps.  At the same time, it
    was clear by then that all-caps was ugly and people preferred to use
    small letters when possible.  The obvious solution was to make the
    language case-insensitive, like many other languages of that time (such
    as Pascal, which was a big influence for Ada).  It was a /practical/ decision, not made because someone thought being case-insensitive made
    the language inherently better.

    Ada 83 style used bold lower case letters for keywords and upper case
    letters for identifiers.

    Legislating against stupidity or malice is /very/ difficult.

    There is no malice, it is quite common practice to do things like:

        void Boo::Foo (Object * object) {
           int This = this->idx;

    etc.

    As has been said, again and again, writing something like "Object
    object" is a common idiom and entirely clear to anyone experienced as a
    C or C++ programmer.

    "it should be entirely clear for anyone..." is no argument.

    I can't remember ever seeing a capitalised keyword
    used as an identifier - it is /far/ from common practice.  It counts as stupidity, not malice.

    Why using properly spelt words is stupidity? (:-))

       Legislating against accidents and inconsistency is
    easier, and a better choice.

    Why exactly

        int INT;

    must be legal?

    If a language can make such things illegal, great - but /not/ at the
    cost of making "int a; INT b; inT c = A + B;" legal.

    I don't see any cost here, because int, INT, inT is the same word to me.

    They are so visually distinct that there is a higher cognitive cost in reading them - that makes them bad, even when you know they mean the
    same thing.

    Come on, they are not visually distinct, just open any book and observe capital letters at the beginning of every sentence!

    If there is any cost then keeping in mind artificially introduced
    differences.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Fri Nov 25 11:41:02 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-25 11:12, James Harris wrote:
    On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
    On 2022-11-24 22:50, James Harris wrote:
    On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to
    MyVal, myVal, myval.

    Which you want to make legal, right?

    No.

    ...

    That is evidently wrong. Why exactly

        int INT;

    must be legal?

    I didn't say it should be.

    But it is. q.e.d.

    Not necessarily. As I said before, within a scope names which vary only
    by case could be prohibited.

    You can introduce such rules, but so could a case-insensitive language
    as well. The rule as it is agnostic to the choice.

    Yet you must rely on them in order to prevent:

        int INT;

    No, the compiler could detect it.

    How? Without additional rules (see above) this is perfectly legal in a case-sensitive language.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Fri Nov 25 11:50:58 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 09:52, David Brown wrote:
    On 25/11/2022 10:13, Dmitry A. Kazakov wrote:

    Any how exactly case-sensitiveness would not let them? So far the
    outcome:

    case-insensitive: illegal
    case-sensitive:   OK

    You misunderstood my question.

    You dislike case sensitivity because it lets you have two different identifiers written "int" and "INT".  That is a fair point, and a clear disadvantage of case sensitivity.

    But this happens in real code. For example `enum (INT, FLOAT, DOUBLE)`,
    plus of course `Image image`.


    But if you have a case insensitive language, it lets you write "int" and "INT" for the /same/ identifier, despite written differences.  That is a clear disadvantage of case /insensitivity/.

    This could happen in real code, but it very rarely does.

    So one is a real disadvantage, the other only a perceived one. Here's
    another issue:

    zfail
    zFar
    zNear
    zpass

    These don't clash. But there are two patterns here: small z following by either a capitalised word or non-capitalised. How do you remember which
    is which? With case-sensitive, you /have/ to get it right.

    With case-insensitive, if these identifiers were foisted on you, you can choose to use more consistent capitalisation.

     But case sensitive makes accidental misuse far more likely to be
    caught by the compiler,

    Example please. Typing 'i' instead of 'I' is not a misuse to me.

    If I accidentally type "I" instead of "i", a C compiler will catch the error.  "for (int i = 0; I < 10; i++) ..."  It's an error in C.

    (This isn't:)

    for (int i=0; i<M; ++i)
    for (int j=0; j<N; ++i)

    I can't reproduce your exact example, because my loop headers only
    mention the index once, but I can write this:

    for i to n do
    a[I] := 0
    end

    It's not an error, so no harm done. At some point it will be noticed
    that one of those has the wrong case, and it will be fixed.

    It is a complete non-issue.

    There is a very simple canonical form for ASCII text - leave it alone.
    For Unicode, there is a standard normalisation procedure (converting combining diacriticals into single combination codes where applicable).

    Ada has its roots in a time when many programming languages were
    all-caps, at least for their keywords, and significant computer systems
    were still using punched cards, 6-bit character sizes, and other limitations.  If you wanted a language that could be used widely (and
    that was one of Ada's aim) without special requirements, you had to
    accept that some people would be using all-caps.  At the same time, it
    was clear by then that all-caps was ugly


    Yes, it is, in the wrong font. Which I take advantage of by writing
    debugging code in all-caps. Even commented out, it stands out so it is
    clear which comments could be deleted, and which comments contain code
    that is temporarily out of use or not ready.

    I also tend to write such code unindented, which further highlights it
    and saves effort. But I couldn't do that in Python: all code must be the proper case, and properly indented. There is no redundancy at all.


    and people preferred to use
    small letters when possible.  The obvious solution was to make the
    language case-insensitive, like many other languages of that time (such
    as Pascal, which was a big influence for Ada).  It was a /practical/ decision, not made because someone thought being case-insensitive made
    the language inherently better.

    Where did the fad for case-sensitivity really come from? Was it someone
    just being lazy, because in a lexer it is easier to process A-Z and a-z
    as distinct letters rather than convert to some canonical form (eg. I
    store as all-lower-case).

    Not just in source, but for text everywhere in a computer system, even user-facing code. But I guess in the early days, everyone involved would
    be some technical member of staff. The trouble that eventually, these
    systems would have non-technical users.

    I'd imagine that not many men-in-the-street directly used Unix in the
    70s and 80s, so I don't know what they would have made of
    case-sensitivity eveywhere.

    But millions of ordinary people did use OSes like CP/M and DOS, which
    thank god were case-insensitive.

    I spent a year or two doing technical support on the phone for customers
    using our computers; I dread what things could have been like!

    Now ordinary users normally use GUIs, gestures etc, even voice, which
    largely insulates them from the underlying case-sensitivity of the
    machine (which don't tell me, is based on Linux and written in C).

    As I said elsewhere, normal people only really come across it with
    passwords, or the latter part of URLs since those are generally part of
    a Linux file path.

    But I think it is generally understood that case-sensitivity is bad for ordinalry users.

    There is no malice, it is quite common practice to do things like:

        void Boo::Foo (Object * object) {
           int This = this->idx;

    etc.


    As has been said, again and again, writing something like "Object
    object" is a common idiom and entirely clear to anyone experienced as a
    C or C++ programmer.  It is less common to see a pointer involved (idiomatic C++ would likely have "object" as a reference or const
    reference here).  I can't remember ever seeing a capitalised keyword
    used as an identifier - it is /far/ from common practice.  It counts as stupidity, not malice.

    This is from a project called "c4", a C compiler in 500 lines and 4
    functions:

    enum {
    Num = 128, Fun, Sys, Glo, Loc, Id,
    Char, Else, Enum, If, Int, Return, Sizeof, While,
    Assign, Cond, Lor, Lan, Or, Xor, And, Eq, Ne, Lt, Gt, Le, Ge,
    Shl, Shr, Add, Sub, Mul, Div, Mod, Inc, Dec, Brak
    };

    enum { CHAR, INT, PTR };

    Both Int and INT are used.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Fri Nov 25 11:55:49 2022
    From Newsgroup: comp.lang.misc

    On 24/11/2022 22:51, Bart wrote:
    On 24/11/2022 21:47, James Harris wrote:
    On 24/11/2022 20:52, Bart wrote:

    Yes, although it's a function declaration; the presumably incorrectly
    typed identifier zSQL is ignored.

    We don't know the purpose of zSQL. But the point is it is there, a
    slightly differently-case version of with the same name, which I can't
    for the life of me recall right now. That is the problem.

    You brought up sqlite3.c but if it is over 100,000 lines of code in one
    file (Andy would not care much for it...) I'm not sure that it's a valid example for anything!

    Nevertheless, you mentioned the purpose of the capitalised zSQL. It
    appears only at the end of the declaration of sqlite3_declare_vtab:

    SQLITE_API SQLITE_EXPERIMENTAL int sqlite3_declare_vtab(sqlite3*,
    const char *zSQL);

    SQLITE_EXPERIMENTAL is defined to be blank so the declaration matches
    with the definition

    SQLITE_API int sqlite3_declare_vtab(sqlite3 *db, const char
    *zCreateTable){

    The final parameter is still a const char *. AIUI the name in the
    prototype, the zSQL you mentioned as anomalous, is ignored which is
    presumably why the programmer left it with a case mismatch.

    So the problem here is not case but that in such circumstances C
    requires forward declarations and, because the parameter names are
    ignored, the compiler does not have to check for a mismatch.



    (If I look back, it is zSql. But if now encounter these even in 10
    minutes time, which one would be which? I would forget.)



    Similar could be said for any names which differed only slightly.

    Say short, Short and SHORT out loud; any difference?

    Yes, they get louder. ;-)


    You're debugging some code and need to print out the value of hmenu. Or
    is hMenu or Hmenu? Personally I am half-blind to case usage because it
    so commonly irrelevant and ignored in English.

    Do you not use a consistent scheme? Perhaps when a compiler ignores case
    it encourages programmers to be inconsistent. I don't think inconsistent capitalisation is a good thing but I accept that YMMV.


    I would be constantly double-checking and constantly getting it wrong
    too. And that's with just one of these three in place.

    Differences in spelling are another matter; I'm a good speller.

    You might notice when the cup you're handed in Starbucks has Janes
    rather than James and would want to check it is yours; but you probably wouldn't care if its james or James or JAMES because that is just style.
    You know they are all the same name.

    I wouldn't write all three variants in one program. Programmers should
    be consistent, IMO, and the compiler should check.


    But also, just because people can make typos by pressing the wrong
    letter or being the wrong length doesn't make allowing 2**N more
    incorrect possibilities acceptable.

    With case-sensitive languages programmers just need to follow sensible conventions and to keep to them consistently.


    In C this happens /all the time/. It's almost a requirement. When I
    translated OpenGL headers, then many macro names shared the same
    identifers with functions if you took away case.

    One cannot stop programmers doing daft things. For example, a
    programmer could declare names such as

       CreateTableForwandReference

    and

       createtableforwardrefarence

    The differences are not obvious.

    So to fix it, we allow

        CreateTableForwardRefarence

        createtableforwandreference

    as synonyms? Case sensitive, you have subtle differences in letters
    /plus/ subtle differences in case!

    You have not solved the problem. It's still impossible to prevent evil programmers from writing confusing code if they are determined to do so
    - or are administrators rather than programmers!


    Maybe the best a language designer can do for cases such as this is to
    help reduce the number of different names a programmer would have to
    define in any given location.

    Given the various restrictions you've mentioned that you'd want even
    with case sensitive names, is there any point to having case
    sensitivity? What would it be used for; what would it allow?

    Yes. The problem I have with case insensitivity is that it /allows/ inconsistent coding. I'd rather the compiler refused to compile code
    which uses myVar, MyVar, myvar and MYVAR for the same thing.

    Case can be indicative. For example, consider the identifier

    barbend

    Does it mean the bend in a bar or the end of a barb...? With
    capitalisation or underscores (or hyphens in languages which support
    them) the meaning can be made clear.


    I had a scripting language that shipped with my applications. While case insensitive, I would usually write keywords in lower case as if, then, while.

    But some users would write If, Then, While, and make more use in
    identifiers of mixed case. And they would capitalise global variables
    that I defined in lower case.
    As I said before, a compiler can prohibit names which fold to the same
    string. (Whether they should do or not is an open question, however.)
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Fri Nov 25 12:08:30 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 09:24, David Brown wrote:
    On 24/11/2022 22:47, James Harris wrote:

    1. Forward declarations should not be needed.


    Usually not, for functions.  But sometimes you will need them for
    mutually recursive functions,

    Not even then. Modern languages seems to deal with out-of-order
    functions without needing special declarations.

    and I think it makes sense to have some
    kind of module interface definition with a list of declared functions
    (and other entities).  In other words, a function should not be exported from a module just by writing "export" at the definition.

    Why not?

      You should
    have an interface section with the declarations (like Pascal), or a
    separate interface file (like Modula-2).

    Then you have the same information repeated in two places.

    If you need a summary of the interface without exposing the
    implementaton, this can be done automatically by a compiler, which can
    be done for those functions marked with 'export'.

    (In my languages, which use whole-program compilers, such an exports
    file is only needed to export names from the whole program, when it
    forms a complete library.

    Plus I need to create such a file to create bindings in my language to
    FFI libraries. But there I don't have the sources of those libraries)

    2. Parameter names should be part of the interface.

    I agree - though not everyone does, so there are choices here too.

    Since in my languages you only ever specify the function header in one
    place - where it's defined - parameter names are mandatory. And there is
    only ever one set.

      Some
    people like to write a declaration such as :

        void rectangle(int top, int_left, int width, int height);

    and then the definition :

        void rectangle(t, l, w, h) { ... }

    That's how my systems language worked for 20 years. I find it
    astonishing now that I tolerated it for so long.

    Well, actually not quite: the declaration listed only types; the
    definition only names. Names in the declaration had no use.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Fri Nov 25 12:13:54 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 10:41, Dmitry A. Kazakov wrote:
    On 2022-11-25 11:12, James Harris wrote:
    On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
    On 2022-11-24 22:50, James Harris wrote:
    On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to >>>>>> MyVal, myVal, myval.

    Which you want to make legal, right?

    No.

    ...

    That is evidently wrong. Why exactly

        int INT;

    must be legal?

    I didn't say it should be.

    But it is. q.e.d.

    Not necessarily. As I said before, within a scope names which vary
    only by case could be prohibited.

    You can introduce such rules, but so could a case-insensitive language
    as well. The rule as it is agnostic to the choice.

    No, I'm arguing for consistency - and consistency that can be enforced
    by the compiler. The thing I dislike is the inconsistency allowed by
    case insensitivity.


    Yet you must rely on them in order to prevent:

        int INT;

    No, the compiler could detect it.

    How? Without additional rules (see above) this is perfectly legal in a case-sensitive language.

    As I say, names which fold to the same string can be detected and
    prohibited by the compiler.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Fri Nov 25 12:40:30 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 10:31, James Harris wrote:
    On 25/11/2022 08:18, David Brown wrote:
    On 24/11/2022 22:35, Dmitry A. Kazakov wrote:

    ...

    Why exactly

        int INT;

    must be legal?

    If a language can make such things illegal, great - but /not/ at the
    cost of making "int a; INT b; inT c = A + B;" legal.

    Well put! It's that kind of mess which makes me dislike the idea of a language ignoring case.

    But, that never happens! If it does, you can change it to how you like;
    the program still works; that is the advantage.

    C allows that line to be written like this:

    i\
    n\
    t\

    a\
    ;\

    i\
    n\
    t\

    b\
    ;\

    i\
    n\
    t\

    c
    =
    a
    +
    b;


    In C, any *token* can be split across multiple lines using line
    continuation, even ones like '//', and string literals (and in the
    middle of string escape sequences).

    But have you ever seen that? Should you disallow this feature?

    Plus, you can make that original line legal in C too:

    #define INT int
    #define inT int
    #define A a
    #define B a

    int a; INT b; inT c = A + B;

    So, what do we ban here? C also allows this:

    Point:; struct Point Point;

    Here you don't even need to change case!

    All sorts of nonsense can written legally, some of it more dangerous
    than being lax about letter case.

    You might know that A[i] can be written as i[A]; did you know you can
    also write i[A][A][A][A][A][A][A][A]?

    Did you know that you can write a simple function pointer as:

    void(*)(void)

    Oh, hang on, that's how C works anyway!


    I don't understand how anyone can think that a
    compiler actually allowing such a jumble and viewing it as legal is a
    good idea.

    Because it is blind to case? In the same way it doesn't see extra or misleading white space which can lead to even worse jumbles.

    The solution is easy: just make your language case-sensitive if that is
    your preference.

    Others may make theirs case-insensitive. Because it doesn't look like
    anyone is going to change their mind about this stuff.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 14:46:22 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 11:41, Dmitry A. Kazakov wrote:
    On 2022-11-25 11:12, James Harris wrote:
    On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
    On 2022-11-24 22:50, James Harris wrote:
    On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:



    Restoring the snippet context:

    While 99% of all these tools were developed specifically for
    case-sensitive languages? Come on!

    It's a personal view but IMO a language should be independent of,
    and should not rely on, IDEs or special editors.




    Yet you must rely on them in order to prevent:

        int INT;

    No, the compiler could detect it.

    How? Without additional rules (see above) this is perfectly legal in a case-sensitive language.


    I think it is quite clear (with the restored context) that James meant a compiler could detect the undesirable "int INT;" pattern, without the
    need of additional tools or smart editors. This is, of course, entirely
    true. Even if this is only considered a "bad style warning", rather
    than being ruled against by the language grammar or constraints, a
    compiler could check it.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Fri Nov 25 14:20:54 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 12:40, Bart wrote:
    On 25/11/2022 10:31, James Harris wrote:
    On 25/11/2022 08:18, David Brown wrote:
    On 24/11/2022 22:35, Dmitry A. Kazakov wrote:

    ...

    Why exactly

        int INT;

    must be legal?

    If a language can make such things illegal, great - but /not/ at the
    cost of making "int a; INT b; inT c = A + B;" legal.

    Well put! It's that kind of mess which makes me dislike the idea of a
    language ignoring case.

    But, that never happens!

    Eh???? You say it never happens when very similar does happen ... and
    then you go on to give your own counterexamples that are so outre they
    really never happen! Que????

    If it does, you can change it to how you like;
    the program still works; that is the advantage.

    C allows that line to be written like this:

    i\
    n\
    t\

    a\
    ;\

    i\
    n\
    t\

    b\
    ;\

    i\
    n\
    t\

    c
    =
    a
    +
    b;

    ...

    All sorts of nonsense can written legally, some of it more dangerous
    than being lax about letter case.

    Of course but as said that's the same whether case is recognised or not.
    A determined programmer can /always/ write garbage and a language cannot prevent him from doing so.

    The difference is that where the programmer doesn't mean to be
    inconsistent but simply makes a mistake and inadvertently writes the
    same identifier in different cases. A compiler can pick that up so that identifier names can be written the same each time they are used, making
    code more consistent and the intent of any capitalisation clearer.
    What's not to like...?!

    ...

    I don't understand how anyone can think that a compiler actually
    allowing such a jumble and viewing it as legal is a good idea.

    Because it is blind to case? In the same way it doesn't see extra or misleading white space which can lead to even worse jumbles.

    The solution is easy: just make your language case-sensitive if that is
    your preference.

    Indeed, and possibly prohibit names which differ only by case.


    Others may make theirs case-insensitive. Because it doesn't look like
    anyone is going to change their mind about this stuff.

    That's never stopped Usenet discussions before!

    I'll add one more thing. If case is to be ignored (your preference) then
    there is nothing to stop different programmers adopting different
    non-standard conventions for how they personally capitalise parts of
    names, making code even harder to maintain.

    I am not a particular advocate of case sensitivity. It's just that where
    case is ignored I don't personally like the unnecessary legitimising of inconsistently written names. A compiler can report such discrepancies.
    You know it makes sense! ;-)
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Fri Nov 25 15:46:52 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-25 14:46, David Brown wrote:
    On 25/11/2022 11:41, Dmitry A. Kazakov wrote:
    On 2022-11-25 11:12, James Harris wrote:
    On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
    Yet you must rely on them in order to prevent:

        int INT;

    No, the compiler could detect it.

    How? Without additional rules (see above) this is perfectly legal in a
    case-sensitive language.

    I think it is quite clear (with the restored context) that James meant a compiler could detect the undesirable "int INT;" pattern, without the
    need of additional tools or smart editors.  This is, of course, entirely true.

    No, it cannot without additional rules, which, BTW, case-sensitive
    languages do not have. [A Wikipedia listed one? (:-))]

    We do not argue about such rules. We do about case-sensitivity.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 15:52:01 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 11:37, Dmitry A. Kazakov wrote:
    On 2022-11-25 10:52, David Brown wrote:
    On 25/11/2022 10:13, Dmitry A. Kazakov wrote:
    On 2022-11-25 09:18, David Brown wrote:
    On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int to >>>>>> MyVal, myVal, myval.

    Which you want to make legal, right? Again,

    If int and INT shall never mean two different entities, why do you
    let them?

    If int and INT are poor style when referring to the same entities,
    why do you let them?

    Any how exactly case-sensitiveness would not let them? So far the
    outcome:

    case-insensitive: illegal
    case-sensitive:   OK

    You misunderstood my question.

    You dislike case sensitivity because it lets you have two different
    identifiers written "int" and "INT".  That is a fair point, and a
    clear disadvantage of case sensitivity.

    But if you have a case insensitive language, it lets you write "int"
    and "INT" for the /same/ identifier, despite written differences.
    That is a clear disadvantage of case /insensitivity/.

    Only when identifiers are not supposed to mean anything, which is not
    how I want programs to be. So to me, in context of programming as an activity to communicate ideas written in programs, this disadvantage
    does not exist.

     But case sensitive makes accidental misuse far more likely to be
    caught by the compiler,

    Example please. Typing 'i' instead of 'I' is not a misuse to me.

    If I accidentally type "I" instead of "i", a C compiler will catch the
    error.  "for (int i = 0; I < 10; i++) ..."  It's an error in C.

    But this is no error to me, because there cannot be two different object named i and I.

    Would you consider it good style to mix "i" and "I" in the same code, as
    the same identifier? I would not - even when the language allows it. I
    have done little Ada programming, but I used to do a lot of Pascal
    coding - I have never seen circumstances where I considered it to be an advantage to use different cases for the same identifier. On the
    contrary, I saw a lot of code that was harder to comprehend because of
    mixing cases.

    So to me, writing "i" one place and "I" a different place is an /error/
    in the code - even in Ada or Pascal. It is not an error that the
    compiler will spot (though other tools might do) or according to the
    language standards, but in my book, such bad style is an error.

    So let me ask you a direct question, and I hope you can give me a direct answer. If you were doing an Ada code review and the code had used an identifier in two places with two different capitalisations, would you
    let it pass or would you want it changed?

    Second question. Do you set up your extra tools (or IDE) to flag
    inconsistent case as a warning or error?


    If your answers here are "yes" - and frankly, I can't see how a serious professional would answer anything else - then you are agreeing that
    there are /no/ inherent advantages in a language being case-insensitive.
    Your only concern (and it's a valid concern) is avoiding the
    /disadvantage/ of case-sensitive languages in being open to allowing
    confusing names.



    Of course there are other options as well, which are arguably better
    than either of these.  One is to say you have to get the case right
    for consistency, but disallow identifiers that differ only in case.

    You could do that. You can even say that identifiers must be in
    italics and keywords in bold Arial and then apply all your arguments
    to font shapes, sizes, orientation etc. Why not?

    Sorry, I was only giving sensible suggestions.

    Why distinction of case is sensible and distinction of fonts is not?

    Baring a few niche (or outdated) languages that rely on specialised
    editors, languages should not be dependent on the appearance of the
    text. Syntax highlighting is useful for reading and editing, but not as
    a part of the syntax or grammar of the language.


    One of the reasons Ada did not do this and many filesystems as well,
    because one might wish to be able to convert names to some canonical
    form without changing the meaning. After all this is how the letter
    case appeared in European languages in the first place - to beautify
    written text.

    There is a very simple canonical form for ASCII text - leave it alone.

    No, regarding identifiers the alphabet is not ASCII, never was. At best
    you can say let identifiers be Latin letters plus some digits, maybe
    some binding signs. ASCII provides means to encode, in particular, Latin letters. Letters can be encoded in a great number of ways.

    Sure - identifiers use a subset of ASCII. (The subset varies a little
    from language to language.) And when languages allow characters beyond
    ASCII, they are a subset of Unicode. All other character encodings are obsolete.


    For Unicode, there is a standard normalisation procedure (converting
    combining diacriticals into single combination codes where applicable).

    Ada has its roots in a time when many programming languages were
    all-caps, at least for their keywords, and significant computer
    systems were still using punched cards, 6-bit character sizes, and
    other limitations.  If you wanted a language that could be used widely
    (and that was one of Ada's aim) without special requirements, you had
    to accept that some people would be using all-caps.  At the same time,
    it was clear by then that all-caps was ugly and people preferred to
    use small letters when possible.  The obvious solution was to make the
    language case-insensitive, like many other languages of that time
    (such as Pascal, which was a big influence for Ada).  It was a
    /practical/ decision, not made because someone thought being
    case-insensitive made the language inherently better.

    Ada 83 style used bold lower case letters for keywords and upper case letters for identifiers.

    Legislating against stupidity or malice is /very/ difficult.

    There is no malice, it is quite common practice to do things like:

        void Boo::Foo (Object * object) {
           int This = this->idx;

    etc.

    As has been said, again and again, writing something like "Object
    object" is a common idiom and entirely clear to anyone experienced as
    a C or C++ programmer.

    "it should be entirely clear for anyone..." is no argument.

    A programmer for a language should be expected to understand the
    fundamentals of the language and common idioms. I did not say "it
    should be entirely clear for [sic] anyone" - I said "it should be
    entirely clear to anyone experienced as a C or C++ programmer". If you
    have never touched C (or other languages with similar idioms), then I
    would not expect you to be comfortable with "Point point;" no matter how
    many decades experience you have with other languages. But if you have programmed C or C++ for a few years, I would expect you to be completely comfortable in reading and understanding it, even if you do not use that
    idiom yourself.


    I can't remember ever seeing a capitalised keyword used as an
    identifier - it is /far/ from common practice.  It counts as
    stupidity, not malice.

    Why using properly spelt words is stupidity? (:-))
       Legislating against accidents and inconsistency is
    easier, and a better choice.

    Why exactly

        int INT;

    must be legal?

    If a language can make such things illegal, great - but /not/ at the
    cost of making "int a; INT b; inT c = A + B;" legal.

    I don't see any cost here, because int, INT, inT is the same word to me. >>
    They are so visually distinct that there is a higher cognitive cost in
    reading them - that makes them bad, even when you know they mean the
    same thing.

    Come on, they are not visually distinct, just open any book and observe capital letters at the beginning of every sentence!


    As I have said before - programming is not "everyday language".

    If there is any cost then keeping in mind artificially introduced differences.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 16:02:59 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 15:46, Dmitry A. Kazakov wrote:
    On 2022-11-25 14:46, David Brown wrote:
    On 25/11/2022 11:41, Dmitry A. Kazakov wrote:
    On 2022-11-25 11:12, James Harris wrote:
    On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
    Yet you must rely on them in order to prevent:

        int INT;

    No, the compiler could detect it.

    How? Without additional rules (see above) this is perfectly legal in
    a case-sensitive language.

    I think it is quite clear (with the restored context) that James meant
    a compiler could detect the undesirable "int INT;" pattern, without
    the need of additional tools or smart editors.  This is, of course,
    entirely true.

    No, it cannot without additional rules, which, BTW, case-sensitive
    languages do not have. [A Wikipedia listed one? (:-))]

    You snippet the context again.

    You may also have missed the fact that James is working on his own
    language design. He makes the rules, he decides what goes in the
    compiler for /his/ language.


    We do not argue about such rules. We do about case-sensitivity.


    All you have done so far is argue /against/ one type of unclear code
    that can be written in existing case-sensitive languages. It would only
    be a problem when done maliciously or in smart-arse programming -
    accidents would generally be caught by the compiler. (Note that people
    can write code with clear intent and function by making use of case sensitivity - no matter how much Bart whinges and whines about it.)

    I've seen /nothing/ from you or anyone else that suggests any benefit in
    being case-insensitive in itself.

    James appears to be considering a suggestion that avoids the
    disadvantages of case-sensitivity, and also the disadvantages of case-insensitivity - though it also disallows advantageous use of cases.
    (You can't have everything - there are always trade-offs.) To me,
    that is certainly worth considering - this is not a strictly
    black-or-white issue.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Fri Nov 25 15:05:36 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 14:46, Dmitry A. Kazakov wrote:
    On 2022-11-25 14:46, David Brown wrote:
    On 25/11/2022 11:41, Dmitry A. Kazakov wrote:
    On 2022-11-25 11:12, James Harris wrote:
    On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
    Yet you must rely on them in order to prevent:

        int INT;

    No, the compiler could detect it.

    How? Without additional rules (see above) this is perfectly legal in
    a case-sensitive language.

    I think it is quite clear (with the restored context) that James meant
    a compiler could detect the undesirable "int INT;" pattern, without
    the need of additional tools or smart editors.  This is, of course,
    entirely true.

    No, it cannot without additional rules, which, BTW, case-sensitive
    languages do not have. [A Wikipedia listed one? (:-))]

    David is right: I have repeatedly said that a compiler can object to
    names which differ only in capitalisation. That may be an "additional
    rule" to what /you/ have in mind but it is not an additional rule to
    what we have been discussing. :-)
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 16:07:34 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 13:40, Bart wrote:
    On 25/11/2022 10:31, James Harris wrote:
    On 25/11/2022 08:18, David Brown wrote:
    On 24/11/2022 22:35, Dmitry A. Kazakov wrote:

    ...

    Why exactly

        int INT;

    must be legal?

    If a language can make such things illegal, great - but /not/ at the
    cost of making "int a; INT b; inT c = A + B;" legal.

    Well put! It's that kind of mess which makes me dislike the idea of a
    language ignoring case.

    But, that never happens! If it does, you can change it to how you like;
    the program still works; that is the advantage.

    C allows that line to be written like this:


    Will you /please/ give it a rest? Whenever you can't think of something useful to write, you always go off on some rant about how it is possible
    to writing something in C that you don't like. I've been trying to
    avoid replying to you so much, because it just makes me annoyed and
    write unpleasantly. But sometimes your obsession with hating C is
    borderline psychotic.

    (And if you had anything interesting to say, it is lost in the noise I snipped.)


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 16:28:22 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 10:48, Dmitry A. Kazakov wrote:
    On 2022-11-25 09:43, David Brown wrote:
    On 24/11/2022 22:58, Dmitry A. Kazakov wrote:

    I meant some fancy language where no declarations needed. But OK,
    take this:

        int MyVal = a;
        int myVal = MyVal + b;

    How do you know?

    It is unavoidable in any language, with any rules, that people will be
    able to write confusing code, or that people will be able to make
    mistakes that compilers and tools can't catch.  No matter how smart
    you make the language or the tools, that will /always/ be possible.

    Sill, the above is illegal in Ada and legal in C.

    Yes. So what? It's good to try to stop accidents. It is foolish to
    try to stop intentionally confusing code.


    Purely statistically your argument makes no sense. Since the set of
    unique identifiers in a case-insensitive language is by order of
    magnitude narrower, any probability of mess/error etc is also less
    under equivalent conditions.

    "Purely statistically" you are talking drivel and comparing one
    countably infinite set with a different countably infinite set.

    The probability theory deals with infinite sets. Sets must be
    measurable, not countable.

    But the set of identifiers is of course countable, since no human and no
    FSM can deploy infinite identifiers.

    No, the set of identifiers in most languages is countably infinite - few languages impose specific limits on the length of identifiers (which
    would make the set finite) - and none allow infinite length identifiers
    (which would make the set uncountably infinite). Changing the size of
    the set of distinguishable letters does not change the size of the
    identifier space. And if you really wanted to go there, I'd like to
    point out that Ada identifiers can use Unicode letters and thus have a
    vastly bigger choice of letters than many case-sensitive programming languages. But this is all side tracking - there's no need to go
    through the mathematics of countable sets here just to prove that you
    made a silly pretend-mathematical claim.


    There are some programmers who appear to pick identifiers by letting
    their cat walk at random over the keyboard.  Most don't.  Programmers
    mostly pick the same identifiers regardless of case sensitivity, and
    mostly pick identifiers that differ in more than just case.  Baring
    abusing programmers, the key exception is idioms such as "Point point"
    where "Point" is a type and "point" is an object of that type.

    It is a bad idiom.

    That is a matter of familiarity and personal opinion, not fact.

    Spoken languages use articles and other grammatical
    means to disambiguate classes and instances of. A programming language
    may also have different name spaces for different categories of entities (hello, first-class types, functions etc (:-)). Writing "Point point" specifically in C++ is laziness, stupidity and abuse.

    The only reason to have case-sensitive identifiers is for having
    homographs = for producing mess.

    No, it /avoids/ mess.  And case insensitivity does not avoid
    homographs - HellO and He110 are homographs in simple fonts, despite
    being different identifiers regardless of case sensitivity.  "Int" and
    "int" are not homographs in any font.  "Ρο" and "Po" are homographs,
    regardless of case sensitivity, despite being completely different
    Unicode identifiers (the first uses Greek letters).

    ("Homograph" means they look much the same, but are actually different
    - not that the cases are different.)

    Cannot avoid some homographs, let's introduce more?


    No - but you don't introduce more ways of writing confusing code unless
    there are significant benefits outweighing the costs. That's the
    decision Ada made when it added Unicode, despite having /vastly/ more opportunities to write confusingly similar but programmatically distinct identifiers. Certainly the possible accidental mixups due to case
    sensitivity are a drop in the ocean in comparison.

    The key benefit of case sensitivity is disallowing inconsistent cases,
    rather than because it allows identifiers that differ in case.

    How "point" is disallowed by being different from "Point"?


    Yes - if this is unintentional. The most important feature of a case sensitive language is that you don't get some people writing "Point" and others writing "point" - two letter sequences that look different and
    /are/ different - and referring to the same thing.

    A second feature - one that some people like, and some people do not -
    is that you can use such different letter sequences to refer to
    different things.

    The strange thing about case-insensitive languages is that it means
    different letter sequences sometimes refer to the same thing, which can
    be confusing.

    This is not rocket science.

    Some people know how to use tools properly.

    These people don't buy them and thus do not count... (:-))


    I don't follow.  I make a point of learning how to use my tools as
    best I can, whether they are commercial paid-for tools or zero cost
    price.

    But if you mean that the programmers who could most benefit from good
    tools to check style and code quality are precisely the ones that
    don't use them, I agree.  Usually they don't even have to buy them or
    acquire them - they already have tools they could use, but don't use
    them properly.

    If I were making a compiler, all its warnings would be on by default,
    and you'd have to use the flag "-old-bad-code" to disable them.

    Ideally, you should not need a tool if your primary instrument (the language) works well.


    Usually a language needs a compiler or interpreter to be useful...
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 16:30:03 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 12:55, James Harris wrote:
    On 24/11/2022 22:51, Bart wrote:

    Say short, Short and SHORT out loud; any difference?

    Yes, they get louder. ;-)

    They also get harder to read. I read the third one as "SHOUT" at first.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Fri Nov 25 15:32:30 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 11:50, Bart wrote:
    On 25/11/2022 09:52, David Brown wrote:
    On 25/11/2022 10:13, Dmitry A. Kazakov wrote:

    Any how exactly case-sensitiveness would not let them? So far the
    outcome:

    case-insensitive: illegal
    case-sensitive:   OK

    You misunderstood my question.

    You dislike case sensitivity because it lets you have two different
    identifiers written "int" and "INT".  That is a fair point, and a
    clear disadvantage of case sensitivity.

    But this happens in real code. For example `enum (INT, FLOAT, DOUBLE)`,
    plus of course `Image image`.

    Have to say that those names are so short they look alright!
    Importantly, such short names are easily distinguishable. *If* (to
    emphasise, *if*) there's a well-known convention and they follow that convention then adherent capitalisation can help with code clarity.
    Contrast how the code would look if there was no capitalisation standard
    and the names had to be distinguished some other way:

    int_const
    float_const
    double_const
    image_type
    image



    But if you have a case insensitive language, it lets you write "int"
    and "INT" for the /same/ identifier, despite written differences.
    That is a clear disadvantage of case /insensitivity/.

    This could happen in real code, but it very rarely does.

    So one is a real disadvantage, the other only a perceived one. Here's another issue:

     zfail
     zFar
     zNear
     zpass

    First thought: Why is the programmer using two different naming
    conventions in the same piece of code? Can his pay be docked this week? ;-)


    These don't clash. But there are two patterns here: small z following by either a capitalised word or non-capitalised. How do you remember which
    is which? With case-sensitive, you /have/ to get it right.

    Indeed. Having to get it right is an advantage, not the converse!!! If
    the compiler tells you that name uses don't match declarations then you
    can fix it. As you say, you /have/ to get it right. Good!

    We are not in the days of batch compiles which might take a few days to
    come back and report a trivial error. A compiler can tell us immediately
    if names have been written inconsistently. And we can fix them.


    With case-insensitive, if these identifiers were foisted on you, you can choose to use more consistent capitalisation.

    Case ignoring is telling the compiler: "Look, it doesn't matter how
    names are capitalised; anything goes; please accept ANY capitalisation
    of names as all the same." Why would one want to say that? (Rhetorical)
    I just don't get it.

    I can see why someone would object to myVar and MyVar being different variables (and that's a valid point) but not why someone would
    deliberately tell a compiler to accept any combination of case when the programmer is perfectly capable of writing names consistently.

    Why not, instead, just get the compiler to report any uses which differ
    from the definitions? I seriously don't see why a programmer would
    object to that.

    ...

    It's not an error, so no harm done. At some point it will be noticed
    that one of those has the wrong case, and it will be fixed.

    I note you say such things need to be 'fixed'. It's almost as though you
    see something wrong with them! ;-)

    ...

    But I think it is generally understood that case-sensitivity is bad for ordinalry users.

    Different audience. Different goalposts!
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 16:49:35 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 13:08, Bart wrote:
    On 25/11/2022 09:24, David Brown wrote:
    On 24/11/2022 22:47, James Harris wrote:

    1. Forward declarations should not be needed.


    Usually not, for functions.  But sometimes you will need them for
    mutually recursive functions,

    Not even then. Modern languages seems to deal with out-of-order
    functions without needing special declarations.

    That usually depends on whether you have early compile-time binding and
    name lookup, or late run-time binding. For an interpreted language that
    does not attempt to find the recursive function before it is actually
    needed at run-time, out-of-order definitions without declarations are
    fine. (So it works in Python.) For compiled languages with static
    binding, it's usually a different matter. (I don't suggest it is
    impossible to avoid the declarations, but declarations can make the
    language easier.)


    and I think it makes sense to have some kind of module interface
    definition with a list of declared functions (and other entities).  In
    other words, a function should not be exported from a module just by
    writing "export" at the definition.

    Why not?

    Because then you need to look in the middle of the implementation code
    to see the interface for the module.


      You should have an interface section with the declarations (like
    Pascal), or a separate interface file (like Modula-2).

    Then you have the same information repeated in two places.

    Yes. This is rarely a big cost, and is often extremely useful - it
    means you can separate writing the interface from writing the
    implementation.


    If you need a summary of the interface without exposing the
    implementaton, this can be done automatically by a compiler, which can
    be done for those functions marked with 'export'.


    Having the compiler generate human-readable interface summaries is a possibility. If it is combined with formalised documentation strings
    that are part of the language, it might be workable.

    (In my languages, which use whole-program compilers, such an exports
    file is only needed to export names from the whole program, when it
    forms a complete library.

    Modular programming means different modules (usually different files,
    but possibly parts of files, collections of files, directories, etc.)
    have specific entities that they export through an interface.
    Non-exported entities are local, and can share names with local entities
    in other modules without conflict. It is totally irrelevant whether the program is combined with a whole-program compiler, a linker, a run-time interpreter, or anything else.


    Plus I need to create such a file to create bindings in my language to
    FFI libraries. But there I don't have the sources of those libraries)

    2. Parameter names should be part of the interface.

    I agree - though not everyone does, so there are choices here too.

    Since in my languages you only ever specify the function header in one
    place - where it's defined - parameter names are mandatory. And there is only ever one set.


    Certainly you can be consistent if you only write them once!

    Being part of the interface can also mean more. Do you allow keyword parameters in your language?


      Some
    people like to write a declaration such as :

         void rectangle(int top, int_left, int width, int height);

    and then the definition :

         void rectangle(t, l, w, h) { ... }

    That's how my systems language worked for 20 years. I find it
    astonishing now that I tolerated it for so long.

    Well, actually not quite: the declaration listed only types; the
    definition only names. Names in the declaration had no use.


    In C, names in the declaration are in effect just documentation - they
    are optional, and they don't have to match the names in the definition.
    But they are very useful documentation! (Of course I strongly prefer
    to have them consistent, and some tools like clang-tidy can check this.)

    Types in the definition were not needed in K&R C, I think (I never
    really used C before C90). I can't imagine any reason you would /not/
    want the types clearly visible at the function definition.

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Fri Nov 25 16:50:57 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-25 13:13, James Harris wrote:
    On 25/11/2022 10:41, Dmitry A. Kazakov wrote:
    On 2022-11-25 11:12, James Harris wrote:
    On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
    On 2022-11-24 22:50, James Harris wrote:
    On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
    On 2022-11-24 19:55, James Harris wrote:

    All of the above are examples of poor code - from Int, INT, int >>>>>>> to MyVal, myVal, myval.

    Which you want to make legal, right?

    No.

    ...

    That is evidently wrong. Why exactly

        int INT;

    must be legal?

    I didn't say it should be.

    But it is. q.e.d.

    Not necessarily. As I said before, within a scope names which vary
    only by case could be prohibited.

    You can introduce such rules, but so could a case-insensitive language
    as well. The rule as it is agnostic to the choice.

    No, I'm arguing for consistency - and consistency that can be enforced
    by the compiler. The thing I dislike is the inconsistency allowed by
    case insensitivity.

    I fail to see any inconsistency. int = INT. That is consistent. If you
    want to introduce the rule of same spelling, I do not object.

    Yet you must rely on them in order to prevent:

        int INT;

    No, the compiler could detect it.

    How? Without additional rules (see above) this is perfectly legal in a
    case-sensitive language.

    As I say, names which fold to the same string can be detected and
    prohibited by the compiler.

    Yes and that is fully orthogonal to the question of case-sensitivity.
    Note that the rule is not enforceable in presence of late bindings and separate compilation, so the issue remain.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Fri Nov 25 16:03:31 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 15:02, David Brown wrote:

    ... (discussion of case sensitivity in a language)

    James appears to be considering a suggestion that avoids the
    disadvantages of case-sensitivity, and also the disadvantages of case-insensitivity - though it also disallows advantageous use of cases.
     (You can't have everything - there are always trade-offs.)  To me,
    that is certainly worth considering - this is not a strictly
    black-or-white issue.

    That's a good summary.

    Further, I've seen your comments on linked areas such as different
    namespaces and parameter names - which are also as yet unresolved but
    related issues. It may be a few days before I can reply but I do intend
    to get back to them.
    --
    James Harris

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Fri Nov 25 16:09:53 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 15:07, David Brown wrote:
    On 25/11/2022 13:40, Bart wrote:

    C allows that line to be written like this:


    Will you /please/ give it a rest?  Whenever you can't think of something useful to write, you always go off on some rant about how it is possible
    to writing something in C that you don't like.  I've been trying to
    avoid replying to you so much, because it just makes me annoyed and
    write unpleasantly.  But sometimes your obsession with hating C is borderline psychotic.

    (And if you had anything interesting to say, it is lost in the noise I snipped.)


    Yes, you missed the bit where I say that this example that was
    complained of:

    int a; INT b; inT c = A + B;

    could be made legal in C.

    My other examples are simply those that certain language rules make it possible to write, but no one ever does.

    Exactly the same as a rule that makes letter case redundant unless it is marked as significant.

    The one about the function pointer though was a joke.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Fri Nov 25 17:11:50 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-25 15:52, David Brown wrote:
    On 25/11/2022 11:37, Dmitry A. Kazakov wrote:
    On 2022-11-25 10:52, David Brown wrote:
    On 25/11/2022 10:13, Dmitry A. Kazakov wrote:

    Example please. Typing 'i' instead of 'I' is not a misuse to me.

    If I accidentally type "I" instead of "i", a C compiler will catch
    the error.  "for (int i = 0; I < 10; i++) ..."  It's an error in C.

    But this is no error to me, because there cannot be two different
    object named i and I.

    Would you consider it good style to mix "i" and "I" in the same code, as
    the same identifier?

    I consider lack of indentation before "for" bad style. Yet I find Python
    ideas about making it syntax horrific.

    have done little Ada programming, but I used to do a lot of Pascal
    coding - I have never seen circumstances where I considered it to be an advantage to use different cases for the same identifier.

    How case-sensitivity forces different cases? It is some imaginary
    problem. Why the mouse does not have a foot detector in order to prevent
    me using my feet with it? Just do not! (:-))

    On the
    contrary, I saw a lot of code that was harder to comprehend because of mixing cases.

    Good, why let them influence the program semantics?

    So let me ask you a direct question, and I hope you can give me a direct answer.  If you were doing an Ada code review and the code had used an identifier in two places with two different capitalisations, would you
    let it pass or would you want it changed?

    Yes, I would want to change it according to the actual guidelines.

    Second question.  Do you set up your extra tools (or IDE) to flag inconsistent case as a warning or error?

    No. I always write identifiers correct, except when I borrow them from
    alien programming language or standards. Then I make some compromises, I
    could well name type LPARAM in the Windows API.

     Your only concern (and it's a valid concern) is avoiding the
    /disadvantage/ of case-sensitive languages in being open to allowing confusing names.

    Yes and introducing problems later. E.g. it is a minefield working with
    Linux file system, which is case-sensitive. In Ada modules must be named
    after the files and it is a great help that I need not to care inside
    the module. But outside I must be very careful. I have all lowercase
    rule for the source files because a project perfectly built under
    Windows might miserably fail under Linux.

    Of course there are other options as well, which are arguably
    better than either of these.  One is to say you have to get the
    case right for consistency, but disallow identifiers that differ
    only in case.

    You could do that. You can even say that identifiers must be in
    italics and keywords in bold Arial and then apply all your arguments
    to font shapes, sizes, orientation etc. Why not?

    Sorry, I was only giving sensible suggestions.

    Why distinction of case is sensible and distinction of fonts is not?

    Baring a few niche (or outdated) languages that rely on specialised
    editors, languages should not be dependent on the appearance of the
    text.  Syntax highlighting is useful for reading and editing, but not as
    a part of the syntax or grammar of the language.

    So is the case!

    One of the reasons Ada did not do this and many filesystems as well,
    because one might wish to be able to convert names to some canonical
    form without changing the meaning. After all this is how the letter
    case appeared in European languages in the first place - to beautify
    written text.

    There is a very simple canonical form for ASCII text - leave it alone.

    No, regarding identifiers the alphabet is not ASCII, never was. At
    best you can say let identifiers be Latin letters plus some digits,
    maybe some binding signs. ASCII provides means to encode, in
    particular, Latin letters. Letters can be encoded in a great number of
    ways.

    Sure - identifiers use a subset of ASCII.  (The subset varies a little
    from language to language.)  And when languages allow characters beyond ASCII, they are a subset of Unicode.  All other character encodings are obsolete.

    The point is that "letter" is a borrowed term. It is not ASCII or
    Unicode. There is no good reason to use letters differently from their original meaning. Surely you can have some conventions about style, yet
    i and I are different spellings of the same letter.

    Legislating against stupidity or malice is /very/ difficult.

    There is no malice, it is quite common practice to do things like:

        void Boo::Foo (Object * object) {
           int This = this->idx;

    etc.

    As has been said, again and again, writing something like "Object
    object" is a common idiom and entirely clear to anyone experienced as
    a C or C++ programmer.

    "it should be entirely clear for anyone..." is no argument.

    A programmer for a language should be expected to understand the fundamentals of the language and common idioms.  I did not say "it
    should be entirely clear for [sic] anyone" - I said "it should be
    entirely clear to anyone experienced as a C or C++ programmer".

    It is a bad idiom, like picking the nose at the table... (:-))

    If you
    have never touched C (or other languages with similar idioms), then I
    would not expect you to be comfortable with "Point point;" no matter how many decades experience you have with other languages.  But if you have programmed C or C++ for a few years, I would expect you to be completely comfortable in reading and understanding it, even if you do not use that idiom yourself.

    Have mercy! I painfully restrain myself from elaborating the
    nose-picking allegory... (:-))
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 17:39:47 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 17:03, James Harris wrote:
    On 25/11/2022 15:02, David Brown wrote:

    ... (discussion of case sensitivity in a language)

    James appears to be considering a suggestion that avoids the
    disadvantages of case-sensitivity, and also the disadvantages of
    case-insensitivity - though it also disallows advantageous use of
    cases.   (You can't have everything - there are always trade-offs.)
    To me, that is certainly worth considering - this is not a strictly
    black-or-white issue.

    That's a good summary.

    Further, I've seen your comments on linked areas such as different namespaces and parameter names - which are also as yet unresolved but related issues. It may be a few days before I can reply but I do intend
    to get back to them.


    Take your time. And maybe start them on new threads (as you have been
    doing for some topics) - threads in this newsgroup have a tendency to
    wander a bit!



    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Fri Nov 25 17:52:11 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 17:11, Dmitry A. Kazakov wrote:
    On 2022-11-25 15:52, David Brown wrote:
    On 25/11/2022 11:37, Dmitry A. Kazakov wrote:
    On 2022-11-25 10:52, David Brown wrote:
    On 25/11/2022 10:13, Dmitry A. Kazakov wrote:

    Example please. Typing 'i' instead of 'I' is not a misuse to me.

    If I accidentally type "I" instead of "i", a C compiler will catch
    the error.  "for (int i = 0; I < 10; i++) ..."  It's an error in C.

    But this is no error to me, because there cannot be two different
    object named i and I.

    Would you consider it good style to mix "i" and "I" in the same code,
    as the same identifier?

    I consider lack of indentation before "for" bad style. Yet I find Python ideas about making it syntax horrific.


    Have you considered a career in politics? You do a fine job of avoiding questions :-)

    have done little Ada programming, but I used to do a lot of Pascal
    coding - I have never seen circumstances where I considered it to be
    an advantage to use different cases for the same identifier.

    How case-sensitivity forces different cases? It is some imaginary
    problem. Why the mouse does not have a foot detector in order to prevent
    me using my feet with it? Just do not! (:-))

    On the contrary, I saw a lot of code that was harder to comprehend
    because of mixing cases.

    Good, why let them influence the program semantics?

    So let me ask you a direct question, and I hope you can give me a
    direct answer.  If you were doing an Ada code review and the code had
    used an identifier in two places with two different capitalisations,
    would you let it pass or would you want it changed?

    Yes, I would want to change it according to the actual guidelines.

    Second question.  Do you set up your extra tools (or IDE) to flag
    inconsistent case as a warning or error?

    No. I always write identifiers correct, except when I borrow them from
    alien programming language or standards. Then I make some compromises, I could well name type LPARAM in the Windows API.


    I can believe that you are always consistent. So I am, when using
    Windows file names, or programming in Pascal. The problem is other
    people, and I'd rather their inconsistencies were errors than glossed over.

    (Maybe Bart has the perfect solution with his language - if no one else
    uses the language, you never have to deal with other people's bad code!
    :-) )

     Your only concern (and it's a valid concern) is avoiding the
    /disadvantage/ of case-sensitive languages in being open to allowing
    confusing names.

    Yes and introducing problems later. E.g. it is a minefield working with Linux file system, which is case-sensitive.

    The only issues I have ever had with case sensitivity in Linux were
    caused by Windows users being inconsistent in their naming due to the
    lax case-insensitive OS - such as giving their header files one name and
    then using a different name (i.e., different capitalisation) for their #include directives.

    In Ada modules must be named
    after the files and it is a great help that I need not to care inside
    the module. But outside I must be very careful. I have all lowercase
    rule for the source files because a project perfectly built under
    Windows might miserably fail under Linux.

    Keep the cases consistent and there is no problem.

    Of course there are other options as well, which are arguably
    better than either of these.  One is to say you have to get the
    case right for consistency, but disallow identifiers that differ
    only in case.

    You could do that. You can even say that identifiers must be in
    italics and keywords in bold Arial and then apply all your
    arguments to font shapes, sizes, orientation etc. Why not?

    Sorry, I was only giving sensible suggestions.

    Why distinction of case is sensible and distinction of fonts is not?

    Baring a few niche (or outdated) languages that rely on specialised
    editors, languages should not be dependent on the appearance of the
    text.  Syntax highlighting is useful for reading and editing, but not
    as a part of the syntax or grammar of the language.

    So is the case!

    One of the reasons Ada did not do this and many filesystems as
    well, because one might wish to be able to convert names to some
    canonical form without changing the meaning. After all this is how
    the letter case appeared in European languages in the first place - >>>>> to beautify written text.

    There is a very simple canonical form for ASCII text - leave it alone.

    No, regarding identifiers the alphabet is not ASCII, never was. At
    best you can say let identifiers be Latin letters plus some digits,
    maybe some binding signs. ASCII provides means to encode, in
    particular, Latin letters. Letters can be encoded in a great number
    of ways.

    Sure - identifiers use a subset of ASCII.  (The subset varies a little
    from language to language.)  And when languages allow characters
    beyond ASCII, they are a subset of Unicode.  All other character
    encodings are obsolete.

    The point is that "letter" is a borrowed term. It is not ASCII or
    Unicode. There is no good reason to use letters differently from their original meaning. Surely you can have some conventions about style, yet
    i and I are different spellings of the same letter.


    On a computer, they are different. In ASCII, they are different. In
    fonts, they are different.

    Legislating against stupidity or malice is /very/ difficult.

    There is no malice, it is quite common practice to do things like:

        void Boo::Foo (Object * object) {
           int This = this->idx;

    etc.

    As has been said, again and again, writing something like "Object
    object" is a common idiom and entirely clear to anyone experienced
    as a C or C++ programmer.

    "it should be entirely clear for anyone..." is no argument.

    A programmer for a language should be expected to understand the
    fundamentals of the language and common idioms.  I did not say "it
    should be entirely clear for [sic] anyone" - I said "it should be
    entirely clear to anyone experienced as a C or C++ programmer".

    It is a bad idiom, like picking the nose at the table... (:-))

    If you have never touched C (or other languages with similar idioms),
    then I would not expect you to be comfortable with "Point point;" no
    matter how many decades experience you have with other languages.  But
    if you have programmed C or C++ for a few years, I would expect you to
    be completely comfortable in reading and understanding it, even if you
    do not use that idiom yourself.

    Have mercy! I painfully restrain myself from elaborating the
    nose-picking allegory... (:-))


    Maybe we should just leave things here and agree to disagree on the
    whole topic? It might not be healthy to dig deeper into some analogies!



    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Fri Nov 25 17:58:26 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-25 16:28, David Brown wrote:
    On 25/11/2022 10:48, Dmitry A. Kazakov wrote:
    On 2022-11-25 09:43, David Brown wrote:
    On 24/11/2022 22:58, Dmitry A. Kazakov wrote:

    I meant some fancy language where no declarations needed. But OK,
    take this:

        int MyVal = a;
        int myVal = MyVal + b;

    How do you know?

    It is unavoidable in any language, with any rules, that people will
    be able to write confusing code, or that people will be able to make
    mistakes that compilers and tools can't catch.  No matter how smart
    you make the language or the tools, that will /always/ be possible.

    Sill, the above is illegal in Ada and legal in C.

    Yes.  So what?  It's good to try to stop accidents.

    Accident stopped in case-insensitive language.

    It is foolish to try to stop intentionally confusing code.

    Like code distinguishing i and I?

    Purely statistically your argument makes no sense. Since the set of
    unique identifiers in a case-insensitive language is by order of
    magnitude narrower, any probability of mess/error etc is also less
    under equivalent conditions.

    "Purely statistically" you are talking drivel and comparing one
    countably infinite set with a different countably infinite set.

    The probability theory deals with infinite sets. Sets must be
    measurable, not countable.

    But the set of identifiers is of course countable, since no human and
    no FSM can deploy infinite identifiers.

    No, the set of identifiers in most languages is countably infinite - few languages impose specific limits on the length of identifiers (which
    would make the set finite) - and none allow infinite length identifiers (which would make the set uncountably infinite).

    Finite-length identifiers comprise a finite sets.

    Changing the size of
    the set of distinguishable letters does not change the size of the identifier space.

    It does. The set is measurable even if infinite. That is because the probabilities of the letters depend on their position in the identifier. Shorter identifiers are far more probable to be typed. Infinitely long
    ones have infinitely small probabilities.

    The measure of the set of case-insensitive identifiers is way less than
    the measure of the set of case-sensitive ones.

    And if you really wanted to go there, I'd like to
    point out that Ada identifiers can use Unicode letters and thus have a vastly bigger choice of letters than many case-sensitive programming languages.

    I don't argue that full Unicode is a bad idea. We must consider
    comparable cases. If James want to limit it to ASCII, the point stands.

    Cannot avoid some homographs, let's introduce more?

    No - but you don't introduce more ways of writing confusing code unless there are significant benefits outweighing the costs.

    Tell me about the advantages having i and I distinct names.

    That's the
    decision Ada made when it added Unicode, despite having /vastly/ more opportunities to write confusingly similar but programmatically distinct identifiers.  Certainly the possible accidental mixups due to case sensitivity are a drop in the ocean in comparison.

    See above, I am against Unicode.

    However Unicode does not contribute much here. The idea behind it was to
    allow programmers to choose identifiers in their native languages. The probability for them to mix things up is same as one for an English
    speaker choosing English identifiers. Imaginary cases when somebody
    would use е (Cyrillic) for e (Latin) in Hello fall into the category of "accidents" you reject anyway.

    The key benefit of case sensitivity is disallowing inconsistent
    cases, rather than because it allows identifiers that differ in case.

    How "point" is disallowed by being different from "Point"?

    Yes - if this is unintentional.  The most important feature of a case sensitive language is that you don't get some people writing "Point" and others writing "point" - two letter sequences that look different and
    /are/ different - and referring to the same thing.

    A second feature - one that some people like, and some people do not -
    is that you can use such different letter sequences to refer to
    different things.

    The strange thing about case-insensitive languages is that it means different letter sequences sometimes refer to the same thing, which can
    be confusing.

    This is not rocket science.

    It is, point = Point.

    Ideally, you should not need a tool if your primary instrument (the
    language) works well.

    Usually a language needs a compiler or interpreter to be useful...

    Which is why compiler is not a tool. lint is a tool, gcc is not.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Fri Nov 25 18:05:45 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-25 17:52, David Brown wrote:

    Keep the cases consistent and there is no problem.

    Not possible in the world infested by case-sensitive stuff. File names leak.

    Maybe we should just leave things here and agree to disagree on the
    whole topic?  It might not be healthy to dig deeper into some analogies!

    Yes, the simplest issues usually the most controversial ones!
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Fri Nov 25 18:06:38 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-25 16:02, David Brown wrote:
    On 25/11/2022 15:46, Dmitry A. Kazakov wrote:
    On 2022-11-25 14:46, David Brown wrote:
    On 25/11/2022 11:41, Dmitry A. Kazakov wrote:
    On 2022-11-25 11:12, James Harris wrote:
    On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
    Yet you must rely on them in order to prevent:

        int INT;

    No, the compiler could detect it.

    How? Without additional rules (see above) this is perfectly legal in
    a case-sensitive language.

    I think it is quite clear (with the restored context) that James
    meant a compiler could detect the undesirable "int INT;" pattern,
    without the need of additional tools or smart editors.  This is, of
    course, entirely true.

    No, it cannot without additional rules, which, BTW, case-sensitive
    languages do not have. [A Wikipedia listed one? (:-))]

    You snippet the context again.

    You may also have missed the fact that James is working on his own
    language design.  He makes the rules, he decides what goes in the
    compiler for /his/ language.


    We do not argue about such rules. We do about case-sensitivity.

    All you have done so far is argue /against/ one type of unclear code
    that can be written in existing case-sensitive languages.  It would only
    be a problem when done maliciously or in smart-arse programming -
    accidents would generally be caught by the compiler.  (Note that people
    can write code with clear intent and function by making use of case sensitivity - no matter how much Bart whinges and whines about it.)

    I've seen /nothing/ from you or anyone else that suggests any benefit in being case-insensitive in itself.

    OK, but you too provide nothing but accidents only. When confronted with counterexamples, you say, ah, but I could add some additional rule! That
    is an admission of problem.

    James appears to be considering a suggestion that avoids the
    disadvantages of case-sensitivity, and also the disadvantages of case-insensitivity - though it also disallows advantageous use of cases.

    I fail to see cases when such a rule might be a disadvantage in a case-insensitive language. For a case-sensitive language some cases
    could probably be construed with namespaces:

    X.z
    Y.Z

    is Z same identifier here? etc.

     (You can't have everything - there are always trade-offs.)  To me,
    that is certainly worth considering - this is not a strictly
    black-or-white issue.

    I have nothing against the rule.
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Fri Nov 25 17:49:35 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 14:20, James Harris wrote:
    On 25/11/2022 12:40, Bart wrote:
    On 25/11/2022 10:31, James Harris wrote:

    But, that never happens!

    Eh???? You say it never happens when very similar does happen ... and
    then you go on to give your own counterexamples that are so outre they really never happen! Que????

    Which examples are those?

    I wouldn't deliberately write code like this:

    int a; INT b; inT c = A + B;"

    unless it was some temporary debug or test code and I hadn't paid
    attention to the Caps Lock status.

    I take advantage of case-insensitivity in these situations:

    * For informality for throwaway bits of code: I can inadvertently mix up
    case, but the code still works, so who cares because it will disappear a minute later

    * To specifically add debugging statements in capitals

    * To sometimes write function headers in capitals (I did this decades
    ago to make it easier to find function definitions with a text editor)

    * To allow imported mixed-case names to be written in my own choice of capitalisation (usually all-lower-case)

    * To sometimes highlight, for various reasons, specific identifiers or expressions; perhaps to indicate something that is temporary or needs attention or for a 'TO DO' bit of code (see the LXERROR calls at my link)

    * Simply being allowed not to care; it's just an extra bit of redundancy
    in a language, which is always good

    That the feature has a side-effect of allowing deranged people to write programs with random and inconsistent capitalisation within indentifiers
    is not something I'm going to lose sleep over.

    Those same people can wreak havoc in any language (especially C, but
    David Brown says I can't use that language as an example).

    Here's a one-file version of my current systems compiler:

    https://raw.githubusercontent.com/sal55/langs/master/sources/mm.ma

    It's nearly all lower case; see if you can find any inconsistencies of
    case, or even much in upper case at all. (Note comments have been
    removed from this.)

    (The SHOW macro, in capitals, was for debugging purposes, when it would
    also be invoked as capitals, but there are no uncommented instances of
    it here.)

    All sorts of nonsense can written legally, some of it more dangerous
    than being lax about letter case.

    Of course but as said that's the same whether case is recognised or not.
    A determined programmer can /always/ write garbage and a language cannot prevent him from doing so.

    Of course. But case insensitivity has benefits to me as listed above,
    and it doesn't have the disadvantages of case-/sensitivity/ that I
    listed elsewhere.

    The difference is that where the programmer doesn't mean to be
    inconsistent but simply makes a mistake and inadvertently writes the
    same identifier in different cases.

    The language decides whether that matters or not. Or chooses to make an
    option of whether it matters. If you want it to matter in all cases,
    then just make the syntax 100% case-sensitive no matter what.

    (If DB will allow me, here are a few rare instances of even C being case-insensitive:

    0xABC or 0xabc or 0XAbC
    123ULL or 123ull
    123e4 or 123E4
    u"ABC" or U"ABC"
    #include <STDIO.H> # on Windows


    Does it matter that different programmers will use diferent styles here,
    or the same programmer mixes styles in one program?)



    A compiler can pick that up so that
    identifier names can be written the same each time they are used, making code more consistent and the intent of any capitalisation clearer.
    What's not to like...?!

    I don't like CamelCase, especially halfCamelCase or
    half_heartedCamelcase. With case-sensitivity I'm forced to memorise and
    to write the same crappy choices.

    ...

    I'll add one more thing. If case is to be ignored (your preference) then there is nothing to stop different programmers adopting different non-standard conventions for how they personally capitalise parts of
    names, making code even harder to maintain.

    You mean, easier? If it's a problem then use a formatting tool that can
    take care of it.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Andy Walker@anw@cuboid.co.uk to comp.lang.misc on Sat Nov 26 00:43:24 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 11:55, James Harris wrote:
    You brought up sqlite3.c but if it is over 100,000 lines of code in
    one file (Andy would not care much for it...) I'm not sure that it's
    a valid example for anything!

    Andy would point out that that's ~2000 pages of code; about 3x the size of the draft C23 standard. It is therefore a valid example of code
    that contains serious errors even if no-one knows what they are. I would strongly advise not using it anywhere near any potentially fatal activity,
    such as running a nuclear power station, flying a plane, delivering doses
    of radiotherapy, .... You Have Been Warned.

    [Slightly relevant, I note that Fred "Mythical Man-Month" Brooks
    died a few days ago, aged 91. He had much to say about large projects,
    having managed the development of the IBM System/360. His biography on
    Wiki notes that he considered his biggest decision "was to change the IBM
    360 series from a 6-bit byte to an 8-bit byte, thereby enabling the use
    of lowercase letters", relevant to other recent threads here. But the "thereby" isn't accurate; IBM could have followed Ferranti and other
    computer manufacturers in using 6-bit Flexowriter codes. At a stroke,
    all modern computers and discs would have been 33% bigger!]
    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Handel
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Sat Nov 26 13:59:27 2022
    From Newsgroup: comp.lang.misc

    On 26/11/2022 00:43, Andy Walker wrote:
    On 25/11/2022 11:55, James Harris wrote:
    You brought up sqlite3.c but if it is over 100,000 lines of code in
    one file (Andy would not care much for it...) I'm not sure that it's
    a valid example for anything!

        Andy would point out that that's ~2000 pages of code;  about 3x the size of the draft C23 standard.  It is therefore a valid example of code that contains serious errors even if no-one knows what they are.  I would strongly advise not using it anywhere near any potentially fatal activity, such as running a nuclear power station, flying a plane, delivering doses
    of radiotherapy, ....  You Have Been Warned.

    Why does it matter whether it is in one file or not?

    Anyway, sqlite3.c is actually an amalgatation of 100 or so separate
    sources files, intended to be simpler to embed into applications.

    The total size is some 0.22Mloc (and that is bloated because it is 40% comments).

    However OSes can be tens of millions of lines or code. A quick google
    tells me that a Boeing 787 uses 6.5 million lines of avionics code.

    Given a rough rule-of-thumb that 10 bytes of (x64) binary equates to 1
    line of systems-level code, you can estimate line-counts for the various binaries on your machine.


    about 3x the size of the draft C23 standard.

    Not a good benchmark; not many people would have a clue about the C23 standard!

    sqlite3.c is roughly 2-3 times the size of the Holy Bible.

    Meanwhile, and this is a figure I read about 20 years ago so doubtless
    has increased since, the source code for MS Visual Studio comprised 1.5 million /files/ (not lines!), and a full build took 60 hours.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Sat Nov 26 20:37:17 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 15:32, James Harris wrote:
    On 25/11/2022 11:50, Bart wrote:

    These don't clash. But there are two patterns here: small z following
    by either a capitalised word or non-capitalised. How do you remember
    which is which? With case-sensitive, you /have/ to get it right.

    Indeed. Having to get it right is an advantage, not the converse!!!

    Not when you artificially add a difficulty which has to be 'got right'
    for no reason.

    Getting the spelling right is one thing, but the author's foibles in
    their mix of letter case?

    If
    the compiler tells you that name uses don't match declarations then you
    can fix it. As you say, you /have/ to get it right. Good!

    How much time will be wasted getting it wrong before you get it right?


    We are not in the days of batch compiles which might take a few days to
    come back and report a trivial error. A compiler can tell us immediately
    if names have been written inconsistently. And we can fix them.

    When writing Algol68 I spent 50% of my time fixing semicolons because of
    its poor rules. When I write C, it might be 10% fixing semicolons. With
    my syntax, it's 0%.

    I don't want to spend /any/ time in complying with some capitalistion
    style because in my view it is pointless.

    And code which /relies/ on exact case to distinguish one identifier from another, with the same scope, should be taken out and shot.



    With case-insensitive, if these identifiers were foisted on you, you
    can choose to use more consistent capitalisation.

    Case ignoring is telling the compiler: "Look, it doesn't matter how
    names are capitalised; anything goes; please accept ANY capitalisation
    of names as all the same." Why would one want to say that? (Rhetorical)
    I just don't get it.

    OK, I just don't like mixed case in source code. Case-sensitivity seems
    to encourage such code, but with case-insensitivity you can just ignore it.

    Below is a C program (not mine) which I today had to quickly port to my scripting language, shown below the C version (you can't mistake them!).

    As you can see, the first is very much mixed case. But two things came
    up during porting that had not been obvious:

    * Two global identifiers `SDL_Quit` and `SDL_QUIT` clash when case is
    ignored

    * Two global identifers `W` (a width) and `w` (a handle to a window
    object) clash when case is ignored.

    I won't remark on these choices.

    In the C, those upper-case macros, which are simple constants, get undue prominence (a lot of shouting is going on).

    In the other, this was my choice of capitalisation. Here I didn't think
    much of functions, types and constants all starting with 'SDL_`.


    ---------------- C Version --------------------------

    #include <SDL.h>

    #define W 1024
    #define H 768

    static int uniform(uint64_t *rng, int limit)
    {
    *rng = *rng*0x3243f6a8885a308d + 1;
    return ((*rng>>32) * limit) >> 32;
    }

    int main(int argc, char **argv)
    {
    (void)argc;
    (void)argv;

    if (SDL_Init(SDL_INIT_VIDEO)) {
    SDL_Log("SDL_Init(): %s", SDL_GetError());
    return 1;
    }

    SDL_Window *w = SDL_CreateWindow(
    "Example",
    SDL_WINDOWPOS_CENTERED, SDL_WINDOWPOS_CENTERED,
    W, H, 0
    );


    if (!w) {
    SDL_Log("SDL_CreateWindow(): %s", SDL_GetError());
    return 1;
    }

    SDL_Renderer *r = SDL_CreateRenderer(w, -1, SDL_RENDERER_PRESENTVSYNC);
    if (!r) {
    SDL_Log("SDL_CreateRenderer(): %s", SDL_GetError());
    return 1;
    }

    for (uint64_t rng = 1;;) {
    SDL_Event e;
    while (SDL_PollEvent(&e)) {
    if (e.type == SDL_QUIT) {
    SDL_Quit();
    return 0;
    }
    }

    int color = uniform(&rng, 0x1000000);
    SDL_SetRenderDrawColor(r, color>>16, color>>8, color, 255);

    SDL_Rect rect; // {int x,y,w,h}
    rect.x = uniform(&rng, W);
    rect.y = uniform(&rng, H);
    rect.w = uniform(&rng, W-rect.x);
    rect.h = uniform(&rng, H-rect.y);
    SDL_RenderFillRect(r, &rect);

    SDL_RenderPresent(r);
    }
    }


    --------- Non-C Port ---------------------

    const sdl_init_video = 0x20
    const sdl_centered = 0x2FFF0000
    const sdl_renderer_presentvsync = 4
    const sdl_quitx = 0x100

    const w = 1024, h = 768

    importdll sdl2 =
    func "SDL_Init" (u32)i32
    func "SDL_CreateWindow" (stringz, i32, i32, i32, i32,
    u32)ref void
    func "SDL_CreateRenderer" (ref void, i32, u32)ref void
    func "SDL_PollEvent" (ref void)i32
    func "SDL_SetRenderDrawColor" (ref void, i32, i32, i32, i32)i32
    func "SDL_RenderFillRect" (ref void, ref void)i32
    proc "SDL_RenderPresent" (ref void)
    proc "SDL_Quit"
    end

    type sdl_event = struct
    u32 etype
    [64]byte dummy
    end

    type sdl_rect = struct
    i32 x, y, w, h
    end

    fun randz(n) = random(0..n-1)

    proc main=
    if sdl_init(sdl_init_video) then
    abort("SDL error")
    fi

    wnd := sdl_createwindow("Example", sdl_centered, sdl_centered,
    w, h, 0)
    if not wnd then
    abort("SDL window error")
    fi

    r := sdl_createrenderer(wnd, -1, sdl_renderer_presentvsync)
    if not r then
    abort("SDL Render error")
    fi

    do
    e:=new(sdl_event)
    while sdl_pollevent(&e) do
    if e.etype=sdl_quitx then
    sdl_quit()
    stop
    fi
    od

    sdl_setrenderdrawcolor(r, randz(256), randz(256),
    randz(256), 255)

    rect := new(sdl_rect)
    rect.x := randz(w)
    rect.y := randz(h)
    rect.w := randz(w-rect.x)
    rect.h := randz(h-rect.y)

    sdl_renderfillrect(r, &rect)
    sdl_renderpresent(r)
    od
    end





    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Sun Nov 27 16:05:24 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 17:58, Dmitry A. Kazakov wrote:
    On 2022-11-25 16:28, David Brown wrote:
    On 25/11/2022 10:48, Dmitry A. Kazakov wrote:


    (I promised not dig further into your mistakes about sets of
    identifiers. It was not easy, but I have resisted the temptation!)

    Ideally, you should not need a tool if your primary instrument (the
    language) works well.

    Usually a language needs a compiler or interpreter to be useful...

    Which is why compiler is not a tool. lint is a tool, gcc is not.


    Sometimes I wonder if the terms you use are completely different from
    everyone else's terms.

    Your written English is excellent (though your grammar gets a bit "lazy" sometimes). But perhaps you have learned some terms in a different
    language originally, and they don't correspond quite to the common uses
    in English.

    Yes, a compiler is a tool. A "toolchain" for a language generally
    consists of the compiler, assembler, librarian and linker (though
    sometimes these are combined) - everything taking you from source code
    to executable. All the parts of it are "tools", as are optional extras
    such as linters, debuggers, build tools, etc. Every program that helps
    you do your development work is a "development tool". Your IDE is a
    tool, so is your git client. It is a very general term.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Dmitry A. Kazakov@mailbox@dmitry-kazakov.de to comp.lang.misc on Sun Nov 27 17:22:10 2022
    From Newsgroup: comp.lang.misc

    On 2022-11-27 16:05, David Brown wrote:

    Yes, a compiler is a tool.  A "toolchain" for a language generally
    consists of the compiler, assembler, librarian and linker (though
    sometimes these are combined) - everything taking you from source code
    to executable.  All the parts of it are "tools", as are optional extras such as linters, debuggers, build tools, etc.  Every program that helps
    you do your development work is a "development tool".  Your IDE is a
    tool, so is your git client.  It is a very general term.

    Yes, essential vs auxiliary. E.g. violin is an "instrument." Rosin and
    the music stand are "tools."
    --
    Regards,
    Dmitry A. Kazakov
    http://www.dmitry-kazakov.de

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From David Brown@david.brown@hesbynett.no to comp.lang.misc on Sun Nov 27 18:04:27 2022
    From Newsgroup: comp.lang.misc

    On 27/11/2022 17:22, Dmitry A. Kazakov wrote:
    On 2022-11-27 16:05, David Brown wrote:

    Yes, a compiler is a tool.  A "toolchain" for a language generally
    consists of the compiler, assembler, librarian and linker (though
    sometimes these are combined) - everything taking you from source code
    to executable.  All the parts of it are "tools", as are optional
    extras such as linters, debuggers, build tools, etc.  Every program
    that helps you do your development work is a "development tool".  Your
    IDE is a tool, so is your git client.  It is a very general term.

    Yes, essential vs auxiliary. E.g. violin is an "instrument." Rosin and
    the music stand are "tools."


    A compiler may be your primary tool, but it is still a tool. Calling it something else, or insisting that it is not a tool, is silly and confusing.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Andy Walker@anw@cuboid.co.uk to comp.lang.misc on Sun Nov 27 17:31:08 2022
    From Newsgroup: comp.lang.misc

    On 26/11/2022 13:59, Bart wrote:
    [James:]
    You brought up sqlite3.c but if it is over 100,000 lines of code in
    one file (Andy would not care much for it...) I'm not sure that it's
    a valid example for anything!
         Andy would point out that that's ~2000 pages of code;  about 3x the
    size of the draft C23 standard.  It is therefore a valid example of code
    that contains serious errors even if no-one knows what they are. [...]
    Why does it matter whether it is in one file or not?

    It doesn't. I merely claim that it is for all practical purposes impossible to write that much code without making mistakes. In the
    particular case of SQLite, I regularly have to install bug fixes, and
    would be amazed if there are no more to come.
    [...]
    However OSes can be tens of millions of lines or code.

    So, show me a bug-free OS. Or even one still in significant use
    that has not needed a bug/security update in the past year or so.

    A quick google
    tells me that a Boeing 787 uses 6.5 million lines of avionics code.

    So a Boeing 787 would have been an even better example of code
    that beyond reasonable doubt includes potentially-fatal bugs. Just one
    more reason not to fly and not to live under a flight path.

    about 3x the size of the draft C23 standard.
    Not a good benchmark; not many people would have a clue about the C23 standard!

    People /here/ ought to have such a clue.
    sqlite3.c is roughly 2-3 times the size of the Holy Bible.

    "I rest my case, m'lud."

    Meanwhile, and this is a figure I read about 20 years ago so
    doubtless has increased since, the source code for MS Visual Studio
    comprised 1.5 million /files/ (not lines!), and a full build took 60
    hours.

    If you regard MS Visual Studio as bug free, then I'd suggest
    that you are very much in the minority, certainly here.
    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Hummel
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Sun Nov 27 18:36:46 2022
    From Newsgroup: comp.lang.misc

    On 27/11/2022 17:31, Andy Walker wrote:
    On 26/11/2022 13:59, Bart wrote:
    [James:]
    You brought up sqlite3.c but if it is over 100,000 lines of code in
    one file (Andy would not care much for it...) I'm not sure that it's
    a valid example for anything!
         Andy would point out that that's ~2000 pages of code;  about 3x the
    size of the draft C23 standard.  It is therefore a valid example of code >>> that contains serious errors even if no-one knows what they are. [...]
    Why does it matter whether it is in one file or not?

        It doesn't.  I merely claim that it is for all practical purposes impossible to write that much code without making mistakes.  In the particular case of SQLite, I regularly have to install bug fixes, and
    would be amazed if there are no more to come.

    Bugs are a fact of life. Applications should make allowance for that.
    Also users, by saving their work, creating backups etc.


    [...]
    However OSes can be tens of millions of lines or code.

        So, show me a bug-free OS.  Or even one still in significant use that has not needed a bug/security update in the past year or so.

    So how much program code should exist on a PC with 8GB RAM, and perhaps
    1000GB disk space? (With access to limitless code downloadable from the internet.)

    Perhaps 1MB which is 100,000 lines of code? That means that only
    1/80000th of the RAM and 1/1000000th of the disk will be utilised by
    programs; what on earth do you use the rest for?

    That is completely unrealistic these days.

                                 A quick google
    tells me that a Boeing 787 uses 6.5 million lines of avionics code.

        So a Boeing 787 would have been an even better example of code
    that beyond reasonable doubt includes potentially-fatal bugs.  Just one
    more reason not to fly and not to live under a flight path.

    I'm sure such products are well-tested, and have lots of redundancies.
    Just don't fly in a 737 MAX where the issue was a poor design wrongly compensated for in software.

    about 3x the size of the draft C23 standard.
    Not a good benchmark; not many people would have a clue about the C23
    standard!

        People /here/ ought to have such a clue.
    sqlite3.c is roughly 2-3 times the size of the Holy Bible.

        "I rest my case, m'lud."

    Meanwhile, and this is a figure I read about 20 years ago so
    doubtless has increased since, the source code for MS Visual Studio
    comprised 1.5 million /files/ (not lines!), and a full build took 60
    hours.

        If you regard MS Visual Studio as bug free, then I'd suggest
    that you are very much in the minority, certainly here.

    It won't be bug free. But it's an example of a program very much larger
    than even sqlite; lots of people like it and use it.

    But then, my own programs are 1/5th the size of sqlite, and they are not
    bug free. They just need to be useful.


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Andy Walker@anw@cuboid.co.uk to comp.lang.misc on Mon Nov 28 16:50:39 2022
    From Newsgroup: comp.lang.misc

    On 27/11/2022 18:36, Bart wrote:
    Bugs are a fact of life. Applications should make allowance for that.
    Also users, by saving their work, creating backups etc.

    Why am I not in the least surprised by your cavalier attitude
    towards bugs? The depressing thing is not that bugs happen, but that
    so many of them happen for reasons that /should/ have been detected
    much earlier. The great majority are things like index out of bounds, uninitialised variables, dereference of null pointer, pointer used
    after free, overflow, .... Programs, especially during development,
    should stop under control when these things happen, not crash [or
    execute malware] some time later when the initial cause has long been forgotten. But it seems to be more important to shave milliseconds
    off compilation/run times than to build in checks. Half a century
    ago, many of the checks were done by hardware; that technology seems
    to have been forgotten, presumably in the interests of expense.

    So how much program code should exist on a PC with 8GB RAM, and
    perhaps 1000GB disk space? (With access to limitless code
    downloadable from the internet.)
    Perhaps 1MB which is 100,000 lines of code? That means that only
    1/80000th of the RAM and 1/1000000th of the disk will be utilised by programs; what on earth do you use the rest for?

    Well, that's actually 1/8000 of the RAM and if you have the
    source available perhaps 1/300000 of the disc per program. My PC has
    some 30000 programs installed, the vast majority dragged in by things
    I neither need nor use [but it's too much like hard work to sort out
    exactly what /is/ needed]; go figure. My PC has a lot of data on it
    as well as programs, something like 30% of the disc space; most of
    the rest is taken up by daily and weekly dumps. But you're right that
    the amount of RAM, CPU power and disc is overkill, and I could manage
    with a much smaller machine; but I went for a certain amount of future-proofing, and a decade later it's still pretty near the top of
    the range.
    --
    Andy Walker, Nottingham.
    Andy's music pages: www.cuboid.me.uk/andy/Music
    Composer of the day: www.cuboid.me.uk/andy/Music/Composers/Mayer
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Mon Nov 28 23:04:14 2022
    From Newsgroup: comp.lang.misc

    On 25/11/2022 17:49, Bart wrote:
    On 25/11/2022 14:20, James Harris wrote:
    On 25/11/2022 12:40, Bart wrote:

    ...

    But, that never happens!

    Eh???? You say it never happens when very similar does happen ... and
    then you go on to give your own counterexamples that are so outre they
    really never happen! Que????

    Which examples are those?

    I left at least one in but you snipped it!


    I wouldn't deliberately write code like this:

        int a; INT b; inT c = A + B;"

    unless it was some temporary debug or test code and I hadn't paid
    attention to the Caps Lock status.

    Understood for temporary or test code but how do you know you've not
    done the same in important code?


    I take advantage of case-insensitivity in these situations:

    * For informality for throwaway bits of code: I can inadvertently mix up case, but the code still works, so who cares because it will disappear a minute later

    * To specifically add debugging statements in capitals

    * To sometimes write function headers in capitals (I did this decades
    ago to make it easier to find function definitions with a text editor)

    * To allow imported mixed-case names to be written in my own choice of capitalisation (usually all-lower-case)

    * To sometimes highlight, for various reasons, specific identifiers or expressions; perhaps to indicate something that is temporary or needs attention or for a 'TO DO' bit of code (see the LXERROR calls at my link)

    * Simply being allowed not to care; it's just an extra bit of redundancy
    in a language, which is always good

    That illustrates what I said below about stropping. Does your compiler
    check that you used capitalisation according to the above scheme? I
    guess not. It's your choice if you want to do that to your own code but
    isn't it possible that you may not even have kept to your own scheme?

    Worse, consider collaborating with other programmers and trying to
    coordinate such capitalisation rules!

    ...

    I'll add one more thing. If case is to be ignored (your preference)
    then there is nothing to stop different programmers adopting different
    non-standard conventions for how they personally capitalise parts of
    names, making code even harder to maintain.

    You mean, easier? If it's a problem then use a formatting tool that can
    take care of it.

    I mean code maintenance would be harder because you'd have to get all maintainers to follow the same capitalisation rules, especially if
    adherence wasn't checked by the compiler.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Mon Nov 28 23:14:02 2022
    From Newsgroup: comp.lang.misc

    On 26/11/2022 00:43, Andy Walker wrote:

    ...

        [Slightly relevant, I note that Fred "Mythical Man-Month" Brooks died a few days ago, aged 91.  He had much to say about large projects, having managed the development of the IBM System/360.  His biography on
    Wiki notes that he considered his biggest decision "was to change the IBM
    360 series from a 6-bit byte to an 8-bit byte, thereby enabling the use
    of lowercase letters", relevant to other recent threads here.  But the "thereby" isn't accurate;  IBM could have followed Ferranti and other computer manufacturers in using 6-bit Flexowriter codes.

    Thanks for letting us know about Fred Brooks. I hadn't heard.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Tue Nov 29 00:05:26 2022
    From Newsgroup: comp.lang.misc

    On 28/11/2022 23:04, James Harris wrote:
    On 25/11/2022 17:49, Bart wrote:
    unless it was some temporary debug or test code and I hadn't paid
    attention to the Caps Lock status.

    Understood for temporary or test code but how do you know you've not
    done the same in important code?

    I will notice; I need to look at the screen sometime.



    That illustrates what I said below about stropping. Does your compiler
    check that you used capitalisation according to the above scheme?

    No, it largely doesn't care. Case is only significant here in string
    literals (sometimes used imported function names), or in `Abc backtick
    names.

    I
    guess not. It's your choice if you want to do that to your own code but isn't it possible that you may not even have kept to your own scheme?

    Sometimes; but I gave a link to some of my sources; did you see any infringements there? Except that of course they wouldn't matter; just something to be tidied up at some point.

    I mean code maintenance would be harder because you'd have to get all maintainers to follow the same capitalisation rules, especially if
    adherence wasn't checked by the compiler.

    I've mentioned a few times other examples where programmmers can impose
    their own styles, including white space usage, and the 4-5 cases where C
    is case-insensitive, leaving people free make things inconsistent. For example:

    0xabcdef
    0XABCdef

    In languages with numeric separators, you might write 123_456 followed
    by 12_34_5_6 followed by 123456. It is a non-issue.

    In my syntax, letter case is usually redundant, but which allows it to
    be utilised in various ways that don't effect the validity or behaviour
    of the program, as I've listed.

    That is a benefit, not a disadvantage. While case-sensitivity to me
    allows worse sins, like `int x, X`, which actually happens (I gave a
    real example).
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Tue Nov 29 09:26:38 2022
    From Newsgroup: comp.lang.misc

    On 29/11/2022 00:05, Bart wrote:
    On 28/11/2022 23:04, James Harris wrote:
    On 25/11/2022 17:49, Bart wrote:

    unless it was some temporary debug or test code and I hadn't paid
    attention to the Caps Lock status.

    Understood for temporary or test code but how do you know you've not
    done the same in important code?

    I will notice; I need to look at the screen sometime.

    I thought you said earlier that you don't always remember what
    capitalisation you had used for a particular identifier so I am not sure
    how you would notice visually all capitalisation discrepancies.

    ...

    I
    guess not. It's your choice if you want to do that to your own code but isn't it possible that you may not even have kept to your own scheme?

    Sometimes; but I gave a link to some of my sources; did you see any infringements there? Except that of course they wouldn't matter; just something to be tidied up at some point.

    I didn't look through your code for infringements of your standards. I'm
    a human with better things to do! :-) As I've said, any relevant or
    important checking is a job for the compiler.


    I mean code maintenance would be harder because you'd have to get all maintainers to follow the same capitalisation rules, especially if adherence wasn't checked by the compiler.

    I've mentioned a few times other examples where programmmers can impose their own styles, including white space usage, and the 4-5 cases where C
    is case-insensitive, leaving people free make things inconsistent. For example:

       0xabcdef
       0XABCdef

    Of the examples you gave I don't generally have the dubiety in my language:

    * ull suffix - I don't need anything like that
    * e or E for exponent - I don't use either
    * U or u - if I ever support Unicode directly then I'd mandate one of
    * #include = I don't use
    * 0x or 0X - 0x
    * letter case for hex digits - not decided yet


    In languages with numeric separators, you might write 123_456 followed
    by 12_34_5_6 followed by 123456. It is a non-issue.

    Rather than use such numbers in code wouldn't you give them names as in

    const n_digits = 10
    const buf_size = 10

    Then (1) the body of the program code can be clearer as to what it means
    and (2) either number be changed without affecting the other.


    In my syntax, letter case is usually redundant, but which allows it to
    be utilised in various ways that don't effect the validity or behaviour
    of the program, as I've listed.

    That is a benefit, not a disadvantage. While case-sensitivity to me
    allows worse sins, like `int x, X`, which actually happens (I gave a
    real  example).

    I don't expect us to agree but this has been an interesting exploration
    of the issues.
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Tue Nov 29 11:25:27 2022
    From Newsgroup: comp.lang.misc

    On 29/11/2022 09:26, James Harris wrote:
    On 29/11/2022 00:05, Bart wrote:

    Of the examples you gave I don't generally have the dubiety in my language:

    * ull suffix - I don't need anything like that
    * e or E for exponent - I don't use either
    * U or u - if I ever support Unicode directly then I'd mandate one of
    * #include = I don't use
    * 0x or 0X - 0x
    * letter case for hex digits - not decided yet

    This was supposed to highlight that that kind of case variability exists
    in C, a language famous for popularising strict case sensitivity.

    In languages with numeric separators, you might write 123_456 followed
    by 12_34_5_6 followed by 123456. It is a non-issue.

    Rather than use such numbers in code wouldn't you give them names as in

      const n_digits = 10
      const buf_size = 10

    Then (1) the body of the program code can be clearer as to what it means
    and (2) either number be changed without affecting the other.

    You're ignoring my point, which is that there is scope for variability
    without changing the meaning, since underscores (etc) are a redundant
    feature.

    Separators exist because integer constants exist. Even named constants
    could be defined as:

    const a = 10_000
    const b = 10000
    const c = 0010000 # not in C
    const d = 1e4 # in some languages
    const e = 0x2710
    const f = 9999+1

    (I assume you're picking on the fact that my three numbers were
    identical; I should have written different values.)


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From James Harris@james.harris.1@gmail.com to comp.lang.misc on Tue Nov 29 12:11:39 2022
    From Newsgroup: comp.lang.misc

    On 29/11/2022 11:25, Bart wrote:
    On 29/11/2022 09:26, James Harris wrote:
    On 29/11/2022 00:05, Bart wrote:

    Of the examples you gave I don't generally have the dubiety in my
    language:

    * ull suffix - I don't need anything like that
    * e or E for exponent - I don't use either
    * U or u - if I ever support Unicode directly then I'd mandate one of
    * #include = I don't use
    * 0x or 0X - 0x
    * letter case for hex digits - not decided yet

    This was supposed to highlight that that kind of case variability exists
    in C, a language famous for popularising strict case sensitivity.

    Sure, but a new language wouldn't be C and wouldn't have to have those problems. (Possibly the worst example in C is the trailing lower-case l.)


    In languages with numeric separators, you might write 123_456
    followed by 12_34_5_6 followed by 123456. It is a non-issue.

    Rather than use such numbers in code wouldn't you give them names as in

       const n_digits = 10
       const buf_size = 10

    Then (1) the body of the program code can be clearer as to what it
    means and (2) either number be changed without affecting the other.

    You're ignoring my point, which is that there is scope for variability without changing the meaning, since underscores (etc) are a redundant feature.

    Oh, I agree. But that's formatting. You could say similar about
    redundant parentheses or about the following.

    if x {
    return 0
    }

    vs

    if x
    return 0


    Separators exist because integer constants exist. Even named constants
    could be defined as:

        const a = 10_000
        const b = 10000
        const c = 0010000       # not in C
        const d = 1e4           # in some languages
        const e = 0x2710
        const f = 9999+1

    (I assume you're picking on the fact that my three numbers were
    identical; I should have written different values.)

    OK. I would still point out that such numbers are typically expressed
    once so have nothing to match against. I'd assume that each of your 10ks
    was written to be most relevant for each of the constants. The digit
    grouping can be whatever is clearest for that specific usage. (I'd
    typically use 3-digit grouping for decimals and 4-digit grouping for
    others but I could do something else for binary when writing numbers
    which relate to bitfields.) In essence, each of your 10ks means
    something different - the size of a buffer, a limit of years, something
    to group digits, etc. So it's natural that they are all expressed in
    ways that relate to how they would be used.

    By contrast, each identifier may be used many times always relating to
    the same thing. That can lead to questions about the reason why it is
    written differently in some places compared with others. And it allows
    the creation of all sorts of informal, unchecked and inconsistent
    stropping conventions in a language in which case is ignored.

    But those are just my views. I don't expect you to change your mind!
    --
    James Harris


    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Bart@bc@freeuk.com to comp.lang.misc on Tue Nov 29 14:53:16 2022
    From Newsgroup: comp.lang.misc

    On 29/11/2022 12:11, James Harris wrote:
    On 29/11/2022 11:25, Bart wrote:

    Separators exist because integer constants exist. Even named constants
    could be defined as:

         const a = 10_000
         const b = 10000
         const c = 0010000       # not in C
         const d = 1e4           # in some languages
         const e = 0x2710
         const f = 9999+1

    (I assume you're picking on the fact that my three numbers were
    identical; I should have written different values.)

    OK. I would still point out that such numbers are typically expressed
    once so have nothing to match against. I'd assume that each of your 10ks
    was written to be most relevant for each of the constants. The digit grouping can be whatever is clearest for that specific usage. (I'd
    typically use 3-digit grouping for decimals and 4-digit grouping for
    others but I could do something else for binary when writing numbers
    which relate to bitfields.) In essence, each of your 10ks means
    something different - the size of a buffer, a limit of years, something
    to group digits, etc. So it's natural that they are all expressed in
    ways that relate to how they would be used.

    By contrast, each identifier may be used many times always relating to
    the same thing. That can lead to questions about the reason why it is written differently in some places compared with others.

    That's a fair point. But you will know that in such a language, they are
    all equivalent. Then you might want to know if there's a reason for that
    case selection, or it's just because of untidiness or carelessness.

    What I'm saying that just because it can be done, just like you can
    write both 0 and 0x0 within the same expression, generally it isn't.
    There's little percentage in a language laying down rules, or even a
    compiler.

    Possibly some additional tool, or some special option of a compiler,
    might be made to detect such things. Personally I wouldn't bother, since
    it will not affect the validity or behaviour or performance of a program.

    --- Synchronet 3.19c-Linux NewsLink 1.113