On 16/11/2022 16:50, David Brown wrote:
Yes, but for you, a "must-have" list for a programming language would be mainly "must be roughly like ancient style C in functionality, but with enough change in syntax and appearance so that no one will think it is C". If that's what you like, and what pays for your daily bread, then that's absolutely fine.
On 18/11/2022 07:12, David Brown wrote:
Yes, it is a lot like C. It has a number of changes, some that I think are good, some that I think are bad, but basically it is mostly like C.
The above remarks implies strongly that my systems language is a rip-off
of C.
On 19/11/2022 21:02, James Harris wrote:
On 19/11/2022 20:30, Bart wrote:
On 19/11/2022 20:17, James Harris wrote:
I try to keep my main influences to hardware and various assembly
languages I've used over the years. But even though we try not to be
influenced by C I don't think any of us can help it. Two reasons: C
became the base for so many languages which came after it, and C so
well fits the underlying machine.
I even suspect that the CPUs we use today are also as they are in
part due to C. It has been that influential.
Well, there's a lot of C code around that needs to be keep working.
Yes.
However, what aspects of today's processors do you think owe anything
to C?
Things like the 8-bit byte, 2's complement, and the lack of segmentation.
Really? C was pretty much the only language in the world that does not specify the size of a byte. (It doesn't even a 'byte' type.)
And it's a language that, even now (until C23) DOESN'T stipulate that integers use two's complement.
As for segmentation, or lack of, that was very common across machines.
It is really nothing at all to do with C. (How would it have influenced
that anyway, given that C implementions were adept are dealing with any memory model?)
The progression from 8 to 16 to 32 to 64 bits and beyond has long
been on the cards, irrespective of languages.
Actually C is lagging behind since most implementations are stuck
with a 32-bit int type. Which means lots of software, for those
lazily using 'int' everywhere, will perpetuate the limitations of
that type.
C famously also doesn't like to pin down its types. It doesn't even
have a `byte` type, and its `char` type, apart from not have a
specified signedness, could have any width of 8 bits or more.
Pre C99 yes. But AIUI since C99 C has had very precise types such as
int64_t
I'm sure the byte type, it's size and byte-addressibility, was more influenced more by IBM, such as with its 360 mainframes from the 1960s
BC (Before C). The first byte-addressed machine I used was a 360-clone.
In any case, I would dispute that C even now properly has fixed-width
types. First, you need to do this to enable them:
#include <stdint.h>
Otherwise it knows nothing about them. Second, if you look inside a
typical stdint.h file (this one is from gcc/TDM on Windows), you might
well see:
typedef signed char int8_t;
typedef unsigned char uint8_t;
Nothing here guarantees that int8_t will be an 8-bit type; these 'exact-width' types are defined on top of those loosely-defined types. They're an illusion.
On 19/11/2022 22:49, Bart wrote:
I even suspect that the CPUs we use today are also as they are in
part due to C. It has been that influential.
C is /massively/ influential to the general purpose CPUs we have today.
The prime requirement for almost any CPU design is that you should be able to use it efficiently for C.
ll, the great majority of
software is written in languages that, at their core, are similar to C
(in the sense that once the compiler front-end has finished with them,
you have variables, imperative functions, pointers, objects in memory,
etc., much like C).
Really? C was pretty much the only language in the world that does not
specify the size of a byte. (It doesn't even a 'byte' type.)
8-bit byte and two's complement were, I think, inevitable regardless of
C.
It is really nothing at all to do with C. (How would it have
influenced that anyway, given that C implementions were adept are
dealing with any memory model?)
C implementations are /not/ good at dealing with non-linear memory,
But of course C was not the only influence on processor evolution.
#include <stdint.h>
Otherwise it knows nothing about them. Second, if you look inside a
typical stdint.h file (this one is from gcc/TDM on Windows), you might
well see:
typedef signed char int8_t;
typedef unsigned char uint8_t;
Nothing here guarantees that int8_t will be an 8-bit type; these
'exact-width' types are defined on top of those loosely-defined types.
They're an illusion.
Sorry, you are completely wrong here. Feel free to look it up in the C standards if you don't believe me.
However, any kind of guesses as to how processors would have looked
without C, and therefore what influence C /really/ had, are always going
to be speculative.
Two of the first machines I used were PDP10 and PDP11, developed by DEC
in the 1960s, both using linear memory spaces. While the former was word-based, the PDP11 was byte-addressable, just like the IBM 360 also
from the 1960s.
And not Assembly, or Fortran or any other language?
The REASON for segmented memory was becaused 16-bits and address spaces larger than 64K words didn't mix. When this was eventually fixed on
80386 for x86, that was able to use 32-bit registers.
More interesting however is what Unix would have looked like without C.
On 21/11/2022 17:56, David Brown wrote:
On 19/11/2022 22:49, Bart wrote:
I even suspect that the CPUs we use today are also as they are in >>>>>> part due to C. It has been that influential.
C is /massively/ influential to the general purpose CPUs we have today.
"Massively" influential? Why, how do you think CPUs would have ended up without C?
Two of the first machines I used were PDP10 and PDP11, developed by DEC
in the 1960s, both using linear memory spaces. While the former was word-based, the PDP11 was byte-addressable, just like the IBM 360 also
from the 1960s.
The early microprocessors I used (6800, Z80) also had a linear memory
space, at a time when it was unlikely C implementations existed for
them, or that people even thought that much about C outside of Unix.
The prime requirement for almost any CPU design is that you should
be able to use it efficiently for C.
And not Assembly, or Fortran or any other language?
Don't forget that at
the point it all began to change, mid-70s to mid-80, C wasn't that
dominant. Any C implementations for microprocessors were incredibly slow
and produced indifferent code.
The OSes I used (for PDP10, PDP11, ICL 4/72, Z80) had no C involvement.
When x86 popularised segment memory, EVERYBODY hated it, and EVERY
language had a problem with it.
The REASON for segmented memory was becaused 16-bits and address spaces larger than 64K words didn't mix. When this was eventually fixed on
80386 for x86, that was able to use 32-bit registers.
According to you, without C, we would have been using 64KB segments even with 32 bit registers, or we maybe would never have got to 32 bits at
all. What nonsense!
(I was designing paper CPUs with linear addressing long before then, probably like lots of people.)
ll, the great majority of software is written in languages that, at
their core, are similar to C (in the sense that once the compiler
front-end has finished with them, you have variables, imperative
functions, pointers, objects in memory, etc., much like C).
I wish people would just accept that C does not have and never has had a monopoly on lower level languages.
It a shame that people now associate 'close-to-the-metal' programming
with a language where a function pointer type is written as
`void(*)(void)`, and that's in the simples case.
Really? C was pretty much the only language in the world that does
not specify the size of a byte. (It doesn't even a 'byte' type.)
8-bit byte and two's complement were, I think, inevitable regardless
of C.
So were lots of things. It didn't take a clairvoyant to guess that the
next progression of 8 -> 16 was going to be 32 and then 64.
(The Z8000 came out in 1979. It was a 16-bit processor with a register
set that could be accessed as 8, 16, 32 or 64-bit chunks. Actually you
can also look at 68000 from that era, and the NatSemi 32032. I was an engineer at the time and very familiar with this stuff.
C didn't figure in that world at all as far as I was concerned.)
It is really nothing at all to do with C. (How would it have
influenced that anyway, given that C implementions were adept are
dealing with any memory model?)
C implementations are /not/ good at dealing with non-linear memory,
No longer likes it, as I said.
But of course C was not the only influence on processor evolution.
OK, you admit now it was not '/massive/'; good!
#include <stdint.h>
Otherwise it knows nothing about them. Second, if you look inside a
typical stdint.h file (this one is from gcc/TDM on Windows), you
might well see:
typedef signed char int8_t;
typedef unsigned char uint8_t;
Nothing here guarantees that int8_t will be an 8-bit type; these
'exact-width' types are defined on top of those loosely-defined
types. They're an illusion.
Sorry, you are completely wrong here. Feel free to look it up in the
C standards if you don't believe me.
The above typedefs are from a C compiler you may have heard of: 'gcc'.
Some may well use internal types such as `__int8`, but the above is the actual content of stdint.h, and makes `int8_t` a direct synonym for
`signed char`.
However, any kind of guesses as to how processors would have looked
without C, and therefore what influence C /really/ had, are always
going to be speculative.
Without C, another lower-level systems language would have dominated,
since such a language was necessary.
More interesting however is what Unix would have looked like without C.
On 2022-11-21 19:44, Bart wrote:
Two of the first machines I used were PDP10 and PDP11, developed by
DEC in the 1960s, both using linear memory spaces. While the former
was word-based, the PDP11 was byte-addressable, just like the IBM 360
also from the 1960s.
PDP-11 was not linear. The internal machine address was 24-bit. But the effective address in the program was 16-bit. The address space was 64K
for data and 64K for code mapped by the virtual memory manager. Some machines had a third 64K space.
And not Assembly, or Fortran or any other language?
Assember is not portable.
FORTRAN had no pointers. Programmers
implemented memory management on top of an array
On 21/11/2022 20:20, Dmitry A. Kazakov wrote:
On 2022-11-21 19:44, Bart wrote:
Two of the first machines I used were PDP10 and PDP11, developed by
DEC in the 1960s, both using linear memory spaces. While the former
was word-based, the PDP11 was byte-addressable, just like the IBM 360
also from the 1960s.
PDP-11 was not linear. The internal machine address was 24-bit. But
the effective address in the program was 16-bit. The address space was
64K for data and 64K for code mapped by the virtual memory manager.
Some machines had a third 64K space.
My PDP11/34 probably didn't have that much memory. But if you couldn't access more than 64K per task (say for code or data, if treated
separately), then I would still call that linear from the task's point
of view.
FORTRAN had no pointers. Programmers implemented memory management on
top of an array
But those arrays work better in linear memory.
On 21/11/2022 19:44, Bart wrote:
Two of the first machines I used were PDP10 and PDP11, developed by
DEC in the 1960s, both using linear memory spaces. While the former
was word-based, the PDP11 was byte-addressable, just like the IBM 360
also from the 1960s.
C was developed originally for these processors, and was a major reason
for their long-term success.
C was designed with some existing processors in mind - I don't think
anyone is suggesting that features such as linear memory came about
solely because of C. But there was more variety of processor
architectures in the old days, while almost all we have now are
processors that are good for running C code.
The early microprocessors I used (6800, Z80) also had a linear memory
space, at a time when it was unlikely C implementations existed for
them, or that people even thought that much about C outside of Unix.
The prime requirement for almost any CPU design is that you should
be able to use it efficiently for C.
And not Assembly, or Fortran or any other language?
Not assembly, no - /very/ little code is now written in assembly.
FORTRAN efficiency used to be important for processor design, but not
for a very long time. (FORTRAN is near enough the same programming
model as C, however.)
Don't forget that at the point it all began to change, mid-70s to
mid-80, C wasn't that dominant. Any C implementations for
microprocessors were incredibly slow and produced indifferent code.
The OSes I used (for PDP10, PDP11, ICL 4/72, Z80) had no C
involvement. When x86 popularised segment memory, EVERYBODY hated it,
and EVERY language had a problem with it.
Yes - the choice of the 8086 for PC's was a huge mistake. It was purely economics - the IBM designers wanted a 68000 processor.
According to you, without C, we would have been using 64KB segments
even with 32 bit registers, or we maybe would never have got to 32
bits at all. What nonsense!
Eh, no. I did not say anything /remotely/ like that.
I does have, and has had for 40+ years, a /near/ monopoly on low-level languages. You can dislike C as much as you want, but you really cannot deny that!
But of course C was not the only influence on processor evolution.
OK, you admit now it was not '/massive/'; good!
Would you please stop making things up and pretending I said them?
C is /massively/ influential to the general purpose CPUs we have today.
More interesting however is what Unix would have looked like without C.
How do you think it would have looked?
On 21/11/2022 20:22, David Brown wrote:
On 21/11/2022 19:44, Bart wrote:
Two of the first machines I used were PDP10 and PDP11, developed by
DEC in the 1960s, both using linear memory spaces. While the former
was word-based, the PDP11 was byte-addressable, just like the IBM 360
also from the 1960s.
C was developed originally for these processors, and was a major
reason for their long-term success.
Of the PDP10 and IBM 360? Designed in the 1960s and discontinued in 1983
and 1979 respectively. C only came out in a first version in 1972.
The PDP11 was superceded around this time (either side of 1980) by the VAX-11, a 32-bit version, no doubt inspired by the C language, one that
was well known for not specifying the sizes of its types - it adapted to
the size of the hardware.
Do you really believe this stuff?
C was designed with some existing processors in mind - I don't think
anyone is suggesting that features such as linear memory came about
solely because of C. But there was more variety of processor
architectures in the old days, while almost all we have now are
processors that are good for running C code.
As I said, C is the language that adapts itself to the hardware, and in
fact still is the primary language now that can and does run on every odd-ball architecture.
Which is why it is an odd candidate for a language that was supposed to drive the evolution of hardware because of its requirements.
The early microprocessors I used (6800, Z80) also had a linear memory
space, at a time when it was unlikely C implementations existed for
them, or that people even thought that much about C outside of Unix.
The prime requirement for almost any CPU design is that you should >>>> be able to use it efficiently for C.
And not Assembly, or Fortran or any other language?
Not assembly, no - /very/ little code is now written in assembly.
Now, yes. I'm talking about that formative period of mid-70s to mid-80s
when everything changed. From being dominated by mainframes, to 32-bit microprocessors which are only one step behind the 64-bit ones we have now.
FORTRAN efficiency used to be important for processor design, but not
for a very long time. (FORTRAN is near enough the same programming
model as C, however.)
Oh, right. In that case could be it possibly have been the need to run Fortran efficiency that was a driving force in that period?
(I spent a year in the late 70s writing Fortran code in two scientific establishments in the UK. No one used C.)
Don't forget that at the point it all began to change, mid-70s to
mid-80, C wasn't that dominant. Any C implementations for
microprocessors were incredibly slow and produced indifferent code.
The OSes I used (for PDP10, PDP11, ICL 4/72, Z80) had no C
involvement. When x86 popularised segment memory, EVERYBODY hated it,
and EVERY language had a problem with it.
Yes - the choice of the 8086 for PC's was a huge mistake. It was
purely economics - the IBM designers wanted a 68000 processor.
When you looked at the 68000 more closely, it had nearly as much non-orthoganality as the 8086. (I was trying at that time to get my
company to switch to a processor like the 68k.)
(The 8086 was bearable, but it had one poor design choice that had huge implications: forming an address by shifting a 16-bit segment address by
4 bits instead 8.
That meant an addressing range of only 1MB instead of 16MB, leading to a situation later where you could cheaply install 4MB or 8MB of memory,
but you couldn't easily make use of it.)
According to you, without C, we would have been using 64KB segments
even with 32 bit registers, or we maybe would never have got to 32
bits at all. What nonsense!
Eh, no. I did not say anything /remotely/ like that.
It sounds like it! Just accept that C had no more nor less influence
than any other language /at that time/.
I does have, and has had for 40+ years, a /near/ monopoly on low-level
languages. You can dislike C as much as you want, but you really
cannot deny that!
It's also the fact that /I/ at least have also successively avoided
using C for 40+ years (and, probably fairly uniquely, have used private languages). I'm sure there are other stories like mine that you don't
hear about.
But of course C was not the only influence on processor evolution.
OK, you admit now it was not '/massive/'; good!
Would you please stop making things up and pretending I said them?
You actually said this:
C is /massively/ influential to the general purpose CPUs we have today.
Which suggests that you don't think any other language comes close.
I don't know which individual language, if any, was most influential,
but I doubt C played a huge part because it came out too late, and was
not that popular in those formative years, but which time the way
processors were going to evolve was becoming clear anyway.
(That is, still dominated by von Neumann architectures, as has been the
case since long before C.)
But C probably has influenced modern 64-bit ABIs, even though they are supposed to be language-independent.
More interesting however is what Unix would have looked like without C.
How do you think it would have looked?
Case insensitive? Or maybe that's just wishful thinking.
Case insensitivity is a mistake, born from the days before computers
were advanced enough to have small letters as well as capitals.
On 22/11/2022 13:38, Bart wrote:
When you looked at the 68000 more closely, it had nearly as much
non-orthoganality as the 8086. (I was trying at that time to get my
company to switch to a processor like the 68k.)
No, it does not. (Yes, I have looked at it closely, and used 68k processors extensively.)
But C probably has influenced modern 64-bit ABIs, even though they are
supposed to be language-independent.
What makes you think they are supposed to be language independent? What makes you think they are not? What makes you care?
The types and terms from C are a very convenient way to describe an ABI,
since it is a language familiar to any programmer who might be
interested in the details of an ABI. Such ABI's only cover a
(relatively) simple common subset of possible interfaces, but do so in a
way that can be used from any language (with wrappers if needed) and can
be extended as needed.
People make ABI's for practical use. MS made the ABI for Win64 to suit their own needs and uses. AMD and a range of *nix developers (both OS
and application developers) and compiler developers got together to
develop the 64-bit x86 ABI used by everyone else, designed to suit
/there/ needs and uses.
Case insensitive? Or maybe that's just wishful thinking.
Case insensitivity is a mistake, born from the days before computers
were advanced enough to have small letters as well as capitals. It
leads to ugly inconsistencies, wastes the opportunity to convey useful semantic information, and is an absolute nightmare as soon as you stray
from the simple English-language alphabet.
I believe Unix's predecessor, Multics, was case-sensitive. But I could
be wrong.
Language A can talk to language B via the machine's ABI. Where does C
come into it?
Language A can talk to a library or OS component that resides in a DLL,
via the ABI. The library might have been implemented in C, or assembler,
or in anything else, but in binary form, is pure machine code anyway.
What makes /you/ think that such ABIs were invented purely for the use
of C programs? Do you think the designers of the ABI simply assumed that only programs written in the C language could call into the OS?
When you download a shared library DLL, do you think they have different versions depending on what language will be using the DLL?
I'm surprised the Unix and C developers even had a terminal that could
do upper and lower case.
I'm surprised the Unix and C developers even had a terminal that
could do upper and lower case. I was stuck with upper case for the
first year or two. [...]
Case-sensitivity was a luxury into the 80s.
On 22/11/2022 13:38, Bart wrote:
As I said, C is the language that adapts itself to the hardware, and
in fact still is the primary language now that can and does run on
every odd-ball architecture.
C does not "adapt itself to the hardware".
On 2022-11-22 18:13, Bart wrote:Actually the Win64 ABI doesn't go into types much at all. The main types
Language A can talk to language B via the machine's ABI. Where does C
come into it?
Data types of arguments including padding/gaping of structures, calling conventions.
E.g. Windows' native calling convention is stdcall, while C
deploys cdecl.
When you download a shared library DLL, do you think they have
different versions depending on what language will be using the DLL?
That is certainly a possibility.
On 22/11/2022 17:13, Bart wrote:
I'm surprised the Unix and C developers even had a terminal that
could do upper and lower case. I was stuck with upper case for the
first year or two. [...]
Case-sensitivity was a luxury into the 80s.
As per my nearby article, lower case was available for paper
tape long before [electronic] computers existed. It's difficult to
do word processing without a decent character set; I was doing it [admittedly in a rather primitive way] in the mid-60s. There were
some peripherals [esp many lineprinters, card punches and teletypes]
that were restricted to upper case, but lower case was scarcely a
"luxury" when many secretaries were using electric typewriters.
On 22/11/2022 17:47, Dmitry A. Kazakov wrote:
On 2022-11-22 18:13, Bart wrote:Actually the Win64 ABI doesn't go into types much at all.
Language A can talk to language B via the machine's ABI. Where does C
come into it?
Data types of arguments including padding/gaping of structures,
calling conventions.
Or are you going to claim like David Brown that the hardware is like
that solely due to the need to run C programs?
That all disappears with 64 bits. With 32-bit DLLs, while there was
still one DLL, you needed to know the call-convention in use; this would have been part of the API. But while there were 100s of languages, there were only a handful of call conventions.
Plus, DLLs tend to include other DLLs; when the OS loads a DLL A, which imports DLL B, it will not know which language version of B to look for
(and they would all be called B.DLL).
On 2022-11-22 21:42, Bart wrote:
On 22/11/2022 17:47, Dmitry A. Kazakov wrote:
On 2022-11-22 18:13, Bart wrote:Actually the Win64 ABI doesn't go into types much at all.
Language A can talk to language B via the machine's ABI. Where does
C come into it?
Data types of arguments including padding/gaping of structures,
calling conventions.
It is all about types. The funny thing, it even specifies endianness
thanks to the C's stupidity of unions, see how LARGE_INTEGER is defined.
Or are you going to claim like David Brown that the hardware is like
that solely due to the need to run C programs?
Nobody would ever use any hardware if there is no C compiler. So David
is certainly right.
Long ago, there existed Lisp machines, machines designed for tagging
data with types etc. All this sunk down when C took the reign. Today situation slowly changes with FPGA and the NN hype foaming over...
That all disappears with 64 bits. With 32-bit DLLs, while there was
still one DLL, you needed to know the call-convention in use; this
would have been part of the API. But while there were 100s of
languages, there were only a handful of call conventions.
There are as many conventions as languages because complex types and closures require techniques unknown to plain C.
Perhaps the computer department of my college, and the places I
worked at, were poorly equipped then. We used ASR33s and video
terminals that emulated those teletypes, so upper case only.
All the Fortran I wrote was in upper case that I can remember.
The file systems of my PDP10 machine at least used 'sixbit' encoding,
so could only do upper-case. The 'radix50' encoding of the PDP11
linker also restricted things to upper case.
The bitmaps fonts of early screens and dot-matrix printers may also
have been limited to upper case (the first video display of my own
was).
I think the Tektronix 4010 vector display I used was upper case
only.
My point was, there were so many restrictions, how did people manage
to write C code? It was only into the 1980s that I could reliably
make use of mixed case.
On 22/11/2022 22:24, Dmitry A. Kazakov wrote:
On 2022-11-22 21:42, Bart wrote:
On 22/11/2022 17:47, Dmitry A. Kazakov wrote:
On 2022-11-22 18:13, Bart wrote:Actually the Win64 ABI doesn't go into types much at all.
Language A can talk to language B via the machine's ABI. Where does >>>>> C come into it?
Data types of arguments including padding/gaping of structures,
calling conventions.
It is all about types. The funny thing, it even specifies endianness
thanks to the C's stupidity of unions, see how LARGE_INTEGER is defined.
LARGE_INTEGER is not mentioned in the ABI and is not listed here: https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types.
The ABI really doesn't care about types other than it needs to know how
many bytes values occupy, and whether they need to go into GP or FLOAT registers. It is quite low-level.
Or are you going to claim like David Brown that the hardware is like
that solely due to the need to run C programs?
Nobody would ever use any hardware if there is no C compiler. So David
is certainly right.
You're both certainly wrong. People used hardware before C; they used hardware without C. And I spent a few years building bare computer
boards that I programmed from scratch, with no C compiler in sight.
That all disappears with 64 bits. With 32-bit DLLs, while there was
still one DLL, you needed to know the call-convention in use; this
would have been part of the API. But while there were 100s of
languages, there were only a handful of call conventions.
There are as many conventions as languages because complex types and
closures require techniques unknown to plain C.
If complex language X wants to talk to complex language Y,
If I export a function F taking an i64 type and returning an i64 type,
it is thanks to C that that is possible?
Nothing to do with the hardware
implementing a 64-bit type and making use of that fact.
On 22/11/2022 15:29, David Brown wrote:
Case insensitivity is a mistake, born from the days before computers
were advanced enough to have small letters as well as capitals.
I don't believe I have ever used a computer that did not "have
small letters". There has been some discussion over in "comp.compilers" recently, but it's basically the difference between punched cards and
paper tape. The Flexowriter can be traced back to the 1920s, and its
most popular form was certainly being used by computers in the 1950s,
so there really weren't many "days before" to be considered.
On 2022-11-23 01:03, Bart wrote:
LARGE_INTEGER is not mentioned in the ABI and is not listed here:
https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types.
It is defined in winnt.h
The ABI really doesn't care about types other than it needs to know
how many bytes values occupy, and whether they need to go into GP or
FLOAT registers. It is quite low-level.
At this point, I must ask, did you ever use any OS API at all? Or maybe
you just do not know what a datatype is?
You're both certainly wrong. People used hardware before C; they used
hardware without C. And I spent a few years building bare computer
boards that I programmed from scratch, with no C compiler in sight.
We do not talk about hobby projects.
That all disappears with 64 bits. With 32-bit DLLs, while there was
still one DLL, you needed to know the call-convention in use; this
would have been part of the API. But while there were 100s of
languages, there were only a handful of call conventions.
There are as many conventions as languages because complex types and
closures require techniques unknown to plain C.
If complex language X wants to talk to complex language Y,
They just don't. Most more or less professionally designed languages
provide interfacing to and from C.
interfaced to a bare minimum. Dynamic languages are slightly better
because of their general primitivism and because they are actually
written in C. But dealing with real languages like C++ is almost
impossible, e.g. handing virtual tables etc. So nobody cares.
If I export a function F taking an i64 type and returning an i64 type,
it is thanks to C that that is possible?
For the machine you are using, the answer is yes.
Nothing to do with the hardware implementing a 64-bit type and making
use of that fact.
Not even with power supply unit and the screws holding the motherboard...
On 19/11/2022 17:01, Bart wrote:
On 16/11/2022 16:50, David Brown wrote:
Yes, but for you, a "must-have" list for a programming languagewould be
mainly "must be roughly like ancient style C in functionality, butwith
enough change in syntax and appearance so that no one will think it is >> > C". If that's what you like, and what pays for your daily bread, then >> > that's absolutely fine.
On 18/11/2022 07:12, David Brown wrote:
Yes, it is a lot like C. It has a number of changes, some that Ithink
are good, some that I think are bad, but basically it is mostlylike C.
The above remarks implies strongly that my systems language is a
rip-off of C.
No, it does not. You can infer what you want from what I write, but I don't see any such implications from my remark.
If anyone were to write
a (relatively) simple structured language for low level work, suitable
for "direct" compilation to assembly on a reasonable selection of common general-purpose processors, and with the aim of giving a "portable alternative to writing in assembly", then the result will inevitably
have a good deal in common with C. There can be plenty of differences
in the syntax and details, but the "ethos" or "flavour" of the language
will be similar.
Note that I have referred to Pascal as C-like in this sense.
On 23/11/2022 09:04, Dmitry A. Kazakov wrote:
On 2022-11-23 01:03, Bart wrote:
LARGE_INTEGER is not mentioned in the ABI and is not listed here:It is defined in winnt.h
https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types. >>
In my 'windows.h' header for my C compilers, it is defined in windows.h.
So are myriad other types.
LARGE_INTEGER is not in there; it's something that is used for a handful
of functions out of 1000s. Maybe it was an early kind of type for
dealing with 64 bits and kept for compatibility.
Apart from that, ABIs really, really don't care what that chunk
represents. They are mainly concerned with where things go.
Why are you two trying to rewrite history?
If you really believe that, then both you and David Brown are deluded. I long suspected that C worshipping was more akin to a religious cult; it
now seems it's more widespread than I thought with people being
brainwashed into believing any old rubbish.
On 2022-11-23 12:55, Bart wrote:
On 23/11/2022 09:04, Dmitry A. Kazakov wrote:
On 2022-11-23 01:03, Bart wrote:
LARGE_INTEGER is not mentioned in the ABI and is not listed here:
https://learn.microsoft.com/en-us/windows/win32/winprog/windows-data-types.
It is defined in winnt.h
In my 'windows.h' header for my C compilers, it is defined in
windows.h. So are myriad other types.
Now you have an opportunity to look at it.
LARGE_INTEGER is not in there; it's something that is used for a
handful of functions out of 1000s. Maybe it was an early kind of type
for dealing with 64 bits and kept for compatibility.
LARGE_INTEGER is massively used in Windows API.
Apart from that, ABIs really, really don't care what that chunk
represents. They are mainly concerned with where things go.
Try to actually program using Windows API. You will know or at least
read some MS documentation. Start with something simple:
https://learn.microsoft.com/en-us/windows/win32/api/winnt/ns-winnt-token_access_information
Why are you two trying to rewrite history?
Maybe because we lived it through and saw it happen?...
If you really believe that, then both you and David Brown are deluded.
I long suspected that C worshipping was more akin to a religious cult;
it now seems it's more widespread than I thought with people being
brainwashed into believing any old rubbish.
I do not know about David, but I hate C and consider it a very bad
language.
hardware and software as well as corrupted minds of several generations
and is keeping doing so.
On 22/11/2022 15:29, David Brown wrote:
On 22/11/2022 13:38, Bart wrote:
When you looked at the 68000 more closely, it had nearly as much
non-orthoganality as the 8086. (I was trying at that time to get my
company to switch to a processor like the 68k.)
No, it does not. (Yes, I have looked at it closely, and used 68k
processors extensively.)
As a compiler writer?
The first thing you noticed is that you have to
decide whether to use D-registers or A-registers, as they had different characteristics, but the 3-bit register field of instructions could only
use one or the other.
That made the 8086 simpler because there was no choice! The registers
were limited and only one was general purpose.
But C probably has influenced modern 64-bit ABIs, even though they
are supposed to be language-independent.
What makes you think they are supposed to be language independent?
What makes you think they are not? What makes you care?
Language A can talk to language B via the machine's ABI. Where does C
come into it?
Language A can talk to a library or OS component that resides in a DLL,
via the ABI. The library might have been implemented in C, or assembler,
or in anything else, but in binary form, is pure machine code anyway.
What makes /you/ think that such ABIs were invented purely for the use
of C programs? Do you think the designers of the ABI simply assumed that only programs written in the C language could call into the OS?
When you download a shared library DLL, do you think they have different versions depending on what language will be using the DLL?
The types and terms from C are a very convenient way to describe an ABI,
They're pretty terrible actually. The types involved in SYS V ABI can be expressed as follows in a form that everyone understands and many
languages use:
i8 i16 i32 i64 i128
u8 u16 u32 u64 u128
f32 f64 f128
This document (https://refspecs.linuxbase.org/elf/x86_64-abi-0.99.pdf)
lists the C equivalents as follows (only signed integers shown):
i8 char, signed char
i16 short, signed short
i32 int, signed int
i64 long, signed long, long long, signed long long
i128 __int128, signed __int128
(No use of int8_t etc despite the document dated 2012.)
This comes up in APIs too where it is 100 times more relevant (only
compiler writers care about the API). The C denotations shown here are
not fit for purpose for language-neutral interfaces.
(Notice also that 'long' and 'long long' are both 64 bits, and that
'char' is assumed to be signed. In practice the C denotations would vary across platforms, while those i8-i128 would stay constant, provided only that the machine uses conventional register sizes.)
So it's more like, such interfaces were developed /despite/ C.
since it is a language familiar to any programmer who might be
interested in the details of an ABI. Such ABI's only cover a
(relatively) simple common subset of possible interfaces, but do so in
a way that can be used from any language (with wrappers if needed) and
can be extended as needed.
People make ABI's for practical use. MS made the ABI for Win64 to
suit their own needs and uses. AMD and a range of *nix developers
(both OS and application developers) and compiler developers got
together to develop the 64-bit x86 ABI used by everyone else, designed
to suit /there/ needs and uses.
x86-32 used a number of different ABIs depending on language and
compiler. x86-64 tends to use one ABI, which is a strong indication that that that ABI was intended to work across languages and compilers.
Case insensitive? Or maybe that's just wishful thinking.
Case insensitivity is a mistake, born from the days before computers
were advanced enough to have small letters as well as capitals. It
leads to ugly inconsistencies, wastes the opportunity to convey useful
semantic information, and is an absolute nightmare as soon as you
stray from the simple English-language alphabet.
Yet Google searches are case-insensitive. How is that possible, given
that search strings can use Unicode which you say does not define case equivalents across most alphabets?
As are email addresses and domain names.
As are most things in everyday life, even now that it is all tied up
with computers and smartphones and tablets with everything being online.
(Actually, most people's exposure to case-sensitivity is in online passwords, which is also the worst place to have it, as usually you
can't see them!)
Your objections make no sense at all. Besides which, plenty of case-insensitive languages, file-systems and shell programs and
applications exist.
I believe Unix's predecessor, Multics, was case-sensitive. But I
could be wrong.
I'm surprised the Unix and C developers even had a terminal that could
do upper and lower case. I was stuck with upper case for the first year
or two. File-systems and global linker symbols were also restricted in length and case for a long time, to minimise space.
Case-sensitivity was a luxury into the 80s.
On 23/11/2022 12:40, Dmitry A. Kazakov wrote:
I do not know about David, but I hate C and consider it a very bad
language.
Which languages don't you hate?
But I can't figure out what Dmitry might like - unless he too has his
own personal language.
Name just /one/ real programming language that supports case-insensitive identifiers but is not restricted to ASCII. (Let's define "real programming language" as a programming language that has its own
Wikipedia entry.)
On 19/11/2022 22:23, James Harris wrote:
I remember reading that when AMD wanted to design a 64-bit
architecture they asked programmers (especially at Microsoft) what
they wanted. One thing was 'no segmentation'. The C model had
encouraged programmers to think in terms of flat address spaces, and
the mainstream segmented approach for x86 was a nightmare that people
didn't want to repeat.
I think you're ascribing too much to C. In what way did any other
languages (Algol, Pascal, Cobol, Fortran, even Ada by then) encourage
the use of segmented memory?
Do you mean because C required the use of different kinds of pointers,
and people were fed up with that? Whereas other languages hid that
detail better.
You might as well say then that Assembly was equally responsible since
it was even more of a pain to deal with segments!
(Actually, aren't the segments still there on x86? Except they are 4GB
in size instead of 64KB.)
On 20/11/2022 00:35, Bart wrote:
On 19/11/2022 22:23, James Harris wrote:
...
I remember reading that when AMD wanted to design a 64-bit
architecture they asked programmers (especially at Microsoft) what
they wanted. One thing was 'no segmentation'. The C model had
encouraged programmers to think in terms of flat address spaces, and
the mainstream segmented approach for x86 was a nightmare that people
didn't want to repeat.
I think you're ascribing too much to C. In what way did any other
languages (Algol, Pascal, Cobol, Fortran, even Ada by then) encourage
the use of segmented memory?
I wouldn't say they did. What I would say is that probably none of them
had C's influence on what programming became.
Yes, Cobol was widespread
for a long time but its design didn't get incorporated into later
languages. Conversely, much of Algol's approach was adopted by nearly
all later languages but it itself never achieved the widespread use of
C. Only C had widespread use as well as strong influence on others. Much
of the programming community today still thinks in C terms even 50 years (!!!) after its release.
On pointers to data consider the subexpression
f(p)
where p is a pointer. Even on a segmented machine that call has no
concept of whether p is pointing to, say, the stack or one of many data segments. In general, all pointers have to be flat: any pointer can
point anywhere; that's the C model.
Consider a segment of memory as a simple range from byte 'first' to byte 'last'. With such ranges:
* all accesses can be range checked automatically
* no access outside the range would be permitted
* the range could be extended or shortened
* the memory used could be moved around as needed
all without impacting a program which accesses them.
On 23/11/2022 15:56, James Harris wrote:
On 20/11/2022 00:35, Bart wrote:
On 19/11/2022 22:23, James Harris wrote:
...
I remember reading that when AMD wanted to design a 64-bit
architecture they asked programmers (especially at Microsoft) what
they wanted. One thing was 'no segmentation'. The C model had
encouraged programmers to think in terms of flat address spaces, and
the mainstream segmented approach for x86 was a nightmare that
people didn't want to repeat.
I think you're ascribing too much to C. In what way did any other
languages (Algol, Pascal, Cobol, Fortran, even Ada by then) encourage
the use of segmented memory?
I wouldn't say they did. What I would say is that probably none of
them had C's influence on what programming became.
Examples? Since the current crop of languages all have very different
ideas from C.
Yes, Cobol was widespread for a long time but its design didn't get
incorporated into later languages. Conversely, much of Algol's
approach was adopted by nearly all later languages but it itself never
achieved the widespread use of C. Only C had widespread use as well as
strong influence on others. Much of the programming community today
still thinks in C terms even 50 years (!!!) after its release.
Is it really C terms, or does that just happen to be the hardware model?
Yes, C is a kind of lingua franca that lots of people know, but notice
that people talk about a 'u64' type, something everyone understands, but
not 'unsigned long long int' (which is not even defined by C to be
exactly 64 bits), nor even `uint64_t` (which not even C programs
recognise unless you use stdint.h or inttypes.h!).
On pointers to data consider the subexpression
f(p)
where p is a pointer. Even on a segmented machine that call has no
concept of whether p is pointing to, say, the stack or one of many
data segments. In general, all pointers have to be flat: any pointer
can point anywhere; that's the C model.
Why the C model? Do you have any languages in mind with a different model?
Pointers or references occur in many languages (from that time period, Pascal, Ada, Algol68); I don't recall them being restricted in their
model of memory.
C, on the other, had lots of restrictions:
* Having FAR and NEAR pointer types
On 23/11/2022 16:36, Bart wrote:
I wouldn't say they did. What I would say is that probably none of
them had C's influence on what programming became.
Examples? Since the current crop of languages all have very different
ideas from C.
Cobol and Algol:
well as strong influence on others. Much of the programming community
today still thinks in C terms even 50 years (!!!) after its release.
Is it really C terms, or does that just happen to be the hardware model?
Yes, C is a kind of lingua franca that lots of people know, but notice
that people talk about a 'u64' type, something everyone understands,
but not 'unsigned long long int' (which is not even defined by C to be
exactly 64 bits), nor even `uint64_t` (which not even C programs
recognise unless you use stdint.h or inttypes.h!).
u64 is just a name.
Why the C model? Do you have any languages in mind with a different
model?
Yes, the C model is as stated: any pointer can point anywhere. A C
pointer must be able to point to rodata, stack, and anywhere in the data section.
Pointers or references occur in many languages (from that time period,
Pascal, Ada, Algol68); I don't recall them being restricted in their
model of memory.
C, on the other, had lots of restrictions:
* Having FAR and NEAR pointer types
Are you sure that FAR and NEAR were part of C?
C, on the other, had lots of restrictions:
* Having FAR and NEAR pointer types
* Having distinct object and function pointers (you aren't even
allowed to directly cast between them)
* Not being able to compare pointers to two different objects
It is the only one that I recall which exposes the fact that these
could all exist in different, non-compatible and /non-linear/ regions
of memory.
On 2022-11-23 16:03, David Brown wrote:
But I can't figure out what Dmitry might like - unless he too has his
own personal language.
No, I am not that megalomaniac. (:-))
I want stuff useful for software engineering. To me it is a DIY shop. I choose techniques I find useful in long term perspective and reject
other. I generally avoid academic exercises, hobby languages, big-tech/corporate/vendor-lock bullshit. You can guess which of your pet languages falls into which category. (:-))
On 23/11/2022 15:56, James Harris wrote:
On pointers to data consider the subexpression
f(p)
where p is a pointer. Even on a segmented machine that call has no
concept of whether p is pointing to, say, the stack or one of many
data segments. In general, all pointers have to be flat: any pointer
can point anywhere; that's the C model.
Why the C model? Do you have any languages in mind with a different model?
Pointers or references occur in many languages (from that time period, Pascal, Ada, Algol68); I don't recall them being restricted in their
model of memory.
C, on the other, had lots of restrictions:
* Having FAR and NEAR pointer types
* Having distinct object and function pointers (you aren't even allowed
to directly cast between them)
* Not being able to compare pointers to two different objects
On 23/11/2022 16:36, Bart wrote:
C, on the other, had lots of restrictions:
* Having FAR and NEAR pointer types
Never part of C. [Non-standard extension in some implementations.]
* Having distinct object and function pointers (you aren't even
allowed to directly cast between them)
Correctly so. In a proper HLL, type punning should, in general,
be forbidden. There could be a case made out for casting between two structures that are identical apart from the names of the components, otherwise it is a recipe for hard-to-find bugs.
* Not being able to compare pointers to two different objects
Of course you can. Such pointers compare as unequal. You can also reliably subtract pointers in some cases. What more can you
reasonably expect?
It is the only one that I recall which exposes the fact that these
could all exist in different, non-compatible and /non-linear/ regions
of memory.
"Exposes"? How? Where? Examples? [In either K&R C or standard
C, of course, not in some dialect implementation.]
C doesn't allow relative compares, or subtracting operators. Or* Not being able to compare pointers to two different objectsOf course you can. Such pointers compare as unequal. You can >> also reliably subtract pointers in some cases. What more can you
reasonably expect?
rather, it will make those operations implementation defined or UB,> simply because pointers could in fact refer to incompatible memory
regions.
This goes against the suggestion that C is more conducive to linear
memory than any other languages.
What is being claimed is that it is largely C that has beenIt is the only one that I recall which exposes the fact that these"Exposes"? How? Where? Examples? [In either K&R C or standard
could all exist in different, non-compatible and /non-linear/ regions
of memory.
C, of course, not in some dialect implementation.]
responsible for linear memory layouts in hardware.
On 23/11/2022 16:23, Dmitry A. Kazakov wrote:
On 2022-11-23 16:03, David Brown wrote:
But I can't figure out what Dmitry might like - unless he too has his
own personal language.
No, I am not that megalomaniac. (:-))
I want stuff useful for software engineering. To me it is a DIY shop.
I choose techniques I find useful in long term perspective and reject
other. I generally avoid academic exercises, hobby languages,
big-tech/corporate/vendor-lock bullshit. You can guess which of your
pet languages falls into which category. (:-))
The languages I mostly use are C, C++ and Python, depending on the task
and the target system. (And while I enjoy working with each of these,
and see their advantages in particular situations, I also appreciate
that they are not good in other cases and they all have features I dislike.) Your criteria would not rule out any of these - I too
generally avoid languages with vendor lock-in, and small developer or
user communities. Academic exercise languages are of course no use
unless you are doing academic exercises.
Your criteria would also not rule out several key functional programming languages, including Haskell, OCaml, and Scala.
It would rule out C#, VB, Bart's languages, and possibly Java. Pascal
is in theory open and standard, but in practice it is disjoint with vendor-specific variants. (There's FreePascal, which has no lock-in.)
You would still have Ada, D, Erlang, Fortran, Forth, JavaScript, Lua,
Rust, Modula-2, Perl, and PHP.
I think that covers most of the big languages (I assume you also don't
like ones that have very small user bases).
On 2022-11-23 15:53, David Brown wrote:
Name just /one/ real programming language that supports
case-insensitive identifiers but is not restricted to ASCII. (Let's
define "real programming language" as a programming language that has
its own Wikipedia entry.)
1. https://en.wikipedia.org/wiki/Ada_(programming_language)
2. Ada Reference Manual 2.3:
Two identifiers are considered the same if they consist of the same sequence of characters after applying locale-independent simple case folding, as defined by documents referenced in the note in Clause 1 of ISO/IEC 10646:2011.
After applying simple case folding, an identifier shall not be
identical to a reserved word.
On 23/11/2022 18:31, Andy Walker wrote:
On 23/11/2022 16:36, Bart wrote:
C, on the other, had lots of restrictions:
* Having FAR and NEAR pointer types
Never part of C. [Non-standard extension in some implementations.]
* Having distinct object and function pointers (you aren't even
allowed to directly cast between them)
Correctly so. In a proper HLL, type punning should, in general, >> be forbidden. There could be a case made out for casting between two
structures that are identical apart from the names of the components,
otherwise it is a recipe for hard-to-find bugs.
* Not being able to compare pointers to two different objects
Of course you can. Such pointers compare as unequal. You can >> also reliably subtract pointers in some cases. What more can you
reasonably expect?
C doesn't allow relative compares, or subtracting operators. Or rather,
it will make those operations implementation defined or UB, simply
because pointers could in fact refer to incompatible memory regions.
This goes against the suggestion that C is more conducive to linear
memory than any other languages.
It is the only one that I recall which exposes the fact that these
could all exist in different, non-compatible and /non-linear/ regions
of memory.
"Exposes"? How? Where? Examples? [In either K&R C or standard
C, of course, not in some dialect implementation.]
What is being claimed is that it is largely C that has been responsible
for linear memory layouts in hardware.
On 2022-11-23 19:38, David Brown wrote:
On 23/11/2022 16:23, Dmitry A. Kazakov wrote:
On 2022-11-23 16:03, David Brown wrote:
But I can't figure out what Dmitry might like - unless he too has
his own personal language.
No, I am not that megalomaniac. (:-))
I want stuff useful for software engineering. To me it is a DIY shop.
I choose techniques I find useful in long term perspective and reject
other. I generally avoid academic exercises, hobby languages,
big-tech/corporate/vendor-lock bullshit. You can guess which of your
pet languages falls into which category. (:-))
The languages I mostly use are C, C++ and Python, depending on the
task and the target system. (And while I enjoy working with each of
these, and see their advantages in particular situations, I also
appreciate that they are not good in other cases and they all have
features I dislike.) Your criteria would not rule out any of these -
I too generally avoid languages with vendor lock-in, and small
developer or user communities. Academic exercise languages are of
course no use unless you are doing academic exercises.
Your criteria would also not rule out several key functional
programming languages, including Haskell, OCaml, and Scala.
It would rule out C#, VB, Bart's languages, and possibly Java. Pascal
is in theory open and standard, but in practice it is disjoint with
vendor-specific variants. (There's FreePascal, which has no lock-in.)
You would still have Ada, D, Erlang, Fortran, Forth, JavaScript, Lua,
Rust, Modula-2, Perl, and PHP.
I think that covers most of the big languages (I assume you also don't
like ones that have very small user bases).
Narrow user base is no reason to reject a language. However there is a danger that the language might go extinct.
To me most important is the language toolbox:
- modules, separate compilation, late bindings
- abstract data types
- generic programming (i.e. in terms of sets of types)
- formal verification, contracts, correctness proofs
- some object representation control
- interfacing to C and thus system and other libraries
- high level concurrency support
- program readability, reasonable syntax AKA don't be APL (:-))
- standard library abstracting the underlying OS
- some type introspection
Things not important or ones I actively avoid are
- lambdas
- relational algebra
- patterns
- recursive types
- closures
- dynamic/duck/weak/no-typing
- macros/preprocessor/templates/generics
- standard container library (like std or boost)
- standard GUI library
On 23/11/2022 18:53, Bart wrote:
C doesn't allow relative compares, or subtracting operators. Or* Not being able to compare pointers to two different objectsOf course you can. Such pointers compare as unequal. You can >>> also reliably subtract pointers in some cases. What more can you
reasonably expect?
rather, it will make those operations implementation defined or UB,>
simply because pointers could in fact refer to incompatible memory
regions.
N2478 [other standards are available], section 6.5.6.10:
" When two pointers are subtracted, both shall point to elements of
" the same array object, or one past the last element of the array
" object; the result is the difference of the subscripts of the two
" array elements. The size of the result is implementation-defined,
" and its type (a signed integer type) is ptrdiff_t defined in the
" <stddef.h> header. If the result is not representable in an object
" of that type, the behavior is undefined. "
So the behaviour is undefined only if the subtraction overflows, and is implementation defined only to the extent of what size of signed integer
the implementation prefers. It's difficult to see what other behaviour could reasonably be specified in the Standard.
Section 6.5.8.6:
" When two pointers are compared, the result depends on the relative
" locations in the address space of the objects pointed to. If two
" pointers to object types both point to the same object, or both
" point one past the last element of the same array object, they
" compare equal. If the objects pointed to are members of the same
" aggregate object, pointers to structure members declared later
" compare greater than pointers to members declared earlier in the
" structure, and pointers to array elements with larger subscript
" values compare greater than pointers to elements of the same
" array with lower subscript values. All pointers to members of the
" same union object compare equal. If the expression P points to an
" element of an array object and the expression Q points to the last
" element of the same array object, the pointer expression Q+1
" compares greater than P. In all other cases, the behavior is
" undefined. "
Well, it's rather verbose, but it all seems common sense to me.
No
mention anywhere of "incompatible memory regions", so I suspect that
you're making it up based on what you think C is like rather than how
it is defined in reality.
May be worth noting that [eg] Algol defines only the relations
"is" and "isn't" between pointers; C is at least more "helpful" than that. But that is largely driven by C's use of pointers in arrays.
This goes against the suggestion that C is more conducive to linear
memory than any other languages.
Well, /I/ have made no such suggestion, and don't really even
know what that is claimed to mean. Most HLLs specifically hide the
layout and structure of memory from ordinary programmers; no doubt a
Good Thing.
What is being claimed is that it is largely C that has beenIt is the only one that I recall which exposes the fact that these"Exposes"? How? Where? Examples? [In either K&R C or standard
could all exist in different, non-compatible and /non-linear/ regions
of memory.
C, of course, not in some dialect implementation.]
responsible for linear memory layouts in hardware.
Not a claim I have ever made, nor even seen before. There is remarkably little in [standard] C that relates to memory layouts. So
I take it that you have no examples of this claimed exposure?
On 23/11/2022 21:25, Dmitry A. Kazakov wrote:
On 2022-11-23 19:38, David Brown wrote:
On 23/11/2022 16:23, Dmitry A. Kazakov wrote:
On 2022-11-23 16:03, David Brown wrote:
But I can't figure out what Dmitry might like - unless he too has
his own personal language.
No, I am not that megalomaniac. (:-))
I want stuff useful for software engineering. To me it is a DIY
shop. I choose techniques I find useful in long term perspective and
reject other. I generally avoid academic exercises, hobby languages,
big-tech/corporate/vendor-lock bullshit. You can guess which of your
pet languages falls into which category. (:-))
The languages I mostly use are C, C++ and Python, depending on the
task and the target system. (And while I enjoy working with each of
these, and see their advantages in particular situations, I also
appreciate that they are not good in other cases and they all have
features I dislike.) Your criteria would not rule out any of these -
I too generally avoid languages with vendor lock-in, and small
developer or user communities. Academic exercise languages are of
course no use unless you are doing academic exercises.
Your criteria would also not rule out several key functional
programming languages, including Haskell, OCaml, and Scala.
It would rule out C#, VB, Bart's languages, and possibly Java.
Pascal is in theory open and standard, but in practice it is disjoint
with vendor-specific variants. (There's FreePascal, which has no
lock-in.)
You would still have Ada, D, Erlang, Fortran, Forth, JavaScript, Lua,
Rust, Modula-2, Perl, and PHP.
I think that covers most of the big languages (I assume you also
don't like ones that have very small user bases).
Narrow user base is no reason to reject a language. However there is a
danger that the language might go extinct.
Yes. (I said "very small user bases".)
To me most important is the language toolbox:
- modules, separate compilation, late bindings
- abstract data types
- generic programming (i.e. in terms of sets of types)
I thought you didn't like that?
- formal verification, contracts, correctness proofs
Yet you reject functional programming?
You can do a bit of formal
proofs with SPARK, but people doing serious formal correctness proofs
tend to prefer pure functional programming languages.
- some object representation control
- interfacing to C and thus system and other libraries
- high level concurrency support
- program readability, reasonable syntax AKA don't be APL (:-))
- standard library abstracting the underlying OS
- some type introspection
I think Haskell would fit for all of that. And C++ is as good as Ada.
Things not important or ones I actively avoid are
- lambdas
- relational algebra
- patterns
- recursive types
- closures
- dynamic/duck/weak/no-typing
- macros/preprocessor/templates/generics
So generics are important to you, but you actively avoid them?
That made the 8086 simpler because there was no choice! The registers
were limited and only one was general purpose.
On 23/11/2022 16:51, James Harris wrote:
On 23/11/2022 16:36, Bart wrote:
I wouldn't say they did. What I would say is that probably none of
them had C's influence on what programming became.
Examples? Since the current crop of languages all have very different
ideas from C.
Cobol and Algol:
I was asking about C's influence, but those two languages predated C.
well as strong influence on others. Much of the programming
community today still thinks in C terms even 50 years (!!!) after
its release.
Is it really C terms, or does that just happen to be the hardware model? >>>
Yes, C is a kind of lingua franca that lots of people know, but
notice that people talk about a 'u64' type, something everyone
understands, but not 'unsigned long long int' (which is not even
defined by C to be exactly 64 bits), nor even `uint64_t` (which not
even C programs recognise unless you use stdint.h or inttypes.h!).
u64 is just a name.
So what are the 'C terms' you mentioned?
Since if talking about
primitive types for example, u64 or uint64 or whatever are common ways
of refering to a 64-bit unsigned integer type, then unless the
discussion specically about C, you wouldn't use C denotations for it.
Why the C model? Do you have any languages in mind with a different
model?
Yes, the C model is as stated: any pointer can point anywhere. A C
pointer must be able to point to rodata, stack, and anywhere in the
data section.
And that is different from any other language that had pointers, how?
Because I'm having trouble in understanding how you can attribute linear memory models to C and only C, when it is the one language that exposes
the limitations of non-linear memory.
Pointers or references occur in many languages (from that time
period, Pascal, Ada, Algol68); I don't recall them being restricted
in their model of memory.
C, on the other, had lots of restrictions:
* Having FAR and NEAR pointer types
Are you sure that FAR and NEAR were part of C?
They were part of implementations of it for 8086. There were actually 'near', 'far' and 'huge'. I think a 'far' pointer had a fixed segment part.
On 22/11/2022 18:13, Bart wrote:
As a compiler writer?
As an assembly programmer and C programmer.
The first thing you noticed is that you have to decide whether to use
D-registers or A-registers, as they had different characteristics, but
the 3-bit register field of instructions could only use one or the other.
Yes, although they share quite a bit in common too. You have 8 data registers that are all orthogonal and can be used for any data
instructions as source and designation, all 32 bit. You have 8 address registers that could all be used for all kinds of addressing modes (and
a few kinds of calculations, and as temporary storage) - the only
special one was A7 that was used for stack operations (as well as being available for all the other addressing modes).
How does that even begin to compare to the 8086 with its 4 schizophrenic "main" registers that are sometimes 16-bit, sometimes two 8-bit
registers, with a wide range of different dedicated usages for each register? Then you have 4 "index" registers, each with different
dedicated uses. And 4 "segment" registers, each with different
dedicated uses.
Where the 68000 has wide-spread, planned and organised orthogonality and flexibility, the 8086 is a collection of random dedicated bits and pieces.
You can also understand it by looking at the processor market. Real
CISC with dedicated and specialised registers is dead. In the progress
of x86 through 32-bit and then 64-bit, the architecture became more and
more orthogonal - the old registers A, B, C, D, SI, DI, etc., are now no more than legacy alternative names for r0, r1, etc., general purpose registers.
i8 i16 i32 i64 i128
u8 u16 u32 u64 u128
f32 f64 f128
Or they can be expressed in a form that everyone understands, like
"char", "int", etc., that are defined in the ABI, and that everybody and every language /does/ use when integrating between different languages.
That document has no mention anywhere of your personal short names for size-specific types.
It has a table stating the type names and sizes.
Think of it as just a definition of the technical terms used in the document, no different from when one processor reference might define
"word" to mean 16 bits and another define "word" to mean 32 bits.
How does Google manage case-insensitive searches with text in Unicode in many languages? By being /very/ smart. I didn't say it was impossible
to be case-insensitive beyond plain English alphabet, I said it was an "absolute nightmare". It is.
It is done where it has to be done -
you'll find all major databases have support for doing sorting,
searching, and case translation for large numbers of languages and alphabets. It is a /huge/ undertaking to handle it all. You don't do
it if it is not important.
Name just /one/ real programming language that supports case-insensitive identifiers
but is not restricted to ASCII. (Let's define "real
programming language" as a programming language that has its own
Wikipedia entry.)
There are countless languages that have case-sensitive Unicode
identifiers, because that's easy to implement and useful for programmers.
Domain names are case insensitive if they are in ASCII.
For other
characters, it gets complicated.
Programmers are not "most people". Programs are not "most things in everyday life".
Most people are quite tolerant of spelling mistakes in everyday life -
do you think programming languages should be too?
They do exist, yes. That does not make them a good idea.
The stuff about C being solely responsible may just have been a wind-up.
It's not me making the claims.
On 2022-11-23 21:59, David Brown wrote:
I think Haskell would fit for all of that. And C++ is as good as Ada.
C++ has problems with high-level concurrency and massive syntax issues. Looking at a modern C++ code I am not sure whether it is plain text or Base64-encoded.
The 8086 was horrible in all sorts of ways. Comparing a 68000 with an
8086 is like comparing a Jaguar E-type with a bathtub with wheels. And
for the actual chip used in the first PC, an 8088, half the wheels were removed.
On 23/11/2022 21:01, Bart wrote:
...
The stuff about C being solely responsible may just have been a wind-up.
Maybe I've missed it but I've not noticed anyone claim C was solely responsible.
...
It's not me making the claims.
It might be you making up the claims. ;-)
However, what aspects of today's processors do you think owe anythingto C?
Two of the first machines I used were PDP10 and PDP11, developed byDEC in the 1960s, both using linear memory spaces. While the former was word-based, the PDP11 was byte-addressable, just like the IBM 360 also
Of the PDP10 and IBM 360? Designed in the 1960s and discontinued in1983 and 1979 respectively. C only came out in a first version in 1972.
that solely due to the need to run C programs?Or are you going to claim like David Brown that the hardware is like
Nobody would ever use any hardware if there is no C compiler. SoDavid is certainly right.
On 23/11/2022 20:24, Andy Walker wrote:
On 23/11/2022 18:53, Bart wrote:
C doesn't allow relative compares, or subtracting operators. Or* Not being able to compare pointers to two different objectsOf course you can. Such pointers compare as unequal. You can >>>> also reliably subtract pointers in some cases. What more can you
reasonably expect?
rather, it will make those operations implementation defined or UB,>
simply because pointers could in fact refer to incompatible memory
regions.
N2478 [other standards are available], section 6.5.6.10:
" When two pointers are subtracted, both shall point to elements of
" the same array object, or one past the last element of the array
" object; the result is the difference of the subscripts of the two
" array elements. The size of the result is implementation-defined,
" and its type (a signed integer type) is ptrdiff_t defined in the
" <stddef.h> header. If the result is not representable in an object
" of that type, the behavior is undefined. "
So the behaviour is undefined only if the subtraction overflows, and is
implementation defined only to the extent of what size of signed integer
the implementation prefers. It's difficult to see what other behaviour
could reasonably be specified in the Standard.
Section 6.5.8.6:
" When two pointers are compared, the result depends on the relative
" locations in the address space of the objects pointed to. If two
" pointers to object types both point to the same object, or both
" point one past the last element of the same array object, they
" compare equal. If the objects pointed to are members of the same
" aggregate object, pointers to structure members declared later
" compare greater than pointers to members declared earlier in the
" structure, and pointers to array elements with larger subscript
" values compare greater than pointers to elements of the same
" array with lower subscript values. All pointers to members of the
" same union object compare equal. If the expression P points to an
" element of an array object and the expression Q points to the last
" element of the same array object, the pointer expression Q+1
" compares greater than P. In all other cases, the behavior is
" undefined. "
Well, it's rather verbose, but it all seems common sense to me.
So, basically, everything is fully defined when both pointers refer to
the same objects, which is what I said, more briefly.
No
mention anywhere of "incompatible memory regions", so I suspect that
you're making it up based on what you think C is like rather than how
it is defined in reality.
This part of it is implied by those restrictions, when you think of the reasons why they might apply.
Except C applies those whether or not pointers to those memory registers would be compatible or not.
In fact, you /can/ have distinct kinds of memory, though more common on older hardware, or in microcontrollers.
But this is all by the by; my quest was trying to figure what it was
about how how C (and only C) does pointers, that made architecture
designers decide they need more linear memory than segmented.
The stuff about C being solely responsible may just have been a wind-up.
May be worth noting that [eg] Algol defines only the relations
"is" and "isn't" between pointers; C is at least more "helpful" than
that. But that is largely driven by C's use of pointers in arrays.
This goes against the suggestion that C is more conducive to linear
memory than any other languages.
Well, /I/ have made no such suggestion, and don't really even
know what that is claimed to mean. Most HLLs specifically hide the
layout and structure of memory from ordinary programmers; no doubt a
Good Thing.
At least 3 people in the group were claiming all sorts of unlikely
things of C.
What is being claimed is that it is largely C that has beenIt is the only one that I recall which exposes the fact that these"Exposes"? How? Where? Examples? [In either K&R C or standard
could all exist in different, non-compatible and /non-linear/ regions >>>>> of memory.
C, of course, not in some dialect implementation.]
responsible for linear memory layouts in hardware.
Not a claim I have ever made, nor even seen before. There is
remarkably little in [standard] C that relates to memory layouts. So
I take it that you have no examples of this claimed exposure?
It's not me making the claims.
On 23/11/2022 14:53, David Brown wrote:
On 22/11/2022 18:13, Bart wrote:
As a compiler writer?
As an assembly programmer and C programmer.
The first thing you noticed is that you have to decide whether to use
D-registers or A-registers, as they had different characteristics,
but the 3-bit register field of instructions could only use one or
the other.
Yes, although they share quite a bit in common too. You have 8 data
registers that are all orthogonal and can be used for any data
instructions as source and designation, all 32 bit. You have 8
address registers that could all be used for all kinds of addressing
modes (and a few kinds of calculations, and as temporary storage) -
the only special one was A7 that was used for stack operations (as
well as being available for all the other addressing modes).
How does that even begin to compare to the 8086 with its 4
schizophrenic "main" registers that are sometimes 16-bit, sometimes
two 8-bit registers, with a wide range of different dedicated usages
for each register? Then you have 4 "index" registers, each with
different dedicated uses. And 4 "segment" registers, each with
different dedicated uses.
Where the 68000 has wide-spread, planned and organised orthogonality
and flexibility, the 8086 is a collection of random dedicated bits and
pieces.
It's too big an effort to dig into to now, many decades on, what gave me that impression about the 68K. But the big one /is/ those two kinds of registers.
Current machines already have GP and float registers to make things more difficult, but here there are separate registers for integers - and
integers that might be used as memory addresses.
So you would have instructions that operated on one set but not the
other. You'd need to decide whether functions returned values in D0 or A0.
Glancing at the instruction set now, you have ADD which adds to
everything except A regs; ADDA which /only/ adds to AREGS.
ADDI which adds immed values to everything except AREGS, and ADDQ which
adds small values (1..8) to everything /including/ AREGS.
Similarly with ANDI, which works for every dest except AREGS, but there
is no version for AREGS (so if you were playing with tagged pointers and needed to clear the bottom bits then use them for an address, it gets awkward).
With a compiler, you had to make decisions on whether it's best to start evaluating in DREGS or AREGS and then move across, if it involved mixed operations that were only available for one set.
Note that the 80386 processor, which apparently first appeared in 1985, removed many of the restrictions of the 8086, also widening the
registers but not adding any more. Further, these 32-bit additions and
new address modes were available while running in 16-bit mode within a 16-bit application.
You can also understand it by looking at the processor market. Real
CISC with dedicated and specialised registers is dead. In the
progress of x86 through 32-bit and then 64-bit, the architecture
became more and more orthogonal - the old registers A, B, C, D, SI,
DI, etc., are now no more than legacy alternative names for r0, r1,
etc., general purpose registers.
What become completely unorthogonal on x86 is the register naming. It's
a zoo of mismatched names of mixed lengths. The mapping is also bizarre, with the stack pointer somewhere below the middle.
(However that is easy to fix as I can use my own register names and
ordering as well as the official names. My 64-bit registers are called
D0 to D15, with D15 (aka Dstack) being the stack pointer.)
i8 i16 i32 i64 i128
u8 u16 u32 u64 u128
f32 f64 f128
Or they can be expressed in a form that everyone understands, like
"char", "int", etc., that are defined in the ABI, and that everybody
and every language /does/ use when integrating between different
languages.
Sorry, but C typenames using C syntax are NOT good enough, not for cross-language use. You don't really want to see 'int long unsigned
long'; you want 'uint64' or 'u64'.
Even C decided that `int` `char` were not good enough by adding types
like `int32_t` and ... sorry I can't even tell you what `char`
corresponds to. That is how rubbish C type designations are.
That document has no mention anywhere of your personal short names for
size-specific types.
It uses names of its own like 'unsigned eightbyte' which unequivocally describes the type. However you will see `u64` all over forums; you will never see `unsigned eightbyte`, and never 'unsigned long long int'
outside of C forums or actual C code.
It has a table stating the type names and sizes.
Yes, that helps too. What doesn't help is just using 'long'.
Think of it as just a definition of the technical terms used in the
document, no different from when one processor reference might define
"word" to mean 16 bits and another define "word" to mean 32 bits.
So defining a dozen variations on 'unsigned long long int` is better
than just using `u64` or `uint64`?
That must be the reason why a dozen different languages have all adopted those C designations because they work so well and are so succinct and unambiguous. Oh, hang on...
How does Google manage case-insensitive searches with text in Unicode
in many languages? By being /very/ smart. I didn't say it was
impossible to be case-insensitive beyond plain English alphabet, I
said it was an "absolute nightmare". It is.
No, it really isn't. Now you're making things up. You don't need to be
very smart at all, it's actually very easy.
It is done where it has to be done - you'll find all major databases
have support for doing sorting, searching, and case translation for
large numbers of languages and alphabets. It is a /huge/ undertaking
to handle it all. You don't do it if it is not important.
Think about the 'absolute nightmare' if /everything/ was case sensitive
and a database has 1000 variations of people called 'David Brown'.
(There could be 130,000 with my name.)
Now imagine talking over the phone to someone, they create an account in
the name you give them, but they use or omit capitalisation you weren't aware of. How would you log in?
Name just /one/ real programming language that supports
case-insensitive identifiers
I'm not talking about Unicode identifiers. I wouldn't go there becase
there are too many issues. For a start, which of the 1.1 million
characters should be allowed at the beginning, and which within an identifier?
but is not restricted to ASCII. (Let's define "real programming
language" as a programming language that has its own Wikipedia entry.)
There are countless languages that have case-sensitive Unicode
identifiers, because that's easy to implement and useful for programmers.
And also a nightmare, since there are probably 20 distinct characters
that share the same glyph as 'A'.
Adding Unicode to identifiers is too easy to do badly.
Domain names are case insensitive if they are in ASCII.
Because?
For other characters, it gets complicated.
So, the same situation with language keywords and commands in CLIs.
But hey, think of the advantage of having Sort and sorT working in decreasing/increasing order; no need to specify that separately. Plus
you have 14 more variations to apply meanings to. Isn't this the point
of being case-sensitive?
Because if it isn't, then I don't get it. On Windows, I can type 'sort'
or `SORT`, it doesn't matter. I don't even need to look at the screen or have to double-check caps lock.
BTW my languages (2 HLLs and one assembler) use case-insensitive
identifiers and keywords, but allow case-sensitive names when they are sometimes needed, mainly to with working with FFIs.
It really isn't hard at all.
Programmers are not "most people". Programs are not "most things in
everyday life".
Most people are quite tolerant of spelling mistakes in everyday life -
do you think programming languages should be too?
Using case is not a spelling mistake; it's a style. In my languages,
someone can write 'int', 'Int' or 'INT' according to preference.
Or that can use CamelCase if they like that, but someone importing such
a function can just write camelcase if they hate the style.
I use upper case when writing debug code so that I can instantly
identify it.
They do exist, yes. That does not make them a good idea.
Yes, it does. How do you explain to somebody why using exact case is absolutely essential, when it clearly shouldn't matter?
Look: I create my own languages, yes? And I could have chosen at any
time to make them case sensitive, yes?
So why do you think I would choose to make like an 'absolute nightmare'
for myself?
The reason is obvious: because case insensitivity just works better and
is far more useful.
On 22/11/2022 15:29, David Brown wrote:
The 8086 was horrible in all sorts of ways. Comparing a 68000 with an
8086 is like comparing a Jaguar E-type with a bathtub with wheels.
And for the actual chip used in the first PC, an 8088, half the wheels
were removed.
You've forgotten the 68008.
And current AMD and Intel chips are bathtubs with wheels and rocket
engines!
(The great thing about car analogies is how much they can be abused...)
On 23/11/2022 23:42, Bart wrote:
Using case is not a spelling mistake; it's a style. In my languages,
someone can write 'int', 'Int' or 'INT' according to preference.
No, it is a mess.
On 24/11/2022 15:03, David Brown wrote:
On 23/11/2022 23:42, Bart wrote:
Using case is not a spelling mistake; it's a style. In my languages,
someone can write 'int', 'Int' or 'INT' according to preference.
No, it is a mess.
Allowing int Int INT to be three distinct types, or to represent types, variables, functions etc, is perfectly fine?
int Int = INT;
On 24/11/2022 16:55, Bart wrote:
On 24/11/2022 15:03, David Brown wrote:
On 23/11/2022 23:42, Bart wrote:
Using case is not a spelling mistake; it's a style. In my languages,
someone can write 'int', 'Int' or 'INT' according to preference.
No, it is a mess.
Allowing int Int INT to be three distinct types, or to represent
types, variables, functions etc, is perfectly fine?
int Int = INT;
Contrast
MyVal := a
myVal := myval + b
Are you happy for a language to allow so much inconsistency?
On 2022-11-24 18:56, James Harris wrote:
On 24/11/2022 16:55, Bart wrote:
On 24/11/2022 15:03, David Brown wrote:
On 23/11/2022 23:42, Bart wrote:
Using case is not a spelling mistake; it's a style. In my
languages, someone can write 'int', 'Int' or 'INT' according to
preference.
No, it is a mess.
Allowing int Int INT to be three distinct types, or to represent
types, variables, functions etc, is perfectly fine?
int Int = INT;
Contrast
MyVal := a
myVal := myval + b
Are you happy for a language to allow so much inconsistency?
Make it
MyVal := a
myVal := MyVal + b
better be case-sensitive?
On 24/11/2022 18:02, Dmitry A. Kazakov wrote:
On 2022-11-24 18:56, James Harris wrote:
On 24/11/2022 16:55, Bart wrote:
On 24/11/2022 15:03, David Brown wrote:
On 23/11/2022 23:42, Bart wrote:
Using case is not a spelling mistake; it's a style. In my
languages, someone can write 'int', 'Int' or 'INT' according to
preference.
No, it is a mess.
Allowing int Int INT to be three distinct types, or to represent
types, variables, functions etc, is perfectly fine?
int Int = INT;
Contrast
MyVal := a
myVal := myval + b
Are you happy for a language to allow so much inconsistency?
Make it
MyVal := a
myVal := MyVal + b
better be case-sensitive?
My point (to you and Bart) is that programmers can choose identifier
names so the latter example need not arise unless it is written deliberately;
but if the compiler folds case then programmers can
/mistype/ names accidentally, leading to the messy inconsistency
mentioned above.
On 24/11/2022 16:55, Bart wrote:
On 24/11/2022 15:03, David Brown wrote:
On 23/11/2022 23:42, Bart wrote:
Using case is not a spelling mistake; it's a style. In my languages,
someone can write 'int', 'Int' or 'INT' according to preference.
No, it is a mess.
Allowing int Int INT to be three distinct types, or to represent
types, variables, functions etc, is perfectly fine?
int Int = INT;
Contrast
MyVal := a
myVal := myval + b
Are you happy for a language to allow so much inconsistency?
On 2022-11-24 19:07, James Harris wrote:
On 24/11/2022 18:02, Dmitry A. Kazakov wrote:
On 2022-11-24 18:56, James Harris wrote:
On 24/11/2022 16:55, Bart wrote:
On 24/11/2022 15:03, David Brown wrote:
On 23/11/2022 23:42, Bart wrote:
Using case is not a spelling mistake; it's a style. In my
languages, someone can write 'int', 'Int' or 'INT' according to >>>>>>> preference.
No, it is a mess.
Allowing int Int INT to be three distinct types, or to represent
types, variables, functions etc, is perfectly fine?
int Int = INT;
Contrast
MyVal := a
myVal := myval + b
Are you happy for a language to allow so much inconsistency?
Make it
MyVal := a
myVal := MyVal + b
better be case-sensitive?
My point (to you and Bart) is that programmers can choose identifier
names so the latter example need not arise unless it is written
deliberately;
Why did you suggest an error? The point is, you could not know. Nobody could.
but if the compiler folds case then programmers can /mistype/ names
accidentally, leading to the messy inconsistency mentioned above.
Same question. Why do you think that the example I gave was mistyped?
In a case-insensitive language mistyping the case has no effect on the program legality. Any decent IDE enforces preferred case style.
Moreover, tools for the case-sensitive languages like C++ do just the
same. You cannot have reasonable names in C++ anymore. There would be lurking clang-format or SonarQube configured to force something a three
year old suffering dyslexia would pen... (:-))
On 24/11/2022 18:39, Dmitry A. Kazakov wrote:
On 2022-11-24 19:07, James Harris wrote:
My point (to you and Bart) is that programmers can choose identifier
names so the latter example need not arise unless it is written
deliberately;
Why did you suggest an error? The point is, you could not know. Nobody
could.
but if the compiler folds case then programmers can /mistype/ names
accidentally, leading to the messy inconsistency mentioned above.
Same question. Why do you think that the example I gave was mistyped?
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval. The difference is that in a case-sensitive language (such as C) a programmer would have deliberately to choose daft names to engineer the mess;
whereas in a language which ignores case (such as
Ada) the mess can come about accidentally, via typos.
In a case-insensitive language mistyping the case has no effect on the
program legality. Any decent IDE enforces preferred case style.
Moreover, tools for the case-sensitive languages like C++ do just the
same. You cannot have reasonable names in C++ anymore. There would be
lurking clang-format or SonarQube configured to force something a
three year old suffering dyslexia would pen... (:-))
As you suggest, for languages which ignore case extra tools are needed
to help tidy up the code.
On 2022-11-24 19:07, James Harris wrote:
On 24/11/2022 18:02, Dmitry A. Kazakov wrote:
On 2022-11-24 18:56, James Harris wrote:
On 24/11/2022 16:55, Bart wrote:
On 24/11/2022 15:03, David Brown wrote:
On 23/11/2022 23:42, Bart wrote:
Using case is not a spelling mistake; it's a style. In my
languages, someone can write 'int', 'Int' or 'INT' according to >>>>>>> preference.
No, it is a mess.
Allowing int Int INT to be three distinct types, or to represent
types, variables, functions etc, is perfectly fine?
int Int = INT;
Contrast
MyVal := a
myVal := myval + b
Are you happy for a language to allow so much inconsistency?
Make it
MyVal := a
myVal := MyVal + b
better be case-sensitive?
My point (to you and Bart) is that programmers can choose identifier
names so the latter example need not arise unless it is written
deliberately;
Why did you suggest an error? The point is, you could not know. Nobody could.
but if the compiler folds case then programmers can /mistype/ names
accidentally, leading to the messy inconsistency mentioned above.
Same question. Why do you think that the example I gave was mistyped?
In a case-insensitive language mistyping the case has no effect on the program legality.
Any decent IDE enforces preferred case style.
Moreover, tools for the case-sensitive languages like C++ do just the
same. You cannot have reasonable names in C++ anymore. There would be lurking clang-format or SonarQube configured to force something a three
year old suffering dyslexia would pen... (:-))
On 24/11/2022 18:55, James Harris wrote:
On 24/11/2022 18:39, Dmitry A. Kazakov wrote:
On 2022-11-24 19:07, James Harris wrote:
My point (to you and Bart) is that programmers can choose identifier
names so the latter example need not arise unless it is written
deliberately;
Why did you suggest an error? The point is, you could not know.
Nobody could.
but if the compiler folds case then programmers can /mistype/ names
accidentally, leading to the messy inconsistency mentioned above.
Same question. Why do you think that the example I gave was mistyped?
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval. The difference is that in a case-sensitive
language (such as C) a programmer would have deliberately to choose
daft names to engineer the mess;
They do. I gave examples in my other post. But this kind of idiom I find annoying:
Image image;
Colour colour; //(At least it's not colour color!)
Matrix matrix;
(Actual examples from the Raylib API. Which also cause grief when ported
to my case-insensitive syntax, yet another problem.)
whereas in a language which ignores case (such as Ada) the mess can
come about accidentally, via typos.
Using the wrong case isn't really a typo. A real typo would yield the
wrong letters
Using the wrong case is harmless. At some point, the discrepancy in
style, if not intentional, will be discovered and fixed.
In a case-insensitive language mistyping the case has no effect on
the program legality. Any decent IDE enforces preferred case style.
Moreover, tools for the case-sensitive languages like C++ do just the
same. You cannot have reasonable names in C++ anymore. There would be
lurking clang-format or SonarQube configured to force something a
three year old suffering dyslexia would pen... (:-))
As you suggest, for languages which ignore case extra tools are needed
to help tidy up the code.
Possibly; I've never actually needed to in 46 years of case-insensitive coding. But I also use upper case for emphasis.
On 2022-11-24 16:07, David Brown wrote:
And current AMD and Intel chips are bathtubs with wheels and rocket
engines!
Judging by how they screech ... there is no wheels. (:-))
(The great thing about car analogies is how much they can be abused...)
OT. I remember the story of a guy who installed the rocket engine on a,
I believe, VW Beetle and honorably died riding his invention. Death by
Rock and Roll, as Pretty Reckless sung...
/OT
On 24/11/2022 20:07, Bart wrote:
They do. I gave examples in my other post. But this kind of idiom I
find annoying:
Image image;
Colour colour; //(At least it's not colour color!)
Matrix matrix;
I have never come across any programmer for any language that does not
find some commonly-used idioms or coding styles annoying.
I bet that even you, who are the only programmer for the languages you yourself designed, can look back at old code and think some of your own idioms are annoying.
(Actual examples from the Raylib API. Which also cause grief when
ported to my case-insensitive syntax, yet another problem.)
Do not blame C for the deficiencies in your language or your use of it!
whereas in a language which ignores case (such as Ada) the mess can
come about accidentally, via typos.
Using the wrong case isn't really a typo. A real typo would yield the
wrong letters
The wrong case is either a typo, or appalling lack of attention to
detail and care of code quality.
Using the wrong case is harmless. At some point, the discrepancy in
style, if not intentional, will be discovered and fixed.
With a decent language it will be discovered as soon as you compile (if
not before, when you use a good editor).
On 24/11/2022 17:56, James Harris wrote:
On 24/11/2022 16:55, Bart wrote:
On 24/11/2022 15:03, David Brown wrote:
On 23/11/2022 23:42, Bart wrote:
Using case is not a spelling mistake; it's a style. In my
languages, someone can write 'int', 'Int' or 'INT' according to
preference.
No, it is a mess.
Allowing int Int INT to be three distinct types, or to represent
types, variables, functions etc, is perfectly fine?
int Int = INT;
Contrast
MyVal := a
myVal := myval + b
Are you happy for a language to allow so much inconsistency?
This is inconsistency of style, something which doesn't affect the
meaning of the code. Languages already allow that:
MyVal:=a
myVal := myval +b
Maybe there can be a tool to warn about this or tidy it up for you, but
I don't believe it should be the job of the language, or compiler.
However I did suggest that a case-sensitive language /could/ enforce consistency across identifiers intended to be identical.
And as DAK said, you can have inconsistencies in case-sensitive code
that are actually dangerous. It took me a few seconds to realise the
second `MyVal` had a small `m` so would be a different identifier.
In a language with declarations, perhaps that would be picked up (unless
it was Go, then := serves to creat declare a new variable). In dynamic
ones, `myVal` would be silently created as a fresh variable.
That can happen with case-insensitivity too, but you have to actually misspell the name, not just use the wrong capitalisation.
Here are some examples from sqlite3.c of names which are identical
except for subtle differences of case:
(walCkptInfo,WalCkptInfo)
(walIndexHdr,WalIndexHdr)
(wrflag,wrFlag)
(writeFile,WriteFile)
(xHotSpot,xHotspot)
(yHotspot,yHotSpot)
(yymajor,yyMajor)
(yyminor,yyMinor)
(zErrMsg,zErrmsg)
(zSql,zSQL)
Try to spot the differences. Remember that in a real program, it will be much busier, and these names haven't been pre-selected and helpfully
placed side by side! Usually you will see them in isolation.
On 24/11/2022 18:42, Bart wrote:
Here are some examples from sqlite3.c of names which are identical
except for subtle differences of case:
(walCkptInfo,WalCkptInfo)
(walIndexHdr,WalIndexHdr)
(wrflag,wrFlag)
(writeFile,WriteFile)
(xHotSpot,xHotspot)
(yHotspot,yHotSpot)
(yymajor,yyMajor)
(yyminor,yyMinor)
(zErrMsg,zErrmsg)
(zSql,zSQL)
Try to spot the differences. Remember that in a real program, it will
be much busier, and these names haven't been pre-selected and
helpfully placed side by side! Usually you will see them in isolation.
Well, some of those would happen regardless of the case sensitivity of
the language. For example, in the version of sqlite3.c I found online I saw that some routines use wrFlag and others use wrflag. From a quick
look I cannot see any routine which uses both. Such a discrepancy
wouldn't be picked up whether the language was case sensitive or not.
Also, it looks as though zSQL is used only in comments.
Nevertheless, I take your point. A programmer /could/ unwisely choose to
use names which differed only by the case of one letter.
Here's a suggestion: make the language case sensitive and have the
compiler reject programs which give access to two names with no changes other than case, such that Myvar and myvar could not be simultaneously accessible.
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval.
The difference is that in a case-sensitive language
(such as C) a programmer would have deliberately to choose daft names to engineer the mess; whereas in a language which ignores case (such as
Ada) the mess can come about accidentally, via typos.
Moreover, tools for the case-sensitive languages like C++ do just the
same. You cannot have reasonable names in C++ anymore. There would be
lurking clang-format or SonarQube configured to force something a
three year old suffering dyslexia would pen... (:-))
As you suggest, for languages which ignore case extra tools are needed
to help tidy up the code.
On 24/11/2022 20:13, James Harris wrote:
On 24/11/2022 18:42, Bart wrote:
Here are some examples from sqlite3.c of names which are identical
except for subtle differences of case:
(walCkptInfo,WalCkptInfo)
(walIndexHdr,WalIndexHdr)
(wrflag,wrFlag)
(writeFile,WriteFile)
(xHotSpot,xHotspot)
(yHotspot,yHotSpot)
(yymajor,yyMajor)
(yyminor,yyMinor)
(zErrMsg,zErrmsg)
(zSql,zSQL)
Try to spot the differences. Remember that in a real program, it will
be much busier, and these names haven't been pre-selected and
helpfully placed side by side! Usually you will see them in isolation.
Well, some of those would happen regardless of the case sensitivity of
the language. For example, in the version of sqlite3.c I found online
I saw that some routines use wrFlag and others use wrflag. From a
quick look I cannot see any routine which uses both. Such a
discrepancy wouldn't be picked up whether the language was case
sensitive or not. Also, it looks as though zSQL is used only in comments.
zSQL occurs here:
SQLITE_API int sqlite3_declare_vtab(sqlite3*, const char *zSQL);
which is between two block comments but is not itself insid a comment.
I have done analysis in the past which tried to detect whether any of
those pairs occurred within the same function; I think one or two did,
but is too much effort to repeat now.
(The whole list is 200 entries; 3 of them have 3 variations on the same name:
(hmenu,hMenu,HMENU)
(next,nExt,Next)
(short,Short,SHORT)
)
But this is just to show that such variances can occur, especially in
the longer names where the difference is subtle.
These are likely to create a lot of confusion, if you type the wrong capitalisation because you assume xHotSpot style rather then xHotspot.
Or even if you're just browsing the code: was this the same name I saw a minute ago? No; one has a small s the other a big S, they just sound the same when you say them out loud (or in your head!).
Such confusion /has/ to be less when xHotSpot, xHotspot, plus the other
62 (I think) variations have to be the same identifier, 'xhotspot' when normalised.
Nevertheless, I take your point. A programmer /could/ unwisely choose
to use names which differed only by the case of one letter.
In C this happens /all the time/. It's almost a requirement. When I translated OpenGL headers, then many macro names shared the same
identifers with functions if you took away case.
Here's a suggestion: make the language case sensitive and have the
compiler reject programs which give access to two names with no
changes other than case, such that Myvar and myvar could not be
simultaneously accessible.
Apart from not being able to do this:
Colour colour;
what would be the point of case sensitivity in this case? Or would the restriction not apply to types? What about a variable that clashes with
a reserved word when case is ignored?
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval.
Which you want to make legal, right?
That is evidently wrong. Why exactly
int INT;
must be legal?
Moreover, tools for the case-sensitive languages like C++ do just the
same. You cannot have reasonable names in C++ anymore. There would be
lurking clang-format or SonarQube configured to force something a
three year old suffering dyslexia would pen... (:-))
As you suggest, for languages which ignore case extra tools are needed
to help tidy up the code.
While 99% of all these tools were developed specifically for
case-sensitive languages? Come on!
On 24/11/2022 19:39, Dmitry A. Kazakov wrote:
On 2022-11-24 19:07, James Harris wrote:
On 24/11/2022 18:02, Dmitry A. Kazakov wrote:
On 2022-11-24 18:56, James Harris wrote:
On 24/11/2022 16:55, Bart wrote:
On 24/11/2022 15:03, David Brown wrote:
On 23/11/2022 23:42, Bart wrote:
Using case is not a spelling mistake; it's a style. In my
languages, someone can write 'int', 'Int' or 'INT' according to >>>>>>>> preference.
No, it is a mess.
Allowing int Int INT to be three distinct types, or to represent
types, variables, functions etc, is perfectly fine?
int Int = INT;
Contrast
MyVal := a
myVal := myval + b
Are you happy for a language to allow so much inconsistency?
Make it
MyVal := a
myVal := MyVal + b
better be case-sensitive?
My point (to you and Bart) is that programmers can choose identifier
names so the latter example need not arise unless it is written
deliberately;
Why did you suggest an error? The point is, you could not know. Nobody
could.
Of course you could know, if the language requires variables to be
declared before usage.
int Myval = 1;
int myval = 2;
In a case-sensitive language, it is legal but written by a > intentionally bad programmer - and no matter how hard you try, bad
programmers will find a way to write bad code. In a case-insensitive language, it is an error written intentionally by a bad programmer.
Give me the language that helps catch typos, not the language that is
happy with an inconsistent jumble.
but if the compiler folds case then programmers can /mistype/ names
accidentally, leading to the messy inconsistency mentioned above.
I prefer mistypes to be considered errors where possible.
Moreover, tools for the case-sensitive languages like C++ do just the
same. You cannot have reasonable names in C++ anymore. There would be
lurking clang-format or SonarQube configured to force something a
three year old suffering dyslexia would pen... (:-))
Some people know how to use tools properly.
On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval.
Which you want to make legal, right?
No.
...
That is evidently wrong. Why exactly
int INT;
must be legal?
I didn't say it should be.
Moreover, tools for the case-sensitive languages like C++ do just
the same. You cannot have reasonable names in C++ anymore. There
would be lurking clang-format or SonarQube configured to force
something a three year old suffering dyslexia would pen... (:-))
As you suggest, for languages which ignore case extra tools are
needed to help tidy up the code.
While 99% of all these tools were developed specifically for
case-sensitive languages? Come on!
It's a personal view but IMO a language should be independent of, and
should not rely on, IDEs or special editors.
On 24/11/2022 20:52, Bart wrote:
Yes, although it's a function declaration; the presumably incorrectly
typed identifier zSQL is ignored.
Similar could be said for any names which differed only slightly.
In C this happens /all the time/. It's almost a requirement. When I
translated OpenGL headers, then many macro names shared the same
identifers with functions if you took away case.
One cannot stop programmers doing daft things. For example, a programmer could declare names such as
CreateTableForwandReference
and
createtableforwardrefarence
The differences are not obvious.
Maybe the best a language designer can do for cases such as this is to
help reduce the number of different names a programmer would have to
define in any given location.
On 24/11/2022 19:28, David Brown wrote:
On 24/11/2022 20:07, Bart wrote:
They do. I gave examples in my other post. But this kind of idiom I
find annoying:
Image image;
Colour colour; //(At least it's not colour color!)
Matrix matrix;
I have never come across any programmer for any language that does not
find some commonly-used idioms or coding styles annoying.
I can port all my identifiers to a case-sensitive language with no
clashes. I can't guarantee no clashes when porting from case-sensitive
to case-insensitive.
Which would be less hassle?
I don't like the C idiom because if I read it in my head it sounds stupid.
That means that, given a choice of what to with lower and upper case letters, I've selected different priorities, since I place little value
on writing code like this:
struct Foo Foo[FOO] = {foo};
Clearly, you have a different opinion.
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval.
Which you want to make legal, right? Again,
If int and INT shall never mean two different entities, why do you let
them?
The difference is that in a case-sensitive language (such as C) a
programmer would have deliberately to choose daft names to engineer
the mess; whereas in a language which ignores case (such as Ada) the
mess can come about accidentally, via typos.
That is evidently wrong.
Why exactly
int INT;
must be legal?
Moreover, tools for the case-sensitive languages like C++ do just the
same. You cannot have reasonable names in C++ anymore. There would be
lurking clang-format or SonarQube configured to force something a
three year old suffering dyslexia would pen... (:-))
As you suggest, for languages which ignore case extra tools are needed
to help tidy up the code.
While 99% of all these tools were developed specifically for
case-sensitive languages? Come on!
On 2022-11-24 22:50, James Harris wrote:
On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval.
Which you want to make legal, right?
No.
...
That is evidently wrong. Why exactly
int INT;
must be legal?
I didn't say it should be.
But it is. q.e.d.
Moreover, tools for the case-sensitive languages like C++ do just
the same. You cannot have reasonable names in C++ anymore. There
would be lurking clang-format or SonarQube configured to force
something a three year old suffering dyslexia would pen... (:-))
As you suggest, for languages which ignore case extra tools are
needed to help tidy up the code.
While 99% of all these tools were developed specifically for
case-sensitive languages? Come on!
It's a personal view but IMO a language should be independent of, and
should not rely on, IDEs or special editors.
Yet you must rely on them in order to prevent:
int INT;
On 2022-11-24 20:23, David Brown wrote:
On 24/11/2022 19:39, Dmitry A. Kazakov wrote:
On 2022-11-24 19:07, James Harris wrote:
On 24/11/2022 18:02, Dmitry A. Kazakov wrote:
On 2022-11-24 18:56, James Harris wrote:
On 24/11/2022 16:55, Bart wrote:
On 24/11/2022 15:03, David Brown wrote:
On 23/11/2022 23:42, Bart wrote:
Using case is not a spelling mistake; it's a style. In my
languages, someone can write 'int', 'Int' or 'INT' according to >>>>>>>>> preference.
No, it is a mess.
Allowing int Int INT to be three distinct types, or to represent >>>>>>> types, variables, functions etc, is perfectly fine?
int Int = INT;
Contrast
MyVal := a
myVal := myval + b
Are you happy for a language to allow so much inconsistency?
Make it
MyVal := a
myVal := MyVal + b
better be case-sensitive?
My point (to you and Bart) is that programmers can choose identifier
names so the latter example need not arise unless it is written
deliberately;
Why did you suggest an error? The point is, you could not know.
Nobody could.
Of course you could know, if the language requires variables to be
declared before usage.
I meant some fancy language where no declarations needed. But OK, take
this:
int MyVal = a;
int myVal = MyVal + b;
How do you know?
int Myval = 1;
int myval = 2;
In a case-sensitive language, it is legal but written by a >
intentionally bad programmer - and no matter how hard you try, bad
programmers will find a way to write bad code. In a case-insensitive
language, it is an error written intentionally by a bad programmer.
Give me the language that helps catch typos, not the language that is
happy with an inconsistent jumble.
declare
Myval : Integer := 1;
myval : Integer := 2;
begin
This is illegal in Ada.
but if the compiler folds case then programmers can /mistype/ names
accidentally, leading to the messy inconsistency mentioned above.
A programmer cannot mistype names if the language is case-sensitive?
Purely statistically your argument makes no sense. Since the set of
unique identifiers in a case-insensitive language is by order of
magnitude narrower, any probability of mess/error etc is also less under equivalent conditions.
The only reason to have case-sensitive identifiers is for having
homographs = for producing mess.
I prefer mistypes to be considered errors where possible.
And I gave more or less formal proof why case-insensitive languages are better here.
Moreover, tools for the case-sensitive languages like C++ do just the
same. You cannot have reasonable names in C++ anymore. There would be
lurking clang-format or SonarQube configured to force something a
three year old suffering dyslexia would pen... (:-))
Some people know how to use tools properly.
These people don't buy them and thus do not count... (:-))
On 24/11/2022 20:13, James Harris wrote:
On 24/11/2022 18:42, Bart wrote:
Nevertheless, I take your point. A programmer /could/ unwisely choose
to use names which differed only by the case of one letter.
In C this happens /all the time/. It's almost a requirement. When I translated OpenGL headers, then many macro names shared the same
identifers with functions if you took away case.
Here's a suggestion: make the language case sensitive and have the
compiler reject programs which give access to two names with no
changes other than case, such that Myvar and myvar could not be
simultaneously accessible.
Apart from not being able to do this:
Colour colour;
what would be the point of case sensitivity in this case? Or would the restriction not apply to types? What about a variable that clashes with
a reserved word when case is ignored?
(BTW my syntax can represent the above as:
`Colour colour
The backtick is case-preserving, and also allows names that clash with reserved words. But I don't want to have to write that in code; this is
for automatic translation tools, or for a one-off.)
On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval.
Which you want to make legal, right? Again,
If int and INT shall never mean two different entities, why do you let
them?
If int and INT are poor style when referring to the same entities, why
do you let them?
But case sensitive makes accidental misuse far more likely to be
caught by the compiler,
Of course there are other options as well, which are arguably better
than either of these. One is to say you have to get the case right for consistency, but disallow identifiers that differ only in case.
The difference is that in a case-sensitive language (such as C) a
programmer would have deliberately to choose daft names to engineer
the mess; whereas in a language which ignores case (such as Ada) the
mess can come about accidentally, via typos.
That is evidently wrong.
What exactly is wrong about my statement? "int INT;" is an example of deliberately daft names.
Legislating against stupidity or malice is
/very/ difficult.
easier, and a better choice.
Why exactly
int INT;
must be legal?
If a language can make such things illegal, great - but /not/ at the
cost of making "int a; INT b; inT c = A + B;" legal.
Moreover, tools for the case-sensitive languages like C++ do just
the same. You cannot have reasonable names in C++ anymore. There
would be lurking clang-format or SonarQube configured to force
something a three year old suffering dyslexia would pen... (:-))
As you suggest, for languages which ignore case extra tools are
needed to help tidy up the code.
While 99% of all these tools were developed specifically for
case-sensitive languages? Come on!
I see no problem with using extra tools, or extra compiler warnings, to improve code quality or catch errors. Indeed, I am a big fan of them.
As a fallible programmer I like all the help I can get, and I like it as early in the process as possible (such as smart editors or IDEs).
1. Forward declarations should not be needed.
2. Parameter names should be part of the interface.
On 24/11/2022 22:58, Dmitry A. Kazakov wrote:
I meant some fancy language where no declarations needed. But OK, take
this:
int MyVal = a;
int myVal = MyVal + b;
How do you know?
It is unavoidable in any language, with any rules, that people will be
able to write confusing code, or that people will be able to make
mistakes that compilers and tools can't catch. No matter how smart you make the language or the tools, that will /always/ be possible.
Give me the language that helps catch typos, not the language that is
happy with an inconsistent jumble.
declare
Myval : Integer := 1;
myval : Integer := 2;
begin
This is illegal in Ada.
Great. Ada catches some mistakes. It lets others through. That's life in programming.
but if the compiler folds case then programmers can /mistype/ names >>>>> accidentally, leading to the messy inconsistency mentioned above.
A programmer cannot mistype names if the language is case-sensitive?
Sure - but at least some typos are more likely to be caught.
Purely statistically your argument makes no sense. Since the set of
unique identifiers in a case-insensitive language is by order of
magnitude narrower, any probability of mess/error etc is also less
under equivalent conditions.
"Purely statistically" you are talking drivel and comparing one
countably infinite set with a different countably infinite set.
There are some programmers who appear to pick identifiers by letting
their cat walk at random over the keyboard. Most don't. Programmers mostly pick the same identifiers regardless of case sensitivity, and
mostly pick identifiers that differ in more than just case. Baring
abusing programmers, the key exception is idioms such as "Point point"
where "Point" is a type and "point" is an object of that type.
The only reason to have case-sensitive identifiers is for having
homographs = for producing mess.
No, it /avoids/ mess. And case insensitivity does not avoid homographs
- HellO and He110 are homographs in simple fonts, despite being
different identifiers regardless of case sensitivity. "Int" and "int"
are not homographs in any font. "Ρο" and "Po" are homographs,
regardless of case sensitivity, despite being completely different
Unicode identifiers (the first uses Greek letters).
("Homograph" means they look much the same, but are actually different -
not that the cases are different.)
The key benefit of case sensitivity is disallowing inconsistent cases, rather than because it allows identifiers that differ in case.
Some people know how to use tools properly.
These people don't buy them and thus do not count... (:-))
I don't follow. I make a point of learning how to use my tools as best
I can, whether they are commercial paid-for tools or zero cost price.
But if you mean that the programmers who could most benefit from good
tools to check style and code quality are precisely the ones that don't
use them, I agree. Usually they don't even have to buy them or acquire them - they already have tools they could use, but don't use them properly.
If I were making a compiler, all its warnings would be on by default,
and you'd have to use the flag "-old-bad-code" to disable them.
On 2022-11-25 09:18, David Brown wrote:
On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval.
Which you want to make legal, right? Again,
If int and INT shall never mean two different entities, why do you
let them?
If int and INT are poor style when referring to the same entities, why
do you let them?
Any how exactly case-sensitiveness would not let them? So far the outcome:
case-insensitive: illegal
case-sensitive: OK
But case sensitive makes accidental misuse far more likely to be
caught by the compiler,
Example please. Typing 'i' instead of 'I' is not a misuse to me.
Of course there are other options as well, which are arguably better
than either of these. One is to say you have to get the case right
for consistency, but disallow identifiers that differ only in case.
You could do that. You can even say that identifiers must be in italics
and keywords in bold Arial and then apply all your arguments to font
shapes, sizes, orientation etc. Why not?
One of the reasons Ada did not do this and many filesystems as well,
because one might wish to be able to convert names to some canonical
form without changing the meaning. After all this is how the letter case appeared in European languages in the first place - to beautify written text.
If you do not misuse the concept that a program is a text, you should
have no problem with the idea that text appearance may vary. Never
changed IDE fonts? (:-))
The difference is that in a case-sensitive language (such as C) a
programmer would have deliberately to choose daft names to engineer
the mess; whereas in a language which ignores case (such as Ada) the
mess can come about accidentally, via typos.
That is evidently wrong.
What exactly is wrong about my statement? "int INT;" is an example of
deliberately daft names.
What's wrong with the name int? Let's take
integer Integer;
Legislating against stupidity or malice is /very/ difficult.
There is no malice, it is quite common practice to do things like:
void Boo::Foo (Object * object) {
int This = this->idx;
etc.
Legislating against accidents and inconsistency is
easier, and a better choice.
Why exactly
int INT;
must be legal?
If a language can make such things illegal, great - but /not/ at the
cost of making "int a; INT b; inT c = A + B;" legal.
I don't see any cost here, because int, INT, inT is the same word to me.
It boils down to how you choose identifiers. If an identifier is a combination of dictionary words case/font/size-insensitivity is the most natural choice. If the idea is to obfuscate the meaning, then it quickly becomes pointless since there is no way you could defeat ill intents.
Moreover, tools for the case-sensitive languages like C++ do just
the same. You cannot have reasonable names in C++ anymore. There
would be lurking clang-format or SonarQube configured to force
something a three year old suffering dyslexia would pen... (:-))
As you suggest, for languages which ignore case extra tools are
needed to help tidy up the code.
While 99% of all these tools were developed specifically for
case-sensitive languages? Come on!
I see no problem with using extra tools, or extra compiler warnings,
to improve code quality or catch errors. Indeed, I am a big fan of
them. As a fallible programmer I like all the help I can get, and I
like it as early in the process as possible (such as smart editors or
IDEs).
It is OK, James argued that these tools somewhat exist because of Ada's case-insensitivity! (:-))
(To me a tool is an indicator of a problem, but that is another story)
On 2022-11-24 22:50, James Harris wrote:
On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval.
Which you want to make legal, right?
No.
...
That is evidently wrong. Why exactly
int INT;
must be legal?
I didn't say it should be.
But it is. q.e.d.
Moreover, tools for the case-sensitive languages like C++ do just
the same. You cannot have reasonable names in C++ anymore. There
would be lurking clang-format or SonarQube configured to force
something a three year old suffering dyslexia would pen... (:-))
As you suggest, for languages which ignore case extra tools are
needed to help tidy up the code.
While 99% of all these tools were developed specifically for
case-sensitive languages? Come on!
It's a personal view but IMO a language should be independent of, and
should not rely on, IDEs or special editors.
Yet you must rely on them in order to prevent:
int INT;
On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
Why exactly
int INT;
must be legal?
If a language can make such things illegal, great - but /not/ at the
cost of making "int a; INT b; inT c = A + B;" legal.
On 25/11/2022 10:13, Dmitry A. Kazakov wrote:
On 2022-11-25 09:18, David Brown wrote:
On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval.
Which you want to make legal, right? Again,
If int and INT shall never mean two different entities, why do you
let them?
If int and INT are poor style when referring to the same entities,
why do you let them?
Any how exactly case-sensitiveness would not let them? So far the
outcome:
case-insensitive: illegal
case-sensitive: OK
You misunderstood my question.
You dislike case sensitivity because it lets you have two different identifiers written "int" and "INT". That is a fair point, and a clear disadvantage of case sensitivity.
But if you have a case insensitive language, it lets you write "int" and "INT" for the /same/ identifier, despite written differences. That is a clear disadvantage of case /insensitivity/.
But case sensitive makes accidental misuse far more likely to be
caught by the compiler,
Example please. Typing 'i' instead of 'I' is not a misuse to me.
If I accidentally type "I" instead of "i", a C compiler will catch the error. "for (int i = 0; I < 10; i++) ..." It's an error in C.
Of course there are other options as well, which are arguably better
than either of these. One is to say you have to get the case right
for consistency, but disallow identifiers that differ only in case.
You could do that. You can even say that identifiers must be in
italics and keywords in bold Arial and then apply all your arguments
to font shapes, sizes, orientation etc. Why not?
Sorry, I was only giving sensible suggestions.
One of the reasons Ada did not do this and many filesystems as well,
because one might wish to be able to convert names to some canonical
form without changing the meaning. After all this is how the letter
case appeared in European languages in the first place - to beautify
written text.
There is a very simple canonical form for ASCII text - leave it alone.
For Unicode, there is a standard normalisation procedure (converting combining diacriticals into single combination codes where applicable).
Ada has its roots in a time when many programming languages were
all-caps, at least for their keywords, and significant computer systems
were still using punched cards, 6-bit character sizes, and other limitations. If you wanted a language that could be used widely (and
that was one of Ada's aim) without special requirements, you had to
accept that some people would be using all-caps. At the same time, it
was clear by then that all-caps was ugly and people preferred to use
small letters when possible. The obvious solution was to make the
language case-insensitive, like many other languages of that time (such
as Pascal, which was a big influence for Ada). It was a /practical/ decision, not made because someone thought being case-insensitive made
the language inherently better.
Legislating against stupidity or malice is /very/ difficult.
There is no malice, it is quite common practice to do things like:
void Boo::Foo (Object * object) {
int This = this->idx;
etc.
As has been said, again and again, writing something like "Object
object" is a common idiom and entirely clear to anyone experienced as a
C or C++ programmer.
I can't remember ever seeing a capitalised keyword
used as an identifier - it is /far/ from common practice. It counts as stupidity, not malice.
Legislating against accidents and inconsistency is
easier, and a better choice.
Why exactly
int INT;
must be legal?
If a language can make such things illegal, great - but /not/ at the
cost of making "int a; INT b; inT c = A + B;" legal.
I don't see any cost here, because int, INT, inT is the same word to me.
They are so visually distinct that there is a higher cognitive cost in reading them - that makes them bad, even when you know they mean the
same thing.
On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
On 2022-11-24 22:50, James Harris wrote:
On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int to
MyVal, myVal, myval.
Which you want to make legal, right?
No.
...
That is evidently wrong. Why exactly
int INT;
must be legal?
I didn't say it should be.
But it is. q.e.d.
Not necessarily. As I said before, within a scope names which vary only
by case could be prohibited.
Yet you must rely on them in order to prevent:
int INT;
No, the compiler could detect it.
On 25/11/2022 10:13, Dmitry A. Kazakov wrote:
Any how exactly case-sensitiveness would not let them? So far the
outcome:
case-insensitive: illegal
case-sensitive: OK
You misunderstood my question.
You dislike case sensitivity because it lets you have two different identifiers written "int" and "INT". That is a fair point, and a clear disadvantage of case sensitivity.
But if you have a case insensitive language, it lets you write "int" and "INT" for the /same/ identifier, despite written differences. That is a clear disadvantage of case /insensitivity/.
But case sensitive makes accidental misuse far more likely to be
caught by the compiler,
Example please. Typing 'i' instead of 'I' is not a misuse to me.
If I accidentally type "I" instead of "i", a C compiler will catch the error. "for (int i = 0; I < 10; i++) ..." It's an error in C.
There is a very simple canonical form for ASCII text - leave it alone.
For Unicode, there is a standard normalisation procedure (converting combining diacriticals into single combination codes where applicable).
Ada has its roots in a time when many programming languages were
all-caps, at least for their keywords, and significant computer systems
were still using punched cards, 6-bit character sizes, and other limitations. If you wanted a language that could be used widely (and
that was one of Ada's aim) without special requirements, you had to
accept that some people would be using all-caps. At the same time, it
was clear by then that all-caps was ugly
and people preferred to use
small letters when possible. The obvious solution was to make the
language case-insensitive, like many other languages of that time (such
as Pascal, which was a big influence for Ada). It was a /practical/ decision, not made because someone thought being case-insensitive made
the language inherently better.
There is no malice, it is quite common practice to do things like:
void Boo::Foo (Object * object) {
int This = this->idx;
etc.
As has been said, again and again, writing something like "Object
object" is a common idiom and entirely clear to anyone experienced as a
C or C++ programmer. It is less common to see a pointer involved (idiomatic C++ would likely have "object" as a reference or const
reference here). I can't remember ever seeing a capitalised keyword
used as an identifier - it is /far/ from common practice. It counts as stupidity, not malice.
On 24/11/2022 21:47, James Harris wrote:
On 24/11/2022 20:52, Bart wrote:
Yes, although it's a function declaration; the presumably incorrectly
typed identifier zSQL is ignored.
We don't know the purpose of zSQL. But the point is it is there, a
slightly differently-case version of with the same name, which I can't
for the life of me recall right now. That is the problem.
(If I look back, it is zSql. But if now encounter these even in 10
minutes time, which one would be which? I would forget.)
Similar could be said for any names which differed only slightly.
Say short, Short and SHORT out loud; any difference?
You're debugging some code and need to print out the value of hmenu. Or
is hMenu or Hmenu? Personally I am half-blind to case usage because it
so commonly irrelevant and ignored in English.
I would be constantly double-checking and constantly getting it wrong
too. And that's with just one of these three in place.
Differences in spelling are another matter; I'm a good speller.
You might notice when the cup you're handed in Starbucks has Janes
rather than James and would want to check it is yours; but you probably wouldn't care if its james or James or JAMES because that is just style.
You know they are all the same name.
But also, just because people can make typos by pressing the wrong
letter or being the wrong length doesn't make allowing 2**N more
incorrect possibilities acceptable.
In C this happens /all the time/. It's almost a requirement. When I
translated OpenGL headers, then many macro names shared the same
identifers with functions if you took away case.
One cannot stop programmers doing daft things. For example, a
programmer could declare names such as
CreateTableForwandReference
and
createtableforwardrefarence
The differences are not obvious.
So to fix it, we allow
CreateTableForwardRefarence
createtableforwandreference
as synonyms? Case sensitive, you have subtle differences in letters
/plus/ subtle differences in case!
Maybe the best a language designer can do for cases such as this is to
help reduce the number of different names a programmer would have to
define in any given location.
Given the various restrictions you've mentioned that you'd want even
with case sensitive names, is there any point to having case
sensitivity? What would it be used for; what would it allow?
I had a scripting language that shipped with my applications. While case insensitive, I would usually write keywords in lower case as if, then, while.As I said before, a compiler can prohibit names which fold to the same
But some users would write If, Then, While, and make more use in
identifiers of mixed case. And they would capitalise global variables
that I defined in lower case.
On 24/11/2022 22:47, James Harris wrote:
1. Forward declarations should not be needed.
Usually not, for functions. But sometimes you will need them for
mutually recursive functions,
and I think it makes sense to have some
kind of module interface definition with a list of declared functions
(and other entities). In other words, a function should not be exported from a module just by writing "export" at the definition.
You should
have an interface section with the declarations (like Pascal), or a
separate interface file (like Modula-2).
2. Parameter names should be part of the interface.I agree - though not everyone does, so there are choices here too.
people like to write a declaration such as :
void rectangle(int top, int_left, int width, int height);
and then the definition :
void rectangle(t, l, w, h) { ... }
On 2022-11-25 11:12, James Harris wrote:
On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
On 2022-11-24 22:50, James Harris wrote:
On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int to >>>>>> MyVal, myVal, myval.
Which you want to make legal, right?
No.
...
That is evidently wrong. Why exactly
int INT;
must be legal?
I didn't say it should be.
But it is. q.e.d.
Not necessarily. As I said before, within a scope names which vary
only by case could be prohibited.
You can introduce such rules, but so could a case-insensitive language
as well. The rule as it is agnostic to the choice.
Yet you must rely on them in order to prevent:
int INT;
No, the compiler could detect it.
How? Without additional rules (see above) this is perfectly legal in a case-sensitive language.
On 25/11/2022 08:18, David Brown wrote:
On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
...
Why exactly
int INT;
must be legal?
If a language can make such things illegal, great - but /not/ at the
cost of making "int a; INT b; inT c = A + B;" legal.
Well put! It's that kind of mess which makes me dislike the idea of a language ignoring case.
I don't understand how anyone can think that a
compiler actually allowing such a jumble and viewing it as legal is a
good idea.
On 2022-11-25 11:12, James Harris wrote:
On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
On 2022-11-24 22:50, James Harris wrote:
On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
On 2022-11-24 19:55, James Harris wrote:
While 99% of all these tools were developed specifically for
case-sensitive languages? Come on!
It's a personal view but IMO a language should be independent of,
and should not rely on, IDEs or special editors.
Yet you must rely on them in order to prevent:
int INT;
No, the compiler could detect it.
How? Without additional rules (see above) this is perfectly legal in a case-sensitive language.
On 25/11/2022 10:31, James Harris wrote:
On 25/11/2022 08:18, David Brown wrote:
On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
...
Why exactly
int INT;
must be legal?
If a language can make such things illegal, great - but /not/ at the
cost of making "int a; INT b; inT c = A + B;" legal.
Well put! It's that kind of mess which makes me dislike the idea of a
language ignoring case.
But, that never happens!
If it does, you can change it to how you like;
the program still works; that is the advantage.
C allows that line to be written like this:
i\
n\
t\
a\
;\
i\
n\
t\
b\
;\
i\
n\
t\
c
=
a
+
b;
All sorts of nonsense can written legally, some of it more dangerous
than being lax about letter case.
I don't understand how anyone can think that a compiler actually
allowing such a jumble and viewing it as legal is a good idea.
Because it is blind to case? In the same way it doesn't see extra or misleading white space which can lead to even worse jumbles.
The solution is easy: just make your language case-sensitive if that is
your preference.
Others may make theirs case-insensitive. Because it doesn't look like
anyone is going to change their mind about this stuff.
On 25/11/2022 11:41, Dmitry A. Kazakov wrote:
On 2022-11-25 11:12, James Harris wrote:
On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
Yet you must rely on them in order to prevent:
int INT;
No, the compiler could detect it.
How? Without additional rules (see above) this is perfectly legal in a
case-sensitive language.
I think it is quite clear (with the restored context) that James meant a compiler could detect the undesirable "int INT;" pattern, without the
need of additional tools or smart editors. This is, of course, entirely true.
On 2022-11-25 10:52, David Brown wrote:
On 25/11/2022 10:13, Dmitry A. Kazakov wrote:
On 2022-11-25 09:18, David Brown wrote:
On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int to >>>>>> MyVal, myVal, myval.
Which you want to make legal, right? Again,
If int and INT shall never mean two different entities, why do you
let them?
If int and INT are poor style when referring to the same entities,
why do you let them?
Any how exactly case-sensitiveness would not let them? So far the
outcome:
case-insensitive: illegal
case-sensitive: OK
You misunderstood my question.
You dislike case sensitivity because it lets you have two different
identifiers written "int" and "INT". That is a fair point, and a
clear disadvantage of case sensitivity.
But if you have a case insensitive language, it lets you write "int"
and "INT" for the /same/ identifier, despite written differences.
That is a clear disadvantage of case /insensitivity/.
Only when identifiers are not supposed to mean anything, which is not
how I want programs to be. So to me, in context of programming as an activity to communicate ideas written in programs, this disadvantage
does not exist.
But case sensitive makes accidental misuse far more likely to be
caught by the compiler,
Example please. Typing 'i' instead of 'I' is not a misuse to me.
If I accidentally type "I" instead of "i", a C compiler will catch the
error. "for (int i = 0; I < 10; i++) ..." It's an error in C.
But this is no error to me, because there cannot be two different object named i and I.
Of course there are other options as well, which are arguably better
than either of these. One is to say you have to get the case right
for consistency, but disallow identifiers that differ only in case.
You could do that. You can even say that identifiers must be in
italics and keywords in bold Arial and then apply all your arguments
to font shapes, sizes, orientation etc. Why not?
Sorry, I was only giving sensible suggestions.
Why distinction of case is sensible and distinction of fonts is not?
One of the reasons Ada did not do this and many filesystems as well,
because one might wish to be able to convert names to some canonical
form without changing the meaning. After all this is how the letter
case appeared in European languages in the first place - to beautify
written text.
There is a very simple canonical form for ASCII text - leave it alone.
No, regarding identifiers the alphabet is not ASCII, never was. At best
you can say let identifiers be Latin letters plus some digits, maybe
some binding signs. ASCII provides means to encode, in particular, Latin letters. Letters can be encoded in a great number of ways.
For Unicode, there is a standard normalisation procedure (converting
combining diacriticals into single combination codes where applicable).
Ada has its roots in a time when many programming languages were
all-caps, at least for their keywords, and significant computer
systems were still using punched cards, 6-bit character sizes, and
other limitations. If you wanted a language that could be used widely
(and that was one of Ada's aim) without special requirements, you had
to accept that some people would be using all-caps. At the same time,
it was clear by then that all-caps was ugly and people preferred to
use small letters when possible. The obvious solution was to make the
language case-insensitive, like many other languages of that time
(such as Pascal, which was a big influence for Ada). It was a
/practical/ decision, not made because someone thought being
case-insensitive made the language inherently better.
Ada 83 style used bold lower case letters for keywords and upper case letters for identifiers.
Legislating against stupidity or malice is /very/ difficult.
There is no malice, it is quite common practice to do things like:
void Boo::Foo (Object * object) {
int This = this->idx;
etc.
As has been said, again and again, writing something like "Object
object" is a common idiom and entirely clear to anyone experienced as
a C or C++ programmer.
"it should be entirely clear for anyone..." is no argument.
I can't remember ever seeing a capitalised keyword used as an
identifier - it is /far/ from common practice. It counts as
stupidity, not malice.
Why using properly spelt words is stupidity? (:-))
They are so visually distinct that there is a higher cognitive cost inLegislating against accidents and inconsistency is
easier, and a better choice.
Why exactly
int INT;
must be legal?
If a language can make such things illegal, great - but /not/ at the
cost of making "int a; INT b; inT c = A + B;" legal.
I don't see any cost here, because int, INT, inT is the same word to me. >>
reading them - that makes them bad, even when you know they mean the
same thing.
Come on, they are not visually distinct, just open any book and observe capital letters at the beginning of every sentence!
If there is any cost then keeping in mind artificially introduced differences.
On 2022-11-25 14:46, David Brown wrote:
On 25/11/2022 11:41, Dmitry A. Kazakov wrote:
On 2022-11-25 11:12, James Harris wrote:
On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
Yet you must rely on them in order to prevent:
int INT;
No, the compiler could detect it.
How? Without additional rules (see above) this is perfectly legal in
a case-sensitive language.
I think it is quite clear (with the restored context) that James meant
a compiler could detect the undesirable "int INT;" pattern, without
the need of additional tools or smart editors. This is, of course,
entirely true.
No, it cannot without additional rules, which, BTW, case-sensitive
languages do not have. [A Wikipedia listed one? (:-))]
We do not argue about such rules. We do about case-sensitivity.
On 2022-11-25 14:46, David Brown wrote:
On 25/11/2022 11:41, Dmitry A. Kazakov wrote:
On 2022-11-25 11:12, James Harris wrote:
On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
Yet you must rely on them in order to prevent:
int INT;
No, the compiler could detect it.
How? Without additional rules (see above) this is perfectly legal in
a case-sensitive language.
I think it is quite clear (with the restored context) that James meant
a compiler could detect the undesirable "int INT;" pattern, without
the need of additional tools or smart editors. This is, of course,
entirely true.
No, it cannot without additional rules, which, BTW, case-sensitive
languages do not have. [A Wikipedia listed one? (:-))]
On 25/11/2022 10:31, James Harris wrote:
On 25/11/2022 08:18, David Brown wrote:
On 24/11/2022 22:35, Dmitry A. Kazakov wrote:
...
Why exactly
int INT;
must be legal?
If a language can make such things illegal, great - but /not/ at the
cost of making "int a; INT b; inT c = A + B;" legal.
Well put! It's that kind of mess which makes me dislike the idea of a
language ignoring case.
But, that never happens! If it does, you can change it to how you like;
the program still works; that is the advantage.
C allows that line to be written like this:
On 2022-11-25 09:43, David Brown wrote:
On 24/11/2022 22:58, Dmitry A. Kazakov wrote:
I meant some fancy language where no declarations needed. But OK,
take this:
int MyVal = a;
int myVal = MyVal + b;
How do you know?
It is unavoidable in any language, with any rules, that people will be
able to write confusing code, or that people will be able to make
mistakes that compilers and tools can't catch. No matter how smart
you make the language or the tools, that will /always/ be possible.
Sill, the above is illegal in Ada and legal in C.
Purely statistically your argument makes no sense. Since the set of
unique identifiers in a case-insensitive language is by order of
magnitude narrower, any probability of mess/error etc is also less
under equivalent conditions.
"Purely statistically" you are talking drivel and comparing one
countably infinite set with a different countably infinite set.
The probability theory deals with infinite sets. Sets must be
measurable, not countable.
But the set of identifiers is of course countable, since no human and no
FSM can deploy infinite identifiers.
There are some programmers who appear to pick identifiers by letting
their cat walk at random over the keyboard. Most don't. Programmers
mostly pick the same identifiers regardless of case sensitivity, and
mostly pick identifiers that differ in more than just case. Baring
abusing programmers, the key exception is idioms such as "Point point"
where "Point" is a type and "point" is an object of that type.
It is a bad idiom.
Spoken languages use articles and other grammatical
means to disambiguate classes and instances of. A programming language
may also have different name spaces for different categories of entities (hello, first-class types, functions etc (:-)). Writing "Point point" specifically in C++ is laziness, stupidity and abuse.
The only reason to have case-sensitive identifiers is for having
homographs = for producing mess.
No, it /avoids/ mess. And case insensitivity does not avoid
homographs - HellO and He110 are homographs in simple fonts, despite
being different identifiers regardless of case sensitivity. "Int" and
"int" are not homographs in any font. "Ρο" and "Po" are homographs,
regardless of case sensitivity, despite being completely different
Unicode identifiers (the first uses Greek letters).
("Homograph" means they look much the same, but are actually different
- not that the cases are different.)
Cannot avoid some homographs, let's introduce more?
The key benefit of case sensitivity is disallowing inconsistent cases,
rather than because it allows identifiers that differ in case.
How "point" is disallowed by being different from "Point"?
Some people know how to use tools properly.
These people don't buy them and thus do not count... (:-))
I don't follow. I make a point of learning how to use my tools as
best I can, whether they are commercial paid-for tools or zero cost
price.
But if you mean that the programmers who could most benefit from good
tools to check style and code quality are precisely the ones that
don't use them, I agree. Usually they don't even have to buy them or
acquire them - they already have tools they could use, but don't use
them properly.
If I were making a compiler, all its warnings would be on by default,
and you'd have to use the flag "-old-bad-code" to disable them.
Ideally, you should not need a tool if your primary instrument (the language) works well.
On 24/11/2022 22:51, Bart wrote:
Say short, Short and SHORT out loud; any difference?
Yes, they get louder. ;-)
On 25/11/2022 09:52, David Brown wrote:
On 25/11/2022 10:13, Dmitry A. Kazakov wrote:
Any how exactly case-sensitiveness would not let them? So far the
outcome:
case-insensitive: illegal
case-sensitive: OK
You misunderstood my question.
You dislike case sensitivity because it lets you have two different
identifiers written "int" and "INT". That is a fair point, and a
clear disadvantage of case sensitivity.
But this happens in real code. For example `enum (INT, FLOAT, DOUBLE)`,
plus of course `Image image`.
But if you have a case insensitive language, it lets you write "int"
and "INT" for the /same/ identifier, despite written differences.
That is a clear disadvantage of case /insensitivity/.
This could happen in real code, but it very rarely does.
So one is a real disadvantage, the other only a perceived one. Here's another issue:
zfail
zFar
zNear
zpass
These don't clash. But there are two patterns here: small z following by either a capitalised word or non-capitalised. How do you remember which
is which? With case-sensitive, you /have/ to get it right.
With case-insensitive, if these identifiers were foisted on you, you can choose to use more consistent capitalisation.
It's not an error, so no harm done. At some point it will be noticed
that one of those has the wrong case, and it will be fixed.
But I think it is generally understood that case-sensitivity is bad for ordinalry users.
On 25/11/2022 09:24, David Brown wrote:
On 24/11/2022 22:47, James Harris wrote:
1. Forward declarations should not be needed.
Usually not, for functions. But sometimes you will need them for
mutually recursive functions,
Not even then. Modern languages seems to deal with out-of-order
functions without needing special declarations.
and I think it makes sense to have some kind of module interface
definition with a list of declared functions (and other entities). In
other words, a function should not be exported from a module just by
writing "export" at the definition.
Why not?
You should have an interface section with the declarations (like
Pascal), or a separate interface file (like Modula-2).
Then you have the same information repeated in two places.
If you need a summary of the interface without exposing the
implementaton, this can be done automatically by a compiler, which can
be done for those functions marked with 'export'.
(In my languages, which use whole-program compilers, such an exports
file is only needed to export names from the whole program, when it
forms a complete library.
Plus I need to create such a file to create bindings in my language to
FFI libraries. But there I don't have the sources of those libraries)
2. Parameter names should be part of the interface.I agree - though not everyone does, so there are choices here too.
Since in my languages you only ever specify the function header in one
place - where it's defined - parameter names are mandatory. And there is only ever one set.
Some
people like to write a declaration such as :
void rectangle(int top, int_left, int width, int height);
and then the definition :
void rectangle(t, l, w, h) { ... }
That's how my systems language worked for 20 years. I find it
astonishing now that I tolerated it for so long.
Well, actually not quite: the declaration listed only types; the
definition only names. Names in the declaration had no use.
On 25/11/2022 10:41, Dmitry A. Kazakov wrote:
On 2022-11-25 11:12, James Harris wrote:
On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
On 2022-11-24 22:50, James Harris wrote:
On 24/11/2022 21:35, Dmitry A. Kazakov wrote:
On 2022-11-24 19:55, James Harris wrote:
All of the above are examples of poor code - from Int, INT, int >>>>>>> to MyVal, myVal, myval.
Which you want to make legal, right?
No.
...
That is evidently wrong. Why exactly
int INT;
must be legal?
I didn't say it should be.
But it is. q.e.d.
Not necessarily. As I said before, within a scope names which vary
only by case could be prohibited.
You can introduce such rules, but so could a case-insensitive language
as well. The rule as it is agnostic to the choice.
No, I'm arguing for consistency - and consistency that can be enforced
by the compiler. The thing I dislike is the inconsistency allowed by
case insensitivity.
Yet you must rely on them in order to prevent:
int INT;
No, the compiler could detect it.
How? Without additional rules (see above) this is perfectly legal in a
case-sensitive language.
As I say, names which fold to the same string can be detected and
prohibited by the compiler.
James appears to be considering a suggestion that avoids the
disadvantages of case-sensitivity, and also the disadvantages of case-insensitivity - though it also disallows advantageous use of cases.
(You can't have everything - there are always trade-offs.) To me,
that is certainly worth considering - this is not a strictly
black-or-white issue.
On 25/11/2022 13:40, Bart wrote:
C allows that line to be written like this:
Will you /please/ give it a rest? Whenever you can't think of something useful to write, you always go off on some rant about how it is possible
to writing something in C that you don't like. I've been trying to
avoid replying to you so much, because it just makes me annoyed and
write unpleasantly. But sometimes your obsession with hating C is borderline psychotic.
(And if you had anything interesting to say, it is lost in the noise I snipped.)
On 25/11/2022 11:37, Dmitry A. Kazakov wrote:
On 2022-11-25 10:52, David Brown wrote:
On 25/11/2022 10:13, Dmitry A. Kazakov wrote:
Example please. Typing 'i' instead of 'I' is not a misuse to me.
If I accidentally type "I" instead of "i", a C compiler will catch
the error. "for (int i = 0; I < 10; i++) ..." It's an error in C.
But this is no error to me, because there cannot be two different
object named i and I.
Would you consider it good style to mix "i" and "I" in the same code, as
the same identifier?
have done little Ada programming, but I used to do a lot of Pascal
coding - I have never seen circumstances where I considered it to be an advantage to use different cases for the same identifier.
On the
contrary, I saw a lot of code that was harder to comprehend because of mixing cases.
So let me ask you a direct question, and I hope you can give me a direct answer. If you were doing an Ada code review and the code had used an identifier in two places with two different capitalisations, would you
let it pass or would you want it changed?
Second question. Do you set up your extra tools (or IDE) to flag inconsistent case as a warning or error?
Your only concern (and it's a valid concern) is avoiding the
/disadvantage/ of case-sensitive languages in being open to allowing confusing names.
Of course there are other options as well, which are arguably
better than either of these. One is to say you have to get the
case right for consistency, but disallow identifiers that differ
only in case.
You could do that. You can even say that identifiers must be in
italics and keywords in bold Arial and then apply all your arguments
to font shapes, sizes, orientation etc. Why not?
Sorry, I was only giving sensible suggestions.
Why distinction of case is sensible and distinction of fonts is not?
Baring a few niche (or outdated) languages that rely on specialised
editors, languages should not be dependent on the appearance of the
text. Syntax highlighting is useful for reading and editing, but not as
a part of the syntax or grammar of the language.
One of the reasons Ada did not do this and many filesystems as well,
because one might wish to be able to convert names to some canonical
form without changing the meaning. After all this is how the letter
case appeared in European languages in the first place - to beautify
written text.
There is a very simple canonical form for ASCII text - leave it alone.
No, regarding identifiers the alphabet is not ASCII, never was. At
best you can say let identifiers be Latin letters plus some digits,
maybe some binding signs. ASCII provides means to encode, in
particular, Latin letters. Letters can be encoded in a great number of
ways.
Sure - identifiers use a subset of ASCII. (The subset varies a little
from language to language.) And when languages allow characters beyond ASCII, they are a subset of Unicode. All other character encodings are obsolete.
Legislating against stupidity or malice is /very/ difficult.
There is no malice, it is quite common practice to do things like:
void Boo::Foo (Object * object) {
int This = this->idx;
etc.
As has been said, again and again, writing something like "Object
object" is a common idiom and entirely clear to anyone experienced as
a C or C++ programmer.
"it should be entirely clear for anyone..." is no argument.
A programmer for a language should be expected to understand the fundamentals of the language and common idioms. I did not say "it
should be entirely clear for [sic] anyone" - I said "it should be
entirely clear to anyone experienced as a C or C++ programmer".
If you
have never touched C (or other languages with similar idioms), then I
would not expect you to be comfortable with "Point point;" no matter how many decades experience you have with other languages. But if you have programmed C or C++ for a few years, I would expect you to be completely comfortable in reading and understanding it, even if you do not use that idiom yourself.
On 25/11/2022 15:02, David Brown wrote:
... (discussion of case sensitivity in a language)
James appears to be considering a suggestion that avoids the
disadvantages of case-sensitivity, and also the disadvantages of
case-insensitivity - though it also disallows advantageous use of
cases. (You can't have everything - there are always trade-offs.)
To me, that is certainly worth considering - this is not a strictly
black-or-white issue.
That's a good summary.
Further, I've seen your comments on linked areas such as different namespaces and parameter names - which are also as yet unresolved but related issues. It may be a few days before I can reply but I do intend
to get back to them.
On 2022-11-25 15:52, David Brown wrote:
On 25/11/2022 11:37, Dmitry A. Kazakov wrote:
On 2022-11-25 10:52, David Brown wrote:
On 25/11/2022 10:13, Dmitry A. Kazakov wrote:
Example please. Typing 'i' instead of 'I' is not a misuse to me.
If I accidentally type "I" instead of "i", a C compiler will catch
the error. "for (int i = 0; I < 10; i++) ..." It's an error in C.
But this is no error to me, because there cannot be two different
object named i and I.
Would you consider it good style to mix "i" and "I" in the same code,
as the same identifier?
I consider lack of indentation before "for" bad style. Yet I find Python ideas about making it syntax horrific.
have done little Ada programming, but I used to do a lot of Pascal
coding - I have never seen circumstances where I considered it to be
an advantage to use different cases for the same identifier.
How case-sensitivity forces different cases? It is some imaginary
problem. Why the mouse does not have a foot detector in order to prevent
me using my feet with it? Just do not! (:-))
On the contrary, I saw a lot of code that was harder to comprehend
because of mixing cases.
Good, why let them influence the program semantics?
So let me ask you a direct question, and I hope you can give me a
direct answer. If you were doing an Ada code review and the code had
used an identifier in two places with two different capitalisations,
would you let it pass or would you want it changed?
Yes, I would want to change it according to the actual guidelines.
Second question. Do you set up your extra tools (or IDE) to flag
inconsistent case as a warning or error?
No. I always write identifiers correct, except when I borrow them from
alien programming language or standards. Then I make some compromises, I could well name type LPARAM in the Windows API.
Your only concern (and it's a valid concern) is avoiding the
/disadvantage/ of case-sensitive languages in being open to allowing
confusing names.
Yes and introducing problems later. E.g. it is a minefield working with Linux file system, which is case-sensitive.
In Ada modules must be named
after the files and it is a great help that I need not to care inside
the module. But outside I must be very careful. I have all lowercase
rule for the source files because a project perfectly built under
Windows might miserably fail under Linux.
Of course there are other options as well, which are arguably
better than either of these. One is to say you have to get the
case right for consistency, but disallow identifiers that differ
only in case.
You could do that. You can even say that identifiers must be in
italics and keywords in bold Arial and then apply all your
arguments to font shapes, sizes, orientation etc. Why not?
Sorry, I was only giving sensible suggestions.
Why distinction of case is sensible and distinction of fonts is not?
Baring a few niche (or outdated) languages that rely on specialised
editors, languages should not be dependent on the appearance of the
text. Syntax highlighting is useful for reading and editing, but not
as a part of the syntax or grammar of the language.
So is the case!
One of the reasons Ada did not do this and many filesystems as
well, because one might wish to be able to convert names to some
canonical form without changing the meaning. After all this is how
the letter case appeared in European languages in the first place - >>>>> to beautify written text.
There is a very simple canonical form for ASCII text - leave it alone.
No, regarding identifiers the alphabet is not ASCII, never was. At
best you can say let identifiers be Latin letters plus some digits,
maybe some binding signs. ASCII provides means to encode, in
particular, Latin letters. Letters can be encoded in a great number
of ways.
Sure - identifiers use a subset of ASCII. (The subset varies a little
from language to language.) And when languages allow characters
beyond ASCII, they are a subset of Unicode. All other character
encodings are obsolete.
The point is that "letter" is a borrowed term. It is not ASCII or
Unicode. There is no good reason to use letters differently from their original meaning. Surely you can have some conventions about style, yet
i and I are different spellings of the same letter.
Legislating against stupidity or malice is /very/ difficult.
There is no malice, it is quite common practice to do things like:
void Boo::Foo (Object * object) {
int This = this->idx;
etc.
As has been said, again and again, writing something like "Object
object" is a common idiom and entirely clear to anyone experienced
as a C or C++ programmer.
"it should be entirely clear for anyone..." is no argument.
A programmer for a language should be expected to understand the
fundamentals of the language and common idioms. I did not say "it
should be entirely clear for [sic] anyone" - I said "it should be
entirely clear to anyone experienced as a C or C++ programmer".
It is a bad idiom, like picking the nose at the table... (:-))
If you have never touched C (or other languages with similar idioms),
then I would not expect you to be comfortable with "Point point;" no
matter how many decades experience you have with other languages. But
if you have programmed C or C++ for a few years, I would expect you to
be completely comfortable in reading and understanding it, even if you
do not use that idiom yourself.
Have mercy! I painfully restrain myself from elaborating the
nose-picking allegory... (:-))
On 25/11/2022 10:48, Dmitry A. Kazakov wrote:
On 2022-11-25 09:43, David Brown wrote:
On 24/11/2022 22:58, Dmitry A. Kazakov wrote:
I meant some fancy language where no declarations needed. But OK,
take this:
int MyVal = a;
int myVal = MyVal + b;
How do you know?
It is unavoidable in any language, with any rules, that people will
be able to write confusing code, or that people will be able to make
mistakes that compilers and tools can't catch. No matter how smart
you make the language or the tools, that will /always/ be possible.
Sill, the above is illegal in Ada and legal in C.
Yes. So what? It's good to try to stop accidents.
It is foolish to try to stop intentionally confusing code.
Purely statistically your argument makes no sense. Since the set of
unique identifiers in a case-insensitive language is by order of
magnitude narrower, any probability of mess/error etc is also less
under equivalent conditions.
"Purely statistically" you are talking drivel and comparing one
countably infinite set with a different countably infinite set.
The probability theory deals with infinite sets. Sets must be
measurable, not countable.
But the set of identifiers is of course countable, since no human and
no FSM can deploy infinite identifiers.
No, the set of identifiers in most languages is countably infinite - few languages impose specific limits on the length of identifiers (which
would make the set finite) - and none allow infinite length identifiers (which would make the set uncountably infinite).
Changing the size of
the set of distinguishable letters does not change the size of the identifier space.
And if you really wanted to go there, I'd like to
point out that Ada identifiers can use Unicode letters and thus have a vastly bigger choice of letters than many case-sensitive programming languages.
Cannot avoid some homographs, let's introduce more?
No - but you don't introduce more ways of writing confusing code unless there are significant benefits outweighing the costs.
That's the
decision Ada made when it added Unicode, despite having /vastly/ more opportunities to write confusingly similar but programmatically distinct identifiers. Certainly the possible accidental mixups due to case sensitivity are a drop in the ocean in comparison.
The key benefit of case sensitivity is disallowing inconsistent
cases, rather than because it allows identifiers that differ in case.
How "point" is disallowed by being different from "Point"?
Yes - if this is unintentional. The most important feature of a case sensitive language is that you don't get some people writing "Point" and others writing "point" - two letter sequences that look different and
/are/ different - and referring to the same thing.
A second feature - one that some people like, and some people do not -
is that you can use such different letter sequences to refer to
different things.
The strange thing about case-insensitive languages is that it means different letter sequences sometimes refer to the same thing, which can
be confusing.
This is not rocket science.
Ideally, you should not need a tool if your primary instrument (the
language) works well.
Usually a language needs a compiler or interpreter to be useful...
Keep the cases consistent and there is no problem.
Maybe we should just leave things here and agree to disagree on the
whole topic? It might not be healthy to dig deeper into some analogies!
On 25/11/2022 15:46, Dmitry A. Kazakov wrote:
On 2022-11-25 14:46, David Brown wrote:
On 25/11/2022 11:41, Dmitry A. Kazakov wrote:
On 2022-11-25 11:12, James Harris wrote:
On 24/11/2022 22:00, Dmitry A. Kazakov wrote:
Yet you must rely on them in order to prevent:
int INT;
No, the compiler could detect it.
How? Without additional rules (see above) this is perfectly legal in
a case-sensitive language.
I think it is quite clear (with the restored context) that James
meant a compiler could detect the undesirable "int INT;" pattern,
without the need of additional tools or smart editors. This is, of
course, entirely true.
No, it cannot without additional rules, which, BTW, case-sensitive
languages do not have. [A Wikipedia listed one? (:-))]
You snippet the context again.
You may also have missed the fact that James is working on his own
language design. He makes the rules, he decides what goes in the
compiler for /his/ language.
We do not argue about such rules. We do about case-sensitivity.
All you have done so far is argue /against/ one type of unclear code
that can be written in existing case-sensitive languages. It would only
be a problem when done maliciously or in smart-arse programming -
accidents would generally be caught by the compiler. (Note that people
can write code with clear intent and function by making use of case sensitivity - no matter how much Bart whinges and whines about it.)
I've seen /nothing/ from you or anyone else that suggests any benefit in being case-insensitive in itself.
James appears to be considering a suggestion that avoids the
disadvantages of case-sensitivity, and also the disadvantages of case-insensitivity - though it also disallows advantageous use of cases.
(You can't have everything - there are always trade-offs.) To me,
that is certainly worth considering - this is not a strictly
black-or-white issue.
On 25/11/2022 12:40, Bart wrote:
On 25/11/2022 10:31, James Harris wrote:
But, that never happens!
Eh???? You say it never happens when very similar does happen ... and
then you go on to give your own counterexamples that are so outre they really never happen! Que????
All sorts of nonsense can written legally, some of it more dangerous
than being lax about letter case.
Of course but as said that's the same whether case is recognised or not.
A determined programmer can /always/ write garbage and a language cannot prevent him from doing so.
The difference is that where the programmer doesn't mean to be
inconsistent but simply makes a mistake and inadvertently writes the
same identifier in different cases.
A compiler can pick that up so that
identifier names can be written the same each time they are used, making code more consistent and the intent of any capitalisation clearer.
What's not to like...?!
...
I'll add one more thing. If case is to be ignored (your preference) then there is nothing to stop different programmers adopting different non-standard conventions for how they personally capitalise parts of
names, making code even harder to maintain.
You brought up sqlite3.c but if it is over 100,000 lines of code in
one file (Andy would not care much for it...) I'm not sure that it's
a valid example for anything!
On 25/11/2022 11:55, James Harris wrote:
You brought up sqlite3.c but if it is over 100,000 lines of code in
one file (Andy would not care much for it...) I'm not sure that it's
a valid example for anything!
Andy would point out that that's ~2000 pages of code; about 3x the size of the draft C23 standard. It is therefore a valid example of code that contains serious errors even if no-one knows what they are. I would strongly advise not using it anywhere near any potentially fatal activity, such as running a nuclear power station, flying a plane, delivering doses
of radiotherapy, .... You Have Been Warned.
about 3x the size of the draft C23 standard.
On 25/11/2022 11:50, Bart wrote:
These don't clash. But there are two patterns here: small z following
by either a capitalised word or non-capitalised. How do you remember
which is which? With case-sensitive, you /have/ to get it right.
Indeed. Having to get it right is an advantage, not the converse!!!
If
the compiler tells you that name uses don't match declarations then you
can fix it. As you say, you /have/ to get it right. Good!
We are not in the days of batch compiles which might take a few days to
come back and report a trivial error. A compiler can tell us immediately
if names have been written inconsistently. And we can fix them.
With case-insensitive, if these identifiers were foisted on you, you
can choose to use more consistent capitalisation.
Case ignoring is telling the compiler: "Look, it doesn't matter how
names are capitalised; anything goes; please accept ANY capitalisation
of names as all the same." Why would one want to say that? (Rhetorical)
I just don't get it.
On 2022-11-25 16:28, David Brown wrote:
On 25/11/2022 10:48, Dmitry A. Kazakov wrote:
Ideally, you should not need a tool if your primary instrument (the
language) works well.
Usually a language needs a compiler or interpreter to be useful...
Which is why compiler is not a tool. lint is a tool, gcc is not.
Yes, a compiler is a tool. A "toolchain" for a language generally
consists of the compiler, assembler, librarian and linker (though
sometimes these are combined) - everything taking you from source code
to executable. All the parts of it are "tools", as are optional extras such as linters, debuggers, build tools, etc. Every program that helps
you do your development work is a "development tool". Your IDE is a
tool, so is your git client. It is a very general term.
On 2022-11-27 16:05, David Brown wrote:
Yes, a compiler is a tool. A "toolchain" for a language generally
consists of the compiler, assembler, librarian and linker (though
sometimes these are combined) - everything taking you from source code
to executable. All the parts of it are "tools", as are optional
extras such as linters, debuggers, build tools, etc. Every program
that helps you do your development work is a "development tool". Your
IDE is a tool, so is your git client. It is a very general term.
Yes, essential vs auxiliary. E.g. violin is an "instrument." Rosin and
the music stand are "tools."
Why does it matter whether it is in one file or not?You brought up sqlite3.c but if it is over 100,000 lines of code inAndy would point out that that's ~2000 pages of code; about 3x the
one file (Andy would not care much for it...) I'm not sure that it's
a valid example for anything!
size of the draft C23 standard. It is therefore a valid example of code
that contains serious errors even if no-one knows what they are. [...]
However OSes can be tens of millions of lines or code.
A quick google
tells me that a Boeing 787 uses 6.5 million lines of avionics code.
about 3x the size of the draft C23 standard.Not a good benchmark; not many people would have a clue about the C23 standard!
sqlite3.c is roughly 2-3 times the size of the Holy Bible.
Meanwhile, and this is a figure I read about 20 years ago so
doubtless has increased since, the source code for MS Visual Studio
comprised 1.5 million /files/ (not lines!), and a full build took 60
hours.
On 26/11/2022 13:59, Bart wrote:
[James:]
Why does it matter whether it is in one file or not?You brought up sqlite3.c but if it is over 100,000 lines of code inAndy would point out that that's ~2000 pages of code; about 3x the
one file (Andy would not care much for it...) I'm not sure that it's
a valid example for anything!
size of the draft C23 standard. It is therefore a valid example of code >>> that contains serious errors even if no-one knows what they are. [...]
It doesn't. I merely claim that it is for all practical purposes impossible to write that much code without making mistakes. In the particular case of SQLite, I regularly have to install bug fixes, and
would be amazed if there are no more to come.
[...]
However OSes can be tens of millions of lines or code.
So, show me a bug-free OS. Or even one still in significant use that has not needed a bug/security update in the past year or so.
A quick google
tells me that a Boeing 787 uses 6.5 million lines of avionics code.
So a Boeing 787 would have been an even better example of code
that beyond reasonable doubt includes potentially-fatal bugs. Just one
more reason not to fly and not to live under a flight path.
about 3x the size of the draft C23 standard.Not a good benchmark; not many people would have a clue about the C23
standard!
People /here/ ought to have such a clue.
sqlite3.c is roughly 2-3 times the size of the Holy Bible.
"I rest my case, m'lud."
Meanwhile, and this is a figure I read about 20 years ago so
doubtless has increased since, the source code for MS Visual Studio
comprised 1.5 million /files/ (not lines!), and a full build took 60
hours.
If you regard MS Visual Studio as bug free, then I'd suggest
that you are very much in the minority, certainly here.
Bugs are a fact of life. Applications should make allowance for that.
Also users, by saving their work, creating backups etc.
So how much program code should exist on a PC with 8GB RAM, and
perhaps 1000GB disk space? (With access to limitless code
downloadable from the internet.)
Perhaps 1MB which is 100,000 lines of code? That means that only
1/80000th of the RAM and 1/1000000th of the disk will be utilised by programs; what on earth do you use the rest for?
On 25/11/2022 14:20, James Harris wrote:
On 25/11/2022 12:40, Bart wrote:
But, that never happens!
Eh???? You say it never happens when very similar does happen ... and
then you go on to give your own counterexamples that are so outre they
really never happen! Que????
Which examples are those?
I wouldn't deliberately write code like this:
int a; INT b; inT c = A + B;"
unless it was some temporary debug or test code and I hadn't paid
attention to the Caps Lock status.
I take advantage of case-insensitivity in these situations:
* For informality for throwaway bits of code: I can inadvertently mix up case, but the code still works, so who cares because it will disappear a minute later
* To specifically add debugging statements in capitals
* To sometimes write function headers in capitals (I did this decades
ago to make it easier to find function definitions with a text editor)
* To allow imported mixed-case names to be written in my own choice of capitalisation (usually all-lower-case)
* To sometimes highlight, for various reasons, specific identifiers or expressions; perhaps to indicate something that is temporary or needs attention or for a 'TO DO' bit of code (see the LXERROR calls at my link)
* Simply being allowed not to care; it's just an extra bit of redundancy
in a language, which is always good
I'll add one more thing. If case is to be ignored (your preference)
then there is nothing to stop different programmers adopting different
non-standard conventions for how they personally capitalise parts of
names, making code even harder to maintain.
You mean, easier? If it's a problem then use a formatting tool that can
take care of it.
[Slightly relevant, I note that Fred "Mythical Man-Month" Brooks died a few days ago, aged 91. He had much to say about large projects, having managed the development of the IBM System/360. His biography on
Wiki notes that he considered his biggest decision "was to change the IBM
360 series from a 6-bit byte to an 8-bit byte, thereby enabling the use
of lowercase letters", relevant to other recent threads here. But the "thereby" isn't accurate; IBM could have followed Ferranti and other computer manufacturers in using 6-bit Flexowriter codes.
On 25/11/2022 17:49, Bart wrote:
unless it was some temporary debug or test code and I hadn't paid
attention to the Caps Lock status.
Understood for temporary or test code but how do you know you've not
done the same in important code?
That illustrates what I said below about stropping. Does your compiler
check that you used capitalisation according to the above scheme?
I
guess not. It's your choice if you want to do that to your own code but isn't it possible that you may not even have kept to your own scheme?
I mean code maintenance would be harder because you'd have to get all maintainers to follow the same capitalisation rules, especially if
adherence wasn't checked by the compiler.
On 28/11/2022 23:04, James Harris wrote:
On 25/11/2022 17:49, Bart wrote:
unless it was some temporary debug or test code and I hadn't paid
attention to the Caps Lock status.
Understood for temporary or test code but how do you know you've not
done the same in important code?
I will notice; I need to look at the screen sometime.
I
guess not. It's your choice if you want to do that to your own code but isn't it possible that you may not even have kept to your own scheme?
Sometimes; but I gave a link to some of my sources; did you see any infringements there? Except that of course they wouldn't matter; just something to be tidied up at some point.
I mean code maintenance would be harder because you'd have to get all maintainers to follow the same capitalisation rules, especially if adherence wasn't checked by the compiler.
I've mentioned a few times other examples where programmmers can impose their own styles, including white space usage, and the 4-5 cases where C
is case-insensitive, leaving people free make things inconsistent. For example:
0xabcdef
0XABCdef
In languages with numeric separators, you might write 123_456 followed
by 12_34_5_6 followed by 123456. It is a non-issue.
In my syntax, letter case is usually redundant, but which allows it to
be utilised in various ways that don't effect the validity or behaviour
of the program, as I've listed.
That is a benefit, not a disadvantage. While case-sensitivity to me
allows worse sins, like `int x, X`, which actually happens (I gave a
real example).
On 29/11/2022 00:05, Bart wrote:
Of the examples you gave I don't generally have the dubiety in my language:
* ull suffix - I don't need anything like that
* e or E for exponent - I don't use either
* U or u - if I ever support Unicode directly then I'd mandate one of
* #include = I don't use
* 0x or 0X - 0x
* letter case for hex digits - not decided yet
In languages with numeric separators, you might write 123_456 followed
by 12_34_5_6 followed by 123456. It is a non-issue.
Rather than use such numbers in code wouldn't you give them names as in
const n_digits = 10
const buf_size = 10
Then (1) the body of the program code can be clearer as to what it means
and (2) either number be changed without affecting the other.
On 29/11/2022 09:26, James Harris wrote:
On 29/11/2022 00:05, Bart wrote:
Of the examples you gave I don't generally have the dubiety in my
language:
* ull suffix - I don't need anything like that
* e or E for exponent - I don't use either
* U or u - if I ever support Unicode directly then I'd mandate one of
* #include = I don't use
* 0x or 0X - 0x
* letter case for hex digits - not decided yet
This was supposed to highlight that that kind of case variability exists
in C, a language famous for popularising strict case sensitivity.
In languages with numeric separators, you might write 123_456
followed by 12_34_5_6 followed by 123456. It is a non-issue.
Rather than use such numbers in code wouldn't you give them names as in
const n_digits = 10
const buf_size = 10
Then (1) the body of the program code can be clearer as to what it
means and (2) either number be changed without affecting the other.
You're ignoring my point, which is that there is scope for variability without changing the meaning, since underscores (etc) are a redundant feature.
Separators exist because integer constants exist. Even named constants
could be defined as:
const a = 10_000
const b = 10000
const c = 0010000 # not in C
const d = 1e4 # in some languages
const e = 0x2710
const f = 9999+1
(I assume you're picking on the fact that my three numbers were
identical; I should have written different values.)
On 29/11/2022 11:25, Bart wrote:
Separators exist because integer constants exist. Even named constants
could be defined as:
const a = 10_000
const b = 10000
const c = 0010000 # not in C
const d = 1e4 # in some languages
const e = 0x2710
const f = 9999+1
(I assume you're picking on the fact that my three numbers were
identical; I should have written different values.)
OK. I would still point out that such numbers are typically expressed
once so have nothing to match against. I'd assume that each of your 10ks
was written to be most relevant for each of the constants. The digit grouping can be whatever is clearest for that specific usage. (I'd
typically use 3-digit grouping for decimals and 4-digit grouping for
others but I could do something else for binary when writing numbers
which relate to bitfields.) In essence, each of your 10ks means
something different - the size of a buffer, a limit of years, something
to group digits, etc. So it's natural that they are all expressed in
ways that relate to how they would be used.
By contrast, each identifier may be used many times always relating to
the same thing. That can lead to questions about the reason why it is written differently in some places compared with others.
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 764 |
Nodes: | 10 (2 / 8) |
Uptime: | 253:07:42 |
Calls: | 10,560 |
Calls today: | 2 |
Files: | 185,887 |
Messages: | 1,673,469 |