• Re: transpiling to low level C

    From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Tue Dec 17 18:29:37 2024
    From Newsgroup: comp.lang.c

    On 17.12.2024 00:26, bart wrote:
    On 16/12/2024 20:39, Janis Papanagnou wrote:

    I wasn't commenting on any "IL",

    The subthread was about ILs.

    You obviously missed that I wasn't quoting anything else from this
    subthread but the one isolated statement I replied to (which also
    wasn't a part of a larger paragraph but standing alone on its own).

    (That's why I think it's good style to strip to the essentials one
    intends to reply to and don't assume (or refer) to any unspoken or
    unquoted parts of a thread; keep posts self-contained! YMMV. That
    will also keep potential confusions and misunderstandings small.)

    Janis

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Thiago Adams@thiago.adams@gmail.com to comp.lang.c on Tue Dec 17 14:55:34 2024
    From Newsgroup: comp.lang.c

    Em 12/17/2024 4:03 AM, BGB escreveu:
    On 12/16/2024 5:21 AM, Thiago Adams wrote:
    On 15/12/2024 20:53, BGB wrote:
    On 12/15/2024 3:32 PM, bart wrote:
    On 15/12/2024 19:08, Bonita Montero wrote:
    C++ is more readable because is is magnitudes more expressive than C. >>>>> You can easily write a C++-statement that would hunddres of lines in >>>>> C (imagines specializing a unordered_map by hand). Making a language >>>>> less expressive makes it even less readable, and that's also true for >>>>> your reduced C.


    That's not really the point of it. This reduced C is used as an
    intermediate language for a compiler target. It will not usually be
    read, or maintained.

    An intermediate language needs to at a lower level than the source
    language.

    And for this project, it needs to be compilable by any C89 compiler.

    Generating C++ would be quite useless.


    As an IL, even C is a little overkill, unless turned into a
    restricted subset (say, along similar lines to GCC's GIMPLE).

    Say:
       Only function-scope variables allowed;
       No high-level control structures;
       ...

    Say:
       int foo(int x)
       {
         int i, v;
         for(i=x, v=0; i>0; i--)
           v=v*i;
         return(v);
       }

    Becoming, say:
       int foo(int x)
       {
         int i;
         int v;
         i=x;
         v=0;
         if(i<=0)goto L1;
         L0:
         v=v*i;
         i=i-1;
         if(i>0)goto L0;
         L1:
         return v;
       }

    ...


    I have considered to remove loops and keep only goto.
    But I think this is not bring too much simplification.


    It depends.

    If the compiler works like an actual C compiler, with a full parser and
    AST stage, yeah, it may not save much.


    If the parser is a thin wrapper over 3AC operations (only allowing statements that map 1:1 with a 3AC IR operation), it may save a bit more...



    As for whether or not it makes sense to use a C like syntax here, this
    is more up for debate (for practical use within a compiler, I would
    assume a binary serialization rather than an ASCII syntax, though ASCII
    may be better in terms of inter-operation or human readability).


    But, as can be noted, I would assume a binary serialization that is
    oriented around operators; and *not* about serializing the structures
    used to implement those operators. Also I would assume that the IR need
    not be in SSA form (conversion to full SSA could be done when reading in
    the IR operations).


    Ny argument is that not using SSA form means fewer issues for both the serialization format and compiler front-end to need to deal with (and is comparably easy to regenerate for the backend, with the backend
    operating with its internal IR in SSA form).

    Well, contrast to LLVM assuming everything is always in SSA form.

    ...



    I also have considered split expressions.

    For instance

    if (a*b+c) {}

    into

    register int r1 = a * b;
    register int r2 = r1 + c;
    if (r2) {}

    This would make easier to add overflow checks in runtime (if desired)
    and implement things like _complex

    Is this what you mean by 3AC or SSA?

    This would definitely simplify expressions grammar.


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Thiago Adams@thiago.adams@gmail.com to comp.lang.c on Tue Dec 17 14:59:49 2024
    From Newsgroup: comp.lang.c

    Em 12/17/2024 2:55 PM, Thiago Adams escreveu:
    Em 12/17/2024 4:03 AM, BGB escreveu:
    On 12/16/2024 5:21 AM, Thiago Adams wrote:
    On 15/12/2024 20:53, BGB wrote:
    On 12/15/2024 3:32 PM, bart wrote:
    On 15/12/2024 19:08, Bonita Montero wrote:
    C++ is more readable because is is magnitudes more expressive than C. >>>>>> You can easily write a C++-statement that would hunddres of lines in >>>>>> C (imagines specializing a unordered_map by hand). Making a language >>>>>> less expressive makes it even less readable, and that's also true for >>>>>> your reduced C.


    That's not really the point of it. This reduced C is used as an
    intermediate language for a compiler target. It will not usually be >>>>> read, or maintained.

    An intermediate language needs to at a lower level than the source
    language.

    And for this project, it needs to be compilable by any C89 compiler. >>>>>
    Generating C++ would be quite useless.


    As an IL, even C is a little overkill, unless turned into a
    restricted subset (say, along similar lines to GCC's GIMPLE).

    Say:
       Only function-scope variables allowed;
       No high-level control structures;
       ...

    Say:
       int foo(int x)
       {
         int i, v;
         for(i=x, v=0; i>0; i--)
           v=v*i;
         return(v);
       }

    Becoming, say:
       int foo(int x)
       {
         int i;
         int v;
         i=x;
         v=0;
         if(i<=0)goto L1;
         L0:
         v=v*i;
         i=i-1;
         if(i>0)goto L0;
         L1:
         return v;
       }

    ...


    I have considered to remove loops and keep only goto.
    But I think this is not bring too much simplification.


    It depends.

    If the compiler works like an actual C compiler, with a full parser
    and AST stage, yeah, it may not save much.


    If the parser is a thin wrapper over 3AC operations (only allowing
    statements that map 1:1 with a 3AC IR operation), it may save a bit
    more...



    As for whether or not it makes sense to use a C like syntax here, this
    is more up for debate (for practical use within a compiler, I would
    assume a binary serialization rather than an ASCII syntax, though
    ASCII may be better in terms of inter-operation or human readability).


    But, as can be noted, I would assume a binary serialization that is
    oriented around operators; and *not* about serializing the structures
    used to implement those operators. Also I would assume that the IR
    need not be in SSA form (conversion to full SSA could be done when
    reading in the IR operations).


    Ny argument is that not using SSA form means fewer issues for both the
    serialization format and compiler front-end to need to deal with (and
    is comparably easy to regenerate for the backend, with the backend
    operating with its internal IR in SSA form).

    Well, contrast to LLVM assuming everything is always in SSA form.

    ...



    I also have considered split expressions.

    For instance

    if (a*b+c) {}

    into

    register int r1 = a * b;
    register int r2 = r1 + c;
    if (r2) {}

    This would make easier to add overflow checks in runtime (if desired)
    and implement things like _complex

    Is this what you mean by 3AC or SSA?

    This would definitely simplify expressions grammar.



    I also have consider remove local scopes. But I think local scopes may
    be useful to better use stack reusing the same addresses when variables
    goes out of scope.
    For instance

    {
    int i =1;
    {
    int a = 2;
    }
    {
    int b = 3;
    }
    }
    I think scope makes easier to use the same stack position of a and b
    because it is easier to see a does not exist any more.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Thiago Adams@thiago.adams@gmail.com to comp.lang.c on Tue Dec 17 15:16:48 2024
    From Newsgroup: comp.lang.c

    Em 12/17/2024 2:59 PM, Thiago Adams escreveu:
    Em 12/17/2024 2:55 PM, Thiago Adams escreveu:
    Em 12/17/2024 4:03 AM, BGB escreveu:
    On 12/16/2024 5:21 AM, Thiago Adams wrote:
    On 15/12/2024 20:53, BGB wrote:
    On 12/15/2024 3:32 PM, bart wrote:
    On 15/12/2024 19:08, Bonita Montero wrote:
    C++ is more readable because is is magnitudes more expressive
    than C.
    You can easily write a C++-statement that would hunddres of lines in >>>>>>> C (imagines specializing a unordered_map by hand). Making a language >>>>>>> less expressive makes it even less readable, and that's also true >>>>>>> for
    your reduced C.


    That's not really the point of it. This reduced C is used as an
    intermediate language for a compiler target. It will not usually
    be read, or maintained.

    An intermediate language needs to at a lower level than the source >>>>>> language.

    And for this project, it needs to be compilable by any C89 compiler. >>>>>>
    Generating C++ would be quite useless.


    As an IL, even C is a little overkill, unless turned into a
    restricted subset (say, along similar lines to GCC's GIMPLE).

    Say:
       Only function-scope variables allowed;
       No high-level control structures;
       ...

    Say:
       int foo(int x)
       {
         int i, v;
         for(i=x, v=0; i>0; i--)
           v=v*i;
         return(v);
       }

    Becoming, say:
       int foo(int x)
       {
         int i;
         int v;
         i=x;
         v=0;
         if(i<=0)goto L1;
         L0:
         v=v*i;
         i=i-1;
         if(i>0)goto L0;
         L1:
         return v;
       }

    ...


    I have considered to remove loops and keep only goto.
    But I think this is not bring too much simplification.


    It depends.

    If the compiler works like an actual C compiler, with a full parser
    and AST stage, yeah, it may not save much.


    If the parser is a thin wrapper over 3AC operations (only allowing
    statements that map 1:1 with a 3AC IR operation), it may save a bit
    more...



    As for whether or not it makes sense to use a C like syntax here,
    this is more up for debate (for practical use within a compiler, I
    would assume a binary serialization rather than an ASCII syntax,
    though ASCII may be better in terms of inter-operation or human
    readability).


    But, as can be noted, I would assume a binary serialization that is
    oriented around operators; and *not* about serializing the structures
    used to implement those operators. Also I would assume that the IR
    need not be in SSA form (conversion to full SSA could be done when
    reading in the IR operations).


    Ny argument is that not using SSA form means fewer issues for both
    the serialization format and compiler front-end to need to deal with
    (and is comparably easy to regenerate for the backend, with the
    backend operating with its internal IR in SSA form).

    Well, contrast to LLVM assuming everything is always in SSA form.

    ...



    I also have considered split expressions.

    For instance

    if (a*b+c) {}

    into

    register int r1 = a * b;
    register int r2 = r1 + c;
    if (r2) {}

    This would make easier to add overflow checks in runtime (if desired)
    and implement things like _complex

    Is this what you mean by 3AC or SSA?

    This would definitely simplify expressions grammar.



    I also have consider remove local scopes. But I think local scopes may
    be useful to better use stack reusing the same addresses when variables
    goes out of scope.
    For instance

    {
     int i =1;
     {
      int a  = 2;
     }
     {
      int b  = 3;
     }
    }
    I think scope makes easier to use the same stack position of a and b
    because it is easier to see a does not exist any more.


    also remove structs changing by unsigned char [] and cast parts of it to access members.

    I think this the lower level possible in c.





    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Tue Dec 17 18:37:47 2024
    From Newsgroup: comp.lang.c

    On 17/12/2024 18:16, Thiago Adams wrote:


    also remove structs changing by unsigned char [] and cast parts of it to access members.

    I think this the lower level possible in c.

    This is what I do in my IL, where structs are just fixed blocks of so
    many bytes.

    But there are some things to consider:

    * A struct may still need alignment corresponding to the strictest
    alignment among the members. (Any padding between members and at the end should already be taken care of.)

    I use an alignment based on overall size, so a 40-byte struct is assumed
    to have an 64-bit max alignment, but it may only need 16-bit alignment.
    That is harmless, but it can be fixed with some extra metadata.

    With a C char[], you can choose to use a short[] array for example
    (obviously of half the length) to signal that it needs 16-bit alignment.


    * Some machine ABIs, like SYS V for 64 bits, may need to know the
    internal layout of structs when they are passed 'by value'.

    If reduced down to char[], this info will be missing.

    I ignore this because I only target Win64 ABI. It only comes up in SYS
    V, when calling functions across an FFI, and when the API uses value
    structs, which is uncommon. And also makes I can't make head or tail of
    the rules.


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Tue Dec 17 18:46:00 2024
    From Newsgroup: comp.lang.c

    bart <bc@freeuk.com> wrote:

    If you try to extract any meaning, it is that any control flow can be expressed either with 'goto' or with 'recursive functions'.

    This is what I picked up on. Who on earth would eschew 'goto' and use
    such a disproportionately more complex and inefficient method like
    recursive functions?

    Due to silly conding standard? Or in language that does not have
    'goto'.

    How would you even express an arbitrary goto from random point X in a function to random point Y, which may be inside differently nested
    blocks, via a recursive function?

    AFAICS in C main limitation is that you either pass all variables
    as parameters (ugly and verbose) or use only global variables
    (much worse than 'goto'). The following silly example shows
    that 'if' can be simulated using array of function pointers and
    indirect calls:

    static int bar(int a) {
    return a + 1;
    }

    static int baz(int a) {
    return 2*a;
    }

    int
    silly(int a) {
    int (*t[2])(int) = {bar, baz};
    return (*t[!!(a > 3)])(a);
    }

    If you compile it with 'gcc -S -O2' you can see that actually there
    are no function calls in generated code (but generated code is clearly
    crappy). However, needed optimication is really simple, so
    in principle any compiler could do better. OTOH code like
    this is rare in practice, so probably compiler writers did not
    bother.

    In similar way one can simulate dense C 'switch'.

    Main point is that function call at the end of say 'F' to function
    'G' which retruns in the same way as 'F' can be compiled to some
    stack shuffling + goto (this is called 'tail call optimization').

    IIUC at least some Scheme and ML compilers keep calls in intermediate
    level representation (both have no 'goto') and convert them to
    jumps only when emiting machine code.

    Similar thing was used by one disassembly system: it generated "high
    level code" by converting all jumps in machine code to function
    calls. Later the result was cleaned up by transformations, in
    particular recursion elimination.

    Of course, for orginal purpose of this thread replacing 'if' by
    indirect calls is useless
    --
    Waldek Hebisch
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Tue Dec 17 12:51:04 2024
    From Newsgroup: comp.lang.c

    On 12/17/2024 6:04 AM, bart wrote:
    On 16/12/2024 21:23, Lawrence D'Oliveiro wrote:
    On Sun, 15 Dec 2024 17:53:30 -0600, BGB wrote:

    As an IL, even C is a little overkill, unless turned into a restricted
    subset ...

    Why not use WASM as your IL?

    Have you tried it? I mean, directly generating WASM from a compiler front-end, not just using somebody else's tool to do so.

    WASM is a stack-based language, but one that supposedly doesn't even
    have branching, although there is a 'br' statement, with some restrictions.

    Information about it is quite elusive; it took me 5 minutes to even get examples of what it looks like (and I've seen it before).

    C can apparently compile to WASM via Clang, so I tried this program:

     void F(void) {
        int i=0;
        while (i<10000) ++i;
     }

    which compiled to 128 lines of WASM (technically, some form of 'WAT', as WASM is a binary format). The 60 lines correspondoing to F are shown
    below, and below that, is my own stack IL code.

    So, what do you with your WASM/WAT program once generated? I've no idea, except that WASM is inextricably typed up with with browsers and with JavaScript, in which I have no interest.

    With C, you run a compiler; with ASM, an assembler; these formats are
    well understood.

    You can appreciate that it can be easier to devise your own format and
    your own tools that you understand 100%.



    Hmm... It looks like the WASM example is already trying to follow SSA
    rules, then mapped to a stack IL... Not necessarily the best way to do
    it IMO.


    But, yeah, in BGBCC I am also using a stack-based IL (RIL), which
    follows rules more in a similar category to .NET CIL (in that, stack
    items carry type, and the stack is generally fully emptied on branch).


    In my IL, labels are identified with a LABEL opcode (with an immediate),
    and things like branches work by having the branch target and label
    having the same immediate (label ID).

    I ended up considering this preferable to byte offsets, as:
    Easier to generate from the front-end;
    LABEL also marks the start/end of basic blocks;
    ...

    There are also opcodes to convey the source filename and line number,
    these don't generate any output but merely serve to transport filename
    and line number information (useful for debugging).


    RIL was a little weird in that functions and variables are themselves
    defined via bytecode operations. This is unlike both JVM and .NET CIL,
    which had used external metadata/structures for defining functions and variables (nevermind the significant differences between JVM and .NET in
    this area).

    This is pros/cons, main downside of the current format is that it
    requires the bytecode modules to be loaded sequentially and fully. This
    works OK for a compiler on a modern PC, but does impose on RAM somewhat
    for a compiler on a more memory-constrained target. One idea would be to individually wrap functions and have a mechanism so that they can be
    loaded dynamically. But, this hasn't really been done for my existing
    IL. Most likely option is that metadata continues to be defined via
    bytecode operations, just that each function is separately wrapped, and
    there may be an index to map function names to the corresponding "lump"
    (say, if using a WAD variant as the top-level container).

    Say:
    Lump name is "FNC01234" (IWAD) or "func_1234" (WAD2).
    And there is a table mapping "FOO_SomeFunction" to "FNC01234" or "func_1234".
    But, this sort of things, along with past ideas to try moving this over
    to a format along similar lines to RIFF/AVI, have generally fizzled
    (along with possible debate over to to the merits of a WAD-like or
    RIFF-like format).

    Though, an arguably simpler option might be to just individually wrap
    the bytecode for each translation unit, and have an manifest of what
    symbols are present. In this way, it would function more like a
    traditional static library (as opposed to the current strategy of
    globing all of the translation units in the library into a single large
    blob of bytecode); and probably dumping the bytecode for each
    translation unit into a WAD (again, possibly either IWAD or WAD2, though probably WAD2 in this case, as the comparably larger lumps would
    eliminate most concern over the larger directory entries).



    When converting to the 3AC IR, there is the quirk that function calls
    are split into multiple parts:
    The CALL operation, which ends the current basic-block;
    A CSRV operation, which is at the start of a new basic block.
    CSRV = Caller Save Return Value.

    In cases where the 3AC was being interpreted, this was better, as the
    CSRV operation serves to save the return value from the called function
    to the correct place in the caller's frame (where the interpreter does
    not use recursion for its own operation).
    Internal conversion to 3AC was faster than trying to directly interpret
    a stack bytecode (as well as 3AC being a better format for code generation).



    -------------------------------------- F:                                      # @F
        .functype    F () -> ()
        .local      i32, i32, i32, i32, i32, i32, i32, i32, i32, i32, i32,
    i32, i32, i32
    # %bb.0:
        global.get    __stack_pointer
        local.set    0
        i32.const    16
        local.set    1
        local.get    0
        local.get    1
        i32.sub
        local.set    2
        i32.const    0
        local.set    3
        local.get    2
        local.get    3
        i32.store    12 .LBB0_1:                                # =>This Inner Loop Header: Depth=1
        block
        loop                                        # label1:
        local.get    2
        i32.load    12
        local.set    4
        i32.const    10000
        local.set    5
        local.get    4
        local.set    6
        local.get    5
        local.set    7
        local.get    6
        local.get    7
        i32.lt_s
        local.set    8
        i32.const    1
        local.set    9
        local.get    8
        local.get    9
        i32.and
        local.set    10
        local.get    10
        i32.eqz
        br_if       1                               # 1: down to label0
    # %bb.2:                                #   in Loop: Header=BB0_1 Depth=1
        local.get    2
        i32.load    12
        local.set    11
        i32.const    1
        local.set    12
        local.get    11
        local.get    12
        i32.add
        local.set    13
        local.get    2
        local.get    13
        i32.store    12
        br          0                               # 0: up to label1
    .LBB0_3:
        end_loop
        end_block                               # label0:
        return
        end_function

    -----------------------------

    proc F::
               local    i32       i.1
        load     i32       0
        store    i32       i.1
        jump               #2
    #4:
        load     u64       &i.1
        incrto   i32 /1
    #2:
        load     i32       i.1
        load     i32       10000
        jumplt   i32       #4
    #3:
    #1:
        retproc
    endproc



    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Thiago Adams@thiago.adams@gmail.com to comp.lang.c on Tue Dec 17 16:07:45 2024
    From Newsgroup: comp.lang.c

    Em 12/17/2024 3:37 PM, bart escreveu:
    On 17/12/2024 18:16, Thiago Adams wrote:


    also remove structs changing by unsigned char [] and cast parts of it
    to access members.

    I think this the lower level possible in c.

    This is what I do in my IL, where structs are just fixed blocks of so
    many bytes.


    How do you do with struct parameters?


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Tue Dec 17 13:07:44 2024
    From Newsgroup: comp.lang.c

    On 12/17/2024 11:55 AM, Thiago Adams wrote:
    Em 12/17/2024 4:03 AM, BGB escreveu:
    On 12/16/2024 5:21 AM, Thiago Adams wrote:
    On 15/12/2024 20:53, BGB wrote:
    On 12/15/2024 3:32 PM, bart wrote:
    On 15/12/2024 19:08, Bonita Montero wrote:
    C++ is more readable because is is magnitudes more expressive than C. >>>>>> You can easily write a C++-statement that would hunddres of lines in >>>>>> C (imagines specializing a unordered_map by hand). Making a language >>>>>> less expressive makes it even less readable, and that's also true for >>>>>> your reduced C.


    That's not really the point of it. This reduced C is used as an
    intermediate language for a compiler target. It will not usually be >>>>> read, or maintained.

    An intermediate language needs to at a lower level than the source
    language.

    And for this project, it needs to be compilable by any C89 compiler. >>>>>
    Generating C++ would be quite useless.


    As an IL, even C is a little overkill, unless turned into a
    restricted subset (say, along similar lines to GCC's GIMPLE).

    Say:
       Only function-scope variables allowed;
       No high-level control structures;
       ...

    Say:
       int foo(int x)
       {
         int i, v;
         for(i=x, v=0; i>0; i--)
           v=v*i;
         return(v);
       }

    Becoming, say:
       int foo(int x)
       {
         int i;
         int v;
         i=x;
         v=0;
         if(i<=0)goto L1;
         L0:
         v=v*i;
         i=i-1;
         if(i>0)goto L0;
         L1:
         return v;
       }

    ...


    I have considered to remove loops and keep only goto.
    But I think this is not bring too much simplification.


    It depends.

    If the compiler works like an actual C compiler, with a full parser
    and AST stage, yeah, it may not save much.


    If the parser is a thin wrapper over 3AC operations (only allowing
    statements that map 1:1 with a 3AC IR operation), it may save a bit
    more...



    As for whether or not it makes sense to use a C like syntax here, this
    is more up for debate (for practical use within a compiler, I would
    assume a binary serialization rather than an ASCII syntax, though
    ASCII may be better in terms of inter-operation or human readability).


    But, as can be noted, I would assume a binary serialization that is
    oriented around operators; and *not* about serializing the structures
    used to implement those operators. Also I would assume that the IR
    need not be in SSA form (conversion to full SSA could be done when
    reading in the IR operations).


    Ny argument is that not using SSA form means fewer issues for both the
    serialization format and compiler front-end to need to deal with (and
    is comparably easy to regenerate for the backend, with the backend
    operating with its internal IR in SSA form).

    Well, contrast to LLVM assuming everything is always in SSA form.

    ...



    I also have considered split expressions.

    For instance

    if (a*b+c) {}

    into

    register int r1 = a * b;
    register int r2 = r1 + c;
    if (r2) {}

    This would make easier to add overflow checks in runtime (if desired)
    and implement things like _complex

    Is this what you mean by 3AC or SSA?


    3AC means that IR expressed 3 (or sometimes more) operands per IR op.

    So:
    MUL r1, a, b
    Rather than, say, stack:
    LOAD a
    LOAD b
    MUL
    STORE r1


    SSA:
    Static Single Assignment

    Generally:
    Every variable may only be assigned once (more like in a functional programming language);
    Generally, variables are "merged" in the control-flow via PHI operators
    (which variable merges in depending on which path control came from).

    IMHO, while SSA is preferable for backend analysis, optimization, and
    code generation; it is undesirable pretty much everywhere else as it
    adds too much complexity.

    Better IMO for the frontend compiler and main IL stage to assume that
    local variables are freely mutable.

    Typically, global variables are excluded in most variants, and remain
    fully mutable; but may be handled as designated LOAD/STORE operations.


    In BGBCC though, full SSA only applies to temporaries. Normal local
    variables are merely flagged by "version", and all versions of the same
    local variable implicitly merge back together at each branch/label.

    This allows some similar advantages (for analysis and optimization)
    while limiting some of the complexities. Though, this differs from
    temporaries which are assumed to essentially fully disappear once they
    go outside of the span in which they exist (albeit with an awkward case
    to deal with temporaries that cross basic-block boundaries, which need
    to actually "exist" in some semi-concrete form, more like local variables).

    Note that unless the address is taken of a local variable, it need not
    have any backing in memory. Temporaries can never have their address
    taken, so generally exist exclusively in CPU registers.


    This would definitely simplify expressions grammar.



    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Thiago Adams@thiago.adams@gmail.com to comp.lang.c on Tue Dec 17 16:33:09 2024
    From Newsgroup: comp.lang.c

    Em 12/17/2024 4:07 PM, BGB escreveu:
    On 12/17/2024 11:55 AM, Thiago Adams wrote:
    Em 12/17/2024 4:03 AM, BGB escreveu:
    On 12/16/2024 5:21 AM, Thiago Adams wrote:
    On 15/12/2024 20:53, BGB wrote:
    On 12/15/2024 3:32 PM, bart wrote:
    On 15/12/2024 19:08, Bonita Montero wrote:
    C++ is more readable because is is magnitudes more expressive
    than C.
    You can easily write a C++-statement that would hunddres of lines in >>>>>>> C (imagines specializing a unordered_map by hand). Making a language >>>>>>> less expressive makes it even less readable, and that's also true >>>>>>> for
    your reduced C.


    That's not really the point of it. This reduced C is used as an
    intermediate language for a compiler target. It will not usually
    be read, or maintained.

    An intermediate language needs to at a lower level than the source >>>>>> language.

    And for this project, it needs to be compilable by any C89 compiler. >>>>>>
    Generating C++ would be quite useless.


    As an IL, even C is a little overkill, unless turned into a
    restricted subset (say, along similar lines to GCC's GIMPLE).

    Say:
       Only function-scope variables allowed;
       No high-level control structures;
       ...

    Say:
       int foo(int x)
       {
         int i, v;
         for(i=x, v=0; i>0; i--)
           v=v*i;
         return(v);
       }

    Becoming, say:
       int foo(int x)
       {
         int i;
         int v;
         i=x;
         v=0;
         if(i<=0)goto L1;
         L0:
         v=v*i;
         i=i-1;
         if(i>0)goto L0;
         L1:
         return v;
       }

    ...


    I have considered to remove loops and keep only goto.
    But I think this is not bring too much simplification.


    It depends.

    If the compiler works like an actual C compiler, with a full parser
    and AST stage, yeah, it may not save much.


    If the parser is a thin wrapper over 3AC operations (only allowing
    statements that map 1:1 with a 3AC IR operation), it may save a bit
    more...



    As for whether or not it makes sense to use a C like syntax here,
    this is more up for debate (for practical use within a compiler, I
    would assume a binary serialization rather than an ASCII syntax,
    though ASCII may be better in terms of inter-operation or human
    readability).


    But, as can be noted, I would assume a binary serialization that is
    oriented around operators; and *not* about serializing the structures
    used to implement those operators. Also I would assume that the IR
    need not be in SSA form (conversion to full SSA could be done when
    reading in the IR operations).


    Ny argument is that not using SSA form means fewer issues for both
    the serialization format and compiler front-end to need to deal with
    (and is comparably easy to regenerate for the backend, with the
    backend operating with its internal IR in SSA form).

    Well, contrast to LLVM assuming everything is always in SSA form.

    ...



    I also have considered split expressions.

    For instance

    if (a*b+c) {}

    into

    register int r1 = a * b;
    register int r2 = r1 + c;
    if (r2) {}

    This would make easier to add overflow checks in runtime (if desired)
    and implement things like _complex

    Is this what you mean by 3AC or SSA?


    3AC means that IR expressed 3 (or sometimes more) operands per IR op.

    So:
      MUL r1, a, b
    Rather than, say, stack:
      LOAD a
      LOAD b
      MUL
      STORE r1


    SSA:
      Static Single Assignment


    Oh sorry .. I knew what SSA is.

    Generally:
    Every variable may only be assigned once (more like in a functional programming language);
    Generally, variables are "merged" in the control-flow via PHI operators (which variable merges in depending on which path control came from).


    I do similar merge in my flow analysis but without the concept of SSA.

    IMHO, while SSA is preferable for backend analysis, optimization, and
    code generation; it is undesirable pretty much everywhere else as it
    adds too much complexity.

    Better IMO for the frontend compiler and main IL stage to assume that
    local variables are freely mutable.

    Typically, global variables are excluded in most variants, and remain
    fully mutable; but may be handled as designated LOAD/STORE operations.


    In BGBCC though, full SSA only applies to temporaries. Normal local variables are merely flagged by "version", and all versions of the same local variable implicitly merge back together at each branch/label.


    Sorry what is BGBCC ? (C compiler?)

    This allows some similar advantages (for analysis and optimization)
    while limiting some of the complexities. Though, this differs from temporaries which are assumed to essentially fully disappear once they
    go outside of the span in which they exist (albeit with an awkward case
    to deal with temporaries that cross basic-block boundaries, which need
    to actually "exist" in some semi-concrete form, more like local variables).

    Note that unless the address is taken of a local variable, it need not
    have any backing in memory. Temporaries can never have their address
    taken, so generally exist exclusively in CPU registers.


    This would definitely simplify expressions grammar.





    It can be added in the future.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Lawrence D'Oliveiro@ldo@nz.invalid to comp.lang.c on Tue Dec 17 19:40:53 2024
    From Newsgroup: comp.lang.c

    On Tue, 17 Dec 2024 12:04:29 +0000, bart wrote:

    Information about it is quite elusive ...

    Did you try the usual place for Web-related stuff?

    <https://developer.mozilla.org/en-US/docs/WebAssembly>
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Tue Dec 17 19:42:51 2024
    From Newsgroup: comp.lang.c

    On 17/12/2024 19:07, Thiago Adams wrote:
    Em 12/17/2024 3:37 PM, bart escreveu:
    On 17/12/2024 18:16, Thiago Adams wrote:


    also remove structs changing by unsigned char [] and cast parts of it
    to access members.

    I think this the lower level possible in c.

    This is what I do in my IL, where structs are just fixed blocks of so
    many bytes.


    How do you do with struct parameters?



    In the IL they are always passed notionally by value. This side of the
    IL (that is, the frontend compile that generates IL), knows nothing
    about the target, such as ABI details.

    (In practice, some things are known, like the word size of the target,
    since that can change characteristics of the source language, like the
    size of 'int' or of 'void*'. It also needs to assume, or request from
    the backend, argument evaluation order, although my IL can reverse order
    if necessary.)

    It is the backend, on the other size of the IL, that needs to deal with
    those details.

    That can include making copies of structs that the ABI says are passed
    by value. But when targeting SYS V ABI (which I haven't attempted yet),
    it may need to know the internal layout of a struct.

    You can however do experiments with using SYS V on Linux (must be 64 bits):

    * Create test structs with, say, int32 or int64 elements

    * Write a test function where such a struct is passed by value, and
    then return a modified copy

    * Rerun the test using a version of the function where a char[] version
    of the struct is passed and returned, and which contains the member
    access casts you suggested

    * See if it gives the same results.

    You might need a union of the two structs, or use memcpy to transfer
    contents, before and after calling the test function.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Tue Dec 17 19:45:49 2024
    From Newsgroup: comp.lang.c

    On 17/12/2024 19:40, Lawrence D'Oliveiro wrote:
    On Tue, 17 Dec 2024 12:04:29 +0000, bart wrote:

    Information about it is quite elusive ...

    Did you try the usual place for Web-related stuff?

    <https://developer.mozilla.org/en-US/docs/WebAssembly>

    That's all at the wrong level, eg:

    "When you've written code in C/C++, you can then compile it into Wasm
    using a tool like Emscripten"

    It's not aimed at people /implementing/ such a tool.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Keith Thompson@Keith.S.Thompson+u@gmail.com to comp.lang.c on Tue Dec 17 12:13:15 2024
    From Newsgroup: comp.lang.c

    bart <bc@freeuk.com> writes:
    On 17/12/2024 01:19, Keith Thompson wrote:
    bart <bc@freeuk.com> writes:
    [SNIP]
    In that case I've no idea what you were trying to say.

    When somebody says that 'goto' can emulate any control structure, then
    clearly some of them need to be conditional; that is implied.

    Your reply suggested they you can do away with 'goto', and use
    recursive functions, in a scenario where no other control structures
    need exist.

    OK, if this is not for an IL, then it's not a language I would care
    for either. Why tie one hand behind your back for no good reason?
    I read Janis's post. I saw a suggestion that certain constructs are
    *theoretically* unnecessary. I saw no suggestion of any advocacy for
    such an approach.
    """
    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.
    """

    This doesn't actually make much sense. So 'goto' is necessary, but
    'goto' *is*?

    I presume you didn't write what you intended to write. Responding to
    what I *think* you meant :

    Either
    "if" and "goto"
    or
    "if" and recursive functions
    are theoretically sufficient to express certain kinds of algorithms
    (I'm handwaving a bit). Which implies that "goto" is not strictly
    necessary. It also implies that recursive functions are not strictly
    necessary if you have "goto".

    Since this is comp.lang.c, not comp.theory (or what comp.theory was
    intended to be), I'm not going to go into the details, nor am I going to
    take the time to express the concept in mathematically rigorous terms.

    If you try to extract any meaning, it is that any control flow can be expressed either with 'goto' or with 'recursive functions'.

    Yes, either of those plus "if". It appears you understand the point.

    This is what I picked up on. Who on earth would eschew 'goto' and use
    such a disproportionately more complex and inefficient method like
    recursive functions?

    Perhaps it wasn't clear initially, but it should be by now,
    that Janis was talking about what's theoretically sufficient to
    express general algorithms. You seized on the silly idea that
    Janis was *advocating* the use of one of the two minimal methods in
    an intermediate language for a compiler. The idea Janis brought
    up (briefly, in passing) is about theoretical computer science,
    not practical software engineering. (Janis, please correct me if
    I'm mistaken.)

    Repeatedly asking why anyone would do such a thing misses the point.

    [...]
    --
    Keith Thompson (The_Other_Keith) Keith.S.Thompson+u@gmail.com
    void Void(void) { Void(); } /* The recursive call of the void */
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Lawrence D'Oliveiro@ldo@nz.invalid to comp.lang.c on Tue Dec 17 22:25:44 2024
    From Newsgroup: comp.lang.c

    On Tue, 17 Dec 2024 19:45:49 +0000, bart wrote:

    On 17/12/2024 19:40, Lawrence D'Oliveiro wrote:

    On Tue, 17 Dec 2024 12:04:29 +0000, bart wrote:

    Information about it is quite elusive ...

    Did you try the usual place for Web-related stuff?

    <https://developer.mozilla.org/en-US/docs/WebAssembly>

    It's not aimed at people /implementing/ such a tool.

    It is aimed at those capable of following the links to relevant specs.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Tue Dec 17 22:45:14 2024
    From Newsgroup: comp.lang.c

    On 17/12/2024 18:46, Waldek Hebisch wrote:
    bart <bc@freeuk.com> wrote:

    If you try to extract any meaning, it is that any control flow can be
    expressed either with 'goto' or with 'recursive functions'.

    This is what I picked up on. Who on earth would eschew 'goto' and use
    such a disproportionately more complex and inefficient method like
    recursive functions?

    Due to silly conding standard? Or in language that does not have
    'goto'.

    It was suggested that 'theoretically', 'goto' could be replaced by
    recursive function calls.

    Whether still within the context of a language with no other control
    flow instructions, is not known. The suggester also chose not to share examples of how it would work.


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Tue Dec 17 22:55:53 2024
    From Newsgroup: comp.lang.c

    On 17/12/2024 22:25, Lawrence D'Oliveiro wrote:
    On Tue, 17 Dec 2024 19:45:49 +0000, bart wrote:

    On 17/12/2024 19:40, Lawrence D'Oliveiro wrote:

    On Tue, 17 Dec 2024 12:04:29 +0000, bart wrote:

    Information about it is quite elusive ...

    Did you try the usual place for Web-related stuff?

    <https://developer.mozilla.org/en-US/docs/WebAssembly>

    It's not aimed at people /implementing/ such a tool.

    It is aimed at those capable of following the links to relevant specs.

    It also a pretty terrible link. Trying to extract useful info a snippet
    at a time is like pulling teeth. Here I was merely after an example of
    WASM textual format.

    WASM is somewhat like LLVM in that there the docs are so extensive that
    they become impossible.

    Show me (I assume you know all about it) how to write Hello, World in
    WAT format, and what tool I need to download and use to run it. On Windows.

    I can do it with my IL in half a page.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Wed Dec 18 00:23:12 2024
    From Newsgroup: comp.lang.c

    bart <bc@freeuk.com> wrote:
    On 17/12/2024 18:46, Waldek Hebisch wrote:
    bart <bc@freeuk.com> wrote:

    If you try to extract any meaning, it is that any control flow can be
    expressed either with 'goto' or with 'recursive functions'.

    This is what I picked up on. Who on earth would eschew 'goto' and use
    such a disproportionately more complex and inefficient method like
    recursive functions?

    Due to silly conding standard? Or in language that does not have
    'goto'.

    It was suggested that 'theoretically', 'goto' could be replaced by
    recursive function calls.

    Whether still within the context of a language with no other control
    flow instructions, is not known. The suggester also chose not to share examples of how it would work.

    The example I gave (and you snipped) was supposed to explain how
    the technique works, but it seems that it is not enough. So
    let us look at another example. Start from ordinary C code that
    only uses global variables (this is not strictly necessary, but
    let as make such assumption for simplicity):

    int n;
    int * a;
    int b;
    int i;

    ...
    /* Simple search loop */
    for(i = 0; i < n; i++) {
    if (a[i] == b) {
    break;
    }
    }

    First, express flow control using only conditional and unconditional
    jump:

    l0:
    i = 0;
    goto l3;
    l1:
    int c1 = a[i] == b;
    if (c1) {
    goto l4;
    } else {
    goto l2;
    }
    l2:
    i++;
    l3:
    int c2 = i < n;
    if (c2) {
    goto l1;
    } else {
    goto l4;
    }
    l4:
    ;

    Note, I introduced more jumps than strictly necessary, so that
    hunks between labels end either in conditional or unconditional
    jump.

    Next, replace each hunk staring in a label, up to (but not
    including) next label, by a new function. Replace final jumps
    by function calls, for conditional jumps using the same trick
    as in previous 'silly' example:

    int n;
    int * a;
    int b;
    int i;

    void l2(void);
    void l3(void);
    void l4(void);

    void l0(void) {
    i = 0;
    l3();
    }

    void l1(void) {
    void (*(t[2]))(void) = {l4, l2};
    int c1 = a[i] == b;
    (*(t[c1]))();
    }

    void l2(void) {
    i++;
    l3();
    }

    void l3(void) {
    void (*(t[]))(void) = {l1, l4};
    int c2 = i < n;
    (*(t[c2]))();
    }

    void l4(void) {
    }

    Note: 'l4' is different than other functions, intead of calling
    something it returns, ensuring that the sequence of calls
    eventually terminate.

    I hope that principles are clear now. If you compile this
    with gcc at -O2 you will see that there are no calls
    in generated code, only jumps. Slightly better code is
    generated by clang. Note that generated code uses stack
    only for final return.

    BTW: you can see that currently tcc do not support this
    coding style, that is code generated by tcc dully performs
    all calls leading possibly to stack overflow and to
    lower performance. Code generated by tcc from "jumpy"
    version looks slightly worse than code generated by
    clang from version using calls.
    --
    Waldek Hebisch
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Wed Dec 18 01:24:42 2024
    From Newsgroup: comp.lang.c

    On 18/12/2024 00:23, Waldek Hebisch wrote:
    bart <bc@freeuk.com> wrote:
    On 17/12/2024 18:46, Waldek Hebisch wrote:
    bart <bc@freeuk.com> wrote:

    If you try to extract any meaning, it is that any control flow can be
    expressed either with 'goto' or with 'recursive functions'.

    This is what I picked up on. Who on earth would eschew 'goto' and use
    such a disproportionately more complex and inefficient method like
    recursive functions?

    Due to silly conding standard? Or in language that does not have
    'goto'.

    It was suggested that 'theoretically', 'goto' could be replaced by
    recursive function calls.

    Whether still within the context of a language with no other control
    flow instructions, is not known. The suggester also chose not to share
    examples of how it would work.

    The example I gave (and you snipped) was supposed to explain how
    the technique works, but it seems that it is not enough.

    It showed how to do conditional code without explicit branching. It
    didn't seem to me to cover arbitrary gotos, or where recursion comes
    into it.

    (Actually I implemented it in my two languages to compare performance to 'straight' versions, however my test called silly() lots of times so it
    wasn't a good test.)

    So
    let us look at another example. Start from ordinary C code that
    only uses global variables (this is not strictly necessary, but
    let as make such assumption for simplicity):

    int n;
    int * a;
    int b;
    int i;

    ...
    /* Simple search loop */
    for(i = 0; i < n; i++) {
    if (a[i] == b) {
    break;
    }
    }

    First, express flow control using only conditional and unconditional
    jump:

    l0:
    i = 0;
    goto l3;
    l1:
    int c1 = a[i] == b;
    if (c1) {
    goto l4;
    } else {
    goto l2;
    }
    l2:
    i++;
    l3:
    int c2 = i < n;
    if (c2) {
    goto l1;
    } else {
    goto l4;
    }
    l4:
    ;

    Note, I introduced more jumps than strictly necessary, so that
    hunks between labels end either in conditional or unconditional
    jump.

    Next, replace each hunk staring in a label, up to (but not
    including) next label, by a new function. Replace final jumps
    by function calls, for conditional jumps using the same trick
    as in previous 'silly' example:

    int n;
    int * a;
    int b;
    int i;

    void l2(void);
    void l3(void);
    void l4(void);

    void l0(void) {
    i = 0;
    l3();
    }

    void l1(void) {
    void (*(t[2]))(void) = {l4, l2};
    int c1 = a[i] == b;
    (*(t[c1]))();
    }

    void l2(void) {
    i++;
    l3();
    }

    void l3(void) {
    void (*(t[]))(void) = {l1, l4};
    int c2 = i < n;
    (*(t[c2]))();
    }

    void l4(void) {
    }

    Note: 'l4' is different than other functions, intead of calling
    something it returns, ensuring that the sequence of calls
    eventually terminate.

    OK thanks for this. I tried to duplicate it based on this starting point:

    #include <stdio.h>

    int n=6;
    int a[]={10,20,30,40,50,60};
    int b=30;
    int i;

    int main(void) {
    for(i = 0; i < n; i++) {
    printf("%d\n",a[i]);
    if (a[i] == b) {
    break;
    }
    }
    }

    This prints 10 20 30 as it is. But the version with the function calls
    showed only '10'. If I swapped '{l1, l4}' in l3(), then I got '10 10 20'.

    I didn't spend too long to debug it further. I will take your word that
    this works. (I tried 3 compilers all with the same results, including TCC.)

    I don't fully understand it; what I got was that you first produce
    linear code with labels. Each span between labels is turned into a
    function. To 'step into' label L, or jump to L, I have to do L().

    There would still be lots of questions (even ignoring the problems of accessing locals), like what the return path is, or how an early return
    would work (also returning a value). Or what kind of pressure the stack
    would be under.

    It looks like a crude form of threaded code (which, when I use that,
    never returns, and it doesn't use a stack either).

    I've seen enough to know that it would be last kind of IL I would choose (unless it was the last IL left in the world - then maybe).

    There is also the oddity that eliminating a simple kind of branching
    relies on more elaborate branching: call and return mechanisms.

    More interesting and more practical would be replacing call/return by
    'goto'! (It would need to support label pointers or indirect jumps,
    unless runtime code modification was allowed.)


    (my test)
    --------------------------
    #include <stdio.h>

    int n=6;
    int a[]={10,20,30,40,50,60};
    int b=30;
    int i;

    void k2(void);
    void k3(void);
    void k4(void);

    void k0(void) {
    i = 0;
    k3();
    }

    void k1(void) {
    void (*(t[2]))(void) = {k4, k2};
    printf("%d\n",a[i]);
    int c1 = a[i] == b;
    (*(t[c1]))();
    }

    void k2(void) {
    i++;
    // k3();
    }

    void k3(void) {
    void (*(t[]))(void) = {k4, k1};
    int c2 = i < n;
    (*(t[c2]))();
    }

    void k4(void) {
    }

    int main(void) {
    k0();
    k1();
    k2();
    k3();
    k4();
    }



    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Wed Dec 18 03:51:17 2024
    From Newsgroup: comp.lang.c

    bart <bc@freeuk.com> wrote:
    On 18/12/2024 00:23, Waldek Hebisch wrote:
    bart <bc@freeuk.com> wrote:
    On 17/12/2024 18:46, Waldek Hebisch wrote:
    bart <bc@freeuk.com> wrote:

    If you try to extract any meaning, it is that any control flow can be >>>>> expressed either with 'goto' or with 'recursive functions'.

    This is what I picked up on. Who on earth would eschew 'goto' and use >>>>> such a disproportionately more complex and inefficient method like
    recursive functions?

    Due to silly conding standard? Or in language that does not have
    'goto'.

    It was suggested that 'theoretically', 'goto' could be replaced by
    recursive function calls.

    Whether still within the context of a language with no other control
    flow instructions, is not known. The suggester also chose not to share
    examples of how it would work.

    The example I gave (and you snipped) was supposed to explain how
    the technique works, but it seems that it is not enough.

    It showed how to do conditional code without explicit branching. It
    didn't seem to me to cover arbitrary gotos, or where recursion comes
    into it.

    (Actually I implemented it in my two languages to compare performance to 'straight' versions, however my test called silly() lots of times so it wasn't a good test.)

    So
    let us look at another example. Start from ordinary C code that
    only uses global variables (this is not strictly necessary, but
    let as make such assumption for simplicity):

    int n;
    int * a;
    int b;
    int i;

    ...
    /* Simple search loop */
    for(i = 0; i < n; i++) {
    if (a[i] == b) {
    break;
    }
    }

    First, express flow control using only conditional and unconditional
    jump:

    l0:
    i = 0;
    goto l3;
    l1:
    int c1 = a[i] == b;
    if (c1) {
    goto l4;
    } else {
    goto l2;
    }
    l2:
    i++;
    l3:
    int c2 = i < n;
    if (c2) {
    goto l1;
    } else {
    goto l4;
    }
    l4:
    ;

    Note, I introduced more jumps than strictly necessary, so that
    hunks between labels end either in conditional or unconditional
    jump.

    Next, replace each hunk staring in a label, up to (but not
    including) next label, by a new function. Replace final jumps
    by function calls, for conditional jumps using the same trick
    as in previous 'silly' example:

    int n;
    int * a;
    int b;
    int i;

    void l2(void);
    void l3(void);
    void l4(void);

    void l0(void) {
    i = 0;
    l3();
    }

    void l1(void) {
    void (*(t[2]))(void) = {l4, l2};
    ^^^^^^^
    Should be
    l2, l4
    int c1 = a[i] == b;
    (*(t[c1]))();
    }

    void l2(void) {
    i++;
    l3();
    }

    void l3(void) {
    void (*(t[]))(void) = {l1, l4};
    ^^^^^^
    l4, l2
    int c2 = i < n;
    (*(t[c2]))();
    }

    void l4(void) {
    }

    Note: 'l4' is different than other functions, intead of calling
    something it returns, ensuring that the sequence of calls
    eventually terminate.

    OK thanks for this. I tried to duplicate it based on this starting point:

    #include <stdio.h>

    int n=6;
    int a[]={10,20,30,40,50,60};
    int b=30;
    int i;

    int main(void) {
    for(i = 0; i < n; i++) {
    printf("%d\n",a[i]);
    if (a[i] == b) {
    break;
    }
    }
    }

    This prints 10 20 30 as it is. But the version with the function calls showed only '10'. If I swapped '{l1, l4}' in l3(), then I got '10 10 20'.

    Sorry, there was a thinko: 1 is true and this is the second element
    of the array, while I was thinking that the first one is true branch
    and second is false branch.

    I didn't spend too long to debug it further. I will take your word that
    this works. (I tried 3 compilers all with the same results, including TCC.)

    I don't fully understand it; what I got was that you first produce
    linear code with labels. Each span between labels is turned into a
    function. To 'step into' label L, or jump to L, I have to do L().

    Yes.

    There would still be lots of questions (even ignoring the problems of accessing locals), like what the return path is, or how an early return would work (also returning a value). Or what kind of pressure the stack would be under.

    OK, you take a function F, it has some arguments and local variables.
    And some retrun type. You create "entry function" to take the
    same arguments as F and has the same return type as F. You tranform
    body as above, but now each new function has the same return type
    as F and arguments are arguments of original function + extra arguments,
    one for each local variable of F. In "entry function" you call
    function corresponding to first label passing it arguments and
    initial values of local variables of F. In each subseqent call
    you pass argument and values around so that they are available
    in each new function. And the call is an argument to return
    statement. When you want to return you simply return value,
    without performing a call.

    Stack use depend on optimizations in your compiler. With small
    effort compiler can recognize that it will return value (possibly
    void) from a call and replace such call by stack+register
    shuffling + jump. Actually when there is return value, you
    have something like

    return lx(a0, a1, ..., ak);

    which is easy to recognize due to 'return' keyword. One also
    need to check that types agree (C automatically applies integer
    convertions, but such convertions may produce real code, so in
    such case one needs normal call). In void case one need to
    check that there the call is textually last thing or that
    it is followed by return statement. Stack+register
    shuffling may require some code before control transfer, but
    call can be replaced by jump.

    So, if compiler has tail call optimization, then there is no
    more stack use than maximum needed by any of the functions.

    Note: I described general transformation, partially to show
    that 'if' is _not_ needed. But similar style is used to
    write code by hand. In hand written code people do not
    bother with transforming 'if', which makes tail call
    optimization a bit more complicated. OTOH, unlike rather
    ugly code produced by mechanical transformation, hand
    written code depending on tail call optimization may be quite
    nice and readible. There is potential trouble: sometimes
    author thinks that a call is a tail call, but compiler
    disagrees, leading to lower efficiency.

    Of course, when compiler do not have tail call optimization,
    then stack use may be quite high.

    It looks like a crude form of threaded code (which, when I use that,
    never returns, and it doesn't use a stack either).

    IMO it is quite different than what I know as threaded code.

    I've seen enough to know that it would be last kind of IL I would choose (unless it was the last IL left in the world - then maybe).

    There is also the oddity that eliminating a simple kind of branching
    relies on more elaborate branching: call and return mechanisms.

    One motivation for eliminating 'goto' is that it is not easy to
    say what effect 'goto' has on variables. I mean, variables keep
    ther values, but when you may arrive to given point from several
    places than values of variables depend on place that control came
    from, and this may be hard to analyze. In a sense functions have
    the same problem, but there is well-developed technique to reason
    about function calls. So both jumps and function calls are
    hard to analyze, but eliminating jumps allows re-use of work
    done for functions.

    More interesting and more practical would be replacing call/return by 'goto'! (It would need to support label pointers or indirect jumps,
    unless runtime code modification was allowed.)

    The point is that calls are strictly more powerful than jumps
    (you get parameter passing and local variables).
    --
    Waldek Hebisch
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Lawrence D'Oliveiro@ldo@nz.invalid to comp.lang.c on Wed Dec 18 05:55:31 2024
    From Newsgroup: comp.lang.c

    On Tue, 17 Dec 2024 22:55:53 +0000, bart wrote:

    On 17/12/2024 22:25, Lawrence D'Oliveiro wrote:

    On Tue, 17 Dec 2024 19:45:49 +0000, bart wrote:

    It's not aimed at people /implementing/ such a tool.

    It is aimed at those capable of following the links to relevant specs.

    It also a pretty terrible link.

    Did you see this link <https://developer.mozilla.org/en-US/docs/WebAssembly/Reference>? Lots of examples from there.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Wed Dec 18 12:08:24 2024
    From Newsgroup: comp.lang.c

    On 17/12/2024 18:51, BGB wrote:
    On 12/17/2024 6:04 AM, bart wrote:

    C can apparently compile to WASM via Clang, so I tried this program:

      void F(void) {
         int i=0;
         while (i<10000) ++i;
      }

    which compiled to 128 lines of WASM (technically, some form of 'WAT',
    as WASM is a binary format). The 60 lines correspondoing to F are
    shown below, and below that, is my own stack IL code.

    I'm not even sure what format that code is in, as WAT is supposed to use S-expressions. The generated code is flat. It differs in other ways from examples of WAT.

    Hmm... It looks like the WASM example is already trying to follow SSA
    rules, then mapped to a stack IL... Not necessarily the best way to do
    it IMO.

    I hadn't considered that SSA could be represented in stack form.

    But couldn't each push be converted to an assignment to a fresh
    variable, and the same with pop?

    As for Phi functions, the only similar thing I encounter (but could be mistaken), is when there is a choice of paths to yield a value (such as
    (c ? a : b) in C; my language has several such constructs).

    With stack code, the result conveniently ends up on top of the stack
    whichever path is taken, which is a big advantage. Unless you then have
    to convert that to register code, and need to ensure the values end up
    in the same register when the control paths join up again.


    But, yeah, in BGBCC I am also using a stack-based IL (RIL), which
    follows rules more in a similar category to .NET CIL (in that, stack
    items carry type, and the stack is generally fully emptied on branch).


    In my IL, labels are identified with a LABEL opcode (with an immediate),
    and things like branches work by having the branch target and label
    having the same immediate (label ID).

    So, you jump to label L123, and the label looks like:

    L123:

    I think that is pretty standard! But it sounds like you use a very tight encoding for bytecode, while mine uses a 32-byte descriptor for each IL instruction.

    (One quibble with labels is whether a label definition occupies an
    actual IL instruction. With my IL used as a backend for static
    languages, it does. And there can be clusters of labels at the same spot.

    With dynamic bytecode designed for interpretation, it doesn't. It uses a different structure. This means labels don't need to be 'executed' when encountered.)


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed Dec 18 17:19:01 2024
    From Newsgroup: comp.lang.c

    On 17.12.2024 21:13, Keith Thompson wrote:
    bart <bc@freeuk.com> writes:
    [...]

    [...]
    (Janis, please correct me if I'm mistaken.)

    I think it couldn't have been explained clearer. - Thanks.

    Janis

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Wed Dec 18 17:26:49 2024
    From Newsgroup: comp.lang.c

    On 17.12.2024 19:46, Waldek Hebisch wrote:
    bart <bc@freeuk.com> wrote:
    [...]

    [ ponderings about where recursive functions might be used ]

    Due to silly conding standard? Or in language that does not have
    'goto'.

    (I'd rule out the "coding standards" hypothesis.)

    Languages without 'goto', I suppose, would either have other control
    constructs ('while', etc.) to formulate in an imperative style, or
    be of the Functional Programming Languages type.

    Janis

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Wed Dec 18 12:50:20 2024
    From Newsgroup: comp.lang.c

    On 12/18/2024 6:08 AM, bart wrote:
    On 17/12/2024 18:51, BGB wrote:
    On 12/17/2024 6:04 AM, bart wrote:

    C can apparently compile to WASM via Clang, so I tried this program:

      void F(void) {
         int i=0;
         while (i<10000) ++i;
      }

    which compiled to 128 lines of WASM (technically, some form of 'WAT',
    as WASM is a binary format). The 60 lines correspondoing to F are
    shown below, and below that, is my own stack IL code.

    I'm not even sure what format that code is in, as WAT is supposed to use S-expressions. The generated code is flat. It differs in other ways from examples of WAT.


    Dunno there...

    It looks like WASM has changed slightly from what I remember when I
    originally looked at it, so it could be "possible" if it could be made
    to support separate compilation and similar.


    Hmm... It looks like the WASM example is already trying to follow SSA
    rules, then mapped to a stack IL... Not necessarily the best way to do
    it IMO.

    I hadn't considered that SSA could be represented in stack form.

    But couldn't each push be converted to an assignment to a fresh
    variable, and the same with pop?

    As for Phi functions, the only similar thing I encounter (but could be mistaken), is when there is a choice of paths to yield a value (such as
    (c ? a : b) in C; my language has several such constructs).


    I was mostly noting that it appeared that every operation was creating a
    new variable and only assigning to it once.

    I didn't look too much more closely than this, only to note that it was different.


    With stack code, the result conveniently ends up on top of the stack whichever path is taken, which is a big advantage. Unless you then have
    to convert that to register code, and need to ensure the values end up
    in the same register when the control paths join up again.


    With JVM, the rule was that all paths landing at the same label need to
    have the same stack depth and same types.

    With .NET, the rule was that the stack was always empty, any merging
    would need to be done using variables.


    BGBCC is sorta mixed:
    In most cases, it follows the .NET rule;
    A special-case exception exists mostly for implementing the ?: operation (which in turn has special stack operations to signal its use).

    BEGINU // start a ?: operator
    L0:
    ... //one case
    SETU
    JMP L2
    L1:
    ... //other case
    SETU
    JMP L2
    ENDU
    L2:


    This is a bit of wonk, if I were designing it now, would likely do it
    the same as .NET, and use temporary variables.


    Actually, I might be tempted to use a 3AC IR as well (though, probably non-SSA). And, probably design things a bit differently.


    In this case, if I did a 3AC IR, might design a textual syntax along
    similar lines to BASIC or FORTRAN 77 (albeit probably without the
    fixed-column formatting or line numbers).

    Though, the nominal format for use in the compiler would remain binary.



    But, yeah, in BGBCC I am also using a stack-based IL (RIL), which
    follows rules more in a similar category to .NET CIL (in that, stack
    items carry type, and the stack is generally fully emptied on branch).


    In my IL, labels are identified with a LABEL opcode (with an
    immediate), and things like branches work by having the branch target
    and label having the same immediate (label ID).

    So, you jump to label L123, and the label looks like:

      L123:


    Yeah, in textual form.
    Though, the label is internally represented as, say:
    LABEL 123

    IIRC, usually numbering starts over from 0 for each function, though in
    the backend IR all labels get a unique number within a 24-bit numbering
    space.

    The labels are then split into several categories:
    Global labels, used to identify functions/variables, with an associated
    name;
    IL labels, which were mapped over from the RIL bytecode;
    Temporary labels, which exist solely in the backend;
    Line numbers, not true labels, mostly exist to convey line-number info (associated with a file-name and line number);
    Special/Architectural, used as placeholders for things like CPU
    registers (for variable load/store).


    I think that is pretty standard! But it sounds like you use a very tight encoding for bytecode, while mine uses a 32-byte descriptor for each IL instruction.

    (One quibble with labels is whether a label definition occupies an
    actual IL instruction. With my IL used as a backend for static
    languages, it does. And there can be clusters of labels at the same spot.

    With dynamic bytecode designed for interpretation, it doesn't. It uses a different structure. This means labels don't need to be 'executed' when encountered.)


    In my interpreters, it always uses a bytecode operation.
    However, apart from my very early interpreters, typically the stack IL
    is not used directly.

    So, a personal timeline was like:
    2003/2004: BGBScript came into existence
    First version used DOM and directly walked the DOM tree.
    Used a GC, generated lots of garbage objects;
    Syntax was based on JavaScript with some wonk;
    Was horridly slow.
    2006:
    BGBScript VM (BS-VM) was rewritten to S-Expressions internally;
    Dropped some of the original wonk, moving to a cleaner JS syntax;
    Went to a bytecode interpreter.
    2007:
    BGBCC was written using the frontend from the 2003 VM as a base;
    The IL design was based on 2006 BS-VM;
    Replaced the original DOM with a custom stand-in;
    Used parts of the 2006 VM as well.
    2009:
    The BS-VM was modified to turn the stack IL into 3AC and run this;
    Also had a JIT and similar by this point;
    Using 3AC and JIT made things significantly faster;
    Also tended to leak a lot less garbage,
    operating mostly at "steady state".
    Syntactically, it had become more like ActionScript3 or HaXE.
    2013: Created BGBScript2 (BS2)
    This mostly resembled a Java/C#/AS3 hybrid;
    Eliminated the GC in favor of primarily static + manual MM.
    2015/2016: Created the BGBTech2 3D engine
    Partly written in a mix of C and BGBScript2
    Was my biggest project to use BS2

    Then:
    2017: Started on my BJX1 project
    Revived BGBCC, used it as the compiler.
    2019: Rebooted the project to BJX2.
    BJX1 quickly turned into a huge mess
    which was non-viable to implement in an FPGA.
    Until now, BJX2 project has continued.

    Some stuff following the design of the BS2 VM was back-ported onto
    BGBCC, but in many ways, BGBCC has a lot more cruft.



    In the BS2 VM, the image format is a TLV container.
    There is a string table, data area for functions/etc;
    Index tables;
    ...
    Generally, functions could be loaded and converted to 3AC on demand.


    The IL in the BS2 VM was not a pure stack machine, but more like:
    OP with 2 stack args, stack dest (common with BGBCC)
    OP with 2 stack args, local dest (common with BGBCC)
    OP with 2 local args, stack dest
    OP with 2 local args, local dest (like in 3AC)
    OP with local and immediate, stack dest
    OP with local and immediate, local dest
    OP with local and stack, stack dest
    OP with local and stack, local dest

    This was more complicated, but reduced the number of IL operations. Internally, it all converted to 3AC for the backend interpreter.


    The incentive to do this for BGBCC was less, as folding the
    local-variable or constant-loads into the operator is less immediately beneficial to a compiler; but does make the bytecode loader more
    complicated. Folding the destination register into the bytecode ops in
    many cases is still relevant, as it is comparably harder to fold the destination-store into the 3AC op than to fold a source load.


    Generally, bytecode ops and operands were encoded with VLNs (variable
    length numbers).

    Generally (numberic VLN):
    00..7F: 0..127
    00..BF XX: 128..16383
    C0..DF XX XX: 16384..2M
    ...

    These values were encoded in MSB first order, and could directly
    represent values up to 64 bits (in both the BS2VM and BGBCC, 128-bit
    values tend to be represented as pairs of 64-bit values).

    For signed integer values, the sign was folded into the LSB.
    Floating point values were represented as a base/exponent VLN pair.
    Basically, an integer value scaled by a power-of-2 exponent.


    Opcodes were different, IIRC:
    00..DF: Single Byte
    E0..EF: Two Byte (224..4095)
    F0..F7: Three Byte
    ...

    But, generally, only 1 and 2 byte cases were used.

    IIRC, did not define a textual notation for the BS2VM's ASM.


    Local variables, labels, etc, were all identified as numeric indices.
    Typically a single byte.

    Like JVM, and unlike BGBCC, in the BS2VM, all the variables (including arguments) were held in an array of local variables (BGBCC has locals, arguments, and temporaries, as 3 separate spaces).

    IIRC, BS2VM had still used variable type-tagging (like BGBCC and .NET),
    rather than the untyped variables with typed operators scheme (what JVM
    had used).

    But, typed operators more make sense if you intend to interpret the
    stack bytecode directly, which was generally not done in my VMs (except
    in very early versions). Otherwise, implicitly typed operators probably
    make more sense.


    ...

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Wed Dec 18 12:51:04 2024
    From Newsgroup: comp.lang.c

    On 12/17/2024 1:42 PM, bart wrote:
    On 17/12/2024 19:07, Thiago Adams wrote:
    Em 12/17/2024 3:37 PM, bart escreveu:
    On 17/12/2024 18:16, Thiago Adams wrote:


    also remove structs changing by unsigned char [] and cast parts of
    it to access members.

    I think this the lower level possible in c.

    This is what I do in my IL, where structs are just fixed blocks of so
    many bytes.


    How do you do with struct parameters?



    In the IL they are always passed notionally by value. This side of the
    IL (that is, the frontend compile that generates IL), knows nothing
    about the target, such as ABI details.

    (In practice, some things are known, like the word size of the target,
    since that can change characteristics of the source language, like the
    size of 'int' or of 'void*'. It also needs to assume, or request from
    the backend, argument evaluation order, although my IL can reverse order
    if necessary.)

    It is the backend, on the other size of the IL, that needs to deal with those details.

    That can include making copies of structs that the ABI says are passed
    by value. But when targeting SYS V ABI (which I haven't attempted yet),
    it may need to know the internal layout of a struct.

    You can however do experiments with using SYS V on Linux (must be 64 bits):

    * Create test structs with, say, int32 or int64 elements

    * Write a test function where such a struct is passed by value, and
      then return a modified copy

    * Rerun the test using a version of the function where a char[] version
    of the struct is passed and returned, and which contains the member
    access casts you suggested

    * See if it gives the same results.

    You might need a union of the two structs, or use memcpy to transfer contents, before and after calling the test function.


    I took a different approach:
    In the backend IR stage, structs are essentially treated as references
    to the structure.

    A local structure may be "initialized" via an IR operation, in which
    point it will be assigned storage in the stack frame, and the reference
    will be initialized to the storage area for the structure.

    Most operations will pass them by reference.

    Assigning a struct will essentially be turned into a struct-copy
    operation (using the same mechanism as inline memcpy).


    Type model could be seen as multiple levels:
    I: integer types of 'int' and smaller;
    L: integer types of 64 bits or less that are not I.
    D: 'double' and smaller floating-point types.
    A: Address (pointers, arrays, structs, ...)
    X: 128-bit types.
    int128, 'long double', SIMD vectors, ...

    I:
    char, signed char, unsigned char
    short, unsigned short
    int, unsigned int
    _Bool, wchar_t, ...
    L:
    long, long long, unsigned long, unsigned long long
    64-bit SIMD vectors
    variant (sorta)
    D: double, float, short float
    A:
    pointers
    arrays
    structs
    class instances
    ...
    X:
    grab bag of pretty much everything that is 128 bits.

    The toplevel types all basically have similar storage and behavior, so
    in many cases one can rely on this rather than the actual type.



    ...

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Wed Dec 18 12:51:23 2024
    From Newsgroup: comp.lang.c

    On 12/17/2024 1:33 PM, Thiago Adams wrote:
    Em 12/17/2024 4:07 PM, BGB escreveu:
    On 12/17/2024 11:55 AM, Thiago Adams wrote:
    Em 12/17/2024 4:03 AM, BGB escreveu:
    On 12/16/2024 5:21 AM, Thiago Adams wrote:
    On 15/12/2024 20:53, BGB wrote:
    On 12/15/2024 3:32 PM, bart wrote:
    On 15/12/2024 19:08, Bonita Montero wrote:
    C++ is more readable because is is magnitudes more expressive >>>>>>>> than C.
    You can easily write a C++-statement that would hunddres of
    lines in
    C (imagines specializing a unordered_map by hand). Making a
    language
    less expressive makes it even less readable, and that's also
    true for
    your reduced C.


    That's not really the point of it. This reduced C is used as an >>>>>>> intermediate language for a compiler target. It will not usually >>>>>>> be read, or maintained.

    An intermediate language needs to at a lower level than the
    source language.

    And for this project, it needs to be compilable by any C89 compiler. >>>>>>>
    Generating C++ would be quite useless.


    As an IL, even C is a little overkill, unless turned into a
    restricted subset (say, along similar lines to GCC's GIMPLE).

    Say:
       Only function-scope variables allowed;
       No high-level control structures;
       ...

    Say:
       int foo(int x)
       {
         int i, v;
         for(i=x, v=0; i>0; i--)
           v=v*i;
         return(v);
       }

    Becoming, say:
       int foo(int x)
       {
         int i;
         int v;
         i=x;
         v=0;
         if(i<=0)goto L1;
         L0:
         v=v*i;
         i=i-1;
         if(i>0)goto L0;
         L1:
         return v;
       }

    ...


    I have considered to remove loops and keep only goto.
    But I think this is not bring too much simplification.


    It depends.

    If the compiler works like an actual C compiler, with a full parser
    and AST stage, yeah, it may not save much.


    If the parser is a thin wrapper over 3AC operations (only allowing
    statements that map 1:1 with a 3AC IR operation), it may save a bit
    more...



    As for whether or not it makes sense to use a C like syntax here,
    this is more up for debate (for practical use within a compiler, I
    would assume a binary serialization rather than an ASCII syntax,
    though ASCII may be better in terms of inter-operation or human
    readability).


    But, as can be noted, I would assume a binary serialization that is
    oriented around operators; and *not* about serializing the
    structures used to implement those operators. Also I would assume
    that the IR need not be in SSA form (conversion to full SSA could be
    done when reading in the IR operations).


    Ny argument is that not using SSA form means fewer issues for both
    the serialization format and compiler front-end to need to deal with
    (and is comparably easy to regenerate for the backend, with the
    backend operating with its internal IR in SSA form).

    Well, contrast to LLVM assuming everything is always in SSA form.

    ...



    I also have considered split expressions.

    For instance

    if (a*b+c) {}

    into

    register int r1 = a * b;
    register int r2 = r1 + c;
    if (r2) {}

    This would make easier to add overflow checks in runtime (if desired)
    and implement things like _complex

    Is this what you mean by 3AC or SSA?


    3AC means that IR expressed 3 (or sometimes more) operands per IR op.

    So:
       MUL r1, a, b
    Rather than, say, stack:
       LOAD a
       LOAD b
       MUL
       STORE r1


    SSA:
       Static Single Assignment


    Oh sorry .. I knew what SSA is.

    Generally:
    Every variable may only be assigned once (more like in a functional
    programming language);
    Generally, variables are "merged" in the control-flow via PHI
    operators (which variable merges in depending on which path control
    came from).


    I do similar merge in my flow analysis but without the concept of SSA.

    IMHO, while SSA is preferable for backend analysis, optimization, and
    code generation; it is undesirable pretty much everywhere else as it
    adds too much complexity.

    Better IMO for the frontend compiler and main IL stage to assume that
    local variables are freely mutable.

    Typically, global variables are excluded in most variants, and remain
    fully mutable; but may be handled as designated LOAD/STORE operations.


    In BGBCC though, full SSA only applies to temporaries. Normal local
    variables are merely flagged by "version", and all versions of the
    same local variable implicitly merge back together at each branch/label.


    Sorry what is BGBCC ? (C compiler?)


    It is my C compiler.

    Can be found within my current main project: https://github.com/cr88192/bgbtech_btsr1arch/tree/master/bgbcc22


    It started out, long ago, as a fork off my scripting language, which was originally a JavaScript clone.

    First stage:
    Originally written as a C interpreter of sorts.

    Original idea was to use dynamically compiled C as an application
    scripting language, but C wasn't a great language for this task (vs a JS clone), and the compiler was a lot harder to debug.

    Then, for a while, it was turned over to mining metadata from headers to generate an FFI for the script language.


    Its use as a C compiler was revived when I started my CPU ISA project,
    as I needed a compiler for it, and other options (Clang, GCC, and LCC)
    were unattractive in various ways.

    Though, in all, a lot more effort in the project has gone into the C
    compiler than into much of anything else, and it is still a bit of a
    pain finding and fixing bugs (and avoiding causing new bugs).


    It targets both BJX2 (my own ISA) or RISC-V, albeit using PE/COFF for
    the latter (rather than ELF).


    This allows some similar advantages (for analysis and optimization)
    while limiting some of the complexities. Though, this differs from
    temporaries which are assumed to essentially fully disappear once they
    go outside of the span in which they exist (albeit with an awkward
    case to deal with temporaries that cross basic-block boundaries, which
    need to actually "exist" in some semi-concrete form, more like local
    variables).

    Note that unless the address is taken of a local variable, it need not
    have any backing in memory. Temporaries can never have their address
    taken, so generally exist exclusively in CPU registers.


    This would definitely simplify expressions grammar.





    It can be added in the future.


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Thiago Adams@thiago.adams@gmail.com to comp.lang.c on Wed Dec 18 16:43:59 2024
    From Newsgroup: comp.lang.c

    Em 12/18/2024 3:51 PM, BGB escreveu:

    I took a different approach:
    In the backend IR stage, structs are essentially treated as references
    to the structure.

    A local structure may be "initialized" via an IR operation, in which
    point it will be assigned storage in the stack frame, and the reference
    will be initialized to the storage area for the structure.

    Most operations will pass them by reference.

    Assigning a struct will essentially be turned into a struct-copy
    operation (using the same mechanism as inline memcpy).

    But what happens with calling a external C function that has a struct X
    as parameter? (not pointer to struct)
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Wed Dec 18 23:37:11 2024
    From Newsgroup: comp.lang.c

    On 18/12/2024 18:50, BGB wrote:
    On 12/18/2024 6:08 AM, bart wrote:

    With stack code, the result conveniently ends up on top of the stack
    whichever path is taken, which is a big advantage. Unless you then
    have to convert that to register code, and need to ensure the values
    end up in the same register when the control paths join up again.


    With JVM, the rule was that all paths landing at the same label need to
    have the same stack depth and same types.

    With .NET, the rule was that the stack was always empty, any merging
    would need to be done using variables.


    BGBCC is sorta mixed:
    In most cases, it follows the .NET rule;
    A special-case exception exists mostly for implementing the ?: operation (which in turn has special stack operations to signal its use).

    BEGINU  // start a ?: operator
    L0:
    ...  //one case
    SETU
    JMP L2
    L1:
    ... //other case
    SETU
    JMP L2
    ENDU
    L2:


    This is a bit of wonk,

    Well, this is pretty much what I do in stack code. I consider it impure,
    as in needing artificial hints, but also the simplest solution.

    I use opcodes STARTMX, RESETMX, ENDMX. They are no-ops when the IL is interpreted. But during the linear scan needed during code generation,
    where it has to keep track of the IL's operand stack, RESETMX will reset
    the stack too.

    (As mentioned, I have a lot more constructs that can yield N values not
    just two. Apart from N-way select, if-else, switch-when and case-when statements can also return values.)

    if I were designing it now, would likely do it
    the same as .NET, and use temporary variables.

    In 3AC then it's easy, all paths write to the same temporary.


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Wed Dec 18 18:27:26 2024
    From Newsgroup: comp.lang.c

    On 12/18/2024 1:43 PM, Thiago Adams wrote:
    Em 12/18/2024 3:51 PM, BGB escreveu:

    I took a different approach:
    In the backend IR stage, structs are essentially treated as references
    to the structure.

    A local structure may be "initialized" via an IR operation, in which
    point it will be assigned storage in the stack frame, and the
    reference will be initialized to the storage area for the structure.

    Most operations will pass them by reference.

    Assigning a struct will essentially be turned into a struct-copy
    operation (using the same mechanism as inline memcpy).

    But what happens with calling a external C function that has a struct X
    as parameter? (not pointer to struct)


    In my ABI, if larger than 16 bytes, it is passed by reference (as a
    pointer in a register or on the stack), callee is responsible for
    copying it somewhere else if needed.

    For struct return, a pointer to return the struct into is provided by
    the caller, and the callee copies the returned struct into this address.

    If the caller ignores the return value, the caller provides a dummy
    buffer for the return value.

    If no prototype is provided... well, most likely the program crashes or similar.

    So, in effect, the by-value semantics are mostly faked by the compiler.


    It is roughly similar to the handling of C array types, which in this
    case are also seen as a combination of a hidden pointer to the data, and
    the backing data (the array's contents). The code-generator mostly
    operates in terms of this hidden pointer.


    By-Value Structs smaller than 16 bytes are passed as-if they were a 64
    or 128 bit integer type (as a single register or as a register pair,
    with a layout matching their in-memory representation).

    ...


    But, yeah, at the IL level, one could potentially eliminate structs and
    arrays as a separate construct, and instead have bare pointers and a
    generic "reserve a blob of bytes in the frame and initialize this
    pointer to point to it" operator (with the business end of this operator happening in the function prolog).

    ...

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Thu Dec 19 00:32:47 2024
    From Newsgroup: comp.lang.c

    On 18/12/2024 05:55, Lawrence D'Oliveiro wrote:
    On Tue, 17 Dec 2024 22:55:53 +0000, bart wrote:

    On 17/12/2024 22:25, Lawrence D'Oliveiro wrote:

    On Tue, 17 Dec 2024 19:45:49 +0000, bart wrote:

    It's not aimed at people /implementing/ such a tool.

    It is aimed at those capable of following the links to relevant specs.

    It also a pretty terrible link.

    Did you see this link <https://developer.mozilla.org/en-US/docs/WebAssembly/Reference>? Lots of examples from there.

    I promised a example of Hello World using my IL, and how to process and
    run it, in half a page. This is it for Windows :

    ------------------------------------
    Paste the indented code into a file hello.pcl:

    addlib "msvcrt"
    extproc puts

    proc main:::
    setcall i32 /1
    load u64 "Hello World!"
    setarg u64 /1
    callf i32 /1 &puts
    unload i32
    load i32 0
    stop
    retproc
    endproc

    Download the pc.exe file here: https://github.com/sal55/langs/blob/master/pc.exe, which is a 65KB file (UPX-compressed from 180KB). (Advice to navigate AV not included here.)

    At a command prompt with both files present, type:

    pc -r hello

    This will convert it to x64 code and run it. Use 'pc' by itself to see
    the 6 other processing options.
    ------------------------------------

    So 20 non-blank lines. It would be nice if an equally simple example
    existed for WASM/WAT, or if people who suggested that choice could post
    a link to such an example /that/ works on Windows.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Thu Dec 19 00:35:41 2024
    From Newsgroup: comp.lang.c

    On 19/12/2024 00:27, BGB wrote:
    On 12/18/2024 1:43 PM, Thiago Adams wrote:
    Em 12/18/2024 3:51 PM, BGB escreveu:

    I took a different approach:
    In the backend IR stage, structs are essentially treated as
    references to the structure.

    A local structure may be "initialized" via an IR operation, in which
    point it will be assigned storage in the stack frame, and the
    reference will be initialized to the storage area for the structure.

    Most operations will pass them by reference.

    Assigning a struct will essentially be turned into a struct-copy
    operation (using the same mechanism as inline memcpy).

    But what happens with calling a external C function that has a struct
    X as parameter? (not pointer to struct)


    In my ABI, if larger than 16 bytes, it is passed by reference (as a
    pointer in a register or on the stack), callee is responsible for
    copying it somewhere else if needed.

    For struct return, a pointer to return the struct into is provided by
    the caller, and the callee copies the returned struct into this address.

    If the caller ignores the return value, the caller provides a dummy
    buffer for the return value.

    If no prototype is provided... well, most likely the program crashes or similar.

    So, in effect, the by-value semantics are mostly faked by the compiler.


    It is roughly similar to the handling of C array types, which in this
    case are also seen as a combination of a hidden pointer to the data, and
    the backing data (the array's contents). The code-generator mostly
    operates in terms of this hidden pointer.


    By-Value Structs smaller than 16 bytes are passed as-if they were a 64
    or 128 bit integer type (as a single register or as a register pair,
    with a layout matching their in-memory representation).

    ...


    But, yeah, at the IL level, one could potentially eliminate structs and arrays as a separate construct, and instead have bare pointers and a
    generic "reserve a blob of bytes in the frame and initialize this
    pointer to point to it" operator (with the business end of this operator happening in the function prolog).

    The problem with this, that I mentioned elsewhere, is how well it would
    work with SYS V ABI, since the rules for structs are complex, and
    apparently recursive.

    Having just a block of bytes might not be enough.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Wed Dec 18 23:46:21 2024
    From Newsgroup: comp.lang.c

    On 12/18/2024 6:35 PM, bart wrote:
    On 19/12/2024 00:27, BGB wrote:
    On 12/18/2024 1:43 PM, Thiago Adams wrote:
    Em 12/18/2024 3:51 PM, BGB escreveu:

    I took a different approach:
    In the backend IR stage, structs are essentially treated as
    references to the structure.

    A local structure may be "initialized" via an IR operation, in which
    point it will be assigned storage in the stack frame, and the
    reference will be initialized to the storage area for the structure.

    Most operations will pass them by reference.

    Assigning a struct will essentially be turned into a struct-copy
    operation (using the same mechanism as inline memcpy).

    But what happens with calling a external C function that has a struct
    X as parameter? (not pointer to struct)


    In my ABI, if larger than 16 bytes, it is passed by reference (as a
    pointer in a register or on the stack), callee is responsible for
    copying it somewhere else if needed.

    For struct return, a pointer to return the struct into is provided by
    the caller, and the callee copies the returned struct into this address.

    If the caller ignores the return value, the caller provides a dummy
    buffer for the return value.

    If no prototype is provided... well, most likely the program crashes
    or similar.

    So, in effect, the by-value semantics are mostly faked by the compiler.


    It is roughly similar to the handling of C array types, which in this
    case are also seen as a combination of a hidden pointer to the data,
    and the backing data (the array's contents). The code-generator mostly
    operates in terms of this hidden pointer.


    By-Value Structs smaller than 16 bytes are passed as-if they were a 64
    or 128 bit integer type (as a single register or as a register pair,
    with a layout matching their in-memory representation).

    ...


    But, yeah, at the IL level, one could potentially eliminate structs
    and arrays as a separate construct, and instead have bare pointers and
    a generic "reserve a blob of bytes in the frame and initialize this
    pointer to point to it" operator (with the business end of this
    operator happening in the function prolog).

    The problem with this, that I mentioned elsewhere, is how well it would
    work with SYS V ABI, since the rules for structs are complex, and
    apparently recursive.

    Having just a block of bytes might not be enough.

    In my case, I am not bothering with the SysV style ABI's (well, along
    with there not being any x86 or x86-64 target...).


    For my ISA, it is a custom ABI, but follows mostly similar rules to some
    of the other "Microsoft style" ABIs (where, I have noted that across
    multiple targets, MS tools have tended to use similar ABI designs).

    For my compiler targeting RISC-V, it uses a variation of RV's ABI rules. Argument passing is basically similar, but struct pass/return is
    different; and it passes floating-point values in GPRs (and, in my own
    ISA, all floating-point values use GPRs, as there are no FPU registers;
    though FPU registers do exist for RISC-V).

    Not likely a huge issue as one is unlikely to use ELF and PE/COFF in the
    same program.


    For the "OS" that runs on my CPU core, it is natively using PE/COFF, but
    ELF is supported for RISC-V (currently PIE only). It generally needs to
    use my own C library as I still haven't gotten glibc or musl libc to
    work on it (and they work in a different way from my own C library).

    Seemingly, something is going terribly wrong in the "dynamic linking"
    process, but too hard to figure out in the absence of any real debugging interface (what debug mechanisms I have, effectively lack any symbols
    for things inside "ld-linux.so"'s domain).

    Theoretically, could make porting usermode software easier, as then I
    could compile stuff as-if it were running on an RV64 port of Linux.

    But, easier said than done.

    ...

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Thu Dec 19 11:27:03 2024
    From Newsgroup: comp.lang.c

    On 19/12/2024 05:46, BGB wrote:
    On 12/18/2024 6:35 PM, bart wrote:
    On 19/12/2024 00:27, BGB wrote:

    By-Value Structs smaller than 16 bytes are passed as-if they were a
    64 or 128 bit integer type (as a single register or as a register
    pair, with a layout matching their in-memory representation).

    ...


    But, yeah, at the IL level, one could potentially eliminate structs
    and arrays as a separate construct, and instead have bare pointers
    and a generic "reserve a blob of bytes in the frame and initialize
    this pointer to point to it" operator (with the business end of this
    operator happening in the function prolog).

    The problem with this, that I mentioned elsewhere, is how well it
    would work with SYS V ABI, since the rules for structs are complex,
    and apparently recursive.

    Having just a block of bytes might not be enough.

    In my case, I am not bothering with the SysV style ABI's (well, along
    with there not being any x86 or x86-64 target...).

    I'd imagine it's worse with ARM targets as there are so many more
    registers to try and deconstruct structs into.


    For my ISA, it is a custom ABI, but follows mostly similar rules to some
    of the other "Microsoft style" ABIs (where, I have noted that across multiple targets, MS tools have tended to use similar ABI designs).

    When you do your own thing, it's easy.

    In the 1980s, I didn't need to worry about call conventions used for
    other software, since there /was/ no other software! I had to write everything, save for the odd calls to DOS which used some form of SYSCALL.

    Then, arrays and structs were actually passed and returned by value (not
    via hidden references), by copying the data to and from the stack.

    However, I don't recall ever using the feature, as I considered it
    efficient. I always used explicit references in my code.

    For my compiler targeting RISC-V, it uses a variation of RV's ABI rules. Argument passing is basically similar, but struct pass/return is
    different; and it passes floating-point values in GPRs (and, in my own
    ISA, all floating-point values use GPRs, as there are no FPU registers; though FPU registers do exist for RISC-V).

    Supporting C's variadic functions, which is needed for many languages
    when calling C across an FFI, usually requires different rules. On Win64
    ABI for example, by passing low variadic arguments in both GPRs and FPU registers.

    /Implementing/ variadic functions (which only occurs if implementing C)
    is another headache if it has to work with the ABI (which can be assumed
    for a non-static function).

    I barely have a working solution for Win64 ABI, which needs to be done
    via stdarg.h, but wouldn't have a clue how to do it for SYS V.

    (Even Win64 has problems, as it assumes a downward-growing stack; in my
    IL interpreter, the stack grows upwards!)

    Not likely a huge issue as one is unlikely to use ELF and PE/COFF in the same program.


    For the "OS" that runs on my CPU core, it is natively using PE/COFF, but

    That's interesting: you deliberately used one of the most complex file
    formats around, when you could have devised your own?

    I did exactly that at a period when my generated DLLs were buggy for
    some reason (it turned out to be two reasons). I created a simple
    dynamic library format of my own. Then I found the same format worked
    also for executables.

    But I needed a loader program to run them, as Windows obviously didn't understand the format. Such a program can be written in 800 lines of C,
    and can dynamically libraries in both my format, and proper DLLs (not
    the buggy ones I generated!).

    A hello-world program is under 300 bytes compared with 2 or
    2.5KB of EXE. And the format is portable to Linux, so no need to
    generate ELF (but I haven't tried). Plus the format might be transparent
    to AV software (haven't tried that either).

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Thu Dec 19 14:36:37 2024
    From Newsgroup: comp.lang.c

    On 12/19/2024 5:27 AM, bart wrote:
    On 19/12/2024 05:46, BGB wrote:
    On 12/18/2024 6:35 PM, bart wrote:
    On 19/12/2024 00:27, BGB wrote:

    By-Value Structs smaller than 16 bytes are passed as-if they were a
    64 or 128 bit integer type (as a single register or as a register
    pair, with a layout matching their in-memory representation).

    ...


    But, yeah, at the IL level, one could potentially eliminate structs
    and arrays as a separate construct, and instead have bare pointers
    and a generic "reserve a blob of bytes in the frame and initialize
    this pointer to point to it" operator (with the business end of this
    operator happening in the function prolog).

    The problem with this, that I mentioned elsewhere, is how well it
    would work with SYS V ABI, since the rules for structs are complex,
    and apparently recursive.

    Having just a block of bytes might not be enough.

    In my case, I am not bothering with the SysV style ABI's (well, along
    with there not being any x86 or x86-64 target...).

    I'd imagine it's worse with ARM targets as there are so many more
    registers to try and deconstruct structs into.


    Not messed much with the ARM64 ABI or similar, but I will draw the line
    in the sand somewhere.

    Struct passing/return is enough of an edge case that one can just sort
    of declare it "no go" between compilers with "mostly but not strictly compatible" ABIs.



    For my ISA, it is a custom ABI, but follows mostly similar rules to
    some of the other "Microsoft style" ABIs (where, I have noted that
    across multiple targets, MS tools have tended to use similar ABI
    designs).

    When you do your own thing, it's easy.

    In the 1980s, I didn't need to worry about call conventions used for
    other software, since there /was/ no other software! I had to write everything, save for the odd calls to DOS which used some form of SYSCALL.

    Then, arrays and structs were actually passed and returned by value (not
    via hidden references), by copying the data to and from the stack.

    However, I don't recall ever using the feature, as I considered it efficient. I always used explicit references in my code.


    Most of the time, one is passing/returning structures as pointers, and
    not by value.

    By value structures are usually small.


    When a structure is not small, it is both simpler to implement, and
    usually faster, to internally pass it by reference.

    If you pass a large structure to a function by value, via an on-stack
    copy, and the function assigns it to another location (say, a global variable):
    Pass by reference: Only a single copy operation is needed;
    Pass by value on-stack: At least two copy operations are needed.

    One also needs to reserve enough space in the function arguments list to
    hold any structures passed, which could be bad if they are potentially
    large.



    But, on my ISA, ABI is sort of like:
    R4 ..R7 : Arg0 ..Arg3
    R20..R23: Arg4 ..Arg7
    R36..R39: Arg8 ..Arg11 (optional)
    R52..R55: Arg12..Arg15 (optional)
    Return Value:
    R2, R3:R2 (128 bit)
    R2 is also used to pass in the return value pointer.

    'this':
    Generally passed in either R3 or R18, depending on ABI variant.

    Where, callee-save:
    R8 ..R14, R24..R31,
    R40..R47, R56..R63
    R15=SP

    Non-saved scratch:
    R2 ..R7 , R16..R23,
    R32..R39, R48..R55


    Arguments beyond the first 8/16 register arguments are passed on stack.
    In this case, a spill space for the first 8/16 arguments (64 or 128
    bytes) is provided on stack before the first non-register argument.

    If the function accepts a fixed number of arguments and the number of
    argument registers is 8 or less, spill space need only be provided for
    the first 8 arguments (calling vararg functions will always reserve
    space for 16 registers in the 16-register ABI). This spill space
    effectively belongs to the callee rather than the caller.


    Structures (by value):
    1.. 8 bytes: Passed in a single register
    9..16 bytes: Passed in a pair, padded to the next even pair
    17+: Pass as a reference.

    Things like 128-bit types are also passed/returned in register pairs.



    Contrast, RV ABI:
    X10..X17 are used for arguments;
    No spill space is provided;
    ...

    My variant uses similar rules to my own ABI for passing/returning
    structures, with:
    X28, structure return pointer
    X29, 'this'
    Normal return values go into X10 or X11:X10.



    Note that in both ABI's, passing 'this' in a register would mean that
    class instances and COM objects are not equivalent (COM object methods
    always pass 'this' as the first argument).

    The 'this' register is implicitly also used by lambdas to pass in the
    pointer to the captured bindings area (which mostly resembles a
    structure containing each variable captured by the lambda).

    Can note though that in this case, capturing a binding by reference
    means the lambda is limited to automatic lifetime (non-automatic lambdas
    may only capture by value). In this case, capture by value is the default.


    For my compiler targeting RISC-V, it uses a variation of RV's ABI rules.
    Argument passing is basically similar, but struct pass/return is
    different; and it passes floating-point values in GPRs (and, in my own
    ISA, all floating-point values use GPRs, as there are no FPU
    registers; though FPU registers do exist for RISC-V).

    Supporting C's variadic functions, which is needed for many languages
    when calling C across an FFI, usually requires different rules. On Win64
    ABI for example, by passing low variadic arguments in both GPRs and FPU registers.


    I simplified things by assuming only GPRs are used.


    /Implementing/ variadic functions (which only occurs if implementing C)
    is another headache if it has to work with the ABI (which can be assumed
    for a non-static function).

    I barely have a working solution for Win64 ABI, which needs to be done
    via stdarg.h, but wouldn't have a clue how to do it for SYS V.

    (Even Win64 has problems, as it assumes a downward-growing stack; in my
    IL interpreter, the stack grows upwards!)


    Most targets use a downward growing stack.
    Mine is no exception here...


    Not likely a huge issue as one is unlikely to use ELF and PE/COFF in
    the same program.


    For the "OS" that runs on my CPU core, it is natively using PE/COFF, but

    That's interesting: you deliberately used one of the most complex file formats around, when you could have devised your own?


    For what I wanted, I would have mostly needed to recreate most of the
    same functionality as PE/COFF anyways.


    When one considers the entire loading process (including DLLs/SOs), then PE/COFF loading is actually simpler than ELF loading (ELF subjects the
    loader to needing to deal with symbol and relocation tables), similar to
    PIE loading.


    Things like the MZ stub are optional in my case, and mostly ignored if
    present (in my LZ compressed PE variants, the MZ stub is omitted entirely).


    I had at one point considered doing a custom format resembling LZ
    compressed MachO, but ended up not bothering, as it wouldn't have really
    saved anything over LZ compressed PE/COFF.


    Some "unneeded cruft" like the Resource Section was discarded, mostly
    replaced by an embedded WAD2 image. The header was modified some to
    allow for backwards compatibility with the Windows format (mostly
    creating a dummy header in the original format that points to the WAD2 directory).


    Idea is that icons, bitmaps, and other things, would mostly be held in
    WAD lumps. Though, resources which may be accessed via symbols in the
    EXE/DLL need to be stored uncompressed (where "__rsrc_lumpname" may be
    used to access the contents of resource-section lumps as an extern symbol).

    Say, for example:
    extern byte __rsrc_mybitmap[]; //resolves to a DIB/BMP or similar

    For now, resource formats:
    Images:
    BMP (various settings)
    4, 8, and 16 bpp typical
    Supports a non-standard 16-bpp alpha-blended mode (*1).
    Supports non-standard 16 color and 256 color with transparent.
    Supports CRAM BMP as well (2 bpp)
    QOI (assumes RGBA32, nominally lossless)
    QOI is a semi-simplistic non-entropy-coded format.
    Can give PNG-like compression in some cases.
    Reasonably fast/cheap to decode.
    LCIF, custom lossy format, color-cell compression.
    OK Q/bpp but mostly only on the low-end.
    Resembles a QOI+CRAM hybrid.
    UPIC, lossy or lossless, JPEG-like (*2)

    *1:
    0rrrrrgggggbbbbb Normal/Opaque
    1rrrraggggabbbba With 3 bit alpha (4b/ch RGB).

    For 16 and 256 color, a variant is supported with a transparent color. Generally the high intensity magenta is reused as the transparent color.
    This is encoded in the color palette (if all colors apart from one have
    the alpha bits set to FF, and one color has 00, then that color is
    assumed to be a transparent color).

    CRAM bpp: Uses a limited form of the 8-bit CRAM format:
    16 bits, 4x4 pixels, 1 bit per pixel
    2x 8 bits: Color Endpoints
    The rest of the format being unsupported, so it can simply assume a
    fixed 32-bits per 4x4 pixel cell.



    *2: The UPIC format is structurally similar to JPEG, but:
    Uses TLV packaging (vs FF-escape tagging);
    Uses Rice coding (vs Huffman)
    Uses Z3.V5 VLC, vs Z4.V4
    Uses Block-Haar and RCT
    Vs DCT and YCbCr.
    Supports an alpha channel.
    Y 1 (*2A)
    YA 1:1 (*2A)
    YUV 4:2:0
    YUV 4:4:4 (*2A)
    YUVA 4:2:0:4
    YUVA 4:4:4:4 (*2A)
    *2A: May be used in the lossless modes, depending on image.


    VLC coding resembles Deflate's natch distance encoding, with sign-folded values. Runs of zero coefficients have a shorter limit, but similar.
    Like with JPEG, an 0x00 symbol encodes an early EOB.

    In tests, on my main PC:
    Vs JPEG: It is a little faster
    Q/bpp is similar, better/worse depends on image.
    Slightly worse on photos, but "similar".
    Generally somewhat better on artificial images.
    Vs PNG:
    Faster to decode (with less memory overhead);
    Better compression on many images (particularly photo-like).

    Note that UPIC was designed to not require any large intermediate
    buffers, so will decode directly to an RGB555 or RGBA32 output buffer (decoding happens in terms of individual 16x16 pixel macroblocks).

    It was designed to be moderately fast and to try to minimize memory
    overhead for decoding (vs either PNG or JPEG, which need a more
    significant chunk of working memory to decode).


    Block-Haar is a Haar transform made to fit the same 8x8 pixel blocks as
    DCT, where Haar maps (A,B)->(C,D):
    C=(A+B)/2 (*: X/2 here being defined as (X>>1))
    D=A-B
    But, can be reversed exactly, IIRC:
    B=C-(D/2)
    A=B+D
    By doing multiple stages of Haar transform, one can build an 8-pixel
    version, and then use horizontal and vertical transforms for an 8x8
    block. It is computationally fairly cheap, and lossless.

    The Walsh-Hadamard transform can give similar properties, but generally involves a few extra steps that make it more computationally expensive.

    It is possible to use a lifting transform to make a Reversible DCT, but
    it is slow...


    BGBCC accepts JPEG and PNG for input and can convert them to
    BMP/QOI/UPIC as needed.


    For audio storage, generally using the RIFF WAV format. For bulk audio,
    both A-Law and IMA ADPCM work OK. Granted, IMA ADPCM is not space
    efficient for stereo, but mostly OK for mono (most common use-case for
    sound effects).


    I did exactly that at a period when my generated DLLs were buggy for
    some reason (it turned out to be two reasons). I created a simple
    dynamic library format of my own. Then I found the same format worked
    also for executables.

    But I needed a loader program to run them, as Windows obviously didn't understand the format. Such a program can be written in 800 lines of C,
    and can dynamically libraries in both my format, and proper DLLs (not
    the buggy ones I generated!).

    A hello-world program is under 300 bytes compared with 2 or
    2.5KB of EXE. And the format is portable to Linux, so no need to
    generate ELF (but I haven't tried). Plus the format might be transparent
    to AV software (haven't tried that either).


    OK.

    By design, my PEL format (PE+LZ) isn't going to get under 2K (1K for
    headers, 1K for LZ'ed sections).

    But, usually this is not a problem.


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Fri Dec 20 05:10:44 2024
    From Newsgroup: comp.lang.c

    On 12/19/2024 2:36 PM, BGB wrote:
    On 12/19/2024 5:27 AM, bart wrote:
    On 19/12/2024 05:46, BGB wrote:
    On 12/18/2024 6:35 PM, bart wrote:
    On 19/12/2024 00:27, BGB wrote:

    By-Value Structs smaller than 16 bytes are passed as-if they were a >>>>> 64 or 128 bit integer type (as a single register or as a register
    pair, with a layout matching their in-memory representation).

    ...


    But, yeah, at the IL level, one could potentially eliminate structs >>>>> and arrays as a separate construct, and instead have bare pointers
    and a generic "reserve a blob of bytes in the frame and initialize
    this pointer to point to it" operator (with the business end of
    this operator happening in the function prolog).

    The problem with this, that I mentioned elsewhere, is how well it
    would work with SYS V ABI, since the rules for structs are complex,
    and apparently recursive.

    Having just a block of bytes might not be enough.

    In my case, I am not bothering with the SysV style ABI's (well, along
    with there not being any x86 or x86-64 target...).

    I'd imagine it's worse with ARM targets as there are so many more
    registers to try and deconstruct structs into.


    Not messed much with the ARM64 ABI or similar, but I will draw the line
    in the sand somewhere.

    Struct passing/return is enough of an edge case that one can just sort
    of declare it "no go" between compilers with "mostly but not strictly compatible" ABIs.



    For my ISA, it is a custom ABI, but follows mostly similar rules to
    some of the other "Microsoft style" ABIs (where, I have noted that
    across multiple targets, MS tools have tended to use similar ABI
    designs).

    When you do your own thing, it's easy.

    In the 1980s, I didn't need to worry about call conventions used for
    other software, since there /was/ no other software! I had to write
    everything, save for the odd calls to DOS which used some form of
    SYSCALL.

    Then, arrays and structs were actually passed and returned by value
    (not via hidden references), by copying the data to and from the stack.

    However, I don't recall ever using the feature, as I considered it
    efficient. I always used explicit references in my code.


    Most of the time, one is passing/returning structures as pointers, and
    not by value.

    By value structures are usually small.


    When a structure is not small, it is both simpler to implement, and
    usually faster, to internally pass it by reference.

    If you pass a large structure to a function by value, via an on-stack
    copy, and the function assigns it to another location (say, a global variable):
      Pass by reference: Only a single copy operation is needed;
      Pass by value on-stack: At least two copy operations are needed.

    One also needs to reserve enough space in the function arguments list to hold any structures passed, which could be bad if they are potentially large.



    But, on my ISA, ABI is sort of like:
      R4 ..R7 : Arg0 ..Arg3
      R20..R23: Arg4 ..Arg7
      R36..R39: Arg8 ..Arg11 (optional)
      R52..R55: Arg12..Arg15 (optional)
    Return Value:
      R2, R3:R2 (128 bit)
      R2 is also used to pass in the return value pointer.

    'this':
      Generally passed in either R3 or R18, depending on ABI variant.

    Where, callee-save:
      R8 ..R14,  R24..R31,
      R40..R47,  R56..R63
      R15=SP

    Non-saved scratch:
      R2 ..R7 ,  R16..R23,
      R32..R39,  R48..R55


    Arguments beyond the first 8/16 register arguments are passed on stack.
    In this case, a spill space for the first 8/16 arguments (64 or 128
    bytes) is provided on stack before the first non-register argument.

    If the function accepts a fixed number of arguments and the number of argument registers is 8 or less, spill space need only be provided for
    the first 8 arguments (calling vararg functions will always reserve
    space for 16 registers in the 16-register ABI). This spill space
    effectively belongs to the callee rather than the caller.


    Structures (by value):
      1.. 8 bytes: Passed in a single register
      9..16 bytes: Passed in a pair, padded to the next even pair
      17+: Pass as a reference.

    Things like 128-bit types are also passed/returned in register pairs.



    Contrast, RV ABI:
      X10..X17 are used for arguments;
      No spill space is provided;
      ...

    My variant uses similar rules to my own ABI for passing/returning structures, with:
      X28, structure return pointer
      X29, 'this'
    Normal return values go into X10 or X11:X10.



    Note that in both ABI's, passing 'this' in a register would mean that
    class instances and COM objects are not equivalent (COM object methods always pass 'this' as the first argument).

    The 'this' register is implicitly also used by lambdas to pass in the pointer to the captured bindings area (which mostly resembles a
    structure containing each variable captured by the lambda).

    Can note though that in this case, capturing a binding by reference
    means the lambda is limited to automatic lifetime (non-automatic lambdas
    may only capture by value). In this case, capture by value is the default.


    For my compiler targeting RISC-V, it uses a variation of RV's ABI rules. >>> Argument passing is basically similar, but struct pass/return is
    different; and it passes floating-point values in GPRs (and, in my
    own ISA, all floating-point values use GPRs, as there are no FPU
    registers; though FPU registers do exist for RISC-V).

    Supporting C's variadic functions, which is needed for many languages
    when calling C across an FFI, usually requires different rules. On
    Win64 ABI for example, by passing low variadic arguments in both GPRs
    and FPU registers.


    I simplified things by assuming only GPRs are used.


    /Implementing/ variadic functions (which only occurs if implementing
    C) is another headache if it has to work with the ABI (which can be
    assumed for a non-static function).

    I barely have a working solution for Win64 ABI, which needs to be done
    via stdarg.h, but wouldn't have a clue how to do it for SYS V.

    (Even Win64 has problems, as it assumes a downward-growing stack; in
    my IL interpreter, the stack grows upwards!)


    Most targets use a downward growing stack.
    Mine is no exception here...


    Not likely a huge issue as one is unlikely to use ELF and PE/COFF in
    the same program.


    For the "OS" that runs on my CPU core, it is natively using PE/COFF, but >>
    That's interesting: you deliberately used one of the most complex file
    formats around, when you could have devised your own?


    For what I wanted, I would have mostly needed to recreate most of the
    same functionality as PE/COFF anyways.


    When one considers the entire loading process (including DLLs/SOs), then PE/COFF loading is actually simpler than ELF loading (ELF subjects the loader to needing to deal with symbol and relocation tables), similar to
    PIE loading.


    My wording there sucked...

    PIE loading is the same as the case for ELF shared object loading, so is fairly complex.

    For normal loading, they try to make it simpler for the kernel loader by having a special "interpreter" program deal with it. The process it then
    uses to bootstrap itself is rather convoluted.


    Things like the MZ stub are optional in my case, and mostly ignored if present (in my LZ compressed PE variants, the MZ stub is omitted entirely).


    My loader will accept multiple sub-variants:
    With MZ stub (original format);
    Without MZ stub (but uncompressed);
    With LZ4 compression (no MZ stub allowed).


    The format for the no-stub case is basically the same as the with-stub
    case, except that the stub is absent and thus the 'PE' sig is still present.

    Note that in my variants, omitting the MZ stub does cause it to change
    to a different checksum algorithm (the original PE/COFF checksum being unacceptably weak).



    I had at one point considered doing a custom format resembling LZ
    compressed MachO, but ended up not bothering, as it wouldn't have really saved anything over LZ compressed PE/COFF.


    The core process is still:
    Read stuff into memory;
    Apply post-load fixups.

    This part of the process was essentially unavoidable.


    Some "unneeded cruft" like the Resource Section was discarded, mostly replaced by an embedded WAD2 image. The header was modified some to
    allow for backwards compatibility with the Windows format (mostly
    creating a dummy header in the original format that points to the WAD2 directory).


    Note that the change of resource section format was more because the
    original approach to the resource section made little sense to me.

    Identifying things with short names made a lot more sense than magic
    numbers.

    The WAD approach Worked for Doom and similar, probably sufficient for
    things like inline bitmap images and icons.


    Idea is that icons, bitmaps, and other things, would mostly be held in
    WAD lumps. Though, resources which may be accessed via symbols in the EXE/DLL need to be stored uncompressed (where "__rsrc_lumpname" may be
    used to access the contents of resource-section lumps as an extern symbol).


    Note that it can also load blobs of text or binary data.
    Though, BGBCC provides less in terms of format converters for arbitrary
    data.

    A special text format is used both to define files to pull into the
    resource section (and what lump name to use), as well as format
    conversions to apply.


    Say, for example:
      extern byte __rsrc_mybitmap[];  //resolves to a DIB/BMP or similar

    For now, resource formats:
      Images:
        BMP (various settings)
          4, 8, and 16 bpp typical
          Supports a non-standard 16-bpp alpha-blended mode (*1).
          Supports non-standard 16 color and 256 color with transparent.
          Supports CRAM BMP as well (2 bpp)
        QOI (assumes RGBA32, nominally lossless)
          QOI is a semi-simplistic non-entropy-coded format.
          Can give PNG-like compression in some cases.
          Reasonably fast/cheap to decode.
        LCIF, custom lossy format, color-cell compression.
          OK Q/bpp but mostly only on the low-end.
          Resembles a QOI+CRAM hybrid.
        UPIC, lossy or lossless, JPEG-like (*2)

    *1:
      0rrrrrgggggbbbbb  Normal/Opaque
      1rrrraggggabbbba  With 3 bit alpha (4b/ch RGB).

    For 16 and 256 color, a variant is supported with a transparent color. Generally the high intensity magenta is reused as the transparent color. This is encoded in the color palette (if all colors apart from one have
    the alpha bits set to FF, and one color has 00, then that color is
    assumed to be a transparent color).

    CRAM bpp: Uses a limited form of the 8-bit CRAM format:
      16 bits, 4x4 pixels, 1 bit per pixel
      2x 8 bits: Color Endpoints
    The rest of the format being unsupported, so it can simply assume a
    fixed 32-bits per 4x4 pixel cell.


    There being cases where one may want this...
    If an image doesn't have more than 2 colors per 4x4 cell, it may give an acceptable image (and is often less space than 16-color).

    Though, for small images, 16 color may use less space due to a smaller
    color palette (but, in theory, could add a special case to allow
    omitting the color palette when it is the default palette).

    Say:
    biBitCount=8, biClrUsed=0, biClrImportant=256
    Encoding a special "palette is absent, use fixed OS palette" case.
    As the BMP format burns 1K just to encode a 256-color palette.




    *2: The UPIC format is structurally similar to JPEG, but:
      Uses TLV packaging (vs FF-escape tagging);
      Uses Rice coding (vs Huffman)
      Uses Z3.V5 VLC, vs Z4.V4
      Uses Block-Haar and RCT
        Vs DCT and YCbCr.
      Supports an alpha channel.
        Y    1       (*2A)
        YA   1:1     (*2A)
        YUV  4:2:0
        YUV  4:4:4   (*2A)
        YUVA 4:2:0:4
        YUVA 4:4:4:4 (*2A)
      *2A: May be used in the lossless modes, depending on image.


    VLC coding resembles Deflate's natch distance encoding, with sign-folded values. Runs of zero coefficients have a shorter limit, but similar.
    Like with JPEG, an 0x00 symbol encodes an early EOB.


    ^ match. Also, UPIC is a custom format.

    Add context:
    Actually, it is using an entropy coding scheme I call STF+AdRice:
    Swap towards front, with Adaptive Rice Coding.

    The Rice coding parameter (k) is adapted based on Q:
    0: k--;
    1: no change;
    2..7: k++
    8: k++; Symbol index encoded as a raw 8 bits.

    Symbols are encoded as indices into a table. Whenever an index is
    encoded, the symbol swaps places with the symbol at (I*15)/16, causing
    more commonly used symbols to migrate towards 0.

    Theoretically, the decoding process is more complex than a table-driven
    static Huffman decoder (as well as worse compression), but:
    Less memory is needed;
    Faster to initialize;
    On average, it is speed competitive.
    Lookup table initialization for static Huffman is expensive;
    Decode speed hindered by high L1 miss rates.

    With a 15-bit symbol-length limit, Huffman has a very high L1 miss rate. Generally, to be fast, one needs to impose a 12 or 13 bit symbol length
    limit, reducing compression, but greatly reducing the number of L1
    misses. Though, 12 bits is a lower limit in practice (going much less
    than this, and Huffman coding becomes ineffective).



    In tests, on my main PC:
      Vs JPEG: It is a little faster
        Q/bpp is similar, better/worse depends on image.
          Slightly worse on photos, but "similar".
          Generally somewhat better on artificial images.
      Vs PNG:
        Faster to decode (with less memory overhead);
        Better compression on many images (particularly photo-like).

    Note that UPIC was designed to not require any large intermediate
    buffers, so will decode directly to an RGB555 or RGBA32 output buffer (decoding happens in terms of individual 16x16 pixel macroblocks).

    It was designed to be moderately fast and to try to minimize memory
    overhead for decoding (vs either PNG or JPEG, which need a more
    significant chunk of working memory to decode).


    Block-Haar is a Haar transform made to fit the same 8x8 pixel blocks as
    DCT, where Haar maps (A,B)->(C,D):
      C=(A+B)/2  (*: X/2 here being defined as (X>>1))
      D=A-B
    But, can be reversed exactly, IIRC:
      B=C-(D/2)
      A=B+D
    By doing multiple stages of Haar transform, one can build an 8-pixel version, and then use horizontal and vertical transforms for an 8x8
    block. It is computationally fairly cheap, and lossless.

    The Walsh-Hadamard transform can give similar properties, but generally involves a few extra steps that make it more computationally expensive.

    It is possible to use a lifting transform to make a Reversible DCT, but
    it is slow...


    Also, the code-size footprint for UPIC is smaller than a JPEG decoder.



    BGBCC accepts JPEG and PNG for input and can convert them to BMP/QOI/
    UPIC as needed.


    For audio storage, generally using the RIFF WAV format. For bulk audio,
    both A-Law and IMA ADPCM work OK. Granted, IMA ADPCM is not space
    efficient for stereo, but mostly OK for mono (most common use-case for
    sound effects).


    This isn't used much yet in this project.

    In general, for other cases where I use audio, 16kHz is a typical default.

    Where:
    8 and 11 kHz sound poor.
    Also 8-bit linear PCM sounds poor.

    I am less a fan of MP3:
    Very complex decoder;
    Much under 96 or 128 kbps, has very obvious audio distortions...
    At lower bitrates, the audio quality is decidedly unpleasant.
    IMHO: 16 kHz ADPCM sounds better than 64 kbps MP3.

    Not sure why it is so possible, when, as noted, at lower bitrates it
    sounds pretty broken (but, then again, it mostly sounds much fine at 128
    kbps or beyond, so dunno).

    ADPCM's property of sounding tinny is still preferable to sounding like
    one is rattling a steel can full of broken glass, IMHO.


    Did experimentally create an MP3-like audio codec (but much simpler),
    also using Block-Haar (rather than MDCT), and reused some amount of code
    from UPIC, which seems to avoid some of MP3's more obvious artifacts.
    But, the design did have a few of its own issues (might need to revisit later).

    Mostly, it uses a half-cubic spline to approximate the low-frequency components (and try to reduce blocking artifacts; the spline is
    subtracted out so only higher frequency components use the Block-Haar),
    but seemingly the spline was too coarse (one sample per block), and I
    would likely need a higher effective sampling rate for the spline to
    avoid blocking artifacts in some cases (mostly, with sounds at roughly
    the same frequency as the block size effectively resulting in square
    waves, which sound bad).


    I did exactly that at a period when my generated DLLs were buggy for
    some reason (it turned out to be two reasons). I created a simple
    dynamic library format of my own. Then I found the same format worked
    also for executables.

    But I needed a loader program to run them, as Windows obviously didn't
    understand the format. Such a program can be written in 800 lines of
    C, and can dynamically libraries in both my format, and proper DLLs
    (not the buggy ones I generated!).

    A hello-world program is under 300 bytes compared with 2 or
    2.5KB of EXE. And the format is portable to Linux, so no need to
    generate ELF (but I haven't tried). Plus the format might be
    transparent to AV software (haven't tried that either).


    OK.

    By design, my PEL format (PE+LZ) isn't going to get under 2K (1K for headers, 1K for LZ'ed sections).

    But, usually this is not a problem.



    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Fri Dec 20 17:28:29 2024
    From Newsgroup: comp.lang.c

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Lawrence D'Oliveiro@ldo@nz.invalid to comp.lang.c on Sat Dec 21 05:34:07 2024
    From Newsgroup: comp.lang.c

    On Tue, 17 Dec 2024 13:07:44 -0600, BGB wrote:

    Every variable may only be assigned once ...

    Note this only applies to registers.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sat Dec 21 21:31:24 2024
    From Newsgroup: comp.lang.c

    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    (Unless you just wanted to say that in some HLL abstraction like
    'printf("Hello world!\n")' there's no [visible] conditional branch.
    Likewise in a 'ClearAccumulator' machine instruction, or the like.)

    The comparisons and predicates are one key function (not any specific
    branch construct, whether on HLL level, assembler level, or with the (elementary but most powerful) Turing Machine). Comparisons inherently
    result in predicates which is what controls program execution).

    So your statement asks for some explanation at least.

    Janis

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat Dec 21 13:51:27 2024
    From Newsgroup: comp.lang.c

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 21.12.2024 02:28, Tim Rentsch wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    (Unless you just wanted to say that in some HLL abstraction like 'printf("Hello world!\n")' there's no [visible] conditional branch.
    Likewise in a 'ClearAccumulator' machine instruction, or the like.)

    The comparisons and predicates are one key function (not any specific
    branch construct, whether on HLL level, assembler level, or with the (elementary but most powerful) Turing Machine). Comparisons inherently result in predicates which is what controls program execution).

    So your statement asks for some explanation at least.

    Start with C - any of C90, C99, C11.

    Take away the short-circuiting operators - &&, ||, ?:.

    Take away all statement types that involve intra-function transfer
    of control: goto, break, continue, if, for, while, switch, do/while.
    Might as well take away statement labels too.

    Take away setjmp and longjmp.

    Rule out programs with undefined behavior.

    The language that is left is still Turing complete.

    Proof: exercise for the reader.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Sun Dec 22 00:20:32 2024
    From Newsgroup: comp.lang.c

    On Sat, 21 Dec 2024 21:31:24 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    So your statement asks for some explanation at least.

    Janis


    I would guess that Tim worked as CS professor for several dozens years.
    And it shows.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun Dec 22 01:13:07 2024
    From Newsgroup: comp.lang.c

    On 21.12.2024 23:20, Michael S wrote:
    On Sat, 21 Dec 2024 21:31:24 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    So your statement asks for some explanation at least.

    I would guess that Tim worked as CS professor for several dozens years.
    And it shows.

    Ranks and titles are, per se, no guarantee. I'm not impressed; I've
    seen all sorts/qualities of professors. YMMV.

    If that is true (that he was one) I'm wondering why we observe so
    often that he posts statements here and doesn't care to explain it.
    At least the many _good_ professors I met in my life typically were
    keen to explain their theses, statements, or knowledge (instead of
    dragging that out of him).

    Janis

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Sun Dec 22 02:18:51 2024
    From Newsgroup: comp.lang.c

    On Sun, 22 Dec 2024 01:13:07 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 21.12.2024 23:20, Michael S wrote:
    On Sat, 21 Dec 2024 21:31:24 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    So your statement asks for some explanation at least.

    I would guess that Tim worked as CS professor for several dozens
    years. And it shows.

    Ranks and titles are, per se, no guarantee. I'm not impressed; I've
    seen all sorts/qualities of professors. YMMV.

    If that is true (that he was one) I'm wondering why we observe so
    often that he posts statements here and doesn't care to explain it.
    At least the many _good_ professors I met in my life typically were
    keen to explain their theses, statements, or knowledge (instead of
    dragging that out of him).

    Janis


    It seems, you didn't understand me. (Ogh, it is contagious ;-)

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun Dec 22 01:22:01 2024
    From Newsgroup: comp.lang.c

    On 21.12.2024 22:51, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 21.12.2024 02:28, Tim Rentsch wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    (Unless you just wanted to say that in some HLL abstraction like
    'printf("Hello world!\n")' there's no [visible] conditional branch.
    Likewise in a 'ClearAccumulator' machine instruction, or the like.)

    The comparisons and predicates are one key function (not any specific
    branch construct, whether on HLL level, assembler level, or with the
    (elementary but most powerful) Turing Machine). Comparisons inherently
    result in predicates which is what controls program execution).

    So your statement asks for some explanation at least.

    Start with C - any of C90, C99, C11.

    Take away the short-circuiting operators - &&, ||, ?:.

    Take away all statement types that involve intra-function transfer
    of control: goto, break, continue, if, for, while, switch, do/while.
    Might as well take away statement labels too.

    Take away setjmp and longjmp.

    And also things like the above mentioned 'printf()' that most certainly
    implies an iteration over the format string checking for it's '\0'-end.
    And so on, and so on. - What will be left as "language".

    Would you be able to formulate functionality of the class of Recursive Functions (languages class of a Turing Machine with Chomsky-0 grammar).


    Rule out programs with undefined behavior.

    The language that is left is still Turing complete.

    Is it? - But wouldn't that be just the argument I mentioned above; that
    a, say, 'ClearAccumulator' machine statement wouldn't contain any jump?

    Proof: exercise for the reader.

    (Typical sort of your reply.)

    Janis

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun Dec 22 01:39:49 2024
    From Newsgroup: comp.lang.c

    On 22.12.2024 01:18, Michael S wrote:
    On Sun, 22 Dec 2024 01:13:07 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 21.12.2024 23:20, Michael S wrote:
    On Sat, 21 Dec 2024 21:31:24 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    So your statement asks for some explanation at least.

    I would guess that Tim worked as CS professor for several dozens
    years. And it shows.

    Ranks and titles are, per se, no guarantee. I'm not impressed; I've
    seen all sorts/qualities of professors. YMMV.

    If that is true (that he was one) I'm wondering why we observe so
    often that he posts statements here and doesn't care to explain it.
    At least the many _good_ professors I met in my life typically were
    keen to explain their theses, statements, or knowledge (instead of
    dragging that out of him).

    It seems, you didn't understand me. (Ogh, it is contagious ;-)

    I'm sorry, no. - I certainly took it literally - as I do (at first)
    with most people and their statements (until I get to know better).

    If it was meant sarcastically or anything, I'd appreciate a smiley
    or something like that. (It certainly wasn't obvious to me.)

    If it was meant serious and I completely missed the point - which
    may also happen occasionally - I'd appreciate a pointer.

    Janis

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Sun Dec 22 03:04:51 2024
    From Newsgroup: comp.lang.c

    On Sun, 22 Dec 2024 01:39:49 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 22.12.2024 01:18, Michael S wrote:
    On Sun, 22 Dec 2024 01:13:07 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 21.12.2024 23:20, Michael S wrote:
    On Sat, 21 Dec 2024 21:31:24 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    So your statement asks for some explanation at least.

    I would guess that Tim worked as CS professor for several dozens
    years. And it shows.

    Ranks and titles are, per se, no guarantee. I'm not impressed; I've
    seen all sorts/qualities of professors. YMMV.

    If that is true (that he was one) I'm wondering why we observe so
    often that he posts statements here and doesn't care to explain it.
    At least the many _good_ professors I met in my life typically were
    keen to explain their theses, statements, or knowledge (instead of
    dragging that out of him).

    It seems, you didn't understand me. (Ogh, it is contagious ;-)

    I'm sorry, no. - I certainly took it literally - as I do (at first)
    with most people and their statements (until I get to know better).

    If it was meant sarcastically or anything, I'd appreciate a smiley
    or something like that. (It certainly wasn't obvious to me.)

    If it was meant serious and I completely missed the point - which
    may also happen occasionally - I'd appreciate a pointer.

    Janis



    Part of the answer is in your previous response.
    You wrote: "many _good_ professors I met in my life typically were
    keen to explain their theses, statements, or knowledge (instead of
    dragging that out of him)". You essentially admitted that not all good professors behave like that.

    There is more than one school of teaching. One school believes that
    students learn from explanations and exercises. Other school believes
    that students learn best when provided with bare basics and then asked
    to figure out the rest by themselves. There is also the third school
    that believes that student don't really learn anything before they try
    to explain it to somebody else.

    You make an impression of one that received basics of CS. Probably, 40
    or so years ago, but still you have to know basic facts. Unlike me, for example.
    So, Tim expects that you will be able to utilizes his hints. And that
    it would lead to much better understanding on your part then if he
    feeds you by teaspoon.
    That is one part. Another part is that he is annoyed by your tone.



    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun Dec 22 03:06:54 2024
    From Newsgroup: comp.lang.c

    On 22.12.2024 02:04, Michael S wrote:
    [...]

    Part of the answer is in your previous response.
    You wrote: "many _good_ professors I met in my life typically were
    keen to explain their theses, statements, or knowledge (instead of
    dragging that out of him)". You essentially admitted that not all good professors behave like that.

    Oh, what I meant to express was different; that good professors
    *would* explain it (only bad ones wouldn't).

    (At least that was my experience; and not only covering the CS
    domain, BTW.)

    [ "schools of teaching" stuff snipped ]

    You make an impression of one that received basics of CS. Probably, 40
    or so years ago, but still you have to know basic facts. Unlike me, for example.
    So, Tim expects that you will be able to utilizes his hints.

    The point [repeatedly] stated (also by others here) was that
    he more often than not just provides no information but simple
    arbitrary statements of opinion.

    *Especially* if folks here that are discussing CS stuff have 40
    or 50 years experience, as you say, with academical and practical
    background one would think that a non-substantial "kindergarten"
    statement is then effectively just an offense (or likely part of
    a arrogant [professorial?] behavior).

    And that
    it would lead to much better understanding on your part then if he
    feeds you by teaspoon.

    Which he doesn't do.

    Moreover, given that many of the folks here obviously *do* have
    a solid background (or at least years long IT or CS experiences)
    should, IMO, be a motivation to try to explain any arguable point
    if one really cares about the topic. (Unless some habit, maybe of
    being an inerrant authority, prevents one from such.)

    Myself I'm at least trying to explain knowledges and backup by
    experiences, not just throw short phrases into the pool.

    That is one part. Another part is that he is annoyed by your tone.

    (And I'm annoyed by his. But, anyway, his posting tone is that
    same in most of his responses to folks here, not just to me.)

    Janis

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Sat Dec 21 22:17:20 2024
    From Newsgroup: comp.lang.c

    On 12/21/24 20:04, Michael S wrote:
    ...
    There is more than one school of teaching. One school believes that
    students learn from explanations and exercises. Other school believes
    that students learn best when provided with bare basics and then asked
    to figure out the rest by themselves.

    I personally believe that Tim generally thinks there's a justification
    for what he says, and that we'd be better off figuring it out ourselves.
    I also know, from the rare occasions when he's been convinced to provide
    his justification, that I often don't consider his justification valid. However, he says things that seem to be unjustified so often, I can't
    help wondering if he doesn't occasionally say things he realizes are unjustified (either at the time, or as the result of subsequent
    discussion), and withholds his justifications in order to hide the fact
    that he knows he was wrong. Probably not, but I keep wondering.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Sun Dec 22 06:01:52 2024
    From Newsgroup: comp.lang.c

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts where I explained in detail how to translate
    goto program (with conditional jumps) into program that contains
    no goto and no conditional jumps).

    Or try to figure out how to do this knowing that C has function
    pointers.
    --
    Waldek Hebisch
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Sun Dec 22 11:22:51 2024
    From Newsgroup: comp.lang.c

    On Sun, 22 Dec 2024 06:01:52 -0000 (UTC)
    antispam@fricas.org (Waldek Hebisch) wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via
    goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts where I explained in detail how to translate
    goto program (with conditional jumps) into program that contains
    no goto and no conditional jumps).

    Considering that Janis replied to your post I find a possibility that
    he did not look at it unlikely. Although not completely impossible.


    Or try to figure out how to do this knowing that C has function
    pointers.




    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Sun Dec 22 11:35:53 2024
    From Newsgroup: comp.lang.c

    On 22/12/2024 09:22, Michael S wrote:
    On Sun, 22 Dec 2024 06:01:52 -0000 (UTC)
    antispam@fricas.org (Waldek Hebisch) wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via
    goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts where I explained in detail how to translate
    goto program (with conditional jumps) into program that contains
    no goto and no conditional jumps).

    Considering that Janis replied to your post I find a possibility that
    he did not look at it unlikely. Although not completely impossible.

    He only replied to the first remark. And summarised the rest with:

    "[ ponderings about where recursive functions might be used ]"

    (18-Dec, 16:26 GMT)

    I don't think JP does details, and I've struggled to find posts where he writes actual code. His replies to mine have mostly been about trying to
    beat me over the head.


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Ben Bacarisse@ben@bsb.me.uk to comp.lang.c on Sun Dec 22 14:19:13 2024
    From Newsgroup: comp.lang.c

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    I don't want to speak for Tim, but as far as I am concerned, it all
    boils down to what you take to be a model of (effective) computation.
    In some purely theoretical sense, models like the pure lambda calculus
    and combinator calculus are "complete" and they have no specific
    conditional "branches".

    Going into detail (such as examples of making a "choice" in pure lambda calculus) are way off topic here.

    This is exactly what comp.theory should be used for, so I will cross
    post there and set the followup-to header. comp.theory has been trashed
    by cranks but maybe a topical post will help it a but.
    --
    Ben.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Ben Bacarisse@ben@bsb.me.uk to comp.lang.c on Sun Dec 22 15:30:30 2024
    From Newsgroup: comp.lang.c

    Ben Bacarisse <ben@bsb.me.uk> writes:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    I don't want to speak for Tim, but as far as I am concerned, it all
    boils down to what you take to be a model of (effective) computation.
    In some purely theoretical sense, models like the pure lambda calculus
    and combinator calculus are "complete" and they have no specific
    conditional "branches".

    Going into detail (such as examples of making a "choice" in pure lambda calculus) are way off topic here.

    This is exactly what comp.theory should be used for, so I will cross
    post there and set the followup-to header. comp.theory has been trashed
    by cranks but maybe a topical post will help it a but.

    I see from a post I had not read before replying that Tim's point was
    very much focused on C. Given that theory is off topic here (and
    comp.theory is a mess) there is probably no point in trying to
    discussing the more general point.
    --
    Ben.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun Dec 22 10:38:11 2024
    From Newsgroup: comp.lang.c

    antispam@fricas.org (Waldek Hebisch) writes:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 21.12.2024 02:28, Tim Rentsch wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts [...]

    What makes you think I didn't?
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun Dec 22 19:51:13 2024
    From Newsgroup: comp.lang.c

    On 22.12.2024 04:17, James Kuyper wrote:
    On 12/21/24 20:04, Michael S wrote:
    ...
    There is more than one school of teaching. One school believes that
    students learn from explanations and exercises. Other school believes
    that students learn best when provided with bare basics and then asked
    to figure out the rest by themselves.

    In context of this newsgroup where my impression is that there's a lot
    of years long IT/CS experienced and quite old people discussing topics
    the explanatory "model" of "schools of teaching" is anyway completely inappropriate; there's not "one _teacher_ [who knows almost all]" and
    "all the rest are [ignorant] _pupils_" that need to be "guided" (in
    one way or the other). Not saying anything substantial on a topic can
    certainly be perceived as some rhetorical move but it's surely not any
    sort of teaching-didactics [of whatever "school of teaching"]).


    I personally believe that Tim generally thinks there's a justification
    for what he says, and that we'd be better off figuring it out ourselves.

    (My impression is that he often says something on a topic where he has
    no deeper knowledge, but is pretending to know by not saying anything substantial.)

    I also know, from the rare occasions when he's been convinced to provide
    his justification, that I often don't consider his justification valid. However, he says things that seem to be unjustified so often, I can't
    help wondering if he doesn't occasionally say things he realizes are unjustified (either at the time, or as the result of subsequent
    discussion), and withholds his justifications in order to hide the fact
    that he knows he was wrong. Probably not, but I keep wondering.

    (This matches with my observations and I drew a similar conclusion.)

    Janis

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.c on Sun Dec 22 20:41:44 2024
    From Newsgroup: comp.lang.c

    On 22.12.2024 07:01, Waldek Hebisch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts where I explained in detail how to translate
    goto program (with conditional jumps) into program that contains
    no goto and no conditional jumps).

    I'm not sure but may have just skimmed over your "C" example if it
    wasn't of interest to the point I tried to make (at that stage).

    Or try to figure out how to do this knowing that C has function
    pointers.

    I will retry to explain what I tried to say... - very simply put...

    There's "Recursive Functions" and the Turing Machines "equivalent".
    The "Recursive Functions" is the most powerful class of algorithms.
    Formal Recursive Functions are formally defined in terms of abstract mathematical formulated properties; one of these [three properties]
    are the "Test Sets". (Here I can already stop.)

    But since we're not in a theoretical CS newsgroup I'd just wanted
    to see an example of some common, say, mathematical function and
    see it implemented without 'if' and 'goto' or recursion. - Take a
    simple one, say, fac(n) = n! , the factorial function. I know how
    I can implement that with 'if' and recursion, and I know how I can
    implement that with 'while' (or 'goto').

    If I re-inspect your example upthread - I hope it was the one you
    wanted to refer to - I see that you have removed the 'if' symbol
    but not the conditional, the test function; there's still the
    predicate (the "Test Set") present in form of 'int c2 = i < n',
    and it's there in the original code, in the goto transformed code,
    and in the function-pointer code. And you cannot get rid of that.

    Whether you have the test in an 'if', or in a ternary '?:', or
    use it through a bool-int coercion as integer index to an indexed function[-pointer] table; it's a conditional branch based on the
    ("Test Set") predicate i<n. You showed in your example how to get
    rid of the 'if' symbol, but you could - as expected - not get rid
    of the actual test that is the substance of a conditional branch.

    I think that is what is to expect by the theory and the essence of
    the point I tried to make.

    Janis

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Sun Dec 22 19:44:49 2024
    From Newsgroup: comp.lang.c

    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:
    antispam@fricas.org (Waldek Hebisch) writes:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 21.12.2024 02:28, Tim Rentsch wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto. >>>>>
    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts [...]

    What makes you think I didn't?

    I made the same claim as you earlier and gave examples. You
    did not acknowledge my posts. Why? For me most natural
    explanation is that you did not read them.
    --
    Waldek Hebisch
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Sun Dec 22 21:45:14 2024
    From Newsgroup: comp.lang.c

    On 2024-12-21, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    In a functional langauge, we can make a decision by, for instance,
    putting two lambdas into an array A, and then calling A[0] or A[1],
    where the index 0 or 1 is comes from some Boolean result.

    The only reason we have a control construct like if(A, X, Y) where X
    is only evaluated if A is true, otherwise Y, is that X and Y
    have side effects.

    If X and Y don't have side effects, then if(A, X, Y) can be an ordinary function whose arguments are strictly evaluated.

    Moreover, if we give the functional language lazy evaluation semantics,
    then anyway we get the behavior that Y is not evaluated if A is true,
    and that lazy evaluation model can be used as the basis for sneaking
    effects into the functional language and conctrolling them.

    Anyway, Turing calculation by primitive recursion does not require
    conditional branching. Just perhaps an if function which returns
    either its second or third argument based on the truth value of
    its first argument.

    For instance, in certain C preprocessor tricks, conditional expansion
    is achieved by such macros.

    When we run the following through the GNU C preprocessor (e.g. by pasting
    into gcc -E -x c -p -):

    #define TRUE_SELECT_TRUE(X) X
    #define TRUE_SELECT_FALSE(X)

    #define FALSE_SELECT_TRUE(X)
    #define FALSE_SELECT_FALSE(X) X

    #define SELECT_TRUE(X) X
    #define SELECT_FALSE(X)

    #define PASTE(X, Y) X ## Y

    #define IF(A, B, C) PASTE(TRUE_SELECT_, A)(B) PASTE(FALSE_SELECT_, A)(C)

    #define FOO TRUE
    #define BAR FALSE

    IF(FOO, foo is true, foo is false)
    IF(BAR, bar is true, bar is false)

    We get these tokens:

    foo is true
    bar is false

    Yet, macro expansion has no conditionals. The preprocessing language has
    #if and #ifdef, but we didn't use those. Just expansion of computed names.

    This is an example of not strictly needing conditionals to achieve
    conditional evaluation or expansion: an IF(A, B, C) operator that
    yields B or C depending on the truth of A, and so forth.

    John MacCarthy (Lisp inventor) wrote himself such an IF function
    in Fortran, in a program for calculating chess moves. It evaluated
    both the B and C expressions, and so it wasn't a proper imperative
    conditional, but it didn't matter.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Mon Dec 23 00:20:48 2024
    From Newsgroup: comp.lang.c

    On Sun, 22 Dec 2024 20:41:44 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 22.12.2024 07:01, Waldek Hebisch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via
    goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts where I explained in detail how to translate
    goto program (with conditional jumps) into program that contains
    no goto and no conditional jumps).

    I'm not sure but may have just skimmed over your "C" example if it
    wasn't of interest to the point I tried to make (at that stage).

    Or try to figure out how to do this knowing that C has function
    pointers.

    I will retry to explain what I tried to say... - very simply put...

    There's "Recursive Functions" and the Turing Machines "equivalent".
    The "Recursive Functions" is the most powerful class of algorithms.
    Formal Recursive Functions are formally defined in terms of abstract mathematical formulated properties; one of these [three properties]
    are the "Test Sets". (Here I can already stop.)

    But since we're not in a theoretical CS newsgroup I'd just wanted
    to see an example of some common, say, mathematical function and
    see it implemented without 'if' and 'goto' or recursion. - Take a
    simple one, say, fac(n) = n! , the factorial function. I know how
    I can implement that with 'if' and recursion, and I know how I can
    implement that with 'while' (or 'goto').

    If I re-inspect your example upthread - I hope it was the one you
    wanted to refer to - I see that you have removed the 'if' symbol
    but not the conditional, the test function; there's still the
    predicate (the "Test Set") present in form of 'int c2 = i < n',
    and it's there in the original code, in the goto transformed code,
    and in the function-pointer code. And you cannot get rid of that.

    Whether you have the test in an 'if', or in a ternary '?:', or
    use it through a bool-int coercion as integer index to an indexed function[-pointer] table; it's a conditional branch based on the
    ("Test Set") predicate i<n. You showed in your example how to get
    rid of the 'if' symbol, but you could - as expected - not get rid
    of the actual test that is the substance of a conditional branch.

    I think that is what is to expect by the theory and the essence of
    the point I tried to make.

    Janis



    You make no sense. I am starting to suspect that the reason for it
    is ignorance rather than mere stubbornness.

    https://godbolt.org/z/EKo5rrYce
    Show me conditional branch in the right pane.






    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Sun Dec 22 23:22:36 2024
    From Newsgroup: comp.lang.c

    On 22/12/2024 21:45, Kaz Kylheku wrote:
    On 2024-12-21, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    In a functional langauge, we can make a decision by, for instance,
    putting two lambdas into an array A, and then calling A[0] or A[1],
    where the index 0 or 1 is comes from some Boolean result.

    The only reason we have a control construct like if(A, X, Y) where X
    is only evaluated if A is true, otherwise Y, is that X and Y
    have side effects.

    If X and Y don't have side effects, then if(A, X, Y) can be an ordinary function whose arguments are strictly evaluated.

    Moreover, if we give the functional language lazy evaluation semantics,
    then anyway we get the behavior that Y is not evaluated if A is true,
    and that lazy evaluation model can be used as the basis for sneaking
    effects into the functional language and conctrolling them.

    Anyway, Turing calculation by primitive recursion does not require conditional branching. Just perhaps an if function which returns
    either its second or third argument based on the truth value of
    its first argument.

    For instance, in certain C preprocessor tricks, conditional expansion
    is achieved by such macros.

    When we run the following through the GNU C preprocessor (e.g. by pasting into gcc -E -x c -p -):

    #define TRUE_SELECT_TRUE(X) X
    #define TRUE_SELECT_FALSE(X)

    #define FALSE_SELECT_TRUE(X)
    #define FALSE_SELECT_FALSE(X) X

    #define SELECT_TRUE(X) X
    #define SELECT_FALSE(X)

    #define PASTE(X, Y) X ## Y

    #define IF(A, B, C) PASTE(TRUE_SELECT_, A)(B) PASTE(FALSE_SELECT_, A)(C)

    #define FOO TRUE
    #define BAR FALSE

    IF(FOO, foo is true, foo is false)
    IF(BAR, bar is true, bar is false)

    We get these tokens:

    foo is true
    bar is false


    So, how long did it take to debug? (I've no idea how it works. If I
    change all TRUE/FALSE to BART/LISA respectively, it still gives the same output. I'm not sure how germane such an example is.)


    Yet, macro expansion has no conditionals. The preprocessing language has
    #if and #ifdef, but we didn't use those. Just expansion of computed names.

    This is an example of not strictly needing conditionals to achieve conditional evaluation or expansion: an IF(A, B, C) operator that
    yields B or C depending on the truth of A, and so forth.

    John MacCarthy (Lisp inventor) wrote himself such an IF function
    in Fortran, in a program for calculating chess moves. It evaluated
    both the B and C expressions, and so it wasn't a proper imperative conditional, but it didn't matter.

    You mean like this one:

    int IF(int c, int a, int b) {
    return a*!!c + b*!c;
    }

    I think most languages can manage that. I guess there was a reason
    McCarthy needed it rather than use Fortran's existing IF statement,
    other than, being the Lisp guy, that was how his mind worked.)

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Sun Dec 22 23:29:50 2024
    From Newsgroup: comp.lang.c

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 22.12.2024 07:01, Waldek Hebisch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto. >>>>>
    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts where I explained in detail how to translate
    goto program (with conditional jumps) into program that contains
    no goto and no conditional jumps).

    I'm not sure but may have just skimmed over your "C" example if it
    wasn't of interest to the point I tried to make (at that stage).

    Or try to figure out how to do this knowing that C has function
    pointers.

    I will retry to explain what I tried to say... - very simply put...

    There's "Recursive Functions" and the Turing Machines "equivalent".
    The "Recursive Functions" is the most powerful class of algorithms.
    Formal Recursive Functions are formally defined in terms of abstract mathematical formulated properties; one of these [three properties]
    are the "Test Sets". (Here I can already stop.)

    Classic definition uses some number of base functions, some
    number of base conditions, conditional definitions and
    "minimum operator". "Minimum operator" given a (possibly
    partially defined) function f and l computes smallest n such that
    f(k, l) is defined for k=0,1,...,n and f(n, l) = 0 and is undefined
    otherwise. Some texts require minimum to be effective, that
    is f should be total and for each l there should be n >= 0 such
    that f(n, l) = 0. Clearly "minimum operator" is equvalent to
    'while' loop. IIRC, if instead of "minimum operator" you only
    recursion, then resulting class of functions is strictly smaller.
    So assuming that I remember correctly, in framework of recursive
    functions claim that conditianals and recursion give Turing
    completness is false, one needs some "programming" constructs.

    Anyway, using recursion you clearly need some way to stop it. If you
    restrict yourself to eagerly evaluated total integer valued functions
    only, then clearly there is no way to stop recursion. But if
    you have different system like lambda calculus or C, then there
    are ways to stop recursion that are quite different than 'if'
    or tertiary operator.

    But since we're not in a theoretical CS newsgroup I'd just wanted
    to see an example of some common, say, mathematical function and
    see it implemented without 'if' and 'goto' or recursion.

    To be clear: I need recursion in general. I do not need 'if'
    to stop recursion.

    - Take a
    simple one, say, fac(n) = n! , the factorial function. I know how
    I can implement that with 'if' and recursion, and I know how I can
    implement that with 'while' (or 'goto').

    If I re-inspect your example upthread - I hope it was the one you
    wanted to refer to - I see that you have removed the 'if' symbol
    but not the conditional, the test function; there's still the
    predicate (the "Test Set") present in form of 'int c2 = i < n',

    You failed to see that this is on ordinary total function: it
    evaluates both arguments and produces a value. If I take the
    following C function:

    int lt(int a, int b) {
    return (a < b);
    }

    and compile it using 'gcc -O -S' I get:

    lt:
    .LFB0:
    .cfi_startproc
    cmpl %esi, %edi
    setl %al
    movzbl %al, %eax
    ret

    As you can see the only control transfer there is 'ret' at the
    end of the function. 'if' and C ternary oprators are quite
    different, you can not implement them as ordinary functions
    (some special case can be optimized to jumpless code, but not
    in general).

    and it's there in the original code, in the goto transformed code,
    and in the function-pointer code. And you cannot get rid of that.

    I can, but for something like factorial code would be quite
    ugly. One can implement reasonable Turing machine emulator
    using just integer and function pointer arrays, array accesses
    and assignments, direct and indirect funcction calls.

    By reasonable I mean that as long as Turning machine stays
    in part of tape modeled as C array emulator and Turing machine
    would move in step. Stop in Turing machine would exit emulator.
    Only when Turing machine exceeds memory of C program, C program
    would exhibit undefined behaviour. If you allow yourself also
    C arithmetic operators (crucualy '/' and '%'), then you can
    stop execution.

    If you assume C implementation with infinite memory such that
    'malloc' newer fails, then instead of array you can use
    doubly linked list which gets extended when Turing machine
    tries to get outside allocated space.

    IIUC such infinite C implementation would exhibit undefined
    behaviour as C standard requires finite bound on integers and
    injective cast from from pointers to some integer type.

    Whether you have the test in an 'if', or in a ternary '?:', or
    use it through a bool-int coercion as integer index to an indexed function[-pointer] table; it's a conditional branch based on the
    ("Test Set") predicate i<n. You showed in your example how to get
    rid of the 'if' symbol, but you could - as expected - not get rid
    of the actual test that is the substance of a conditional branch.

    Wrong, one can use properties of C23 division (actually,
    what is needed division and remainder by a fixed positive
    number, say 3).

    I think that is what is to expect by the theory and the essence of
    the point I tried to make.

    One point that I wanted to make is that programming languages are
    different than theory of integer functions, in particular programming constructs may be surprisingly powerful. For example, there was
    a theorem about some special concurrecy problem saying that
    desired mutial exclusion can not be done by binary semaphores.
    David Parnas showed that it in fact can be solved using arrays
    of binary semaphores. The theorem had unstated assumption that
    only scalar semaphore variables are in use. Of course, once you eliminate
    all useful constructs from a language, then one can not do anything
    is such a language (as a joke David Parnas defined such a language).

    Second point was that function calls in tail position are quite similar
    to goto, and in case of indirect calls they can do job of 'if' or 'switch'.
    So if you consider elimination of 'if' (or 'goto') as a cheat, the
    cheat is in using function calls, and not in predicates.
    --
    Waldek Hebisch
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Kaz Kylheku@643-408-1753@kylheku.com to comp.lang.c on Sun Dec 22 23:47:25 2024
    From Newsgroup: comp.lang.c

    On 2024-12-22, bart <bc@freeuk.com> wrote:
    On 22/12/2024 21:45, Kaz Kylheku wrote:
    On 2024-12-21, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto. >>>>>
    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    In a functional langauge, we can make a decision by, for instance,
    putting two lambdas into an array A, and then calling A[0] or A[1],
    where the index 0 or 1 is comes from some Boolean result.

    The only reason we have a control construct like if(A, X, Y) where X
    is only evaluated if A is true, otherwise Y, is that X and Y
    have side effects.

    If X and Y don't have side effects, then if(A, X, Y) can be an ordinary
    function whose arguments are strictly evaluated.

    Moreover, if we give the functional language lazy evaluation semantics,
    then anyway we get the behavior that Y is not evaluated if A is true,
    and that lazy evaluation model can be used as the basis for sneaking
    effects into the functional language and conctrolling them.

    Anyway, Turing calculation by primitive recursion does not require
    conditional branching. Just perhaps an if function which returns
    either its second or third argument based on the truth value of
    its first argument.

    For instance, in certain C preprocessor tricks, conditional expansion
    is achieved by such macros.

    When we run the following through the GNU C preprocessor (e.g. by pasting
    into gcc -E -x c -p -):

    #define TRUE_SELECT_TRUE(X) X
    #define TRUE_SELECT_FALSE(X)

    #define FALSE_SELECT_TRUE(X)
    #define FALSE_SELECT_FALSE(X) X

    #define SELECT_TRUE(X) X
    #define SELECT_FALSE(X)

    #define PASTE(X, Y) X ## Y

    #define IF(A, B, C) PASTE(TRUE_SELECT_, A)(B) PASTE(FALSE_SELECT_, A)(C) >>
    #define FOO TRUE
    #define BAR FALSE

    IF(FOO, foo is true, foo is false)
    IF(BAR, bar is true, bar is false)

    We get these tokens:

    foo is true
    bar is false


    So, how long did it take to debug? (I've no idea how it works. If I

    I typed it out right in the middle of my article and piped it out to
    gcc, iterating a few times. I made a few silly mistakes in IF, mostly
    due to referencing the wrong A, B, C.

    Also, the SELECT_TRUE and SELECT_FALSE macros are dead code; not used.

    change all TRUE/FALSE to BART/LISA respectively, it still gives the same output. I'm not sure how germane such an example is.)

    If you rename consistently, it will work. But it's not hygienic in that
    since the solution relies on calculated identifiers, you have to change TRUE_SELECT_TRUE to TRUE_SELECT_BART.

    How it works is very simmple in that PASTE(TRUE_SELECT_, A) calculates TRUE_SELECT_TRUE or TRUE_SELECT_FALSE depending on whether A contains
    TRUE or FALSE. Then the argument list (B) is combined with this
    calculated name, resulting in a macro call to TRUE_SELECT_TRUE(B) or TRUE_SELECT_FALSE(B) with the value of B as an argument.
    If the former is used, then it expands to B; if the latter, then to
    nothing.

    One of the two PASTE calls in the expansion of IF() produces tokens, and
    the other nothing. The two results are catenated together into one token sequence, so we get the result of whichever one is nonempty.
    --
    TXR Programming Language: http://nongnu.org/txr
    Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
    Mastodon: @Kazinator@mstdn.ca
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun Dec 22 17:22:01 2024
    From Newsgroup: comp.lang.c

    Kaz Kylheku <643-408-1753@kylheku.com> writes:

    On 2024-12-21, Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 21.12.2024 02:28, Tim Rentsch wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via
    goto.

    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    In a functional langauge, we can make a decision by, for instance,
    putting two lambdas into an array A, and then calling A[0] or A[1],
    where the index 0 or 1 is comes from some Boolean result.

    The only reason we have a control construct like if(A, X, Y) where X
    is only evaluated if A is true, otherwise Y, is that X and Y
    have side effects.

    If X and Y don't have side effects, then if(A, X, Y) can be an ordinary function whose arguments are strictly evaluated.

    Moreover, if we give the functional language lazy evaluation
    semantics, then anyway we get the behavior that Y is not evaluated
    if A is true, and that lazy evaluation model can be used as the
    basis for sneaking effects into the functional language and
    conctrolling them.

    Anyway, Turing calculation by primitive recursion does not require conditional branching. Just perhaps an if function which returns
    either its second or third argument based on the truth value of its
    first argument.

    For instance, in certain C preprocessor tricks, conditional
    expansion is achieved by such macros.

    When we run the following through the GNU C preprocessor (e.g. by
    pasting into gcc -E -x c -p -):

    #define TRUE_SELECT_TRUE(X) X
    #define TRUE_SELECT_FALSE(X)

    #define FALSE_SELECT_TRUE(X)
    #define FALSE_SELECT_FALSE(X) X

    #define SELECT_TRUE(X) X
    #define SELECT_FALSE(X)

    #define PASTE(X, Y) X ## Y

    #define IF(A, B, C) PASTE(TRUE_SELECT_, A)(B) PASTE(FALSE_SELECT_, A)(C)

    #define FOO TRUE
    #define BAR FALSE

    IF(FOO, foo is true, foo is false)
    IF(BAR, bar is true, bar is false)

    We get these tokens:

    foo is true
    bar is false

    Yet, macro expansion has no conditionals. The preprocessing
    language has #if and #ifdef, but we didn't use those. Just
    expansion of computed names.

    Nice example.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sun Dec 22 17:39:52 2024
    From Newsgroup: comp.lang.c

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 22.12.2024 02:04, Michael S wrote:

    [...]

    Part of the answer is in your previous response.
    You wrote: "many _good_ professors I met in my life typically were
    keen to explain their theses, statements, or knowledge (instead of
    dragging that out of him)". You essentially admitted that not all good
    professors behave like that.

    Oh, what I meant to express was different; that good professors
    *would* explain it (only bad ones wouldn't).

    (At least that was my experience; and not only covering the CS
    domain, BTW.)

    [ "schools of teaching" stuff snipped ]

    You make an impression of one that received basics of CS. Probably, 40
    or so years ago, but still you have to know basic facts. Unlike me, for
    example.
    So, Tim expects that you will be able to utilizes his hints.

    The point [repeatedly] stated (also by others here) was that
    he more often than not just provides no information but simple
    arbitrary statements of opinion.

    The comments I made here, in two responses to postings of yours,
    were not statements of opinion but statements of fact. They are
    no more statements of opinion than a statement about whether the
    Riemann Hypothesis is true is a statement of opinion. Someone
    might wonder whether an assertion "The Riemann Hypothesis is
    true" is true or false, but it is still a matter of fact, not a
    matter of opinion.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Lawrence D'Oliveiro@ldo@nz.invalid to comp.lang.c on Mon Dec 23 02:08:46 2024
    From Newsgroup: comp.lang.c

    On Wed, 18 Dec 2024 23:46:21 -0600, BGB wrote:

    ... (what debug mechanisms I have, effectively lack any symbols
    for things inside "ld-linux.so"'s domain).

    nm -D /lib/ld-linux.so.2
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From antispam@antispam@fricas.org (Waldek Hebisch) to comp.lang.c on Mon Dec 23 02:41:10 2024
    From Newsgroup: comp.lang.c

    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    The comments I made here, in two responses to postings of yours,
    were not statements of opinion but statements of fact.

    They are opinions _about facts_, or if you prefer, opinion
    about truth value of some statements.

    They are
    no more statements of opinion than a statement about whether the
    Riemann Hypothesis is true is a statement of opinion. Someone
    might wonder whether an assertion "The Riemann Hypothesis is
    true" is true or false, but it is still a matter of fact, not a
    matter of opinion.

    It is reasobable to assume that you do not know if Riemann Hypothesis
    is true or false. So if you say "Riemann Hypothesis is true",
    this is just your opinion. I am not a native English speaker
    but I believed that "statements of opinion" means just that:
    person does not know the truth, but makes a statement.
    --
    Waldek Hebisch
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon Dec 23 08:43:07 2024
    From Newsgroup: comp.lang.c

    On 23/12/2024 03:41, Waldek Hebisch wrote:
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    The comments I made here, in two responses to postings of yours,
    were not statements of opinion but statements of fact.

    They are opinions _about facts_, or if you prefer, opinion
    about truth value of some statements.

    You can program in C without the "normal" conditional statements or expressions. You can make an array of two (or more) function pointers
    and select between them using your controlling expression, and that
    should be sufficient for conditionals. (There may be other methods too.)

    So as far as I can see, Tim gave statements of fact, not opinion.

    You can say that Tim's posts were patronising, arrogant, and irritating.
    /That/ would be an opinion - a /justified/ opinion because it is
    backed up in the evidence of these posts and corroborating evidence from previous posts and discussions from Tim. But without some kind of
    precise definition of the terms involved and a robust and repeatable
    method of classification, it could not be called "fact".

    You could say that Tim's posts were intended to be annoying, or you
    could say that he has refused to give an answer to how C can be used
    without the "normal" conditionals because he realises he was wrong in
    his posts and won't admit it. That would be /unjustified/ opinion - or "speculation" - because we have no way of knowing his motives or
    anything more than what he wrote in his posts.


    You could, quite fairly, characterise Tim's posts as unjustified
    statements of fact - because he has stated his claim as fact, but has
    given no justification or reasoning, and it is not something that is
    obvious or well-known to people.


    They are
    no more statements of opinion than a statement about whether the
    Riemann Hypothesis is true is a statement of opinion. Someone
    might wonder whether an assertion "The Riemann Hypothesis is
    true" is true or false, but it is still a matter of fact, not a
    matter of opinion.

    It is reasobable to assume that you do not know if Riemann Hypothesis
    is true or false.

    I think if anyone knew the truth of falsity of the Riemann Hypothesis -
    i.e., they had a proof one way or the other - we'd have heard about it!

    So if you say "Riemann Hypothesis is true",
    this is just your opinion.

    No, that would not be an opinion. It would be an unjustified claim. "I /believe/ the Riemann Hypothesis is true" is an opinion.

    I am not a native English speaker
    but I believed that "statements of opinion" means just that:
    person does not know the truth, but makes a statement.


    No, an opinion is a personal preference or judgement. That's very
    different from not knowing about something factual. If I say "the
    number 17 will turn up in next week's lottery numbers", that's not an
    opinion, it's a claim about facts. It's an unjustified claim, since I
    don't know if it is true or not, but it's not an opinion.

    It is not always clear when something is a fact or not, and whether a statement is a justified statement of fact, an unjustified statement of
    fact (i.e., it might happen to be true, but you have not presented
    evidence of it), a justified opinion, or an unjustified opinion. I'm
    sure there's a philosophy group on Usenet somewhere, but I doubt if cross-posting there would lead to any clarification!


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon Dec 23 09:46:46 2024
    From Newsgroup: comp.lang.c

    On 22/12/2024 20:41, Janis Papanagnou wrote:
    On 22.12.2024 07:01, Waldek Hebisch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto. >>>>>
    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts where I explained in detail how to translate
    goto program (with conditional jumps) into program that contains
    no goto and no conditional jumps).

    I'm not sure but may have just skimmed over your "C" example if it
    wasn't of interest to the point I tried to make (at that stage).

    Or try to figure out how to do this knowing that C has function
    pointers.

    I will retry to explain what I tried to say... - very simply put...

    There's "Recursive Functions" and the Turing Machines "equivalent".
    The "Recursive Functions" is the most powerful class of algorithms.
    Formal Recursive Functions are formally defined in terms of abstract mathematical formulated properties; one of these [three properties]
    are the "Test Sets". (Here I can already stop.)

    But since we're not in a theoretical CS newsgroup I'd just wanted
    to see an example of some common, say, mathematical function and
    see it implemented without 'if' and 'goto' or recursion. - Take a
    simple one, say, fac(n) = n! , the factorial function. I know how
    I can implement that with 'if' and recursion, and I know how I can
    implement that with 'while' (or 'goto').

    If I re-inspect your example upthread - I hope it was the one you
    wanted to refer to - I see that you have removed the 'if' symbol
    but not the conditional, the test function; there's still the
    predicate (the "Test Set") present in form of 'int c2 = i < n',
    and it's there in the original code, in the goto transformed code,
    and in the function-pointer code. And you cannot get rid of that.

    Whether you have the test in an 'if', or in a ternary '?:', or
    use it through a bool-int coercion as integer index to an indexed function[-pointer] table; it's a conditional branch based on the
    ("Test Set") predicate i<n. You showed in your example how to get
    rid of the 'if' symbol, but you could - as expected - not get rid
    of the actual test that is the substance of a conditional branch.

    I think that is what is to expect by the theory and the essence of
    the point I tried to make.


    You are adding more restrictions than Tim had given.

    We all know that for most non-trivial algorithms you need some kind of repetition (loops, recursion, etc.) and some way to end that repetition.
    No one is claiming otherwise.

    Tim ruled out &&, ||, ?:, goto, break, continue, if, for, while, switch,
    do, labels, setjmp and longjmp.

    He didn't rule out recursion, or the relational operators, or any other
    part of C.


    int fact(int n);

    int fact_zero(int n) {
    return 1;
    }

    int n_fact_n1(int n) {
    return n * fact(n - 1);
    }

    int fact(int n) {
    return (int (*[])(int)){ fact_zero, n_fact_n1 }[(bool) n](n);
    }


    There are additional fun things that can be done using different
    operators. For an unsigned integer "n" that is not big enough to wrap,
    "(n + 2) / (n + 1) - 1" evaluates "(n == 0)".

    And Tim did not rule out using the standard library, which would surely
    open up new possibilities.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Mon Dec 23 05:15:29 2024
    From Newsgroup: comp.lang.c

    On 12/22/2024 8:08 PM, Lawrence D'Oliveiro wrote:
    On Wed, 18 Dec 2024 23:46:21 -0600, BGB wrote:

    ... (what debug mechanisms I have, effectively lack any symbols
    for things inside "ld-linux.so"'s domain).

    nm -D /lib/ld-linux.so.2

    It is not actually on Linux, but rather trying to make my kernel mimic Linux...

    The issue isn't getting the symbol map, but rather that in this case,
    there are multiple levels of abstraction and so, at the level of the CPU emulator (where I can get instruction traces when something crashes), it
    can no longer figure out what addresses map to where.

    With the normal PE loader, it can send messages to the virtual debug
    UART which signal where it has loaded things in memory (for every EXE
    and DLL). But, things partly break down for ELF PIE binaries with glibc
    or musl.

    Granted, the ELF loader does at least know in theory where the main
    binary and interpreter were loaded.



    But, seemingly, process is sort of like:
    Read in main ELF binary;
    Read in interpreter;
    Set up argument list, environment, and other stuff (*1), on the stack;
    Branch to entry point on interpreter;
    Magic happens.
    (Currently, it just crashes).

    *1:
    (SP+ 0): argc
    (SP+ 8): argv[0]
    (SP+16): argv[1]
    ...
    (SP+(argc+1)*8): NULL
    (SP+xx): Env var pointers...
    (SP+xx): NULL
    (SP+xx): Auxiliary Vectors
    Key/value pairs
    Terminated by Key==0

    Information on how exactly to set up the auxiliary vectors in a way that
    glibc and musl are happy, is harder to figure out. At this stage, things become rather poorly documented.

    Theoretically the interpreter program is responsible for loading the
    other SO's; or if the main ELF loader is supposed to do it, it is not
    obvious how it is supposed to tell ld-so where it had loaded them.

    ...

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Mon Dec 23 11:35:56 2024
    From Newsgroup: comp.lang.c

    On 23/12/2024 08:46, David Brown wrote:

    Tim ruled out &&, ||, ?:, goto, break, continue, if, for, while, switch,
    do, labels, setjmp and longjmp.

    He didn't rule out recursion, or the relational operators, or any other
    part of C.


    int fact(int n);

    int fact_zero(int n) {
            return 1;
    }

    int n_fact_n1(int n) {
            return n * fact(n - 1);
    }

    int fact(int n) {
            return (int (*[])(int)){ fact_zero, n_fact_n1 }[(bool) n](n); }


    There are additional fun things that can be done using different operators.  For an unsigned integer "n" that is not big enough to wrap,
    "(n + 2) / (n + 1) - 1"  evaluates "(n == 0)".

    Isn't this just !n ? I don't think "!" was ruled out. This would also
    work for negative n.

    And Tim did not rule out using the standard library, which would surely
    open up new possibilities.

    printf (not sprintf) would be reasonable here to show results. Anything
    else could be considered cheating.

    The original context was a small subset of C that can be used to
    represent a larger subset.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Mon Dec 23 13:40:08 2024
    From Newsgroup: comp.lang.c

    On Mon, 23 Dec 2024 09:46:46 +0100
    David Brown <david.brown@hesbynett.no> wrote:


    And Tim did not rule out using the standard library,


    Are you sure?

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon Dec 23 13:18:46 2024
    From Newsgroup: comp.lang.c

    On 23/12/2024 12:35, bart wrote:
    On 23/12/2024 08:46, David Brown wrote:

    Tim ruled out &&, ||, ?:, goto, break, continue, if, for, while,
    switch, do, labels, setjmp and longjmp.

    He didn't rule out recursion, or the relational operators, or any
    other part of C.


    int fact(int n);

    int fact_zero(int n) {
             return 1;
    }

    int n_fact_n1(int n) {
             return n * fact(n - 1);
    }

    int fact(int n) {
             return (int (*[])(int)){ fact_zero, n_fact_n1 }[(bool) n](n);
    }


    There are additional fun things that can be done using different
    operators.  For an unsigned integer "n" that is not big enough to
    wrap, "(n + 2) / (n + 1) - 1"  evaluates "(n == 0)".

    Isn't this just !n ? I don't think "!" was ruled out. This would also
    work for negative n.

    Sure. It was merely another example of something you could use, if you
    had ruled out simpler things (like the conversion to bool that I used,
    or the ! operator that you suggest).



    And Tim did not rule out using the standard library, which would
    surely open up new possibilities.

    printf (not sprintf) would be reasonable here to show results. Anything
    else could be considered cheating.


    No, I would not say so - as long as the standard library is not ruled
    out, it is part of C. But I think you could reasonably argue that
    allowing the standard library makes this whole pointless exercise even
    more pointless!

    The original context was a small subset of C that can be used to
    represent a larger subset.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From David Brown@david.brown@hesbynett.no to comp.lang.c on Mon Dec 23 13:24:14 2024
    From Newsgroup: comp.lang.c

    On 23/12/2024 12:40, Michael S wrote:
    On Mon, 23 Dec 2024 09:46:46 +0100
    David Brown <david.brown@hesbynett.no> wrote:


    And Tim did not rule out using the standard library,


    Are you sure?


    Fairly sure, yes.

    But if you think I missed something, please say.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From scott@scott@slp53.sl.home (Scott Lurndal) to comp.lang.c on Mon Dec 23 15:41:40 2024
    From Newsgroup: comp.lang.c

    Michael S <already5chosen@yahoo.com> writes:
    On Sun, 22 Dec 2024 20:41:44 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:



    Whether you have the test in an 'if', or in a ternary '?:', or
    use it through a bool-int coercion as integer index to an indexed
    function[-pointer] table; it's a conditional branch based on the
    ("Test Set") predicate i<n. You showed in your example how to get
    rid of the 'if' symbol, but you could - as expected - not get rid
    of the actual test that is the substance of a conditional branch.

    I think that is what is to expect by the theory and the essence of
    the point I tried to make.

    Janis



    You make no sense. I am starting to suspect that the reason for it
    is ignorance rather than mere stubbornness.

    https://godbolt.org/z/EKo5rrYce
    Show me conditional branch in the right pane.


    The 'C' in 'CSET' is short for conditional. Because
    the branch is folded into the compare doesn't mean it
    isn't there.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From bart@bc@freeuk.com to comp.lang.c on Mon Dec 23 15:51:24 2024
    From Newsgroup: comp.lang.c

    On 23/12/2024 15:41, Scott Lurndal wrote:
    Michael S <already5chosen@yahoo.com> writes:
    On Sun, 22 Dec 2024 20:41:44 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:



    Whether you have the test in an 'if', or in a ternary '?:', or
    use it through a bool-int coercion as integer index to an indexed
    function[-pointer] table; it's a conditional branch based on the
    ("Test Set") predicate i<n. You showed in your example how to get
    rid of the 'if' symbol, but you could - as expected - not get rid
    of the actual test that is the substance of a conditional branch.

    I think that is what is to expect by the theory and the essence of
    the point I tried to make.

    Janis



    You make no sense. I am starting to suspect that the reason for it
    is ignorance rather than mere stubbornness.

    https://godbolt.org/z/EKo5rrYce
    Show me conditional branch in the right pane.


    The 'C' in 'CSET' is short for conditional. Because
    the branch is folded into the compare doesn't mean it
    isn't there.

    That's just a mnemomic, which doesn't exist in the x86 version.

    Anyway, 'w0' seems to be set either way, and the program counter will
    point to the same instruction in each case too.

    So there's no branching at this level of code, unless you consider
    stepping PC to the next instruction to be a jump.

    How is it 'folded into' the compare anyway? Are they not two independent instructions?
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Michael S@already5chosen@yahoo.com to comp.lang.c on Mon Dec 23 18:05:48 2024
    From Newsgroup: comp.lang.c

    On Mon, 23 Dec 2024 15:41:40 GMT
    scott@slp53.sl.home (Scott Lurndal) wrote:

    Michael S <already5chosen@yahoo.com> writes:
    On Sun, 22 Dec 2024 20:41:44 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:



    Whether you have the test in an 'if', or in a ternary '?:', or
    use it through a bool-int coercion as integer index to an indexed
    function[-pointer] table; it's a conditional branch based on the
    ("Test Set") predicate i<n. You showed in your example how to get
    rid of the 'if' symbol, but you could - as expected - not get rid
    of the actual test that is the substance of a conditional branch.

    I think that is what is to expect by the theory and the essence of
    the point I tried to make.

    Janis



    You make no sense. I am starting to suspect that the reason for it
    is ignorance rather than mere stubbornness.

    https://godbolt.org/z/EKo5rrYce
    Show me conditional branch in the right pane.


    The 'C' in 'CSET' is short for conditional. Because
    the branch is folded into the compare doesn't mean it
    isn't there.

    No, branch is not "folded". It is absent. CSET is an ALU operation.
    The logical-arithmetic nature of comparison operator is even more
    pronounced in code that gcc generates for POWER
    https://godbolt.org/z/8Gs9s6nEo





    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Mon Dec 23 13:02:02 2024
    From Newsgroup: comp.lang.c

    Michael S <already5chosen@yahoo.com> writes:

    On Sat, 21 Dec 2024 21:31:24 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    So your statement asks for some explanation at least.

    I would guess that Tim worked as CS professor for several dozens years.
    And it shows.

    I'm not sure whether to feel flattered or insulted. ;)
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Mon Dec 23 13:18:24 2024
    From Newsgroup: comp.lang.c

    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 23 Dec 2024 09:46:46 +0100
    David Brown <david.brown@hesbynett.no> wrote:

    And Tim did not rule out using the standard library,

    Are you sure?

    I explicitly called out setjmp and longjmp as being excluded.
    Based on that, it's reasonable to infer the rest of the
    standard library is allowed.

    Furthermore I don't think it matters. Except for a very small
    set of functions -- eg, fopen, fgetc, fputc, malloc, free --
    everything else in the standard library either isn't important
    for Turing Completeness or can be synthesized from the base
    set. The functionality of fprintf(), for example, can be
    implemented on top of fputc and non-library language features.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon Dec 23 13:25:48 2024
    From Newsgroup: comp.lang.c

    On 12/23/2024 1:02 PM, Tim Rentsch wrote:
    Michael S <already5chosen@yahoo.com> writes:

    On Sat, 21 Dec 2024 21:31:24 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    So your statement asks for some explanation at least.

    I would guess that Tim worked as CS professor for several dozens years.
    And it shows.

    I'm not sure whether to feel flattered or insulted. ;)

    AHAHA! lol. You forced me to laugh here. wow. :^D
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon Dec 23 13:28:38 2024
    From Newsgroup: comp.lang.c

    On 12/23/2024 12:46 AM, David Brown wrote:
    On 22/12/2024 20:41, Janis Papanagnou wrote:
    On 22.12.2024 07:01, Waldek Hebisch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
    On 21.12.2024 02:28, Tim Rentsch wrote:
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto. >>>>>>
    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts where I explained in detail how to translate
    goto program (with conditional jumps) into program that contains
    no goto and no conditional jumps).

    I'm not sure but may have just skimmed over your "C" example if it
    wasn't of interest to the point I tried to make (at that stage).

    Or try to figure out how to do this knowing that C has function
    pointers.

    I will retry to explain what I tried to say... - very simply put...

    There's "Recursive Functions" and the Turing Machines "equivalent".
    The "Recursive Functions" is the most powerful class of algorithms.
    Formal Recursive Functions are formally defined in terms of abstract
    mathematical formulated properties; one of these [three properties]
    are the "Test Sets". (Here I can already stop.)

    But since we're not in a theoretical CS newsgroup I'd just wanted
    to see an example of some common, say, mathematical function and
    see it implemented without 'if' and 'goto' or recursion. - Take a
    simple one, say, fac(n) = n! , the factorial function. I know how
    I can implement that with 'if' and recursion, and I know how I can
    implement that with 'while' (or 'goto').

    If I re-inspect your example upthread - I hope it was the one you
    wanted to refer to - I see that you have removed the 'if' symbol
    but not the conditional, the test function; there's still the
    predicate (the "Test Set") present in form of 'int c2 = i < n',
    and it's there in the original code, in the goto transformed code,
    and in the function-pointer code. And you cannot get rid of that.

    Whether you have the test in an 'if', or in a ternary '?:', or
    use it through a bool-int coercion as integer index to an indexed
    function[-pointer] table; it's a conditional branch based on the
    ("Test Set") predicate i<n. You showed in your example how to get
    rid of the 'if' symbol, but you could - as expected - not get rid
    of the actual test that is the substance of a conditional branch.

    I think that is what is to expect by the theory and the essence of
    the point I tried to make.


    You are adding more restrictions than Tim had given.

    We all know that for most non-trivial algorithms you need some kind of repetition (loops, recursion, etc.) and some way to end that repetition.
     No one is claiming otherwise.

    Tim ruled out &&, ||, ?:, goto, break, continue, if, for, while, switch,
    do, labels, setjmp and longjmp.

    He didn't rule out recursion, or the relational operators, or any other
    part of C.
    [...]

    pseudo code

    func_ptr icall = funcs[i % 3];

    icall->you()

    ;^)

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Mon Dec 23 14:00:38 2024
    From Newsgroup: comp.lang.c

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 22.12.2024 07:01, Waldek Hebisch wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 21.12.2024 02:28, Tim Rentsch wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto. >>>>>
    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts where I explained in detail how to translate
    goto program (with conditional jumps) into program that contains
    no goto and no conditional jumps).

    I'm not sure but may have just skimmed over your "C" example if it
    wasn't of interest to the point I tried to make (at that stage).

    Or try to figure out how to do this knowing that C has function
    pointers.

    I will retry to explain what I tried to say... - very simply put...

    There's "Recursive Functions" and the Turing Machines "equivalent".
    The "Recursive Functions" is the most powerful class of algorithms.
    Formal Recursive Functions are formally defined in terms of abstract mathematical formulated properties; one of these [three properties]
    are the "Test Sets". (Here I can already stop.)

    But since we're not in a theoretical CS newsgroup I'd just wanted
    to see an example of some common, say, mathematical function and
    see it implemented without 'if' and 'goto' or recursion. - Take a
    simple one, say, fac(n) = n! , the factorial function. I know how
    I can implement that with 'if' and recursion, and I know how I can
    implement that with 'while' (or 'goto').

    If I re-inspect your example upthread - I hope it was the one you
    wanted to refer to - I see that you have removed the 'if' symbol
    but not the conditional, the test function; there's still the
    predicate (the "Test Set") present in form of 'int c2 = i < n',
    and it's there in the original code, in the goto transformed code,
    and in the function-pointer code. And you cannot get rid of that.

    Are you sure about that?
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Mon Dec 23 14:05:55 2024
    From Newsgroup: comp.lang.c

    scott@slp53.sl.home (Scott Lurndal) writes:

    Michael S <already5chosen@yahoo.com> writes:

    On Sun, 22 Dec 2024 20:41:44 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:



    Whether you have the test in an 'if', or in a ternary '?:', or
    use it through a bool-int coercion as integer index to an indexed
    function[-pointer] table; it's a conditional branch based on the
    ("Test Set") predicate i<n. You showed in your example how to get
    rid of the 'if' symbol, but you could - as expected - not get rid
    of the actual test that is the substance of a conditional branch.

    I think that is what is to expect by the theory and the essence of
    the point I tried to make.

    You make no sense. I am starting to suspect that the reason for it
    is ignorance rather than mere stubbornness.

    https://godbolt.org/z/EKo5rrYce
    Show me conditional branch in the right pane.

    The 'C' in 'CSET' is short for conditional. Because
    the branch is folded into the compare doesn't mean it
    isn't there.

    It's a moot point because relational operators and equality
    operators can be synthesized out of bitwise and arithmetic
    operators.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Mon Dec 23 15:50:44 2024
    From Newsgroup: comp.lang.c

    On 12/23/2024 1:25 PM, Chris M. Thomasson wrote:
    On 12/23/2024 1:02 PM, Tim Rentsch wrote:
    Michael S <already5chosen@yahoo.com> writes:

    On Sat, 21 Dec 2024 21:31:24 +0100
    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    So your statement asks for some explanation at least.

    I would guess that Tim worked as CS professor for several dozens years.
    And it shows.

    I'm not sure whether to feel flattered or insulted.  ;)

    AHAHA! lol. You forced me to laugh here. wow. :^D

    merry christmas!
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Ben Bacarisse@ben@bsb.me.uk to comp.lang.c on Tue Dec 24 00:41:23 2024
    From Newsgroup: comp.lang.c

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 23 Dec 2024 09:46:46 +0100
    David Brown <david.brown@hesbynett.no> wrote:

    And Tim did not rule out using the standard library,

    Are you sure?

    I explicitly called out setjmp and longjmp as being excluded.
    Based on that, it's reasonable to infer the rest of the
    standard library is allowed.

    Furthermore I don't think it matters.

    Hmm... I'm puzzled. Where does the unbounded store come from without
    I/O? Do you take "C is Turing complete" to mean that there is a
    theoretically possible implementation of C sufficient for any given
    problem instance (rather than for any given problem)? That's not how
    different models are usually compared, and I think it would run into
    some rather odd theoretical problems.

    There is a somewhat informal version of "C (with the restrictions you
    have stated) is Turing complete" which just means "you can do anything
    you want provided you don't hit an implementation limit".
    --
    Ben.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Mon Dec 23 20:55:04 2024
    From Newsgroup: comp.lang.c

    Ben Bacarisse <ben@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 23 Dec 2024 09:46:46 +0100
    David Brown <david.brown@hesbynett.no> wrote:

    And Tim did not rule out using the standard library,

    Are you sure?

    I explicitly called out setjmp and longjmp as being excluded.
    Based on that, it's reasonable to infer the rest of the
    standard library is allowed.

    Furthermore I don't think it matters.

    Hmm... I'm puzzled. Where does the unbounded store come from without
    I/O? Do you take "C is Turing complete" to mean that there is a theoretically possible implementation of C sufficient for any given
    problem instance (rather than for any given problem)? That's not how different models are usually compared, and I think it would run into
    some rather odd theoretical problems.

    Sorry, it seems my comment was misleading. I thought it was
    apparent from the rest of my paragraph (not shown in your excerpt)
    that my statement was meant as "Furthermore I don't think it matters
    if _most_ of the standard library is excluded." There had been a
    mention of printf as being infringing (which in my view is silly,
    but never mind that), so I wanted to point out that most of the
    standard library is irrelevant, including in particular [f]printf.

    There is a somewhat informal version of "C (with the restrictions you
    have stated) is Turing complete" which just means "you can do anything
    you want provided you don't hit an implementation limit".

    Yes, I'm familiar with that, and I knowingly glossed over the
    distinction, because I think it's customary, when talking about
    Turing Completeness relative to conventional programming languages,
    to ignore the finiteness of conventional language models. I should
    have known better with you in the audience. You got me! :)
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Wed Dec 25 00:51:37 2024
    From Newsgroup: comp.lang.c

    On 12/23/2024 1:43 AM, David Brown wrote:
    On 23/12/2024 03:41, Waldek Hebisch wrote:
    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    The comments I made here, in two responses to postings of yours,
    were not statements of opinion but statements of fact.

    They are opinions _about facts_, or if you prefer, opinion
    about truth value of some statements.

    You can program in C without the "normal" conditional statements or expressions.  You can make an array of two (or more) function pointers
    and select between them using your controlling expression, and that
    should be sufficient for conditionals.  (There may be other methods too.)

    So as far as I can see, Tim gave statements of fact, not opinion.

    Jumping back in:
    That one can do this seems obvious enough;
    Downside, as I see it, is that there is no current or likely processor hardware where this is likely to be performance competitive with the
    more traditional if-goto mechanism (and if the backend is expected to
    optimize it away, not obvious what would be gained).


    Sort of like with "continuation passing style":
    Yes, you can do this, but the performance overhead relative to
    conventional call-frames is severe.

    But, CPS does at least have use-cases which can justify this overhead.



    Though, FWIW, doing control flow via a combination of CPS and plugging
    things together with function pointers is fairly useful in implementing
    things like fast interpreters (where calling through function pointers
    can be faster than going through big if/else trees or "switch()" blocks).

    Where, early on in writing interpreters, I had often ran into a limit
    that the interpreter would become bottle-necked by how quickly it could
    spin in a loop and feed instructions through a big "switch()" block.
    Using function pointers can theoretically sidestep this limit (then one
    is more limited by how quickly they can walk the trace graph and call
    the relevant function pointers).

    But, can get within 10x of native code in some cases, which is pretty
    fast by interpreter standards (to get much faster usually requires a JIT).


    Well, except in my current emulator, where in trying to be
    cycle-accurate, the much bigger overhead is in trying to mimic behavior
    and cycle costs of the cache hierarchy and similar.


    You can say that Tim's posts were patronising, arrogant, and irritating.
     /That/ would be an opinion - a /justified/ opinion because it is
    backed up in the evidence of these posts and corroborating evidence from previous posts and discussions from Tim.  But without some kind of
    precise definition of the terms involved and a robust and repeatable
    method of classification, it could not be called "fact".

    You could say that Tim's posts were intended to be annoying, or you
    could say that he has refused to give an answer to how C can be used
    without the "normal" conditionals because he realises he was wrong in
    his posts and won't admit it.  That would be /unjustified/ opinion - or "speculation" - because we have no way of knowing his motives or
    anything more than what he wrote in his posts.


    You could, quite fairly, characterise Tim's posts as unjustified
    statements of fact - because he has stated his claim as fact, but has
    given no justification or reasoning, and it is not something that is
    obvious or well-known to people.


      They are
    no more statements of opinion than a statement about whether the
    Riemann Hypothesis is true is a statement of opinion.  Someone
    might wonder whether an assertion "The Riemann Hypothesis is
    true" is true or false, but it is still a matter of fact, not a
    matter of opinion.

    It is reasobable to assume that you do not know if Riemann Hypothesis
    is true or false.

    I think if anyone knew the truth of falsity of the Riemann Hypothesis - i.e., they had a proof one way or the other - we'd have heard about it!

    So if you say "Riemann Hypothesis is true",
    this is just your opinion.

    No, that would not be an opinion.  It would be an unjustified claim.
    "I /believe/ the Riemann Hypothesis is true" is an opinion.

     I am not a native English speaker
    but I believed that "statements of opinion" means just that:
    person does not know the truth, but makes a statement.


    No, an opinion is a personal preference or judgement.  That's very different from not knowing about something factual.  If I say "the
    number 17 will turn up in next week's lottery numbers", that's not an opinion, it's a claim about facts.  It's an unjustified claim, since I don't know if it is true or not, but it's not an opinion.

    It is not always clear when something is a fact or not, and whether a statement is a justified statement of fact, an unjustified statement of
    fact (i.e., it might happen to be true, but you have not presented
    evidence of it), a justified opinion, or an unjustified opinion.  I'm
    sure there's a philosophy group on Usenet somewhere, but I doubt if cross-posting there would lead to any clarification!



    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Wed Dec 25 03:41:41 2024
    From Newsgroup: comp.lang.c

    On 12/23/2024 3:18 PM, Tim Rentsch wrote:
    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 23 Dec 2024 09:46:46 +0100
    David Brown <david.brown@hesbynett.no> wrote:

    And Tim did not rule out using the standard library,

    Are you sure?

    I explicitly called out setjmp and longjmp as being excluded.
    Based on that, it's reasonable to infer the rest of the
    standard library is allowed.

    Furthermore I don't think it matters. Except for a very small
    set of functions -- eg, fopen, fgetc, fputc, malloc, free --
    everything else in the standard library either isn't important
    for Turing Completeness or can be synthesized from the base
    set. The functionality of fprintf(), for example, can be
    implemented on top of fputc and non-library language features.


    If I were to choose a set of primitive functions, probably:
    malloc/free and/or realloc
    could define, say:
    malloc(sz) => realloc(NULL, sz)
    free(ptr) => realloc(ptr, 0)
    Maybe _msize and _mtag/..., but this is non-standard.
    With _msize, can implement realloc on top of malloc/free.

    For basic IO:
    fopen, fclose, fseek, fread, fwrite

    printf could be implemented on top of vsnprintf and fputs
    fputs can be implemented on top of fwrite (via strlen).
    With a temporary buffer buffer being used for the printed string.

    ...


    Though, one may still end up with various other stuff over the interface
    as well. Though, the interface can be made open-ended if one has a GetInterface call or similar, which can request other interfaces given
    an ID, such as, FOURCC/EIGHTCC pair, a SIXTEENCC, or GUID (*1). IMHO, generally preferable over a "GetProcAddress" mechanism due to lower
    overheads; tough, with an annoyance that interface vtables generally
    have a fixed layout (generally can't really add or change anything
    without creating binary compatibility issues; so a lot of
    tables/structures need to be kept semi-frozen).

    Though, APIs like DirectX had dealt with the issue of having version
    numbers for vtables and then one requests a specific version of the
    vtable (within the range of versions supported by the major version of DirectX). But, this is crufty.

    *1: Say: QWORD qwMajor, QWORD qwMinor.
    qwMajor:
    Major ID (FOURCC, EIGHTCC)
    Or: First 8 bytes of SIXTEENCC or GUID
    qwMinor:
    SubID/Version (FOURCC or EIGHTCC)
    Second 8 bytes of SIXTEENCC or GUID.
    Where:
    High 32 bits are 0, assume FOURCC.
    Else, look at bits to determine EIGHTCC vs GUID.
    Assume if both are EIGHTCC, value represents a SIXTEENCC.
    Bit patterns for valid SIXTEENCCs vs GUIDs are mutually exclusive.
    Names make more sense for public interfaces.
    Leaving GUIDs mostly for private/internal interfaces.

    Well, unlike Windows, where they use GUIDs for pretty much everything
    here (and also, I didn't bother with an IDL compiler; generally doing
    all this directly in C).


    Well, and some wonk, like the exact contents of structures like BITMAPINFOHEADER being interpreted based on using biSize as a magic
    number (well, sometimes with other stuff glued onto the end, as
    understood based the use of the biCompression field), ...

    But, it has held up well, this structure being almost as old as I am...



    In a few cases, one might also take the option of using a "DriverProc()"
    style interface, where one provides a pair of context-dependent pointers
    and uses magic numbers to identify the desired operation, or, intermediate:
    (*ifvt)->QueryProc(ifvt, iHdl, lParm, pParm1, pParm2);
    (*ifvt)->ModifyProc(ifvt, iHdl, lParm, pParm1, pParm2);

    Where, QueryProc is intended for non-destructive operations, and
    ModifyProc for destructive operations.
    iHdl: Context-dependent integer handle;
    lParm: Magic command number.
    pParm1/pParm2: Magic pointers, often:
    pParm1: Input data address;
    pParm2: Output data address.

    Where, vtable is usually provided in "VT **" form, hence the need to
    deref the table before a method can be invoked.



    Actually, some of this overlaps with how I had implemented the C library
    for DLLs in my project:
    Only the main binary has the full C library;
    DLL's generally use a C library which calls back to the main C library
    via a COM style interface (things like malloc/free and stdio calls are
    routed over this interface).

    Note that this is partly because in my case:
    1, DLLs only allow an acyclic dependency graph;
    2, The mechanism does not currently allow sharing global variables;
    3, There was a desire to allow dlopen/dlsym to dynamically load libraries.

    1 & 3 mean that if a statically-linked C library is used for the main
    binary:
    One needs to also statically link a C library to each DLL;
    The C library needs to operate over a COM interface for shared interfaces.

    Or, alternatively, that only a DLL may be used for the C library, and
    all DLLs would need to use the same C library DLL.


    Note that neither 1 nor 2 traditionally apply with ELF Shared Objects
    (which usually both shared everything and allow for cyclic dependency
    graphs). But, traditionally ELF has other drawbacks, like needing to
    access variables and call functions via a GOT (which has higher overhead
    than direct calls, or accessing global variables as a fixed offset
    relative to a known base register, ...).

    Note that having the kernel inject DLLs into a running process wouldn't
    really mix well with the way glibc approaches shared objects (where, it manages this stuff in userland, rather than having this left up to the kernel's program loader).

    May not matter as much though as if providing an COM-like interface, one doesn't necessarily actually need dlopen/dlsym to be able to see the
    symbols in the library that the interface came from.


    Where, in this case, COM-like interfaces may be used in ways that
    deviate from usual dependency ordering; and was more flexible. They are awkward to use directly, so it may make sense to provide C API wrappers
    (thus far, usually statically linked, but they can fetch the interfaces
    they need from the main C library or the OS).

    Where, in my case, the OS interface is a mix of conventional syscalls
    and object-method-calls routed over the syscall interface (the target
    being either in the kernel or in another process; or the OS might load a
    DLL into the client process and return a process-local vtable).

    If non-local, generally the method pointers are generic, and serve to
    forward the call over the syscall mechanism (the syscall interface being
    used in a somewhat different way from how it would be used in something
    like Linux; where Linux generally just does not do things this way...).


    ...


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Wed Dec 25 15:43:29 2024
    From Newsgroup: comp.lang.c

    On 12/25/2024 3:41 AM, BGB wrote:
    On 12/23/2024 3:18 PM, Tim Rentsch wrote:
    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 23 Dec 2024 09:46:46 +0100
    David Brown <david.brown@hesbynett.no> wrote:

    And Tim did not rule out using the standard library,

    Are you sure?

    I explicitly called out setjmp and longjmp as being excluded.
    Based on that, it's reasonable to infer the rest of the
    standard library is allowed.

    Furthermore I don't think it matters.  Except for a very small
    set of functions -- eg, fopen, fgetc, fputc, malloc, free --
    everything else in the standard library either isn't important
    for Turing Completeness or can be synthesized from the base
    set.  The functionality of fprintf(), for example, can be
    implemented on top of fputc and non-library language features.


    If I were to choose a set of primitive functions, probably:
      malloc/free and/or realloc
        could define, say:
          malloc(sz) => realloc(NULL, sz)
          free(ptr) => realloc(ptr, 0)
        Maybe _msize and _mtag/..., but this is non-standard.
          With _msize, can implement realloc on top of malloc/free.

    For basic IO:
      fopen, fclose, fseek, fread, fwrite

    printf could be implemented on top of vsnprintf and fputs
      fputs can be implemented on top of fwrite (via strlen).
      With a temporary buffer buffer being used for the printed string.

    ...


    Though, one may still end up with various other stuff over the interface
    as well. Though, the interface can be made open-ended if one has a GetInterface call or similar, which can request other interfaces given
    an ID, such as, FOURCC/EIGHTCC pair, a SIXTEENCC, or GUID (*1). IMHO, generally preferable over a "GetProcAddress" mechanism due to lower overheads; tough, with an annoyance that interface vtables generally
    have a fixed layout (generally can't really add or change anything
    without creating binary compatibility issues; so a lot of tables/
    structures need to be kept semi-frozen).

    Though, APIs like DirectX had dealt with the issue of having version
    numbers for vtables and then one requests a specific version of the
    vtable (within the range of versions supported by the major version of DirectX). But, this is crufty.

    *1: Say: QWORD qwMajor, QWORD qwMinor.
      qwMajor:
        Major ID (FOURCC, EIGHTCC)
        Or: First 8 bytes of SIXTEENCC or GUID
      qwMinor:
        SubID/Version (FOURCC or EIGHTCC)
        Second 8 bytes of SIXTEENCC or GUID.
      Where:
        High 32 bits are 0, assume FOURCC.
        Else, look at bits to determine EIGHTCC vs GUID.
        Assume if both are EIGHTCC, value represents a SIXTEENCC.
        Bit patterns for valid SIXTEENCCs vs GUIDs are mutually exclusive.
        Names make more sense for public interfaces.
          Leaving GUIDs mostly for private/internal interfaces.

    Well, unlike Windows, where they use GUIDs for pretty much everything
    here (and also, I didn't bother with an IDL compiler; generally doing
    all this directly in C).


    Clarification:
    Though, despite taking influence from COM, it is not COM.

    I am not using the COM API, and generally practices regarding vtable structure, etc, are a bit more loose.



    There is also not currently any plan to actually implement the OLE or
    COM APIs. Only that some similar ideas are in use.

    Pretty much everything else is different...

    COM uses a 16-byte struct to convey a GUID;
    I was using pairs of 64-bit integer values.

    ...

    Some ideas from OLE, such as storing object instances from one library
    in a "document" held by an unrelated program instance, and then saving/reloading them later, are not a thing in my case.

    It is possible I could consider doing something similar to OLE, but I
    don't have an immediate use-case (and, more often, I was using the
    object interfaces internally for things like OS level APIs).


    Note that many core OS APIs are still a bit more mundane, like:
    memory is still managed using pointers;
    files IO is still managed using integer handles;
    ...


    Though, within the kernel, open VFS files are implemented via objects
    with vtable pointers. This detail is not exposed to program instances,
    where the system calls identify them via integer handles.


    Well, and also I am using a Unix style directory tree structure, rather
    than drive letters.

    But, does differ some in things like locating DLLs for a program:
    ELF:
    Either "/lib/", "/usr/lib/",
    or a hard-coded path in the binary.
    Win:
    Check current directory;
    Then search PATH;
    TK:
    Check first in the directory the EXE is found;
    Then search LIBPATH;
    Then search PATH.

    Hard coding paths in the binary does mean though that the installation
    path for any binaries that depends on custom SO's is fixed, which is not ideal. Checking relative to the binary allows more flexible installation paths.



    Well, and some wonk, like the exact contents of structures like BITMAPINFOHEADER being interpreted based on using biSize as a magic
    number (well, sometimes with other stuff glued onto the end, as
    understood based the use of the biCompression field), ...

    But, it has held up well, this structure being almost as old as I am...


    Clarification:
    I am towards the older end of the Millennial / Gen Y age range...

    Started existence in the IBM Clones and MS-DOS era, but by the time I
    was using computers, was mostly in the era of Windows, CD-ROM based FMV
    games, and early/slow internet.




    In a few cases, one might also take the option of using a "DriverProc()" style interface, where one provides a pair of context-dependent pointers
    and uses magic numbers to identify the desired operation, or, intermediate:
      (*ifvt)->QueryProc(ifvt, iHdl, lParm, pParm1, pParm2);
      (*ifvt)->ModifyProc(ifvt, iHdl, lParm, pParm1, pParm2);

    Where, QueryProc is intended for non-destructive operations, and
    ModifyProc for destructive operations.
      iHdl: Context-dependent integer handle;
      lParm: Magic command number.
      pParm1/pParm2: Magic pointers, often:
        pParm1: Input data address;
        pParm2: Output data address.

    Where, vtable is usually provided in "VT **" form, hence the need to
    deref the table before a method can be invoked.


    Well, theoretically, say:
    First 4 pointers are reserved;
    Used internally for various stuff;
    Methods take the object instance as the first argument;
    ...

    The pointer itself points to an object instance, which may often be a dummy.
    Then, the object starts with a pointer to the vtable.




    Actually, some of this overlaps with how I had implemented the C library
    for DLLs in my project:
    Only the main binary has the full C library;
    DLL's generally use a C library which calls back to the main C library
    via a COM style interface (things like malloc/free and stdio calls are routed over this interface).


    Looking back at it, this may not count as "COM like", as, many of the
    vtable pointers deviate from the traditional form:
    Don't take the object as a first argument;
    Many are just plain C function pointers.
    Eg: ptr=(*vt)->malloc_fp(sz);

    Then again, for marshaling the C library across DLL boundaries, it
    likely does not matter (and would have complicated the interface;
    requiring method pointers to take a vtable pointer only to ignore it; as conceptually the C library is global across the whole process instance).


    Note that this is partly because in my case:
    1, DLLs only allow an acyclic dependency graph;
    2, The mechanism does not currently allow sharing global variables;
    3, There was a desire to allow dlopen/dlsym to dynamically load libraries.

    1 & 3 mean that if a statically-linked C library is used for the main binary:
    One needs to also statically link a C library to each DLL;
    The C library needs to operate over a COM interface for shared interfaces.

    Or, alternatively, that only a DLL may be used for the C library, and
    all DLLs would need to use the same C library DLL.



    Groan, I had described it as a COM interface, but as noted above, this
    is not correct in this case...

    The vtable in question isn't even close to following COM patterns.

    How closely patterns are followed is kinda variable.



    But, as noted, if no shared interface were used, then each DLL (and the
    main program binary) would effectively have their own heap and could not
    share "FILE *" pointers.

    While Windows generally has this limitation (at least with MSVC), I
    personally didn't want this (better if one can "malloc()" something in
    one library and "free()" it in another, and not horribly break the C
    runtime).

    Cygwin and MinGW had addressed this issue in different ways (say, in the
    case of Cygwin, by consolidating all of the core stuff into "cygwin1.dll").


    Can note that in my case, each binary image still gets its own native
    copy of things like memcpy/memset/strlen/...

    These don't depend on any external state, and generally one wants a low-overhead interface for these (along with potential of special
    handling by the compiler).



    I had at one point considered writing a new C library where this stuff
    would have been engineered better, but this fizzled due to inertia. In
    this case, the library would have been fully split into a "client" and "server" parts:
    "client": Has all the recognizable parts of the C library;
    "server": Backend where all the magic happens (the core parts of malloc
    and stdio and similar reside here).


    In such a library, a lot the client-side calls would be wrappers, say:
    void *malloc(size_t size)
    {
    __clib_autoinit(); //bring up C library if needed
    return((*__clib_vta)->Malloc(__clib_vta, size));
    }


    It is unclear here if the server would still be static linked to the
    main EXE, or if it would be a component that is dynamically loaded by
    the kernel as needed during process creation (this could make the main
    EXE smaller);

    or, instead, going the route of having the C library server part inside
    of a common DLL (like in Cygwin).


    At present, there is a pointer in the task context structure than is set
    by the main binary to allow for the DLL's C libraries to bootstrap
    themselves (initially a "GetProcAddress" function, that at present only
    serves to fetch the main instance vtable pointer).


    Note that some libraries like TKGDI (used for graphics/sound/user-input)
    is itself mostly a thin wrapper over a vtable on the client side
    (though, with some of its own logic, as data needs to be passed over the interface in "GlobalAlloc" memory buffers; as the server is generally
    running in a different process).


    TKRA-GL (my OpenGL implementation) also internally uses a vtable
    structure, but slightly different:
    Most of the normal OpenGL calls are handled on the client side;
    The vtable essentially mostly handles things like texture uploads and
    the backend logic for glDrawArrays / glDrawElements calls.

    Note that the "glBegin()"/"glEnd()" interface exists primarily as a
    wrapper over the "glDrawArrays()" mechanism.


    Currently the backend parts run in the kernel, but it is tempting to
    consider folding it off to a dynamically loadable module as it adds significant bulk (and is not always needed).

    Say, for example, only loading the OpenGL DLL kernel-side if a user
    program tries to create an instance of OpenGL.


    Likewise, maybe work towards further separating the client and server
    parts of the GL implementation, as there was not a split initially. Not
    really sure how it usually works in other systems, this stuff doesn't
    seem well documented (though, generally seems like, at least on Windows: "opengl32.dll" wraps GPU vendor provided DLL, which then does whatever,
    to communicate with the backend driver).



    Note that neither 1 nor 2 traditionally apply with ELF Shared Objects
    (which usually both shared everything and allow for cyclic dependency graphs). But, traditionally ELF has other drawbacks, like needing to
    access variables and call functions via a GOT (which has higher overhead than direct calls, or accessing global variables as a fixed offset
    relative to a known base register, ...).

    Note that having the kernel inject DLLs into a running process wouldn't really mix well with the way glibc approaches shared objects (where, it manages this stuff in userland, rather than having this left up to the kernel's program loader).

    May not matter as much though as if providing an COM-like interface, one doesn't necessarily actually need dlopen/dlsym to be able to see the
    symbols in the library that the interface came from.


    Where, in this case, COM-like interfaces may be used in ways that
    deviate from usual dependency ordering; and was more flexible. They are awkward to use directly, so it may make sense to provide C API wrappers (thus far, usually statically linked, but they can fetch the interfaces
    they need from the main C library or the OS).

    Where, in my case, the OS interface is a mix of conventional syscalls
    and object-method-calls routed over the syscall interface (the target
    being either in the kernel or in another process; or the OS might load a
    DLL into the client process and return a process-local vtable).

    If non-local, generally the method pointers are generic, and serve to forward the call over the syscall mechanism (the syscall interface being used in a somewhat different way from how it would be used in something
    like Linux; where Linux generally just does not do things this way...).


    Can note that in trying to get glibc ELF binaries to work on my stuff, effectively there is a separate syscall interface that mimics the Linux syscall interface.

    But, likely, these would represent a different "ecosystem" in terms of
    the binaries (besides just the ELF / PE differences).



    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Rosario19@Ros@invalid.invalid to comp.lang.c on Thu Dec 26 13:16:57 2024
    From Newsgroup: comp.lang.c

    On Mon, 16 Dec 2024 21:22:31 -0000 (UTC), Lawrence D'Oliveiro <>
    wrote:

    On Sun, 15 Dec 2024 20:08:53 +0100, Bonita Montero wrote:

    C++ is more readable because is is magnitudes more expressive than C.

    my position is all is based from the more easy instructions both for
    cpu and human, if goto

    someone says that is better ifcall, but for me is not much readable,
    it is worse than ifgoto

    And it is certainly more surprising than C. Often unpleasantly so.

    You can easily write a C++-statement that would hunddres of lines in C

    Yes, but *which* hundreds of lines, exactly, would be the correct C >equivalent?

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat Dec 28 09:20:23 2024
    From Newsgroup: comp.lang.c

    BGB <cr88192@gmail.com> writes:

    On 12/23/2024 1:43 AM, David Brown wrote:

    On 23/12/2024 03:41, Waldek Hebisch wrote:

    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    The comments I made here, in two responses to postings of yours,
    were not statements of opinion but statements of fact.

    They are opinions _about facts_, or if you prefer, opinion
    about truth value of some statements.

    You can program in C without the "normal" conditional statements or
    expressions. You can make an array of two (or more) function
    pointers and select between them using your controlling expression,
    and that should be sufficient for conditionals. (There may be other
    methods too.)

    So as far as I can see, Tim gave statements of fact, not opinion.

    Jumping back in:
    That one can do this seems obvious enough;
    Downside, as I see it, is that there is no current or likely
    processor hardware where this is likely to be performance
    competitive with the more traditional if-goto mechanism [...]

    Irrelevant to the issue being discussed.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat Dec 28 09:24:16 2024
    From Newsgroup: comp.lang.c

    BGB <cr88192@gmail.com> writes:

    On 12/23/2024 3:18 PM, Tim Rentsch wrote:

    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 23 Dec 2024 09:46:46 +0100
    David Brown <david.brown@hesbynett.no> wrote:

    And Tim did not rule out using the standard library,

    Are you sure?

    I explicitly called out setjmp and longjmp as being excluded.
    Based on that, it's reasonable to infer the rest of the
    standard library is allowed.

    Furthermore I don't think it matters. Except for a very small
    set of functions -- eg, fopen, fgetc, fputc, malloc, free --
    everything else in the standard library either isn't important
    for Turing Completeness or can be synthesized from the base
    set. The functionality of fprintf(), for example, can be
    implemented on top of fputc and non-library language features.

    If I were to choose a set of primitive functions, probably:
    malloc/free and/or realloc
    could define, say:
    malloc(sz) => realloc(NULL, sz)
    free(ptr) => realloc(ptr, 0)
    Maybe _msize and _mtag/..., but this is non-standard.
    With _msize, can implement realloc on top of malloc/free.

    For basic IO:
    fopen, fclose, fseek, fread, fwrite

    printf could be implemented on top of vsnprintf and fputs
    fputs can be implemented on top of fwrite (via strlen).
    With a temporary buffer buffer being used for the printed string.

    Most of these aren't needed. I think everything can be
    done using only fopen, fclose, fgetc, fputc, and feof.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From BGB@cr88192@gmail.com to comp.lang.c on Sat Dec 28 13:59:24 2024
    From Newsgroup: comp.lang.c

    On 12/28/2024 11:24 AM, Tim Rentsch wrote:
    BGB <cr88192@gmail.com> writes:

    On 12/23/2024 3:18 PM, Tim Rentsch wrote:

    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 23 Dec 2024 09:46:46 +0100
    David Brown <david.brown@hesbynett.no> wrote:

    And Tim did not rule out using the standard library,

    Are you sure?

    I explicitly called out setjmp and longjmp as being excluded.
    Based on that, it's reasonable to infer the rest of the
    standard library is allowed.

    Furthermore I don't think it matters. Except for a very small
    set of functions -- eg, fopen, fgetc, fputc, malloc, free --
    everything else in the standard library either isn't important
    for Turing Completeness or can be synthesized from the base
    set. The functionality of fprintf(), for example, can be
    implemented on top of fputc and non-library language features.

    If I were to choose a set of primitive functions, probably:
    malloc/free and/or realloc
    could define, say:
    malloc(sz) => realloc(NULL, sz)
    free(ptr) => realloc(ptr, 0)
    Maybe _msize and _mtag/..., but this is non-standard.
    With _msize, can implement realloc on top of malloc/free.

    For basic IO:
    fopen, fclose, fseek, fread, fwrite

    printf could be implemented on top of vsnprintf and fputs
    fputs can be implemented on top of fwrite (via strlen).
    With a temporary buffer buffer being used for the printed string.

    Most of these aren't needed. I think everything can be
    done using only fopen, fclose, fgetc, fputc, and feof.


    If you only have fgetc and fputc, IO speeds are going to be unacceptably
    slow for non-trivial file sizes.

    If you try to fake fseek by closing, re-opening, and an fgetc loop,
    well, also going to be very slow.


    Then again, fgetc/fputc as the primary operations could make sense for
    text files if the implementation is doing some form of format conversion
    (such as converting between LF only and CR+LF), though admittedly IMO
    one is better off treating text files as equivalent to binary files (and letting the application deal with any conversions here).


    OTOH:
    fgetc and fputc can be implemented via fread and fwrite;
    feof (for normal files) can be implemented via fseek (*1);
    Similar, ftell could be treated as a special case of fseek.

    *1: Say, if the internal fseek call were made to return the current file position (similar to lseek).

    ...





    Well, in another also recently left facing off with the wonk of UTF-8 normalization for the VFS layer in my project (for paths/filenames).
    Options:
    Do Nothing, assume valid UTF-8 and that it is sensibly normalized;
    May risk malformed encodings at deeper levels of the VFS though.
    Encoding only normalization:
    Normalize to an M-UTF-8 variant and call it done.
    Do a subset of normalizing combining characters.
    The full set of Unicode rules would likely be too bulky;
    Filesystem should have no concept of locale;
    The rules should be ideally be "semi frozen" once defined.

    At present, this is applied at the level of VFS syscalls (like "open()"
    or "opendir()").


    Current thinking is that it will normalize to a variant of M-UTF-8 NFC (characters are stored in composed forms), but:
    Will only apply the rules covering the Latin-1 and Latin Extended A
    spaces, and a subset of Latin Extended B.

    Though, a case could be made for limiting the scope solely to the
    Latin-1/1252 range (and passing everything beyond this along as-is).

    Less sure, had also added cases for the Roman numeral characters, mostly
    for decomposing them into ASCII; various ligatures would also be
    decomposed to ASCII (excluding those which appear as their own glyph, so
    AE and OE are left as-is, but IJ/DZ/... would be decomposed). A case
    could also be made for leaving these alone (passing them along
    unmodified). Depends mostly on the open question of whether or not these convey relevant semantic information (or are merely historical/aesthetic).

    At present, the rules are stored as a table, with roughly 8 bytes needed
    per combiner rule (increases to 12 once initialized, mostly because it allocates a pair of 16-bit hash chains).
    Namely: SrcCodepoint1, SrcCodepoint2, DstCodepoint, Flags
    Flags specify when and how the rule is applied.
    SrcCodepoint2 is currently 0x0000 for simple conversion rules.
    DstCodepoint is used for lookup for decompose.
    ...

    Limiting the scope also makes things likely more repeatable (where inconsistent normalization could result in file lookup issues in cases
    where rules differ, if stepping on the offending code-points). Goal is
    mostly to find an acceptable set of rules that can be "mostly frozen".
    Though, in most cases this is likely N/A as the majority of filenames
    tend to be plain ASCII.

    The responsibility for any more advanced normalization (or
    locale-dependent stuff) would be left up at the application level.


    Can't seem to find much information about "best practices" in these areas.

    It is not certain normalizing for combining characters is actually a
    good idea, vs only normalizing for codepoint encoding. Mostly to deal
    with cases where malformed data is submitted to the VFS, or possibly
    1252 (if the VFS calls and similar are given something that is invalid
    UTF-8, then it may be assumed to be 1252). Theoretically, the locale
    code in the C library is expected to normalize for 1252 vs UTF-8 though
    (but, ideally, the integrity of the VFS should be kept protected from
    this sort of thing).

    This also applies to console printing, which is also expected to be
    handed UTF-8, but may also normalize the strings. Though, there is some
    wonk with the console here in my case.


    Seemingly (from what I can gather):
    Linux:
    It is per FS driver;
    Some are "do nothing", others normalize.
    MacOS:
    Also depends on filesystem:
    HFS/HFS+, normalizing (as NFD for some reason);
    APFS, does nothing (apparently leads to a lot of hassles).
    Windows:
    FAT32: Depends solely on OS locale;
    NTFS: Locale rules are baked-in when the drive is formatted.
    The relevant tables are held in filesystem metadata.

    ...


    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue Dec 31 04:57:58 2024
    From Newsgroup: comp.lang.c

    BGB <cr88192@gmail.com> writes:

    On 12/28/2024 11:24 AM, Tim Rentsch wrote:

    BGB <cr88192@gmail.com> writes:

    On 12/23/2024 3:18 PM, Tim Rentsch wrote:

    Michael S <already5chosen@yahoo.com> writes:

    On Mon, 23 Dec 2024 09:46:46 +0100
    David Brown <david.brown@hesbynett.no> wrote:

    And Tim did not rule out using the standard library,

    Are you sure?

    I explicitly called out setjmp and longjmp as being excluded.
    Based on that, it's reasonable to infer the rest of the
    standard library is allowed.

    Furthermore I don't think it matters. Except for a very small
    set of functions -- eg, fopen, fgetc, fputc, malloc, free --
    everything else in the standard library either isn't important
    for Turing Completeness or can be synthesized from the base
    set. The functionality of fprintf(), for example, can be
    implemented on top of fputc and non-library language features.

    If I were to choose a set of primitive functions, probably:
    malloc/free and/or realloc
    could define, say:
    malloc(sz) => realloc(NULL, sz)
    free(ptr) => realloc(ptr, 0)
    Maybe _msize and _mtag/..., but this is non-standard.
    With _msize, can implement realloc on top of malloc/free.

    For basic IO:
    fopen, fclose, fseek, fread, fwrite

    printf could be implemented on top of vsnprintf and fputs
    fputs can be implemented on top of fwrite (via strlen).
    With a temporary buffer buffer being used for the printed string.

    Most of these aren't needed. I think everything can be
    done using only fopen, fclose, fgetc, fputc, and feof.

    If you only have fgetc and fputc, IO speeds are going to be
    unacceptably slow for non-trivial file sizes.

    Once again, any performance concerns are not relevant to the
    matter under discussion.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat Jan 4 11:18:07 2025
    From Newsgroup: comp.lang.c

    antispam@fricas.org (Waldek Hebisch) writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    antispam@fricas.org (Waldek Hebisch) writes:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:

    On 21.12.2024 02:28, Tim Rentsch wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via goto. >>>>>>
    A 'goto' may be used but it isn't strictly *necessary*. What *is*
    necessary, though, that is an 'if' (some conditional branch), and
    either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not strictly
    necessary either.

    No? - Can you give an example of your statement?

    Look at example that I posted (apparently neither you nor Tim
    looked at my posts [...]

    What makes you think I didn't?

    I made the same claim as you earlier and gave examples. You
    did not acknowledge my posts. Why? For me most natural
    explanation is that you did not read them.

    You should revise your inference heuristics. There are any
    number of reasons why I might not have referred to your
    comments. Furthermore your conclusion is incorrect.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Sat Jan 4 12:12:15 2025
    From Newsgroup: comp.lang.c

    antispam@fricas.org (Waldek Hebisch) writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    The comments I made here, in two responses to postings of yours,
    were not statements of opinion but statements of fact.

    They are opinions _about facts_, or if you prefer, opinion
    about truth value of some statements.

    They are
    no more statements of opinion than a statement about whether the
    Riemann Hypothesis is true is a statement of opinion. Someone
    might wonder whether an assertion "The Riemann Hypothesis is
    true" is true or false, but it is still a matter of fact, not a
    matter of opinion.

    It is reasobable to assume that you do not know if Riemann Hypothesis
    is true or false. So if you say "Riemann Hypothesis is true",
    this is just your opinion. I am not a native English speaker
    but I believed that "statements of opinion" means just that:
    person does not know the truth, but makes a statement.

    A statement of opinion is a statement concerning a subjective
    question, such as "Do cats make better pets than dogs?" A
    statement of opinion isn't ever right or wrong or true or false,
    it merely expresses an individual point of view. Most statements
    that have a word like "should" or "good" or "bad" or "better",
    etc., are statements of opinion. That can change if the
    qualifying words are given precise and objective definitions, but
    in most cases they have not been.

    A statement of fact is a statement concerning an objective question,
    such as "Is every even number greater than 4 the sum of two prime
    numbers?". A statement of fact can be right or wrong or true or
    false, even if it isn't known at the present time which of those is
    the case. The statement "Four colors suffice to color any planar
    map such that adjacent regions do not have the same color" is a
    statement of fact, both now and 60 years ago before the statement
    had been proven. Both P==NP and P!=NP are statements of fact, even
    though one of them must certainly be false; the key property is
    that they are objective statements, subject to falsification. If I
    say "The Earth is flat", that is a statement of fact, even though
    the statement is false.

    In any case, my statements about a particular subset of C being
    Turing Complete were statements of fact, and also true statements.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Chris M. Thomasson@chris.m.thomasson.1@gmail.com to comp.lang.c on Sat Jan 4 12:53:01 2025
    From Newsgroup: comp.lang.c

    On 1/4/2025 12:12 PM, Tim Rentsch wrote:
    antispam@fricas.org (Waldek Hebisch) writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> wrote:

    The comments I made here, in two responses to postings of yours,
    were not statements of opinion but statements of fact.

    They are opinions _about facts_, or if you prefer, opinion
    about truth value of some statements.

    They are
    no more statements of opinion than a statement about whether the
    Riemann Hypothesis is true is a statement of opinion. Someone
    might wonder whether an assertion "The Riemann Hypothesis is
    true" is true or false, but it is still a matter of fact, not a
    matter of opinion.

    It is reasobable to assume that you do not know if Riemann Hypothesis
    is true or false. So if you say "Riemann Hypothesis is true",
    this is just your opinion. I am not a native English speaker
    but I believed that "statements of opinion" means just that:
    person does not know the truth, but makes a statement.

    A statement of opinion is a statement concerning a subjective
    question, such as "Do cats make better pets than dogs?"

    sometimes, why do cats seem to own their owners?

    ;^)

    [...]
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Ben Bacarisse@ben@bsb.me.uk to comp.lang.c on Sun Jan 5 11:18:03 2025
    From Newsgroup: comp.lang.c

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    A statement of fact is a statement concerning an objective question,
    such as "Is every even number greater than 4 the sum of two prime
    numbers?". A statement of fact can be right or wrong or true or
    false, even if it isn't known at the present time which of those is
    the case. The statement "Four colors suffice to color any planar
    map such that adjacent regions do not have the same color" is a
    statement of fact, both now and 60 years ago before the statement
    had been proven. Both P==NP and P!=NP are statements of fact, even
    though one of them must certainly be false; the key property is
    that they are objective statements, subject to falsification. If I
    say "The Earth is flat", that is a statement of fact, even though
    the statement is false.

    I think you go too far. The word "fact" is not neutral as far as its
    truth is concerned, and writing "a statement of fact" does not
    significantly change that. Most dictionaries define a fact as something
    that is true (or at least supported by currently available evidence).
    One online essay[1] concludes that

    "A statement of fact is one that has objective content and is
    well-supported by the available evidence."

    [1] https://philosophersmag.com/the-fact-opinion-distinction/
    --
    Ben.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From James Kuyper@jameskuyper@alumni.caltech.edu to comp.lang.c on Sun Jan 5 12:04:41 2025
    From Newsgroup: comp.lang.c

    On 1/5/25 06:18, Ben Bacarisse wrote:
    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    A statement of fact is a statement concerning an objective question,
    such as "Is every even number greater than 4 the sum of two prime
    numbers?". A statement of fact can be right or wrong or true or
    false, even if it isn't known at the present time which of those is
    the case. The statement "Four colors suffice to color any planar
    map such that adjacent regions do not have the same color" is a
    statement of fact, both now and 60 years ago before the statement
    had been proven. Both P==NP and P!=NP are statements of fact, even
    though one of them must certainly be false; the key property is
    that they are objective statements, subject to falsification. If I
    say "The Earth is flat", that is a statement of fact, even though
    the statement is false.

    I think you go too far. The word "fact" is not neutral as far as its
    truth is concerned, and writing "a statement of fact" does not
    significantly change that. Most dictionaries define a fact as something
    that is true (or at least supported by currently available evidence).
    One online essay[1] concludes that

    "A statement of fact is one that has objective content and is
    well-supported by the available evidence."

    [1] https://philosophersmag.com/the-fact-opinion-distinction/

    In US constitutional law, there is the concept of "False statements of
    fact". The distinction is important in that context because they have
    less protection under the First Amendment than true statements of fact.
    They still have some protection, but not if they are defamatory, false advertising, or commercial speech.

    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Tue Jan 7 21:38:38 2025
    From Newsgroup: comp.lang.c

    Ben Bacarisse <ben@bsb.me.uk> writes:

    Tim Rentsch <tr.17687@z991.linuxsc.com> writes:

    A statement of fact is a statement concerning an objective question,
    such as "Is every even number greater than 4 the sum of two prime
    numbers?". A statement of fact can be right or wrong or true or
    false, even if it isn't known at the present time which of those is
    the case. The statement "Four colors suffice to color any planar
    map such that adjacent regions do not have the same color" is a
    statement of fact, both now and 60 years ago before the statement
    had been proven. Both P==NP and P!=NP are statements of fact, even
    though one of them must certainly be false; the key property is
    that they are objective statements, subject to falsification. If I
    say "The Earth is flat", that is a statement of fact, even though
    the statement is false.

    I think you go too far. The word "fact" is not neutral as far as its
    truth is concerned, and writing "a statement of fact" does not
    significantly change that. Most dictionaries define a fact as something
    that is true (or at least supported by currently available evidence).
    One online essay[1] concludes that

    "A statement of fact is one that has objective content and is
    well-supported by the available evidence."

    [1] https://philosophersmag.com/the-fact-opinion-distinction/

    I will concede that the phrase "statement of fact" can be used in
    the sense you describe.

    I believe it is also true that "statement of fact" is used in the
    sense I describe, and that sense appears among the alternatives in
    various well-regarded dictionaries.

    In any case, my point was not to have a debate about the meaning of
    a phrase, but to clarify the intended meaning of my earlier remarks.
    I was making a statement about an objective question, one subject to independent verification or falsification. I was not offering a
    comment that was merely expressing a personal point of view.
    --- Synchronet 3.20a-Linux NewsLink 1.114
  • From Tim Rentsch@tr.17687@z991.linuxsc.com to comp.lang.c on Mon Jan 13 08:10:31 2025
    From Newsgroup: comp.lang.c

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 21.12.2024 22:51, Tim Rentsch wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 21.12.2024 02:28, Tim Rentsch wrote:

    Janis Papanagnou <janis_papanagnou+ng@hotmail.com> writes:

    On 16.12.2024 00:53, BGB wrote:

    [...]

    Pretty much all higher level control flow can be expressed via
    goto.

    A 'goto' may be used but it isn't strictly *necessary*. What
    *is* necessary, though, that is an 'if' (some conditional
    branch), and either 'goto' or recursive functions.

    Conditional branches, including 'if', '?:', etc., are not
    strictly necessary either.

    No? - Can you give an example of your statement?

    (Unless you just wanted to say that in some HLL abstraction like
    'printf("Hello world!\n")' there's no [visible] conditional
    branch. Likewise in a 'ClearAccumulator' machine instruction, or
    the like.)

    The comparisons and predicates are one key function (not any
    specific branch construct, whether on HLL level, assembler
    level, or with the (elementary but most powerful) Turing
    Machine). Comparisons inherently result in predicates which is
    what controls program execution).

    So your statement asks for some explanation at least.

    Start with C - any of C90, C99, C11.

    Take away the short-circuiting operators - &&, ||, ?:.

    Take away all statement types that involve intra-function
    transfer of control: goto, break, continue, if, for, while,
    switch, do/while. Might as well take away statement labels too.

    Take away setjmp and longjmp.

    And also things like the above mentioned 'printf()' that most
    certainly implies an iteration over the format string checking for
    it's '\0'-end.

    The *printf() functions can be implemented in standard C, under the
    above stated limitations, without needing iteration.

    And so on, and so on. - What will be left as "language".

    I think most C developers would be able to answer that question
    given the above stated description. Is there some part that isn't
    clear to you?

    Would you be able to formulate functionality of the class of
    Recursive Functions (languages class of a Turing Machine with
    Chomsky-0 grammar).

    General rewrite grammars, which is another name IIRC for Chomsky
    Type 0 languages, are computationally equivalent to Turing Machines
    (which incidentally takes me back almost five decades to my formal computability education). The answer is yes.


    Rule out programs with undefined behavior.

    The language that is left is still Turing complete.

    Is it?

    Yes, it is.

    But wouldn't that be just the argument I mentioned above; that a,
    say, 'ClearAccumulator' machine statement wouldn't contain any
    jump?

    No, afaict the two questions have nothing to do with each other.


    Proof: exercise for the reader.

    (Typical sort of your reply.)

    I expect you will see better results if you put more effort into
    listening and thinking, and less effort into making ad hominem
    remarks.
    --- Synchronet 3.20a-Linux NewsLink 1.114