so you implemented something resembling the functionality of SQL
SELECT statement GROUP BY ?
impressive library indeed. i took a lot at your GIT tree.so you implemented something resembling the functionality of SQLNo such thing appears in the example you replied to, so funny you should mention it; but in fact I have a group_by function in the <array.h>
SELECT statement GROUP BY ?
header, which is still undocumented.
https://www.kylheku.com/cgit/cppawk/tree/cppawk-include/array.h
I don't know SQL, but this is like the group-by function you
find in some dynamic programming languages.
Here is a quick demo. First, a background warmup. Let's write
an uncoditional action which builds a list of cons cell pairs
made from fields $1 and $2, pushing them onto the lst variable:
./cppawk '
#include <cons.h>
#include <array.h>
{ push(cons($1, $2), lst) }
END { print sexp(lst) }'
a 1
a 2
a 3
b 1
a 4
c 2
c 3
[Ctrl-D][Enter]
(("c" . 3) ("c" . 2) ("a" . 4) ("b" . 1) ("a" . 3) ("a" . 2) ("a" . 1))
OK, now let's introduce group-by:
./cppawk '
#include <cons.h>
#include <array.h>
#include <fun.h>
{ push(cons($1, $2), lst) }
END { group_by(fun(car), lst, arr);
for (i in arr) print i, sexp(arr[i]) }'
a 1
a 2
a 3
b 1
a 4
c 2
c 3
[Ctrl-D][Enter]
a (("a" . 4) ("a" . 3) ("a" . 2) ("a" . 1))
b (("b" . 1))
c (("c" . 3) ("c" . 2))
group_by has populated the array arr with keys a, b, c,
each one tied to a list of those cons pair items which
have that key.
group_by(fun(car), lst, arr) means: for each item x in list,
apply the car function to extract the key k as if by k = car(x).
Then collect the item x into a list that is specific to k.
Each such collected then appears as arr[k] in the array.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
I guess i come from a completely different angle in terms off adding on features to awk - i made mine to be
- all still at scripting level,
- make close to zero amount of external calls (other than benchmarking utility - only mawk2 gives me sub-second timestamps, the rest i need
to go to gnu-date)
- same code base being able to all run from at least 4 variants of awk
that i have (so it can't leverage any of the extra goodies from gawk,
and i have to devise equivalent ones),
i actually meant it as being able to tell whether gawk was invoked with -c flag or -n flag or -P flag or -M flag , multiply all that by unicode-ness - without relying on looking at the invocation call, or peek at "ps" output.- same code base being able to all run from at least 4 variants of awk that i have (so it can't leverage any of the extra goodies from gawk,I not only have that, but you in cppawk you can test which Awk you're
and i have to devise equivalent ones),
using at preprocessing time:
#if __gawk__
...
#else
...
#endif
can test using #if which Awk you're running on. There are command
line options to tell cppawk which Awk to generate code for and execute.
- same code base being able to all run from at least 4 variants of awkI not only have that, but you in cppawk you can test which Awk you're
that i have (so it can't leverage any of the extra goodies from gawk,
and i have to devise equivalent ones),
using at preprocessing time:
#if __gawk__
...
#else
...
#endif
can test using #if which Awk you're running on. There are command
line options to tell cppawk which Awk to generate code for and execute.
i actually meant it as being able to tell whether gawk was invoked
with -c flag or -n flag or -P flag or -M flag , multiply all that by
unicode-ness - without relying on looking at the invocation call, or
peek at "ps" output.
On 2022-05-29, Kpop 2GM <jason....@gmail.com> wrote:there's only one single user of that "portable library" - me
- same code base being able to all run from at least 4 variants of awk >> > that i have (so it can't leverage any of the extra goodies from gawk, >> > and i have to devise equivalent ones),I not only have that, but you in cppawk you can test which Awk you're
using at preprocessing time:
#if __gawk__
...
#else
...
#endif
can test using #if which Awk you're running on. There are command
line options to tell cppawk which Awk to generate code for and execute.
i actually meant it as being able to tell whether gawk was invokedI could add support for that in cppawk. It parses all the options in
with -c flag or -n flag or -P flag or -M flag , multiply all that by
order to support a few cpp options, and a couple of its own. The
rest are passed to awk. I could have it recognize -c being passed
to gawk, to set some preprocessor symbol.
unicode-ness - without relying on looking at the invocation call, orThe advantage of having a preprocessing layer is that it may be
peek at "ps" output.
in many situations acceptable that the output of preprocessing just *assumes* it is running on a certain brand of awk, of a certain
version, invoked in a certain way. Then you don't have any run-time detection and switching overheads in the code.
I suspect that in many cases, the user of a portable Awk library
is actually just using one specific awk, and doesn't care whether
the preprocessed output works with other awks.
Or else if they do care about their code running on other awks also,
many users may be accepting of the limitations of doing it statically:
being able to generate efficient code that is tuned to a particular awk,
or else inefficient code that works with more awks, rather than one body code which switches at run-time.
--
TXR Programming Language: http://nongnu.org/txr
Cygnal: Cygwin Native Application Library: http://kylheku.com/cygnal
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 790 |
Nodes: | 10 (0 / 10) |
Uptime: | 193:33:59 |
Calls: | 11,043 |
Files: | 186,065 |
Messages: | 1,743,705 |