• Re: obfuscated AWK code challenge

From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Sat Jan 15 04:16:18 2022
From Newsgroup: comp.lang.awk

the openssl part is only to do a real-time proving that whatever that was generated is indeed a 3312-bit prime number. The "bc" part is just to print that prime out in hex, that's all. if u want to u can ignore that part. i could add a bigint2hex function here but it would bloat it up.
when i posted this one, 3312-bit was the largest i found, with this output :
log-base-2 :: 3312.114313696145
# digits in decimal :: 998
11111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111110111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 is prime
that's not a bit-string - that's a base-10 decimal number that happens to only have 1's and 0's. This code actually runs pretty fast - no recursion needed, no loops in the main body (other than setting up a lookup table at beginning)
it's Mcb cuz knowing the very speicific structure of only 1s, and 1 single zero, all i have to encode would be total length, and position of that zero
779 is the position, 998 is the length
779998
by ASCII ordinance numbers, 77,99,98 maps to M, c, and b, which is even more efficient encoding that trying to encode that number in hex, which is BE6DE. Print that hex out as bytes x0B xE6 xDE and it's still 3-bytes needed, but won't be particularly human-readable.
that's before. since the original posting, using the same technique, i could now encode a 13,789-bit prime using just 4 bytes (see below). What's the point ? i'm curious if any algorithm, anywhere, can achieve a better compression ratio than that. RLE is still much larger, as is LZW or LZMA.
u can say mine is a bit cheating, since it's not a generic algorithm - i see it as : decompression algorithms are just a form of lossless reconstruction, so why limit the possibilities, as long as the output is what's intended ? there are algorithms optimized for lossless compression of audio, like FLAC, so perhaps there should also be compression algorithms optimized for the key exchanges in cryptography .
Make the overhead low enough, and we could even move to a paradigm where every other message, or even every message, is a different one-time key.
function __________(__,_,___) {
(___="bc <<< \"obase="(_*=_^=\
_+=++_)"; "(__)" ;\"")|getline _;
close(___); return _ };
function _________(_,__,___,____,
_____,______,_______,________) {
_____="%c";
for(_-=(___-=(_=(_*=_*=_*=\
________=_+=++_)))^!_;\
-""<=_;_--) {__[\
sprintf(_____,___-+-_)]=\
(((______=sprintf(substr(_____,\
_^!___,!-"")(+"")(_______)"d",_)\
)!~/.../)?______\
: substr(______,(!-"")+(!-"")))
};for(_ in __) {
__[_] }
____=+(_______="")
_______=_=-(\
_^=_^=_+=_^=_="")
for(_=_____=length(____=sprintf(\
"%c%c%c%c",\
_______+(4-6)^6-4*6+4^!6,\
_______+46+6,\
_______+(4+6)^(-4+6),_______+46));_;_--) {
___=__[substr(____,_,!!_)]___;
}
if ((______=___%(_____=\
((!-"")(-""))^(_____*=\
(_^=_=-"")^-!-"")))<(_______=\
int(___/_____))) {___=______;
______=_______;_______=___}
___="%*.f";________=sprintf(\
(___"_")___,--_______,!_,\
--______+-+_______,!___)
gsub(/[^_]/,_^!_,________)
sub(/[_]/,+"",________)
sub(/..\$/,"",_____)
_____=(_+=_^=_="")^_*_+_;
printf(" log-base-"(_)" :: %."(\
_^_^_)"g\n\n %c digits in "\
"decimal :: %d\n\n 0x %s\n\n",\
(log((".")(substr(________,_/_,\
_____*_*_____)))+log(_____)*(___=\
length(________)))/log(_),_____/-_+\
_^_*_____,___, __________(________))
system("openssl prime "\
" "(________)); return ________ };
log-base-2 :: 13789.47552497089
# digits in decimal :: 4152
111111111111111111111111111111111111111111111011111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111111 is prime
On Monday, December 27, 2021 at 5:01:32 AM UTC-5, Janis Papanagnou wrote:
Subject: Re: obfuscated AWK code challenge

It's arguable whether it's appropriate to call code an "AWK code
challenge" that obviously relies on external commands like 'od',
'openssl', and various shell commands _within_ the awk code.

Janis
On 27.12.2021 03:53, RARE Kpop Manifesto wrote:
without actually running this code, can you figure out which prime number has been encoded within the input of ASCII letters

[ Mcb ]

plus the trick that allowed me to achieve such a compression ratio.

* The code works equally well in gawk, mawk-1, mawk2, and nawk.
It is ENTIRELY self-encapsulated.

Enjoy !

= === === === === === === === === === === === === ==

echo; cmd=' echo Mcb | mawk2 '\''function _________(__,_,___) { (___="bc <<< \"obase="(_*=_^=_+=++_)"; "(__)" ;\"")|getline _; close(___); return _ } BEGIN {_____="%c";for(_-=(___-=(_=(_*=_*=_*=_+=++_)))^!_;-""<=_;_--) {__[sprintf(_____,___-+-_)]=_}; for(_ in __) { __[_] } } {____=+""; for(_=_____=length(\$(____));_;_--) { ___=__[substr(\$(____),_,!!_)]___ }; ______=___%(_____=((!-"")(-""))^(_____*=(_^=_=-"")^-!-""));_______=int(___/_____);___="%*.f"; ________=sprintf((___"_")___,--_______,!_,--______+-+_______,!___); gsub(/[^_]/,_^!_,________); sub(/[_]/,+"",________); sub(/..\$/,"",_____);_____+=_="";_+=_^=_; printf(" log-base-"(_)" :: %."(_^_^_)"g%c%c %c digits in decimal :: %d%c%c 0x %s%c%c",log(_)^(_/-_)*(log((".")(________))+log(_____)*(___=length(________))), _____, _____,_^_*_____-_^-!!_*_____,___,_____,_____, _________(________), _____,_____); system("openssl prime -checks "(_+=_*=(_^=_)*_)" "(________)) }'\'' '; gprintf '\n%s\n\n' "\${cmd}" ; echo; eval "\${cmd}"

• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Mon Jan 17 02:28:27 2022
From Newsgroup: comp.lang.awk

@Janis Papanagnou : if u want a pure awk code challenge, tell me what output comes out of this :

mawk _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_
• From Janis Papanagnou@janis_papanagnou@hotmail.com to comp.lang.awk on Mon Jan 17 11:54:25 2022
From Newsgroup: comp.lang.awk

On 17.01.2022 11:28, Kpop 2GM wrote:

@Janis Papanagnou : if u want a pure awk code challenge, tell me what output comes out of this :

mawk _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_

From a quick glance it seems only half of what is produced by either of

awk 'Doh! Uh - oh, oh no; no! (not again)'

awk 'Doh! Uh - oh, oh no; no! (not again) - Unclear?! then say: "bye"'

Janis

• From pk@pk@pk.invalid to comp.lang.awk on Mon Jan 17 12:03:45 2022
From Newsgroup: comp.lang.awk

On Mon, 17 Jan 2022 02:28:27 -0800 (PST), Kpop 2GM
<jason.cy.kwan@gmail.com> wrote:

@Janis Papanagnou : if u want a pure awk code challenge, tell me what
output comes out of this :

mawk _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_

Absolutely nothing if you don't supply an input, and even then I doubt it produces any output at all.

• From Janis Papanagnou@janis_papanagnou@hotmail.com to comp.lang.awk on Mon Jan 17 12:14:46 2022
From Newsgroup: comp.lang.awk

On 17.01.2022 12:03, pk wrote:
On Mon, 17 Jan 2022 02:28:27 -0800 (PST), Kpop 2GM
<jason.cy.kwan@gmail.com> wrote:

@Janis Papanagnou : if u want a pure awk code challenge, tell me what
output comes out of this :

mawk
_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_

Absolutely nothing if you don't supply an input, and even then I doubt it produces any output at all.

Careful! - You have a variable named _ and it has a value in numeric
context of 0, then you have these cascades of power 0^0 which results
in 1, and some arithmetic in between (- and - -), that effectively
seems to result in 1, meaning an awk condition 'true', and that the
input is therefore just reproduced in the output. (Just an educated
guess, no analysis done.)

Janis

• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Thu Jan 20 13:12:35 2022
From Newsgroup: comp.lang.awk

On Monday, January 17, 2022 at 5:54:28 AM UTC-5, Janis Papanagnou wrote:
On 17.01.2022 11:28, Kpop 2GM wrote:

@Janis Papanagnou : if u want a pure awk code challenge, tell me what output comes out of this :

mawk _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_

From a quick glance it seems only half of what is produced by either of

awk 'Doh! Uh - oh, oh no; no! (not again)'

awk 'Doh! Uh - oh, oh no; no! (not again) - Unclear?! then say: "bye"'

Janis

it's actually really straight forward - that code simply prints everything except first line.

here's another pure awk one for u : what does this perform :

mawk -F= 'BEGIN {_+=(_^=__=_+=++_+_)-_/_} \$!!_=\$!_=\$(__+(_<=+\$!_)*NF)'
• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Thu Jan 20 13:16:33 2022
From Newsgroup: comp.lang.awk

On Monday, January 17, 2022 at 6:14:48 AM UTC-5, Janis Papanagnou wrote:
On 17.01.2022 12:03, pk wrote:
On Mon, 17 Jan 2022 02:28:27 -0800 (PST), Kpop 2GM

@Janis Papanagnou : if u want a pure awk code challenge, tell me what
output comes out of this :

mawk
_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_

Absolutely nothing if you don't supply an input, and even then I doubt it produces any output at all.
Careful! - You have a variable named _ and it has a value in numeric
context of 0, then you have these cascades of power 0^0 which results
in 1, and some arithmetic in between (- and - -), that effectively
seems to result in 1, meaning an awk condition 'true', and that the
input is therefore just reproduced in the output. (Just an educated
guess, no analysis done.)

Janis
those cascading 0^0, and one single negation and one single pre-decrement were absolutely intentional.that wasn't a typo.
but i also found the absolutely shortest syntax possible to utilize gawk's hex decoder and print out decimals, assuming the input is just rows of 0x….. :
gawk -n '(\$!_=+\$_)~_'
or
gawk -nM '(\$!_=+\$_)~_' (if you need higher than double precision)
• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Thu Jan 20 13:21:10 2022
From Newsgroup: comp.lang.awk

but i also found the absolutely shortest syntax possible to utilize gawk's hex decoder and print out decimals, assuming the input is just rows of 0x….. :

gawk -n '(\$!_=+\$_)~_'
or
gawk -nM '(\$!_=+\$_)~_' (if you need higher than double precision)
need to append my statement - the same code also works for octals-to-decimals. echo 025333523235356512534543125646531264523653261 | gawk -nM '(\$!_=+\$_)~_' 1822980154315230596830091282486207141553
basically anything in the same standardized format accepted by strtonum( ) would work, without having to call that function, or even use the print statement, and without even having to type in any numbers at all in the code.
• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Thu Jan 20 13:33:21 2022
From Newsgroup: comp.lang.awk

the posix flag -P is similar as the -n (nondecimal) flag in the sense it can interpret hex and octals, but only realistically up to 2^53 since -P flag doesn't pair well with bignum flag -M - it wouldn't print out an error message, but the -P flag gets nullified by -M flag
the easiest way i found to detect whether an innovation of any variant of awk is in gawk -P mode is
("<"<"\x3c")
\x3C is the hex code for byte "<", so this boolean criteria fails for everyone else since one cannot be less than one-self, except gawk -P, which ignores the hex notation, and compares "<" (\074) against "x" (\170)
• From Janis Papanagnou@janis_papanagnou@hotmail.com to comp.lang.awk on Fri Jan 21 00:09:04 2022
From Newsgroup: comp.lang.awk

On 20.01.2022 22:12, Kpop 2GM wrote:
On Monday, January 17, 2022 at 5:54:28 AM UTC-5, Janis Papanagnou wrote:
On 17.01.2022 11:28, Kpop 2GM wrote:

@Janis Papanagnou : if u want a pure awk code challenge, tell me what output comes out of this :

mawk _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_

From a quick glance it seems only half of what is produced by either of

awk 'Doh! Uh - oh, oh no; no! (not again)'

awk 'Doh! Uh - oh, oh no; no! (not again) - Unclear?! then say: "bye"'

Janis

it's actually really straight forward - that code simply prints everything except first line.

Not in my book, not in my system environment. And certainly not as
designed and intended.

Does it really do that in your environment?
Have you tried other awks than mawk?

And it's much less straightforward than your previously posted code.

variable _ used that gets a default value in an expression that has
just be copied many many times. Any maybe the decrement operator can
be considered tricky because it requires an lvalue not a value, but
that's standard in C based programming languages.

In my code you find various concepts; lots of implicit forth and back conversions between strings and integers, grouping, arithmetic and
negations, string constants, a range operator, and last but not least
even a ternary operator. All grouped like a sentence.

Spoiler; it should effectively evaluate to awk '1;1' thus, as noted, duplicating every input line. (I wrote "[your code produces] half of
what is produced by [my code]", and I meant that literally since your
code produces the same as awk '1'.)

Janis

here's another pure awk one for u : what does this perform :

mawk -F= 'BEGIN {_+=(_^=__=_+=++_+_)-_/_} \$!!_=\$!_=\$(__+(_<=+\$!_)*NF)'

• From Janis Papanagnou@janis_papanagnou@hotmail.com to comp.lang.awk on Fri Jan 21 00:23:18 2022
From Newsgroup: comp.lang.awk

On 20.01.2022 22:16, Kpop 2GM wrote:
On Monday, January 17, 2022 at 6:14:48 AM UTC-5, Janis Papanagnou wrote:
On 17.01.2022 12:03, pk wrote:
On Mon, 17 Jan 2022 02:28:27 -0800 (PST), Kpop 2GM

@Janis Papanagnou : if u want a pure awk code challenge, tell me what >>>> output comes out of this :

mawk
_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^-_^_^_^_^_^_^_^_^_^_^_^_^_^--_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_

Absolutely nothing if you don't supply an input, and even then I doubt it >>> produces any output at all.
Careful! - You have a variable named _ and it has a value in numeric
context of 0, then you have these cascades of power 0^0 which results
in 1, and some arithmetic in between (- and - -), that effectively
seems to result in 1, meaning an awk condition 'true', and that the
input is therefore just reproduced in the output. (Just an educated
guess, no analysis done.)

Janis

those cascading 0^0, and one single negation and one single pre-decrement were absolutely intentional.that wasn't a typo.

I didn't say or meant it was a typo. That's just "some arithmetic",
as also said in my other recent post. You used arithmetic, ^, -, --,
and a variable _ , that's all WRT complexity, the duplication doesn't
really contribute.

BTW, in my other post I forgot to mention one more trick in your code;
one should be aware that the exponentiation has right-associativity
(right grouping expression) and the evaluation of the subexpression
toggles between 0 and 1, so if you reduce the expression you have to
do that pair-wise to decompose that correctly.

_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_

is equivalent to either of

_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_ _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_ _^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_^_
...
_^_^_^_
_^_

but i also found the absolutely shortest syntax possible to utilize
gawk's hex decoder and print out decimals, assuming the input is just rows of 0x….. :

gawk -n '(\$!_=+\$_)~_'
or
gawk -nM '(\$!_=+\$_)~_' (if you need higher than double precision)

With my version of GNU awk this code just reproduces the input.

Want to provide test samples?

Janis

• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Thu Jan 20 22:03:11 2022
From Newsgroup: comp.lang.awk

try this one :

echo '0xCAFEBEEFFEED' | gawk -n '(\$!_=+\$_)~_'
223195473903341
• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Thu Jan 20 22:09:00 2022
From Newsgroup: comp.lang.awk

nawk is slightly more verbose :

echo '0xCAFEBEEFFEED' | nawk '(\$!+_=+\$+_)~_'
223195473903341

echo '0xCAFEBEEFFEED' | mawk '(\$!_=+\$_)^_' OFMT=%.f
223195473903341

echo \$'0x0\n0xFEEDCAFEBEF' | mawk '(\$!_=+\$_)<"_"' OFMT=%.f
0
17518579149807

echo '0xCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEEDCAFEBEEFFEED' | gawk -nM '(\$!+_=+\$+_)~_'

696760147094848118127845202403676761678558245541567696744233608423914272675373998167144529865231202819614166286871620170316927958511530298151998159289949617901
• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Thu Jan 20 22:12:49 2022
From Newsgroup: comp.lang.awk

echo '0xCAFEBEEFFEED' |gawk -p- -P '(\$!+_=+\$+_)~_'
223195473903341
# gawk profile, created Fri Jan 21 01:12:16 2022

# Rule(s)

1 (\$! +_ = +\$+_) ~ _ { # 1
1 print
}
• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Thu Jan 20 23:11:41 2022
From Newsgroup: comp.lang.awk

i've been building this huge library for my own use, and to ensure it's as portable as possible, i test the library against mawk 1.3.4, mawk1.9.9.6, gawk in unicode mode, gawk in byte mode, and nawk (or whatever proper name of that awk is that comes with macos at /usr/bin/awk)
from either pre-compiled binaries at homebrew, or just a straight up "make" using mawk-2's source code.
as a result, the library practically has nothing gawk specific at all, but the library also auto-detects which variant I invoked it with based on built-in behavior of the awk itself that cannot simply be tricked by hardcoding in a constant, or setting a variable somewhere, shell or inside awk :
e.g. this criteria (+"0x1" * 0x1)
every other awk and every other invocation flag of gawk would produce a zero, EXCEPT gawk -n / gawk -n -M
or this one : (+"")^atan2(+"",-log(+!+""))
every other awk and every invocation flag of gawk would produce a zero, EXCEPT nawk
i made this next criteria to quickly identify a few different variants/gawk flags, although not all of them are unique :
'BEGIN {
print \
-log((log(-"")*log(-""))^-log(-""))\
/(-"0xABCD")^-!-"" }'
mawk inf
mawk2 nan
nawk inf
gawk -P -e +inf
gawk -c -e +nan
gawk -e +nan
gawk -M -e -nan
gawk -n -e +inf
just gawk alone, you can get it to print out either positive Infinity, negative NaN, or positive NaN, depending on which flags.
• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Thu Jan 20 23:25:11 2022
From Newsgroup: comp.lang.awk

to check for mawk-2, it's

("\333\222"~"[^\333\222]")

it's a bug that only shows up in mawk-2. run it in gawk unicode mode or gawk byte mode, it's still a false.

or simply checking using hex decoding, one can split out 4 different invocation flags of gawk :

% gawk -n -e 'BEGIN { print +0xDEAD,+"0xCAFE" }'
57005 51966

% gawk -t -e 'BEGIN { print +0xDEAD,+"0xCAFE" }' (same for just -e)
57005 0

% gawk -P -e 'BEGIN { print +0xDEAD,+"0xCAFE" }'
0 51966

% gawk -c -e 'BEGIN { print +0xDEAD,+"0xCAFE" }'
0 0

• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Fri Jan 21 03:04:31 2022
From Newsgroup: comp.lang.awk

i found you a great test case : it's a pair of hex that's horizontal mirrors of each other, and both are prime. As you can see here, the same syntax works across mawk-1, nawk, and gawk -n :

mawk '(\$!+_=\$+_=(\$+_)(":")(+\$+_))~_' CONVFMT=%.f <<< \$'0x11111BBBBFFF\n0xFFFBBBB11111' | ecp

0x11111BBBBFFF:18765177405439
0xFFFBBBB11111:281456650817809

nawk '(\$!+_=\$+_=(\$+_)(":")(+\$+_))~_' CONVFMT=%.f <<< \$'0x11111BBBBFFF\n0xFFFBBBB11111' | ecp

0x11111BBBBFFF:18765177405439
0xFFFBBBB11111:281456650817809

gawk -n '(\$!+_=\$+_=(\$+_)(":")(+\$+_))~_' CONVFMT=%.f <<< \$'0x11111BBBBFFF\n0xFFFBBBB11111' | ecp

0x11111BBBBFFF:18765177405439
0xFFFBBBB11111:281456650817809

• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Fri Jan 21 03:23:30 2022
From Newsgroup: comp.lang.awk

apparently even aligning decimals on the right hand side doesn't require a printf( ) statement :

mawk '_<(\$(_^_+!_)=+\$+_)' CONVFMT=%20.f <<< \$'0x0\n0x11111BBBBFFF\n0xFFFBBBB11111'

0x11111BBBBFFF 18765177405439
0xFFFBBBB11111 281456650817809
• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Fri Jan 21 03:28:03 2022
From Newsgroup: comp.lang.awk

apparently even right-aligning of decimals doesn't require a printf( ) statement :
mawk '(\$(_^_+!_)=+\$+_)^_' CONVFMT=%20.f <<< \$'0x11111BBBBFFF\n0xFFFBBBB11111' | gtr ' ' '.'
0x11111BBBBFFF.......18765177405439
0xFFFBBBB11111……281456650817809
(it looks screwy here since default google font isn't fixed width)
or with built-in vertical separation :
mawk '(\$(_^_+!_)=+\$+_)<"~"' CONVFMT=%.f OFS="\f" <<< \$'0x11111BBBBFFF\n0xFFFBBBB11111'
0x11111BBBBFFF
18765177405439
0xFFFBBBB11111
281456650817809
• From Janis Papanagnou@janis_papanagnou@hotmail.com to comp.lang.awk on Sat Jan 22 10:57:34 2022
From Newsgroup: comp.lang.awk

On 21.01.2022 07:03, Kpop 2GM wrote:
try this one :

echo '0xCAFEBEEFFEED' | gawk -n '(\$!_=+\$_)~_'
223195473903341

Not for me...

\$ echo '0xCAFEBEEFFEED' | gawk -n '(\$!_=+\$_)~_'
0xCAFEBEEFFEED

BTW, I skip/skipped your many posts from the last two days; I find it inconvenient to get fragmentary thoughts spread over many postings.
(Maybe I read them later or maybe not.)

But, for a change, I like your idea of an awk code challenge and will
open a new thread with another one.

Janis

• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Sat Jan 22 10:25:08 2022
From Newsgroup: comp.lang.awk

this is what my gawk looks like with minor variations each time :
% echo '0xCAFEBEEFFEED' | gawk -n '\$1=\$0=\$0'
0xCAFEBEEFFEED
(as expected)
% echo '0xCAFEBEEFFEED' | gawk -n '\$1=\$0'
0xCAFEBEEFFEED
% echo '0xCAFEBEEFFEED' | gawk -n '\$1=\$0=\$0'
0xCAFEBEEFFEED
% echo '0xCAFEBEEFFEED' | gawk -n '\$1=\$0=+\$0'
223195473903341
% echo '0xCAFEBEEFFEED' | gawk -P '\$1=\$0=+\$0'
223195473903341
% echo '0xCAFEBEEFFEED' | gawk -P '\$1=+\$0'
223195473903341
% echo '0xCAFEBEEFFEED' | gawk -P '\$1=+\$1'
223195473903341
it's quite baffling to me why your gawk acts like that, seeing that on gnu.org, they list gawk 3.1 as first time it contains ability to interpret hex, which is quite some time ago. I got the exact same syntax working in gawk 5.1.1, mawk 1.3.4, and macos nawk to go from hex-to-decimal, so if you're still stuck I don't know what else I could suggest to workaround it other than doing it the old fashion verbose way of strtonum( ), e.g.
echo '0xCAFEBEEFFEED' | gawk -e '\$!_=\$_=strtonum(\$_)'
223195473903341
<<< '0xCAFEBEEFFEED' gawk -e '(\$!_=strtonum(\$_))^_'
223195473903341
• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Sat Jan 22 10:32:54 2022
From Newsgroup: comp.lang.awk

BTW, I skip/skipped your many posts from the last two days; I find it inconvenient to get fragmentary thoughts spread over many postings.
(Maybe I read them later or maybe not.)
it's not that i enjoy fragmentation (maybe it's just a manifestation of my ADHD). this, being a good ole' newsgroup, means I couldn't go edit existing posts. the only other option being i copy-over full text of existing to a new post, plus the amendment(s), then deleting the old post (and repeating that cycle numerous times). I'll do it if that's your preference.
• From Kpop 2GM@jason.cy.kwan@gmail.com to comp.lang.awk on Sat Jan 22 12:38:35 2022
From Newsgroup: comp.lang.awk

@Janis : i wasn't even intentally obfuscating code for others. I write code directly in that style. like this function here, performs arbitrary-length big-integer multiplication

function _x_(_,__,___,____,_____,______,_______,
________,_________,__________,___________) {
if ((_=="")||(__=="")) {
if (__=="") {
return _ };_=__}
_____="^[-]";________=substr("-",!-"",\
sub(_____,"",_)!=sub(_____,"",__))
sub(/[-]/,"+",_____)
sub(_____,"",_)-sub(_____,"",__)
_______="^["(+"")"]+";
sub(_______,"",_)-sub(_______,"",__)
if (_~(_______="^"(!-"")"?\$")) {
return (________)(_?__:_)
} else if (__~_______) {
return (________)(__?_:__) }
_______=""; gsub(/./,+"",_____)
_=(_____)_; gsub(/./,".",_____)
sub("("(_____)")+\$","_&",_)
sub("[^_]*[_]","",_)
_________=___*=___=length(_____)
___-=match(___,"\$")
if(((_____=(__________=length(_))+ \
(___________=length(__)))<_________)\
|| (_________==_____\
&& (_*__)<(_________^___))) {
return ________?-_*__:_*__;
};_________-=--___;___=\
__________;____=___________;
split(genZeros(_____),______,//);
_____-=!!_+!!_;_____-=_________;
___________-=_________;
for(___^=!___;___<__________;___+=_________) {
_______=+substr(_,___,_________++);
for(____=___________;-_________<____;\
____-=_________) {______[\
_____-___-____]+=_______*(((\
!___<____)||FLG_AWK_MAWK_2)\
? substr(__,____,_________)\
: substr(__,___^!___,\
____+_________-___^!___))
};--_________};_______=\
___^=_____=+(___=____=_____="")
_______=length(______)+(_^=_="")
_^=_="";_+=_+=_-+-++_;
while(___<_______) {____=(\
(_____+=______[___++])%_)____;
_____=int(_____/_) }
sub("^"(!_)"*",________,____); return ____ }

then i use this next one to convert arbitrary sized integers to hex :

function int2hex(_______,______________, _____________,____________,___________,
__________,_________,________,______,
_____,____,___,__,_) {
___________=((_____=((__+=++__)\
)^__)^((__^(__*__)-++__)))*_____;
______________=(__^--__+--__)^(++__)^++__;
___________/=(______=(++__)^(__*=__))
__="";
sub(/^[+-]?[0]*/,"",_______)
sub(/[.][[:digit:]]*\$/,"",_______)
if (_______=="") {
return "0x0" }
if (length(_______)<((__+=++__)^__^__)) {
if (_______~/^[0-9]\$/) {
return ("0x")(+_______) }
#\$if (_______<((___=_____*_____)+___))
__=sprintf("%X%.8X",int(\
_______/______),_______%______)
sub(/^0*/,"0x",__); return __;
}
split("",_);_____=__^=__^=__/__+__;
__=(__=".")__;
gsub("",__,__)
sub("("(__)")+\$","_&",_______)
gsub(".","[^_]",__)
___=__=gsub(__,"&_",_______)
____=+"";
while(_______) {
________=(____=____*\
______________+_______)%_____;
_[__--]=int(____/_____)
____=________;
_______=substr(_______,\
index(_______,"_")+!+"") }
_______=sprintf("%.6X",____%_____)
_____+=_____+=_____;
__=____="";__________=-(__^__)
while(___) {
if(_[___]==(____=+"")) {
delete _[___--] }
if(!___) {
break }
for(__=___;-""<=__;__--) {
________=(____=____*\
______________+_[__])%_____;
_[__]=int(____/_____)
____=________ }
if (__________<-"") {
__________=+____
} else {_______=( !FLG_AWK_MAWK_1 \
? sprintf("%.13X",____*_____+__________)\
: sprintf("%.5X%.8X",int((__________+=\
____*_____)/______),__________%______))_______;
__________=-!!______;
} }
_______=sprintf("%X%08X",int((__________=\
__________<-""?____+_[___]:__________+\
_____*(____+_[___]))/______),\
__________%______)_______;
sub(/^0*/,"0x",_______); return _______ };