I am interested in generally organizing a long string of
comma-separated numbers ("CSV" or "CVS") in different ways. For
instance, I'd like to get every other pair of numbers (see below for
work). This might be useful and extendable for basic mathematical
analysis, or practical reformatting of program output. E.g. svg files
have paths with such features (see the "q" or "c" commands), or for
plotting different sets the data, e.g. every other pair, or other combinations. (However, I note that the gnuplot "every" command is
also useful for this).
For example this sequence:
-10.000000,-9.000000,-8.000000,-7.000000,-[...trim...]7.000000,8.000000,9.000000,10.000000
can be put into different groups, for example these "x,y" data points :
-10.000, -9.000
-8.000, -7.000
-6.000, -5.000
-4.000, -3.000
-2.000, -1.000
0.000, 1.000
2.000, 3.000
4.000, 5.000
6.000, 7.000
8.000, 9.000
10.000,
(note there is no partner for the last pair). This script will do
that (with extra details shown to help follow the processes):
awk_dev_test_seq=$(seq -s',' -f '%f' -10 10)
gawk -F, '
{
{
for (i=1;i<=NF;i++ )
{
if ( i % 2 == 0 ) printf("i=%s Y:%3.3f%s ", i, $i, "\n")
else
printf("i=%s X:%3.3f%s ", i, $i, ",")
}
}
}' <<EOF
${awk_dev_test_seq}
EOF
The number in (i % 2 == 0 ) can be adjusted to get e.g. each line
containing the three consecutive numbers by changing "i % 2" to "i % 3". results :
i=1 X:-10.000, i=2 X:-9.000, i=3 Y:-8.000
... and so on. I have been looking at how to do other groupings of
the data - for example, getting every other *pair* of numbers would be interesting, illustrated in this pseudo-output :
keep this line : -10.000, -9.000
Skip this line->-8.000, -7.000
keep this line : -6.000, -5.000
Skip this line-> -4.000, -3.000
keep this line : -2.000, -1.000
I am asking what approaches might be best to do that in awk -
if/else, while, for, or other control sequences (I think is the term
for those).
Tried to keep this short, but I'll note some interesting postings on this topic :
"Parsing standard CVS data by gawk" https://lists.gnu.org/archive/html/bug-gawk/2015-07/msg00002.html
"CSV parsing with awk" https://backreference.org/2010/04/17/csv-parsing-with-awk/index.html
-Bryan
Personally I'd take a (slightly) different approach here, like doingThis is interesting, thanks.
a handling of irregular (odd) cases
awk -F, '
NF % 2 == 1 { ...in case of odd number of fields - what to do?... }
NF % 2 == 0 { ...(regular?) case of even number of fields... }
'
(The second condition may be irrelevant if you use the first action
to fix your data, and you can fall through in the regular case.)
For the iteration I'd dothat idea - in the following script - appears to be exactly what I mean: awk_dev_test_seq=$(seq -s',' -f '%f' -10 10)
for (i=1; i<=NF; i+=2) # i.e. increment by 2
and print a pair of numbers in one single print statement
printf "X:%3.3f%s,Y:%3.3f%s\n", $i, $(i+1)
(adjust the formatting string and arguments as desired).
In case you want to skip a data pair adjust the increment
appropriately, say, by i+=4 (for your example below), or by
i+=3 if you want to skip a data value (say a Z-coordinate).
printf ( "i=%s %3.3f %3.3f \n", i, $i, $(i+1) )
On 17.03.2023 17:32, Bryan wrote:
printf ( "i=%s %3.3f %3.3f \n", i, $i, $(i+1) )
I see you added parenthesis. But note that 'printf' - as 'print',
but as opposed to 'sprintf()' - is a statement, not a function.
In article <tv3aao$2aucf$1@dont-email.me>,
Janis Papanagnou <janis_papanagnou+ng@hotmail.com> wrote:
On 17.03.2023 17:32, Bryan wrote:
printf ( "i=%s %3.3f %3.3f \n", i, $i, $(i+1) )
I see you added parenthesis. But note that 'printf' - as 'print',
but as opposed to 'sprintf()' - is a statement, not a function.
Although you don't say so explicitly, the implication is that using parentheses with printf is wrong. This implication is incorrect.
Although the parens are optional in most cases, they are necessary in
certain cases. I always use them (when I use printf in awk), because:
1) It looks better (IMHO, of course). It conforms more to what we
would expect to see in C.
2) It is necessary in certain cases, so might as well use them always.
Sysop: | DaiTengu |
---|---|
Location: | Appleton, WI |
Users: | 793 |
Nodes: | 10 (1 / 9) |
Uptime: | 38:42:09 |
Calls: | 11,106 |
Calls today: | 3 |
Files: | 186,086 |
Messages: | 1,751,449 |