• gawk regexp question

    From gazelle@gazelle@shell.xmission.com (Kenny McCormack) to comp.lang.awk on Thu Dec 1 09:04:18 2022
    From Newsgroup: comp.lang.awk

    I have a regexp like:

    /^.*[?/]word[=/]/

    and it seems to work as expected. Notice that neither of the weird/special characters (? or /) are escaped (I.e., preceded with \) inside of [].

    Am I correct in assuming this is OK? Is there a list anywhere of what is
    and isn't "special" (i.e., needing to be escaped) inside of []?
    --
    When I was growing up we called them "retards", but that's not PC anymore.
    Now, we just call them "Trump Voters".

    The question is, of course, how much longer it will be until that term is also un-PC.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Manuel Collado@m-collado@users.sourceforge.net to comp.lang.awk on Thu Dec 1 11:30:53 2022
    From Newsgroup: comp.lang.awk

    El 01/12/2022 a las 10:04, Kenny McCormack escribió:
    I have a regexp like:

    /^.*[?/]word[=/]/

    and it seems to work as expected. Notice that neither of the weird/special characters (? or /) are escaped (I.e., preceded with \) inside of [].

    Am I correct in assuming this is OK? Is there a list anywhere of what is
    and isn't "special" (i.e., needing to be escaped) inside of []?


    The gawk manual says:

    "To include one of the characters ‘\’, ‘]’, ‘-’, or ‘^’ in a bracket
    expression, put a ‘\’ in front of it. For example:
    [d\]]
    matches either ‘d’ or ‘]’. Additionally, if you place ‘]’ right after
    the opening ‘[’, the closing bracket is treated as one of the characters
    to be matched."

    Don't know if this also applies to other awk variants.
    --
    Manuel Collado - http://mcollado.z15.es

    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From gazelle@gazelle@shell.xmission.com (Kenny McCormack) to comp.lang.awk on Thu Dec 1 12:40:42 2022
    From Newsgroup: comp.lang.awk

    In article <651e12f0-d7c7-fc22-ffc1-080cb836e4f6@users.sourceforge.net>,
    Manuel Collado <m-collado@users.sourceforge.net> wrote:
    ...
    The gawk manual says:

    "To include one of the characters \, ], -, or ^ in a bracket
    expression, put a \ in front of it. For example:
    [d\]]
    matches either d or ]. Additionally, if you place ] right after
    the opening [, the closing bracket is treated as one of the characters
    to be matched."

    OK, so it is just those 4 (\]-^).

    I think "-" is also OK (i.e., doesn't need to be escaped) if it is the first character inside of [].

    Don't know if this also applies to other awk variants.

    Nobody cares anymore about "other awk variants".
    (This is a Good Thing...)
    --
    People who say they'll vote for someone else because Obama couldn't fix
    *all* of Bush's messes are like people complaining that he couldn't cure cancer, so they'll go and vote for (more) cancer.
    --- Synchronet 3.19c-Linux NewsLink 1.113
  • From Janis Papanagnou@janis_papanagnou+ng@hotmail.com to comp.lang.awk on Sun Dec 4 01:46:08 2022
    From Newsgroup: comp.lang.awk

    On 01.12.2022 11:30, Manuel Collado wrote:

    The gawk manual says:

    "To include one of the characters ‘\’, ‘]’, ‘-’, or ‘^’ in a bracket
    expression, put a ‘\’ in front of it. For example:
    [d\]]
    matches either ‘d’ or ‘]’. Additionally, if you place ‘]’ right after
    the opening ‘[’, the closing bracket is treated as one of the characters to be matched."

    Don't know if this also applies to other awk variants.

    The old Awk "Bible" says:
    "Inside a character class, all characters have their literal meaning,
    except for the quoting character \ , ^ at the beginning, and - between
    two characters."

    And for meta-characters generally it says that single meta-characters
    match themselves, and otherwise need to be \-escaped to preserve their
    literal meaning.

    I suppose that's what we could expect from other including older awks.
    (Test cases might be []], [[], vs. [\]], [\[].)

    For more recent tools POSIX defines BRE bracket expressions for POSIX
    awk, also mentioning the brackets. (WRT the bracket symbols it gets a
    bit more complicated, though, with the collating syntaxes.)

    Janis

    --- Synchronet 3.19c-Linux NewsLink 1.113