manpagez: man pages & more
info gawk
Home | html | info | man

gawk: Leftmost Longest

 3.5 How Much Text Matches?
 Consider the following:
      echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
    This example uses the 'sub()' function to make a change to the input
 record.  ('sub()' replaces the first instance of any text matched by the
 first argument with the string provided as the second argument; ⇒
 String Functions.)  Here, the regexp '/a+/' indicates "one or more 'a'
 characters," and the replacement text is '<A>'.
    The input contains four 'a' characters.  'awk' (and POSIX) regular
 expressions always match the leftmost, _longest_ sequence of input
 characters that can match.  Thus, all four 'a' characters are replaced
 with '<A>' in this example:
      $ echo aaaabcd | awk '{ sub(/a+/, "<A>"); print }'
      -| <A>bcd
    For simple match/no-match tests, this is not so important.  But when
 doing text matching and substitutions with the 'match()', 'sub()',
 'gsub()', and 'gensub()' functions, it is very important.  ⇒String
 Functions, for more information on these functions.  Understanding
 this principle is also important for regexp-based record and field
 splitting (⇒Records, and also ⇒Field Separators).
© 2000-2018
Individual documents may contain additional copyright information.