manpagez: man pages & more
info gawk
Home | html | info | man

gawk: Feature History

 A.6 History of 'gawk' Features
 This minor node describes the features in 'gawk' over and above those in
 POSIX 'awk', in the order they were added to 'gawk'.
    Version 2.10 of 'gawk' introduced the following features:
    * The 'AWKPATH' environment variable for specifying a path search for
      the '-f' command-line option (⇒Options).
    * The 'IGNORECASE' variable and its effects (⇒
    * The '/dev/stdin', '/dev/stdout', '/dev/stderr' and '/dev/fd/N'
      special file names (⇒Special Files).
    Version 2.13 of 'gawk' introduced the following features:
    * The 'FIELDWIDTHS' variable and its effects (⇒Constant Size).
    * The 'systime()' and 'strftime()' built-in functions for obtaining
      and printing timestamps (⇒Time Functions).
    * Additional command-line options (⇒Options):
         - The '-W lint' option to provide error and portability checking
           for both the source code and at runtime.
         - The '-W compat' option to turn off the GNU extensions.
         - The '-W posix' option for full POSIX compliance.
    Version 2.14 of 'gawk' introduced the following feature:
    * The 'next file' statement for skipping to the next data file (⇒
      Nextfile Statement).
    Version 2.15 of 'gawk' introduced the following features:
    * New variables (⇒Built-in Variables):
         - 'ARGIND', which tracks the movement of 'FILENAME' through
         - 'ERRNO', which contains the system error message when
           'getline' returns -1 or 'close()' fails.
    * The '/dev/pid', '/dev/ppid', '/dev/pgrpid', and '/dev/user' special
      file names.  These have since been removed.
    * The ability to delete all of an array at once with 'delete ARRAY'
    * Command-line option changes (⇒Options):
         - The ability to use GNU-style long-named options that start
           with '--'.
         - The '--source' option for mixing command-line and library-file
           source code.
    Version 3.0 of 'gawk' introduced the following features:
    * New or changed variables:
         - 'IGNORECASE' changed, now applying to string comparison as
           well as regexp operations (⇒Case-sensitivity).
         - 'RT', which contains the input text that matched 'RS' (⇒
    * Full support for both POSIX and GNU regexps (⇒Regexp).
    * The 'gensub()' function for more powerful text manipulation (⇒
      String Functions).
    * The 'strftime()' function acquired a default time format, allowing
      it to be called with no arguments (⇒Time Functions).
    * The ability for 'FS' and for the third argument to 'split()' to be
      null strings (⇒Single Character Fields).
    * The ability for 'RS' to be a regexp (⇒Records).
    * The 'next file' statement became 'nextfile' (⇒Nextfile
    * The 'fflush()' function from BWK 'awk' (then at Bell Laboratories;
      ⇒I/O Functions).
    * New command-line options:
         - The '--lint-old' option to warn about constructs that are not
           available in the original Version 7 Unix version of 'awk'
         - The '-m' option from BWK 'awk'.  (Brian was still at Bell
           Laboratories at the time.)  This was later removed from both
           his 'awk' and from 'gawk'.
         - The '--re-interval' option to provide interval expressions in
           regexps (⇒Regexp Operators).
         - The '--traditional' option was added as a better name for
           '--compat' (⇒Options).
    * The use of GNU Autoconf to control the configuration process (⇒
      Quick Installation).
    * Amiga support.  This has since been removed.
    Version 3.1 of 'gawk' introduced the following features:
    * New variables (⇒Built-in Variables):
         - 'BINMODE', for non-POSIX systems, which allows binary I/O for
           input and/or output files (⇒PC Using).
         - 'LINT', which dynamically controls lint warnings.
         - 'PROCINFO', an array for providing process-related
         - 'TEXTDOMAIN', for setting an application's
           internationalization text domain (⇒
    * The ability to use octal and hexadecimal constants in 'awk' program
      source code (⇒Nondecimal-numbers).
    * The '|&' operator for two-way I/O to a coprocess (⇒Two-way
    * The '/inet' special files for TCP/IP networking using '|&' (⇒
      TCP/IP Networking).
    * The optional second argument to 'close()' that allows closing one
      end of a two-way pipe to a coprocess (⇒Two-way I/O).
    * The optional third argument to the 'match()' function for capturing
      text-matching subexpressions within a regexp (⇒String
    * Positional specifiers in 'printf' formats for making translations
      easier (⇒Printf Ordering).
    * A number of new built-in functions:
         - The 'asort()' and 'asorti()' functions for sorting arrays
           (⇒Array Sorting).
         - The 'bindtextdomain()', 'dcgettext()' and 'dcngettext()'
           functions for internationalization (⇒Programmer i18n).
         - The 'extension()' function and the ability to add new built-in
           functions dynamically (⇒Dynamic Extensions).
         - The 'mktime()' function for creating timestamps (⇒Time
         - The 'and()', 'or()', 'xor()', 'compl()', 'lshift()',
           'rshift()', and 'strtonum()' functions (⇒Bitwise
    * The support for 'next file' as two words was removed completely
      (⇒Nextfile Statement).
    * Additional command-line options (⇒Options):
         - The '--dump-variables' option to print a list of all global
         - The '--exec' option, for use in CGI scripts.
         - The '--gen-po' command-line option and the use of a leading
           underscore to mark strings that should be translated (⇒
           String Extraction).
         - The '--non-decimal-data' option to allow non-decimal input
           data (⇒Nondecimal Data).
         - The '--profile' option and 'pgawk', the profiling version of
           'gawk', for producing execution profiles of 'awk' programs
         - The '--use-lc-numeric' option to force 'gawk' to use the
           locale's decimal point for parsing input data (⇒
    * The use of GNU Automake to help in standardizing the configuration
      process (⇒Quick Installation).
    * The use of GNU 'gettext' for 'gawk''s own message output (⇒
      Gawk I18N).
    * BeOS support.  This was later removed.
    * Tandem support.  This was later removed.
    * The Atari port became officially unsupported and was later removed
    * The source code changed to use ISO C standard-style function
    * POSIX compliance for 'sub()' and 'gsub()' (⇒Gory Details).
    * The 'length()' function was extended to accept an array argument
      and return the number of elements in the array (⇒String
    * The 'strftime()' function acquired a third argument to enable
      printing times as UTC (⇒Time Functions).
    Version 4.0 of 'gawk' introduced the following features:
    * Variable additions:
         - 'FPAT', which allows you to specify a regexp that matches the
           fields, instead of matching the field separator (⇒
           Splitting By Content).
         - If 'PROCINFO["sorted_in"]' exists, 'for(iggy in foo)' loops
           sort the indices before looping over them.  The value of this
           element provides control over how the indices are sorted
           before the loop traversal starts (⇒Controlling
         - 'PROCINFO["strftime"]', which holds the default format for
           'strftime()' (⇒Time Functions).
    * The special files '/dev/pid', '/dev/ppid', '/dev/pgrpid' and
      '/dev/user' were removed.
    * Support for IPv6 was added via the '/inet6' special file.  '/inet4'
      forces IPv4 and '/inet' chooses the system default, which is
      probably IPv4 (⇒TCP/IP Networking).
    * The use of '\s' and '\S' escape sequences in regular expressions
      (⇒GNU Regexp Operators).
    * Interval expressions became part of default regular expressions
      (⇒Regexp Operators).
    * POSIX character classes work even with '--traditional' (⇒
      Regexp Operators).
    * 'break' and 'continue' became invalid outside a loop, even with
DONTPRINTYET       '--traditional' (⇒Break Statement, and also see *noteDONTPRINTYET       '--traditional' (⇒Break Statement, and also see ⇒
      Continue Statement).
    * 'fflush()', 'nextfile', and 'delete ARRAY' are allowed if '--posix'
      or '--traditional', since they are all now part of POSIX.
    * An optional third argument to 'asort()' and 'asorti()', specifying
      how to sort (⇒String Functions).
    * The behavior of 'fflush()' changed to match BWK 'awk' and for
      POSIX; now both 'fflush()' and 'fflush("")' flush all open output
      redirections (⇒I/O Functions).
    * The 'isarray()' function which distinguishes if an item is an array
      or not, to make it possible to traverse arrays of arrays (⇒
      Type Functions).
    * The 'patsplit()' function which gives the same capability as
      'FPAT', for splitting (⇒String Functions).
    * An optional fourth argument to the 'split()' function, which is an
      array to hold the values of the separators (⇒String
    * Arrays of arrays (⇒Arrays of Arrays).
    * The 'BEGINFILE' and 'ENDFILE' special patterns (⇒
    * Indirect function calls (⇒Indirect Calls).
    * 'switch' / 'case' are enabled by default (⇒Switch
    * Command-line option changes (⇒Options):
         - The '-b' and '--characters-as-bytes' options which prevent
           'gawk' from treating input as a multibyte string.
         - The redundant '--compat', '--copyleft', and '--usage' long
           options were removed.
         - The '--gen-po' option was finally renamed to the correct
         - The '--sandbox' option which disables certain features.
         - All long options acquired corresponding short options, for use
           in '#!' scripts.
    * Directories named on the command line now produce a warning, not a
      fatal error, unless '--posix' or '--traditional' are used (⇒
      Command-line directories).
    * The 'gawk' internals were rewritten, bringing the 'dgawk' debugger
      and possibly improved performance (⇒Debugger).
    * Per the GNU Coding Standards, dynamic extensions must now define a
      global symbol indicating that they are GPL-compatible (⇒Plugin
    * In POSIX mode, string comparisons use 'strcoll()' / 'wcscoll()'
      (⇒POSIX String Comparison).
    * The option for raw sockets was removed, since it was never
      implemented (⇒TCP/IP Networking).
    * Ranges of the form '[d-h]' are treated as if they were in the C
      locale, no matter what kind of regexp is being used, and even if
      '--posix' (⇒Ranges and Locales).
    * Support was removed for the following systems:
         - Atari
         - Amiga
         - BeOS
         - Cray
         - MIPS RiscOS
         - MS-DOS with the Microsoft Compiler
         - MS-Windows with the Microsoft Compiler
         - NeXT
         - SunOS 3.x, Sun 386 (Road Runner)
         - Tandem (non-POSIX)
         - Prestandard VAX C compiler for VAX/VMS
    Version 4.1 of 'gawk' introduced the following features:
    * Three new arrays: 'SYMTAB', 'FUNCTAB', and
      'PROCINFO["identifiers"]' (⇒Auto-set).
    * The three executables 'gawk', 'pgawk', and 'dgawk', were merged
      into one, named just 'gawk'.  As a result the command-line options
    * Command-line option changes (⇒Options):
         - The '-D' option invokes the debugger.
         - The '-i' and '--include' options load 'awk' library files.
         - The '-l' and '--load' options load compiled dynamic
         - The '-M' and '--bignum' options enable MPFR.
         - The '-o' option only does pretty-printing.
         - The '-p' option is used for profiling.
         - The '-R' option was removed.
    * Support for high precision arithmetic with MPFR (⇒Arbitrary
      Precision Arithmetic).
    * The 'and()', 'or()' and 'xor()' functions changed to allow any
      number of arguments, with a minimum of two (⇒Bitwise
    * The dynamic extension interface was completely redone (⇒
      Dynamic Extensions).
    * Redirected 'getline' became allowed inside 'BEGINFILE' and
    * The 'where' command was added to the debugger (⇒Execution
    * Support for Ultrix was removed.
    Version 4.2 of 'gawk' introduced the following changes:
    * Changes to 'ENVIRON' are reflected into 'gawk''s environment and
      that of programs that it runs.  ⇒Auto-set.
    * 'FIELDWIDTHS' was enhanced to allow skipping characters before
      assigning a value to a field (⇒Splitting By Content).
    * The 'PROCINFO["argv"]' array.  ⇒Auto-set.
    * The maximum number of hexadecimal digits in '\x' escapes is now
      two.  ⇒Escape Sequences.
    * Strongly typed regexp constants of the form '@/.../' (⇒Strong
      Regexp Constants).
    * The bitwise functions changed, making negative arguments into a
      fatal error (⇒Bitwise Functions).
    * The 'mktime()' function now accepts an optional second argument
      (⇒Time Functions).
    * The 'typeof()' function (⇒Type Functions).
    * Optimizations are enabled by default.  Use '-s' / '--no-optimize'
      to disable optimizations.
    * For many years, POSIX specified that default field splitting only
      allowed spaces and tabs to separate fields, and this was how 'gawk'
      behaved with '--posix'.  As of 2013, the standard restored
      historical behavior, and now default field splitting with '--posix'
      also allows newlines to separate fields.
    * Nonfatal output with 'print' and 'printf'.  ⇒Nonfatal.
    * Retryable I/O via 'PROCINFO[INPUT-FILE, "RETRY"]'; (⇒Retrying
    * Changes to the pretty-printer (⇒Profiling):
         - The '--pretty-print' option no longer runs the 'awk' program
         - Comments in the source program are preserved and placed into
           the output file.
         - Explicit parentheses for expressions in the input are
           preserved in the generated output.
    * Improvements to the extension API (⇒Dynamic Extensions):
         - The 'get_file()' function to access open redirections.
         - The 'nonfatal()' function for generating nonfatal error
         - Support for GMP and MPFR values.
         - Input parsers can now override the default field parsing
           mechanism by specifying explicit locations.
    * Shell startup files are supplied with the distribution and
      installed by 'make install' (⇒Shell Startup Files).
    * The 'igawk' program and its manual page are no longer installed
      when 'gawk' is built.  ⇒Igawk Program.
    * Support for MirBSD was removed.
    * Support for GNU/Linux on Alpha was removed.
© 2000-2018
Individual documents may contain additional copyright information.