manpagez: man pages & more
man glark(1)
Home | html | info | man
glark(1)                          glark 1.8.0                         glark(1)




NAME

       glark - Search text files for complex regular expressions


SYNOPSIS

       glark [options] expression file ...


DESCRIPTION

       Similar to "grep", "glark" offers: Perl-compatible regular expressions,
       color highlighting of matches, context around matches, complex expres-
       sions ("and" and "or"), grep output emulation, and automatic exclusion
       of non-text files. Its regular expressions should be familiar to per-
       sons experienced in Perl, Python, or Ruby. File may also be a list of
       files in the form of a path.


OPTIONS

       Input


           -0[nnn]
               Use \nnn (octal) as the input record separator. If nnn is omit-
               ted, use '\n\n' as the record separator, which treats para-
               graphs as lines.

           -d ACTION, --directories=ACTION
               Directories are processed according to the given ACTION, which
               by default is "read". If ACTION is "recurse", each file in the
               directory is read and each subdirectory is recursed into
               (equivalent to the "-r" option). If ACTION is "skip", directo-
               ries are not read, and no message is produced.

           --binary-files=TYPE
               Specify how to handle binary files, thus overriding the default
               behavior, which is to denote the binary files that match the
               expression, without displaying the match. TYPE may be one of:
               "binary", the default; "without-match", which results in binary
               files being skipped; and "text", which results in the binary
               file being treated as text, the display of which may have bad
               side effects with the terminal. Note that the default behavior
               has changed; this previously was to skip binary files. The same
               effect may be achieved by setting binary-files to "with-
               out-match" in the ~/.glarkrc file.

           --[with-]basename EXPR, --[with-]name EXPR
               Search only files whose names match the given regular expres-
               sion. As in find(1), this works on the basename of the file.
               This expression can be negated and modified with "!" and "i",
               such as '!/io\.[hc]$/i'.

           --[with-]fullname EXPR, --[with-]path EXPR
               Search only files whose names, including path, match the given
               regular expression. As in find(1), this works on the path of
               the file. This expression can be negated and modified with "!"
               and "i", such as '!/Dialog.*\.java$/i'.

           --without-basename EXPR, --without-name EXPR
               Do not search files with base names matching the given regular
               expression.

           --without-fullname EXPR, --without-path EXPR
               Do not search files with full names matching the given regular
               expression.

           -M, --exclude-matching
               Do not search files whose names match the given expression.
               This can be useful for finding external references to a file,
               or to a class (assuming that class names match file names).

           -r, --recurse
               Recurse through directories. Equivalent to --directories=read.

           --split-as-path(=VALUE), --no-split-as-path
               Sets whether, if a command line argument includes the path sep-
               arator (such as ":"), the argument should be split by the path
               separator. This functionality is useful for using environment
               variables as input, such as $PATH and $CLASSPATH, which are
               automatically split and processed as a list of files and direc-
               tories.  The default value of this option is "true".
               "--no-split-as-path" is equivalent to "--split-as-path=false".

           --size-limit=SIZE
               If provided, files no larger than SIZE bytes will be searched.
               This is useful when running the "--recurse" option on directo-
               ries that may contain large files.

       Matching


           -a NUM expr1 expr2
           --and NUM expr1 expr2
           --and=NUM expr1 expr2
           ( expr1 --and=NUM expr2 )
               Match both of the two expressions, within NUM lines of each
               other. See the EXPRESSIONS section for more information.

           -b NUM[%], --before NUM[%]
               Restrict the search to before the given location, which repre-
               sents either the number of the last line within the valid
               range, or the percentage of lines to be searched.

           --after NUM[%]
               Restrict the search to after the given section, which repre-
               sents either the number of the first line within the valid
               range, or the percentage of lines to be skipped.

           -f FILE, --file=FILE
               Use the lines in the given file as expressions. Each line con-
               sists of a regular expression.

           -i, --ignore-case
               Match regular expressions without regard to case. The default
               is case sensitive.

           -m NUM, --match-limit NUM
               Find only the first NUM matches in each file.

           -o expr1 expr2
           --or expr1 expr2
           ( expr1 --or expr2 )
               Match either of the two expressions. See the EXPRESSIONS sec-
               tion for more information.

           -R, --range NUM[%],NUM[%]
               Restrict the search to the given range of lines, as either line
               numbers or a percentage of the length of the file.

           -v, --invert-match
               Show lines that do not match the expression.

           -w, --word, --word-regexp
               Put word boundaries around each pattern, thus matching only
               where the full word(s) occur in the text. Thus, "glark -w Foo"
               is the same as "glark '/\bFoo\b/'".

           -x, --line-regexp
               Select only where the entire line matches the pattern(s).

           --xor expr1 expr2
           ( expr1 --xor expr2 )
               Match either of the two expressions, but not both. See the
               EXPRESSIONS section for more information.

       Output


           -A NUM, --after-context=NUM
               Print NUM lines after a matched expression.

           -B NUM, --before-context=NUM
               Print NUM lines before a matched expression.

           -C [NUM], -NUM, --context[=NUM]
               Output NUM lines of context around a matched expression. The
               default is no context. If no NUM is given for this option, the
               number of lines of context is 2.

           -c, --count
               Instead of normal output, display only the number of matches in
               each file.

           -F, --file-color COLOR
               Specify the highlight color for file names. See the HIGHLIGHT-
               ING section for the values that can be used.

           --no-filter
               Display the entire file(s), presumably with matches high-
               lighted.

           -g, --grep
               Produce output like the grep default: file names, no line num-
               bers, and a single line of the match, which will be the first
               line for matches that span multiple lines. If the EMACS envi-
               ronment variable is set, this value is set to true.  Thus, run-
               ning glark under Emacs results in the output format expected by
               Emacs.

           -h, --no-filename
               Do not display the names of the files that matched.

           -H, --with-filename
               Display the names of the files that matched. This is the
               default behavior.

           -l, --files-with-matches
               Print only the names of the file that matched the expression.

           -L, --files-without-match
               Print only the names of the file that did not match the expres-
               sion.

           --label=NAME
               Use NAME as output file name. This is useful when reading from
               standard input.

           -n, --line-number
               Display the line numbers. This is the default behavior.

           -N, --no-line-number
               Do not display the line numbers.

           --line-number-color
               Specify the highlight color for line numbers. This defaults to
               none (no highlighting). See the HIGHLIGHTING section for more
               information.

           -T, --text-color COLOR
               Specify the highlight color for text. See the HIGHLIGHTING sec-
               tion for more information.

           --text-color-NUM COLOR
               Specify the highlight color for the regular expression capture
               NUM. Colors are used by regular expressions in the order they
               are created (that is, with the "--and" and "--or" option), or
               with captures within a regular expression (such as
               '/(this)|(that)/'). is See the HIGHLIGHTING section for more
               information.

           -u, --highlight=[FORMAT]
               Enable highlighting. This is the default behavior. Format is
               "single" (one color) or "multi" (different color per regular
               expression). See the HIGHLIGHTING section for more information.

           -U, --no-highlight
               Disable highlighting.

           -y, --extract-matches
               Display only the region that matched, not the entire line. If
               the expression contains "backreferences" (i.e., expressions
               bounded by "( ... )"), then only the portion captured will be
               displayed, not the entire line. This option is useful with
               "-g", which eliminates the default highlighting and display of
               file names.

           -Z, --null
               When in -l mode, write file names followed by the ASCII NUL
               character ('\0') instead of '\n'.

       Debugging/Errors


           -?, --help
               Display the help message.

           --config
               Display the settings glark is using, and exit. Since this is
               run after configuration files are read, this may be useful for
               determining values of configuration parameters.

           --explain
               Write the expression in a more legible format, useful for
               debugging.

           -q, -s, --quiet, --no-messages
               Suppress warnings.

           -Q, --no-quiet
               Enable warnings. This is the default.

           -V, --version
               Display version information.

           --verbose
               Display normally suppressed output, for debugging purposes.


EXPRESSIONS

       Regular Expressions

       Regular expressions are expected to be in the Perl/Ruby format. "perl-
       doc perlre" has more general information. The expression may be of
       either form:

           something
           /something/

       There is no difference between the two forms, except that with the lat-
       ter, one can provide the "ignore case" modifier, thus matching "some-
       Thing" and "SoMeThInG":

           % glark /something/i

       Note that this is redundant with the "-i" ("--ignore-case") option.

       All regular expression characters and options are available, such as
       "\w", ".*?" and "[^9]". For example:

           % glark '\b[a-z][^\d]\d{1,3}.*\s*>>\s*\d+\s*.*& +\d{3}'

       If the and and or options are not used, the last non-option is consid-
       ered to be the expression to be matched. In the following, "printf" is
       used as the expression.

           % glark -w printf *.c

       POSIX character classes (e.g., [[:alpha:]]) are also supported.

       Complex Expressions

       Complex expressions combine regular expressions (and complex expres-
       sions themselves) with logical AND, OR, and XOR operators. Both prefix
       and infix notation is supported.

           -o expr1 expr2
           --or expr1 expr2 --end-of-or
           ( expr1 --or expr2 )
               Match either of the two expressions. The results of the two
               forms are equivalent. In the latter syntax, the --end-of-or is
               optional.

           -a number expr1 expr2
           --and=number expr1 expr2 --end-of-and
           ( expr1 --and number expr2 )
               Match both of the two expressions, within <number> lines of
               each other. As with the "or" option, the results of the two
               forms are equivalent, and the "--end-of-and" is optional. The
               forms "-aNUM" and "--and=NUM" are also supported.

               If the number provided is -1 (negative one), the distance is
               considered to be "infinite", and thus, the condition is satis-
               fied if both expressions match within the same file.

               If the number provided is 0 (zero), the condition is satisfied
               if both expressions match on the same line.

               If the --and option is used, and the follow argument is not
               numeric, then the value defaults to zero.

               A warning will be issued if the value given in the number posi-
               tion does not appear to be numeric.

           --xor expr1 expr2 --end-of-xor
           ( expr1 --xor expr2 )
               Match either of the two expressions, but not both.
               "--end-of-xor" is optional.

       Negated Regular Expressions

       Regular expressions can be negated, by being prefixed with '!', and
       using the '/' quote characters around the expression, such as:

           !/expr/

       This has the effect of "match anything other than this". For a single
       expression, this is no different than the -v/--invert-match option, but
       it can be useful in complex expressions, such as:

           --and 0 this '!/that/'

       which means "match and line that has "this", but not "that".


HIGHLIGHTING

       Matching patterns and file names can be highlighted using ANSI escape
       sequences.  Both the foreground and the background colors may be speci-
       fied, from the following:

           black
           blue
           cyan
           green
           magenta
           red
           white
           yellow

       The foreground may have any number of the following modifiers applied:

           blink
           bold
           concealed
           reverse
           underline
           underscore

       The format is "MODIFIERS FOREGROUND on BACKGROUND". For example:

           red
           black on yellow                    (the default for patterns)
           reverse bold                       (the default for file names)
           green on white
           bold underline red on cyan

       By default text is highlighted as black on yellow. File names are writ-
       ten in reversed bold text.


EXAMPLES

       Basic Usage


           % glark format *.h
               Searches for "format" in the local .h files.

           % glark --ignore-case format *.h
               Searches for "format" without regard to case. Short form:
                   % glark -i format *.h

           % glark --context=6 format *.h
               Produces 6 lines of context around any match for "format".
               Short forms:
                   % glark -C 6 format *.h
                   % glark -6 format *.h

           % glark --exclude-matching Object *.java
               Find references to "Object", excluding the files whose names
               match "Object".  Thus, SessionBean.java would be searched;
               EJBObject.java would not. Short form:
                   % glark -M Object *.java

           % glark --grep --extract-matches '\w+\.printStackTrace\(.*\)'
           *.java
               Show where exceptions are dumped. Note that the "--grep" option
               is used, thus turning off highlighting and display of file
               names. If the "--no-filename" option is used, the output will
               consist of only the matching portions. The short form of this
               command is:
                   % glark -gy '\w+\.printStackTrace\(.*\)' *.java

           % glark --grep --extract-matches '(\w+)\.printStackTrace\(.*\)'
           *.java
               Show only the variable name of exceptions that are dumped.
               Short form:
                   % glark -gy '(\w+)\.printStackTrace\(.*\)' *.java

           % who | glark -gy '^(\S+)\s+\S+\s*May 15'
               Display only the names of users who logged in today.

           % glark -l '\b\w{25,}\b' *.txt
               Display (only) the names of the text files that contain "words"
               at least 25 characters long.

           % glark --files-without-match '"\w+"'
               Display (only) the names of the files that do not contain
               strings consisting of a single word. Short form:
                   % glark -L '"\w+"'

           % for i in *.jar; do jar tvf $i | glark --LABEL=$i Exception; done
               Search the files for 'Exception', displaying the jar file name
               instead of the standard input marker ('-').

       Highlighting


           % glark --text-color "red on white" '\b[[:digit:]]{5}\b' *.c
               Display (in red text on a white background) occurrences of
               exactly 5 digits.  Short form:
                   % glark -T "red on white" '\b\d{5}\b' *.c

           See the HIGHLIGHTING section for valid colors and modifiers.

       Complex Expressions


           % glark --or format print *.h
               Searches for either "printf" or "format". Short form:
                   % glark -o format print *.h

           % glark --and 4 printf format *.c *.h
               Searches for both "printf" or "format" within 4 lines of each
               other. Short form:
                   % glark -a 4 printf format *.c *.h

           % glark --context=3 --and 0 printf format *.c
               Searches for both "printf" or "format" on the same line
               ("within 0 lines of each other"). Three lines of context are
               displayed around any matches. Short form:
                   % glark -3 -a 0 printf format *.c

           % glark -8 -i -a 15 -a 2 pthx '\.\.\.' -o 'va_\w+t' die *.c
               (In order of the options:) Produces 8 lines of context around
               case insensitive matches of ("phtx" within 2 lines of '...'
               (literal)) within 15 lines of (either "va_\w+t" or "die").

           % glark --and -1 '#define\s+YIELD' '#define\s+dTHR' *.h
               Looks for "#define\s+YIELD" within the same file (-1 == "infi-
               nite distance") of "#define\s+dTHR". Short form:
                   % glark -a -1 '#define\s+YIELD' '#define\s+dTHR' *.h

       Range Limiting


           % glark --before 50% cout *.cpp
               Find references to "cout", within the first half of the file.
               Short form:
                   % glark -b 50% cout *.cpp

           % glark --after 20 cout *.cpp
               Find references to "cout", starting at the 20th line in the
               file. Short form:
                   % glark -b 50% cout *.cpp

           % glark --range 20 50% cout *.cpp
               Find references to "cout", in the first half of the file, after
               the 20th line.  Short form:
                   % glark -R 20 50% cout *.cpp

       File Processing


           % glark -r print .
               Search for "print" in all files at and below the current direc-
               tory.

           % glark --fullname='/\.java$/' -r println org
               Search for "println" in all Java files at and below the "org"
               directory.

           % glark --basename='!/CVS/' -r '\b\d\d:\d\d:\d\d\b' .
               Search for a time pattern in all files without "CVS" in their
               basenames.

           % glark --size-limit=1024 -r main -r .
               Search for "main" in files no larger than 1024 bytes.


ENVIRONMENT

       GLARKOPTS
           A string of whitespace-delimited options. Due to parsing con-
           straints, should probably not contain complex regular expressions.

       $HOME/.glarkrc
           A resource file, containing name/value pairs, separated by either
           ':' or '='.  The valid fields of a .glarkrc file are as follows,
           with example values:

               after-context:     1
               before-context:    6
               context:           5
               file-color:        blue on yellow
               highlight:         off
               ignore-case:       false
               quiet:             yes
               text-color:        bold reverse
               line-number-color: bold
               verbose:           false
               grep:              true

           "yes" and "on" are synonymnous with "true". "no" and "off" signify
           "false".

           My ~/.glarkrc file is the following:

               file-color:   bold reverse
               text-color:   bold black on yellow
               context:      2
               highlight:    on
               verbose:      false
               ignore-case:  false
               quiet:        yes
               word:         false
               binary-files: without-match

       local .glarkrc
           See the local-config-files field below:

       Fields


       after-context
           See the "--after-context" option. For example, for 3 lines of con-
           text after the match:

               after-context: 3

       basename
           See the "--basename" option. For example, to omit Subversion direc-
           tories:

               basename: !/\.svn/

       before-context
           See the "--before-context" option. For example, for 7 lines of con-
           text before the match:

               before-context: 7

       binary-files
           See the "--binary-files" option. For example, to skip binary files:

               binary-files: without-match

       context
           See the "--context" option. For example, for 2 lines before and
           after matches:

               context: 2

       expression
           See the EXPRESSION section. Example:

               expression: --or '^\s*public\s+class\s+\w+' '^\s*\w+\(

       file-color
           See the "--file-color" option. For example, for white on black:

               file-color: white on black

       filter
           See the "--filter" option. For example, to show the entire file:

               filter: false

       fullname
           See the "--fullname" and "--basename" options. For example, to omit
           CVS files:

               fullname: !/\bCVS\b/

       grep
           See the "--grep" option. For example, to always run in grep mode:

               grep: true

       highlight
           See the "--highlight" option. To turn off highlighting:

               highlight: false

       ignore-case
           See the "--ignore-case" option. To make matching case-insensitive:

               ignore-case: true

       known-nontext-files
           The extensions of files that should be considered to always be non-
           text (binary).  If a file extension is not known, the file contents
           are examined for nontext characters. Thus, setting this field can
           result in faster searches. Example:

               known-nontext-files: class exe dll com

           See the Exclusion of Non-Text Files section in NOTES for the
           default settings.

       known-text-files
           The extensions of files that should be considered to always be
           text. See above for more. Example:

               known-text-files: ini bat xsl xml

           See the Exclusion of Non-Text Files section in NOTES for the
           default settings.

       local-config-files
           By default, glark uses only the configuration file ~/.glarkrc.
           Enabling this makes glark search upward from the current directory
           for the first .glarkrc file.

           This can be used, for example, in a Java project, where .class
           files are binary, versus a PHP project, where .class files are
           text:

               /home/me/.glarkrc

                   local-config-files: true

               /home/me/projects/java/.glarkrc

                   known-nontext-files: class

               /home/me/projects/php/.glarkrc

                   known-text-files: class

           With this configuration, .class files will automatically be treated
           as binary file in Java projects, and .class files will be treated
           as text. This can speed up searches.

           Note that the configuration file ~/.glarkrc is read first, so the
           local configuration file can override those settings.

       quiet
           See the "--quiet" option.

       show-break
           Whether to display breaks between sections, when displaying con-
           text. Example:

               show-break: true

           By default, this is false.

       text-color
           See the "--text-color" option. Example:

               text-color: bold blue on white

       verbose
           See the "--verbose" option. Example:

               verbose: true

       verbosity
           See the "--verbosity" option. Example:

               verbosity: 4


NOTES

       Exclusion of Non-Text Files

       Non-text files are automatically skipped, by taking a sample of the
       file and checking for an excessive number of non-ASCII characters. For
       speed purposes, this test is skipped for files whose suffixes are asso-
       ciated with text files:

           c
           cpp
           css
           h
           f
           for
           fpp
           hpp
           html
           java
           mk
           php
           pl
           pm
           rb
           rbw
           txt

       Similarly, this test is also skipped for files whose suffixes are asso-
       ciated with non-text (binary) files:

           Z
           a
           bz2
           elc
           gif
           gz
           jar
           jpeg
           jpg
           o
           obj
           pdf
           png
           ps
           tar
           zip

       See the "known-text-files" and "known-nontext-files" fields for denot-
       ing file name suffixes to associate as text or nontext.

       Exit Status

       The exit status is 0 if matches were found; 1 if no matches were found,
       and 2 if there was an error. An inverted match (the -v/--invert-match
       option) will result in 1 for matches found, 0 for none found.


SEE ALSO

       For regular expressions, the "perlre" man page.

       Mastering Regular Expressions, by Jeffrey Friedl, published by
       O'Reilly.


CAVEATS

       "Unbalanced" leading and trailing slashes will result in those slashes
       being included as characters in the regular expression. Thus, the fol-
       lowing pairs are equivalent:

           /foo        "/foo"
           /foo\/      "/foo/"
           /foo\/i     "/foo/i"
           foo/        "foo/"
           foo/        "foo/"

       The code to detect nontext files assumes ASCII, not Unicode.


AUTHOR

       Jeff Pace <jpace at incava dot org>


COPYRIGHT

       Copyright (c) 2006, Jeff Pace.

       All Rights Reserved. This module is free software. It may be used,
       redistributed and/or modified under the terms of the Lesser GNU Public
       License. See http://www.gnu.org/licenses/lgpl.html for more informa-
       tion.



glark 1.8.0                       2007-02-02                          glark(1)

glark 1.8.0 - Generated Sat Jun 28 11:23:41 CDT 2008
© manpagez.com 2000-2024
Individual documents may contain additional copyright information.