djvutxt(1) DjVuLibre-3.5 djvutxt(1)
NAME
djvutxt - Extract the hidden text from DjVu documents.
SYNOPSIS
djvutxt [options] inputdjvufile [outputtxtfile]
DESCRIPTION
Program djvutxt decodes the hidden text layer of a DjVu document input-
djvufile and prints it into file outputtxtfile or on the standard out-
put. The hidden text layer is usually generated with the help of an
optical character recognition software.
Without options -detail and -escape, this program simply outputs the
UTF-8 text. Option -detail cause the output of S-expressions describ-
ing the text and its location. Option -escape uses C-style escape
sequences to represent nonprintable non-ASCII characters.
OPTIONS
--page=pagespec
Specify which pages should be processed. When this option is
not specified, the text of all pages of the documents is con-
catenated into the output file. The page specification pagespec
contains one or more comma-separated page ranges. A page range
is either a page number, or two page numbers separated by a
dash. For instance, specification 1-10 outputs pages 1 to 10,
and specification 1,3,99999-4 outputs pages 1 and 3, followed by
all the document pages in reverse order up to page 4.
--detail=keyword
This options causes djvutxt to output S-expressions specifying
the position of the text in the page. See the manual page
djvused(1) for a description of the output format. Argument
keyword specifies the maximum level of detail for which text
location is reported. The recognized values are: page, column,
region, para, line, word, and char. All other values are inter-
preted as char.
--escape
Output escape sequences of the form "ooo" for all non ASCII or
non printable UTF-8 characters and for the backslash character.
REMARKS
Use program djvused(1) for more control over the text layer.
CREDITS
This program was initially written by Andrei Erofeev <andrew_ero-
feev@yahoo.com> and was then improved Bill Riemers <docbill@source-
forge.net> and many others. It was then rewritten to use the ddjvuapi
by Leon Bottou <leonb@sourceforge.net>.
SEE ALSO
djvu(1), djvused(1)
DjVuLibre-3.5 10/11/2001 djvutxt(1)
djvulibre 3.5.27 - Generated Sat Mar 14 18:32:32 CDT 2015
