djvutxt(1) DjVuLibre-3.5 djvutxt(1)
NAME
djvutxt - Extract the hidden text from DjVu documents.
SYNOPSIS
djvutxt [options] inputdjvufile [outputtxtfile]
DESCRIPTION
Program djvutxt decodes the hidden text layer of a DjVu document input- djvufile and prints it into file outputtxtfile or on the standard out- put. The hidden text layer is usually generated with the help of an optical character recognition software. Without options -detail and -escape, this program simply outputs the UTF-8 text. Option -detail cause the output of S-expressions describ- ing the text and its location. Option -escape uses C-style escape sequences to represent nonprintable non-ASCII characters.
OPTIONS
--page=pagespec Specify which pages should be processed. When this option is not specified, the text of all pages of the documents is con- catenated into the output file. The page specification pagespec contains one or more comma-separated page ranges. A page range is either a page number, or two page numbers separated by a dash. For instance, specification 1-10 outputs pages 1 to 10, and specification 1,3,99999-4 outputs pages 1 and 3, followed by all the document pages in reverse order up to page 4. --detail=keyword This options causes djvutxt to output S-expressions specifying the position of the text in the page. See the manual page djvused(1) for a description of the output format. Argument keyword specifies the maximum level of detail for which text location is reported. The recognized values are: page, column, region, para, line, word, and char. All other values are inter- preted as char. --escape Output escape sequences of the form "ooo" for all non ASCII or non printable UTF-8 characters and for the backslash character.
REMARKS
Use program djvused(1) for more control over the text layer.
CREDITS
This program was initially written by Andrei Erofeev <andrew_ero- feev@yahoo.com> and was then improved Bill Riemers <docbill@source- forge.net> and many others. It was then rewritten to use the ddjvuapi by Leon Bottou <leonb@sourceforge.net>.
SEE ALSO
djvu(1), djvused(1) DjVuLibre-3.5 10/11/2001 djvutxt(1)
djvulibre 3.5.27 - Generated Sat Mar 14 18:32:32 CDT 2015