![]() |
Leptonica 1.85.0
Image processing and image analysis suite
|
#include <string.h>#include "allheaders.h"Go to the source code of this file.
Functions | |
| l_ok | compressFilesToPdf (SARRAY *sa, l_int32 onebit, l_int32 savecolor, l_float32 scalefactor, l_int32 quality, const char *title, const char *fileout) |
| l_ok | cropFilesToPdf (SARRAY *sa, l_int32 lr_clear, l_int32 tb_clear, l_int32 edgeclean, l_int32 lr_border, l_int32 tb_border, l_float32 maxwiden, l_int32 printwiden, const char *title, const char *fileout) |
| l_ok | cleanTo1bppFilesToPdf (SARRAY *sa, l_int32 res, l_int32 contrast, l_int32 rotation, l_int32 opensize, const char *title, const char *fileout) |
Image processing operations on multiple images followed by wrapping
them into a pdf.
There are two possible ways to specify the set of images:
(1) an array of pathnames
(2) a directory, typically with an additional pattern for selection.
We use (1) because it is both simpler and more general.
Corresponding to each function here is:
(1) the image processing function that is carried out on each image
(2) a program in prog that extracts images from a pdf and calls this
function with an array of their pathnames.
|=============================================================|
| Important notes |
|=============================================================|
| Some of these functions require I/O libraries such as |
| libtiff, libjpeg, libpng and libz. If you do not have |
| these libraries, some calls will fail. For example, |
| if you do not have libtiff, you cannot write a pdf that |
| uses libtiff to encode bilevel images in tiffg4. |
| |
| You can manually deactivate all pdf writing by setting |
| this in environ.h: |
| |
| #define USE_PDFIO 0 |
|
|
| This will link the stub file pdfappstub.c. |
|=============================================================|
The images in the pdf file can be rendered using a pdf viewer,
such as evince, gv, xpdf or acroread.
Compression of images for prog/compresspdf
l_int32 compressFilesToPdf()
Crop images for prog/croppdf
l_int32 cropFilesToPdf()
Cleanup and binarization of images for prog/cleanpdf
l_int32 cleanTo1bppFilesToPdf()
Definition in file pdfapp.c.
| l_ok cleanTo1bppFilesToPdf | ( | SARRAY * | sa, |
| l_int32 | res, | ||
| l_int32 | contrast, | ||
| l_int32 | rotation, | ||
| l_int32 | opensize, | ||
| const char * | title, | ||
| const char * | fileout ) |
| [in] | sa | sorted full pathnames of images |
| [in] | res | either 300 or 600 ppi for output |
| [in] | contrast | vary contrast: 1 = lightest; 10 = darkest; suggest 1 unless light features are being lost |
| [in] | rotation | cw by 90 degrees: {0,1,2,3} represent 0, 90, 180 and 270 degree cw rotations |
| [in] | opensize | opening size of structuring element for noise removal: {0 or 1to skip; 2, 3 for opening} |
| [in] | title | [optional] pdf title; can be null |
| [in] | fileout | pdf file of all images |
Notes:
(1) This deskews, optionally rotates and darkens, cleans background
to white, binarizes and optionally removes small noise, and
put the images into the pdf in the order given in sa.
(2) All images in the pdf are tiffg4 encoded.
(3) For color and grayscale input, local background normalization is
done to 200, and a threshold of 180 sets the maximum foreground
value in the normalized image.
(4) The res parameter can be either 300 or 600 ppi. If the input
is gray or color and res = 600, this does an interpolated 2x
expansion before binarizing.
(5) The contrast parameter adjusts the binarization to avoid losing
lighter input pixels. Contrast is increased as contrast increases
from 1 to 10.
(6) The #opensize parameter is the size of a square SEL used with
opening to remove small speckle noise. Allowed open sizes are 2,3.
If this is to be used, try 2 before 3.
(7) If there are more than 200 images, store the images after processing
as an array of compressed images (a Pixac); otherwise, use a Pixa.
Definition at line 384 of file pdfapp.c.
References L_CLONE, L_G4_ENCODE, L_INSERT, and L_NOCOPY.
| l_ok compressFilesToPdf | ( | SARRAY * | sa, |
| l_int32 | onebit, | ||
| l_int32 | savecolor, | ||
| l_float32 | scalefactor, | ||
| l_int32 | quality, | ||
| const char * | title, | ||
| const char * | fileout ) |
| [in] | sa | sorted full pathnames of images |
| [in] | onebit | set to 1 to enforce 1 bpp tiffg4 encoding |
| [in] | savecolor | if onebit == 1, set to 1 to save color |
| [in] | scalefactor | scaling factor applied to each image; > 0.0 |
| [in] | quality | for jpeg: 0 for default (50; otherwise 25 - 95. |
| [in] | title | [optional] pdf title; can be null |
| [in] | fileout | pdf file of all images |
Notes:
(1) This function is designed to optionally scale and compress a set of
images, wrapping them in a pdf in the order given in the input sa.
(2) It does the image processing for prog/compresspdf.c.
(3) Images in the output pdf are encoded with either tiffg4 or jpeg (DCT),
or a mixture of them depending on parameters onebit and savecolor.
(4) Parameters onebit and savecolor work as follows:
onebit = 0: no depth conversion, default encoding depends on depth
onebit = 1, savecolor = 0: all images converted to 1 bpp
onebit = 1, savecolor = 1: images without color are converted
to 1 bpp; images with color have the color preserved.
(5) In use, if most of the pages are 1 bpp but some have color that needs
to be preserved, onebit and savecolor should both be 1. This
causes DCT compression of color images and tiffg4 compression
of monochrome images.
(6) The images will be concatenated in the order given in sa.
(7) Typically, scalefactor <= 1.0. It is applied to each image
before encoding. If you enter a value <= 0.0, it will be set to 1.0.
The maximum allowed value is 2.0.
(8) Default jpeg quality is 50; otherwise, quality factors between
25 and 95 are enforced.
(9) Page images at 300 ppi are about 8 Mpixels. RGB(A) rasters are
then about 32 MB (1 bpp images are about 1 MB). If there are
more than 25 images, store the images after processing as an
array of compressed images (a Pixac); otherwise, use a Pixa.
Definition at line 131 of file pdfapp.c.
References L_CLONE, L_DEFAULT_ENCODE, L_INSERT, and L_NOCOPY.
| l_ok cropFilesToPdf | ( | SARRAY * | sa, |
| l_int32 | lr_clear, | ||
| l_int32 | tb_clear, | ||
| l_int32 | edgeclean, | ||
| l_int32 | lr_border, | ||
| l_int32 | tb_border, | ||
| l_float32 | maxwiden, | ||
| l_int32 | printwiden, | ||
| const char * | title, | ||
| const char * | fileout ) |
| [in] | sa | sorted full pathnames of images |
| [in] | lr_clear | full res pixels cleared at left and right sides |
| [in] | tb_clear | full res pixels cleared at top and bottom sides |
| [in] | edgeclean | parameter for removing edge noise (-1 to 15) default = 0 (no removal); 15 is maximally aggressive for random noise -1 for aggressively removing side noise -2 to extract page embedded in black background |
| [in] | lr_border | full res final "added" pixels on left and right |
| [in] | tb_border | full res final "added" pixels on top and bottom |
| [in] | maxwiden | max fractional horizontal stretch allowed |
| [in] | printwiden | 0 to skip, 1 for 8.5x11, 2 for A4 |
| [in] | title | [optional] pdf title; can be null |
| [in] | fileout | pdf file of all images |
Notes:
(1) This function is designed to optionally remove white space from
around the page images, and generate a pdf that prints with
foreground occupying much of the full page.
(2) It does the image processing for prog/croppdf.c.
(3) Images in the output pdf are 1 bpp and encoded with tiffg4.
(4) See documentation in pixCropImage() for details on the processing.
(5) The images will be concatenated in the order given in safiles.
(6) Output page images are at 300 ppi and are stored in memory.
They are about 1 Mpixel when uncompressed. For up to 200 pages,
the images are stored uncompressed; otherwise, the stored
images are compressed with tiffg4.
Definition at line 270 of file pdfapp.c.
References L_CLONE, L_G4_ENCODE, L_INSERT, and L_NOCOPY.