|
|
Home > Glossary of Terms Electronic Copy Center Data Management is a leading specialist in document scanning and data conversion.
|
|
A storage media which is optically read developed by Phillips in the 1970's.
Computer Output to Laser Disk. A computer programming process that outputs electronic records and printed reports to laser disk instead of a printer. Can be used to replace COM (Computer Output to Microfilm) or printed reports.
The re-encoding of data to make the file size smaller. Most image file formats use compression because image files tend to be large and consume large amounts of disk space and transmission time over networks.
Transformation of data from one format to another i.e. paper to an electronic file, microfilm to an electronic file etc.
A generic term for collated information.
The process of straightening skewed (off-centre) images. De-skewing is one of the image enhancements that can improve OCR accuracy. Documents often become skewed when they are scanned or faxed.
Recorded information or object, which can be treated as a unit.
Software used to store, manage, retrieve and distribute documents quickly and easily on the computer or via a network or intranet.
To transmit a file from one computer to another. Usually implies retrieving a file from a remote computer to a local one, or from a large computer to a smaller one.
Refers to scanners that scan both sides of a page on a single pass through the scanner.
Also known as printer resolution - is usually proportional to image resolution. The more dots per inch, the finer the resolution.
An optical storage medium that can store up to 4.7 Gigabytes (single layer), 8.5 GB (double layer), 9.4 GB (double sided, single layer), or 17 GB (double sided, double layer). Transfer rates and seek times are similar to those of CD-ROM for currently available drives. The DVD spec includes higher level specs for audio and video capabilities.
A document that has been scanned, or was originally created on a computer. Documents become more useful when stored electronically because they can be widely distributed instantly, and allow searching. HTML and PDF are well known electronic document formats.
Imaging software that helps manage electronic documents.
Computer data saved to be accessed by specific programs with specific information.
Enables the retrieval of documents by either their word or phrase content. Every word in the document is indexed into a master word list with pointers to the documents and pages where each occurrence of the word appears.
A compression technique used in CCITT Fax Group 4. It produces very good results for black and white, and is frequently used as an option in TIFF files for black and white images. It is also used in Adobe Acrobat (PDF) files.
An image file format that is commonly used on the web. It uses LZW compression, which makes it good for colour and greyscale images, but it does not compress as well as G4 for black and white. LZW is "lossless" which means it will not compress as well as JPEG, but will retain all of the image's quality.
An image type that uses black, white, and a ranges of shades of grey. The number of shades of grey depends on the number of bits per pixel. The larger the number of shades of grey, the better the image will look, and the larger the file will be.
The digitised representation of a picture, graphic or document.
When a page is scanned, the page can be stored in a number of file types. The type should be chosen based on the desired use of the image, and the software that will be utilised. Different file formats commonly use different methods of compression as well, and some types of images compress better using some formats rather than others.
Free Document retrieval and viewing software provided by Electronic Copy Center.
A form of data entry creating a linked database using alpha numeric input. A search of the indexed data will retrieve the relevant scanned document.
Database fields used to categorize and organize documents. Often user-defined, these fields can be used for searches.
Specialized applications used for communication between scanners and computers.
A "lossless" image compression format for binary (black and white) images. Compresses better than G4 by up to 25 percent. Also supports progressive encoding. Licensing issues have slowed its adoption for use.
A "lossy" image compression format for binary (black and white) images. A JBIG2 compressor identifies common objects (usually characters) in the image and creates a dictionary with references to those objects. Lossiness is induced by allowing similar objects to be represented by a single dictionary entry. This format is supported in PDF 1.4 and greater.
An image file format that is best suited for photographs. It supports "lossiness", which means that it will throw away some detail in order to achieve better compression. It does not work well for text.
Page sizes are manufactured to pre-determined specifications and given ISO numbers A3, A2, A1 and A0 are classified as large format paper sizes.
Refers to a data compression algorithm that actually reduces the amount of information in the data, rather than just the number of bits used to represent that information. The lost information is usually removed because it is subjectively less important to the quality of the data (usually an image or sound) or because it can be recovered reasonably by interpolation from the remaining data.
Describes an image-compression method that retains all image detail. (See "LZW compression").
Abbreviation for Lempel-Ziv-Welch, a standard algorithm widely used for compression of data.
When a page is scanned, the page is initially stored as an image only, and the computer does not identify the image as text. Optical Character Recognition is a process that produces a page of text from an image file. It is usually only accurate in the mid 90% range, and must be corrected by a proofer for most applications except for text searching.
Adobe's Portable Document Format. The term Adobe uses to describe Acrobat files. (See Acrobat)
Picture Element. A single dot in an image. It can be black and white, greyscale or colour.
The number of dots per inch (dpi) that were stored during scanning. The greater the number, the greater the amount of detail that is visible. It is recommended that you use between 72 and 100 dpi for images that will de displayed on the screen, and 300 dpi for images that will print on common inexpensive printers. Higher resolution images take up more space as well.
A device that can read text or illustrations printed on paper and translate the information into a form the computer can use. A scanner works by "digitising" an image and placing it on the computer as a file.
Small versions of an image used for quick overviews or to get a general idea of what an image looks like.
An industry standard image file format. It is unique in that it incorporates multiple compression techniques, allowing the user to specify the best format for a type of image, and that one file can contain multiple images.
|