Vous êtes sur la page 1sur 18

Book Scanning QuickStart Guide

Best Practices for Scan and Deliver and Reserve Projects

Yale University Library

Bass Library 6/20/2012

Quick Start
This section explains how to produce a black and white scan of a standard book with a PDF output. Most of your scans will follow these steps. This section assumes that acrobat and the scanner have already been set up and configured for regular use. Otherwise skip ahead to the setup section. Scanner: Minolta PS7000C MKII Software: Adobe Acrobat X Professional 1. Launch Adobe Acrobat and select custom scan. Selecting a different option may crash the software.

2. Align the spine of the original with the center mark of the scanner

Quick Start

3. Press scan. 4. If the book is in good condition, leave the default settings and press scan (or press foot pedal). If not, refer to the section in this guide on adjusting scan driver settings. 5. Select scan more pages, then click okay, and repeat the process until the scan is complete. 6. Select scan is complete, then click okay. 7. To delete any pages, click on the page thumbnail and press the delete key. 8. Next, press crop from the Pages panel in the Tools dropdown (This can be added to the top panel by right clicking the icon and checking add to quick tools).

9. If you are scanning a cover sheet, drag a rectangle over the sheet and double click it. Then click okay. 10. For the rest of the pages, select the book and double click the rectangle. In the page range box in the bottom right of the crop window set the to box to the number of pages in the document. Then click okay. 11. Review the thumbnails. If any of the sheets have been cropped incorrectly, repeat the crop for those pages only. 12. Press the save icon to save the document to the appropriate folder.

Quick Start

ii

Table of Contents
Quick Start...................................................................................................................................................... i Table of Contents ......................................................................................................................................... iii Imaging, Rapide, Acrobat, and other Document Imaging Software An Overview .................................... 1 Initial Setup for the Minolta Scanner and Adobe Software in an Acrobat Scanning Environment .............. 3 Understanding the Scan Driver Settings ....................................................................................................... 6 Post-Scan Processing Options in Acrobat ................................................................................................... 10 Key facts, tips, and tricks:............................................................................................................................ 11 Alternative Workflows ................................................................................................................................ 12 Troubleshooting .......................................................................................................................................... 14

Table of Contents

iii

Imaging, Rapide, Acrobat, and other Document Imaging Software An Overview


Looking at the Options
There are three components to document scanning: the scanner, which produces raw images; the scan driver, which interprets them for the computer; and the document imaging software, which receives images from the driver. Document imaging software converts the images into a form accessible to humans, and then allows the end user to make changes and edits to the document before saving. Yale Library uses two main formats for storing digitized books and documents for reserve and lending PDF and Multi-Page TIFF. PDF is the dominant format, although some applications (Ariel, for example and legacy Odyssey Clients) still require TIFF formatted images. Each different document imaging package has advantages and disadvantages, where some of the most important are summarized below. Adobe Acrobat is the suite preferred for most tasks, although the needs of the situation will dictate the software of choice. Adobe Acrobat Batch Scanning No-Click Batch Scans Native Save to PDF Native Save to TIFF Batch Crop Batch Deskew Auto-save while scan is in progress OCR Option Image Effects The batch crop feature of Acrobat can save a significant amount of time in the scan process since it eliminates the need to select a crop area for each individual page. There is, however, a feature in the Minolta scan driver which allows images to be cropped as they are acquired. If the features of the scan driver are used effectively, the choice between Acrobat and Rapide becomes one of user preference. Minolta Rapide Kodak Imaging Odyssey Native Imaging, Rapide, Acrobat, and other Document Imaging Software An Overview

Adobe Acrobat
The primary advantage of Adobe Acrobat is its powerful suite of post-scan batch editing tools. Adobe can crop, deskew, despeckle, and perform OCR on all the pages of a scan in a single step. Although Acrobat does have a batch scan feature, it requires the user to click an ok button for every new page

an extra step. A macro can be used to suppress this box, but its use depends on the installation of MacroExpress on the system. File management in Adobe is preferred to the workspace model of Rapide.

Minolta Locator Rapide


Minoltas own document management software uses a workspace model to manage scans. Essentially, scans are deposited into a subfolder (named by the user) of a default location on the computer (which cannot be modified by the user). Each image is saved into that folder as it is scanned. The pages must then be exported into a single PDF or TIFF in a different place on the hard disk or network. There is a batch scan mode which allows the user to scan consecutive pages using only the scanner foot pedal.

Kodak Imaging
Imaging, the first program to be bundled with the Minolta scanner, has not changed significantly since its release with Windows 95. It offers relatively few features, is prone to crashing the system during memory intensive scans. It should only be used when performing diagnostics on the scanner.

Imaging, Rapide, Acrobat, and other Document Imaging Software An Overview

Initial Setup for the Minolta Scanner and Adobe Software in an Acrobat Scanning Environment
Although the factory default settings will produce satisfactory scans in most situations, failing to configure the scan driver may result in unwanted behavior, and failure to configure Acrobat may lead to frequent hangs and memory errors. Configuration of both scan driver and Acrobat must be done for every user of the computer. Settings begin at default for every new user of the computer.

Configuring Adobe Acrobat


Acrobat allows you to either use the scan drivers native interface or to suppress it by using presets which still use the scan driver but keep it hidden. Note that if you use a preset then the scan window will never appear, meaning that if your unit does not have a foot pedal to initiate scanning, then the program will hang. Initial Setup for the Minolta Scanner and Adobe Software in an Acrobat Scanning Environment Configuring Acrobat for use with the Native Scan Driver Interface The use of the native scan driver interface with Adobe adds one mouse click per document when compared to using a preset. The advantage of this option is that it allows the user to take advantage of all the powerful configuration options available in the driver. 1. Start Acrobat 2. Start a custom scan from: File > Create > PDF From Scanner > Custom Scan 3. Set Scanner to Book Scanner 4. Set Sides to Front Sides 5. Click on Options (in the top right corner) and from the User Interface dropdown, select Show Scanners Native Interface then press Ok

6. Make sure that all boxes in Document Settings are unchecked. 7. Check the box for Prompt for scanning more pages 8. At the end, your box should look like the one on the right. 9. Press Scan (you must press the button, the foot pedal will not work) and then Exit These settings will be preserved until you change them or press the Defaults button. You need not set them for every scan. In other words, in the future, you will only need to press the scan button when

this window appears. All settings will be changed from within the driver window that appears when scan is pressed. Configuring Acrobat for Use with Presets (Optional skip straight to driver configuration unless your supervisor indicates otherwise). Acrobat can be configured to allow users to bypass the scan driver interface. In this mode, presets are used to tell the scanner to perform a scan using a narrow set of options controlled by acrobat. The advantage of using a preset Important Difference: scan is that the user does not have to click the scan button in A preset is a preconfigured scan in Acrobat before beginning to scan scanning begins when the Acrobat that allows the user to foot pedal is pushed. Since the scan dialogue is suppressed, bypass the scan driver interface. however, the scan must be initiated by the foot pedal or the A profile is a collection of settings program will hang. Presets must be configured in advance, but within the scan driver between once set these steps need not be repeated.
which the user can switch quickly.

Configuring the Scan Driver


The scan driver allows the user to create different profiles suitable for scanning different materials. The scanner will not change profiles by itself it will continue to use the last profile in use until the operator intervenes. Note that any changes to a profile will not be preserved unless the user presses the save button beneath the profile drop-down. The scan driver has many configurable features and this section will help you set up the most basic ones needed to start scanning right away. For more complete coverage of the scan driver - and how tweaking the settings can improve the quality of scans - see the section on understanding the scan driver settings. 1. Start a scan this can be done in Acrobat by selecting File > Create > PDF From Scanner > Custom Scan, or in Rapide by opening a work area and pressing the scan button. 2. The scan driver will appear with the default profile active. 3. Set the page size dropdown to 12x 18L and Image Mode to Black and White. 4. Set Resolution to 300dpi. 5. Uncheck Frame Masking

Initial Setup for the Minolta Scanner and Adobe Software in an Acrobat Scanning Environment

1. Start Acrobat 2. Enter the preset configuration window from File > Create > PDF From Scanner > Configure Presets 3. Select the preset you want to modify from the dropdown menu. 4. Set the resolution to an appropriate value. 300 dpi is adequate for most scans of black and white text. 400dpi will yield slightly smoother font edges, but can nearly double file size. 5. Set paper size to an appropriate value (12x18 will work for most books), and uncheck all boxes in document settings 6. Remember to hit save after modifying each preset. You will need to repeat this process for each preset option.

After following these basic configuration settings for Acrobat and the Scan Driver, you are ready to follow the instructions in the quick-start guide at the beginning of this document. It is important to emphasize that the settings here and the workflow in the quick-start guide represent one of the most efficient, and one of the most easily learned and followed, workflows for the scan-ondemand or reserves processes. It is not the only workflow, however, and if the user is very comfortable with the scanner and driver, it may be more efficient to adjust driver settings to tweak scans rather than making changes using software at a future point in time.

Mapping a Network Drive


When accessing the ILL folder repeatedly, the user may opt to make a network drive or a designated icon in the my computer menu which links permanently to the network share where scans are being saved. This process is straightforward in Windows 7. 1. From My Computer press the Map Network Drive button on the toolbar. 2. Assign any drive letter you choose that will be easy to remember. I: is a good candidate. 3. Enter the folder address of the shared folder. This is the same address you would type into Run to access it. 4. Check the box for Reconnect at Logon to ensure the drive is permanent 5. If you personally do not have access to the network share, check the Connect Using Different Credentials option and have a supervisor provide you with appropriate credentials. 6. Press Finish to complete the process. Now whenever you save you need only go to the drive letter and you will be able to place files directly into the folder.

Initial Setup for the Minolta Scanner and Adobe Software in an Acrobat Scanning Environment

6. Uncheck the Diffusion Dither box 7. Set Black and White Threshold to 75. 8. Type a descriptive name into the dropdown box for Profile and press Save. Using an easily understandable name, like BW Spread Medium will help other scan users quickly understand the use of the profile. 9. When complete, the box should look like the example to the right. You can exit the driver safely by pressing close.

Understanding the Scan Driver Settings


The scan driver is a powerful tool that allows the user to avoid the need for future edits in acrobat. For example, the user can crop the image directly in the driver before it is sent to the document imaging software. In the first part of this section, there will be a description of what the various functions of the driver are and the effect they have on images. In the second part of this section, the application of those settings to the modification of the workflow for a particularly difficult scan will be discussed.

The Scan Driver Dialog Box


The scan driver dialog is represented to the right. Each check box, radio button, or drop-down menu represents a configurable setting. As mentioned earlier, the profile drop down allows the user to save any combination of settings for future use with a unique name. Document Settings This box allows you to configure the driver to understand the type of document you are scanning. Type Book: If selected, the scanner will look for a book, which means that it will try to erase the center page split, will allow the user to select larger page sizes. Sheet: This will cause the scanner to ignore a center split, and will allow the user to select standard single page sizes. This setting is used for cover sheets and for maps. 3D: The Scanner can capture three dimensional images up to two inches high. This feature is currently not used, but could allow in the future the capture of awards and woodcuts. Page Spread: This will capture the entire book in a single image. It is similar to photocopying both pages of the book at once. Single: This will capture only one side of the book. You must press left scan to capture the left, and right scan to capture the right. Each side will appear in its own image.

Spread

Single

Split Understanding the Scan Driver Settings

Image 1

Image 1

Image 1 Image 2

Split: This will capture both pages of the book at once, but will treat each page as its own image. You will get two images as an output, one of the left page and one of the right page. Color Mode Color: Captures the image in full color. This requires the most disk space.

Grayscale: This captures the image in grey. The preservation of detail will be greater, and text is easier to read at a high zoom. At a low zoom, however, greyscale text is actually harder to read since the background is preserved. Greyscale should be reserved for images and diagrams.

Black and White

Grayscale

Black and White: The driver will look at each pixel and if it is darker than a certain threshold it will be black, if lighter then it will be white. This option occupies the least disk space. Output Setting Resolution: This is the number of dots per inch the scanner will capture. Increasing the resolution by 100dpi can as much as double the size of the scanned file, depending on the other scan settings. For text, 300 dpi is best for most materials. For items in bad condition, 400dpi may be used. 600dpi should be reserved for photographs. Resolutions lower than 300dpi may be hard to read, and preclude the possibility of future OCR on the images. 200dpi 300dpi 400dpi 600dpi

Edit Frame Masking: This is used to hide the black border and replace it with a white border. Note that this may make the image harder to crop since the margin will blend into the erased frame. This check box

Understanding the Scan Driver Settings

Preview and in-driver cropping: While the obvious function of this window is to reveal a preview of the page, the user can also define a scan area in this view. Note that this function is disabled when frame masking is checked. Dragging the mouse over the area of a book causes a red square to appear. The scanner will only send this part of the image to the document imaging software. If save the profile after defining the area and closing the preview window (by pressing ok) but before starting the scan, that same crop will be applied to all future scans in that profile until you go back to the preview window and change it. When doing split scans, the user can press either left or right preview. When doing single scans, the user should press the button for the side they wish to preview. Cancel will undo the last modification made, but will not reset the previous saved crop area. To do this, you must select the entire frame, press ok, and then save the profile.

must be enabled in order to enable centering and finger detection. Note, however, that when this check box is selected, the user will no longer be able to crop the image from the scan driver (although cropping from document imaging software is unaffected). Centering: This will cause the software to automatically re-center the image with each scan. This is useful if the operator is not careful with positioning. Always ensuring that the book is centered on the scanner diminishes the usefulness of this setting. Finger masking: This setting causes the scanner to look for fingers holding the book open and replace them with a white frame. The success rate is acceptable, however, sometimes only a few fingers will be masked, and in other cases the masked fingers white finger shaped areas are just as conspicuous as having a scanned finger present in the scan. Center erase: This causes the scanner to look for the binding and to mask it with a white stripe. Although usually this is desirable, the user may wish to disable the feature for books that are tightly bound, or in cases where the scanner is consistently failing to identify the center correctly. Note that the slider on the bottom determines the width of the erasure stripe. The narrowest setting is adequate for most scans. Profile This setting has been covered in the previous section. A user may define as many custom profiles as desired. To delete a profile, the user should select it and then press the delete button. The save button must be pressed every time settings are changed for a profile if the user wishes to preserve those changes, otherwise they will be discarded after the single scan. Image Quality White Balance, Color Correction, and Exposure Bias: These settings can usually be left on auto. In scans of originals in good condition, these settings will not cause significant changes. White balance uses software to adjust the image so that the color that the scanner perceives as slightly off the true color (e.g., a white rendered as a very light gray), is corrected in the software to a truer form. Color correction performs a similar function. Exposure bias can be used to darken or lighten scanned images. It is similar to adjusting the contrast on a photocopy machine. Although this is useful in greyscale or color scanning, for black and white scanning the setting should be left on auto. Noise Reduction: This setting will use the scanner to reduce the noise on the image. Noise refers to speckles, unusual marks, and other imperfections. Enabling this setting will reduce these imperfections, but it will also cause some image detail to be lost, particularly when there is a high black and white threshold in use. Therefore, noise reduction should only be enabled when working with a scan where noise is a significant problem. NR Off NR On

Understanding the Scan Driver Settings

Sharpness: This setting determines the Low Sharp Normal High Sharp sharpness of images and text in the scan. Again, the setting is more effective in grayscale and color scans than in black and white scanning. When sharpness levels are high, lines are emphasized, which can lead to images that look strange, or in text, letters that are shaped somewhat unusually. Low sharpness can lead to letters that are poorly defined. In most cases, it is best to leave sharpness at the default (middle) setting. Diffusion Dither: This setting causes the driver to attempt to recreate shading on the page using black dots, similar to an old newspaper print. This leads to noisy images, and the setting should not be used. Grayscale scanning should be used instead when reproducing an image. Note, however, that when disabled, images (besides sketches and line art) will not be recognizable. Black and White Threshold: This is White Pages, Good Text possibly the single most important setting for producing clear black and white scans. Since when scanning in black and white the driver evaluates every pixel for whether it is dark enough to be considered black, there is a need 65 80 90 to define that boundary. This setting allows the user to define it. As a rule, the Yellowed Pages, Good Text clearer and bolder text is, the higher the threshold can be. Using a high threshold prevents stray marks and page discoloration from being identified as black areas. When text is thin, or of poor quality, the threshold needs to be dropped. This results in more legible text 65 80 90 at the expense of a more noisy scan. Since every book has slightly different font characteristics and page coloration, there will be a slightly different optimal threshold for each book. There are some general guidelines, however. For a book with yellowed pages and dark text, the threshold can be about 80. For yellowed pages and light text, it will need to be between 75 and 80. For white pages with excellent quality, but thin, text, a value as low as 65 works well. For white pages with bold text, between 80 and 90 produces good results. Values higher than 100 will have the least noise, but also lose the most text detail. In general, a value of about 75 will cover most scenarios.

Understanding the Scan Driver Settings

Post-Scan Processing Options in Acrobat


One of the advantages offered by Adobe Acrobat is the abundance of post-scan processing options available within the software. Although most of these are not implemented in any of the current workflow, a few of them bear mention. The Optimize Scanned PDF button, part of the tools sidepane, contains most of the useful functions for scanned material. The adaptive compression box allows the user to compress the images that have been scanned to reduce storage space. The default settings retain quality reasonably well and should be tampered with only if there are problems. The Filters box allows for background removal, descreening (a reduction of moire patterns), and text sharpening, deskew, and text sharpening. With the exception of deskew, all of these modifications can produce poor results on low quality, and even some medium quality scans. Therefore the user may choose to not apply these filters. Finally, Adobe allows the user to perform an OCR of the scan which, although time consuming, makes the scan searchable and open to editing. Adobe allow the user to define batch workflows so that these actions can be performed on a large number of files without the intervention of the operator. The candidate scenario is if a user wishes to perform OCR on all of the reserves materials scanned on a given day, he need only leave the system processing them overnight before filing them in the appropriate place on the network drive. Another example is that Acrobat can batch insert a page, for example, a copyright notice, in front of all of the PDF files in a given folder. The many functions and workflow options in Acrobat are beyond the scope of this guide, but a consultation of the software manual reveals that Acrobat is dynamic and able to respond to most users needs.

10

Post-Scan Processing Options in Acrobat

Key facts, tips, and tricks:


When doing split scans, ALWAYS start with the left scan by pressing the left scan button or by pressing the left scan foot pedal. Otherwise, the order of pages will invert. You can crop either in the driver or in the document imaging software. Since the image will re-center a little differently across long scans, be sure to leave enough of a margin in the crop to not need to repeat the scanning of certain pages. Never use the black and white dither check box. Although this could be used to capture a dot-matrix style image (like an old newspaper image), you are better off using grayscale. Using this feature leads to a very noisy image. The narrowest setting for center masking is adequate for most scans. Only widen this setting if you are scanning a large volume with very wide center margins. Disable center masking when scanning a tightly bound volume. If the scanner guesses the center incorrectly by even a small amount, some text will be lost. Balance resolution and file size carefully more DPI are not necessarily better. Although the book tray shifts to accommodate books, avoid moving it any more than necessary. The camera re-centers the image based on the movement, and if you cause a big indentation with a small book the image will jump to the side of the scanned area. Books with very tight binding are one of the only scenarios where you will want to use single scanning. You can open the binding, hold the opposite pages at a 45 degree angle (exposing the maximum amount of text in the adjacent page) and then perform a single scan. OCR actually shrinks file size acrobat reduces all OCRd pages to 300 dpi. Certain forms of OCR, which eliminate the image completely and replace it with recognized text, can be done using special software and can reduce file sizes to well under a megabyte. While the scanning glass can make it more convenient to hold the book open, it can sometimes apply excessive pressure to the book, damaging the binding, or forcing the scan table to shift excessively, causing the scanner to misjudge the location of the binding producing an unusually centered scan. Key facts, tips, and tricks: The most dramatic improvement in scan quality is made when operators become comfortable with the settings in the scan driver. Although it is not desirable to waste too much time finding the perfect setting, over time, the operator should not be afraid to tinker with and save profiles to get the best quality scans. Use the preview window to make adjustments and save them before starting your actual scan. This will save time of saving an image every time you wish to make a correction.

11

Alternative Workflows
Although the workflow presented in the quickstart is the easiest to learn and one of the most flexible, it is far from the only one. There are many alternatives which take about the same amount of time, and it is the users prerogative to determine which works best for their situation. These two sample workflows take advantage of the ability to crop in the driver, bypassing Adobes bulk crop (at the expense of needing to rescan any pages that are accidentally mis-cropped by the driver).

Scan Driver Cropping in Adobe Acrobat


Scanner: Minolta PS7000C MKII Software: Adobe Acrobat X Professional 1. Launch Adobe Acrobat and select custom scan. Selecting a different option may crash the software. 2. Align the spine of the original with the center mark of the scanner 3. Press scan. 4. Adjust setting appropriately and then select preview, then click preview, and drag your mouse over the book area, creating a red rectangle that should be no smaller than the books border. 5. Click ok and then press save. 6. Now you can press scan (or hit the foot pedal). 7. Select scan more pages, then ok, and repeat the process until the scan is complete. 8. Select scan is complete, then click okay. 9. To delete any pages, click on the page thumbnail and press the delete key. 10. Review the thumbnails. If any of the sheets have been cropped incorrectly, repeat the scans for those pages only, changing the preview rectangle as needed. 11. Press the save icon to save the document to the appropriate folder.

Scan Driver Cropping with Locator Rapide


Scanner: Minolta PS7000C MKII Software: Minolta Locator Rapide (by Covergold) 1. 2. 3. 4. 5. Launch Rapide, and press the open button. Press New in the work areas box, and assign it the appropriate transaction number. Align the spine of the original with the center mark of the scanner Ensure that the scanner settings box is checked Ensure that the batch scan box is checked this will prevent needing to press the scan button repeatedly. 6. Press scan. 7. Adjust setting appropriately and then select preview, then click preview, and drag your mouse over the book area, creating a red rectangle that should be no smaller than the books border. 8. Click ok and then press save.

12

Alternative Workflows

Now you can press scan (or hit the foot pedal). Press scan (or the foot pedal) repeatedly for each page you wish to scan. When finished, press the exit button. To delete any pages, click on the page number and press the delete key. Review the thumbnails. If any of the sheets have been cropped incorrectly, repeat the scans for those pages only, changing the preview rectangle as needed. 14. To move the pages you have rescanned, select the page, press the move button, and then select the same named workspace and the page after which you wish to save the page. 15. Select all pages by selecting the first page, pressing shift, and then clicking on the last page. 16. Press the arrow next to copy pages to location (A floppy disk in front of a drive) icon followed by copy pages to multi page file to save the document to the appropriate folder. 9. 10. 11. 12. 13.

13

Alternative Workflows

Troubleshooting
There are a number of frequent errors that will appear when performing a large number of scans. Some of these will have to do with flaws in the software design, others with flaws in the hardware. While software problems can be largely avoided, hardware issues are relatively rare, but inevitable. Memory Error If an error appears indicating that the computer is out of memory, this is usually caused by an attempt to invoke two instances of the scan driver at once. Most easily reproducible by trying to run adobe and rapide at the same time, it can also happen when the scan driver is running in the background and hangs. This will stop the driver from being loaded again, and the computer must be rebooted to fix the problem. Always make sure to close the scan driver and document imaging software properly to avoid these errors, and read the guide on memory errors. Scan Interrupted This is usually caused by the page moving slightly or by unusual lighting. Simply press scan again and keep the page still with the glass. If this doesnt work, try scanning without the glass. Wavy Scans This is a hardware problem caused by certain print formats. Try turning off center masking and moving the book to different points on the scanner bed, or changing its orientation. If this doesnt work you may have to scan each page in single page mode, or use a different scanner. Vertically Cut Scans This is caused when the scanner misreads the center of the page, causing one side to jump. It usually goes away when you rescan. If not, try scanning each page in single mode and use the preview functions. Excessive Center Removal If the binding is very narrow, try turning off the center erase function to capture as much of the image as possible. Moire Patterns Turn on the descreen option to reduce this behavior. If severe, use the descreen function in Acrobat. Inverted Page Order Ensure that you are using the left scan pedal when doing split scans. Page Split Incorrectly Again, this is caused by the scanner misjudging the top of the book. Try scanning at a 90 degree offset, or scanning on top of the glass instead of the moveable scanner bed.

14

Troubleshooting

Vous aimerez peut-être aussi