Bookscanner

From Noisebridge
Jump to navigation Jump to search

The DIY Book Scanner is a project of the Noisebridge Digital Archivists group. It's based on the open-source DIY Book Scanner Kit designed by Daniel Reetz.

Here's the scanner first being assembled:

Diybookscanner.jpg

The completed scanner ready to scan:

Diybookscanner1.jpg

The scanner on its new rolling stand:

Book scanner stand.JPG


Lighting

The original LED light was replaced with a larger LED array. To avoid glare, the array is mounted perpendicularly so the long side is parallel with the spine of the book. All of the interior surfaces of the book scanner are painted black, to avoid casting reflections on the glass. The glass was taken from two flatbed scanners and cut down to size by hand.

Cameras

We use two Canon Digital Rebel T3 cameras, attaching the scanner frame to the body with a screw that resembles what you would find in a tripod head. Each is connected via USB to a computer and remote-controlled with a Python script that uses the gphoto2 library. There is no battery or persistent storage in either camera. Pictures are transferred to the controller computer after the shutter is triggered. If you would like to use different cameras, there is a list of cameras supported by gphoto2 .

The settings as of this revision are the following

mode: M
aperture: F14
shutter: 1/20
ISO: auto
zoom: 24mm
stabilizer: on
auto focus

All settings are automatic except.

  1. The zoom is adjusted by turning the ring around the lens and looking at the numbers, which indicate millimeters.
  2. The mode is indicated by turning the circular knob on the upper left, so the letter M is facing the arrow.
  3. The stabilizer is a physical switch on the lens. It is marked "stabilizer".

Trigger

The cameras are triggered by pressing the enter key on the computer keyboard.

A cool upgrade would be a foot switch to trigger the shutter. This would free up the operator's hands to be dedicated to the book parts. This could be a cheap MIDI keyboard sustain pedal or a guitar pedal type switch.

Software

The scanning is handled by a Python script that uses gphoto2 to connect with the cameras and displays images of the scanned pages in an HTML view. Three USB ports are required for the monitoring computer: One each for the two cameras, and one for the keyboard or triggering mechanism. Post-processing is handled with ScanTailor, an OCR script using Tessaract, and various other scripts.

Live View is obtained through a utility called Darktable. It supports "tethering mode". It may even be possible to use Darktable as a complete book assembly application. Not sure yet.

Alternative for liveview from a python library http://magiclantern.wikia.com/wiki/Remote_control_with_PTP_and_Python

There is a toolkit called Canon Hack Development Kit (CHDK) which might be useful for cameras not supported by gphoto2.

Infos on building a transcoding cluster http://media.bemyapp.com/develop-video-render-farm/

Instructions

Step-by-step Bookscanner Instructions are available for anyone who would like to use the Bookscanner.