Large Scale Digitization Working Group
December 20, 2006
10:00-11:00am, Room 230B
AGENDA
* Review of RFP for Digital Services
MINUTES
Present: Betsy Kruger, David Vess, Sarah Shreeves, Bill Mischo, Tim Cole, Beth Sandore, Mary Stuart, Candice Bulkley, Michael Norman, Kelvin Touchette.
Absent: Chris Prom, Nuala Koetter, Tom Teper.
Meeting centers on the Third draft of RFP for digitization projects for Illinois Harvest.
Requesting all work be done in color.
Section 1.2
Chris's concerns:
pick tiff or jpeg2000
pdf or djVu
Discussion of Abby-Fine Reader - pros and cons
group decided to require jpeg2000
group decided to require pdf format
Section 1.2 - UIUC will supply (section)
There is concern about all Bib records not being complete - for example, Bill's engineering reports are not all analyzed.
Sarah made suggestion of creating DC record in ideals for items going out.
Section 2.2.1
Will we be getting samples from vendors of our own material or will we rely on samples made from materials we do not supply? ASK Tom for his experiences around this. Decided to ask for complete sample of what we want, with OCR, etc. but not entire book. Also ask for bitonal and gray scale.
Appendix 1.1
Locking in on standards.
Bill checking on most recent ANSI standard.
Cropping issues - ASK Nuala.
enhanced mode? - ASK Nuala
Appendix 1.3.6
Using only jpeg2000 and searchable pdf. (dropping tiff and DjVu)
Discussion of merits of pdf or page turner software. regarding access - ease of use of file sizes in readers. pdfs are very large. files are not large for slow download. Suggest leaving them both in so keeping - (For access images, JPEG at 72dpi.)
Chris's suggestions - (Appendix 1.3.6.)
Chris wanted both ASCII and XML UTF-8 files. Group agreed.
Will be keeping the below list. (note that the descriptions of each file is further down in the document):
Dataset.toc files (see…)
Scandata.txt files (see…)
Checkmd5.fil (see…)
Decided we need a manifest of files at volume level. something that assures the files match the physical pieces.
Appendix 1.4.1
File naming conventions
accepting Tim's additions - such as lower case characters.
enough padding (with 0's) for 9 million files. ;-)
striking references to UNIX filing conventions
Appendix 1.5.1
Indexing Section
have to put in copyright statement for GSLIS stuff for example.
Sarah and Michael will discuss and report back to Betsy about bib info to include.
Appendix 1.5.3 - dumping info about TIFF headers.
having Tom Habing look over the Xerox jpg2 document - to be certain it is good enough to send/reference. Sarah suggested mapping our header from the Cornell header example.
Appendix 1.6
Chris's concerns:
OCR accuracy level is too high - group agreed and changing number to 98%
Wants TEI Lite to be used - group agreed
Betsy is still working on Collection Descriptions
Beth suggested specifying different treatment for different types of images. Photos, engravings/etchings and line drawings, etc. For example, line drawings are better as bitonal scans.
Comments (0)
You don't have permission to comment on this page.