Vous êtes sur la page 1sur 17

Queens College City University of New York Graduate School of Library and Information Studies

Digital Preservation (GSLIS 752) Spring Session 2012 Syllabus (v. 1.5.1)
Instructor: Fred Grevin Email: grevinf@earthlink.net Phone: 917-902-2462 Meetings: by appointment Course Description We will examine the nature and characteristics of digital resources and explore the need to preserve them within a variety of contexts, including corporations, research institutions, government, libraries and archives. We will consider how to assess and mitigate the risks to digital resources. We will evaluate the current state of digital preservation; identify those problems that solutions; those that have solutions under development; and those that have not yet been addressed. This syllabus will be revised as the semester progresses. Each revised syllabus will be emailed to the students directly. It is each students responsibility to ensure he/she has the revised version (see version number on title page and in file name). Course Objectives Upon successfully completing this course, students will: Understand the need for preserving digital resources in various contexts.

Demonstrate a general knowledge of the technical requirements for digital preservation. Understand the risks faced by digital resources, how to assess and mitigate them, and how to apply that understanding to specific situations requiring short-term or long-term retention. Understand the functional requirements of a digital preservation system. Know how to communicate and understanding and knowledge of digital preservation needs to executives and funders.

Course Schedule Classes will be held on Thursdays, from 6:40 p.m. through 9:15 p.m. Each class session will have a 15minute break. The class will begin at 6:40 p.m. sharp. Grading Class participation: 20%.

Assignments: 40%

Digital Preservation Syllabus, Page 2

Final course project: 40%

Basis of Grading Grading policy: work defined as good (in which the stated objective of the assignment or project is met, no more and no less) will typically be graded as 83 to 86.9 points. This translates to a letter grade of B. Students wishing to achieve a higher score will have to produce work that is qualitatively superior to that defined as good. Results count most; effort counts less. Unless otherwise stated, quantity will not result in a better grade. Students are expected to participate actively in class discussions. Failure to participate actively in class discussions, as well as excessive lateness and unexcused absences, will be penalized by a reduction in grade. Assignments and readings are listed in this syllabus. There is no textbook; most of the readings are available online at no cost (see List of Readings at end of syllabus). Assignments may be individual or group assignments (see Description of Assignments & Final Course Project at end of syllabus). The final course project is a group exercise (typically, groups of five to six students each). Each group shall have a Team Leader who shall function as a project manager. The instructor will assign students to groups, and appoint Team Leaders. For details, see the Description of Assignments & Final Course Project.

97412238.doc

Digital Preservation Syllabus, Page 3

Course Description
Definition of digital preservation: ensuring future generations have access to digital objects.

Class 1: Thursday 2 February (first day of class)


Introduction: Fundamental issues of digital preservation. Discussion of course scope and assigned reading. Reading: A Canticle for Leibowitz. Read all. Preserving Digital Information: Final Report and Recommendations. Read the Executive Summary and pages 1-10 (refers to document pagination). Assignment 1: Email categorization, appraisal, and categorisation. This is an individual assignment (due before class on Thursday 16 February).

Class 2: Thursday 9 February


The extent and nature of the problem. Why digital preservation is important. The growth of digital information in quantity and importance. Reading: Preserving Digital Information: Final Report and Recommendations. Read the balance (pp. 11-64). The Digital Divide: Assessing Organisations Preparations for Digital Preservation. Read all. Data Storage: From the Floppy Disk to the Cloud Paul Thurrott, Windows IT Pro. Read all. Data Preservation at LEP. Read all.

Class 3: Thursday 16 February


Assignment 1 is due before class begins. Archival theory and diplomatics; Authenticity, Integrity and Trust. Digital information as authentic, trustworthy records. Reading: The digital signature dilemma. Read all. Authenticity in a Digital Environment Read two articles: Archival Authenticity in a Digital Age, Peter B. Hirtle and Authenticity in Perspective, Abby Smith. Enduring Paradigm, New Opportunities: The Value of the Archival Perspective in the Digital Environment. Read the section titled Utility of the Archival Paradigm in the Digital Environment (pp. 21-29). Uniform Electronic Legal Material Act. Read all. ABA should pause before backing digital-only laws Tonda Rush, WisLawJournal.com. read all. Authentication of Primary Legal Materials and Pricing Options State of California, Office of Legislative Counsel. Read pp. 9-30 (pagination as displayed on the document, not the file).

Class 4: Thursday 23 February


Archival theory and diplomatics; Authenticity, Integrity and Trustcontinued.

97412238.doc

Digital Preservation Syllabus, Page 4

Assignment 2: Referring back to Assignment 1, define your email records as authentic and trustworthyor not. Justify your choice.

Class 5: Thursday 1 March


Assignment 2 is due before class begins. A functional framework. Thinking about where to start and what to do. Reading (read in the order listed below, top to bottom): Technology Watch Report 04-01: The Open Archival Information System Reference Model: Introductory Guide Brian F. Lavoie. ISO Reference Model For an Open Archival Information System (OAIS), Tutorial Presentation, Sawyer et al (2003). Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview. Sibsankar Jana et al, (2009) ERPANET OAIS Training Seminar Report (2003). Reference Model for an Open Archival Information System. Read sections 1.1-1.4, and all of sections 2 and 3. Towards an Open Source Repository and Preservation System. Recommendations on the Implementation of an Open Source Digital Archival and Preservation System and on Related Software Development Bradley et al, UNESCO (2007) The DOI systemIntroductory Overview, The DOI Foundation. Read all. [presentationcase study]. The planning and implementation of a DSpace instance to manage e-records and digital collections. Guest lecturer: Nicholas Webb, Assistant Archivist at the Mount Sinai Medical Center. Additional reading: http://www.dspace.org/introducing

Class 6: Thursday 8 March


A functional frameworkcontinued. Assignment 3: Preserving your email. See the Assignments page for details. This is an individual assignment, which begins immediately (Sunday 4 March) and is due before class on Thursday 19 April, so you have 6 weeks in which to complete it.

Class 7: Thursday 15 March


Preservation strategies. Migration, emulation, archeology, etc. developing methodologies. What and how much to preserve. Reading: Parsimonious preservation: preventing pointless processes! Tim Gollins, The National Archives (UK). Read all pages. The Digital Dilemma: Strategic Issues in Archiving and Accessing Digital Motion Picture Materials, Academy of Motion Picture Arts and Sciences, 2007. Read chapters 1-6. Overview of Technological Approaches to Digital Preservation and Challenges in Coming Years (2002-07), Kenneth Thibodeau, in CLIR Conference Proceedings The State of Digital Preservation: An International Perspective, pp. 4-31. Thirteen Ways of Looking at...Digital Preservation (2004), Brian Lavoie. Read all.
97412238.doc

Digital Preservation Syllabus, Page 5

A Memory of Webs Past Ariel Bleicher, IEEE Spectrum, March 2011. Read all. AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship (2012), AIMS Work Group. Read Foreword (pp. i-viii) and Introduction (pp. 1-2). Digital Preservation Tutorials: File Naming (videos). Digital Preservation Education for North Carolina Employees. View all. Visualizing Digital Preservation Workflows, by Bill LeFurgy (March 8th, 2012). Read all. Life Cycle Models for Digital Stewardship, by Bill LeFurgy (February 21st, 2012). Read all.

Class 8: Thursday 22 March


Preservation strategiescontinued.

Class 9: Thursday 29 March


Guest presentation (Skype session): A Personal History of the Electronic Records Archives program. Fynnette Eaton, formerly of the Electronic Records Archives (ERA) Program Office, U.S. National Archives and Records Administration. Reading: Progress and Risks in Implementing its Electronic Records Archive Initiative. Read all.

Class 10: Thursday 5 April


Data formats. The forms digital resources take and their impact on how they can be preserved. Discussion of the final course project. Reading: Assessing the Durability of Formats in a Digital Preservation Environment. Read all. Defining File Format Obsolescence: A Risky Journey. Read all. Content Categories, a sub-section of Sustainability of Digital Formats: Planning for Library of Congress Collections. Look at information for all content categories, including sub-pages on Quality and Functionality Factors.

PDF File Migration to PDF/A: Technical Considerations Frank L. Walker, et al. Read all. Thursday 12 April: Spring Recessno class

Class 11: Thursday 19 April


Data formatscontinued. Assignment 3 is due before class begins. Begin discussion of the final course project. Guest presentation (Skype): functional requirements of preservation in the oil and gas exploration industry. Wayne Hoff (source of the final course project scenario), Gibson Energy (http://www.gibsons.com/Section/About/About_Corp_Profile.aspx), Calgary, Alberta (Canada). Assignment 4: Prepare an outline, with some illustrative details, of the final course assignment for review and discussion in class on Thursday 26 April. Students will be assigned to groups and team leaders appointed by the instructor.

97412238.doc

Digital Preservation Syllabus, Page 6

Class 12: Thursday 26 April


Metadata. How we know what weve got and what were doing. Reading: Understanding Metadata, National Information Standards Organization (2004). Read all. Metacrap: Putting the torch to seven straw-men of the meta-utopia, Cory Doctorow (2001). Read all. Metadata Encoding and Transmission Standard: Primer and Reference Manual. Read Foreword, Chapter 1, and Chapter 2. PREMIS Data Dictionary for Preservation Metadata. Read the Introduction (pp. 1-21) and the remarks on pp. 22-24. Look carefully at Semantic Unit 1.1 and its components (pp. 28-29). Theres a Glossary on pp. 218-224. Assignment 4 is due before class.

Class 13: Thursday 3 May


Metadatacontinued. Guest presentation: Rebecca Guenther, who retired in 2011 from the Library of Congress as Senior Networking and Standards Specialist, will join us either today or next week (Thursday 3 May) to discuss metadata principles, standards, and progress (see http://blogs.loc.gov/digitalpreservation/2011/07/digital-pioneer-rebecca-guenther/ and http://www.linkedin.com/pub/rebecca-guenther/18/691/64a).

Class 14: 10 May


Data curation. Going beyond the simple act of preserving. Reading: What is Digital Curation? Read all. Historical context and the information age: the Diaspora of Holocaust archives, Raymund Schtz (2011). Read all. Archivists, Curators, and Museum Technicians Bureau of Labor Statistics (BLS) Occupational Outlook Handbook (2010-11 Ed). Read all. Data Curation in Climate and Weather: Transforming Our Ability to Improve Predictions through Global Knowledge Sharing. Read all. Data Curation Program Development in U.S. Universities: The Georgia Institute of Technology Example. Read all.

Class 15: 17 May


Digital preservation and the law. Getting burned: precautions to consider. Discussion of the final course project. Reading: United States Code, Title 17, Chapter 1. Skim through. Case: Lowrys Reports, Inc. v. Legg Mason, Inc. Case details: read all. The Orphan Wars, James Grimmelmann, EDUCAUSE Review. Read all.

Class 16: 24 May


Final course project is due on this date, at the beginning of the class. Presentation of final course project to C-level panel.
97412238.doc

Digital Preservation Syllabus, Page 7

Closing citation:

Information lasts only so long as someone cares about it. The conclusion Ive come to...., after several decades of careful consideration, is that there is no set of hardware and software standards existing today, nor any likely to come along, that will provide any reasonable level of confidence that the stored information will still be accessible (without unreasonable levels of effort) decades from now. (Ray Kurzweil)

97412238.doc

Digital Preservation Syllabus, Page 8

97412238.doc

Digital Preservation Syllabus, Page 9

List of Readings
The readings are listed in class reading order.

Classes 1 and 2
A Canticle for Leibowitz, by Walter M. Miller Jr. Any edition is fine (the book has in print since 1959). public libraries have multiple copies. Preserving Digital Information: Final Report and Recommendations (1996), by the Task Force on Archiving of Digital Information. Source: http://www.oclc.org/research/activities/past/rlg/digpresstudy/default.htm (URL verified 2012-01-28).

Class 2
The Digital Divide: Assessing Organisations Preparations for Digital Preservation (2010), Pauline Sinclair, Planets. (http://www.planets-project.eu/publications/?search[0]=9, under the heading Market Survey White Paper and Survey Analysis. Posted on 11th May 2010URL verified 2012-01-28). Data Storage: From the Floppy Disk to the Cloud Paul Thurrott, Windows IT Pro (2012-01-24), http://www.windowsitpro.com/article/storage/data-storage-floppy-disk-cloud-142021 (URL verified 2012-0128). Data Preservation at LEP, Holzner et al, arXiv (2009). http://arxiv.org/abs/0912.1803v1 (download on upper right of page). (URL verified 2012-01-28).

Classes 3 and 4
The digital signature dilemma (2006), Jean-Franois Blanchette. http://polaris.gseis.ucla.edu/blanchette/papers/annals.pdf (URL verified 2012-01-28). Authenticity in a Digital Environment (2000-05), Council on Library and Information Resources. Read the two articles: Archival Authenticity in a Digital Age, Peter B. Hirtle and Authenticity in Perspective, Abby Smith. www.clir.org/pubs/reports/pub92/pub92.pdf (URL verified 2012-01-28). Enduring Paradigm, New Opportunities: The Value of the Archival Perspective in the Digital Environment, by Anne J. Gilliland-Swetland (February 2000). From http://www.clir.org/pubs/reports/pub89/pub89.pdf (URL verified 2012-01-28). Uniform Electronic Legal Material Act, National Conference of Commissioners on Uniform State Laws. http://www.law.upenn.edu/bll/archives/ulc/apselm/UELMA_Final_2011.htm (URL verified 2012-01-28). ABA should pause before backing digital-only laws Tonda Rush, WisLawJournal.com (2012-01-26). http://wislawjournal.com/2012/01/26/aba-should-pause-before-backing-digital-only-laws/ URL verified 2012-01-28). Authentication of Primary Legal Materials and Pricing Options State of California, Office of Legislative Counsel (2011-12). http://www.mnhs.org/preserve/records/legislativerecords/docs_pdfs/CA_Authentication_WhitePaper_Dec201 1.pdf (URL verified 2012-02-19).

Classes 5 and 6
Technology Watch Report 04-01: The Open Archival Information System Reference Model: Introductory Guide Brian F. Lavoie 2004 (http://www.dpconline.org/publications/technology-watch-reports) (URL verified 2012-01-28).

97412238.doc

Digital Preservation Syllabus, Page 10

ISO Reference Model For an Open Archival Information System (OAIS), Tutorial Presentation, Sawyer et al (2003). nssdc.gsfc.nasa.gov/nost/isoas/presentations/oais_tutorial_200210.ppt. (URL verified 2012-01-28). Digital Preservation with Special Reference to the Open Archival Information System (OAIS) Reference Model: An Overview Sibsankar Jana et al, (2009) http://academic.research.microsoft.com/Paper/2064447. (URL verified 2012-01-28). ERPANET OAIS Training Seminar Report (2003). http://www.erpanet.org/events/2002/copenhagen/ERPANET%20OAIS%20Training%20Seminar%20Report_final.pdf (URL verified 2012-01-28). Reference Model for an Open Archival Information System CCSDS 650.0-B-1 Blue Book, Issue 1 (January 2002). (http://public.ccsds.org/publications/AllPubs.aspxURL verified 2012-01-28; the publications are listed by number, so look for CCSDS 650.0-B-1, a little more than halfway down the page). Consultative Committee for Space Data Systems. NOTE: this document has been published as an International Standard: ISO 14721:2003 Space data and information transfer systems -- Open archival information system -- Reference model. Towards an Open Source Repository and Preservation System. Recommendations on the Implementation of an Open Source Digital Archival and Preservation System and on Related Software Development Bradley et al, UNESCO (2007) http://portal.unesco.org/ci/en/files/24700/11824297751towards_open_source_repository.doc/towards_open_source_repository.doc (URL verified 2012-01-28). The DOI systemIntroductory Overview, The DOI Foundation (2011-10-03) http://www.doi.org/overview/sys_overview_021601.html. (URL verified 2012-01-28).

Classes 7 and 8
Parsimonious preservation: preventing pointless processes! Tim Gollins, The National Archives (UK), 2009. Read all 4 pages. http://www.nationalarchives.gov.uk/documents/parsimonious-preservation.pdf (URL verified 2012-01-28). The Digital Dilemma: Strategic Issues in Archiving and Accessing Digital Motion Picture Materials, Academy of Motion Picture Arts and Sciences, 2007. http://www.oscars.org/science-technology/council/projects/digitaldilemma/ (URL verified 2012-01-28). Overview of Technological Approaches to Digital Preservation and Challenges in Coming Years (2002-07), Kenneth Thibodeau, in CLIR Conference Proceedings The State of Digital Preservation: An International Perspective, pp. 4-31. http://www.clir.org/pubs/abstract/pub107abst.html (choose PDF). (URL verified 2012-01-28). Thirteen Ways of Looking at...Digital Preservation (2004), Brian Lavoie. http://www.dlib.org/dlib/july04/lavoie/07lavoie.html (URL verified 2012-01-28). A Memory of Webs Past Ariel Bleicher, IEEE Spectrum, March 2011. http://spectrum.ieee.org/telecom/internet/a-memory-of-webs-past/0 (URL verified 2012-01-28). AIMS Born-Digital Collections: An Inter-Institutional Model for Stewardship (2012) - AIMS Work Group. http://www2.lib.virginia.edu/aims/whitepaper/AIMS_final.pdf (URL verified 2012-01-28). Digital Preservation Tutorials: File Naming (videos). Digital Preservation Education for North Carolina Employees. http://digitalpreservation.ncdcr.gov/tutorials.html. View all. (URL verified 2012-01-28). Visualizing Digital Preservation Workflows, by Bill LeFurgy (March 8th, 2012). http://blogs.loc.gov/digitalpreservation/2012/03/visualizing-digital-preservation-workflows/ (URL verified 2012-03-12).

97412238.doc

Digital Preservation Syllabus, Page 11

Life Cycle Models for Digital Stewardship, by Bill LeFurgy (February 21st, 2012). http://blogs.loc.gov/digitalpreservation/2012/02/life-cycle-models-for-digital-stewardship/ (URL verified 2012-03-12).

Class 9
Statement of David A. Powner, Director Information Technology Management Issues, United States Government Accountability Office (GAO Report GAO-10-222T). Testimony Before the Subcommittee on Information Policy, Census, and National Archives, Committee on Oversight and Government Reform, House of Representatives, November 5, 2009: Progress and Risks in Implementing its Electronic Records Archive Initiative. http://www.gao.gov/products/GAO-10-222T (URL verified 2012-03-04).

Classes 10 and 11
Assessing the Durability of Formats in a Digital Preservation Environment (2004-11), Andreas Stanescu. http://www.dlib.org/dlib/november04/stanescu/11stanescu.html (URL verified 2012-01-28). Defining File Format Obsolescence: A Risky Journey, David Pearson, Colin Webb, International Journal of Digital Curation, Vol 3, No 1 (2008). http://www.ijdc.net/index.php/ijdc/article/view/76 (URL verified 201201-28). Content Categories, a sub-section of Sustainability of Digital Formats: Planning for Library of Congress Collections. http://www.digitalpreservation.gov/formats/content/content_categories.shtml (URL verified 201201-28). PDF File Migration to PDF/A: Technical Considerations Frank L. Walker, et al (2006) archive.nlm.nih.gov/pubs/ceb2007/2007020.pdf. (URL verified 2012-01-28).

Classes 12 and 13
Understanding Metadata, National Information Standards Organization (2004). http://www.niso.org/publications/press/UnderstandingMetadata.pdf (URL verified 2012-02-03). Metacrap: Putting the torch to seven straw-men of the meta-utopia, Cory Doctorow (2001) http://www.well.com/~doctorow/metacrap.htm (URL verified 2012-02-03). Metadata Encoding and Transmission Standard: Primer and Reference Manual, v. 1.6 revised. 2010 Digital Library Federation. http://www.loc.gov/standards/mets/mets-schemadocs.html (URL verified 2012-01-28). PREMIS Data Dictionary for Preservation Metadata, version 2.1 (2011-01). PREMIS Editorial Committee. http://www.loc.gov/standards/premis/ (URL verified 2012-01-28).

Class 14
What is Digital Curation? (a subsection of The Value of Digital Curation, 2010). Digital Curation Centre Web site http://www.dcc.ac.uk/digital-curation/what-digital-curation (URL verified 2012-01-28). Historical context and the information age: the Diaspora of Holocaust archives, Raymund Schtz (2011). Provided by instructor. Archivists, Curators, and Museum Technicians Bureau of Labor Statistics (BLS) Occupational Outlook Handbook (2010-11 Ed.). http://www.bls.gov/oco/ocos065.htm (URL verified 2012-02-06). Data Curation in Climate and Weather: Transforming Our Ability to Improve Predictions through Global Knowledge Sharing, Clifford A. Jacobs, National Science Foundation, Steven J. Worley, National Center for Atmospheric Research, The International Journal of Digital Curation, Issue 2, Volume 4 (2009). www.ijdc.net/index.php/ijdc/article/viewFile/119/122 (URL verified 2012-01-28).

97412238.doc

Digital Preservation Syllabus, Page 12

Data Curation Program Development in U.S. Universities: The Georgia Institute of Technology Example, Tyler O. Walters, Associate Director, Technology and Resource Services, Library and Information Center, Georgia Institute of Technology, The International Journal of Digital Curation, Issue 3, Volume 4 (2009). www.ijdc.net/index.php/ijdc/article/viewFile/136/153 (URL verified 2012-01-28).

Class 15
United States Code, Title 17, Chapter 1. http://www.copyright.gov/title17/circ92.pdf (URL verified 2012-01-28). Case: Lowrys Reports, Inc. v. Legg Mason, Inc. http://www.internetlibrary.com/cases/lib_case520.cfm and http://www.wlf.org/upload/062705LUPK.pdf (URLs verified 2012-01-28). The Orphan Wars, James Grimmelmann, EDUCAUSE Review, Volume 47 No. 1, January/February 2012. http://www.educause.edu/EDUCAUSE+Review/EDUCAUSEReviewMagazineVolume47/TheOrphanWars/2 44410 (URL verified 2012-01-28).

Additional English-language Resources


[all URLs verified 2012-01-28] D-Lib Magazine http://www.dlib.org Journal of Digital Information http://journals.tdl.org/jodi/index Journal of Digital Information Management, http://www.dirf.org/jdim/ Ariadne, http://www.ariadne.ac.uk/ Council on Library and Information Resources Publications http://www.clir.org/pubs/pubs.html DIGLIB Mailing List Information http://www.ifla.org/II/lists/diglib.htm Digital Library Federation http://www.diglib.org/ OCLC Research http://www.oclc.org/programs/default.htm Email Preservation: Selected Bibliography compiled by Christopher J. Prom (nd), http://e-records.chrisprom.com/?page_id=2180.

97412238.doc

Digital Preservation Syllabus, Page 13

Description of Assignments & Final Course Project


Assignments and the final course project shall also be delivered to the instructor by email in a word-processor format (*.doc, *.docx, or *.rtf), or in Portable Document Format (*.pdf). On the date an assignment is due, students shall bring a printed copy to the class for discussion. The file name for individual assignments must bear the last name of the individual and the number of the assignment, e.g., Doe_A1. The deliverable file name for group assignments and the final course project shall have the group name and the term FCP. For example: TeamB_FCP.doc. In both types of assignment, every page of the assignment must show the file name and the page number. Assignment 1: Email appraisal and categorisation. This is an individual assignment (due before class on Thursday 16 February). Select all email messages from the last two (2) months, preferably in your work environment. Sort these messages into a minimum of 12 categories, based on function. At least one category must be Permanent (arbitrary, if necessary). Summarise your findings (3-5 pages). Include a description of each category, the basis for your appraisal of the content, the number of messages in each category, and a description of any problems you encountered, together with your solution. Assignment 2: Referring back to Assignment 1, define your email records as authentic and trustworthyor not. Justify your choice (2-3 pages). This is an individual assignment (due before class on Thursday 1 March). Assignment 3: Preserving your email. This is an individual assignment, which begins immediately (Sunday 4 March) and is due before class on Thursday 19 April, so you have 6 weeks in which to complete it. Download or print the User Manual from http://www.weirdkid.com/products/emailchemy/doc/Emailchemy_User_Manual.pdf and the FAQs from http://www.weirdkid.com/blog/category/emailchemy-faq/. Carefully read the User Manual and the FAQs. Download and install on your personal computer a Personal copy of the Emailchemy archiving program from http://www.weirdkid.com/products/emailchemy/#purchase (U.S. $15 with student discountuse code WKEDUDIS; and dont forget to use your CUNY email address when registering the productthats how Weird Kid verifies your bona fides as a student). There are separate version for Mac (OS X 10.4 -10.7), Windows (32-bit Java and 64-bit Java), Linux, Solaris, and UNIX. The program itself is under 10 MB, but it does require that Java be installed. Java is closer to 110 MB - depending on the OS. Java must be installed on the primary system volume (the C: drive, or equivalent). Follow the link for the Emailchemy Memory Boost (its on the download page) and read the page (you probably wont need this supplemental program, but its free, so download it if you think you might need it). Download the Mozilla Thunderbird email client from http://www.mozilla.org/en-US/thunderbird/. Make any changes you deem necessary to the default Emailchemy configuration settings, using the Administration module. Use the Emailchemy program to import your email messages. Convert the messagesand their attachmentsto the EML format. Export the messages to the Mozilla Thunderbird email client and open a sample of at least 20 messages in this client. At least 5 of the messages must have attachments. View the 20 messages and their attachments, and note the differences, if any, between the messages and attachments in their original formats and as displayed by the Mozilla Thunderbird email client.
97412238.doc

Digital Preservation Syllabus, Page 14

Notes from Matthew Hovey at Weird Kid: when the students are comparing before and after for differences, they should make sure to look at the full email headers. Often, this meta information is as important as the content. Importing into Thunderbird is a little tricky in that you can't use Thunderbirds Import feature its buggy. The Emailchemy user manual outlines a different process. Usually, if people are looking to preserve data, I usually recommend keeping the email in EML format for reasons of portability and robustness. Emailchemy writes the files in folders sorted as they were email client and names the files so that they are easily sorted by timestamp. Most desktop OSs can index and search these files easily, and the files are easily opened and rendered by most email clients. The one downside is attachments -- they don't get indexed because they are stored in base64 MIME parts. In your paper (5-10 pages), discuss the following: 1. The ease or difficulty installing and configuring the Emailchemy program. 2. The ease or difficulty of importing your email into Emailchemy. 3. The ease or difficulty of converting your email to the EML format. 4. The ease or difficulty of importing your email into the Mozilla Thunderbird email client. 5. The detectable differences, if any, between the messages and attachments in their original formats and as displayed by the Mozilla Thunderbird email client. 6. Your conclusions on the effectiveness and the efficiency of preserving email through this type of software. For the purposes of this assignment, effectiveness is defined as how completely the Emailchemy software preserved the email messages with minimal detectable differences between the messages and attachments in their original formats and as displayed by the Mozilla Thunderbird email client. Again for the purposes of this assignment, efficiency is defined as the amount of time and effort you had to invest in the project, in proportion to the volume of email preserved AND its effectiveness. Assignment 4: Prepare an outline, with some illustrative details, of the final course assignment for review and discussion in class on Thursday 26 April. This is a group assignment (groups of five to six students each). Because each group will be required to work together on a single document, it is recommended that students have access to some form of collaborative Web-based tool, such as Google Docs. Students will be assigned to groups and team leaders appointed by the instructor.

Final Course Project


The final course project is a group exercise (groups of five to six students each). Each group shall have a Team Leader who shall function as a project manager. Beginning on 19 April, the groups will be given time before the end of class to work face-to-face on the final course project. Because each group will be required to work together on a single document, it is recommended that students have access to some form of collaborative Web-based tool, such as Google Docs. On the last day of class (Thursday 24 May), each group will present its project in executive summary format to a panel consisting of five persons, acting as Chief Executive Officer (CEO), Chief Operating Officer (COO), Chief Financial Officer (CFO), Chief Information Officer (CIO), and General Counsel (GC). The length of the response should be 15-20 pages, and it will be the work product of the group as a whole: the grade given to the final course project for each group will constitute each group members grade. Late delivery of the final course project is not acceptable.

97412238.doc

Digital Preservation Syllabus, Page 15

Project Description (draft) It is for a company we'll call Awesome Oil & Gas (AOG). AOG is a small company with about 300 employees in its head office in Calgary, Alberta. It is owned by a large US oil company that also has interests in Houston, TX. AOG is profitable, drilling exploratory and producing wells in Alberta and Saskatchewan. AOG has an active document management (DM) group, but there is no formal Records Management in place. DM consists primarily of a two functions. First, they operate a central file room where employees request records, in particular well files. A well file holds ALL information about a particular land location (even if there have been multiple wells at that location). A small, recently-created well file is a couple of inches thick. An old, active well file can be a few feet thick with information going back prior to the mid-1950's. This information is not sent to offsite storage because engineers and geophysicists look at historical information regularly in order to make decisions about what to do next. The second function of DM is scanning. They scan company documents (95% of them are financial records, such as Accounts Payable). They're a happy bunch, feeling like they're doing something to push the company into the 21st century. One Spring day with a hint of promise in the air, AOG purchases another oil and gas company called Pretty Good Oil & Gas (PGOG). It is common for a purchased company to send its well files to the purchaser as soon as possible, and it is no different in this case. However, DM learns very quickly that PGOG was terrible at managing their well files - documents in the file are loose, unordered, mixed up, and generally a big fat mess. High up the chain of command it was decided that it would be good if they were scanned, since technology solves all problems. Of course, if you scan a big mess, you get a big scanned mess. The files were scanned by a third-party vendor, who pleaded with AOG to organize the files somewhat first. The pressure from high up to scan quickly prevailed, however, and so the vendor scanned each well file as one large PDF (some as large as 40 MB in size). A descriptive filename was given to each PDF and AOG stored them on a network drive for employees to look at when they needed. Because AOG wanted to utilize the PGOG assets as quickly as possible, the files got accessed right away, whether in their electronic version or in their paper version in the file room. Cries of anger arose quickly from the business units, where employees had great difficulties using the files. It impacted the bottom line, and so another project quickly developed to clean up the well files. The project took on two dimensions. The first dimension was to organize the paper file into industry-accepted patterns that the company was used to. That meant sorting them into 12 specific groups of documents, and then in reverse chronological order. Some in the DM group knew well files very well and were able to accomplish this task. The second dimension was to organize the electronic counterparts. One would think that it'd be easier just to abandon the electronic files, but a combination of political will (in order to save face) and younger engineers who saw the promise in an electronic version was enough for the digital re-organization to proceed. Two people were hired to re-organize the electronic versions, add new material, and make them comparable to the paper versions. Instead of one big PDF, each well file consisted of several PDF documents indexed by land location, document type, and date. The files were imported into an electronic document management database (EDMS) and employees were given access and search tools to find the documents.

97412238.doc

Digital Preservation Syllabus, Page 16

Thus peace returned to AOG. While older employees still demanded the paper well file, younger engineers commented that in some cases they could do in half an hour what would have taken most of the day if the electronic files had not existed. Issues for analysis in the final course project: TBD

97412238.doc

Digital Preservation Syllabus, Page 17

Course Calendar
February 2012 Monday 6 13 Tuesday 7 14 Wednesday Thursday 1 2 Class 1: Introduction. Assignment 1 begins. 8 9 Class 2: The extent and nature of the problem. 15 16 Class 3: Archival theory and diplomatics; Authenticity, Integrity and Trust. Assignment 1 due before class. 22 23 Class 4: Archival theory and diplomatics; Authenticity, Integrity and Trustcontinued. Assignment 2 begins. 29 March 2012 Monday Tuesday Wednesday Thursday 1 Class 5: A functional framework. Assignment 2 due before class. Guest lecturer: Nicholas Webb 8 Class 6: A functional frameworkcontinued. Assignment 3 begins. 15 Class 7: Preservation strategies 22 Class 8: Preservation strategiescontinued. 29 Class 9: Guest presentation (Skype): Fynnette Eaton April 2012 Monday 2 9 16 Tuesday 3 10 17 Wednesday 4 5 11 12 18 19 Thursday Class 10: Data formats Spring Recessno class Class 11: Data formatscontinued. Assignment 3 is due before class. Assignment 4 begins. Guest presentation (Skype): Wayne Hoff Class 12: Metadata. Assignment 4 due before class. Friday 6 13 20 Friday 2 9 16 23 30 Friday 3 10 17

20

21

24

27

28

5 12 19 26

6 13 20 27

7 14 21 28

23 30 Monday

24

25

26

27

May 2012 Tuesday 1 8 15 22 Wednesday 2 3 9 16 23 Thursday Class 13: Metadatacontinued. Guest lecturer: Rebecca Guenther. 10 Class 14: Data curation 17 Class 15: Digital preservation and the law 24 Class 16: presentation of final course project. Due before class. Friday 4 11 18 25

7 14 21

97412238.doc