Académique Documents
Professionnel Documents
Culture Documents
Table of Contents
1. Preface ................................................................................................................................ 1.1. Preface ...................................................................................................................... 2. Introduction .......................................................................................................................... 2.1. DSpace System Documentation: Introduction ................................................................... 3. Functional Overview .............................................................................................................. 3.1. Data Model ............................................................................................................... 3.2. Plugin Manager .......................................................................................................... 3.3. Metadata ................................................................................................................... 3.4. Packager Plugins ........................................................................................................ 3.5. Crosswalk Plugins ...................................................................................................... 3.6. E-People and Groups ................................................................................................... 3.6.1. E-Person ......................................................................................................... 3.6.2. Groups ........................................................................................................... 3.7. Authentication ............................................................................................................ 3.8. Authorization ............................................................................................................. 3.9. Ingest Process and Workflow ........................................................................................ 3.9.1. Workflow Steps ............................................................................................... 3.10. Supervision and Collaboration ..................................................................................... 3.11. Handles ................................................................................................................... 3.12. Bitstream 'Persistent' Identifiers ................................................................................... 3.13. Storage Resource Broker (SRB) Support ....................................................................... 3.14. Search and Browse .................................................................................................... 3.15. HTML Support ......................................................................................................... 3.16. OAI Support ............................................................................................................ 3.17. OpenURL Support .................................................................................................... 3.18. Creative Commons Support ........................................................................................ 3.19. Subscriptions ............................................................................................................ 3.20. Import and Export ..................................................................................................... 3.21. Registration .............................................................................................................. 3.22. Statistics .................................................................................................................. 3.22.1. System Statistics ............................................................................................. 3.22.2. Item, Collection and Community Usage Statistics ................................................. 3.23. Checksum Checker .................................................................................................... 3.24. Usage Instrumentation ................................................................................................ 3.25. Choice Management and Authority Control ................................................................... 3.25.1. Introduction and Motivation ............................................................................. 4. Installation ........................................................................................................................... 4.1. For the Impatient ........................................................................................................ 4.2. Prerequisite Software ................................................................................................... 4.2.1. UNIX-like OS or Microsoft Windows .................................................................. 4.2.2. Oracle Java JDK 6 or later (standard SDK is fine, you don't need J2EE) ...................... 4.2.3. Apache Maven 2.2.x or later (Java build tool) ........................................................ 4.2.4. Apache Ant 1.7 or later (Java build tool) .............................................................. 4.2.5. Relational Database: (PostgreSQL or Oracle). ........................................................ 4.2.6. Servlet Engine: (Apache Tomcat 5.5 or 6, Jetty, Caucho Resin or equivalent). .............. 4.2.7. Perl (only required for [dspace]/bin/dspace-info.pl) ................................................. 4.3. Installation Instructions ................................................................................................ 4.3.1. Overview of Install Options ............................................................................... 4.3.2. Overview of DSpace Directories ......................................................................... 4.3.3. Installation ...................................................................................................... 4.4. Advanced Installation .................................................................................................. 10 10 12 12 14 14 16 16 17 17 18 18 18 18 19 20 21 22 22 23 23 23 24 25 25 25 25 26 26 26 26 27 27 28 28 28 30 30 30 30 30 31 31 32 32 33 33 33 34 34 37
4.4.1. 'cron' Jobs ....................................................................................................... 38 4.4.2. Multilingual Installation ..................................................................................... 38 4.4.3. DSpace over HTTPS ......................................................................................... 39 4.4.4. The Handle Server ........................................................................................... 41 4.4.5. Google and HTML sitemaps .............................................................................. 43 4.4.6. DSpace Statistics .............................................................................................. 43 4.5. Windows Installation ................................................................................................... 44 4.5.1. Pre-requisite Software ....................................................................................... 44 4.5.2. Installation Steps .............................................................................................. 45 4.6. Checking Your Installation ........................................................................................... 46 4.7. Known Bugs .............................................................................................................. 46 4.8. Common Problems ...................................................................................................... 46 5. Upgrading a DSpace Installation .............................................................................................. 49 5.1. Upgrading from 1.6.x to 1.7 ......................................................................................... 49 5.2. Upgrading from 1.6 to 1.6.x ......................................................................................... 56 5.3. Upgrading from 1.5.x to 1.6.x ....................................................................................... 57 5.4. Upgrading From 1.5 or 1.5.1 to 1.5.2 ............................................................................. 67 5.5. Upgrading From 1.4.2 to 1.5 ........................................................................................ 74 5.6. Upgrading From 1.4.1 to 1.4.2 ...................................................................................... 78 5.7. Upgrading From 1.4 to 1.4.x ........................................................................................ 78 5.8. Upgrading From 1.3.2 to 1.4.x ...................................................................................... 79 5.9. Upgrading From 1.3.1 to 1.3.2 ...................................................................................... 82 5.10. Upgrading From 1.2.x to 1.3.x .................................................................................... 82 5.11. Upgrading From 1.2.1 to 1.2.2 .................................................................................... 83 5.12. Upgrading From 1.2 to 1.2.1 ....................................................................................... 84 5.13. Upgrading From 1.1 (or 1.1.1) to 1.2 ............................................................................ 86 5.14. Upgrading From 1.1 to 1.1.1 ....................................................................................... 88 5.15. Upgrading From 1.0.1 to 1.1 ....................................................................................... 89 6. Configuration ....................................................................................................................... 92 6.1. General Configuration ................................................................................................. 92 6.1.1. Input Conventions ............................................................................................ 92 6.1.2. Update Reminder ............................................................................................. 93 6.2. The dspace.cfg Configuration Properties File ................................................................... 93 6.2.1. The dspace.cfg file ........................................................................................... 94 6.2.2. Main DSpace Configurations ............................................................................ 102 6.2.3. DSpace Database Configuration ........................................................................ 103 6.2.4. DSpace Email Settings .................................................................................... 104 6.2.5. File Storage ................................................................................................... 106 6.2.6. SRB (Storage Resource Brokerage) File Storage ................................................... 107 6.2.7. Logging Configuration ..................................................................................... 109 6.2.8. Configuring Lucene Search Indexes ................................................................... 110 6.2.9. Handle Server Configuration ............................................................................. 112 6.2.10. Delegation Administration : Authorization System Configuration ........................... 112 6.2.11. Stackable Authentication Method(s) ................................................................. 116 6.2.12. Restricted Item Visibility Settings .................................................................... 124 6.2.13. Proxy Settings .............................................................................................. 124 6.2.14. Configuring Media Filters ............................................................................... 124 6.2.15. Crosswalk and Packager Plugin Settings ............................................................ 126 6.2.16. Event System Configuration ............................................................................ 130 6.2.17. Embargo ...................................................................................................... 131 6.2.18. Checksum Checker Settings ............................................................................ 135 6.2.19. Item Export and Download Settings ................................................................. 136 6.2.20. Subscription Emails ....................................................................................... 137
6.2.21. Batch Metadata Editing .................................................................................. 6.2.22. Hiding Metadata ........................................................................................... 6.2.23. Settings for the Submission Process .................................................................. 6.2.24. Configuring Creative Commons License ........................................................... 6.2.25. WEB User Interface Configurations ................................................................. 6.2.26. Browse Index Configuration ........................................................................... 6.2.27. Author (Multiple metadata value) Display ......................................................... 6.2.28. Links to Other Browse Contexts ...................................................................... 6.2.29. Recent Submissions ....................................................................................... 6.2.30. Submission License Substitution Variables ........................................................ 6.2.31. Syndication Feed (RSS) Settings ...................................................................... 6.2.32. OpenSearch Support ...................................................................................... 6.2.33. Content Inline Disposition Threshold ................................................................ 6.2.34. Multi-file HTML Document/Site Settings .......................................................... 6.2.35. Sitemap Settings ........................................................................................... 6.2.36. Authority Control Settings .............................................................................. 6.2.37. JSPUI Upload File Settings ............................................................................. 6.2.38. JSP Web Interface (JSPUI) Settings ................................................................. 6.2.39. JSPUI Configuring Multilingual Support ........................................................... 6.2.40. JSPUI Item Mapper ....................................................................................... 6.2.41. Display of Group Membership ........................................................................ 6.2.42. JSPUI / XMLUI SFX Server ........................................................................... 6.2.43. JSPUI Item Recommendation Setting ............................................................... 6.2.44. XMLUI Specific Configuration ....................................................................... 6.2.45. OAI-PMH Configuration and Activation ........................................................... 6.2.46. OAI-ORE Harvester Configuration ................................................................... 6.2.47. DSpace SOLR Statistics Configuration ............................................................. 6.3. Optional or Advanced Configuration Settings ................................................................. 6.3.1. The Metadata Format and Bitstream Format Registries .......................................... 6.3.2. XPDF Filter ................................................................................................... 6.3.3. Creating a new Media/Format Filter ................................................................... 6.3.4. Configuring Usage Instrumentation Plugins ......................................................... 6.3.5. SWORD Configuration .................................................................................... 6.4. Discovery ................................................................................................................ 6.4.1. Introduction Video .......................................................................................... 6.4.2. Usage Guidelines ............................................................................................ 6.4.3. Instructions for enabling Discovery in DSpace 1.7.0 .............................................. 6.4.4. Instructions for Configuring Discovery ............................................................... 6.5. DSpace Statistics ...................................................................................................... 6.5.1. Usage Event Logging and Usage Statistics Gathering ............................................ 6.5.2. Configuration settings for Statistics .................................................................... 6.5.3. Older setting that are no currently utilized in the reports ......................................... 6.6. Embargo .................................................................................................................. 6.6.1. What is an embargo? ....................................................................................... 6.7. Google Scholar Metadata Mappings ............................................................................. 7. JSPUI Configuration and Customization .................................................................................. 7.1. Configuration ........................................................................................................... 7.2. Customizing the JSP pages ......................................................................................... 8. XMLUI Configuration and Customization ................................................................................ 8.1. Manakin Configuration Property Keys .......................................................................... 8.2. Configuring Themes and Aspects ................................................................................. 8.2.1. Aspects ......................................................................................................... 8.2.2. Themes .........................................................................................................
137 138 138 139 139 141 146 146 147 147 148 150 152 152 153 153 154 155 158 160 161 161 162 163 167 168 171 173 173 174 176 178 178 184 184 184 184 186 189 189 189 192 192 192 196 197 197 197 199 199 202 202 202
Multilingual Support .................................................................................................. Creating a New Theme .............................................................................................. Customizing the News Document ................................................................................ Adding Static Content ................................................................................................ Enabling OAI-ORE Harvester using XMLUI ................................................................. 8.7.1. Automatic Harvesting (Scheduler) ..................................................................... 8.8. Additional XMLUI Learning Resources ........................................................................ 8.9. Mirage Configuration and Customization ....................................................................... 8.9.1. ................................................................................................................... 8.9.2. Mirage Theme Configuration and Customization .................................................. 8.10. XMLUI Base Theme Templates (dri2xhtml) ................................................................. 8.10.1. dri2xhtml ..................................................................................................... 8.10.2. dri2xhtml-alt ................................................................................................ 9. System Administration ......................................................................................................... 9.1. Community and Collection Structure Importer ................................................................ 9.1.1. Limitation ..................................................................................................... 9.2. Package Importer and Exporter .................................................................................... 9.2.1. Ingesting ....................................................................................................... 9.2.2. Disseminating ................................................................................................ 9.2.3. Archival Information Packages (AIPs) ................................................................ 9.2.4. METS packages ............................................................................................. 9.3. Item Importer and Exporter ......................................................................................... 9.3.1. DSpace Simple Archive Format ........................................................................ 9.3.2. Configuring metadata-[prefix].xml for Different Schema ........................................ 9.3.3. Importing Items .............................................................................................. 9.3.4. Exporting Items .............................................................................................. 9.4. Transferring Items Between DSpace Instances ................................................................ 9.5. Item Update ............................................................................................................. 9.5.1. DSpace simple Archive Format ......................................................................... 9.5.2. ItemUpdate Commands .................................................................................... 9.6. Registering (Not Importing) Bitstreams ......................................................................... 9.6.1. Accessible Storage .......................................................................................... 9.6.2. Registering Items Using the Item Importer .......................................................... 9.6.3. Internal Identification and Retrieval of Registered Items ......................................... 9.6.4. Exporting Registered Items ............................................................................... 9.6.5. METS Export of Registered Items ..................................................................... 9.6.6. Deleting Registered Items ................................................................................ 9.7. METS Tools ............................................................................................................ 9.7.1. The Export Tool ............................................................................................. 9.7.2. Limitations .................................................................................................... 9.8. MediaFilters: Transforming DSpace Content .................................................................. 9.9. Sub-Community Management ..................................................................................... 9.10. Batch Metadata Editing ............................................................................................ 9.10.1. Exporting Process ......................................................................................... 9.10.2. Import Function ............................................................................................ 9.10.3. The CSV Files ............................................................................................. 9.11. Checksum Checker .................................................................................................. 9.11.1. Checker Execution Mode ............................................................................... 9.11.2. Checker Results Pruning ................................................................................ 9.11.3. Checker Reporting ........................................................................................ 9.11.4. Cron or Automatic Execution of Checksum Checker ............................................ 9.11.5. Automated Checksum Checkers' Results ........................................................... 9.12. Embargo ................................................................................................................
203 203 204 205 206 207 207 207 207 207 210 210 211 213 213 214 214 215 218 219 219 219 219 220 221 223 224 224 225 225 226 227 227 228 228 228 229 229 229 229 230 231 232 233 233 234 235 236 237 237 237 238 238
9.13. Browse Index Creation ............................................................................................. 9.13.1. Running the Indexing Programs ....................................................................... 9.13.2. Indexing Customization .................................................................................. 9.14. DSpace Log Converter ............................................................................................. 9.15. Client Statistics ....................................................................................................... 9.16. Test Database ......................................................................................................... 9.17. Moving items ......................................................................................................... 9.18. AIP Backup and Restore ........................................................................................... 9.18.1. Background & Overview ................................................................................ 9.18.2. Makeup and Definition of AIPs ....................................................................... 9.18.3. Running the Code ......................................................................................... 9.18.4. Additional Packager Options ........................................................................... 9.18.5. Configuration in 'dspace.cfg' ........................................................................... 9.18.6. Common Issues or Error Messages ................................................................... 9.18.7. DSpace AIP Format ...................................................................................... 9.19. Curation System ...................................................................................................... 9.19.1. Tasks .......................................................................................................... 9.19.2. Activation .................................................................................................... 9.19.3. Writing your own tasks .................................................................................. 9.19.4. Task Invocation ............................................................................................ 9.19.5. Asynchronous (Deferred) Operation ................................................................. 9.19.6. Task Output and Reporting ............................................................................. 9.19.7. Task Annotations .......................................................................................... 9.19.8. Starter Tasks ................................................................................................ 10. Directories ........................................................................................................................ 10.1. Overview ............................................................................................................... 10.2. Source Directory Layout ........................................................................................... 10.3. Installed Directory Layout ......................................................................................... 10.4. Contents of JSPUI Web Application ........................................................................... 10.5. Contents of XMLUI Web Application (aka Manakin) ..................................................... 10.6. Log Files ............................................................................................................... 10.6.1. log4j.properties File. ...................................................................................... 11. Architecture ...................................................................................................................... 11.1. Overview ............................................................................................................... 11.1.1. DSpace System Architecture ........................................................................... 11.2. Application Layer .................................................................................................... 11.2.1. Web User Interface ....................................................................................... 11.2.2. OAI-PMH Data Provider ................................................................................ 11.2.3. DSpace Command Launcher ........................................................................... 11.3. Business Logic Layer ............................................................................................... 11.3.1. Core Classes ................................................................................................ 11.3.2. Content Management API ............................................................................... 11.3.3. Plugin Manager ............................................................................................ 11.3.4. Workflow System ......................................................................................... 11.3.5. Administration Toolkit ................................................................................... 11.3.6. E-person/Group Manager ................................................................................ 11.3.7. Authorization ............................................................................................... 11.3.8. Handle Manager/Handle Plugin ....................................................................... 11.3.9. Search ......................................................................................................... 11.3.10. Browse API ............................................................................................... 11.3.11. Checksum checker ....................................................................................... 11.3.12. OpenSearch Support .................................................................................... 11.3.13. Embargo Support ........................................................................................
239 240 240 241 242 243 243 243 243 246 247 255 260 262 263 279 279 279 280 280 282 282 283 284 287 287 287 289 289 290 290 292 293 293 293 295 295 303 306 307 307 309 313 321 322 322 323 324 324 326 328 329 330
11.4. DSpace Services Framework ..................................................................................... 11.4.1. Configuring Event Listeners ............................................................................ 11.4.2. Architectural Overview .................................................................................. 11.4.3. Providers and Plugins .................................................................................... 11.4.4. Core Services ............................................................................................... 11.5. Storage Layer ......................................................................................................... 11.5.1. RDBMS / Database Structure .......................................................................... 11.5.2. Bitstream Store ............................................................................................. 12. Submission User Interface ................................................................................................... 12.1. Understanding the Submission Configuration File .......................................................... 12.1.1. The Structure of item-submission.xml ............................................................... 12.1.2. Defining Steps ( <step> ) within the item-submission.xml ..................................... 12.2. Reordering/Removing Submission Steps ...................................................................... 12.3. Assigning a custom Submission Process to a Collection .................................................. 12.3.1. Getting A Collection's Handle ......................................................................... 12.4. Custom Metadata-entry Pages for Submission ............................................................... 12.4.1. Introduction ................................................................................................. 12.4.2. Describing Custom Metadata Forms ................................................................. 12.4.3. The Structure of input-forms.xml ..................................................................... 12.4.4. Deploying Your Custom Forms ....................................................................... 12.5. Configuring the File Upload step ................................................................................ 12.6. Creating new Submission Steps ................................................................................. 12.6.1. Creating a Non-Interactive Step ....................................................................... 13. DRI Schema Reference ....................................................................................................... 13.1. Introduction ............................................................................................................ 13.1.1. The Purpose of DRI ...................................................................................... 13.1.2. The Development of DRI ............................................................................... 13.2. DRI in Manakin ...................................................................................................... 13.2.1. Themes ....................................................................................................... 13.2.2. Aspect Chains .............................................................................................. 13.3. Common Design Patterns .......................................................................................... 13.3.1. Localization and Internationalization ................................................................. 13.3.2. Standard attribute triplet ................................................................................. 13.3.3. Structure-oriented markup ............................................................................... 13.4. Schema Overview .................................................................................................... 13.5. Merging of DRI Documents ...................................................................................... 13.6. Version Changes ..................................................................................................... 13.6.1. Changes from 1.0 to 1.1 ................................................................................. 13.7. Element Reference ................................................................................................... 13.7.1. BODY ........................................................................................................ 13.7.2. cell ............................................................................................................. 13.7.3. div ............................................................................................................. 13.7.4. DOCUMENT ............................................................................................... 13.7.5. field ............................................................................................................ 13.7.6. figure .......................................................................................................... 13.7.7. head ........................................................................................................... 13.7.8. help ............................................................................................................ 13.7.9. hi ............................................................................................................... 13.7.10. instance ..................................................................................................... 13.7.11. item .......................................................................................................... 13.7.12. label .......................................................................................................... 13.7.13. list ............................................................................................................ 13.7.14. META .......................................................................................................
331 331 332 334 334 335 335 338 342 342 342 342 345 345 346 346 346 347 347 351 351 351 352 354 354 354 354 355 355 355 356 356 356 357 357 358 359 359 359 362 363 364 366 366 368 369 370 370 371 372 373 374 375
13.7.15. metadata .................................................................................................... 13.7.16. OPTIONS .................................................................................................. 13.7.17. p .............................................................................................................. 13.7.18. pageMeta ................................................................................................... 13.7.19. params ....................................................................................................... 13.7.20. reference .................................................................................................... 13.7.21. referenceSet ................................................................................................ 13.7.22. repository ................................................................................................... 13.7.23. repositoryMeta ............................................................................................ 13.7.24. row ........................................................................................................... 13.7.25. table .......................................................................................................... 13.7.26. trail ........................................................................................................... 13.7.27. userMeta .................................................................................................... 13.7.28. value ......................................................................................................... 13.7.29. xref ........................................................................................................... 14. Appendices ....................................................................................................................... 14.1. Appendix A ............................................................................................................ 14.1.1. Default Dublin Core Metadata Registry ............................................................. 14.1.2. Default Bitstream Format Registry ................................................................... 15. History ............................................................................................................................. 15.1. Changes in DSpace 1.7.0 .......................................................................................... 15.1.1. New Features ............................................................................................... 15.1.2. General Improvements ................................................................................... 15.1.3. Bug Fixes .................................................................................................... 15.2. Changes in DSpace 1.6.2 .......................................................................................... 15.2.1. General Improvements ................................................................................... 15.2.2. Bug Fixes .................................................................................................... 15.3. Changes in DSpace 1.6.1 .......................................................................................... 15.3.1. General Improvements ................................................................................... 15.3.2. Bug Fixes .................................................................................................... 15.4. Changes in DSpace 1.6.0 .......................................................................................... 15.4.1. New Features ............................................................................................... 15.4.2. General Improvements ................................................................................... 15.4.3. Bug Fixes .................................................................................................... 15.5. Changes in DSpace 1.5.2 .......................................................................................... 15.5.1. New Features ............................................................................................... 15.5.2. General Improvements ................................................................................... 15.5.3. Bug Fixes .................................................................................................... 15.6. Changes in DSpace 1.5.1 .......................................................................................... 15.6.1. General Improvements and Bug Fixes ............................................................... 15.7. Changes in DSpace 1.5 ............................................................................................ 15.7.1. General Improvements ................................................................................... 15.7.2. Bug fixes and smaller patches ......................................................................... 15.8. Changes in DSpace 1.4.1 .......................................................................................... 15.8.1. General Improvements ................................................................................... 15.8.2. Bug fixes ..................................................................................................... 15.9. Changes in DSpace 1.4 ............................................................................................ 15.9.1. General Improvements ................................................................................... 15.9.2. Bug fixes ..................................................................................................... 15.10. Changes in DSpace 1.3.2 ........................................................................................ 15.10.1. General Improvements ................................................................................. 15.10.2. Bug fixes ................................................................................................... 15.11. Changes in DSpace 1.3.1 ........................................................................................
376 377 378 378 380 381 381 382 383 383 384 385 386 387 388 390 390 390 393 395 395 395 395 399 405 405 405 406 406 406 409 409 410 413 420 420 420 423 428 428 430 430 431 432 432 433 434 434 435 436 436 436 436
15.11.1. Bug fixes ................................................................................................... 15.12. Changes in DSpace 1.3 ........................................................................................... 15.12.1. General Improvements ................................................................................. 15.12.2. Bug fixes ................................................................................................... 15.13. Changes in DSpace 1.2.2 ........................................................................................ 15.13.1. General Improvements ................................................................................. 15.13.2. Bug fixes ................................................................................................... 15.13.3. Changes in JSPs .......................................................................................... 15.14. Changes in DSpace 1.2.1 ........................................................................................ 15.14.1. General Improvements ................................................................................. 15.14.2. Bug fixes ................................................................................................... 15.14.3. Changed JSPs ............................................................................................. 15.15. Changes in DSpace 1.2 ........................................................................................... 15.15.1. General Improvments ................................................................................... 15.15.2. Administration ............................................................................................ 15.15.3. Import/Export/OAI ...................................................................................... 15.15.4. Miscellaneous ............................................................................................. 15.15.5. JSP file changes between 1.1 and 1.2 .............................................................. 15.16. Changes in DSpace 1.1.1 ........................................................................................ 15.16.1. Bug fixes ................................................................................................... 15.16.2. Improvements ............................................................................................. 15.17. Changes in DSpace 1.1 ...........................................................................................
436 436 436 437 437 437 438 438 439 439 439 439 440 440 440 441 441 441 444 444 445 445
Preface
1. Preface
Online Version of Documentation also available
This documentation was produced with Confluence software. HTML and PDF versions were generated directly from Confluence. An online, updated version of this Documentation is also available at: https:// wiki.duraspace.org/display/DSDOC
1.1. Preface
Welcome to Release 1.7.0! The committers have volunteered many hours to fix, re-write and contribute new software code for this release. Documentation has also been updated. The following is a list of the new features included for release 1.7.0 (not an exhaustive list): Mirage: a new clean, professional looking theme for XMLUI that eases theme development DSpace Discovery: a faceted browse/search interface for XMLUI that gives a deeper and more intuitive look at repository content Archival Information Package (AIP) Backup & Restore process: allows for a backup of DSpace into a generic METS-based structure, that can be used to migrate DSpace content to another system that supports AIP's (DSpace or non-DSpace) Curation System User Interface: allows for a series of curation tasks (profile bitstream formats, virus scan, check for required metadata) to be performed on objects in DSpace PowerPoint text extraction, for searching within PowerPoint slides Improved Google Scholar indexing on metadata and PDF content Improved performance and scalability of DSpace: the code has been thoroughly analyzed to provide major performance gains with regard to item ingestion and indexing speed to support larger repositories Automated Unit Testing of core code: helps the developers ensure the core code of DSpace is as bug free and stable as possible A full list of all changes / bug fixes in 1.7.0 is available in the Section 15, History section. The following people have contributed directly to this release of DSpace: @mire, Andrea Bollini, Andrea Schweer, Andreas Schwander, Andrew Hankinson, Andrew Taylor, Antero Neto, Ben Bosman, Bill Hays, BioMed Central, Bram Luyten, Caryn Neiswender, Christophe Dupriez, Claudia Jrgen, Enovation Solutions, Erick Rocha Fonseca, Flvio Botelho, Gabriela Mircea, Gareth Waller, Graham Triggs, Hardy Pottinger, Ivan Masr, Jason Stirnaman, Jeffrey Trimble, Keiji Suzuki, Keith Gilbertson, Kevin Van de Velde, Kim Shepherd, Mark Diggory, Mark H. Wood, Marvin Pollard, Michael B. Klein, Nicholas Riley, Nick Nicholas, OhioLINK, Oleksandr Sytnyk, Pere Villega, Peter Dietz, Reinhard Engels, Richard Rodgers, Robin Taylor, Sands Fish, Sarah Shreeves, Scott Phillips, Simon Brown, Stuart Lewis, Tim Donohue, Vladislav Zhivkov, Yin Yin Latt. Many of them could not do this work without the support (release time and financial) of their associated institutions. We offer thanks to those institutions for supporting their staff to take time to contribute to the DSpace project. We apologize to any contributor accidentally left off this list. DSpace has such a large, active development community that we sometimes lose track of all our contributors. Our ongoing list of all known people/institutions that have contributed to DSpace software can be found on our DSpace Contributors page. Acknowledgements to those left off will be made in future releases. Want to make sure you make it on the short list of contributors? All
10
Preface you have to do is report an issue, fix a bug or help us determine the necessary requirements for a new feature! Visit our Issue Tracker to take part and get your name on the list of DSpace Contributors! The Documentation Gardener for this release was Jeffrey Trimble with input from everyone. All typos are his fault. Peter Dietz is the Release Coordinator of this release. Tim Donohue helped out with coordinating the final days of the release. Additional thanks to Tim Donohue from DuraSpace on keeping all of us focused on the work at hand, and calming us when we got excited and for the general support for the DSpace project.
11
Introduction
2. Introduction
Online Version of Documentation also available
This documentation was produced with Confluence software. HTML and PDF versions were generated directly from Confluence. An online, updated version of this Documentation is also available at: https:// wiki.duraspace.org/display/DSDOC
12
DSpace System Documentation: Introduction The DSpace Development List. Join Discussions among DSpace Developers. The DSpace-Devel listserv is for DSpace developers working on the DSpace platform to share ideas and discuss code changes to the open source platform. Join other developers to shape the evolution of the DSpace software. The DSpace community depends on its members to frame functional requirements and high-level architecture, and to facilitate programming, testing, documentation and to the project.
13
Functional Overview
3. Functional Overview
The following sections describe the various functional aspects of the DSpace system.
Data Model Diagram The way data is organized in DSpace is intended to reflect the structure of the organization using the DSpace system. Each DSpace site is divided into communities, which can be further divided into sub-communities reflecting the typical university structure of college, departement, research center, or laboratory.
14
Data Model Communities contain collections, which are groupings of related content. A collection may appear in more than one community. Each collection is composed of items, which are the basic archival elements of the archive. Each item is owned by one collection. Additionally, an item may appear in additional collections; however every item has one and only one owning collection. Items are further subdivided into named bundles of bitstreams. Bitstreams are, as the name suggests, streams of bits, usually ordinary computer files. Bitstreams that are somehow closely related, for example HTML files and images that compose a single HTML document, are organised into bundles. In practice, most items tend to have these named bundles: ORIGINAL the bundle with the original, deposited bitstreams THUMBNAILS thumbnails of any image bitstreams TEXT extracted full-text from bitstreams in ORIGINAL, for indexing LICENSE contains the deposit license that the submitter granted the host organization; in other words, specifies the rights that the hosting organization have CC_LICENSE contains the distribution license, if any (a Creative Commons license) associated with the item. This license specifies what end users downloading the content can do with the content Each bitstream is associated with one Bitstream Format. Because preservation services may be an important aspect of the DSpace service, it is important to capture the specific formats of files that users submit. In DSpace, a bitstream format is a unique and consistent way to refer to a particular file format. An integral part of a bitstream format is an either implicit or explicit notion of how material in that format can be interpreted. For example, the interpretation for bitstreams encoded in the JPEG standard for still image compression is defined explicitly in the Standard ISO/IEC 10918-1. The interpretation of bitstreams in Microsoft Word 2000 format is defined implicitly, through reference to the Microsoft Word 2000 application. Bitstream formats can be more specific than MIME types or file suffixes. For example, application/ms-word and .doc span multiple versions of the Microsoft Word application, each of which produces bitstreams with presumably different characteristics. Each bitstream format additionally has a support level, indicating how well the hosting institution is likely to be able to preserve content in the format in the future. There are three possible support levels that bitstream formats may be assigned by the hosting institution. The host institution should determine the exact meaning of each support level, after careful consideration of costs and requirements. MIT Libraries' interpretation is shown below: Supported The format is recognized, and the hosting institution is confident it can make bitstreams of this format useable in the future, using whatever combination of techniques (such as migration, emulation, etc.) is appropriate given the context of need. The format is recognized, and the hosting institution will promise to preserve the bitstream as-is, and allow it to be retrieved. The hosting institution will attempt to obtain enough information to enable the format to be upgraded to the 'supported' level. The format is unrecognized, but the hosting institution will undertake to preserve the bitstream as-is and allow it to be retrieved.
Known
Unsupported
15
Plugin Manager Each item has one qualified Dublin Core metadata record. Other metadata might be stored in an item as a serialized bitstream, but we store Dublin Core for every item for interoperability and ease of discovery. The Dublin Core may be entered by end-users as they submit content, or it might be derived from other metadata as part of an ingest process. Items can be removed from DSpace in one of two ways: They may be 'withdrawn', which means they remain in the archive but are completely hidden from view. In this case, if an end-user attempts to access the withdrawn item, they are presented with a 'tombstone,' that indicates the item has been removed. For whatever reason, an item may also be 'expunged' if necessary, in which case all traces of it are removed from the archive. Object Community Collection Item Bundle Bitstream Bitstream Format Example Laboratory of Computer Science; Oceanographic Research Center LCS Technical Reports; ORC Statistical Data Sets A technical report; a data set with accompanying description; a video recording of a lecture A group of HTML and image bitstreams making up an HTML document A single HTML file; a single image file; a source code file Microsoft Word version 6.0; JPEG encoded image format
3.3. Metadata
Broadly speaking, DSpace holds three sorts of metadata about archived content: Descriptive Metadata: DSpace can support multiple flat metadata schemas for describing an item. A qualified Dublin Core metadata schema loosely based on the Library Application Profile set of elements and qualifiers is provided by default. The set of elements and qualifiers used by MIT Libraries comes pre-configured with the DSpace source code. However, you can configure multiple schemas and select metadata fields from a mix of configured schemas to describe your items. Other descriptive metadata about items (e.g. metadata described in a hierarchical schema) may be held in serialized bitstreams. Communities and collections have some simple descriptive metadata (a name, and some descriptive prose), held in the DBMS. Administrative Metadata: This includes preservation metadata, provenance and authorization policy data. Most of this is held within DSpace's relational DBMS schema. Provenance metadata (prose) is stored in Dublin
16
Packager Plugins Core records. Additionally, some other administrative metadata (for example, bitstream byte sizes and MIME types) is replicated in Dublin Core records so that it is easily accessible outside of DSpace. Structural Metadata: This includes information about how to present an item, or bitstreams within an item, to an end-user, and the relationships between constituent parts of the item. As an example, consider a thesis consisting of a number of TIFF images, each depicting a single page of the thesis. Structural metadata would include the fact that each image is a single page, and the ordering of the TIFF images/pages. Structural metadata in DSpace is currently fairly basic; within an item, bitstreams can be arranged into separate bundles as described above. A bundle may also optionally have a primary bitstream. This is currently used by the HTML support to indicate which bitstream in the bundle is the first HTML file to send to a browser. In addition to some basic technical metadata, a bitstream also has a 'sequence ID' that uniquely identifies it within an item. This is used to produce a 'persistent' bitstream identifier for each bitstream. Additional structural metadata can be stored in serialized bitstreams, but DSpace does not currently understand this natively.
17
3.6.1. E-Person
DSpace holds the following information about each e-person: E-mail address First and last names Whether the user is able to log in to the system via the Web UI, and whether they must use an X509 certificate to do so; A password (encrypted), if appropriate A list of collections for which the e-person wishes to be notified of new items Whether the e-person 'self-registered' with the system; that is, whether the system created the e-person record automatically as a result of the end-user independently registering with the system, as opposed to the e-person record being generated from the institution's personnel database, for example. The network ID for the corresponding LDAP record, if LDAP authentication is used for this E-Person.
3.6.2. Groups
Groups are another kind of entity that can be granted permissions in the authorization system. A group is usually an explicit list of E-People; anyone identified as one of those E-People also gains the privileges granted to the group. However, an application session can be assigned membership in a group without being identified as an E-Person. For example, some sites use this feature to identify users of a local network so they can read restricted materials not open to the whole world. Sessions originating from the local network are given membership in the "LocalUsers" group and gain the corresonding privileges. Administrators can also use groups as "roles" to manage the granting of privileges more efficiently.
3.7. Authentication
Authentication is when an application session positively identifies itself as belonging to an E-Person and/or Group. In DSpace 1.4 and later, it is implemented by a mechanism called Stackable Authentication: the DSpace configuration declares a "stack" of authentication methods. An application (like the Web UI) calls on the Authentication Manager, which tries each of these methods in turn to identify the E-Person to which the session belongs, as well as any extra Groups. The E-Person authentication methods are tried in turn until one succeeds. Every authenticator in the stack is given a chance to assign extra Groups. This mechanism offers the following advantages: Separates authentication from the Web user interface so the same authentication methods are used for other applications such as non-interactive Web Services
18
Authorization Improved modularity: The authentication methods are all independent of each other. Custom authentication methods can be "stacked" on top of the default DSpace username/password method. Cleaner support for "implicit" authentication where username is found in the environment of a Web request, e.g. in an X.509 client certificate.
3.8. Authorization
DSpace's authorization system is based on associating actions with objects and the lists of EPeople who can perform them. The associations are called Resource Policies, and the lists of EPeople are called Groups. There are two built-in groups: 'Administrators', who can do anything in a site, and 'Anonymous', which is a list that contains all users. Assigning a policy for an action on an object to anonymous means giving everyone permission to do that action. (For example, most objects in DSpace sites have a policy of 'anonymous' READ.) Permissions must be explicit - lack of an explicit permission results in the default policy of 'deny'. Permissions also do not 'commute'; for example, if an e-person has READ permission on an item, they might not necessarily have READ permission on the bundles and bitstreams in that item. Currently Collections, Communities and Items are discoverable in the browse and search systems regardless of READ authorization. The following actions are possible: Collection ADD/REMOVE DEFAULT_ITEM_READ DEFAULT_BITSTREAM_READ add or remove items (ADD = permission to submit items) inherited as READ by all submitted items inherited as READ by Bitstreams of all submitted items. Note: only affects Bitstreams of an item at the time it is initially submitted. If a Bitstream is added later, it does not get the same default read policy. collection admins can edit items in a collection, withdraw items, map other items into this collection.
COLLECTION_ADMIN Table 1. Item ADD/REMOVE READ WRITE Table 2. Bundle ADD/REMOVE Table 3. Bitstream READ WRITE
add or remove bundles can view item (item metadata is always viewable) can modify item
Note that there is no 'DELETE' action. In order to 'delete' an object (e.g. an item) from the archive, one must have REMOVE permission on all objects (in this case, collection) that contain it. The 'orphaned' item is automatically deleted. Policies can apply to individual e-people or groups of e-people.
19
DSpace Ingest Process The batch item importer is an application, which turns an external SIP (an XML metadata document with some content files) into an "in progress submission" object. The Web submission UI is similarly used by an end-user to assemble an "in progress submission" object. Depending on the policy of the collection to which the submission in targeted, a workflow process may be started. This typically allows one or more human reviewers or 'gatekeepers' to check over the submission and ensure it is suitable for inclusion in the collection. When the Batch Ingester or Web Submit UI completes the InProgressSubmission object, and invokes the next stage of ingest (be that workflow or item installation), a provenance message is added to the Dublin Core which includes the filenames and checksums of the content of the submission. Likewise, each time a workflow changes state (e.g. a reviewer accepts the submission), a similar provenance statement is added. This allows us to track how the item has changed since a user submitted it. Once any workflow process is successfully and positively completed, the InProgressSubmission object is consumed by an "item installer", that converts the InProgressSubmission into a fully blown archived item in DSpace. The item installer: Assigns an accession date Adds a "date.available" value to the Dublin Core metadata record of the item Adds an issue date if none already present Adds a provenance message (including bitstream checksums) Assigns a Handle persistent identifier Adds the item to the target collection, and adds appropriate authorization policies Adds the new item to the search and browse indices
20
Submission Workflow in DSpace If a submission is rejected, the reason (entered by the workflow participant) is e-mailed to the submitter, and it is returned to the submitter's 'My DSpace' page. The submitter can then make any necessary modifications and re-submit, whereupon the process starts again. If a submission is 'accepted', it is passed to the next step in the workflow. If there are no more workflow steps with associated groups, the submission is installed in the main archive. One last possibility is that a workflow can be 'aborted' by a DSpace site administrator. This is accomplished using the administration UI. The reason for this apparently arbitrary design is that is was the simplist case that covered the needs of the early adopter communities at MIT. The functionality of the workflow system will no doubt be extended in the future.
21
3.11. Handles
Researchers require a stable point of reference for their works. The simple evolution from sharing of citations to emailing of URLs broke when Web users learned that sites can disappear or be reconfigured without notice, and that their bookmark files containing critical links to research results couldn't be trusted in the long term. To help solve this problem, a core DSpace feature is the creation of a persistent identifier for every item, collection and community stored in DSpace. To persist identifiers, DSpace requires a storage- and location- independent mechanism for creating and maintaining identifiers. DSpace uses the CNRI Handle System for creating these identifiers. The rest of this section assumes a basic familiarity with the Handle system. DSpace uses Handles primarily as a means of assigning globally unique identifiers to objects. Each site running DSpace needs to obtain a unique Handle 'prefix' from CNRI, so we know that if we create identifiers with that prefix, they won't clash with identifiers created elsewhere. Presently, Handles are assigned to communities, collections, and items. Bundles and bitstreams are not assigned Handles, since over time, the way in which an item is encoded as bits may change, in order to allow access with future technologies and devices. Older versions may be moved to off-line storage as a new standard becomes de facto. Since it's usually the item that is being preserved, rather than the particular bit encoding, it only makes sense to persistently identify and allow access to the item, and allow users to access the appropriate bit encoding from there. Of course, it may be that a particular bit encoding of a file is explicitly being preserved; in this case, the bitstream could be the only one in the item, and the item's Handle would then essentially refer just to that bitstream. The same bitstream can also be included in other items, and thus would be citable as part of a greater item, or individually. The Handle system also features a global resolution infrastructure; that is, an end-user can enter a Handle into any service (e.g. Web page) that can resolve Handles, and the end-user will be directed to the object (in the case of DSpace, community, collection or item) identified by that Handle. In order to take advantage of this feature of the Handle system, a DSpace site must also run a 'Handle server' that can accept and resolve incoming resolution requests. All the code for this is included in the DSpace source code bundle. Handles can be written in two forms:
hdl:1721.123/4567
22
The above represent the same Handle. The first is possibly more convenient to use only as an identifier; however, by using the second form, any Web browser becomes capable of resolving Handles. An end-user need only access this form of the Handle as they would any other URL. It is possible to enable some browsers to resolve the first form of Handle as if they were standard URLs using CNRI's Handle Resolver plug-in, but since the first form can always be simply derived from the second, DSpace displays Handles in the second form, so that it is more useful for end-users. It is important to note that DSpace uses the CNRI Handle infrastructure only at the 'site' level. For example, in the above example, the DSpace site has been assigned the prefix '1721.123'. It is still the responsibility of the DSpace site to maintain the association between a full Handle (including the '4567' local part) and the community, collection or item in question.
The above refers to the bitstream with sequence ID 24 in the item with the Handle hdl:123.456/789. The foo.html is really just there as a hint to browsers: Although DSpace will provide the appropriate MIME type, some browsers only function correctly if the file has an expected extension.
23
HTML Support Browsing though title, author, date or subject indices, with optional image thumbnails Search is an essential component of discovery in DSpace. Users' expectations from a search engine are quite high, so a goal for DSpace is to supply as many search features as possible. DSpace's indexing and search module has a very simple API which allows for indexing new content, regenerating the index, and performing searches on the entire corpus, a community, or collection. Behind the API is the Java freeware search engine Lucene. Lucene gives us fielded searching, stop word removal, stemming, and the ability to incrementally add new indexed content without regenerating the entire index. The specific Lucene search indexes are configurable enabling institutions to customize which DSpace metadata fields are indexed. Another important mechanism for discovery in DSpace is the browse. This is the process whereby the user views a particular index, such as the title index, and navigates around it in search of interesting items. The browse subsystem provides a simple API for achieving this by allowing a caller to specify an index, and a subsection of that index. The browse subsystem then discloses the portion of the index of interest. Indices that may be browsed are item title, item issue date, item author, and subject terms. Additionally, the browse can be limited to items within a particular collection or community.
OAI Support /stylesheet.css is not OK (the link will break) http://somedomain.com/content.html is not OK (the link will continue to link to the external site which may change or disappear) Any 'absolute links' (e.g. http://somedomain.com/content.html) are stored 'as is', and will continue to link to the external content (as opposed to relative links, which will link to the copy of the content stored in DSpace.) Thus, over time, the content refered to by the absolute link may change or disappear.
3.19. Subscriptions
As noted above, end-users (e-people) may 'subscribe' to collections in order to be alerted when new items appear in those collections. Each day, end-users who are subscribed to one or more collections will receive an e-mail giving brief details of all new items that appeared in any of those collections the previous day. If no new items
25
Import and Export appeared in any of the subscribed collections, no e-mail is sent. Users can unsubscribe themselves at any time. RSS feeds of new items are also available for collections and communities.
3.21. Registration
Registration is an alternate means of incorporating items, their metadata, and their bitstreams into DSpace by taking advantage of the bitstreams already being in accessible computer storage. An example might be that there is a repository for existing digital assets. Rather than using the normal interactive ingest process or the batch import to furnish DSpace the metadata and to upload bitstreams, registration provides DSpace the metadata and the location of the bitstreams. DSpace uses a variation of the import tool to accomplish registration.
3.22. Statistics
DSpace offers system statistics for administrator usage, as well as usage statistics on the level of items, communities and collections.
26
Checksum Checker Broken-down list of item viewings A full break-down of all performed actions User logins Most popular searches Log Level Information Processing information!stats_genrl_overview.png! The results of statistical analysis can be presented on a by-month and an in-total report, and are available via the user interface. The reports can also either be made public or restricted to administrator access only.
27
Usage Instrumentation
28
Choice Management and Authority Control 4. Authority keys are normally invisible in the public web UIs. They are only seen by administrators editing metadata. The value of an authority key is not expected to be meaningful to an end-user or site visitor. Authority control is different from the controlled vocabulary of keywords already implemented in the submission UI: 1. Authorities are external to DSpace. The source of authority control is typically an external database or network resource. Plug-in architecture makes it easy to integrate new authorities without modifying any core code. 2. This authority proposal impacts all phases of metadata management. The keyword vocabularies are only for the submission UI. Authority control is asserted everywhere metadata values are changed, including unattended/batch submission, LNI and SWORD package submission, and the administrative UI.
Authority Key
29
Installation
4. Installation
4.1. For the Impatient
Since some users might want to get their test version up and running as fast as possible, offered below is an unsupported outline of getting DSpace to run quickly in a Unix-based environment. Only experienced unix admins should even attempt the following without going to the detailed Installation Instructions
useradd -m dspace gunzip -c dspace-1.x-src-release.tar.gz | tar -xf createuser -U postgres -d -A -P dspace createdb -U dspace -E UNICODE dspace cd [dspace-source]/dspace/config vi dspace.cfg mkdir [dspace] chown dspace [dspace] su - dspace cd [dspace-source]/dspace mvn package cd [dspace-source]/dspace/target/dspace-<version>-build.dir ant fresh_install cp -r [dspace]/webapps/* [tomcat]/webapps /etc/init.d/tomcat start [dspace]/bin/dspace create-administrator
4.2.2. Oracle Java JDK 6 or later (standard SDK is fine, you don't need J2EE)
DSpace now requires Oracle Java 6 or greater because of usage of new language capabilities introduced in 5 and 6 that make coding easier and cleaner.
30
Prerequisite Software Java can be downloaded from the following location: http://java.sun.com/javase/downloads/index.jsp Only Oracle's Java has been tested with each release and is known to work correctly. Other flavors of Java may pose problems.
31
Prerequisite Software
4.2.6. Servlet Engine: (Apache Tomcat 5.5 or 6, Jetty, Caucho Resin or equivalent).
Apache Tomcat 5.5 or later. Tomcat can be downloaded from the following location: http:// tomcat.apache.org. Note that DSpace will need to run as the same user as Tomcat, so you might want to install and run Tomcat as a user called 'dspace'. Set the environment variable TOMCAT_USER appropriately. You need to ensure that Tomcat has a) enough memory to run DSpace and b) uses UTF-8 as its default file encoding for international character support. So ensure in your startup scripts (etc) that the following environment variable is set: JAVA_OPTS="-Xmx512M -Xms64M -Dfile.encoding=UTF-8" Modifications in[tomcat]/conf/server.xml: You also need to alter Tomcat's default configuration to support searching and browsing of multi-byte UTF-8 correctly. You need to add a configuration option to the <Connector> element in [tomcat]/config/server.xml: URIEncoding="UTF-8" e.g. if you're using the default Tomcat config, it should read:
<!-- Define a non-SSL HTTP/1.1 Connector on port 8080 --> <Connector port="8080" maxThreads="150" minSpareThreads="25" maxSpareThreads="75" enableLookups="false" redirectPort="8443" acceptCount="100" connectionTimeout="20000" disableUploadTimeout="true" URIEncoding="UTF-8"/>
You may change the port from 8080 by editing it in the file above, and by setting the variable CONNECTOR_PORT in server.xml. 32
Installation Instructions Jetty or Caucho Resin DSpace will also run on an equivalent servlet Engine, such as Jetty (http:// www.mortbay.org/jetty/index.html) or Caucho Resin (http://www.caucho.com/). Jetty and Resin are configured for correct handling of UTF-8 by default.
33
Installation Instructions dspace/ - DSpace 'build' and configuration module dspace-api/ - Java API source module dspace-jspui/ - JSP-UI source module dspace-oai - OAI-PMH source module dspace-xmlui - XML-UI (Manakin) source module dspace-lni - Lightweight Network Interface source module dspace-sword SWORD (Simple Web-serve Offering Repository Deposit) deposit service source module dspace-test DSpace Tests (Unit and Integration Tests) pom.xml - DSpace Parent Project definition
4.3.3. Installation
This method gets you up and running with DSpace quickly and easily. It is identical in both the Default Release and Source Release distributions. 1. Create the DSpace user. This needs to be the same user that Tomcat (or Jetty etc.) will run as. e.g. as root run:
useradd -m dspace
34
Installation Instructions 2. Download the latest DSpace release There are two version available with each release of DSpace: (dspace-1.x-release. and dspace-1.x-src-release.xxx); you only need to choose one. If you want a copy of all underlying Java source code, you should download the dspace-1.x-src-release.xxx Within each version, you have a choice of compressed file format. Choose the one that best fits your environment. 3. Unpack the DSpace software. After downloading the software, based on the compression file format, choose one of the following methods to unpack your software: a. Zip file. If you downloaded dspace-1.6-release.zip do the following:
unzip dspace-1.7-release.zip
For ease of reference, we will refer to the location of this unzipped version of the DSpace release as [dspace-source] in the remainder of these instructions.After unpacking the file, the user may which to change the ownership of the dspace-1.6-release to the 'dspace' user. (And you may need to change the group). 4. Database Setup PostgreSQL: A PostgreSQL JDBC driver is configured as part of the default DSpace build. You no longer need to copy any PostgreSQL jars to get PostgreSQL installed. Create a dspace}}database, owned by the {{dspace PostgreSQL user (you are still logged in at 'root'):
You will be prompted for a password for the DSpace database. (This isn't the same as the dspace user's UNIX password.) Oracle: Setting up oracle is a bit different now. You will need still need to get a Copy of the oracle JDBC driver, but instead of copying it into a lib directory you will need to install it into your local Maven repository. (You'll need to download it first from this location: http://www.oracle.com/technetwork/database/enterprise-edition/jdbc-112010-090769.html) Run the following command (all on one line)
35 Create a database for DSpace. Make sure that the character set is one of the Unicode character sets. DSpace uses UTF-8 natively, and it is required that the Oracle database use the same character set. Create a user account for DSpace (e.g. dspace,) and ensure that it has permissions to add and remove tables in the database.
5. Initial Configuration: Edit [dspace-source]/dspace/config/dspace.cfg, in particular you'll need to set these properties: dspace.dir - must be set to the [dspace] (installation) directory. dspace.url - complete URL of this server's DSpace home page. dspace.hostname - fully-qualified domain name of web server. dspace.name - "Proper" name of your server, e.g. "My Digital Library". db.password - the database password you entered in the previous step. mail.server - fully-qualified domain name of your outgoing mail server. mail.from.address - the "From:" address to put on email sent by DSpace. feedback.recipient - mailbox for feedback mail. mail.admin - mailbox for DSpace site administrator. alert.recipient - mailbox for server errors/alerts (not essential but very useful!) registration.notify - mailbox for emails when new users register (optional) You can interpolate the value of one configuration variable in the value of another one. For example, to set feedback.recipient to the same value as mail.admin, the line would look like: feedback.recipient = ${mail.admin} Refer to the General Configuration section for details and examples of the above. 6. DSpace Directory: Create the directory for the DSpace installation (i.e. [dspace]). As root (or a user with appropriate permissions), run:
mkdir [dspace] chown dspace [dspace]
(Assuming the dspace UNIX username.) 7. Installation Package: As the dspace UNIX user, generate the DSpace installation package in the [dspacesource]/dspace directory:
cd [dspace-source]/dspace/ mvn package
36
Advanced Installation 8. Build DSpace and Initialize Database: As the dspace UNIX user, initialize the DSpace database and install DSpace to [dspace]_:
To see a complete list of build targets, run: ant helpThe most likely thing to go wrong here is the database connection. See theCommon ProblemsSection. 9. Deploy Web Applications. You have two choices or techniques for having Tomcat/Jetty/Resin serve up your web applications: Technique A. Simple and complete. You copy only (or all) of the DSpace Web application(s) you wish to use from the [dspace]/webapps directory to the appropriate directory in your Tomcat/Jetty/Resin installation. For example: cp -R [dspace]/webapps/* [tomcat]/webapps* (This will copy all the web applications to Tomcat). cp -R [dspace]/webapps/jspui [tomcat]/webapps* (This will copy only the jspui web application to Tomcat.) Technique B. Tell your Tomcat/Jetty/Resin installation where to find your DSpace web application(s). As an example, in the \<Host> section of your [tomcat]/conf/server.xml you could add lines similar to the following (but replace [dspace] with your installation location:
<!-- Define the default virtual host Note: XML Schema validation will not work with Xerces 2.2. --> <Host name="localhost" appBase="[dspace]/webapps" ....
[dspace]/bin/dspace create-administrator
11.Initial Startup! Now the moment of truth! Start up (or restart) Tomcat/Jetty/Resin. Visit the base URL(s) of your server, depending on which DSpace web applications you want to use. You should see the DSpace home page. Congratulations! Base URLs of DSpace Web Applications: JSP User Interface - (e.g.) http://dspace.myu.edu:8080/jspui XML User Interface (aka. Manakin) - (e.g.) http://dspace.myu.edu:8080/xmlui OAI-PMH Interface - (e.g.) http://dspace.myu.edu:8080/oai/request?verb=Identify (Should return an XML-based response) In order to set up some communities and collections, you'll need to login as your DSpace Administrator (which you created with create-administrator above) and access the administration UI in either the JSP or XML user interface.
Advanced Installation
Naturally you should change the frequencies to suit your environment. PostgreSQL also benefits from regular 'vacuuming', which optimizes the indexes and clears out any deleted data. Become the postgres UNIX user, run crontab -e and add (for example):
# Clean up the database nightly at 4.20am 20 4 * * * vacuumdb --analyze dspace > /dev/null 2>&1
In order that statistical reports are generated regularly and thus kept up to date you should set up the following cron jobs:
# 0 0 0 0 Run 1 * 1 * 2 * 2 * stat analysis * * [dspace]/bin/dspace * * [dspace]/bin/dspace * * [dspace]/bin/dspace * * [dspace]/bin/dspace
Obviously, you should choose execution times which are most useful to you, and you should ensure that the report scripts run a short while after the analysis scripts to give them time to complete (a run of around 8 months worth of logs can take around 25 seconds to complete); the resulting reports will let you know how long analysis took and you can adjust your cron times accordingly.
38
Advanced Installation The Locales might have the form country, country_language, country_language_variant. According to the languages you wish to support, you have to make sure, that all the i18n related files are available see the Multilingual User Interface Configuring MultiLingual Support section for the JSPUI or the Multilingual Support for XMLUI in the configuration documentation.
b. Install the CA (Certifying Authority) certificate for the CA that granted your server cert, if necessary. This assumes the server CA certificate is in ca.pem:
$JAVA_HOME/bin/keytool -import -noprompt -storepass changeit -trustcacerts -keystore $CATALINA_BASE/conf/keystore -alias ServerCA -file ca.pem
c. Optional ONLY if you need to accept client certificates for the X.509 certificate stackable authentication module See the configuration section for instructions on enabling the X.509 authentication method. Load the keystore with the CA (certifying authority) certificates for the authorities of any clients whose certificates you wish to accept. For example, assuming the client CA certificate is in client1.pem:
$JAVA_HOME/bin/keytool -import -noprompt -storepass changeit -trustcacerts -keystore $CATALINA_BASE/conf/keystore -alias client1 -file client1.pem
d. Now add another Connector tag to your server.xml Tomcat configuration file, like the example below. The parts affecting or specific to SSL are shown in bold. (You may wish to change some details such as the port, pathnames, and keystore password)
<Connector port="8443"
39
Advanced Installation
maxThreads="150" minSpareThreads="25" maxSpareThreads="75" enableLookups="false" disableUploadTimeout="true" acceptCount="100" debug="0" scheme="https" secure="true" sslProtocol="TLS" keystoreFile="conf/keystore" keystorePass="changeit" clientAuth="true" - ONLY if using client X.509 certs for authentication! truststoreFile="conf/keystore" trustedstorePass="changeit" />
Also, check that the default Connector is set up to redirect "secure" requests to the same port as your SSL connector, e.g.:
<Connector port="8080" maxThreads="150" minSpareThreads="25" maxSpareThreads="75" enableLookups="false" redirectPort="8443" acceptCount="100" debug="0" />
2. Quick-and-dirty Procedure for Testing: If you are just setting up a DSpace server for testing, or to experiment with HTTPS, then you don't need to get a real server certificate. You can create a "self-signed" certificate for testing; web browsers will issue warnings before accepting it but they will function exactly the same after that as with a "real" certificate. In the examples below, $CATALINA_BASE is the directory under which your Tomcat is installed. a. Optional ONLY if you don't already have a server certificate. Follow this sub-procedure to request a new, signed server certificate from your Certifying Authority (CA): Create a new key pair under the alias name "tomcat". When generating your key, give the Distinguished Name fields the appropriate values for your server and institution. CN should be the fully-qualified domain name of your server host. Here is an example:
$JAVA_HOME/bin/keytool -genkey -alias tomcat -keyalg RSA -keysize 1024 \ -keystore $CATALINA_BASE/conf/keystore -storepass changeit -validity 365 \ -dname 'CN=dspace.myuni.edu, OU=MIT Libraries, O=Massachusetts Institute of Technology, L=Cambridge, S=MA, C=US'
Then, create a CSR (Certificate Signing Request) and send it to your Certifying Authority. They will send you back a signed Server Certificate. This example command creates a CSR in the file tomcat.csr
$JAVA_HOME/bin/keytool -keystore $CATALINA_BASE/conf/keystore -storepass changeit \ -certreq -alias tomcat -v -file tomcat.csr
Before importing the signed certificate, you must have the CA's certificate in your keystore as a trusted certificate. Get their certificate, and import it with a command like this (for the example mitCA.pem):
$JAVA_HOME/bin/keytool -keystore $CATALINA_BASE/conf/keystore -storepass changeit \ -import -alias mitCA -trustcacerts -file mitCA.pem
Finally, when you get the signed certificate from your CA, import it into the keystore with a command like the following example: (cert is in the file signed-cert.pem)
$JAVA_HOME/bin/keytool -keystore $CATALINA_BASE/conf/keystore -storepass changeit \
40
Advanced Installation
-import -alias tomcat -trustcacerts -file signed-cert.pem
Since you now have a signed server certificate in your keystore, you can, obviously, skip the next steps of installing a signed server certificate and the server CA's certificate. b. Create a Java keystore for your server with the password changeit, and install your server certificate under the alias "tomcat". This assumes the certificate was put in the file server.pem:
$JAVA_HOME/bin/keytool -genkey -alias tomcat -keyalg RSA -keystore $CATALINA_BASE/conf/keystore -storepass changeit
When answering the questions to identify the certificate, be sure to respond to "First and last name" with the fully-qualified domain name of your server (e.g. test-dspace.myuni.edu). The other questions are not important. c. Optional ONLY if you need to accept client certificates for the X.509 certificate stackable authentication module See the configuration section for instructions on enabling the X.509 authentication method. Load the keystore with the CA (certifying authority) certificates for the authorities of any clients whose certificates you wish to accept. For example, assuming the client CA certificate is in client1.pem:
$JAVA_HOME/bin/keytool -import -noprompt -storepass changeit -trustcacerts -keystore $CATALINA_BASE/conf/keystore -alias client1 -file client1.pem
d. Follow the procedure in the section above to add another Connector tag, for the HTTPS port, to your server.xml file.
Now consult the Apache Jakarta Tomcat Connector documentation to configure the mod_jk (note: NOTmod_jk2) module. Select the AJP 1.3 connector protocol. Also follow the instructions there to configure your Tomcat server to respond to AJP. To use SSL on Apache HTTPD with mod_webapp consult the DSpace 1.3.2 documentation. Apache have deprecated the mod_webapp connector and recommend using mod_jk. To use Jetty's HTTPS support consult the documentation for the relevant tool.
41
Advanced Installation You don't have to use CNRI's Handle system. At the moment, you need to change the code a little to use something else (e.g PURLs) but that should change soon. You'll notice that while you've been playing around with a test server, DSpace has apparently been creating handles for you looking like hdl:123456789/24 and so forth. These aren't really Handles, since the global Handle system doesn't actually know about them, and lots of other DSpace test installs will have created the same IDs.They're only really Handles once you've registered a prefix with CNRI (see below) and have correctly set up the Handle server included in the DSpace distribution. This Handle server communicates with the rest of the global Handle infrastructure so that anyone that understands Handles can find the Handles your DSpace has created. If you want to use the Handle system, you'll need to set up a Handle server. This is included with DSpace. Note that this is not required in order to evaluate DSpace; you only need one if you are running a production service. You'll need to obtain a Handle prefix from the central CNRI Handle site. A Handle server runs as a separate process that receives TCP requests from other Handle servers, and issues resolution requests to a global server or servers if a Handle entered locally does not correspond to some local content. The Handle protocol is based on TCP, so it will need to be installed on a server that can broadcast and receive TCP on port 2641. 1. To configure your DSpace installation to run the handle server, run the following command: [dspace]/bin/ dspace make-handle-config Ensure that _[dspace]/handle-server_ matches whatever you have in dspace.cfg for the handle.dir property. 2. Edit the resulting [dspace]/handle-server/config.dct file to include the following lines in the "server_config" clause:
"storage_type" = "CUSTOM" "storage_class" = "org.dspace.handle.HandlePlugin"
This tells the Handle server to get information about individual Handles from the DSpace code. 3. Once the configuration file has been generated, you will need to go to http://hdl.handle.net/4263537/5014 to upload the generated sitebndl.zip file. The upload page will ask you for your contact information. An administrator will then create the naming authority/prefix on the root service (known as the Global Handle Registry), and notify you when this has been completed. You will not be able to continue the handle server installation until you receive further information concerning your naming authority. 4. When CNRI has sent you your naming authority prefix, you will need to edit the config.dct file. The file will be found in /[dspace]/handle-server. Look for "300:0.NA/YOUR_NAMING_AUTHORITY"_Replace _YOUR_NAMING_AUTHORITY with the assigned naming authority prefix sent to you. 5. Now start your handle server (as the dspace user):
[dspace]/bin/start-handle-server
Note that since the DSpace code manages individual Handles, administrative operations such as Handle creation and modification aren't supported by DSpace's Handle server.
Advanced Installation
[dspace]/bin/dspace update-handle-prefix 123456789 1303
This script will change any handles currently assigned prefix 123456789 to prefix 1303, so for example handle 123456789/23 will be updated to 1303/23 in the database.
0 6 * * * [dspace]/bin/generate-sitemaps
43
Windows Installation
solr.log.server = ${dspace.baseUrl}/solr/statistics solr.dbfile = ${dspace.dir}/config/GeoLiteCity.dat solr.spiderips.urls = http://iplists.com/google.txt, \ http://iplists.com/inktomi.txt, \ http://iplists.com/lycos.txt, \ http://iplists.com/infoseek.txt, \ http://iplists.com/altavista.txt, \ http://iplists.com/excite.txt, \ http://iplists.com/misc.txt, \ http://iplists.com/non_engines.txt
2. DSpace logging configuration for Solr. If your DSpace instance is protected by a proxy server, in order for Solr to log the correct IP address of the user rather than of the proxy, it must be configured to look for the X-Forwarded-For header. This feature can be enabled by ensuring the following setting is uncommented in the logging section of dspace.cfg:
useProxies = true
3. DSpace configuration for fields indexed into Solr Event records for search. In the dspace.cfg file, review the following property keys to make sure they are uncommented:
statistics.items.dc.1=dc.identifier statistics.items.dc.2=dc.date.accessioned statistics.items.type.1=dcinput statistics.items.type.2=date statistics.default.start.datepick = 01/01/1977
4. Configuration Control. In the dspace.cfg set the following property key:_statistics.item.authorization.admin=true_This will require the user to sign on to see that statistics. Setting the statistics to "false" will make them publicly available. 5. Final steps. Perform the following step:
cd [dspace-source]/dspace mvn package cd [dspace-source]/dspace/target/dspace-<version>-build.dir ant -Dconfig=[dspace]/config/dspace.cfg update cp -R [dspace]/webapps/* [TOMCAT]/webapps
If you only need to build the statistics, and don't make any changes to other web applications, you can replace the copy step above with: cp -R [dspace]/webapps/solr [TOMCAT]/webapps Restart your webapps (Tomcat/Jetty/Resin)
44
Windows Installation
4. Create the directory for the DSpace installation (e.g. C:\DSpace) 5. Generate the DSpace installation package by running the following from command line (cmd) from your [dspace-source]/dspace/ directory:
mvn package
Note #1: This will generate the DSpace installation package in your [dspace-source]/dspace/target/dspace[version]-build.dir/ directory. Note #2: Without any extra arguments, the DSpace installation package is initialized for PostgreSQL. If you want to use Oracle instead, you should build the DSpace installation package as follows:
mvn -Ddb.name=oracle package
6. Initialize the DSpace database and install DSpace to [dspace] (e.g. C:\DSpace) by running the following from command line from your [dspace-source]/dspace/target/dspace-[version]-build.dir/ directory:
ant fresh_install
Note: to see a complete list of build targets, run: ant help 7. Create an administrator account, by running the following from your [dspace] (e.g. C:\DSpace) directory:
[dspace]\bin\dspace create-administrator
8. Copy the Web application directories from [dspace]\webapps to Tomcat's webapps dir, which should be somewhere like C:\Program Files\Apache Software Foundation\Tomcat\webapps Alternatively, Tell your Tomcat installation where to find your DSpace web application(s). As an example, in the <Host> section of your [tomcat]/conf/server.xml you could add lines similar to the following (but replace [dspace] with your installation location):
45
9. Start the Tomcat service 10.Browse to either http://localhost:8080/jspui or http://localhost:8080/xmlui. You should see the DSpace home page for either the JSPUI or XMLUI, respectively.
46
Common Problems
[java] 2004-03-25 15:17:07,730 INFO org.dspace.storage.rdbms.InitializeDatabase @ Initializing Database [java] 2004-03-25 15:17:08,816 FATAL org.dspace.storage.rdbms.InitializeDatabase @ Caught exception: [java] org.postgresql.util.PSQLException: Connection refused. Check that the hostname and port are correct and that the postmaster is accepting TCP/IP connections. [java] at org.postgresql.jdbc1.AbstractJdbc1Connection.openConnection(AbstractJd bc1Connection.java:204) [java] at org.postgresql.Driver.connect(Driver.java:139)
it usually means you haven't yet added the relevant configuration parameter to your PostgreSQL configuration (see above), or perhaps you haven't restarted PostgreSQL after making the change. Also, make sure that the db.username and db.password properties are correctly set in [dspace-source]/config/dspace.cfg. An easy way to check that your DB is working OK over TCP/IP is to try this on the command line:
psql -U dspace -W -h localhost
Enter the dspace database password, and you should be dropped into the psql tool with a dspace=> prompt. Another common error looks like this:
[java] 2004-03-25 16:37:16,757 INFO org.dspace.storage.rdbms.InitializeDatabase @ Initializing Database [java] 2004-03-25 16:37:17,139 WARN org.dspace.storage.rdbms.DatabaseManager @ Exception initializing DB pool [java] java.lang.ClassNotFoundException: org.postgresql.Driver [java] at java.net.URLClassLoader$1.run(URLClassLoader.java:198) [java] at java.security.AccessController.doPrivileged(Native Method) [java] at java.net.URLClassLoader.findClass(URLClassLoader.java:186)
This means that the PostgreSQL JDBC driver is not present in [dspace-source]/lib. See above. Tomcat doesn't shut down: If you're trying to tweak Tomcat's configuration but nothing seems to make a difference to the error you're seeing, you might find that Tomcat hasn't been shutting down properly, perhaps because it's waiting for a stale connection to close gracefully which won't happen. To see if this is the case, try running: ps -ef | grep java and look for Tomcat's Java processes. If they stay around after running Tomcat's shutdown.sh script, trying running kill on them (or kill -9 if necessary), then starting Tomcat again. Database connections don't work, or accessing DSpace takes forever: If you find that when you try to access a DSpace Web page and your browser sits there connecting, or if the database connections fail, you might find that a 'zombie' database connection is hanging around preventing normal operation. To see if this is the case, try running: ps -ef | grep postgres You might see some processes like this:
dspace 16325 1997 in transaction 0 Feb 14 ? 0:00 postgres: dspace dspace 127.0.0.1 idle
This is normal. DSpace maintains a 'pool' of open database connections, which are re-used to avoid the overhead of constantly opening and closing connections. If they're 'idle' it's OK; they're waiting to be used. 47
Common Problems However sometimes, if something went wrong, they might be stuck in the middle of a query, which seems to prevent other connections from operating, e.g.:
dspace 16325 SELECT 1997 0 Feb 14 ? 0:00 postgres: dspace dspace 127.0.0.1
This means the connection is in the middle of a SELECT operation, and if you're not using DSpace right that instant, it's probably a 'zombie' connection. If this is the case, try running kill on the process, and stopping and restarting Tomcat.
48
49
*CORRECTION* This was moved from the end of the solr configuration section to just under Logging Configurations:
# If enabled, the logging and the solr statistics system will look for # an X-Forward header. If it finds it, it will use this for the user IP Address # useProxies = true
*CHANGE* The MediaFilter is now able to process Power Point Text Extracter
#Names of the enabled MediaFilter or FormatFilter plugins filter.plugins = PDF Text Extractor, HTML Text Extractor, \ PowerPoint Text Extractor, \ Word Text Extractor, JPEG Thumbnail # [To enable Branded Preview]: remove last line above, and uncomment 2 lines below # Word Text Extractor, JPEG Thumbnail, \ # Branded Preview JPEG #Assign 'human-understandable' names to each filter plugin.named.org.dspace.app.mediafilter.FormatFilter = \ org.dspace.app.mediafilter.PDFFilter = PDF Text Extractor, \ org.dspace.app.mediafilter.HTMLFilter = HTML Text Extractor, \ org.dspace.app.mediafilter.WordFilter = Word Text Extractor, \ org.dspace.app.mediafilter.PowerPointFilter = PowerPoint Text Extractor, \ org.dspace.app.mediafilter.JPEGFilter = JPEG Thumbnail, \ org.dspace.app.mediafilter.BrandedPreviewJPEGFilter = Branded Preview JPEG #Configure each filter's input format(s) filter.org.dspace.app.mediafilter.PDFFilter.inputFormats = Adobe PDF filter.org.dspace.app.mediafilter.HTMLFilter.inputFormats = HTML, Text filter.org.dspace.app.mediafilter.WordFilter.inputFormats = Microsoft Word filter.org.dspace.app.mediafilter.PowerPointFilter.inputFormats = Microsoft Powerpoint, Microsoft Powerpoint XML filter.org.dspace.app.mediafilter.JPEGFilter.inputFormats = BMP, GIF, JPEG, image/png filter.org.dspace.app.mediafilter.BrandedPreviewJPEGFilter.inputFormats = BMP, GIF, JPEG, image/png
*CHANGE* The Crosswalk Plugin Configuration has changed with additional lines. Edit your file accordingly:
# Crosswalk Plugin Configuration: # The purpose of Crosswalks is to translate an external metadata format to/from # the DSpace Internal Metadata format (DIM) or the DSpace Database. # Crosswalks are often used by one or more Packager plugins (see below). plugin.named.org.dspace.content.crosswalk.IngestionCrosswalk = \ org.dspace.content.crosswalk.AIPDIMCrosswalk = DIM, \ org.dspace.content.crosswalk.AIPTechMDCrosswalk = AIP-TECHMD, \ org.dspace.content.crosswalk.PREMISCrosswalk = PREMIS, \ org.dspace.content.crosswalk.OREIngestionCrosswalk = ore, \ org.dspace.content.crosswalk.NullIngestionCrosswalk = NIL, \ org.dspace.content.crosswalk.OAIDCIngestionCrosswalk = dc, \ org.dspace.content.crosswalk.DIMIngestionCrosswalk = dim, \ org.dspace.content.crosswalk.METSRightsCrosswalk = METSRIGHTS, \ org.dspace.content.crosswalk.RoleCrosswalk = DSPACE-ROLES plugin.selfnamed.org.dspace.content.crosswalk.IngestionCrosswalk = \ org.dspace.content.crosswalk.XSLTIngestionCrosswalk, \ org.dspace.content.crosswalk.QDCCrosswalk plugin.named.org.dspace.content.crosswalk.StreamIngestionCrosswalk = \ org.dspace.content.crosswalk.NullStreamIngestionCrosswalk = NULLSTREAM, \
50
*NEW*
*CHANGE* The Packager Plugin Configuration has changed considerably. Carefully revise your configuration file:
Packager Plugin Configuration: # Configures the ingest and dissemination packages that DSpace supports. # These Ingester and Disseminator classes support a specific package file format # (e.g. METS) which DSpace understands how to import/export. Each Packager # plugin often will use one (or more) Crosswalk plugins to translate metadata (see above). plugin.named.org.dspace.content.packager.PackageDisseminator = \ org.dspace.content.packager.DSpaceAIPDisseminator = AIP, \ org.dspace.content.packager.DSpaceMETSDisseminator = METS, \ org.dspace.content.packager.RoleDisseminator = DSPACE-ROLES plugin.named.org.dspace.content.packager.PackageIngester = \ org.dspace.content.packager.DSpaceAIPIngester = AIP, \ org.dspace.content.packager.PDFPackager = Adobe PDF, PDF, \ org.dspace.content.packager.DSpaceMETSIngester = METS, \ org.dspace.content.packager.RoleIngester = DSPACE-ROLES
*CHANGE* The Mets Ingester configuration has change and been updated. Carefully edit:
#### METS ingester configuration: # These settings configure how DSpace will ingest a METS-based package # Configures the METS-specific package ingesters (defined above) # 'default' settings are specified by 'default' key # Default Option to save METS manifest in the item: (default is false) mets.default.ingest.preserveManifest = false # Default Option to make use of collection templates when using the METS ingester (default is false) mets.default.ingest.useCollectionTemplate = false # Default crosswalk mappings # Maps a METS 'mdtype' value to a DSpace crosswalk for processing. # When the 'mdtype' value is same as the name of a crosswalk, that crosswalk
51
52
*NEW* A new property has been added to control the discovery index for the Event System Configuration:
consumer to maintain the discovery index event.consumer.discovery.class = org.dspace.discovery.IndexEventConsumer event.consumer.discovery.filters = Community|Collection|Item|Bundle+Add|Create|Modify| Modify_Metadata|Delete|Remove
*NEW* License bundle display is now configurable. You are able to either display or suppress.
# whether to display the contents of the licence bundle (often just the deposit # licence in standard DSpace installation webui.licence_bundle.show = false
*CORRECTION* Thumbnail generation. The width and height of generated thumbnails had a missing equal sign.
*CORRECTION and ADDITION* Authority Control Settings have changed. Formerly called ChoiceAuthority, it is now referred to as DCInputAuthority.
## The DCInputAuthority plugin is automatically configured with every ## value-pairs element in input-forms.xml, namely: ## common_identifiers, common_types, common_iso_languages ## ## The DSpaceControlledVocabulary plugin is automatically configured ## with every *.xml file in [dspace]/config/controlled-vocabularies, ## and creates a plugin instance for each, using base filename as the name. ## eg: nsi, srsc. ## Each DSpaceControlledVocabulary plugin comes with three configuration options: # vocabulary.plugin._plugin_.hierarchy.store = <true|false> # default: true # vocabulary.plugin._plugin_.hierarchy.suggest = <true|false> # default: true # vocabulary.plugin._plugin_.delimiter = "<string>" # default: "::" ## ## An example using "srsc" can be found later in this section #plugin.selfnamed.org.dspace.content.authority.ChoiceAuthority = \ # org.dspace.content.authority.DCInputAuthority, \ # org.dspace.content.authority.DSpaceControlledVocabulary
## demo: subject code autocomplete, using srsc as authority ## (DSpaceControlledVocabulary plugin must be enabled) #choices.plugin.dc.subject = srsc #choices.presentation.dc.subject = select #vocabulary.plugin.srsc.hierarchy.store = true #vocabulary.plugin.srsc.hierarchy.suggest = true #vocabulary.plugin.srsc.delimiter = "::"
53
*NEW* You are now able to order your bitstreams by sequence id or file name.
*NEW* DSpace now includes a metadata mapping feature that makes repository content discoverable by Google Scholar:
##### Google Scholar Metadata Configuration ##### google-metadata.config = ${dspace.dir}/config/crosswalks/google-metadata.properties google-metadata.enable = true
# # # # #
Enabling this property will concatenate CSS, JS and JSON files where possible. CSS files can be concatenated if multiple CSS files with the same media attribute are used in the same page. Links to the CSS files are automatically referring to the concatenated resulting CSS file. The theme sitemap should be updated to use the ConcatenationReader for all js, css and json # files before enabling this property. #xmlui.theme.enableConcatenation = false # Enabling this property will minify CSS, JS and JSON files where possible. # The theme sitemap should be updated to use the ConcatenationReader for all js, css and json # files before enabling this property. #xmlui.theme.enableMinification = false
*NEW* XMLUI Mirage Theme. This is a new theme with it's own configuration:
### Setings for Item lists in Mirage theme ### # What should the emphasis be in the display of item lists? # Possible values : 'file', 'metadata'. If your repository is # used mainly for scientific papers 'metadata' is probably the # best way. If you have a lot of images and other files 'file' # will be the best starting point # (metdata is the default value if this option is not specified) #xmlui.theme.mirage.item-list.emphasis = file
# DSpace by default uses 100 records as the limit for the oai responses. # This can be altered by enabling the oai.response.max-records parameter # and setting the desired amount of results. oai.response.max-records = 100
# Define the metadata type EPDCX (EPrints DC XML) # to be handled by the SWORD crosswalk configuration # mets.default.ingest.crosswalk.EPDCX = SWORD
You will find the result in [dspace-source]/dspace/target/dspace-[version]-build.dir . Inside this directory is the compiled binary distribution of DSpace. Before rebuilding DSpace, the above command will clean out any previously compiled code ('clean') and ensure that your local DSpace JAR files are updated from the remote maven code repository. 7. Update DSpace. Update the DSpace installed directory with the new code and libraries. Issue the following commands:
cd [dspace-source]/dspace/target/dspace-[version]-build.dir ant -Dconfig=[dspace]/config/dspace.cfg update
8. Update the Database. You will need to run the 1.6.x to 1.7.x database upgrade script. For PostgreSQL:
psql -U [dspace-user] -f [dspace-source]/dspace/etc/postgres/database_schema_16_17.sql [database name]
For Oracle: Execute the upgrade script, e.g. with sqlplus, recording the output: a. Start SQL*Plus with sqlplus [connect args] b. Record the output: SQL> spool 'upgrade.lst' c. Run the upgrade script SQL> cle/database_schema_16_17.sql d. Turn off recording of output: SQL> spool off 9. Generate Browse and Search Indexes. It's always good policy to rebuild your search and browse indexes when upgrading to a new release. To do this, run the following command from your DSpace install directory 55 (as the 'dspace' user): @[dspace-source]/dspace/etc/ora-
10.Deploy Web Applications. If your servlet container (e.g. Tomcat) is not configured to look for new web applications in your [dspace]/webapps directory, then you will need to copy the web applications files into the appropriate subdirectory of your servlet container. For example:
cp -R [dspace]/webapps/* [tomcat]/webapps/
11.Restart servlet. Now restart your Tomcat/Jetty/Resin server program and test out the upgrade.
You will find the result in [dspace-source]/dspace/target/dspace-[version]-build.dir . Inside this directory is the compiled binary distribution of DSpace.Before rebuilding DSpace, the above command will clean out any previously compiled code ('clean') and ensure that your local DSpace JAR files are updated from the remote maven repository. 7. Update DSpace. Update the DSpace installed directory with the new code and libraries. Issue the following commands:
cd [dspace-source]/dspace/target/dspace-[version]-build.dir
56
8. Run Registry Format Update for CC License. Creative Commons licenses have been assigned the wrong mime-type in past versions of DSpace. Even if you are not currently using CC Licenses, you should update your Bitstream Format Registry to include a new entry with the proper mime-type. To update your registry, run the following command: dspace]/bin/dspace registry-loader -bitstream [dspace]/etc/upgrades/15-16/newbitstream-formats.xml 9. Update the Database. If you are using Creative Commons Licenses in your DSpace submission process, you will need to run the 1.5.x to 1.6.x database upgrade script again. In 1.6.0 the improper mime-type was being assigned to all CC Licenses. This has now been resolved, and rerunning the upgrade script will now assign the proper mime-type to all existing CC Licenses in your DSpace installation. NOTE: You will receive messages that most of the script additions already exist. This is normal, and nothing to be worried about. For PostgreSQL: psql -U [dspace-user] -f [dspace-source]/dspace/etc/postgres/database_schema_15-16.sql [database name] (Your database name is by default 'dspace'). Example:
psql -U dspace -f /dspace-1.6-1-src-release/dspace/etc/postgres/database_schema_15-16.sql dspace
(The line break above is cosmetic. Please place your command in one line. For Oracle: Execute the upgrade script, e.g. with sqlplus, recording the output: a. Start SQL*Plus with "sqlplus [connect args]" b. Record the output: SQL> spool 'upgrade.lst' c. Run the upgrade script SQL> @[dspace-source]/dspace/etc/oracle/database_schema_15-16.sqlSQL> spool off d. Please note: The final few statements WILL FAIL. That is because you have run some queries and use the results to construct the statements to remove the constraints, manually Oracle doesn't have any easy way to automate this (unless you know PL/SQL). So, look for the comment line beginning:
"--You need to remove the already in place constraints" and follow the instructions in the actual SQL file. Refer to the contents of the spool file "upgrade.lst" for the output of the queries you'll need.
10.Generate Browse and Search Indexes. Though there are not any database changes in the 1.6 to 1.6.1 release, it makes good policy to rebuild your search and browse indexes when upgrading to a new release. To do this, run the following command from your DSpace install directory (as the dspace user):[dspace]/bin/dspace index-init 11.Deploy Web Applications. Copy the web applications files from your [dspace]/webapps directory to the subdirectory of your servlet container (e.g. tomcat):cp -R [dspace]/webapps/* [tomcat]/webapps/ 12.Restart servlet. Now restart your Tomcat/Jetty/Resin server program and test out the upgrade.
Upgrading from 1.5.x to 1.6.x 1. Backup Your DSpace. First, and foremost, make a complete backup of your system, including: A snapshot of the database. _To have a "snapshot" of the PostgreSQL database, you need to shut it down during the backup. You should also have your regular PostgreSQL Backup output (using pg_dump commands). _ The asset store ([dspace]/assetstore by default) Your configuration files and customizations to DSpace (including any customized scripts). 2. Download DSpace 1.6.x Retrieve the new DSpace 1.6.x source code either as a download from DSpace.org or check it out directly from the SVN code repository. If you downloaded DSpace do not unpack it on top of your existing installation.Refer to Chapter 3.3.3 Installation, Step 3 for unpacking directives. 3. Stop Tomcat. Take down your servlet container. For Tomcat, use the $CATALINA/shutdown.sh script. (Many installations will have a startup/shutdown script in the /etc/init.d or /etc/rc.d directories. 4. Apply any customizations. If you have made any local customizations to your DSpace installation they will need to be migrated over to the new DSpace. These are housed in one of the following places:JSPUI modifications: [dspace-source]/dspace/modules/jspui/src/main/webapp/_XMLUI modifications: _[dspace-source]/ dspace/modules/xmlui/src/main/webapp 5. Update Configuration Files. Some of the parameters have change and some are new. Changes will be noted below: **CHANGE** The base url and oai urls property keys are set differently
# DSpace host name - should match base URL. dspace.hostname = localhost Do not include port number
# DSpace base host URL. Include port number etc. dspace.baseUrl = http://localhost:8080 # DSpace base URL. Include port number etc., but NOT trailing slash # Change to xmlui if you wish to use the xmlui as the default, or remove # "/jspui" and set webapp of your choice as the "ROOT" webapp in # the servlet engine. dspace.url = ${dspace.baseUrl}/xmlui # The base URL of the OAI webapp (do not include /request). dspace.oai.url = ${dspace.baseUrl}/oai
**NEW** New email options (Add these at the end of the "Email Settings" sub-section):
# A comma separated list of hostnames that are allowed to refer browsers to # email forms. Default behavior is to accept referrals only from # dspace.hostname #mail.allowed.referrers = localhost # Pass extra settings to the Java mail library. Comma separated, equals sign # between the key and the value. #mail.extraproperties = mail.smtp.socketFactory.port=465, \ # mail.smtp.socketFactory.class=javax.net.ssl.SSLSocketFactory, \ # mail.smtp.socketFactory.fallback=false # An option is added to disable the mailserver. By default, this property is # set to false. By setting mail.server.disabled = true, DSpace will not send # out emails. It will instead log the subject of the email which should have # been sent. This is especially useful for development and test environments # where production data is used when testing functionality. #mail.server.disabled = false
58
Upgrading from 1.5.x to 1.6.x **NEW**New Authorization levels and parameters. See the Section 6, Configuration documentation, "Delegation Administration" section for further information.
##### Authorization system configuration - Delegate ADMIN ##### # COMMUNITY ADMIN configuration # subcommunities and collections #core.authorization.community-admin.create-subelement = true #core.authorization.community-admin.delete-subelement = true # his community #core.authorization.community-admin.policies = true #core.authorization.community-admin.admin-group = true # collections in his community #core.authorization.community-admin.collection.policies = true #core.authorization.community-admin.collection.template-item = true #core.authorization.community-admin.collection.submitters = true #core.authorization.community-admin.collection.workflows = true #core.authorization.community-admin.collection.admin-group = true # item owned by collections in his community #core.authorization.community-admin.item.delete = true #core.authorization.community-admin.item.withdraw = true #core.authorization.community-admin.item.reinstatiate = true #core.authorization.community-admin.item.policies = true # also bundle... #core.authorization.community-admin.item.create-bitstream = true #core.authorization.community-admin.item.delete-bitstream = true #core.authorization.community-admin.item-admin.cc-license = true # COLLECTION ADMIN #core.authorization.collection-admin.policies = true #core.authorization.collection-admin.template-item = true #core.authorization.collection-admin.submitters = true #core.authorization.collection-admin.workflows = true #core.authorization.collection-admin.admin-group = true # item owned by his collection #core.authorization.collection-admin.item.delete = true #core.authorization.collection-admin.item.withdraw = true #core.authorization.collection-admin.item.reinstatiate = true #core.authorization.collection-admin.item.policies = true # also bundle... #core.authorization.collection-admin.item.create-bitstream = true #core.authorization.collection-admin.item.delete-bitstream = true #core.authorization.collection-admin.item-admin.cc-license = true # ITEM ADMIN #core.authorization.item-admin.policies = true # also bundle... #core.authorization.item-admin.create-bitstream = true #core.authorization.item-admin.delete-bitstream = true #core.authorization.item-admin.cc-license = true
**CHANGE** METS ingester has been revised. (Modify In "Crosswalk and Packager Plugin Settings")
# Option to make use of collection templates when using the METS ingester (default is false) mets.submission.useCollectionTemplate = false # Crosswalk Plugins: plugin.named.org.dspace.content.crosswalk.IngestionCrosswalk = \ org.dspace.content.crosswalk.PREMISCrosswalk = PREMIS \ org.dspace.content.crosswalk.OREIngestionCrosswalk = ore \ org.dspace.content.crosswalk.NullIngestionCrosswalk = NIL \ org.dspace.content.crosswalk.QDCCrosswalk = qdc \ org.dspace.content.crosswalk.OAIDCIngestionCrosswalk = dc \
59
**CHANGE** Event Settings have had the following revision with the addition of 'harvester' (modify in "Event System Configuration"):
#### Event System Configuration #### # default synchronous dispatcher (same behavior as traditional DSpace) event.dispatcher.default.class = org.dspace.event.BasicDispatcher event.dispatcher.default.consumers = search, browse, eperson, harvester
also:
#### Embargo Settings #### # DC metadata field to hold the user-supplied embargo terms embargo.field.terms = SCHEMA.ELEMENT.QUALIFIER # DC metadata field to hold computed "lift date" of embargo embargo.field.lift = SCHEMA.ELEMENT.QUALIFIER # string in terms field to indicate indefinite embargo embargo.terms.open = forever # implementation of embargo setter plugin--replace with local implementation if # applicable plugin.single.org.dspace.embargo.EmbargoSetter = \ org.dspace.embargo.DefaultEmbargoSetter # implementation of embargo lifter plugin--replace with local implementation if # applicable plugin.single.org.dspace.embargo.EmbargoLifter = \ org.dspace.embargo.DefaultEmbargoLifter
**NEW** New option for using the Batch Editing capabilities. See Batch Metadata Editing Configuration and also System Administration : Batch Metadata Editing
### Bulk metadata editor settings ### # The delimiter used to separate values within a single field (defaults to a double pipe ||) # bulkedit.valueseparator = || # The delimiter used to separate fields (defaults to a comma for CSV) # bulkedit.fieldseparator = ,
60
# A hard limit of the number of items allowed to be edited in one go in the UI # (does not apply to the command line version) # bulkedit.gui-item-limit = 20 # Metadata elements to exclude when exporting via the user interfaces, or when # using the command line version and not using the -a (all) option. # bulkedit.ignore-on-export = dc.date.accessioned, dc.date.available, \ # dc.date.updated, dc.description.provenance
**NEW** Ability to hide metadata fields is now available. (Look for "JSPUI & XMLUI Configurations" Section)
##### Hide Item Metadata Fields ##### # Fields named here are hidden in the following places UNLESS the # logged-in user is an Administrator: # 1. XMLUI metadata XML view, and Item splash pages (long and short views). # 2. JSPUI Item splash pages # 3. OAI-PMH server, "oai_dc" format. # (NOTE: Other formats are _not_ affected.) # To designate a field as hidden, add a property here in the form: # metadata.hide.SCHEMA.ELEMENT.QUALIFIER = true # # This default configuration hides the dc.description.provenance field, # since that usually contains email addresses which ought to be kept # private and is mainly of interest to administrators: metadata.hide.dc.description.provenance = true
**NEW**Choice Control and Authority Control options are available (Look for "JSPUI & XMLUI Configurations" Section):
And also:
#####
#####
#plugin.named.org.dspace.content.authority.ChoiceAuthority = \ # org.dspace.content.authority.SampleAuthority = Sample, \ # org.dspace.content.authority.LCNameAuthority = LCNameAuthority, \ # org.dspace.content.authority.SHERPARoMEOPublisher = SRPublisher, \ # org.dspace.content.authority.SHERPARoMEOJournalTitle = SRJournalTitle ## This ChoiceAuthority plugin is automatically configured with every ## value-pairs element in input-forms.xml, namely: ## common_identifiers, common_types, common_iso_languages #plugin.selfnamed.org.dspace.content.authority.ChoiceAuthority = \ # org.dspace.content.authority.DCInputAuthority ## configure LC Names plugin #lcname.url = http://alcme.oclc.org/srw/search/lcnaf ## configure SHERPA/RoMEO authority plugin #sherpa.romeo.url = http://www.sherpa.ac.uk/romeo/api24.php ## ## This sets the default lowest confidence level at which a metadata value is included ## in an authority-controlled browse (and search) index. It is a symbolic ## keyword, one of the following values (listed in descending order): ## accepted ## uncertain 61 ## ambiguous
## Demo: publisher name lookup through SHERPA/RoMEO: #choices.plugin.dc.publisher = SRPublisher #choices.presentation.dc.publisher = suggest ## demo: journal title lookup, with ISSN as authority #choices.plugin.dc.title.alternative = SRJournalTitle #choices.presentation.dc.title.alternative = suggest #authority.controlled.dc.title.alternative = true ## demo: use choice authority (without authority-control) to restrict dc.type on EditItemMetadata page # choices.plugin.dc.type = common_types # choices.presentation.dc.type = select ## demo: same idea for dc.language.iso # choices.plugin.dc.language.iso = common_iso_languages # choices.presentation.dc.language.iso = select # Change number of choices shown in the select in Choices lookup popup #xmlui.lookup.select.size = 12
**REPLACE** RSS Feeds now support Atom 1.0. Replace its previous configuration with the one below:
#### Syndication Feed (RSS) Settings ###### # enable syndication feeds - links display on community and collection home pages # (This setting is not used by XMLUI, as you enable feeds in your theme) webui.feed.enable = false # number of DSpace items per feed (the most recent submissions) webui.feed.items = 4 # maximum number of feeds in memory cache # value of 0 will disable caching webui.feed.cache.size = 100 # number of hours to keep cached feeds before checking currency # value of 0 will force a check with each request webui.feed.cache.age = 48 # which syndication formats to offer # use one or more (comma-separated) values from list: # rss_0.90, rss_0.91, rss_0.92, rss_0.93, rss_0.94, rss_1.0, rss_2.0 webui.feed.formats = rss_1.0,rss_2.0,atom_1.0 # URLs returned by the feed will point at the global handle server (e.g. http:// hdl.handle.net/123456789/1) # Set to true to use local server URLs (i.e. http://myserver.myorg/handle/123456789/1) webui.feed.localresolve = false # Customize each single-value field displayed in the # feed information for each item. Each of
62
#### OpenSearch Settings #### # NB: for result data formatting, OpenSearch uses Syndication Feed Settings # so even if Syndication Feeds are not enabled, they must be configured # enable open search websvc.opensearch.enable = false # context for html request URLs - change only for non-standard servlet mapping websvc.opensearch.uicontext = simple-search # context for RSS/Atom request URLs - change only for non-standard servlet mapping websvc.opensearch.svccontext = open-search/ # present autodiscovery link in every page head websvc.opensearch.autolink = true # number of hours to retain results before recalculating websvc.opensearch.validity = 48 # short name used in browsers for search service # should be 16 or fewer characters websvc.opensearch.shortname = DSpace # longer (up to 48 characters) name websvc.opensearch.longname = ${dspace.name} # brief service description websvc.opensearch.description = ${dspace.name} DSpace repository # location of favicon for service, if any must be 16X16 pixels websvc.opensearch.faviconurl = http://www.dspace.org/images/favicon.ico # sample query - should return results websvc.opensearch.samplequery = photosynthesis
63
**NEW* *Exposure of METS metadata can be now hidden. (See "OAI-PMH SPECIFIC CONFIGURATIONS" in the dspace.cfg file)
# When exposing METS/MODS via OAI-PMH all metadata that can be mapped to MODS # is exported. This includes description.provenance which can contain personal # email addresses and other information not intended for public consumption. To # hide this information set the following property to true oai.mets.hide-provenance = true
**NEW* *SWORD has added the following to accept MIME/types. (See "SWORD Specific Configurations" Section)
# A comma separated list of MIME types that SWORD will accept sword.accepts = application/zip
**NEW* *New OAI Harvesting Configuration settings are now available. (See "OAI Harvesting Configurations"
#---------------------------------------------------------------# #--------------OAI HARVESTING CONFIGURATIONS--------------------# #---------------------------------------------------------------# # These configs are only used by the OAI-ORE related functions # #---------------------------------------------------------------# ### Harvester settings # Crosswalk settings; the {name} value must correspond to a declared ingestion crosswalk # harvester.oai.metadataformats.{name} = {namespace},{optional display name} harvester.oai.metadataformats.dc = http://www.openarchives.org/OAI/2.0/oai_dc/, Simple Dublin Core harvester.oai.metadataformats.qdc = http://purl.org/dc/terms/, Qualified Dublin Core harvester.oai.metadataformats.dim = http://www.dspace.org/xmlns/dspace/dim, DSpace Intermediate Metadata # # # # This field works in much the same way as harvester.oai.metadataformats.PluginName The {name} must correspond to a declared ingestion crosswalk, while the {namespace} must be supported by the target OAI-PMH provider when harvesting content. harvester.oai.oreSerializationFormat.{name} = {namespace}
# Determines whether the harvester scheduling process should be started # automatically when the DSpace webapp is deployed. # default: false harvester.autoStart=false # Amount of time subtracted from the from argument of the PMH request to account # for the time taken to negotiate a connection. Measured in seconds. Default value is 120. #harvester.timePadding = 120 # How frequently the harvest scheduler checks the remote provider for updates, # measured in minutes. The default value is 12 hours (or 720 minutes) #harvester.harvestFrequency = 720 # The heartbeat is the frequency at which the harvest scheduler queries the local # database to determine if any collections are due for a harvest cycle (based on
64
**NEW** SOLR Statistics Configurations. For a little more detailed information regarding the configuration, please refer to DSpace SOLR Statistics Configuration ; or, for installation procedures, refer to Advanced Installation: Dspace Statistics.
#---------------------------------------------------------------# #--------------SOLR STATISTICS CONFIGURATIONS-------------------# #---------------------------------------------------------------# # These configs are only used by the SOLR interface/webapp to # # track usage statistics. # #---------------------------------------------------------------# ##### Usage Logging ##### solr.log.server = ${dspace.baseUrl}/solr/statistics solr.spidersfile = ${dspace.dir}/config/spiders.txt solr.dbfile = ${dspace.dir}/config/GeoLiteCity.dat useProxies = true statistics.items.dc.1=dc.identifier statistics.items.dc.2=dc.date.accessioned statistics.items.type.1=dcinput
65
You will find the result in [dspace-source]/dspace/target/dspace-[version]-build.dir . Inside this directory is the compiled binary distribution of DSpace.Before rebuilding DSpace, the above command will clean out any previously compiled code ('clean') and ensure that your local DSpace JAR files are updated from the remote maven repository. 7. Update the database. The database schema needs to be updated to accommodate changes to the database. SQL files contain the relevant updates are provided. Please note that if you have made any local customizations to the database schema, you should consult these updates and make sure they will work for you. For PostgreSQL: psql -U [dspace-user] -f [dspace-source]/dspace/etc/postgres/database_schema_15-16.sql [database name] (Your database name is by default 'dspace'). Example: psql -U dspace -f /dspace-1.6-1-src-release/dspace/etc/ postgres/database_schema_15-16.sql dspace For Oracle: Execute the upgrade script, e.g. with sqlplus, recording the output: a. Start SQL*Plus with "sqlplus [connect args]" b. Record the output: SQL> spool 'upgrade.lst' c. Run the upgrade script SQL> @[dspace-source]/dspace/etc/oracle/database_schema_15_16.sqlSQL> spool off d. Please note: The final few statements WILL FAIL. That is because you have run some queries and use the results to construct the statements to remove the constraints, manually Oracle doesn't have any easy way to automate this (unless you know PL/SQL). So, look for the comment line beginning:
"--You need to remove the already in place constraints" and follow the instructions in the actual SQL file. Refer to the contents of the spool file "upgrade.lst" for the output of the queries you'll need.
8. Update DSpace. Update the DSpace installed directory with the new code and libraries. Issue the following commands:
cd [dspace-source]/dspace/target/dspace-[version]-build.dir ant -Dconfig=[dspace]/config/dspace.cfg update
9. Update Registry for the CC License. If you use the CC License, an incorrect mime-type type is being assigned. You will need to run the following step: _dspace]/bin/dspace registry-loader -bitstream [dspace]/ etc/upgrades/15-16/new-bitstream-formats.xml _ 10.Generate Browse and Search Indexes. It makes good policy to rebuild your search and browse indexes when upgrading to a new release. Almost every release has database changes and indexes can be affected by this. In the DSpace 1.6 release there is Authority Control features and those will need the indexes to be regenerated. To do this, run the following command from your DSpace install directory (as the dspace user):[dspace]/bin/dspace index-init
66
Upgrading From 1.5 or 1.5.1 to 1.5.2 11.Deploy Web Applications. Copy the web applications files from your [dspace]/webapps directory to the subdirectory of your servlet container (e.g. tomcat):cp -R [dspace]/webapps/* [tomcat]/webapps/ 12.Restart servlet. Now restart your Tomcat/Jetty/Resin server program and test out the upgrade. 13.Rolling Log Appender Upgrade. You will want to upgrade your logs to the new format to use the SOLR Statistics now included with DSpace. While the commands for this are found in Chapter 8, here is the steps needed to be performed.
[dspace]/bin/dspace stats-log-converter -i input file name -o output file name -m (if you have more than one dspace.log file) [dspace]/bin/dspace stats-log-importer -i input file name (probably the output name from above) -m
The user is highly recommend to see the System Administration : DSpace Log Converter documentation.
You will find the result in [dspace-source]/dspace/target/dspace-1.5.2-build.dir/; inside this directory is the compiled binary distribution of DSpace. 4. Stop Tomcat Take down your servlet container, for Tomcat use the bin/shutdown.sh script. 5. Apply any customizations If you have made any local customizations to your DSpace installation they will need to be migrated over to the new DSpace. Commonly these modifications are made to "JSP" pages located inside the [dspace 1.4.2]/jsp/local directory. These should be moved [dspace-source]/dspace/modules/jspui/ src/main/webapp/ in the new build structure. See Customizing the JSP Pages for more information. 6. Update DSpace Update the DSpace installed directory with new code and libraries. Inside the [dspacesource]/dspace/target/dspace-1.5-build.dir/ directory run:
67
7. Update configuration files This ant target preserves existing files in [dspace]/config _ and will copy any new configuration files in place. If an existing file prevents copying the new file in place, the new file will have the suffix _.new, for example [dspace]/local/dspace.cfg.new. Note: there is also a configuration option -Doverwrite=true which will instead copy the conflicting target files to *.old suffixes and overwrite target file then with the new file (essentially the opposite) this is beneficial for developers and those who use the [dspace-source]/dspace/config to maintain their changes.
cd [dspace-source]/dspace/target/dspace-1.5-build.dir/ ant -Dconfig=[dspace]/config/dspace.cfg update_configs
You must then verify that you've merged and differenced in the [dspace]/config/*/.new files into your configuration.Some of the new parameters you should look out for in dspace.cfg include: New option to restrict the expose of private items. The following needs to be added to dspace.cfg:
#### Restricted item visibility settings ### # By default RSS feeds, OAI-PMH and subscription emails will include ALL items # regardless of permissions set on them. # # If you wish to only expose items through these channels where the ANONYMOUS # user is granted READ permission, then set the following options to false #harvest.includerestricted.rss = true #harvest.includerestricted.oai = true #harvest.includerestricted.subscription = true
Uncommenting the option below will make the metadata items case-insensitive. This will # result in a single entry in the example above. However the value displayed may be either 'Olive oil'
68
### Usage event settings ### # The usage event handler to call. The default is the "passive" handler, which ignores events. # plugin.single.org.dspace.app.statistics.AbstractUsageEvent = \ # org.dspace.app.statistics.PassiveUsageEvent
#### Sitemap settings ##### # the directory where the generated sitemaps are stored sitemap.dir = ${dspace.dir}/sitemaps
MARC 21 ordering should now be used as default. Unless you have it set already, or you have it set to a different value, the following should be set:
plugin.named.org.dspace.sort.OrderFormatDelegate = org.dspace.sort.OrderFormatTitleMarc21=title
##### Hierarchical LDAP Settings ##### # If your users are spread out across a hierarchical tree on your # LDAP server, you will need to use the following stackable authentication # class: # plugin.sequence.org.dspace.authenticate.AuthenticationMethod = \ # org.dspace.authenticate.LDAPHierarchicalAuthentication # # You can optionally specify the search scope. If anonymous access is not # enabled on your LDAP server, you will need to specify the full DN and # password of a user that is allowed to bind in order to search for the # users. # This is the search scope value for the LDAP search during # autoregistering. This will depend on your LDAP server setup. # This value must be one of the following integers corresponding # to the following values: # object scope : 0 # one level scope : 1 # subtree scope : 2 #ldap.search_scope = 2 # The full DN and password of a user allowed to connect to the LDAP server # and search for the DN of the user trying to log in. If these are not specified, # the initial bind will be performed anonymously. #ldap.search.user = cn=admin,ou=people,o=myu.edu #ldap.search.password = password # # # # # If your LDAP server does not hold an email address for a user, you can use the following field to specify your email domain. This value is appended to the netid in order to make an email address. E.g. a netid of 'user' and ldap.netid_email_domain as '@example.com' would set the email of the user to be 'user@example.com
69
#### Shibboleth Authentication Configuration Settings #### # Check https://mams.melcoe.mq.edu.au/zope/mams/pubs/Installation/dspace15/view # for installation detail. # # org.dspace.authenticate.ShibAuthentication # # DSpace requires email as user's credential. There are 2 ways of providing # email to DSpace: # 1) by explicitly specifying to the user which attribute (header) # carries the email address. # 2) by turning on the user-email-using-tomcat=true which means # the software will try to acquire the user's email from Tomcat # The first option takes PRECEDENCE when specified. Both options can # be enabled to allow fallback. # this option below specifies that the email comes from the mentioned header. # The value is CASE-Sensitive. authentication.shib.email-header = MAIL # optional. Specify the header that carries user's first name # this is going to be used for creation of new-user authentication.shib.firstname-header = SHIB-EP-GIVENNAME # optional. Specify the header that carries user's last name # this is used for creation of new user authentication.shib.lastname-header = SHIB-EP-SURNAME # this option below forces the software to acquire the email from Tomcat. authentication.shib.email-use-tomcat-remote-user = true # should we allow new users to be registered automatically # if the IdP provides sufficient info (and user not exists in DSpace) authentication.shib.autoregister = true # # # # # # # # # # # # # # # # # # # # # # # # # # this header here specifies which attribute that is responsible for providing user's roles to DSpace. When not specified, it is defaulted to 'Shib-EP-UnscopedAffiliation'. The value is specified in AAP.xml (Shib 1.3.x) or attribute-filter.xml (Shib 2.x). The value is CASE-Sensitive. The values provided in this header are separated by semi-colon or comma. authentication.shib.role-header = Shib-EP-UnscopedAffiliation when user is fully authN on IdP but would not like to release his/her roles to DSpace (for privacy reason?), what should be the default roles be given to such users? The values are separated by semi-colon or comma authentication.shib.default-roles = Staff, Walk-ins The following mappings specify role mapping between IdP and Dspace. the left side of the entry is IdP's role (prefixed with "authentication.shib.role.") which will be mapped to the right entry from DSpace. DSpace's group as indicated on the right entry has to EXIST in DSpace, otherwise user will be identified as 'anonymous'. Multiple values on the right entry should be separated by comma. The values are CASE-Sensitive. Heuristic one-to-one mapping will be done when the IdP groups entry are not listed below (i.e. if "X" group in IdP is not specified here, then it will be mapped to "X" group in DSpace if it exists, otherwise it will be mapped to simply 'anonymous') Given sufficient demand, future release could support regex for the mapping special characters need to be escaped by \
70
# When using "resolver" in webui.itemdisplay to render identifiers as resolvable # links, the base URL is taken from <code>webui.resolver.<n>.baseurl</code> # where <code>webui.resolver.<n>.baseurl</code> matches the urn specified in the metadata value. # The value is appended to the "baseurl" as is, so the baseurl need to end with slash almost in any case. # If no urn is specified in the value it will be displayed as simple text. # #webui.resolver.1.urn = doi #webui.resolver.1.baseurl = http://dx.doi.org/ #webui.resolver.2.urn = hdl #webui.resolver.2.baseurl = http://hdl.handle.net/ # # For the doi and hdl urn defaults values are provided, respectively http://dx.doi.org and # http://hdl.handle.net are used. # # If a metadata value with style: "doi", "handle" or "resolver" matches a URL # already, it is simply rendered as a link with no other manipulation.
In configuration sections such as webui.itemdisplay.default, values can be changed from (e.g.) metadata.dc.identifier.doi to metadata.doi.dc.identifier.doi The whole of the SWORD configuration has changed. The SWORD section must be removed and replaced with
#---------------------------------------------------------------# #--------------SWORD SPECIFIC CONFIGURATIONS--------------------# #---------------------------------------------------------------# # These configs are only used by the SWORD interface # #---------------------------------------------------------------# # # # # # # # # # # # # # tell the SWORD METS implementation which package ingester to use to install deposited content. This should refer to one of the classes configured for: plugin.named.org.dspace.content.packager.PackageIngester The value of sword.mets-ingester.package-ingester tells the system which named plugin for this interface should be used to ingest SWORD METS packages The default is METS sword.mets-ingester.package-ingester = METS
# Define the metadata type EPDCX (EPrints DC XML) # to be handled by the SWORD crosswalk configuration # mets.submission.crosswalk.EPDCX = SWORD # define the stylesheet which will be used by the self-named # XSLTIngestionCrosswalk class when asked to load the SWORD # configuration (as specified above). This will use the # specified stylesheet to crosswalk the incoming SWAP metadata # to the DIM format for ingestion # crosswalk.submission.SWORD.stylesheet = crosswalks/sword-swap-ingest.xsl
71
# The metadata field in which to store the updated date for # items deposited via SWORD. # sword.updated.field = dc.date.updated # The metadata field in which to store the value of the slug # header if it is supplied # sword.slug.field = dc.identifier.slug # the accept packaging properties, along with their associated # quality values where appropriate. #
72
# Should the server identify the sword version in deposit response? # # It is recommended to leave this enabled. #
73
8. Restart Tomcat Restart your servlet container, for Tomcat use the bin/startup.sh script.
74
You will find the result in [dspace-source]/dspace/target/dspace-1.5-build.dir/; inside this directory is the compiled binary distribution of DSpace. 4. Stop Tomcat Take down your servlet container, for Tomcat use the bin/shutdown.sh script. 5. Updatedspace.cfg Several new parameters need to be added to your [dspace]/config/dspace.cfg. While it is advisable to start with a fresh DSpace 1.5 _dspace.cfg configuration file_ here are the minimum set of parameters that need to be added to an old DSpace 1.4.2 configuration.
#### Stackable Authentication Methods ##### # # Stack of authentication methods # (See org.dspace.authenticate.AuthenticationManager) # Note when upgrading you should remove the parameter: # plugin.sequence.org.dspace.eperson.AuthenticationMethod plugin.sequence.org.dspace.authenticate.AuthenticationMethod = \ org.dspace.authenticate.PasswordAuthentication ###### JSPUI item style plugin ##### # # Specify which strategy use for select the style for an item plugin.single.org.dspace.app.webui.util.StyleSelection = \ org.dspace.app.webui.util.CollectionStyleSelection
###### Browse Configuration ###### # # The following configuration will mimic the previous # behavior exhibited by DSpace 1.4.2. For alternative # configurations see the manual. # Browse indexes webui.browse.index.1 webui.browse.index.2 webui.browse.index.3 webui.browse.index.4
= = = =
# Sorting options webui.itemlist.sort-option.1 = title:dc.title:title webui.itemlist.sort-option.2 = dateissued:dc.date.issued:date webui.itemlist.sort-option.3 = dateaccessioned:dc.date.accessioned:date # Recent submissions recent.submissions.count = 5 # Itemmapper browse index itemmap.author.index = author # Recent submission processor plugins plugin.sequence.org.dspace.plugin.CommunityHomeProcessor = \ org.dspace.app.webui.components.RecentCommunitySubmissions plugin.sequence.org.dspace.plugin.CollectionHomeProcessor = \ org.dspace.app.webui.components.RecentCollectionSubmissions #### Content Inline Disposition Threshold #### # # Set the max size of a bitstream that can be served inline # Use -1 to force all bitstream to be served inline
75
6. Addxmlui.xconfManakin configuration The new Manakin user interface available with DSpace 1.5 requires an extra configuration file that you will need to manually copy it over to your configuration directory.
cp [dspace-source]/dspace/config/xmlui.xconf [dspace]/config/xmlui.xconf
7. Additem-submission.xmlanditem-submission.dtdconfigurable submission configuration The new configurable submission system that enables an administrator to re-arrange, or add/remove item submission steps requires this configuration file. You need to manually copy it over to your configuration directory.
cp [dspace-source]/dspace/config/item-submission.xml [dspace]/config/item-submission.xml cp [dspace-source]/dspace/config/item-submission.dtd [dspace]/config/item-submission.dtd
8. Add newinput-forms.xmlandinput-forms.dtdconfigurable submission configuration The input-forms.xml now has an included dtd reference to support validation. You'll need to merge in your changes to both file/and or copy them into place.
cp [dspace-source]/dspace/config/input-forms.xml [dspace]/config/input-forms.xml cp [dspace-source]/dspace/config/input-forms.dtd [dspace]/config/inputforms.dtd
9. Addsword-swap-ingest.xslandxhtml-head-item.propertiescrosswalk files New crosswalk files are required to support SWORD and the inclusion of metadata into the head of items.
cp [dspace-source]/dspace/config/crosswalks/sword-swap-ingest.xsl [dspace]/config/crosswalks/sword-swap-ingest.xsl
cp [dspace-source]/dspace/config/crosswalks/xhtml-head-item.properties [dspace]/config/crosswalks/xhtml-head-item.properties
76
Upgrading From 1.4.2 to 1.5 10.Addregistration_notifyemail files A new configuration option (registration.notify = you@your-email.com) can be set to send a notification email whenever a new user registers to use your DSpace. The email template for this email needs to be copied.
cp [dspace-source]/dspace/config/emails/registration_notify [dspace]/config/emails/registration_notify
11.Update the database The database schema needs updating. SQL files contain the relevant updates are provided, note if you have made any local customizations to the database schema you should consult these updates and make sure they will work for you. For PostgreSQL psql -U [dspace-user] -f [dspace-source]/dspace/etc/database_schema_14-15.sql [database-name] For Oracle [dspace-source]/dspace/etc/oracle/database_schema_142-15.sql contains the commands necessary to upgrade your database schema on oracle. 12.Apply any customizations If you have made any local customizations to your DSpace installation they will need to be migrated over to the new DSpace. Commonly these modifications are made to "JSP" pages located inside the [dspace 1.4.2]/jsp/local directory. These should be moved [dspace-source]/dspace/modules/jspui/ src/main/webapp/ in the new build structure. See Customizing the JSP Pages for more information. 13.Update DSpace Update the DSpace installed directory with new code and libraries. Inside the [dspacesource]/dspace/target/dspace-1.5-build.dir/ directory run:
cd [dspace-source]/dspace/target/dspace-1.5-build.dir/; ant -Dconfig=[dspace]/config/dspace.cfg update
14.Update the Metadata Registry New Metadata Registry updates are required to support SWORD.
cp [dspace-source]/dspace/config/registries/sword-metadata.xml [dspace]/config/registries/sword-metadata.xml; [dspace]/bin/dsrun org.dspace.administer.MetadataImporter -f [dspace]/config/registries/sword-metadata.xml
15.Rebuild browse and search indexes One of the major new features of DSpace 1.5 is the browse system which necessitates that the indexes be recreated. To do this run the following command from your DSpace installed directory:
[dspace]/bin/index-init
16.Update statistics scripts The statistics scripts have been rewritten for DSpace 1.5. Prior to 1.5 they were written in Perl, but have been rewritten in Java to avoid having to install Perl. First, make a note of the dates you have specified in your statistics scripts for the statistics to run from. You will find these in [dspace]/bin/ stat-initial, as $start_year and $start_month. Note down these values.Copy the new stats scripts:
cp [dspace-source]/dspace/bin/stat* [dspace]/bin/
Then edit your statistics configuration file with the start details. Add the following to [dspace]/conf/dstat.cfg# the year and month to start creating reports from# - year as four digits (e.g. 2005)# - month as a number (e.g. January is 1, December is 12)start.year = 2005start.month = 1 Replace '2005' and '1' as with the values you noted down. dstat.cfg also used to contain the hostname and service name as displayed at the top of 77 the statistics. These values are now taken from dspace.cfg so you can remove host.name and host.url from dstat.cfg if you wish. The values now used are dspace.hostname and dspace.name from dspace.cfg
Upgrading From 1.4.1 to 1.4.2 17.Deploy web applications Copy the web applications files from your [dspace]/webapps directory to the subdirectory of your servlet container (e.g. Tomcat):
cp [dspace]/webapps/* [tomcat]/webapps/
18.Restart Tomcat Restart your servlet container, for Tomcat use the bin/startup.sh script.
3. Note: Licensing conditions for the handle.jar file have changed. As a result, the latest version of the handle.jar file is not included in this distribution. It is recommended you read the new license conditions and decide whether you wish to update your installation's handle.jar. If you decide to update, you should replace the existing handle.jar in [dspace-1.4.x-source]/lib with the new version. 4. Take down Tomcat (or whichever servlet container you're using). 5. A new configuration item webui.html.max-depth-guess has been added to avoid infinite URL spaces. Add the following to the dspace.cfg file:
#### Multi-file HTML document/site settings ##### # # When serving up composite HTML items, how deep can the request be for us to # serve up a file with the same name? # # e.g. if we receive a request for "foo/bar/index.html" # and we have a bitstream called just "index.html" # we will serve up that bitstream for the request if webui.html.max-depth-guess # is 2 or greater. If webui.html.max-depth-guess is 1 or less, we would not # serve that bitstream, as the depth of the file is greater. # # If webui.html.max-depth-guess is zero, the request filename and path must # always exactly match the bitstream name. Default value is 3. # webui.html.max-depth-guess = 3
78
Upgrading From 1.3.2 to 1.4.x If webui.html.max-depth-guess is not present in dspace.cfg the default value is used. If archiving entire web sites or deeply nested HTML documents it is advisable to change the default to a higher value more suitable for these types of materials. 6. Your 'localized' JSPs (those in jsp/local) now need to be maintained in the source directory. If you have locally modified JSPs in your [dspace]/jsp/local directory, you will need to merge the changes in the new 1.4.x versions into your locally modified ones. You can use the diff command to compare your JSPs against the 1.4.x versions to do this. You can also check against the DSpace CVS. 7. In [dspace-1.4.x-source] run:
ant -Dconfig= [dspace]/config/dspace.cfg update
8. Copy the .war Web application files in [dspace-1.4.x-source]/build to the webapps sub-directory of your servlet container (e.g. Tomcat). e.g.:
cp [dspace-1.4.x-source]/build/*.war [tomcat]/webapps
If you're using Tomcat, you need to delete the directories corresponding to the old .war files. For example, if dspace.war is installed in [tomcat]/webapps/dspace.war, you should delete the [tomcat]/webapps/dspace directory. Otherwise, Tomcat will continue to use the old code in that directory. 9. Restart Tomcat.
4. Note: Licensing conditions for the handle.jar file have changed. As a result, the latest version of the handle.jar file is not included in this distribution. It is recommended you read the new license conditions and decide whether you wish to update your installation's handle.jar. If you decide to update, you should replace the existing handle.jar in [dspace-1.4.x-source]/lib with the new version. 5. Take down Tomcat (or whichever servlet container you're using). 6. Your DSpace configuration will need some updating: In dspace.cfg, paste in the following lines for the new stackable authentication feature, the new method for managing Media Filters, and the Checksum Checker.
#### Stackable Authentication Methods #####
79
#### Media Filter plugins (through PluginManager) #### plugin.sequence.org.dspace.app.mediafilter.MediaFilter = \ org.dspace.app.mediafilter.PDFFilter, org.dspace.app.mediafilter.HTMLFilter, \ org.dspace.app.mediafilter.WordFilter, org.dspace.app.mediafilter.JPEGFilter # to enable branded preview: remove last line above, and uncomment 2 lines below # org.dspace.app.mediafilter.WordFilter, org.dspace.app.mediafilter.JPEGFilter, \ # org.dspace.app.mediafilter.BrandedPreviewJPEGFilter filter.org.dspace.app.mediafilter.PDFFilter.inputFormats = Adobe PDF filter.org.dspace.app.mediafilter.HTMLFilter.inputFormats = HTML, Text filter.org.dspace.app.mediafilter.WordFilter.inputFormats = Microsoft Word filter.org.dspace.app.mediafilter.JPEGFilter.inputFormats = GIF, JPEG, image/png filter.org.dspace.app.mediafilter.BrandedPreviewJPEGFilter.inputFormat s = GIF, JPEG, image/png
#### Settings for Item Preview #### webui.preview.enabled = false # max dimensions of the preview image webui.preview.maxwidth = 600 webui.preview.maxheight = 600 # the brand text webui.preview.brand = My Institution Name # an abbreviated form of the above text, this will be used # when the preview image cannot fit the normal text webui.preview.brand.abbrev = MyOrg # the height of the brand webui.preview.brand.height = 20 # font settings for the brand text webui.preview.brand.font = SansSerif webui.preview.brand.fontpoint = 12 #webui.preview.dc = rights
#### Checksum Checker Settings #### # Default dispatcher in case none specified plugin.single.org.dspace.checker.BitstreamDispatcher=org.dspace.checke r.SimpleDispatcher # Standard interface implementations. You shouldn't need to tinker with these.
80
If you have customized advanced search fields (search.index.n fields, note that you now need to include the schema in the values. Dublin Core is specified as dc. So for example, if in 1.3.2 you had:
search.index.1 = title:title.alternative
If you use LDAP or X509 authentication, you'll need to add org.dspace.eperson.LDAPAuthentication or org.dspace.eperson.X509Authentication respectively. See also configuring custom authentication code. If you have custom Media Filters, note that these are now configured through dspace.cfg (instead of mediafilter.cfg which is obsolete.) Also, take a look through the default dspace.cfg file supplied with DSpace 1.4.x, as this contains configuration options for various new features you might like to use. In general, these new features default to 'off' and you'll need to add configuration properties as described in the default 1.4.x dspace.cfg to activate them. 7. Your 'localized' JSPs (those in jsp/local) now need to be maintained in the source directory. If you have locally modified JSPs in your [dspace]/jsp/local directory, you will need to merge the changes in the new 1.4.x versions into your locally modified ones. You can use the diff command to compare your JSPs against the 1.4.x versions to do this. You can also check against the DSpace CVS. 8. In [dspace-1.4.x-source] run:
ant -Dconfig= [dspace]/config/dspace.cfg update
9. The database schema needs updating. SQL files containing the relevant file are provided. If you've modified the schema locally, you may need to check over this and make alterations. For PostgreSQL: [dspace-1.4.x-source]/etc/database_schema_13-14.sql contains the SQL commands to achieve this for PostgreSQL. To apply the changes, go to the source directory, and run:psql -f etc/ database_schema_13-14.sql [DSpace database name] -h localhost For Oracle: [dspace-1.4.x-source]/etc/oracle/database_schema_13-14.sql should be run on the DSpace database to update the schema. 10.Rebuild the search indexes: [dspace]/bin/index-all 11.Copy the .war Web application files in [dspace-1.4-source]/build to the webapps sub-directory of your servlet container (e.g. Tomcat). e.g.:
cp [dspace-1.4-source]/build/*.war [tomcat]/webapps
If you're using Tomcat, you need to delete the directories corresponding to the old .war files. For example, if dspace.war is installed in [tomcat]/webapps/dspace.war, you should delete the [tomcat]/webapps/dspace 81 directory. Otherwise, Tomcat will continue to use the old code in that directory.
[dspace-1.3.2-source]/lib
3. Take down Tomcat (or whichever servlet container you're using). 4. Your 'localized' JSPs (those in jsp/local) now need to be maintained in the source directory. If you have locally modified JSPs in your [dspace]/jsp/local directory, you will need to merge the changes in the new 1.3.2 versions into your locally modified ones. You can use the diff command to compare the 1.3.1 and 1.3.2 versions to do this. 5. In [dspace-1.3.2-source] run:
ant -Dconfig= [dspace]/config/dspace.cfg update
6. Copy the .war Web application files in [dspace-1.3.2-source]/build to the webapps sub-directory of your servlet container (e.g. Tomcat). e.g.:
cp [dspace-1.3.2-source]/build/*.war [tomcat]/webapps
If you're using Tomcat, you need to delete the directories corresponding to the old .war files. For example, if dspace.war is installed in [tomcat]/webapps/dspace.war, you should delete the [tomcat]/webapps/dspace directory. Otherwise, Tomcat will continue to use the old code in that directory. 7. Restart Tomcat.
Upgrading From 1.2.1 to 1.2.2 3. Copy the PostgreSQL driver JAR to the source tree. For example: cd [dspace]/libcp postgresql.jar [dspace-1.2.2-source]/lib 4. Take down Tomcat (or whichever servlet container you're using). 5. Remove the old version of xerces.jar from your installation, so it is not inadvertently later used:rm [dspace]/ lib/xerces.jar 6. Install the new config files by moving dstat.cfg and dstat.map from [dspace-1.3.x-source]/config/ to [dspace]/ config 7. You need to add new parameters to your [dspace]/dspace.cfg:
###### Statistical Report Configuration Settings ###### # should the stats be publicly available? should be set to false if you only # want administrators to access the stats, or you do not intend to generate # any report.public = false # directory where live reports are stored report.dir = /dspace/reports/
8. Build and install the updated DSpace 1.3.x code. Go to the [dspace-1.3.x-source] directory, and run:ant Dconfig=[dspace]/config/dspace.cfg update 9. You'll need to make some changes to the database schema in your PostgreSQL database. [dspace-1.3.xsource]/etc/database_schema_12-13.sql contains the SQL commands to achieve this. If you've modified the schema locally, you may need to check over this and make alterations.To apply the changes, go to the source directory, and run: psql -f etc/database_schema_12-13.sql [DSpace database name] -h localhost 10.Customize the stat generating statistics as per the instructions in System Statistical Reports 11.Initialize the statistics using: [dspace]/bin/stat-initial[dspace]/bin/stat-general[dspace]/bin/stat-report-initial[dspace]/bin/stat-report-general 12.Rebuild the search indexes: [dspace]/bin/index-all 13.Copy the .war Web application files in [dspace-1.3.x-source]/build to the webapps sub-directory of your servlet container (e.g. Tomcat). e.g.:cp [dspace-1.3.x-source]/build/*.war [tomcat]/webapps 14.Restart Tomcat.
83
[dspace-1.2.2-source]/lib
3. Take down Tomcat (or whichever servlet container you're using). 4. Your 'localized' JSPs (those in jsp/local) now need to be maintained in the source directory. If you have locally modified JSPs in your [dspace]/jsp/local directory, you might like to merge the changes in the new 1.2.2 versions into your locally modified ones. You can use the diff command to compare the 1.2.1 and 1.2.2 versions to do this. Also see the version history for a list of modified JSPs. 5. You need to add a new parameter to your [dspace]/dspace.cfg for configurable fulltext indexing
##### Fulltext Indexing settings ##### # Maximum number of terms indexed for a single field in Lucene. # Default is 10,000 words - often not enough for full-text indexing. # If you change this, you'll need to re-index for the change # to take effect on previously added items. # -1 = unlimited (Integer.MAX_VALUE) search.maxfieldlength = 10000
6. In [dspace-1.2.2-source] run:
ant -Dconfig= [dspace]/config/dspace.cfg update
7. Copy the .war Web application files in [dspace-1.2.2-source]/build to the webapps sub-directory of your servlet container (e.g. Tomcat). e.g.:
cp [dspace-1.2.2-source]/build/*.war [tomcat]/webapps
If you're using Tomcat, you need to delete the directories corresponding to the old .war files. For example, if dspace.war is installed in [tomcat]/webapps/dspace.war, you should delete the [tomcat]/webapps/dspace directory. Otherwise, Tomcat will continue to use the old code in that directory. 8. To finalize the install of the new configurable submission forms you need to copy the file [dspace-1.2.2source]/config/input-forms.xml into [dspace]/config. 9. Restart Tomcat.
[dspace-1.2.1-source]/lib
84
Upgrading From 1.2 to 1.2.1 3. Take down Tomcat (or whichever servlet container you're using). 4. Your 'localized' JSPs (those in jsp/local) now need to be maintained in the source directory. If you have locally modified JSPs in your [dspace]/jsp/local directory, you might like to merge the changes in the new 1.2.1 versions into your locally modified ones. You can use the diff command to compare the 1.2 and 1.2.1 versions to do this. Also see the version history for a list of modified JSPs. 5. You need to add a few new parameters to your [dspace]/dspace.cfg for browse/search and item thumbnails display, and for configurable DC metadata fields to be indexed.
# whether to display thumbnails on browse and search results pages (1.2+) webui.browse.thumbnail.show = false # max dimensions of the browse/search thumbs. Must be <= thumbnail.maxwidth # and thumbnail.maxheight. Only need to be set if required to be smaller than # dimension of thumbnails generated by mediafilter (1.2+) #webui.browse.thumbnail.maxheight = 80 #webui.browse.thumbnail.maxwidth = 80 # whether to display the thumb against each bitstream (1.2+) webui.item.thumbnail.show = true # where should clicking on a thumbnail from browse/search take the user # Only values currently supported are "item" and "bitstream" #webui.browse.thumbnail.linkbehaviour = item
##### Fields to Index for Search ##### # DC metadata elements.qualifiers to be indexed for search # format: - search.index.[number] = [search field]:element.qualifier # - * used as wildcard ### ### changing these will change your search results, but will NOT automatically change your search displays ### ###
search.index.1 = author:contributor.* search.index.2 = author:creator.* search.index.3 = title:title.* search.index.4 = keyword:subject.* search.index.5 = abstract:description.abstract search.index.6 = author:description.statementofresponsibility search.index.7 = series:relation.ispartofseries search.index.8 = abstract:description.tableofcontents search.index.9 = mime:format.mimetype search.index.10 = sponsor:description.sponsorship search.index.11 = id:identifier.*
6. In [dspace-1.2.1-source] run:
ant -Dconfig= [dspace]/config/dspace.cfg update
7. Copy the .war Web application files in [dspace-1.2.1-source]/build to the webapps sub-directory of your servlet container (e.g. Tomcat). e.g.:
cp [dspace-1.2.1-source]/build/*.war [tomcat]/webapps
85
Upgrading From 1.1 (or 1.1.1) to 1.2 If you're using Tomcat, you need to delete the directories corresponding to the old .war files. For example, if dspace.war is installed in [tomcat]/webapps/dspace.war, you should delete the [tomcat]/webapps/dspace directory. Otherwise, Tomcat will continue to use the old code in that directory. 8. Restart Tomcat.
4. Stop Tomcat (or other servlet container.) 5. It's a good idea to upgrade all of the various third-party tools that DSpace uses to their latest versions: Java (note that now version 1.4.0 or later is required) Tomcat (Any version after 4.0 will work; symbolic links are no longer an issue) PostgreSQL (don't forget to build/download an updated JDBC driver .jar file! Also, back up the database first.) Ant 6. You need to add the following new parameters to your [dspace]/dspace.cfg:
##### Media Filter settings ##### # maximum width and height of generated thumbnails thumbnail.maxwidth = 80 thumbnail.maxheight = 80
There are one or two other, optional extra parameters (for controlling the pool of database connections). See the version history for details. If you leave them out, defaults will be used.Also, to avoid future confusion, you might like to remove the following property, which is no longer required:
config.template.oai-web.xml = [dspace]/oai/WEB-INF/web.xml
86
Upgrading From 1.1 (or 1.1.1) to 1.2 7. The layout of the installation directory (i.e. the structure of the contents of [dspace]) has changed somewhat since 1.1.1. First up, your 'localized' JSPs (those in jsp/local) now need to be maintained in the source directory. So make a copy of them now!Once you've done that, you can remove [dspace]/jsp and [dspace]/ oai, these are no longer used. (.war Web application archive files are used instead).Also, if you're using the same version of Tomcat as before, you need to remove the lines from Tomcat's conf/server.xml file that enable symbolic links for DSpace. These are the <Context> elements you added to get DSpace 1.1.1 working, looking something like this:
<Context path="/dspace" docBase="dspace" debug="0" reloadable="true" crossContext="true"> <Resources className="org.apache.naming.resources.FileDirContext" allowLinking="true" /> </Context>
Be sure to remove the <Context> elements for both the Web UI and the OAI Web applications. 8. Build and install the updated DSpace 1.2 code. Go to the DSpace 1.2 source directory, and run:
ant -Dconfig= [dspace]/config/dspace.cfg update
10.You'll need to make some changes to the database schema in your PostgreSQL database. [dspace-1.2-source]/ etc/database_schema_11-12.sql contains the SQL commands to achieve this. If you've modified the schema locally, you may need to check over this and make alterations.To apply the changes, go to the source directory, and run:
psql -f etc/database_schema_11-12.sql [DSpace database name] -h localhost
11.A tool supplied with the DSpace 1.2 codebase will then update the actual data in the relational database. Run it using:
[dspace]/bin/dsrun org.dspace.administer.Upgrade11To12
13.Delete the existing symlinks from your servlet container's (e.g. Tomcat's) webapp sub-directory.Copy the .war Web application files in [dspace-1.2-source]/build to the webapps sub-directory of your servlet container (e.g. Tomcat). e.g.:
cp [dspace-1.2-source]/build/*.war [tomcat]/webapps
14.Restart Tomcat. 15.To get image thumbnails generated and full-text extracted for indexing automatically, you need to set up a 'cron' job, for example one like this: 87
You might also wish to run it now to generate thumbnails and index full text for the content already in your system. 16.Note 1: This update process has effectively 'touched' all of your items. Although the dates in the Dublin Core metadata won't have changed (accession date and so forth), the 'last modified' date in the database for each will have been changed.This means the e-mail subscription tool may be confused, thinking that all items in the archive have been deposited that day, and could thus send a rather long email to lots of subscribers. So, it is recommended that you turn off the e-mail subscription feature for the next day, by commenting out the relevant line in DSpace's cron job, and then re-activating it the next day.Say you performed the update on 08June-2004 (UTC), and your e-mail subscription cron job runs at 4am (UTC). When the subscription tool runs at 4am on 09-June-2004, it will find that everything in the system has a modification date in 08-June-2004, and accordingly send out huge emails. So, immediately after the update, you would edit DSpace's 'crontab' and comment out the /dspace/bin/subs-daily line. Then, after 4am on 09-June-2004 you'd 'un-comment' it out, so that things proceed normally.Of course this means, any real new deposits on 08-June-2004 won't get e-mailed, however if you're updating the system it's likely to be down for some time so this shouldn't be a big problem. 17.Note 2: After consultation with the OAI community, various OAI-PMH changes have occurred: The OAI-PMH identifiers have changed (they're now of the form oai:hostname:handle as opposed to just Handles) The set structure has changed, due to the new sub-communities feature. The default base URL has changed As noted in note 1, every item has been 'touched' and will need re-harvesting. The above means that, if already registered and harvested, you will need to re-register your repository, effectively as a 'new' OAIPMH data provider. You should also consider posting an announcement to the OAI implementers e-mail list so that harvesters know to update their systems.Also note that your site may, over the next few days, take quite a big hit from OAI-PMH harvesters. The resumption token support should alleviate this a little, but you might want to temporarily whack up the database connection pool parameters in [dspace]/ config/dspace.cfg. See the dspace.cfg distributed with the source code to see what these parameters are and how to use them. (You need to stop and restart Tomcat after changing them.)I realize this is not ideal; for discussion as to the reasons behind this please see relevant posts to the OAI community: post one, post two, as well as this post to the dspace-tech mailing list\.If you really can't live with updating the base URL like this, you can fairly easily have thing proceed more-or-less as they are, by doing the following: Change the value of OAI_ID_PREFIX at the top of the org.dspace.app.oai.DSpaceOAICatalog class to hdl: Change the servlet mapping for the OAIHandler servlet back to / (from /request) Rebuild and deploy _oai.war_However, note that in this case, all the records will be re-harvested by harvesters anyway, so you still need to brace for the associated DB activity; also note that the set spec changes may not be picked up by some harvesters. It's recommended you read the above-linked mailing list posts to understand why the change was made. Now, you should be finished!
88
Upgrading From 1.0.1 to 1.1 In the notes below [dspace] refers to the install directory for your existing DSpace installation, and [dspace-1.1.1-source] to the source directory for DSpace 1.1.1. Whenever you see these path references, be sure to replace them with the actual path names on your local system. 1. Take down Tomcat. 2. It would be a good idea to update any of the third-party tools used by DSpace at this point (e.g. PostgreSQL), following the instructions provided with the relevant tools. 3. In [dspace-1.1.1-source] run:
ant -Dconfig= [dspace]/config/dspace.cfg update
4. If you have locally modified JSPs of the following JSPs in your [dspace]/jsp/local directory, you might like to merge the changes in the new 1.1.1 versions into your locally modified ones. You can use the diff command to compare the 1.1 and 1.1.1 versions to do this. The changes are quite minor.
collection-home.jsp admin/authorize-collection-edit.jsp admin/authorize-community-edit.jsp admin/authorize-item-edit.jsp admin/eperson-edit.jsp
5. Restart Tomcat.
89
4. Fix your JSPs for Unicode. If you've modified the site 'skin' (jsp/local/layout/header-default.jsp) you'll need to add the Unicode header, i.e.:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
to the <HEAD> element. If you have any locally-edited JSPs, you need to add this page directive to the top of all of them:
<%@ page contentType="text/html;charset=UTF-8" %>
(If you haven't modified any JSPs, you don't have to do anything.) 5. Copy the required Java libraries that we couldn't include in the bundle to the source tree. For example:
cd [dspace]/lib cp *.policy activation.jar servlet.jar mail.jar [dspace-1.1-source]/lib
6. Compile up the new DSpace code, replacing [dspace]/config/dspace.cfg with the path to your current, LIVE configuration. (The second line, touch `find .`, is a precaution, which ensures that the new code has a current datestamp and will overwrite the old code. Note that those are back quotes.)
cd [dspace-1.1-source] touch `find .` ant ant -Dconfig= [dspace]/config/dspace.cfg update
7. Update the database tables using the upgrader tool, which sets up the new >last_modified date in the item table:
Run [dspace]/bin/dsrun org.dspace.administer.Upgrade101To11
90
Upgrading From 1.0.1 to 1.1 9. Fix the OAICat properties file. Edit [dspace]/config/templates/oaicat.properties. Change the line that says
Identify.deletedRecord=yes
To:
Identify.deletedRecord=persistent
This is needed to fix the OAI-PMH 'Identity' verb response. Then run [dspace]/bin/install-configs. 10.Re-run the indexing to index abstracts and fill out the renamed database views:
[dspace]/bin/index-all
11.Restart Tomcat. Tomcat should be run with the following environment variable set, to ensure that Unicode is handled properly. Also, the default JVM memory heap sizes are rather small. Adjust -Xmx512M (512Mb maximum heap size) and -Xms64M (64Mb Java thread stack size) to suit your hardware.
JAVA_OPTS="-Xmx512M -Xms64M -Dfile.encoding=UTF-8"
91
Configuration
6. Configuration
There are a numbers of ways in which DSpace may be configured and/or customized. This chapter of the documentation will discuss the configuration of the software and will also reference customizations that may be performed in the chapter following. For ease of use, the Configuration documentation is broken into several parts: General Configuration - addresses general conventions used with configuring not only the dspace.cfg file, but other configuration files which use similar conventions. The dspace.cfg Configuration Properties File - specifies the basic dspace.cfg file settings Optional or Advanced Configuration Settings - contain other more advanced settings that are optional in the dspace.cfg configuration file. The full table of contents follows:
Property values can include other, previously defined values, by enclosing the property name in ${...}. For example, if your dspace.cfg contains:
dspace.dir = /dspace dspace.history = ${dspace.dir}/history
Then the value of dspace.history property is expanded to be /dspace/history. This method is especially useful for handling commonly used file paths.
92
This will copy the source dspace.cfg (along with other configuration files) into the runtime ([dspace]/ config) directory. You should remember that after editing your configuration file(s), and you are done and wish to implement the changes, you will need to: Run ant -Dconfig=[dspace]/config/dspace.cfg update if you are updating your dspace.cfg file and wish to see the changes appear. Follow the usual sequence with copying your webapps. If you edit dspace.cfg in [dspace-source]/dspace/config/, you should then run 'ant init_configs' in the directory [dspace-source]/dspace/target/dspace-1.5.2-build.dir so that any changes you may have made are reflected in the configuration files of other applications, for example Apache. You may then need to restart those applications, depending on what you changed.
93
Database Settings
db.name db.url db.driver db.username db.password
4.2.3 or 6.3.3
6.3.3
Email Settings
mail.server mail.server.username mail.server.password mail.server.port mail.from.address feedback.recipient mail.admin alert.recipient registration.notify mail.charset mail.allowed.referrers mail.extraproperties mail.server.disabled
6.3.4
File Storage
assetstore.dir [assetstore.dir.1 assetstore.dir.2 assetstore.incoming]
6.3.5
6.3.6
94
Search Settings
search.dir search.max-clauses search.analyzer search.operator search.maxfieldlengthsearch.index.n search.index.1
6.3.8
Handle Settings
handle.prefix handle.dir
6.3.9
6.3.10
95
Ref. Sect.
6.3.11.1
6.3.11.2
6.3.11.3
6.3.11.5 6.3.11.6
96
6.3.12
Proxy Settings
http.proxy.host http.proxy.port
6.3.13
Custom settings for PDFFilter 6.3.14 pdffilter.largepdfsdffilter.skiponmemoryexception Crosswalk and Packager Plugin Settings (MODS, QDC, XSLT, etc.)
crosswalk.mods.properties.MODS crosswalk.mods.properties.mods crosswalk.submission.MODS.stylesheet crosswalk.qdc.namespace.QDC.dc crosswalk.qdc.namespace.QDC.dcterms crosswalk.qdc.schemaLocation.QDC crosswalk.qdc.properties.QDC mets.submission.crosswalk.DC mets.submission.preserveManifest mets.submission.useCollectionTemplate
6.3.15.1
6.3.15
6.3.15.4 plugin.named.org.dspace.content.crosswalk.IngestionCrosswalk plugin.selfnamed.org.dspace.content.crosswalk.IngestionCrosswalk plugin.named.org.dspace.content.crosswalk.DisseminationCrosswalk plugin.selfnamed.org.dspace.content.crosswalk.DisseminationCrosswalk 6.3.15.5 plugin.named.org.dspace.content.packager.PackageDisseminator plugin.named.org.dspace.content.packager.PackageIngester Event System Configuration
event.dispatcher.default.class
6.3.16
97
Ref. Sect.
Embargo Settings 6.3.17 embargo.field.terms embargo.field.lift embargo.field.open plugin.single.org.dspace.embargo.EmbargoSetter plugin.single.org.dspace.embargo.EmbargoLifter Checksum Checker 6.3.18 plugin.single.org.dspace.checker.BitsreamDispatcher checker.retention.default checker.retention.CHECKSUM-MATCH Item Export and Download Settings
org.dspace.app.itemexport.work.dir org.dspace.app.itemexport.download.dir org.dspace.app.itemexport.life.span.hours org.dspace.app.itemexport.max.size
6.3.19
6.3.20
6.3.21
6.3.22
Submission Process
webui.submit.blocktheses webui.submit.upload.required webui.submit.enable-cc webui.submit.cc-jurisdiction
6.3.23 6.3.24
6.3.25
98
Ref. Sect.
6.3.25
6.3.25
webui.browse.medata.case-insensitive 6.3.26.3 6.3.26.4 webui.browse.value_columns.max webui.browse.sort_columns.max webui.browse.value_columns.omission_mark plugin.named.org.dspace.sort.OrderFormatDelegate Multiple Metadata Value Display
webui.browse.author-field webui.browse.author-limit
6.3.27
Other Browse Contexts webui.browse.link.n Recent Submission 6.3.29 recent.submission.sort-option recent.submissions.count plugin.sequence.org.dspace.plugin.CommunityHomeProcessor plugin.sequence.org.dspace.plugin.CollectionHomeProcessor Submission License Substitution Variables plugin.named.org.dspace.content.license.LicenseArgumentFormatter 6.3.30 Syndication Feed (RSS) Settings
webui.feed.enable webui.feed.items webui.feed.cache.size webui.cache.age webui.feed.formats webui.feed.localresolve webui.feed.item.title webui.feed.item.date webui.feed.item.description webui.feed.item.author webui.feed.item.dc.creator webui.feed.item.dc.date webui.feed.item.dc.description
6.3.28
6.3.31
99
Ref. Sect.
OpenSearch Settings
websvc.opensearch.enable websvc.opensearch.uicontext websvc.opensearch.svccontext websvc.opensearch.autolink websvc.opensearch.validity websvc.opensearch.shortname websvc.opensearch.longname websvc.opensearch.description websvc.opensearch.faviconurl websvc.opensearch.samplequery websvc.opensearch.tags websvc.opensearch.formats
6.3.32
6.3.33
6.3.34
Sitemap Settings
sitemap.dir sitemap.engineurls
6.3.35
Authority Control Settings 6.3.36 plugin.named.org.dspace.content.authority.ChoiceAuthority plugin.selfnamed.org.dspace.content.authority.ChoiceAuthority lcname.url sherpa.romeo.url authority.minconfidence xmlui.lookup.select.size JSPUI Upload File Settings
upload.temp.dir upload.max
6.3.37
JSP Web Interface Settings 6.3.38 webui.licence_bundle.show webui.itemdisplay.default webui.resolver.1.urn webui.resolver.1.baseurl webui.resolver.2.urn webui.resolver.2.baseurl plugin.single.org.dspace.app.webui.util.StyleSelection webui.itemdisplay.thesis.collections webui.itemdisplay.metadata-style webui.itemlist.columns webui.itemlist.widths webui.itemlist.browse.<<index name>.sort.<sort name>.columns webui.itemlist.sort<sort name>.columns webui.itemlist.browse.<browse name>.columns webui.itemlist.<sort or index name>.columns webui.itemlist.dateaccessioned.columns webui.itemlist.dateaccessioned.widths
100
Ref. Sect.
JSPUI i18n Locales / Languages default.locale JSPUI Additional Configuration for Item Mapper itemmap.author.index JSPUI MyDSpace Display of Group Membership webui.mydspace.showgroupmembership JSPUI SFX Server Setting sfx.server.url JSPUI Item Recommendation Settings
webui.suggest.enable webui.suggest.loggedinusers.only
JSPUI Controlled Vocabulary Settings webui.controlledvocabulary.enable JSPUI Session Invalidation webui.session.invalidate XMLUI Settings (Manakin)
xmlui.supported.locales xmlui.force.ssl xmlui.user.registration xmlui.user.editmetadata xmlui.user.assumelogin xmlui.user.logindirect xmlui.theme.allowoverrides xmlui.bundle.upload xmlui.community-list.render.full xmlui.community-list.cache xmlui.bitstream.mods xmlui.bitstream.mets xmlui.google.analytics.key xmlui.controlpanel.activity.max xmlui.controlpanel.activity.ipheader
5.2.47
6.4.6
101
Ref. Sect.
6.3.48
SOLR Statistics Configurations 6.3.49 solr.log.server solr.dbfilesolr.resolver.timeout statistics.item.authorization.adminsolr.statistics.logBots solr.statistics.query.filter.spiderIP solr.statistics.query.filter.isBot solr.spiderips.urls
Property: Example Value: Informational Note: Property: Example Value: Informational Note:
102
The dspace.cfg Configuration Properties File port number etc., but NOT trailing slash. Change to / xmlui if you wish to use the xmlui (Manakin) as the default, or remove "/jspui" and set webapp of your choice as the "ROOT" webapp in the servlet engine. Property: Example Value: Informational note: Property: Example Value: Informational Note: dspace.oai.url dspace.oai.url = ${dspace.baseUrl}/ oai The base URL of the OAI webapp (do not include /request). dspace.name dspace.name = DSpace at My University Short and sweet site name, used throughout Web UI, e-mails and elsewhere (such as OAI protocol)
Property: Example Value: Informational Note: Property: Example Value: Informational Note:
103
The dspace.cfg Configuration Properties File that is used for DSpace by uncommenting the entry. This property is optional. Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: db.maxconnections db.maxconnections = 30 Maximum number of Database connections in the connection pool db.maxwait db.maxwait = 5000 Maximum time to wait before giving up if all connections in pool are busy (in milliseconds). db.maxidle db.maxidle = -1 Maximum number of idle connections in pool. (-1 = unlimited) db.statementpool db.statementpool = true Determines if prepared statement should be cached. (Default is set to true) db.poolname db.poolname = dspacepool Specify a name for the connection pool. This is useful if you have multiple applications sharing Tomcat's database connection pool. If nothing is specified, it will default to 'dspacepool'
104
The dspace.cfg Configuration Properties File Informational Note: The port on which your SMTP mail server can be reached. By default, port 25 is used. Change this setting if your SMTP mailserver is running on another port. This property is optional. mail.from.address mail.from.address = dspacenoreply@myu.edu The "From" address for email. Change the 'myu.edu' to the site's host name. feedback.recipient feedback.recipient = dspacehelp@myu.edu When a user clicks on the feedback link/feature, the information will be send to the email address of choice. This configuration is currently limited to only one recipient. mail.admin mail.admin = dspace-help@myu.edu Email address of the general site administrator (Webmaster) alert.recipient alert.recipient = john.doe@myu.edu Enter the recipient for server errors and alerts. This property is optional. registration.notify registration.notify = mike.smith@myu.edu Enter the recipient that will be notified when a new user registers on DSpace. This property is optional. mail.charset mail.charset = UTF-8 Set the default mail character set. This may be overridden by providing a line inside the email template 'charset: <encoding>', otherwise this default is used. mail.allowed.referrers mail.allowed.referrers = localhost A comma separated list of hostnames that are allowed to refer browsers to email forms. Default behavior is to accept referrals only from dspace.hostname. This property is optional. mail.extraproperties
mail.extraproperties = mail.smtp.socketFactory.port=465, \
Property: Example Value: Informational Note: Property: Example Value: Informational Note:
Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note:
105
mail.smtp.socketFactory.class=javax.net.ssl.SSLSocketFactory \ mail.smtp.socketFactory.fallback=false
Informational Note:
If you need to pass extra settings to the Java mail library. Comma separated, equals sign between the key and the value. This property is optional. mail.server.disabled mail.server.disabled = false An option is added to disable the mailserver. By default, this property is set to 'false'. By setting value to 'true', DSpace will not send out emails. It will instead log the subject of the email which should have been sent. This is especially useful for development and test environments where production data is used when testing functionality. This property is optional. default.language default.language = en_US If no other language is explicitly stated in the input-forms.xml, the default language will be attributed to the metadata values.
106
The dspace.cfg Configuration Properties File cal volume on the server that DSpace resides. So, you might have something like this: _ assetstore.dir = / storevgm/assestore_. Property: Example Value: Informational Note:
assetstore.dir.1 assetstore.dir.2 assetstore.dir.1 = /second/assetstore assetstore.dir.2 = /third/assetstore
This property specifies extra asset stores like the one above, counting from one (1) upwards. This property is commented out (#) until it is needed. assetstore.incoming assetstore.incoming = 1 Informational Note: Specify the number of the store to use for new bitstreams with this property. The default is 0 [zero] which corresponds to the 'assestore.dir' above. As the asset store number is stored in the item metadata (in the database), always keep the assetstore numbering consistent and don't change the asset store number in the item metadata.
Be Careful
In the examples above, you can see that your storage does not have to be under the /dspace directory. For the default installation it needs to reside on the same server (unless you plan to configure SRB (cf. below)). So, if you added storage space to your server, and it has a different logical volume/name/directory, you could have the following as an example: assetstore.dir = /storevgm/assetstore assetstore.dir.1 = /storevgm2/assetstore assetstore.incoming = 1 Please Note: When adding additional storage configuration, you will then need to uncomment and declare assestore.incoming = 1
107
The dspace.cfg Configuration Properties File Property: Example value: Property: Example value: Informational Note: srb.port.1 srb.port.1 = 5544 srb.mcatzone.1 srb.mcatzone.1 = mysrbzone Your SRB Metadata Catalog Zone. An SRB Zone (or zone for short) is a set of SRB servers 'brokered' or administered through a single MCAT. Hence a zone consists of one or more SRB servers along with one MCAT-enabled server. Any existing SRB system (version 2.x.x and below) can be viewed as an SRB zone. For more information on zones, please check http://www.sdsc.edu/srb/index.php/Zones. srb.mdasdomainname.1 srb.mdasdomainname.1 = mysrbdomain Your SRB domain. This domain should be created under the same zone, specified in srb.mcatzone. Information on domains is included here http:// www.sdsc.edu/srb/index.php/Zones. srb.defaultstorageresource.1 srb.defaultstorageresource.1 = mydefaultsrbresource Your default SRB Storage resource. srb.username.1 srb.username.1 = mysrbuser Your SRB Username. srb.password.1 srb.password.1 = mysrbpassword Your SRB Password. srb.homedirectory.1
srb.homedirectory.1 = /mysrbzone/home/ mysrbuser.mysrbdomain
Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note:
Your SRB Homedirectory srb.parentdir.1 srb.parentdir.1 = mysrbdspaceassetstore Several of the terms, such as mcatzone, have meaning only in the SRB context and will be familiar to SRB users. The last, srb.paratdir.n, can be used for additional (SRB) upper directory structure within an SRB account. This property value could be blank as well.
The 'assetstore.incoming' property is an integer that references where new bitstreams will be stored. The default (say the starting reference) is zero. The value will be used to identify the storage where all new bitstreams will be stored until this number is changed. This number is stored in the Bitstream table (store_number column) in
108
The dspace.cfg Configuration Properties File the DSpace database, so older bitstreams that may have been stored when 'asset.incoming' had a different value can be found. In the simple case in which DSpace uses local (or mounted) storage the number can refer to different directories (or partitions). This gives DSpace some level of scalability. The number links to another set of properties 'assetstore.dir', 'assetstore.dir.1' (remember zero is default), assetstore.dir.2', etc., where the values are directories. To support the use of SRB DSpace uses the same scheme but broaden to support: using SRB instead of the local file system using the local file system (native DSpace) using a mix of SRB and local file system in this broadened use of the 'asset.incoming' integer will refer to one of the following storage locations: a local file system directory (native DSpace) a set of SRB account parameters (host, port, zone, domain, username, password, home directory, and resource Should the be any conflict, like '2' referring to a local directory and to a set of SRB parameters, the program will select the local directory. If SRB is chosen from the first install of DSpace, it is suggested that 'assetstore.dir' (no integer appended) be retained to reference a local directory (as above under File Storage) because build.xml uses this value to do a mkdir. In this case, 'assetstore.incoming' can be set to 1 (i.e. uncomment the line in File Storage above) and the 'assetstore.dir' will not be used.
Property: Example value: Informational Note: Property: Example Value: Informational Note:
log.dir log.dir = ${dspace.dir}/log This is where to put the logs. (This is used for initial configuration only) useProxies useProxies = true If your DSpace instance is protected by a proxy server, in order for log4j to log the correct IP address of the user rather than of the proxy, it must be configured to look for the X-Forwarded-For header. This feature can be enabled by ensuring this setting is set
109
The dspace.cfg Configuration Properties File to _true. _ This also affects IPAuthentication, and should be enabled for that to work properly if your installation uses a proxy server. Previous releases of DSpace provided an example ${dspace.dir}/config/log4j.xml as an alternative to log4j.properties. This caused some confusion and has been removed. log4j continues to support both Properties and XML forms of configuration, and you may continue (or begin) to use any form that log4j supports.
Property: Example Value: Informational Note: Property: Example Value: Informational Note
110
The dspace.cfg Configuration Properties File tion item is missing or commented out, OR is used. AND requires all the search terms to be present. OR requires one or more search terms to be present. Property: Example Value: Informational Note: search.maxfieldlength search.maxfieldlength = 10000 This is the maximum number of terms indexed for a single field in Lucene. The default is 10,000 wordsoften not enough for full-text indexing. If you change this, you will need to re-index for the change to take effect on previously added items. -1 = unlimited (Integer.MAG_VALUE) search.index.n search.index.1 = author:dc.contributor.* This property determines which of the metadata fields are being indexed for search. As an example, if you do not include the title field here, searching for a word in the title will not be matched with the titles of your items..
For example, the following entries appear in the default DSpace installation: search.index.1 = author:dc.contributor.* search.index.2 = author:dc.creator.* search.index.3 = title:dc.title.* search.index.4 = keyword:dc.subject.* search.index.5 = abstract:dc.description.abstract search.index.6 = author:dc.description.statementofresponsibility search.index.7 = series:dc.relation.ispartofseries search.index.8 = abstract:dc.description.tableofcontents search.index.9 = mime:dc.format.mimetype search.index.10 = sponsor:dc.description.sponsorship search.index.11 = id:dc.identifier.* search.index.11 = language:dc.language.iso The format of each entry is search.index.<id> = <search label> : <schema> . <metadata field> where: <id> <search label> <schema> <metadata field> is an incremental number to distinguish each search index entry is the identifier for the search field this index will correspond to is the schema used. Dublin Core (DC) is the default. Others are possible. is the DSpace metadata field to be indexed.
In the example above, search.index.1 and search.index.2 and search.index.3 are configured as the author search field. The author index is created by Lucene indexing all dc.contributor.*,dc.creator.* and description.statementofresponsibility metadata fields. After changing the configuration run /[dspace]/bin/index-init to regenerate the indexes.
111
The dspace.cfg Configuration Properties File While the indexes are created, this only affects the search results and has no effect on the search components of the user interface. One will need to customize the user interface to reflect the changes, for example, to add the a new search category to the Advanced Search. In the above examples, notice the asterisk (*). The metadata field (at least for Dublin Core) is made up of the "element" and the "qualifier". The asterisk is used as the "wildcard". So, for example, keyword.dc.subject.* will index all subjects regardless if the term resides in a qualified field. (subject versus subject.lcsh). One could customize the search and only index LCSH (Library of Congress Subject Headings) with the following entry keyword:dc.subject.lcshinstead ofkeyword:dc.subject.* Authority Control Note: Although DSIndexer automatically builds a separate index for the authority keys of any index that contains authority-controlled metadata fields, the "Advanced Search" UIs does not allow direct access to it. Perhaps it will be added in the future. Fortunately, the OpenSearch API lets you submit a query directly to the Lucene search engine, and this may include the authority-controlled indexes.
For complete information regarding the Handle server, the user should consult 3.3.4. The Handle Server section of Installing DSpace.
112
The dspace.cfg Configuration Properties File Authorization to execute the functions that are allowed to user with WRITE permission on an object will be attributed to be the ADMIN of the object (e.g. community/collection/admin will be always allowed to edit metadata of the object). The default will be "true" for all the configurations. Community Administration: Subcommunities and Collections Property: Example Value: Informational Note: Property: Example Value: Informational Note: Community Administration: Policies and The group of administrators Property: Example Value: Informational Note: Property: Example Value: Informational Note: Community Administration: Collections in the above Community Property: Example Value: Informational Note: core.authorization.communityadmin.collection.policies core.authorization.communityadmin.collection.policies = true Authorization for a delegated community administrator to administrate the policies for underlying collections. core.authorization.communityadmin.collection.template-item core.authorization.communityadmin.collection.template-item = true core.authorization.communityadmin.policies core.authorization.communityadmin.policies = true Authorization for a delegated community administrator to administrate the community policies. core.authorization.communityadmin.admin-group core.authorization.communityadmin.admin-group = true Authorization for a delegated community administrator to edit the group of community admins. core.authorization.communityadmin.create-subelement core.authorization.communityadmin.create-subelement = true Authorization for a delegated community administrator to create subcommunities or collections. core.authorization.communityadmin.delete-subelement core.authorization.communityadmin.delete-subelement = true Authorization for a delegated community administrator to delete subcommunities or collections.
113
The dspace.cfg Configuration Properties File Informational Note: Authorization for a delegated community administrator to administrate the item template for underlying collections. core.authorization.communityadmin.collection.submitters core.authorization.communityadmin.collection.submitters = true Authorization for a delegated community administrator to administrate the group of submitters for underlying collections. core.authorization.communityadmin.collection.workflows core.authorization.communityadmin.collection.workflows = true Authorization for a delegated community administrator to administrate the workflows for underlying collections. core.authorization.communityadmin.collection.admin-group core.authorization.communityadmin.collection.admin-group = true Authorization for a delegated community administrator to administrate the group of administrators for underlying collections.
Community Administration: Items Owned by Collections in the Above Community Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: core.authorization.communityadmin.item.delete core.authorization.communityadmin.item.delete = true Authorization for a delegated community administrator to delete items in underlying collections. core.authorization.communityadmin.item.withdraw core.authorization.communityadmin.item.withdraw = true Authorization for a delegated community administrator to withdraw items in underlying collections. core.authorization.communityadmin.item.reinstate core.authorization.communityadmin.item.reinstate = true Authorization for a delegated community administrator to reinstate items in underlying collections.
114
The dspace.cfg Configuration Properties File Property: Example Value: Informational Note: core.authorization.communityadmin.item.policies core.authorization.communityadmin.item.policies = true Authorization for a delegated community administrator to administrate item policies in underlying collections.
Community Administration: Bundles of Bitstreams, related to items owned by collections in the above Community Property: Example Value: Informational Note: core.authorization.communityadmin.item.create-bitstream core.authorization.communityadmin.item.create-bitstream = true Authorization for a delegated community administrator to create additional bitstreams in items in underlying collections. core.authorization.communityadmin.item.delete-bitstream core.authorization.communityadmin.item.delete-bitstream = true Authorization for a delegated community administrator to delete bitstreams from items in underlying collections. core.authorization.communityadmin.item.cc-license core.authorization.communityadmin.item.cc-license = true Authorization for a delegated community administrator to administer licenses from items in underlying collections.
core.authorization.collection-admin.policies core.authorization.collection-admin.templateitem core.authorization.collectionadmin.submitters core.authorization.collection-admin.workflows core.authorization.collection-admin.admingroup core.authorization.collectionadmin.item.delete core.authorization.collectionadmin.item.withdraw core.authorization.collectionadmin.item.reinstatiate core.authorization.collectionadmin.item.policies core.authorization.collectionadmin.item.create-bitstream
Community Administration: The properties for collection administrators work similar to those of community administrators, with respect to collection administration.
Collection Administration: Item owned by the above CollectionThe properties for collection administrators work similar to those of community administrators, with respect to administration of items in underlying collections. Collection Administration: Bundles of bitstreams, related to items owned by collections in the
115
The dspace.cfg Configuration Properties File above Community. The properties for collection administrators work similar to those of community administrators, with respect to administration of bitstreams related to items in underlying collections.
core.authorization.collectionadmin.item.delete-bitstream core.authorization.collection-admin.itemadmin.cc-license
Item Administration. core.authorization.itemThe properties for item administrators work similar to admin.policies those of community and collection administrators, with respect to administration of items in underlying collections. Item Administration: Bundles of bitstreams, related to items owned by collections in the above Community. The properties for item administrators work similar to those of community and collection administrators, with respect to administration of bitstreams related to items in underlying collections.
core.authorization.item-admin.createbitstream core.authorization.item-admin.deletebitstream core.authorization.item-admin.cc-license
Oracle users should consult Chapter 4 Updating a DSpace Installation regarding the necessary database changes that need to take place.
The configuration property plugin.sequence.org.dspace.authenticate.AuthenticationMethod defines the authentication stack. It is a comma-separated list of class names. Each of these classes implements a different authentication method, or way of determining the identity of the user. They are invoked in the order specified until one succeeds. An authentication method is a class that implements the interface org.dspace.authenticate.AuthenticationMethod. It authenticates a user by evaluating the credentials (e.g. username and password) he or she presents and checking that they are valid. The basic authentication procedure in the DSpace Web UI is this: 1. A request is received from an end-user's browser that, if fulfilled, would lead to an action requiring authorization taking place.
116
The dspace.cfg Configuration Properties File 2. If the end-user is already authenticated: If the end-user is allowed to perform the action, the action proceeds If the end-user is NOT allowed to perform the action, an authorization error is displayed. If the end-user is NOT authenticated, i.e. is accessing DSpace anonymously: 3. The parameters etc. of the request are stored. 4. The Web UI's startAuthentication method is invoked. 5. First it tries all the authentication methods which do implicit authentication (i.e. they work with just the information already in the Web request, such as an X.509 client certificate). If one of these succeeds, it proceeds from Step 2 above. 6. If none of the implicit methods succeed, the UI responds by putting up a "login" page to collect credentials for one of the explicit authentication methods in the stack. The servlet processing that page then gives the proffered credentials to each authentication method in turn until one succeeds, at which point it retries the original operation from Step 2 above. Please see the source files AuthenticationManager.java and AuthenticationMethod.java for more details about this mechanism.
117
The dspace.cfg Configuration Properties File Property: Example Value: Informational Note: Property: Example Value: Informational Note: authentication.shib.email-use-tomcat-remote-user authentication.shib.email-use-tomcat-remote-user = true This option forces the software to acquire the email from Tomcat. authentication.shib.autoregister authentication.shib.autoregister = true Option will allow new users to be registered automatically if the IdP provides sufficient information (and the user does not exist in DSpace).
authentication.shib.role-header authentication.shib-role.header.ignore-scope authentication.shib.role-header = Shib-EPScopedAffiliation authentication.shib-role.header.ignore-scope = true
or
authentication.shib.role-header = Shib-EPUnscopedAffiliation authentication.shib-role.header.ignorescope = false
Informational Note:
These two options specify which attribute that is responsible for providing user's roles to DSpace and unscope the attributes if needed. When not specified, it is defaulted to 'Shib-EP-UnscopedAffiliation', and ignore-scope is defaulted to 'false'. The value is specified in AAP.xml (Shib 1.3.x) or attribute-filter.xml (Shib 2.x). The value is CASESensitive. The values provided in this header are separated by semi-colon or comma. If your sp only provides scoped role header, you need to set authentication.shib.role-header.ignore-Scope as 'true'. For example if you only get Shib-EP-ScopedAffiliation instead of Shib-EP-ScopedAffiliation, you name to make your settings as in the example value above. authentication.shib.default-roles authentication.shib.default-roles = Staff, Walk-ins When user is fully authN or IdP but would not like to release his/her roles to DSpace (for privacy reasons?), what should the default roles be given to such user. The values are separated by semi-colon or comma.
118
Informational Note:
The following mappings specify role mapping between IdP and Dspace. The left side of the entry is IdP's role (prefixed with 'authentication.shib.role.') which will be mapped to the right entry from DSpace. DSpace's group as indicated on the right entry has to EXIST in DSpace, otherwise user will be identified as 'anonymous'. Multiple values on the right entry should be separated by comma. The values are CASE-Sensitive. Heuristic one-to-one mapping will be done when the IdP groups entry are not listed below (i.e. if 'X' group in IdP is not specified here, then it will be mapped to 'X' group in DSpace if it exists, otherwise it will be mapped to simply 'anonymous'). Given sufficient demand, future release could support regex for the mapping special characters need to be escaped by '\'
119
The dspace.cfg Configuration Properties File 2. Add the org.dspace.authenticate.X509Authentication plugin first to list of stackable authentication methods in the value of the configuration plugin.sequence.org.dspace.authenticate.AuthenticationMethode.g.:
plugin.sequence.org.dspace.authenticate.AuthenticationMethod = \ org.dspace.authenticate.X509Authentication, \ org.dspace.authenticate.PasswordAuthentication
the key
1. You must also configure DSpace with the same CA certificates as the web server, so it can accept and interpret the clients' certificates. It can share the same keystore file as the web server, or a separate one, or a CA certificate in a file by itself. Configure it by one of these methods, either the Java keystore
authentication.x509.keystore.path = path to Java keystore file authentication.x509.keystore.password = password to access the keystore
2. Choose whether to enable auto-registration: If you want users who authenticate successfully to be automatically registered as new E-Persons if they are not already, set the authentication.x509.autoregister configuration property to true. This lets you automatically accept all users with valid personal certificates. The default is false.
Negative matches can be set by prepending the entry with a '-'. For example if you want to include all of a class B network except for users of a contained class c network, you could use: 111.222,-111.222.333. Notes:
120
The dspace.cfg Configuration Properties File If the Groupname contains blanks you must escape the, e.g. Department\ of\ Statistics If your DSpace installation is hidden behind a web proxy, remember to set the 'useProxies' configuration option within the 'Logging' section of dspace.cfg to use the IP address of the user rather than the IP address of the proxy server.
If LDAP is enabled in the dspace.cfg file, then new users will be able to register by entering their username and password without being sent the registration token. If users do not have a username and password, then they can still register and login with just their email address the same way they do now. If you want to give any special privileges to LDAP users, create a stackable authentication method to automatically put people who have a netid into a special group. You might also want to give certain email addresses special privileges. Refer to the Custom Authentication Code section above for more information about how to do this. Here is an explanation of what each of the different configuration parameters are for: Standard LDAP Configuration Property: Example Value: Informational Note: ldap.enable ldap.enable = false This setting will enable or disable LDAP authentication in DSpace. With the setting off, users will be required to register and login with their email address. With this setting on, users will be able to login and register with their LDAP user ids and passwords. ldap.provider_url ldap.provider_url = ldap:// ldap.myu.edu/o=myu.edu This is the url to your institution's LDAP server. You may or may not need the /o=myu.edu part at the end. Your server may also require the ldaps:// protocol. ldap.id_field ldap.id_field = uid This is the unique identifier field in the LDAP directory where the username is stored. ldap.object_context ldap.object_context = ou=people, o=myu.edu This is the object context used when authenticating the user. It is appended to the ldap.id_field and username. For example uid=username,ou=people,o=myu.edu. You will need to modify this to match your LDAP configuration.
121
The dspace.cfg Configuration Properties File Property: Example Value: Informational Note: ldap.search_context ldap.search_context = ou=people This is the search context used when looking up a user's LDAP object to retrieve their data for autoregistering. With ldap.autoregister turned on, when a user authenticates without an EPerson object we search the LDAP directory to get their name and email address so that we can create one for them. So after we have authenticated against uid=username,ou=people,o=byu.edu we now search in ou=people for filtering on [uid=username]. Often the ldap.search_context is the same as the ldap.object_context parameter. But again this depends on your LDAP server configuration. ldap.email_field ldap.email_field = mail This is the LDAP object field where the user's email address is stored. "mail" is the default and the most common for ldap servers. If the mail field is not found the username will be used as the email address when creating the eperson object. ldap.surname_field ldap.surname_field = sn This is the LDAP object field where the user's last name is stored. "sn" is the default and is the most common for LDAP servers. If the field is not found the field will be left blank in the new eperson object. ldap.givenname_field ldap.givenname_field = givenName This is the LDAP object field where the user's given names are stored. I'm not sure how common the givenName field is in different LDAP instances. If the field is not found the field will be left blank in the new eperson object. ldap.phone_field ldap.phone_field = telephoneNumber This is the field where the user's phone number is stored in the LDAP directory. If the field is not found the field will be left blank in the new eperson object. webui.ldap.autoregister webui.ldap.autoregister = true This will turn LDAP autoregistration on or off. With this on, a new EPerson object will be created for any user who successfully authenticates against the LDAP server when they first login. With this setting off, the user must first register to get an EPerson object by en-
122
The dspace.cfg Configuration Properties File tering their ldap username and password and filling out the forms. LDAP Users Group Property: Example Value: Informational Note: ldap.login.specialgroup ldap.login.specialgroup = group-name If required, a group name can be given here, and all users who log into LDAP will automatically become members of this group. This is useful if you want a group made up of all internal authenticated users. (Remember to log on as the administrator, add this to the "Groups" with read rights).
Hierarchical LDAP Settings. If your users are spread out across a hierarchical tree on your LDAP server, you will need to use the following stackable authentication class:
plugin.sequence.org.dspace.authenticate.AuthenticationMethod = \ org.dspace.authenticate.LDAPHierarchicalAuthentication
You can optionally specify the search scope. If anonymous access is not enabled on your LDAP server, you will need to specify the full DN and password of a user that is allowed to bind in order to search for the users. Property: Example Value: Informational Note: ldap.search_scope ldap.search_scope = 2 This is the search scope value for the LDAP search during autoregistering. This will depend on your LDAP server setup. This value must be one of the following integers corresponding to the following values: object scope : 0 one level scope : 1 subtree scope : 2 ldap.search.user ldap.search.password ldap.search.user = cn=admin,ou=people,o=myu.edu {{ ldap.search.password = password}} The full DN and password of a user allowed to connect to the LDAP server and search for the DN of the user trying to log in. If these are not specified, the initial bind will be performed anonymously. ldap.netid_email_domain ldap.netid_email_domain = @example.com If your LDAP server does not hold an email address for a user, you can use the following field to specify your email domain. This value is appended to the netid in order to make an email address. E.g. a netid of 'user' and ldap.netid_email_domain as
Informational Note:
123
The dspace.cfg Configuration Properties File @example.com would set the email of the user to be user@example.com
124
The dspace.cfg Configuration Properties File Media Filters are configured as Named Plugins, with each filter also having a separate configuration setting (in dspace.cfg) indicating which formats it can process. The default configuration is shown below. Property: Example Value: filter.plugins
filter.plugins = PDF Text Extractor, Html Text Extractor, \ Word Text Extractor, JPEG Thumbnail
Informational Note:
Place the names of the enabled MediaFilter or FormatFilter plugins. To enable Branded Preview, comment out the previous one line and then uncomment the two lines in found in dspace.cfg:
Word Text Extractor, JPEG Thumbnail,\ Branded Preview JPEG
plugin.named.org.dspace.app.mediafilter.FormatFil
plugin.named.org.dspace.app.mediafilter.FormatFilter = \ org.dspace.app.mediafilter.PDFFilter = PDF Text Extractor, \ org.dspace.app.mediafilter.HTMLFilter = HTML Text Extractor, \ org.dspace.app.mediafilter.WordFilter = Word Text Extractor, \ org.dspace.app.mediafilter.JPEGFilter = JPEG Thumbnail, \ org.dspace.app.mediafilter.BrandedPreviewJPEGFilter = Branded Preview JPEG
filter.org.dspace.app.mediafilter.PDFFilter.inputFormats filter.org.dspace.app.mediafilter.HTMLFilter.inputFormats filter.org.dspace.app.mediafilter.WordFilter.inputFormats filter.org.dspace.app.mediafilter.JPEGFilter.inputFormats filter.org.dspace.app.mediafilter.BrandedPreviewJPEGFilter.in filter.org.dspace.app.mediafilter.PDFFilter.inputFormats = Adobe PDF filter.org.dspace.app.mediafilter.HTMLFilter.inputFormats = HTML, Text filter.org.dspace.app.mediafilter.WordFilter.inputFormats = Microsoft Word filter.org.dspace.app.mediafilter.JPEGFilter.inputFormats = BMP, GIF, JPEG, \
Example Value:
125
The dspace.cfg Configuration Properties File Informational Note: It this value is set for "true", all PDF extractions are written to temp files as they are indexed. This is slower, but helps to ensure that PDFBox software DSpace uses does not eat up all your memory. pdffilter.skiponmemoryexception pdffilter.skiponmemoryexception = true If this value is set for "true", PDFs which still result in an "Out of Memory" error from PDFBox are skipped over. These problematic PDFs will never be indexed until memory usage can be decreased in the PDFBox software.
Names are assigned to each filter using the plugin.named.org.dspace.app.mediafilter.FormatFilter field (e.g. by default the PDFilter is named "PDF Text Extractor". Finally, the appropriate filter.<class path>.inputFormats defines the valid input formats which each filter can be applied. These format names must match the short description field of the Bitstream Format Registry. You can also implement more dynamic or configurable Media/Format Filters which extend SelfNamedPlugin .
Informational Note:
126
The dspace.cfg Configuration Properties File The MODS crosswalk properties file is a list of properties describing how DSpace metadata elements are to be turned into elements of the MODS XML output document. The property name is a concatenation of the metadata schema, element name, and optionally the qualifier. For example, the contributor.author element in the native Dublin Core schema would be: dc.contributor.author. The value of the property is a line containing two segments separated by the vertical bar ("|"_): The first part is an XML fragment which is copied into the output document. The second is an XPath expression describing where in that fragment to put the value of the metadata element. For example, in this property:
dc.contributor.author = <mods:name> <mods:role> <mods:roleTerm type="text">author</mods:roleTerm> </mods:role> <mods:namePart>%s</mods:namePart> </mods:name>
Some of the examples include the string "%s" in the prototype XML where the text value is to be inserted, but don't pay any attention to it, it is an artifact that the crosswalk ignores. For example, given an author named Jack Florey, the crosswalk will insert
<mods:name> <mods:role> <mods:roleTerm type="text">author</mods:roleTerm> </mods:role> <mods:namePart>Jack Florey</mods:namePart> </mods:name>
into the output document. Read the example configuration file for more details.
As shown above, there are three (3) parts that make up the properties "key":
crosswalk.submissionPluginName.stylesheet = 1 2 3 4
crosswalk first part of the property key. submission second part of the property key. PluginName is the name of the plugin. The path value is the path to the file containing the crosswalk stylesheet (relative to /[dspace]/config). Here is an example that configures a crosswalk named "LOM" using a stylesheet in [dspace]/config/crosswalks/d-lom.xsl:
127
The dspace.cfg Configuration Properties File crosswalk.submission.LOM.stylesheet = crosswalks/d-lom.xsl A dissemination crosswalk can be configured by starting with the property key crosswalk.dissemination. Example: crosswalk.dissemination.PluginName.stylesheet = path The PluginName is the name of the plugin (!) . The path value is the path to the file containing the crosswalk stylesheet (relative to /[dspace]/config). You can make two different plugin names point to the same crosswalk, by adding two configuration entries with the same path:
crosswalk.submission.MyFormat.stylesheet = crosswalks/myformat.xslt crosswalk.submission.almost_DC.stylesheet = crosswalks/myformat.xslt
The dissemination crosswalk must also be configured with an XML Namespace (including prefix and URI) and an XML schema for its output format. This is configured on additional properties in the DSpace configuration:
crosswalk.dissemination.PluginName.namespace.Prefix = namespace-URI crosswalk.dissemination.PluginName.schemaLocation = schemaLocation value
For example:
crosswalk.dissemination.qdc.namespace.dc = http://purl.org/dc/elements/1.1/ crosswalk.dissemination.qdc.namespace.dcterms = http://purl.org/dc/terms/ crosswalk.dissemination.qdc.schemalocation = http://purl.org/dc/elements/1.1/ \ http://dublincore.org/schemas/xmls/qdc/2003/04/02/qualifieddc.xsd
6.2.15.2.1. Testing XSLT Crosswalks The XSLT crosswalks will automatically reload an XSL stylesheet that has been modified, so you can edit and test stylesheets without restarting DSpace. You can test a dissemination crosswalk by hooking it up to an OAIPMH crosswalk and using an OAI request to get the metadata for a known item. Testing the submission crosswalk is more difficult, so we have supplied a command-line utility to help. It calls the crosswalk plugin to translate an XML document you submit, and displays the resulting intermediate XML (DIM). Invoke it with:
[dspace]/bin/dsrun org.dspace.content.crosswalk.XSLTIngestionCrosswalk [-l] plugin input-file
where plugin is the name of the crosswalk plugin to test (e.g. "LOM"), and input-file is a file containing an XML document of metadata in the appropriate format. Add the -l option to pass the ingestion crosswalk a list of elements instead of a whole document, as if the List form of the ingest() method had been called. This is needed to test ingesters for formats like DC that get called with lists of elements instead of a root element.
128
The dspace.cfg Configuration Properties File Example Value: Properties: Example Value: Properties: Example Value: crosswalk.qdc.namspace.qdc.dc =http://purl.org/dc/elements/1.1_ crosswalk.qdc.namspace.qdc.dcterms crosswalk.qdc.namspace.qdc.dc =http://purl.org/dc/terms/_ crosswalk.qdc.schemaLocation.QDC
crosswalk.qdc.schemaLocation.QDC = http:// www.purl.org/dc/terms \ http://dublincore.org/schemas/xmls/ qdc/2006/01/06/dcterms.xsd \ http://purl.org/dc/elements/1.1 \ http://dublincore.org/schemas/xmls/ qdc/2006/01/06/dc.xsd
crosswalk.qdc.properties.QDC crosswalk.qdc.properties.QDC = crosswalks/QDC.properties Configuration of the QDC Crosswalk dissemination plugin for Qualified DC. _(Add lower-case name for OAI-PMH. That is, change QDC to qdc.)}}
In the property key "crosswalk.qdc.properties.QDC" the value of this property is a path to a separate properties file containing the configuration for this crosswalk. The pathname is relative to the DSpace configuration directory /[dspace]/config . Referring back to the "Example Value" for this property key, one has crosswalks/qdc.properties which defines a crosswalk named QDC whose configuration comes from the file [dspace]/config/crosswalks/qdc.properties . You will also need to configure the namespaces and schema location strings for the XML output generated by this crosswalk. The namespaces properties names are formatted: crosswalk.qdc.namespace.prefix = uri where prefix is the namespace prefix and uri is the namespace URI. See the above Property and Example Value keys as the default dspace.cfg has been configured. The QDC crosswalk properties file is a list of properties describing how DSpace metadata elements are to be turned into elements of the Qualified DC XML output document. The property name is a concatenation of the metadata schema, element name, and optionally the qualifier. For example, the contributor.author element in the native Dublin Core schema would be: dc.contributor.author . The value of the property is an XML fragment, the element whose value will be set to the value of the metadata field in the property key. For example, in this property: dc.coverage.temporal = <dcterms:temporal /> the generated XML in the output document would look like, e.g.: <dcterms:temporal>Fall, 2005</dcterms:temporal>
129
The dspace.cfg Configuration Properties File You can add names for existing crosswalks, add new plugin classes, and add new configurations for the configurable crosswalks as noted below.
130
The dspace.cfg Configuration Properties File Informational Note: Property: Example Value: Informational Note: Property: Example Value: Consumer to maintain the search index. event.consumer.browse.class event.consumer.browse.class = org.dspace.browse.BrowseConsumer Consumer to maintain the browse index. event.consumer.browse.filters event.consumer.browse.filters = Community | Collection | Item | Bundle+Add | Create | Modify | Modify_Metadata | Delete | Remove Consumer to maintain the browse index. event.consumer.eperson.class event.consumer.eperson.class = org.dspace.eperson.EPersonConsumer Consumer related to EPerson changes event.consumer.eperson.filters event.consumer.eperson.filters = EPerson+Create Consumer related to EPerson changes event.consumer.test.class event.consumer.test.class = org.dspace.event.TestConsumer Test consumer for debugging and monitoring. Commented out by default. event.consumer.test.filters event.consumer.test.filters = All +All Test consumer for debugging and monitoring. Commented out by default. testConsumer.verbose testConsumer.verbose = true Set this to true to enable testConsumer messages to standard output. Commented out by default.
Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note:
6.2.17. Embargo
DSpace embargoes utilize standard metadata fields to hold both the 'terms' and the 'lift date'. Which fields you use are configurable, and no specific metadata element is dedicated or predefined for use in embargo. Rather, you specify exactly what field you want the embargo system to examine when it needs to find the terms or assign the lift date. Property: Example Value: embargo.field.terms embargo.field.terms = SCHEMA.ELEMENT.QUALIFIER
131
The dspace.cfg Configuration Properties File Informational Note: Embargo terms will be stored in the item metadata. This property determines in which metadata field these terms will be stored. An example could be dc.embargo.terms embargo.field.lift embargo.field.lift = SCHEMA.ELEMENT.QUALIFIER The Embargo lift date will be stored in the item metadata. This property determines in which metadata field the computed embargo lift date will be stored. You may need to create a DC metadata field in your Metadata Format Registry if it does not already exist. An example could be dc.embargo.liftdate embargo.terms.open embargo.terms.open = forever You can determine your own values for the embargo.field.terms property (see above). This property determines what the string value will be for indefinite embargos. The string in terms field to indicate indefinite embargo. plugin.single.org.dspace.embargo.EmbargoSetter plugin.single.org.dspace.embargo.EmbargoSetter = org.dspace.embargo.DefaultEmbargoSetter To implement the business logic to set your embargos, you need to override the EmbargoSetter class. If you use the value DefaultEmbargoSetter, the default implementation will be used. plugin.single.org.dspace.embargo.EmbargoLifter plugin.single.org.dspace.embargo.EmbargoLifter = org.dspace.embargo.DefaultEmbargoLifter To implement the business logic to lift your embargos, you need to override the EmbargoLifter class. If you use the value DefaultEmbargoLifter, the default implementation will be used.
Informational Note:
Informational Note:
Key Recommendations: 1. If using existing metadata fields, avoid any that are automatically managed by DSpace. For example, fields like 'date.issued' or 'date.accessioned' are normally automatically assigned, and thus must not be recruited for embargo use. 2. Do not place the field for 'lift date' in submission screens. This can potentially confuse submitters because they may feel that they can directly assign values to it. As noted in the life-cycle above, this is erroneous: the lift date gets assigned by the embargo system based on the terms. Any pre-existing value will be overwritten. But see next recommendation for an exception. 3. As the life-cycle discussion above makes clear, after the terms are applied, that field is no longer actionable in the embargo system. Conversely, the 'lift date' field is not actionable until the application. Thus you may
132
The dspace.cfg Configuration Properties File want to consider configuring both the 'terms' and 'lift date' to use the same metadata field. In this way, during workflow you would see only the terms, and after item installation, only the lift date. If you wish the metadata to retain the terms for any reason, use two distinct fields instead. . Detailed Operation After the fields defined for terms and lift date have been assigned in dspace.cfg, and created and configured wherever they will be used, you can begin to embargo items simply by entering data (dates, if using the default setter) in the terms field. They will automatically be embargoed as they exit workflow. For the embargo to be lifted on any item, however, a new administrative procedure must be added: the 'embargo lifter' must be invoked on a regular basis. This task examines all embargoed items, and if their 'lift date' has passed, it removes the access restrictions on the item. Good practice dictates automating this procedure using cron jobs or the like, rather than manually running it. The lifter is available as a target of the 1.6 DSpace launcher: see Section 8.
controls which setter to use. 2. Lifter.The default lifter behavior as described aboveessentially applying the collection policy rules to the itemmight also not be sufficient for all purposes. It also can be replaced with another class:
# implementation of embargo lifter plugin--replace with local implementation if applicable plugin.single.org.dspace.embargo.EmbargoLifter = org.dspace.embargo.DefaultEmbargoLifter
133
Note: if you want to require embargo terms for every item, put a phrase in the <required> element. Example:<required>You must enter an embargo date</required> c. Configure Embargo. Edit [dspace]/config/dspace.cfg. Find the Embargo properties and set these two:
# DC metadata field to hold the user-supplied embargo terms embargo.field.terms = dc.description.embargo # DC metadata field to hold computed "lift date" of embargo embargo.field.lift = dc.description.embargo
d. Restart DSpace application. This will pick up these changes. Now just enter future dates (if applicable) in web submission and the items will be placed under embargo. You can enter years ('2020'), years and months ('2020-12'), or also days ('2020-12-15'). e. Periodically run the lifter. Run the task:_[dspace]/bin/dspace embargo-lifter_You will want to run this task in a cron-scheduled or other repeating way. Item embargoes will be lifted as their dates pass. 2. Period Sets.If you wish to use a fixed set of time periods (e.g. 90 days, 6 months and 1 year) as embargo terms, follow these steps, which involve using a custom 'setter'. a. Select two metadata fields. Let's use 'dc.embargo.terms' and 'dc.embargo.lift'. These fields do not exist in the default DSpace metadata registry. Login as an administrator, go the metadata registry page, select the 'dc' schema, then add the metadata fields. b. Expose the 'term' metadata field. The lift field will be assigned by the embargo system, so it should not be exposed directly. Edit [dspace]/config/input-forms.xml . If you have only one form (usually 'traditional') add it there. If you have multiple forms, add it only to the form(s) linked to collection(s) for which embargo applies. First, add the new field to the 'form definition':
<form name="traditional"> <page number="1"> ... <field> <dc-schema>dc</dc-schema> <dc-element>embargo</dc-element> <dc-qualifier>terms</dc-qualifier> <repeatable>false</repeatable> <label>Embargo Terms</label> <input-type value-pairs-name="embargo_terms">dropdown</input-type> <hint>If required, select embargo terms.</hint> <required></required> </field>
Note: If you want to require embargo terms for every item, put a phrase in the <required> element, e.g._<required>You must select embargo terms</required>_Observe that we have referenced a new value-pair list: "embargo_terms'. We must now define that as well (only once even if references by multiple forms):
134
Note: if desired, you could localize the language of the displayed value. c. Configure Embargo. Edit /dspace/config/dspace.cfg. Find the Embargo properties and set the following properties:
# DC metadata field to hold the user-supplied embargo terms embargo.field.terms = dc.embargo.terms # DC metadata field to hold computed "lift date" of embargo embargo.field.lift = dc.embargo.lift # implementation of embargo setter plugin - replace with local implementation if applicable plugin.single.org.dspace.embargo.EmbargoSetter = org.dspace.embargo.DayTableEmbargoSetter
1. a. This step is the same as Step A.4 above, except that instead of entering a date, the submitter will select a value form a drop-down list. 1. a. Periodically run the lifter. Run the task: [dspace]/bin/dspace embargo-lifter . You will want to run this task in a cron-scheduled or other repeating way. Item embargoes will be lifted as their dates pass.
plugin.single.org.dspace.checker.BitstreamDispatc
plugin.single.org.dspace.checker.BitstreamDispatc = org.dspace.checker.SimpleDispatcher
135
The dspace.cfg Configuration Properties File Informational Note: Property: Example Value: Informational Note: The Default dispatcher is case non is specified. checker.retention.default checker.retention.default = 10y This option specifies the default time frame after which all checksum checks are removed from the database (defaults to 10 years). This means that after 10 years, all successful or unsuccessful matches are removed from the database. checker.retention.CHECKSUM_MATCH checker.retention.CHECKSUM_MATCH = 8w This option specifies the time frame after which a successful match will be removed from your DSpace database (defaults to 8 weeks). This means that after 8 weeks, all successful matches are automatically deleted from your database (in order to keep that database table from growing too large).
org.dspace.app.itemexport.work.dir org.dspace.app.itemexport.work.dir = ${dspace.dir}/exports The directory where the exports will be done and compressed. org.dspace.app.itemexport.download.dir org.dspace.app.itemexport.download.dir = ${dspace.dir}/exports/download The directory where the compressed files will reside and be read by the downloader. org.dspace.app.itemexport.life.span.hours org.dspace.app.itemexport.life.span.hours = 48 The length of time in hours each archive should live for. When new archives are created this entry is used to delete old ones. org.dspace.app.itemexport.max.size
Property:
136
The dspace.cfg Configuration Properties File Example Value: Informational Note org.dspace.app.itemexport.max.size = 200 The maximum size in Megabytes (Mb) that the export should be. This is enforced before the compression. Each bitstream's size in each item being exported is added up, if their cumulative sizes are more than this entry the export is not kicked off.
bulkedit.gui-item-limit bulkedit.gui-item-limit = 20 When using the WEBUI, this sets the limit of the number of items allowed to be edited in one processing. There is no limit when using the CLI.
137
Informational note
Metadata elements to exclude when exporting via the user interfaces, or when using the command line version and not using the -a (all) option.
138
The dspace.cfg Configuration Properties File default is true. If set to "false", then the submitter (human being) has the option to skip the uploading of a file.
Property:
139
The dspace.cfg Configuration Properties File Example Value: Informational Note: webui.browse.thumbnail.maxwidth = 80 This determines the maximum width of the browse/ search thumbnails in pixels (px). This only needs to be set if the thumbnails are required to be smaller than the dimensions of thumbnails generated by MediaFilter. webui.item.thumbnail.show webui.item.thumbnail.show = true This determines whether or not to display the thumbnail against each bitstream. (This configuration property key is not used by XMLUI. To show thumbnails using XMLUI, you need to create a theme which displays them). webui.browse.thumbnail.linkbehavior webui.browse.thumbnail.linkbehavior = item This determines where clicks on the thumbnail in browse and search screens should lead. The only values currently supported are "item" or "bitstream", which will either take the user to the item page, or directly download the bitstream. thumbnail.maxwidth thumbnail.maxwidth = 80 This property sets the maximum width of generated thumbnails that are being displayed on item pages. thumbnail.maxheight thumbnail.maxheight = 80 This property sets the maximum height of generated thumbnails that are being displayed on item pages. webui.preview.enabled webui.preview.enabled = false Whether or not the user can "preview" the image. webui.preview.maxwidth webui.preview.maxwidth = 600 This property sets the maximum width for the preview image. webui.preview.maxheight webui.preview.maxheight = 600 This property sets the maximum height for the preview image. webui.preview.brand webui.preview.brand = My Institution Name
Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value:
140
The dspace.cfg Configuration Properties File Informational Note: Property: Example Value: Informational Note: This is the brand text that will appear with the image. webui.preview.brand.abbrev webui.preview.brand.abbrev = MyOrg An abbreviated form of the full Branded Name. This will be used when the preview image cannot fit the normal text. webui.preview.brand.height webui.preview.brand.height = 20 The height (in px) of the brand. webui.preview.brand.font webui.preview.brand.font = SanSerif This property sets the font for your Brand text that appears with the image. webui.preview.brand.fontpoint webui.preview.brand.fontpoint = 12 This property sets the font point (size) for your Brand text that appears with the image. webui.preview.dc webui.preview.dc = rights The Dublin Core field that will display along with the preview. This field is optional. webui.strengths.show webui.strengths.show = false Determines if communities and collections should display item counts when listed. The default behavior if omitted, is true. (This configuration property key is not used by XMLUI. To show thumbnails using XMLUI, you need to create a theme which displays them). webui.strengths.cache webui.strengths.cache = false When showing the strengths, should they be counted in real time, or fetched from the cache. Counts fetched in real time will perform an actual count of the database contents every time a page with this feature is requested, which will not scale. If you set the property key is set to cache ("true") you must run the following command periodically to update the count: /[dspace]/bin/dspace itemcounter. The default is to count in real time (set to "false").
Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note:
141
The dspace.cfg Configuration Properties File broken into several parts: defining the indexes, defining the fields upon which users can sort results, defining truncation for potentially long fields (e.g. authors), setting cross-links between different browse contexts (e.g. from an author's name to a complete list of their items), how many recent submissions to display, and configuration for item mapping browse. Property: Example Value: Informational Note: Property: Example Value: Informational Note: webui.browse.index.<n> {{webui.browse.index.1 = dateissued:metadata:dc.date.issued:date:full }} This is an example of how one "Defines the Indexes". See Defining the Indexes in the next sub-section. webui.itemlist.sort-option.<n> webui.itemlist.sort-option.1 = title:dc.title:title This is an example of how one "Defines the Sort Options". See Defining Sort Options in the following sub-section.
The format of each entry is webui.browse.index.<n> = <index name>:<metadata>:<schema prefix>.<element>.<qualifier>:<data-type field>:<sort option>. Please notice that the punctuation is paramount in typing this property key in the dspace.cfg file. The following table explains each element: Element webui.browse.index.{<n> Definition and Options (if available) n is the index number. The index numbers must start from 1 and increment continuously by 1 thereafter. Deviation form this will cause an error during install or a configuration update. So anytime you add a new browse index, remember to increase the number. (Commented out index numbers may be used over again). The name by which the index will be identified. You will need to update your Messages.properties file to match this field. (The form used in the Messages.properties file is: browse.type.metadata.<index name> . Only two options are available: "metadata" or "item" The schema used for the field to be index. The default is dc (for Dublin Core).
<index name>
142
The dspace.cfg Configuration Properties File Element <element> Definition and Options (if available) The schema element. In Dublin Core, for example, the author element is referred to as "Contributor". The user should consult the default Dublin Core Metadata Registry table in Appendix A. This is the qualifier to the <element> component. The user has two choices: an asterisk "" or a proper qualifier of the element. The asterisk is a wildcard and causes DSpace to index all types of the schema element. For example, if you have the element "contributor" and the qualifier "" then you would index all contributor data regardless of the qualifier. Another example, you have the element "subject" and the qualifier "lcsh" would cause the indexing of only those fields that have the qualifier "lcsh". (This means you would only index Library of Congress Subject Headings and not all data elements that are subjects. This refers to the datatype of the field: date the index type will be treated as a date object title the index type will be treated like a title, which will include a link to the item page text the index type will be treated as plain text. If single mode is specified then this will link to the full mode list Choose full or single. This refers to the way that the index will be displayed in the browse listing. "Full" will be the full item list as specified by webui.itemlist.columns ; "single" will be a single list of only the indexed term.
<qualifier>
<datatype field>
<index display>
If you are customizing this list beyond the default, you will need to insert the text you wish to appear in the navigation and on link and buttons. You need to edit the Messages.properties file. The form of the parameter(s) in the file: browse.type.<index name>
The format of each entry is web.browse.sort-option.<n> = <option name>:<schema prefix>.<element>.<qualifier>:<datatype>. Please notice the punctuation used between the different elements. The following table explains the each element: Element webui.browse.index.<n> Definition and Options (if available) n is an arbitrary number you choose.
143
The dspace.cfg Configuration Properties File Element <option name> Definition and Options (if available) The name by which the sort option will be identified. This may be used in later configuration or to locate the message key (found in Messages.properties file) for this index. The schema used for the field to be index. The default is dc (for Dublin Core). The schema element. In Dublin Core, for example, the author element is referred to as "Contributor". The user should consult the default Dublin Core Metadata Registry table in Appendix A. This is the qualifier to the <element> component. The user has two choices: an asterisk "*" or a proper qualifier of the element. This refers to the datatype of the field: date the sort type will be treated as a date object text the sort type will be treated as plain text.
<qualifier>
<datatype field>
At the present time, you would need to edit your metadata to clean up the index presentation.
144
The dspace.cfg Configuration Properties File data. Some database implementations (e.g. Oracle) will enforce their own limit on this field size. Reducing the field size will decrease the potential size of your database and increase the speed of the browse, but it will also increase the chance of mis-ordering of similar fields. The values are commented out, but proposed values for reasonably performance versus result quality. This affects the size of field for the browse value (this will affect display, and value sorting ) Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: webui.browse.sort_columns.max webui.browse.sort_columns.max = 200 Size of field for hidden sort columns (this will affect only sorting, not display). Commented out as default. webui.browse.value_columns.omission_mark webui.browse.value_columns.omission_mark = ... Omission mark to be placed after truncated strings in display. The default is "...". plugin.named.org.dspace.sort.OrderFormatDelegate
plugin.named.org.dspace.sort.OrderFormatDelegate = \ org.dspace.sort.OrderFormatTitleMarc21=title
Informational Note:
This sets the option for how the indexes are sorted. All sort normalizations are carried out by the OrderFormatDelegate. The plugin manager can be used to specify your own delegates for each datatype. The default datatypes (and delegates) are:
author = org.dspace.sort.OrderFormatAuthor title = org.dspace.sort.OrderFormatTitle text = org.dspace.sort.OrderFormatText
If you redefine a default datatype here, the configuration will be used in preferences to the default. However, if you do not explicitly redefine a datatype, then the default will still be used in addition to the datatypes you do specify.As of DSpace release 1.5.2, the multi-lingual MARC21 title ordering is configured as default, as shown in the example above. To use the previous title ordering (before release 1.5.2), comment out the configuration in your dspace.cfg file.
webui.browse.index.5 = lcAuthor:metadataAuthority:dc.contributor.author:
145
Replace dc.contributor.* with another field if appropriate. The field should be listed in the configuration for webui.itemlist.columns, otherwise you will not see its effect. It must also be defined in webui.itemlist.columns as being of the datatype text otherwise the functionality will be overridden by the specific data type feature. (This setting is not used by the XMLUI as it is controlled by your theme). Now that we know which field is our author or other multiple metadata value field we can provide the option to truncate the number of values displayed by default. We replace the remaining list of values with "et al" or the language pack specific alternative. Note that this is just for the default, and users will have the option of changing the number displayed when they browse the results. See the following table: Property: Example Value: Informational Note: | Where <n> is an integer number of values to be displayed. Use -1 for unlimited (the default value). webui.browse.author-limit webui.browse.author-limit = <n>
The format of the property key is webui.browse.link.<n> = <index name>:<display column metadata> Please notice the punctuation used between the elements. Element webui.browse.link.<n> Definition and Options (if available) {{<n{{>}}is an arbitrary number you choose
146
The dspace.cfg Configuration Properties File Element <index name> <display column metadata> Definition and Options (if available) This need to match your entry for the index name from webui.browse.index property key. Use the DC element (and qualifier)
Examples of some browse links used in a real DSpace installation instance: webui.browse.link.1 = author:dc.contributor.* Creates a link for all types of contributors (authors, editors, illustrators, others, etc.) webui.browse.link.2 = subject:dc.subject.lcsh Creates a link to subjects that are Library of Congress only. In this case, you have a browse index that contains only LC Subject Headings webui.browse.link.3 = series:dc.relation.ispartofseries Creates a link for the browse index "Series". Please note this is again, a customized browse index and not part of the DSpace distributed release.
There will be the need to set up the processors that the PluginManager will load to actually perform the recent submissions query on the relevant pages. This is already configured by default dspace.cfg so there should be no need for the administrator/programmer to worry about this.
plugin.sequence.org.dspace.plugin.CommunityHomeProcessor = \ org.dspace.app.webui.components.RecentCommunitySubmissions plugin.sequence.org.dspace.plugin.CollectionHomeProcessor = \ org.dspace.app.webui.components.RecentCollectionSubmissions
147
Informational Note:
It is possible include contextual information in the submission license using substitution variables. The text substitution is driven by a plugin implementation.
148
The dspace.cfg Configuration Properties File server URLs are used (e.g. http://myserver.myorg/ handle/123456789/1). Property: Example Value: Informational Note: webui.feed.item.title webui.feed.item.title = dc.title This property customizes each single-value field displayed in the feed information for each item. Each of the fields takes a single metadata field. The form of the key is <scheme prefix>.<element>.<qualifier> In place of the qualifier, one may leave it blank to exclude any qualifiers or use the wildcard "*" to include all qualifiers for a particular element. webui.feed.item.date webui.feed.item.date = dc.date.issued This property customizes each single-value field displayed in the feed information for each item. Each of the fields takes a single metadata field. The form of the key is <scheme prefix>.<element>.<qualifier> In place of the qualifier, one may leave it blank to exclude any qualifiers or use the wildcard "*" to include all qualifiers for a particular element. webui.feed.item.description
webui.feed.item.description = dc.title, dc.contributor.author, \ dc.contributor.editor, dc.description.abstract, \ dc.description
Informational Note:
One can customize the metadata fields to show in the feed for each item's description. Elements are displayed in the order they are specified in dspace.cfg.Like other property keys, the format of this property key is: webui.feed.item.description = <scheme prefix>.<element>.<qualifier>. In place of the qualifier, one may leave it blank to exclude any qualifiers or use the wildcard "*" to include all qualifiers for a particular element. webui.feed.item.author webui.feed.item.author = dc.contributor.author The name of field to use for authors (Atom only); repeatable. webui.feed.logo.url webui.feed.logo.url = ${dspace.url}/ themes/mysite/images/mysite-logo.png Customize the image icon included with the site-wide feeds. This must be an absolute URL. webui.feed.item.dc.creator
Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property:
149
The dspace.cfg Configuration Properties File Example Value: Informational Note: webui.feed.item.dc.creator = dc.contributor.author This optional property adds structured DC elements as XML elements to the feed description. They are not the same thing as, for example, webui.feed.item.description. Useful when a program or stylesheet will be transforming a feed and wants separate author, description, date, etc. webui.feed.item.dc.date webui.feed.item.dc.date = dc.date.issued This optional property adds structured DC elements as XML elements to the feed description. They are not the same thing as, for example, webui.feed.item.description. Useful when a program or stylesheet will be transforming a feed and wants separate author, description, date, etc. webui.feed.item.dc.description webui.feed.item.dc.description = dc.description.abstract This optional property adds structured DC elements as XML elements to the feed description. They are not the same thing as, for example, webui.feed.item.description. Useful when a program or stylesheet will be transforming a feed and wants separate author, description, date, etc.
150
The dspace.cfg Configuration Properties File Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: websvc.opensearch.svccontext websvc.opensearch.svccontext = opensearch/ Context for RSS/Atom request URLs. Change only for non-standard servlet mapping. websvc.opensearch.autolink websvc.opensearch.autolink = true Present autodiscovery link in every page head. websvc.opensearch.validity websvc.opensearch.validity = 48 Number of hours to retain results before recalculating. This applies to the Manakin interface only. websvc.opensearch.shortname websvc.opensearch.shortname = DSpace A short name used in browsers for search service. It should be sixteen (16) or fewer characters. websvc.opensearch.longname websvc.opensearch.longname = ${dspace.name} A longer name up to 48 characters. websvc.opensearch.description websvc.opensearch.description = ${dspace.name} DSpace repository Brief service description websvc.opensearch.faviconurl _websvc.opensearch.faviconurl = http:// www.dspace.org/images/favicon.ico_ Location of favicon for service, if any. They must by 16 x 16 pixels. You can provide your own local favicon instead of the default. websvc.opensearch.samplequery websvc.opensearch.samplequery = photosynthesis Sample query. This should return results. You can replace the sample query with search terms that should actually yield results in your repository. websvc.opensearch.tags websc.opensearch.tags = IR DSpace Tags used to describe search service. websvc.opensearch.formats websvc.opensearch.formats = html,atom,rss
151
The dspace.cfg Configuration Properties File Informational Note: Result formats offered. Use one or more comma-separated from the list: html, atom, rss. Please note that html is required for auto discovery in browsers to function, and must be the first in the list if present.
152
The dspace.cfg Configuration Properties File file with the same name? For example, if one receives a request for "foo/bar/index.html" and one has a bitstream called just "index.html", DSpace will serve up the former bitstream (foo/bar/index.html) for the request if webui.html.max-depth-guess is 2 or greater. If xmlui.html.max-depth-guess is 1 or less, then DSpace would not serve that bitstream, as the depth of the file is greater. If _webui.html.max-depth-guess _is zero, the request filename and path must always exactly match the bitstream name. The default is set to 3.
plugin.named.org.dspace.content.authority.ChoiceA
plugin.named.org.dspace.content.authority.ChoiceAuthority = \ org.dspace.content.authority.SampleAuthority = Sample, \ org.dspace.content.authority.LCNameAuthority = LCNameAuthority, \
153
--
plugin.selfnamed.org.dspace.content.authority.Cho
plugin.selfnamed.org.dspace.content.authority.ChoiceAuthority = \ org.dspace.content.authority.DCInputAuthority
Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note:
lcname.url lcname.url =http://alcme.oclc.org/ srw/search/lcnaf_ Location (URL) of the Library of Congress Name Service sherpa.romeo.url sherpa.romeo.url =http:// www.sherpa.ac.uk/romeo/api24.php_ Location (URL) of the SHERPA/RoMEO authority plugin authority.minconfidence authority.minconfidence = ambiguous This sets the default lowest confidence level at which a metadata value is included in an authority-controlled browse (and search) index. It is a symbolic keyword, one of the following values (listed in descending order): accepted, uncertain, ambiguous, notfound, failed, rejected, novalue, unset. See org.dspace.content.authority.Choices source for descriptions. xmlui.lookup.select.size xmlui.lookup.select.size = 12 This property sets the number of selectable choices in the Choices lookup popup
154
The dspace.cfg Configuration Properties File Example Value: Informational Note: upload.max = 536870912 Maximum size of uploaded files in bytes. A negative setting will result in no limit being set. The default is set for 512Mb.
Informational Note:
This is used to customize the DC metadata fields that display in the item display (the brief display) when pulling up a record. The format is: <schema>.<element>.< { }optional { }qualifier> . In place of the qualifier, one can use the wildcard "*" to include all fields of the same element, or, leave it blank for unqualified elements. Additionally, two additional options are available for behavior/rendering: (date) and (link). See the following examples: dc.title = Dublin Core element 'title' (unqualified) dc.title.alternative = DC element 'title', qualifier 'alternative' dc.title.* = All fields with Dublin Core element 'title' (any or no qualifier) dc.identifier.uri(link) = DC identifier.uri, rendered as a link dc.date.issued(date) = DC date.issued, rendered as a date
155
The dspace.cfg Configuration Properties File The Messages.properties file controls how the fields defined above will display to the user. If the field is missing from the _Messages.properties_file, it will not be display. Look in Messages.properties}}under {{metadata.dc.<field>. Example: metadata.dc.contributor.other = Authors metadata.dc.contributor.author = Authors metadata.dc.title.* = Title Please note: The order in which you place the values to the property key control the order in which they will display to the user on the outside world. (See the Example Value above). Property:
webui.resolver.1.urn webui.resolver.1.baseurl webui.resolver.2.urn webui.resolver.2.baseurl webui.resolver.1.urn = doi webui.resolver.1.baseurl = http://dx.doi.org/ webui.resolver.2.urn = hdl webui.resolver.2.baseurl = http:// hdl.handle.net/
Example Value:
Informational Note:
When using "resolver" in webui.itemdisplay to render identifiers as resolvable links, the base URL is take from <code>webui.resolver.<n>.baseurl<code> where <code>webui.resolver.<n>.baseurl<code> matches the urn specified in the metadata value. The value is appended to the "baseurl" as is, so the baseurl needs to end with the forward slash almost in any case. If no urn is specified in the value it will be displayed as simple text.For the doi and hdl urn defaults values are provided, respectively http://dc.doi.org and http://hdl.handle.net are used.If a metadata value with style "doi", "handle" or "resolver" matches a URL already, it is simply rendered as a link with no other manipulation.
plugin.single.org.dspace.app.webui.util.StyleSele
plugin.single.org.dspace.app.webui.util.StyleSelection = \ org.dspace.app.web.util.CollectionStyleSelection #org.dspace.app.web.util.MetadataStyleSelection
Specify which strategy to use for select the style for an item. webui.itemdisplay.thesis.collections webui.itemdisplay.thesis.collections = 123456789/24, 123456789/35
156
The dspace.cfg Configuration Properties File Informational Note: Property: Example Value: Specify which collections use which views by Handle.
webui.itemdisplay.metadata-style webui.itemdisplay.metadata-syle webui.itemdisplay.metadata-style = schema.element[.qualifier|.*] webui.itemdisplay.metadata-syle = dc.type
Informational Note:
Customize the DC fields to use in the item listing page. Elements will be displayed left to right in the order they are specified here. The form is <schema prefix>.<element>[.<qualifier> | .*][(date)], ... Although not a requirement, it would make sense to include among the listed fields at least the date and title fields as specified by the* webui.browse.index. configuration options in the next section mentioned. (cf.) If you have enabled thumbnails (webui.browse.thumbnail.show), you must also include a 'thumbnail' entry in your columnsthis is where the thumbnail will be displayed. webui.itemlist.width webui.itemlist.width = *, 130, 60%, 40% You can customize the width of each column with the following line--you can have numbers (pixels) or percentages. For the 'thumbnail' column, a setting of '*' will use the max width specified for browse thumbnails (cf. webui.browse.thumbnail.maxwidth, thumbnail.maxwidth)
webui.itemlist.browse.<index name>.sort.<sort name>.columns webui.itemlist.sort.<sort name>.columns webui.itemlist.browse.<browse name>.columns webui.itemlist.<sort or index name>.columns
Property:
_}} You can override the DC fields used on the listing page for a given browse index and/or sort option. As a sort option or index may be defined on a field that isn't normally included in the list, this allows you to display the fields that have been indexed/sorted on.There are a number of forms the configuration can take, and the order in which they are listed below is the priority in which they will be used (so a combina-
157
The dspace.cfg Configuration Properties File tion of an index name and sort name will take precedence over just the browse name).In the last case, a sort option name will always take precedence over a browse index name. Note also, that for any additional columns you list, you will need to ensure there is an itemlist.<field name> entry in the messages file. Property: Example Value: webui.itemlist.dateaccessioned.columns webui.itemlist.dateaccessioned.columns = thumbnail, dc.date.accessioned(date), dc.title, dc.contributor.* This would display the date of the accession in place f the issue date whenever the dateaccessioned browsed index or sort option is selected. Just like webui.itemlist.columns, you will need to include a 'thumbnail' entry to display the thumbnails in the item list. webui.itemlist.dateaccessioned.widths webui.itemlist.dateaccessioned.widths = *, 130, 60%, 40% As in the aforementioned property key, you can customize the width of the columns for each configured column list, substituting '.widths' for '.columns' in the property name. See the setting for _webui.itemlist.widths_for more information. webui.itemlist.tablewidth webui.itemlist.tablewidth = 100% You can also set the overall size of the item list table with the following setting. It can lead to faster table rendering when used with the column widths above, but not generally recommended. webui.session.invalidate webui.session.invalidate = true Enable or disable session invalidation upon login or logout. This feature is enabled by default to help prevent session hijacking but may cause problems for shibboleth, etc. If omitted, the default value is 'true'. [Only used for JSPUI authentication].
Informational Note:
158
The dspace.cfg Configuration Properties File Informational Note: The default language for the application is set with this property key. This is a locale according to i18n and might consist of country, country_language or country_language_variant. If no default locale is defined, then the server default locale will be used. The format of a local specifier is described here: http:// java.sun.com/j2se/1.4.2/docs/api/java/util/Locale.html
159
The dspace.cfg Configuration Properties File [dspace-source]/dspace/config/input-forms_LOCALE.xml [dspace-source]/dspace/config/default_LOCALE.license { }should be pure ascii [dspace-source]/dspace/config/news-top_LOCALE.html [dspace-source]/dspace/config/news-side_LOCALE.html [dspace-source]/dspace/config/emails/change_password_LOCALE [dspace-source]/dspace/config/emails/feedback_LOCALE [dspace-source]/dspace/config/emails/internal_error_LOCALE [dspace-source]/dspace/config/emails/register_LOCALE [dspace-source]/dspace/config/emails/submit_archive_LOCALE [dspace-source]/dspace/config/emails/submit_reject_LOCALE [dspace-source]/dspace/config/emails/submit_task_LOCALE [dspace-source]/dspace/config/emails/subscription_LOCALE [dspace-source]/dspace/config/emails/suggest_LOCALE [dspace]/webapps/jspui/help/collection-admin_LOCALE.html { }in html keep the jump link as original; must be copied to [dspace-source]/ dspace/modules/jspui/src/main/webapp/help [dspace]/webapps/jspui/help/index_LOCALE.html { }must be copied to [dspace-source]/dspace/modules/jspui/src/main/webapp/help [dspace]/webapps/jspui/help/site-admin_LOCALE.html { }must be copied to [dspace-source]/dspace/modules/jspui/src/main/webapp/help
160
All the parameters mapping are defined in [dspace]/config/sfx.xml file. The program will check the parameters in sfx.xml and retrieve the correct metadata of the item. It will then parse the string to your resolver. For the following example, the program will search the first query-pair which is DOI of the item. If there is a DOI for that item, your retrieval results will be, for example: http://researchspace.auckland.ac.nz/handle/2292/5763 Example. For setting DOI in sfx.xml
<query-pairs> <field> <querystring>rft_id=info:doi/</querystring> <dc-schema>dc</dc-schema> <dc-element>identifier</dc-element> <dc-qualifier>doi</dc-qualifier> </field> </query-pairs>
If there is no DOI for that item, it will search next query-pair based on the [dspace]/config/sfx.xml and then so on. Example of using ISSN, volume, issue for item without DOI [http://researchspace.auckland.ac.nz/handle/2292/4947] For parameter passing to the <querystring>
<querystring>rft_id=info:doi/</querystring>
Please refer to these: [http://ocoins.info/cobgbook.html] [http://ocoins.info/cobg.html] Program assume wont get empty string for the item, as there will at least author, title for the item to pass to the resolver.
161
The dspace.cfg Configuration Properties File For contributor author, program maintains original DSpace SFX function of extracting authors first and last name.
<field> <querystring>rft.aulast=</querystring> <dc-schema>dc</dc-schema> <dc-element>contributor</dc-element> <dc-qualifier>author</dc-qualifier> </field> <field> <querystring>rft.aufirst=</querystring> <dc-schema>dc</dc-schema> <dc-element>contributor</dc-element> <dc-qualifier>author</dc-qualifier> </field>
The need for a limited set of keywords is important since it eliminates the ambiguity of a free description system, consequently simplifying the task of finding specific items of information. The controlled vocabulary add-on allows the user to choose from a defined set of keywords organized in an tree (taxonomy) and then use these keywords to describe items while they are being submitted. We have also developed a small search engine that displays the classification tree (or taxonomy) allowing the user to select the branches that best describe the information that he/she seeks. The taxonomies are described in XML following this (very simple) structure:
<node id="acmccs98" label="ACMCCS98"> <isComposedBy> <node id="A." label="General Literature"> <isComposedBy> <node id="A.0" label="GENERAL"/> <node id="A.1" label="INTRODUCTORY AND SURVEY"/>
162
Your are free to use any application you want to create your controlled vocabularies. A simple text editor should be enough for small projects. Bigger projects will require more complex tools. You may use Proteg# to create your taxonomies, save them as OWL and then use a XML Stylesheet (XSLT) to transform your documents to the appropriate format. Future enhancements to this add-on should make it compatible with standard schemas such as OWL or RDF. In order to make DSpace compatible with WAI 2.0, the add-on is turned off by default (the add-on relies strongly on Javascript to function). It can be activated by setting the following property in dspace.cfg: webui.controlledvocabulary.enable = true New vocabularies should be placed in [dspace]/config/controlled-vocabularies/ and must be according to the structure described. A validation XML Schema can be downloaded here. Vocabularies need to be associated with the correspondent DC metadata fields. Edit the file [dspace]/config/input-forms.xml and place a "vocabulary" tag under the "field" element that you want to control. Set value of the "vocabulary" element to the name of the file that contains the vocabulary, leaving out the extension (the add-on will only load files with extension "*.xml"). For example:
<field> <dc-schema>dc</dc-schema> <dc-element>subject</dc-element> <dc-qualifier></dc-qualifier> <!-- An input-type of twobox MUST be marked as repeatable --> <repeatable>true</repeatable> <label>Subject Keywords</label> <input-type>twobox</input-type> <hint> Enter appropriate subject keywords or phrases below. </hint> <required></required> <vocabulary [closed="false"]>nsi</vocabulary> </field>
The vocabulary element has an optional boolean attribute closed that can be used to force input only with the javascript of controlled-vocabulary add-on. The default behavior (i.e. without this attribute) is as set closed="false". This allow the user also to enter the value in free way. The following vocabularies are currently available by default: nsi - nsi.xml - The Norwegian Science Index srsc - srsc.xml - Swedish Research Subject Categories 3. JSPUI Session Invalidation Property: Example Value: Informational Note: webui.session.invalidate webui.session.invalidate = true Enable or disable session invalidation upon login or logout. This feature is enabled by default to help prevent session hijacking but may cause problems for shibboleth, etc. If omitted, the default value is 'true'.
163
The dspace.cfg Configuration Properties File the XMLUI interface based upon the Cocoon framework. (Prior to DSpace Release 1.5.1 XMLUI was referred to Manakin. You may still see references to "Manakin") Property: Example Value: Informational Note: xmlui.supported.locales xmlui.supported.locales = en, de A list of supported locales for Manakin. Manakin will look at a user's browser configuration for the first language that appears in this list to make available to in the interface. This parameter is a comma separated list of Locales. All types of Locales country, country_language, country_language_variant. Note that if the appropriate files are not present (i.e. Messages_XX_XX.xml) then Manakin will fall back through to a more general language. xmlui.force.ssl xmlui.force.ssl = true Force all authenticated connections to use SSL, only non-authenticated connections are allowed over plain http. If set to true, then you need to ensure that the 'dspace.hostname' parameter is set to the correctly. xmlui.user.registration xmlui.user.registration = true Determine if new users should be allowed to register. This parameter is useful in conjunction with Shibboleth where you want to disallow registration because Shibboleth will automatically register the user. Default value is true. xmlui.user.editmetadata xmlui.user.editmetadata = true Determines if users should be able to edit their own metadata. This parameter is useful in conjunction with Shibboleth where you want to disable the user's ability to edit their metadata because it came from Shibboleth. Default value is true. xmlui.user.assumelogon xmlui.user.assumelogon = true Determine if super administrators (those whom are in the Administrators group) can login as another user from the "edit eperson" page. This is useful for debugging problems in a running dspace instance, especially in the workflow process. The default value is false, i.e., no one may assume the login of another user. xmlui.user.loginredirect xmlui.user.loginredirect = /profile After a user has logged into the system, which url should they be directed? Leave this parameter blank
164
The dspace.cfg Configuration Properties File or undefined to direct users to the homepage, or /profile for the user's profile, or another reasonable choice is /submissions to see if the user has any tasks awaiting their attention. The default is the repository home page. Property: Example Value: Informational Note: xmlui.theme.allowoverrides xmlui.theme.allowoverrides = false Allow the user to override which theme is used to display a particular page. When submitting a request add the HTTP parameter "themepath" which corresponds to a particular theme, that specified theme will be used instead of the any other configured theme. Note that this is a potential security hole allowing execution of unintended code on the server, this option is only for development and debugging it should be turned off for any production repository. The default value unless otherwise specified is "false". xmlui.bundle.upload xmlui.bundle.upload = ORIGINAL, METADATA, THUMBNAIL, LICENSE, CC_LICENSE Determine which bundles administrators and collection administrators may upload into an existing item through the administrative interface. If the user does not have the appropriate privileges (add and write) on the bundle then that bundle will not be shown to the user as an option. xmlui.community-list.render.full xmlui.community-list.render.full = true On the community-list page should all the metadata about a community/collection be available to the theme. This parameter defaults to true, but if you are experiencing performance problems on the community-list page you should experiment with turning this option off. xmlui.community-list.cache xmlui.community-list.cache = 12 hours Normally, Manakin will fully verify any cache pages before using a cache copy. This means that when the community-list page is viewed the database is queried for each community/collection to see if their metadata has been modified. This can be expensive for repositories with a large community tree. To help solve this problem you can set the cache to be assumed valued for a specific set of time. The downside of this is that new or editing communities/collections may not show up the website for a period of time.
Informational Note:
165
The dspace.cfg Configuration Properties File Property: Example Value: Informational Note: xmlui.bistream.mods xmlui.bistream.mods = true Optionally, you may configure Manakin to take advantage of metadata stored as a bitstream. The MODS metadata file must be inside the "METADATA" bundle and named MODS.xml. If this option is set to 'true' and the bitstream is present then it is made available to the theme for display. xmlui.bitstream.mets xmlui.bitstream.mets = true Optionally, you may configure Manakin to take advantage of metadata stored as a bitstream. The METS metadata file must be inside the "METADATA" bundle and named METS.xml. If this option is set to "true" and the bitstream is present then it is made available to the theme for display. xmlui.google.analytics.key xmlui.google.analytics.key = UAXXXXXX-X If you would like to use Google analytics to track general website statistics then use the following parameter to provide your analytics key. First sign up for an account at http://analytics.google.com, then create an entry for your repositories website. Google Analytics will give you a snippet of javascript code to place on your site, inside that snip it is your google analytics key usually found in the line: _uacct = "UA-XXXXXXX-X" Take this key (just the UAXXXXXX-X part) and place it here in this parameter. xmlui.controlpanel.activity.max xmlui.controlpanel.activity.max = 250 Assign how many page views will be recorded and displayed in the control panel's activity viewer. The activity tab allows an administrator to debug problems in a running DSpace by understanding who and how their dspace is currently being used. The default value is 250. xmlui.controlpanel.activity.ipheader xmlui.controlpanel.activity.ipheader = X-Forward-For Determine where the control panel's activity viewer receives an events IP address from. If your DSpace is in a load balanced environment or otherwise behind a context-switch then you will need to set the parameter to the HTTP parameter that records the original IP address.
166
167
The dspace.cfg Configuration Properties File 6.2.45.2.1. DIDL By activating the DIDL provider, DSpace items are represented as MPEG-21 DIDL objects. These DIDL objects are XML documents that wrap both the Dublin Core metadata that describes the DSpace item and its actual bitstreams. A bitstream is provided inline in the DIDL object in a base64 encoded manner, and/or by means of a pointer to the bitstream. The data provider exposes DIDL objects via the metadataPrefix didl. The crosswalk does not deal with special characters and purposely skips dissemination of the license.txt file awaiting a better understanding on how to map DSpace rights information to MPEG21-DIDL. The DIDL Crosswalk can be activated as follows: 1. Uncomment the oai.didl.maxresponse configuration in dspace.cfg 2. Uncomment the DIDL Crosswalk entry from the [dspace]/config/oaicat.properties file 3. Restart your servlet container, e.g. Tomcat, for the change to take effect. 4. Verify the Crosswalk is activated by accessing a URL such as http://mydspace/oai/request?verb=ListRecords&metadataPrefix=didl
168
The dspace.cfg Configuration Properties File Informational Note: The webapp responsible for minting the URIs for ORE Resource Maps. If using oai, the dspace.oai.uri config value must be set. The URIs generated for ORE ReMs follow the following convention for both cases._baseURI/metadata/handle/theHandle/ore.xml}} harvester.autoStart harvester.autoStart = false Determines whether the harvest scheduler process starts up automatically when the XMLUI webapp is redeployed. harvester.oai.metadataformats.PluginName
harvester.oai.metadataformats.PluginName = \ http://www.openarchives.org/OAI/2.0/oai_dc/, Simple Dublin Core
Informational Note:
This field can be repeated and serves as a link between the metadata formats supported by the local repository and those supported by the remote OAI-PMH provider. It follows the form harvester.oai.metadataformats.PluginName = NamespaceURI,Optional Display Name . The pluginName designates the metadata schemas that the harvester "knows" the local DSpace repository can support. Consequently, the PluginName must correspond to a previously declared ingestion crosswalk. The namespace value is used during negotiation with the remote OAI-PMH provider, matching it against a list returned by the ListMetadataFormats request, and resolving it to whatever metadataPrefix the remote provider has assigned to that namespace. Finally, the optional display name is the string that will be displayed to the user when setting up a collection for harvesting. If omitted, the PluginName:NamespaceURI combo will be displayed instead. harvester.oai.oreSerializationFormat.OREPrefix
harvester.oai.oreSerializationFormat.OREPrefix = \ http://www.w3.org/2005/Atom
Informational Note:
This field works in much the same way as harvester.oai.metadataformats.PluginName . The OREPrefix must correspond to a declared ingestion crosswalk, while the Namespace must be supported by the target OAI-PMH provider when harvesting content. harvester.timePadding harvester.timePadding = 120 Amount of time subtracted from the from argument of the PMH request to account for the time taken to
169
The dspace.cfg Configuration Properties File negotiate a connection. Measured in seconds. Default value is 120. Property: Example Value: Informational Note: harvester.harvestFrequency harvester.harvestFrequency = 720 How frequently the harvest scheduler checks the remote provider for updates. Should always be longer than _timePadding _. Measured in minutes. Default value is 720. harvester.minHeartbeat harvester.minHeartbeat = 30 The heartbeat is the frequency at which the harvest scheduler queries the local database to determine if any collections are due for a harvest cycle (based on the harvestFrequency) value. The scheduler is optimized to then sleep until the next collection is actually ready to be harvested. The minHeartbeat and maxHeartbeat are the lower and upper bounds on this timeframe. Measured in seconds. Default value is 30. harvester.maxHeartbeat harvester.maxHeartbeat = 3600 The heartbeat is the frequency at which the harvest scheduler queries the local database to determine if any collections are due for a harvest cycle (based on the harvestFrequency) value. The scheduler is optimized to then sleep until the next collection is actually ready to be harvested. The minHeartbeat and maxHeartbeat are the lower and upper bounds on this timeframe. Measured in seconds. Default value is 3600 (1 hour). harvester.maxThreads harvester.maxThreads = 3 How many harvest process threads the scheduler can spool up at once. Default value is 3. harvester.threadTimeout harvester.threadTimeout = 24 How much time passes before a harvest thread is terminated. The termination process waits for the current item to complete ingest and saves progress made up to that point. Measured in hours. Default value is 24. harvester.unknownField harvester.unkownField = fail | add | ignore You have three (3) choices. When a harvest process completes for a single item and it has been passed through ingestion crosswalks for ORE and its chosen descriptive metadata format, it might end up with
Property: Example Value: Informational Note: Property: Example Value: Informational Note:
170
The dspace.cfg Configuration Properties File DIM values that have not been defined in the local repository. This setting determines what should be done in the case where those DIM values belong to an already declared schema. Fail will terminate the harvesting task and generate an error. Ignore will quietly omit the unknown fields. Add will add the missing field to the local repository's metadata registry. Default value: fail. Property: Example Value: Informational Note: harvester.unknownSchema harvester.unknownSchema = fail | add | ignore When a harvest process completes for a single item and it has been passed through ingestion crosswalks for ORE and its chosen descriptive metadata format, it might end up with DIM values that have not been defined in the local repository. This setting determines what should be done in the case where those DIM values belong to an unknown schema. Fail will terminate the harvesting task and generate an error. Ignore will quietly omit the unknown fields. Add will add the missing schema to the local repository's metadata registry, using the schema name as the prefix and "unknown" as the namespace. Default value: fail. harvester.acceptedHandleServer
harvester.acceptedHandleServer = \ hdl.handle.net, handle.test.edu
A harvest process will attempt to scan the metadata of the incoming items (identifier.uri field, to be exact) to see if it looks like a handle. If so, it matches the pattern against the values of this parameter. If there is a match the new item is assigned the handle from the metadata value instead of minting a new one. Default value: hdl.handle.net. harvester.rejectedHandlePrefix harvester.rejectedHandlePrefix = 123456789, myeduHandle Pattern to reject as an invalid handle prefix (known test string, for example) when attempting to find the handle of harvested items. If there is a match with this config parameter, a new handle will be minted instead. Default value: 123456789.
171
The dspace.cfg Configuration Properties File Informational Note: Is used by the SolrLogger Client class to connect to the SOLR server over http and perform updates and queries. solr.spidersfile solr.spidersfile = ${dspace.dir}/ config/spiders.txt Spiders file is utilized by the SolrLogger, this will be populated by running the following command:dsrun org.dspace.statistics.util.SpiderDetector -i <httpd log file> solr.dbfile solr.dbfile = ${dspace.dir}/config/GeoLiteCity.dat The following refers to the GeoLiteCity database file utilized by the LocationUtils to calculate the location of client requests based on IP address. During the Ant build process (both fresh_install and update) this file will be downloaded from http://www.maxmind.com/ app/geolitecity if a new version has been published or it is absent from your [dspace]/config directory. solr.resolver.timeout solr.resolver.timeout = 200 Timeout for the resolver in the dns lookup time in milliseconds, defaults to 200 for backward compatibility; your systems default is usually set in /etc/ resolv.conf and varies between 2 to 5 seconds, to high a value might result in solr exhausting your connection pool. statistics.item.authorization.admin statistics.item.authorization.admin = true Enables access control restriction on DSpace Statistics pages, Restrictions are based on access rights to Community, Collection and Item Pages. This will require the user to sign on to see that statistics. Setting the statistics to "false" will make them publicly available. solr.statistics.logBots {{solr.statistics.logBots = true }} Enable/disable logging of spiders in solr statistics. If false, and IP matches an address in solr.spiderips.urls, event is not logged. If true, event will be logged with the 'isBot' field set to true (see solr.statistics.query.filter.* for query filter options) Default value is true. solr.statistics.query.filter.spiderIp
Property:
172
Optional or Advanced Configuration Settings Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: solr.statistics.query.filter.spiderIp = false Controls solr statistics querying to filter out spider IPs. False by default. {{solr.statistics.query.filter.isBot }} solr.statistics.query.filter.isBot = true Controls solr statistics querying to look at "isBot" field to determin if record is a bot. True by default. solr.spiderips.urls
solr.spiderips.urls = http://iplists.com/ google.txt, \ http://iplists.com/ inktomi.txt, \ http://iplists.com/ lycos.txt, \ http://iplists.com/ infoseek.txt, \ http://iplists.com/ altavista.txt,\ http://iplists.com/ excite.txt, \ http://iplists.com/ misc.txt, \ http://iplists.com/ excite.txt, \ http://iplists.com/ misc.txt, \ http://iplists.com/ non_engines.txt
Informational Note:
173
174
Optional or Advanced Configuration Settings 1. Install the xpdf tools for your platform, from the downloads at http://www.foolabs.com/xpdf 2. Acquire the Sun Java Advanced Imaging Tools and create a local Maven package. 3. Edit DSpace configuration properties to add location of xpdf executables, reconfigure MediaFilter plugins. 4. Build and install DSpace, adding -Pxpdf-mediafilter-support to Maven invocation.
The preceding example leaves the JAR in jai_imageio-1_1/lib/jai_imageio.jar . Now install it in your local Maven repository, e.g.: (changing the path after file= if necessary)
mvn install:install-file \ -Dfile=jai_imageio-1_1/lib/jai_imageio.jar -DgroupId=com.sun.media -DartifactId=jai_imageio -Dversion=1.0_01 -Dpackaging=jar -DgeneratePom=true \ \ \ \ \
You may have to repeat this procedure for the jai_core.jar library, as well, if it is not available in any of the public Maven repositories. Once acquired, this command installs it locally:
mvn install:install-file -Dfile=jai_core-1.1.2_01.jar \ -DgroupId=javax.media -DartifactId=jai_core -Dversion=1.1.2_01 -Dpackaging=jar DgeneratePom=true
175
Now, add the absolute paths to the XPDF tools you installed. In this example they are installed under /usr/ local/bin (a logical place on Linux and MacOSX), but they may be anywhere.
xpdf.path.pdftotext = /usr/local/bin/pdftotext xpdf.path.pdftoppm = /usr/local/bin/pdftoppm xpdf.path.pdfinfo = /usr/local/bin/pdfinfo
Change the MediaFilter plugin configuration to remove the old org.dspace.app.mediafilter.PDFFilter and add the new filters, e.g: (New sections are in bold)
filter.plugins = \ PDF Text Extractor, \ PDF Thumbnail, \ HTML Text Extractor, \ Word Text Extractor, \ JPEG Thumbnail plugin.named.org.dspace.app.mediafilter.FormatFilter = \ org.dspace.app.mediafilter.XPDF2Text = PDF Text Extractor, \ org.dspace.app.mediafilter.XPDF2Thumbnail = PDF Thumbnail, \ org.dspace.app.mediafilter.HTMLFilter = HTML Text Extractor, \ org.dspace.app.mediafilter.WordFilter = Word Text Extractor, \ org.dspace.app.mediafilter.JPEGFilter = JPEG Thumbnail, \ org.dspace.app.mediafilter.BrandedPreviewJPEGFilter = Branded Preview JPEG
Then add the input format configuration properties for each of the new filters, e.g.:
filter.org.dspace.app.mediafilter.XPDF2Thumbnail.inputFormats = Adobe PDFfilter.org.dspace.app.mediafilter.XPDF2Text.inputFormats = Adobe PDF
Finally, if you want PDF thumbnail images, don't forget to add that filter name to the filter.plugins property, e.g.:
filter.plugins = PDF Thumbnail, PDF Text Extractor, ...
176
Optional or Advanced Configuration Settings Alternatively, you could extend the org.dspace.app.mediafilter.MediaFilter class, which just defaults to performing no pre/post-processing of bitstreams before or after filtering. public class MySimpleMediaFilter extends MediaFilter You must give your new filter a "name", by adding it and its name to the plugin.named.org.dspace.app.mediafilter.FormatFilter field in dspace.cfg. In addition to naming your filter, make sure to specify its input formats in the filter.<class path>.inputFormats config item. Note the input formats must match the short description field in the Bitstream Format Registry (i.e. bitstreamformatregistry table).
plugin.named.org.dspace.app.mediafilter.FormatFilter = \ org.dspace.app.mediafilter.MySimpleMediaFilter = My Simple Text Filter, \ ... filter.org.dspace.app.mediafilter.MySimpleMediaFilter.inputFormats = Text
If you neglect to define the inputFormats for a particular filter, the MediaFilterManager will never call that filter, since it will never find a bitstream which has a format matching that filter's input format(s). If you have a complex Media Filter class, which actually performs different filtering for different formats (e.g. conversion from Word to PDF and conversion from Excel to CSV), you should define this as described in Chapter 13.3.2.2 .
As shown above, each Self-Named Filter class must be listed in the plugin.selfnamed.org.dspace.app.mediafilter.FormatFilter item in dspace.cfg. In addition, each Self-Named Filter must define the input formats for each named plugin defined by that filter. In the above example the MyComplexMediaFilter class is assumed to have defined two named plugins, Word2PDF and Excel2CSV. So, these two valid plugin names ("Word2PDF" and "Excel2CSV") must be returned by the getPluginNames() method of the MyComplexMediaFilter class. These named plugins take different input formats as defined above (see the corresponding inputFormats setting).
177
If you neglect to define the inputFormats for a particular named plugin, the MediaFilterManager will never call that plugin, since it will never find a bitstream which has a format matching that plugin's input format(s). For a particular Self-Named Filter, you are also welcome to define additional configuration settings in dspace.cfg. To continue with our current example, each of our imaginary plugins actually results in a different output format (Word2PDF creates "Adobe PDF", while Excel2CSV creates "Comma Separated Values"). To allow this complex Media Filter to be even more configurable (especially across institutions, with potential different "Bitstream Format Registries"), you may wish to allow for the output format to be customizable for each named plugin. For example:
#Define output formats for each named plugin filter.org.dspace.app.mediafilter.MyComplexMediaFilter.Word2PDF.output Format = Adobe PDF filter.org.dspace.app.mediafilter.MyComplexMediaFilter.Excel2CSV.outputFormat = Comma Separated Values
Any custom configuration fields in dspace.cfg defined by your filter are ignored by the MediaFilterManager, so it is up to your custom media filter class to read those configurations and apply them as necessary. For example, you could use the following sample Java code in your MyComplexMediaFilter class to read these custom outputFormat configurations from dspace.cfg:
#Get "outputFormat" configuration from dspace.cfg String outputFormat = ConfigurationManager.getProperty(MediaFilterManager.FILTER_PREFIX + "." + MyComplexMediaFilter.class.getName() + "." + this.getPluginInstanceName() + ".outputFormat");
178
Optional or Advanced Configuration Settings of SWORD currently supported by DSpace is 1.3. The specification and further information can be downloaded fromhttp://swordapp.org. SWORD is based on the Atom Publish Protocol and allows service documents to be requested which describe the structure of the repository, and packages to be deposited. Properties: sword.metsingester.package-ingester sword.metsingester.package-ingester = METS The property key tell the SWORD METS implementation which package ingester to use to install deposited content. This should refer to one of the classes configured for:
plugin.named.org.dspace.content.packager.PackageIngester
Example Value:
Informational Note:
The value of sword.metsingester.package-ingester tells the system which named plugin for this interface should be used to ingest SWORD METS packages. Properties: Example Value: Informational Note: mets.submission.crosswalk.EPDCX mets.submission.crosswalk.EPDCX = SWORD Define the metadata type EPDCX (EPrints DC XML)to be handled by the SWORD crosswalk configuration. crosswalk.submission.SWORD.stylesheet crosswalk.submission.SWORD.stylesheet = crosswalks/swordswap-ingest.xsl Define the stylesheet which will be used by the self-named XSLTIngestionCrosswalk class when asked to load the SWORD configuration (as specified above). This will use the specified stylesheet to crosswalk the incoming SWAP metadata to the DIM format for ingestion. sword.deposit.url sword.deposit.url =http://www.myu.ac.uk/ sword/deposit_
Informational Note:
179
Optional or Advanced Configuration Settings Informational Note: The base URL of the SWORD deposit. This is the URL from which DSpace will construct the deposit location urls for collections. The default is {dspace.url}/sword/deposit. In the event that you are not deploying DSpace as the ROOT application in the servlet container, this will generate incorrect URLs, and you should override the functionality by specifying in full as shown in the example value. {{sword.servicedocument.url }} {{sword.servicedocument.url = http://www.myu.ac.uk/sword/servicedocument_ The base URL of the SWORD service document. This is the URL from which DSpace will construct the service document location urls for the site, and for individual collections. The default is {dspace.url}/sword/servicedocument . In the event that you are not deploying DSpace as the ROOT application in the servlet container, this will generate incorrect URLs, and you should override the functionality by specifying in full as shown in the example value. sword.media-link.url sword.media-link.url =http://www.myu.ac.uk/ sword/media-link_ The base URL of the SWORD media links. This is the URL which DSpace will use to construct the media link urls for items which are deposited via sword. The default is {dspace.url}/sword/media-link. In the event that you are not deploying DSpace as the ROOT application in the servlet container, this will generate incorrect URLs, and you should override the functionality by specifying in full as shown in the example value. sword.generator.url
Informational Note:
Informational Note:
Properties:
180
Optional or Advanced Configuration Settings Example Value: sword.generator.url =http://www.dspace.org/ ns/sword/1.3.1_ The URL which identifies the sword software which provides the sword interface. This is the URL which DSpace will use to fill out the atom:generator element of its atom documents. The default is: {{[http://www.dspace.org/ns/ sword/1.3.1_ sword.updated.field sword.updated.field = dc.date.updated The metadata field in which to store the updated date for items deposited via SWORD. sword.slug.field sword.slug.field = dc.identifier.slug The metadata field in which to store the value of the slug header if it is supplied.
sword.acceptpackaging.METSDSpaceSIP.identifier sword.acceptpackaging.METSDSpaceSIP.q sword.acceptpackaging.METSDSpaceSIP.identifier = http://purl.org/net/swordtypes/METSDSpaceSIP sword.acceptpackaging.METSDSpaceSIP.q = 1.0
Informational Note:
[http://www.dspace.org/ns/ sword/1.3.1_]]}}. If you have modified your sword software, you should change this URI to identify your own version. If you are using the standard dspace-sword module you will not, in general, need to change this setting.
Properties:
Example Value:
Informational Note:
The accept packaging properties, along with their associated quality values where appropriate. This is a Global Setting; these will be used on all DSpace collections
sword.accept-packaging. [handle].METSDSpaceSIP.identifier sword.accept-packaging. [handle].METSDSpaceSIP.q sword.accept-packaging. [handle].METSDSpaceSIP.identifier = http://purl.org/net/swordtypes/METSDSpaceSIP sword.accept-packaging. [handle].METSDSpaceSIP.q = 1.0
Properties:
Example Value:
181
Optional or Advanced Configuration Settings Informational Note: Collection Specific settings: these will be used on the collections with the given handles. sword.expose-items sword.expose-items = false Should the server offer up items in collections as sword deposit targets. This will be effected by placing a URI in the collection description which will list all the allowed items for the depositing user in that collection on request. NOTE: this will require an implementation of deposit onto items, which will not be forthcoming for a short while. sword.expose-communities sword.expose-communities = false Should the server offer as the default the list of all Communities to a Service Document request. If false, the server will offer the list of all collections, which is the default and recommended behavior at this stage. NOTE: a service document for Communities will not offer any viable deposit targets, and the client will need to request the list of Collections in the target before deposit can continue. sword.max-upload-size sword.max-upload-size = 0 The maximum upload size of a package through the sword interface, in bytes. This will be the combined size of all the files, the metadata and any manifest data. It is NOT the same as the maximum size set for an individual file upload through the user interface. If not set, or set to 0, the sword service will default to no limit. sword.keep-original-package sword.keep-original-package = true
182
Optional or Advanced Configuration Settings Informational Note: Whether or not DSpace should store a copy of the original sword deposit package. NOTE: this will cause the deposit process to run slightly slower, and will accelerate the rate at which the repository consumes disk space. BUT, it will also mean that the deposited packages are recoverable in their original form. It is strongly recommended, therefore, to leave this option turned on. When set to "true", this requires that the configuration option upload.temp.dir above is set to a valid location. sword.bundle.name sword.bundle.name = SWORD The bundle name that SWORD should store incoming packages under if sword.keep-original-package is set to true. The default is "SWORD" if not value is set sword.identify-version sword.identify-version = true Should the server identify the sword version in a deposit response. It is recommended to leave this unchanged. sword.on-behalf-of.enable sword.on-behalf-of.enable = true Should mediated deposit via sword be supported. If enabled, this will allow users to deposit content packages on behalf of other users.
plugin.named.org.dspace.sword.SWORDingester plugin.named.org.dspace.sword.SWORDIngester = \ org.dspace.sword.SWORDMETSIngester = http://purl.org/net/swordtypes/METSDSpaceSIP \ org.dspace.sword.SimpleFileIngester = SimpleFileIngester
Informational Note:
183
Discovery this configuration is as per the Plugin Manager's Named Plugin documentation: {{plugin.named.[interface] = [implementation] = [package format identifier] }}. Package ingesters should implement the SWORDIngester interface, and will be loaded when a package of the format specified above in: {{sword.accept-packaging.[package format].identifier = [package format identifier]}}is received. In the event that this is a simple file deposit, with no package format, then the class named by "SimpleFileIngester" will be loaded and executed where appropriate. This case will only occur when a single file is being deposited into an existing DSpace Item. Properties: Example Value: Informational Note: sword.accepts sword.accepts = application/zip, foo/bar A comma separated list of MIME types that SWORD will accept.
6.4. Discovery
6.4.1. Introduction Video 6.4.2. Usage Guidelines
The Discovery Module enables faceted searching for your repository. In a faceted search, a user can filter what they are looking for by grouping entries into a facet, and drill down to find the content they are interested in. So instead of user searching: [ wetland + "dc.author=Mitsch, William J" + dc.subject="water quality" ], they can instead do their initial search, [ wetland ], and then filter the results by attributes. Although these techniques are new in DSpace, they might feel familiar from other platforms like Aquabroser or Amazon, where facets help you to select the right product according to facets like price and brand. DSpace Discovery offers very powerful browse and search configurations that were only possible with code customization in the past.
184
Discovery 1. Enable the Discovery Aspects in the XMLUI by changing the following settings in config/xmlui.xconf a. Comment out: SearchArtifacts b. Uncomment: Discovery
<xmlui> <aspects> <aspect name="Artifact Browser" path="resource://aspects/ArtifactBrowser/" /> <aspect name="Browsing Artifacts" path="resource://aspects/BrowseArtifacts/" /> <!--<aspect name="Searching Artifacts" path="resource://aspects/ SearchArtifacts/" />--> <aspect name="Administration" path="resource://aspects/Administrative/" /> <aspect name="E-Person" path="resource://aspects/EPerson/" /> <aspect name="Submission and Workflow" path="resource://aspects/Submission/" /> <aspect name="Statistics" path="resource://aspects/Statistics/" /> <!-To enable Discovery, uncomment this Aspect that will enable it within your existing XMLUI Also make sure to comment the SearchArtifacts aspect as leaving it on together with discovery will cause UI overlap issues--> <aspect name="Discovery" path="resource://aspects/Discovery/" />
<!-This aspect tests the various possible DRI features, it helps a theme developer create themes --> <!-- <aspect name="XML Tests" path="resource://aspects/XMLTest/"/> --> </aspects>
2. Enable the Discovery Indexing Consumer that will update Discovery Indexes on changes to content in XMLUI, JSPUI, SWORD, and LNI in config/dspace.cfg a. Add discovery to the list of event.dispatcher.default.consumers b. Change recent.submissions.count to zero
#### Event System Configuration #### # default synchronous dispatcher (same behavior as traditional DSpace) event.dispatcher.default.class = org.dspace.event.BasicDispatcher #event.dispatcher.default.consumers = search, browse, eperson, harvester event.dispatcher.default.consumers = search, browse, discovery, eperson, harvester #Put the recent submissions count to 0 so that discovery can use it's recent submissions, # not doing this when discovery is enabled will cause UI overlap issues #How many recent submissions should be displayed at any one time #recent.submissions.count = 5 recent.submissions.count = 0
3. Check that the port is correct for solr.search.server in config/dspace-solr-search.cfg a. If all of your traffic runs over port 80, then you need to remove the port from the URL
##### Search Indexing ##### solr.search.server = http://localhost/solr/search
4. From the command line, navigate to the dspace directory and run the command below to index the content of your DSpace instance into Discovery. 185
Discovery
./bin/dspace update-discovery-index
NOTE: This step may take some time if you have a large number of items in your repository.
Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value:
solr.facets.community=dc.contributor.author,dc.subject,dc.date.issued_dt
186
Discovery Informational Note: Aside from filters that are applied when users are searching, filters can also be applied by default. This property allos to define default filters that are used for every search in Discovery. The syntax is metadatafieldname:value. location is a special example, used to restrict a search to certain communities and collections. l stands for collection, while m is used to restrict the search to a community. The numbers, written after l or m is the internal database ID of the collection or community solr.site.default.filterQuery solr.site.default.filterQuery=dc.contributor.author:Kevin* This parameter applies additional filters on the Recently Added list, shown on the DSpace homepage. As these filters are strict matches, the star in the example is used to filter on all dc.contributor.author values that start with Kevin solr.community.default.filterQuery solr.community.default.filterQuery=dc.contributor.author:Kevin* This parameter applies additional filters on the Recently Added list, shown on Community Homepages. As these filters are strict matches, the star in the example is used to filter on all dc.contributor.author values that start with Kevin solr.collection.default.filterQuery solr.collection.default.filterQuery=dc.contributor.author:Kevin* This parameter applies additional filters on the Recently Added list, shown on Collection Homepages. As these filters are strict matches, the star in the example is used to filter on all dc.contributor.author values that start with Kevin solr.search.default.filterQuery solr.search.default.filterQuery=dc.embargo:lifted This parameter applies additional filters on all Discovery searches. In this example, only items who have the value lifted in the embargo field, are being shown as search results. solr.search.filters dc.title, dc.contributor.author, dc.subject, dc.date.issued.year Defines which fields are shown in the (advanced) search form. solr.search.sort solr.search.sort=dc.title, dc.date.issued_dt
187
Discovery Informational Note: Defines which indexed fields can be sorted on in the search results. With this parameter it's possible to make any field available for sorting. solr.index.type.date solr.index.type.date=dc.date,dc.date.* Defines whichs fields are indexed as dates. Please be aware that for each date field an _dt will be suffixed so that dc.date.issued will become dc.date.issued_dt. For each date indexed the year will also be stored separately in a (field.name).year so it can be used for date faceting solr.recent-submissions.size solr.recent-submissions.size=5 Defines the number of items that are shown in the Recently Added lists. recent.submissions.sort-option recent.submissions.sortoption=dc.date.accessioned_dt The indexed metadata field on which Discovery sorts to determine which items were recently submitted search.facet.max search.facet.max=10 Use the property below to limit the number of facet filters in the side of the search page
Property: Example Value: Informational Note: Property: Example Value: Informational Note: Property: Example Value: Informational Note:
188
DSpace Statistics
### ### ### ### ### ### ### ### ### elevate.xml protwords.txt schema.xml scripts.conf solrconfig.xml spellings.txt stopwords.txt synonyms.txt xslt ### example.xsl ### example_atom.xsl ### example_rss.xsl ### luke.xsl
Assuming you get an HTTP 200 OK response, then you should set solr.log.server
189
DSpace Statistics Property Name Default Value Type Description to the '/statistics' URL of 'http://127.0.0.1/solr/ statistics' (essentially removing the "/select?q=:" query off the end of the responding URL.) List of URLs to download spiders files into [dspace]/ config/spiders. These files contain lists of known spider IPs and are utilized by the SolrLogger to flag usage events with an "isBot" field, or ignore them entirely. The "stats-util" command can be used to force an update of spider files, regenerate "isBot" fields on indexed events, and delete spiders from the index. For usage, run:
dspace stats-util -h
solr.spiderips.urls
http://iplists.com/ google.txt, \ http://iplists.com/ inktomi.txt, \ http://iplists.com/ lycos.txt, \ http://iplists.com/ infoseek.txt, \ http://iplists.com/ altavista.txt, \ http://iplists.com/ excite.txt, \ http://iplists.com/ misc.txt, \ http://iplists.com/ non_engines.txt
String
from your [dspace]/bin directory solr.dbfile ${dspace.dir}/config/GeoLiteCity.dat String The following referes to the GeoLiteCity database file utilized by the LocationUtils to calculate the location of client requests based on IP address. During the Ant build process (both fresh_install and update) this file will be downloaded from http:// www.maxmind.com/app/ geolitecity if a new version has been published or it is absent from your [dspace]/config directory. Timeout in milliseconds for DNS resolution of origin hosts/IPs. Setting this value too high may result in solr exhausting your connection pool. Will cause Statistics loging to look for X-For-
solr.resolver.timeout
200
Integer
useProxies
true
boolean
190
DSpace Statistics Property Name Default Value Type Description ward URI to detect clients IP that have accessed it through a Proxy service. Allows detection of client IP when accessing DSpace. [Note: This setting is found in the DSpace Logging sesction of dspace.cfg] Enables access control restriction on DSpace Statistics pages, Restrictions are based on access rights to Community, Collection and Item Pages. This will require the user to sign on to see that statistics. Setting the statistics to "false" will make them publicly available. If false, and IP is detected as a spider, the event is not logged. If true, the event will be logged with the "isBot" field set to true. (see solr.statistics.query.filter.* for query filter options) If true, statistics queries will filter out spider IPs -use with caution, as this often results in extremely long query strings. If true, statistics queries will filter out events flagged with the "isBot" field. This is the recommended method of filtering spiders from statistics.
statistics.item.authorization.admin true
boolean
solr.statistics.logBots
true
boolean
solr.statistics.query.filter.spiderIp false
boolean
solr.statistics.query.filter.isBot true
boolean
191
Embargo
mvn package cd [dspace-source]/dspace/target/dspace-<version>-build.dir ant -Dconfig=[dspace]/config/dspace.cfg update cp -R [dspace]/webapps/* [TOMCAT]/webapps
The last step is only used if you are not mounting [dspace]/webapps directly into your Tomcat, Resin or Jetty host (the recommended practice)If you only need to build the statistics, and don't make any changes to other web applications, you can replace the copy step above with:
cp -R dspace/webapps/solr TOMCAT/webapps
Again, only if you are not mounting [dspace]/webapps directly into your Tomcat, Resin or Jetty host (the recommended practice) Restart your webapps (Tomcat/Jetty/Resin)
These fields are not used by the new 1.6 Statistics, but are only related to the Statistics from previous DSpace releases
6.6. Embargo
6.6.1. What is an embargo?
An embargo is a temporary access restriction placed on content, commencing at time of accession. It's scope or duration may vary, but the fact that it eventually expires is what distinguishes it from other content restrictions. For example, it is not unusual for content destined for DSpace to come with permanent restrictions on use or access based on license-driven or other IP-based requirements that limit access to institutionally affiliated users. Restrictions such as these are imposed and managed using standard administrative tools in DSpace, typically by attaching specific policies to Items or Collections, Bitstreams, etc. The embargo functionally introduced in 1.6, however, includes tools to automate the imposition and removal of restrictions in managed timeframes.
192
Embargo "2020-09-12" - an absolute date (i.e. the date embargo will be lifted) "6 months" - a time relative to when the item is accessioned "forever" - an indefinite, or open-ended embargo "local only until 2015" - both a time and an exception (public has no access until 2015, local users OK immediately) "Nature Publishing Group standard" - look-up to a policy somewhere (typically 6 months) These terms are 'interpreted' by the embargo system to yield a specific date on which the embargo can be removed or 'lifted'., and a specific set of access policies. Obviously, some terms are easier to interpret than others (the absolute date really requires none at all), and the 'default' embargo logic understands only the most basic terms (the first and third examples above). But as we will see below, the embargo system provides you with the ability to add in your own 'interpreters' to cope with any terms expressions you wish to have. This date that is the result of the interpretation is stored with the item and the embargo system detects when that date has passed, and removes the embargo ("lifts it"), so the item bitstreams become available. Here is a more detailed life-cycle for an embargoed item: 6.6.1.1.1. Terms assignment The first step in placing an embargo on an item is to attach (assign) 'terms' to it. If these terms are missing, no embargo will be imposed. As we will see below, terms are carried in a configurable DSpace metadata field, so assigning terms just means assigning a value to a metadata field. This can be done in a web submission user interface form, in a SWORD deposit package, a batch import, etc. - anywhere metadata is passed to DSpace. The terms are not immediately acted upon, and may be revised, corrected, removed, etc, up until the next stage of the life-cycle. Thus a submitter could enter one value, and a collection editor replace it, and only the last value will be used. Since metadata fields are multivalued, theoretically there can be multiple terms values, but in the default implementation only one is recognized. 6.6.1.1.2. Terms interpretation/imposition In DSpace terminology, when an Item has exited the last of any workflow steps (or if none have been defined for it), it is said to be 'installed' into the repository. At this precise time, the 'interpretation' of the terms occurs, and a computed 'lift date' is assigned, which like the terms is recorded in a configurable metadata field. It is important to understand that this interpretation happens only once, (just like the installation), and cannot be revisited later. Thus, although an administrator can assign a new value to the metadata field holding the terms after the item has been installed, this will have no effect on the embargo, whose 'force' now resides entirely in the 'lift date' value. For this reason, you cannot embargo content already in your repository (at least using standard tools). The other action taken at installation time is the actual imposition of the embargo. The default behavior here is simply to remove the read policies on all the bundles and bitstreams except for the "LICENSE" or "METADATA" bundles. See section V. below for how to alter this behavior. Also note that since these policy changes occur before installation, there is no time during which embargoed content is 'exposed' (accessible by non-administrators). The terms interpretation and imposition together are called 'setting' the embargo, and the component that performs them both is called the embargo 'setter'. 6.6.1.1.3. Embargo period After an embargoed item has been installed, the policy restrictions remain in effect until removed. This is not an automatic process, however: a 'lifter' must be run periodically to look for items whose 'lift date' is past. Note that this means the effective removal of an embargo is not the lift date, but the earliest date after the lift date that the lifter is run. Typically, a nightly cron-scheduled invocation of the lifter is more than adequate, given the granularity of embargo terms. Also note that during the embargo period, all metadata of the item remains visible.This default behavior can be changed. One final point to note is that the 'lift date', although it was computed and assigned during the previous stage, is in the end a regular metadata field. That means, if there are extraordinary circumstances that require an administrator (or collection editor - anyone with edit permissions
193
Embargo on metadata) to change the lift date, they can do so. Thus, they can 'revise' the lift date without reference to the original terms. This date will be checked the next time the 'lifter' is run. One could immediately lift the embargo by setting the lift date to the current day, or change it to 'forever' to indefinitely postpone lifting. 6.6.1.1.4. Embargo lift When the lifter discovers an item whose lift date is in the past, it removes (lifts) the embargo. The default behavior of the lifter is to add the resource policies that would have been added had the embargo not been imposed. That is, it replicates the standard DSpace behavior, in which an item inherits it's policies from its owning collection. As with all other parts of the embargo system, you may replace or extend the default behavior of the lifter (see section V. below). You may wish, e.g. to send an email to an administrator or other interested parties, when an embargoed item becomes available. 6.6.1.1.5. Post embargo After the embargo has been lifted, the item ceases to respond to any of the embargo life-cycle events. The values of the metadata fields reflect essentially historical or provenance values. With the exception of the additional metadata fields, they are indistinguishable from items that were never subject to embargo.
6.6.1.2. Configuration
DSpace embargoes utilize standard metadata fields to hold both the 'terms' and the 'lift date'. Which fields you use are configurable, and no specific metadata element is dedicated or pre-defined for use in embargo. Rather, you specify exactly what field you want the embargo system to examine when it needs to find the terms or assign the lift date. The properties that specify these assignments live in dspace.cfg:
# DC metadata field to hold the user-supplied embargo terms embargo.field.terms = SCHEMA.ELEMENT.QUALIFIER # DC metadata field to hold computed "lift date" of embargo embargo.field.lift = SCHEMA.ELEMENT.QUALIFIER
You replace the placeholder values with real metadata field names. If you only need the 'default' embargo behavior - which essentially accepts only absolute dates as 'terms' , this is the only configuration required, except as noted below. There is also a property for the special date of 'forever':
# string in terms field to indicate indefinite embargo embargo.terms.open = forever
which you may change to suit linguistic or other preference. You are free to use existing metadata fields, or create new fields. If you choose the latter, you must understand that the embargo system does not create or configure these fields: i.e. you must follow all the standard documented procedures for actually creating them (i.e. adding them to the metadata registry, or to display templates, etc) this does not happen automatically. Likewise, if you want the field for 'terms' to appear in submission screens and workflows, you must follow the documented procedure for configurable submission (basically, this means adding the field to input-forms.xml). The flexibility of metadata configuration makes if easy for you to restrict embargoes to specific collections, since configurable submission can be defined per collection. Key recommendations:
194
Embargo 1. If using existing metadata fields, avoid any that are automatically managed by DSpace. For example, fields like 'date.issued' or 'date.accessioned' are normally automatically assigned, and thus must not be recruited for embargo use. 2. Do not place the field for 'lift date' in submission screens. This can potentially confuse submitters because they may feel that they can directly assign values to it. As noted in the life-cycle above, this is erroneous: the lift date gets assigned by the embargo system based on the terms. Any pre-existing value will be overwritten. But see next recommendation for an exception. 3. As the life-cycle discussion above makes clear, after the terms are applied, that field is no longer actionable in the embargo system. Conversely, the 'lift date' field is not actionable until the application. Thus you may want to consider configuring both the 'terms' and 'lift date' to use the same metadata field. In this way, during workflow you would see only the terms, and after item installation, only the lift date. If you wish the metadata to retain the terms for any resaon, use 2 distinct fields instead.
6.6.1.3. Operation
After the fields defined for terms and lift date have been assigned in dspace.cfg, and created and configured wherever they will be used, you can begin to embargo items simply by entering data (dates, if using the default setter) in the terms field. They will automatically be embargoed as they exit workflow. For the embargo to be lifted on any item, however, a new administrative procedure must be added: the 'embargo lifter' must be invoked on a regular basis. This task examines all embargoed items, and if their 'lift date' has passed, it removes the access restrictions on the item. Good practice dictates automating this procedure using cron jobs or the like, rather than manually running it. The lifter is available as a target of the 1.6 DSpace launcher - see launcher documentation for details.
controls which setter to use. 6.6.1.4.2. Lifter The default lifter behavior as described above - essentially applying the collection policy rules to the item - might also not be sufficient for all purposes. It also can be replaced with another class:
# implementation of embargo lifter plugin - - replace with local implementation if applicable
195
Once the feature is enabled, the mapping is configured by a separate configuration file located here:
${dspace.dir}/config/google-metadata.properties
This file contains name/value pairs linking meta-tags with DSpace metadata fields. E.g
google.citation_title = dc.title google.citation_publisher = dc.publisher google.citation_authors = dc.author | dc.contributor.author | dc.creator
There is further documentation in this configuration file explaining proper syntax in specifying which metadata fields to use. If a value is omitted for a meta-tag field, the meta-tag is simply not included in the HTML output. The values for each item are interpolated when the item is viewed, and the appropriate meta-tags are included in the HTML head tag, on both the Brief Item Display and the Full Item Display. This is implemented in the XMLUI and JSPUI.
196
7.1. Configuration
The user will need to refer to the extensive WebUI/JSPUI configurations that are contained in JSP Web Interface Settings.
Heavy use is made of a style sheet, styles.css. If you make edits, copy the local version to [jsp.custom-dir]/ dspace/modules/jspui/src/main/webapp/styles.css, and it will be used automatically in preference to the default, as described above.
197
Customizing the JSP pages Fonts and colors can be easily changed using the stylesheet. The stylesheet is a JSP so that the user's browser version can be detected and the stylesheet tweaked accordingly. The 'layout' of each page, that is, the top and bottom banners and the navigation bar, are determined by the JSPs /layout/header-*.jsp and /layout/footer-*.jsp. You can provide modified versions of these (in [jsp.customdir]/dspace/modules/jspui/src/main/webapp/layout), or define more styles and apply them to pages by using the "style" attribute of the dspace:layout tag. 1. Rebuild the DSpace installation package by running the following command from your [dspace-source]/ dspace/ directory:
mvn package
2. Update all DSpace webapps to [dspace]/webapps by running the following command from your [dspacesource]/dspace/target/dspace-[version]-build.dir directory:
ant -Dconfig=[dspace]/config/dspace.cfg update
4. Restart Tomcat When you restart the web server you should see your customized JSPs.
198
199
Manakin Configuration Property Keys bugging problems in a running dspace instance, especially in the workflow process. The default value is false, i.e., no one may assume the login of another user. Property: Example Value: Informational Note: xmlui.user.loginredirect xmlui.user.loginredirect = /profile After a user has logged into the system, which url should they be directed? Leave this parameter blank or undefined to direct users to the homepage, or /profile for the user's profile, or another reasonable choice is /submissions to see if the user has any tasks awaiting their attention. The default is the repository home page. xmlui.theme.allowoverrides xmlui.theme.allowoverrides = false Allow the user to override which theme is used to display a particular page. When submitting a request add the HTTP parameter "themepath" which corresponds to a particular theme, that specified theme will be used instead of the any other configured theme. Note that this is a potential security hole allowing execution of unintended code on the server, this option is only for development and debugging it should be turned off for any production repository. The default value unless otherwise specified is "false". xmlui.bundle.upload xmlui.bundle.upload = ORIGINAL, METADATA, THUMBNAIL, LICENSE, CC_LICENSE Determine which bundles administrators and collection administrators may upload into an existing item through the administrative interface. If the user does not have the appropriate privileges (add and write) on the bundle then that bundle will not be shown to the user as an option. xmlui.community-list.render.full xmlui.community-list.render.full = true On the community-list page should all the metadata about a community/collection be available to the theme. This parameter defaults to true, but if you are experiencing performance problems on the community-list page you should experiment with turning this option off. xmlui.community-list.cache xmlui.community-list.cache = 12 hours Normally, Manakin will fully verify any cache pages before using a cache copy. This means that when the community-list page is viewed the database is queried for each community/collection to see if their metadata
200
Manakin Configuration Property Keys has been modified. This can be expensive for repositories with a large community tree. To help solve this problem you can set the cache to be assumed valued for a specific set of time. The downside of this is that new or editing communities/collections may not show up the website for a period of time. Property: Example Value: Informational Note: xmlui.bistream.mods xmlui.bistream.mods = true Optionally, you may configure Manakin to take advantage of metadata stored as a bitstream. The MODS metadata file must be inside the "METADATA" bundle and named MODS.xml. If this option is set to 'true' and the bitstream is present then it is made available to the theme for display. xmlui.bitstream.mets xmlui.bitstream.mets = true Optionally, you may configure Manakin to take advantage of metadata stored as a bitstream. The METS metadata file must be inside the "METADATA" bundle and named METS.xml. If this optino is set to "true" and the bitstream is present then it is made available to the theme for display. xmlui.google.analytics.key xmlui.google.analytics.key = UA-XXXXXX-X If you would like to use google analytics to track general website statistics then use the following parameter to provide your analytics key. First sign up for an account at http://analytics.google.com, then create an entry for your repositories website. Google Analytics will give you a snipit of javascript code to place on your site, inside that snip it is your google analytics key usually found in the line: _uacct = "UA-XXXXXXX-X" Take this key (just the UAXXXXXX-X part) and place it here in this parameter. xmlui.controlpanel.activity.max xmlui.controlpanel.activity.max = 250 Assign how many page views will be recorded and displayed in the control panel's activity viewer. The activity tab allows an administrator to debug problems in a running DSpace by understanding who and how their dspace is currently being used. The default value is 250. xmlui.controlpanel.activity.ipheader xmlui.controlpanel.activity.ipheader = X-Forward-For Determine where the control panel's activity viewer recieves an events IP address from. If your DSpace is in a load balanced enviornment or otherwise behind a
201
Configuring Themes and Aspects context-switch then you will need to set the paramater to the HTTP parameter that records the original IP address.
8.2.1. Aspects
The <aspects> section defines the "Aspect Chain", or the linear set of aspects that are installed in the repository. For each aspect that is installed in the repository, the aspect makes available new features to the interface. For example, if the "submission" aspect were to be commented out or removed from the xmlui.xconf, then users would not be able to submit new items into the repository (even the links and language prompting users to submit items are removed). Each <aspect> element has two attributes, name and path. The name is used to identify the Aspect, while the path determines the directory where the aspect's code is located. Here is the default aspect configuration:
<aspects> <aspect <aspect <aspect <aspect </aspects>
name="Artifact Browser" path="resource://aspects/ArtifactBrowser/" /> name="Administration" path="resource://aspects/Administrative/" /> name="E-Person" path="resource://aspects/EPerson/" /> name="Submission and Workflow" path="resource://aspects/Submission/" />
A standard distribution of Manakin/DSpace includes four "core" aspects: Artifact Browser The Artifact Browser Aspect is responsible for browsing communities, collections, items and bitstreams, viewing an individual item and searching the repository. E-Person The E-Person Aspect is responsible for logging in, logging out, registering new users, dealing with forgotten passwords, editing profiles and changing passwords. Submission The Submission Aspect is responsible for submitting new items to DSpace, determining the workflow process and ingesting the new items into the DSpace repository. Administrative The Administrative Aspect is responsible for administrating DSpace, such as creating, modifying and removing all communities, collections, e-persons, groups, registries and authorizations.
8.2.2. Themes
The <themes> section defines a set of "rules" that determine where themes are installed in the repository. Each rule is processed in the order that it appears, and the first rule that matches determines the theme that is applied (so order is important). Each rule consists of a <theme> element with several possible attributes: name (always required)The name attribute is used to document the theme's name.
202
Multilingual Support path (always required)The path attribute determines where the theme is located relative to the themes/ directory and must either contain a trailing slash or point directly to the theme's sitemap.xmap file. regex (either regex and/or handle is required)The regex attribute determines which URLs the theme should apply to. handle (either regex and/or handle is required)The handle attribute determines which community, collection, or item the theme should apply to. If you use the "handle" attribute, the effect is cascading, meaning if a rule is established for a community then all collections and items within that community will also have this theme apply to them as well. Here is an example configuration:
<themes> <theme name="Theme 1" handle="123456789/23" path="theme1/"/> <theme name="Theme 2" regex="community-list" path="theme2/"/> <theme name="Reference Theme" regex=".*" path="Reference/"/> </themes>
In the example above three themes are configured: "Theme 1", "Theme 2", and the "Reference Theme". The first rule specifies that "Theme 1" will apply to all communities, collections, or items that are contained under the parent community "123456789/23". The next rule specifies any URL containing the string "community-list" will get "Theme 2". The final rule, using the regular expression ".", will match *anything, so all pages which have not matched one of the preceding rules will be matched to the Reference Theme.
203
Customizing the News Document in different parts of the repository. The central component of a theme is the sitemap.xmap, which defines what resources are available to the theme such as XSL stylesheets, CSS stylesheets, images, or multimedia files. 1) Create theme skeleton Most theme developers do not create a new theme from scratch; instead they start from the standard theme template, which defines a skeleton structure for a theme. The template is located at: [dspace-source]/dspacexmlui/dspace-xmlui-webbapp/src/main/webbapp/themes/template. To start your new theme simply copy the theme template into your locally defined modules directory, [dspace-source]/dspace/modules/xmlui/src/main/ webbapp/themes/[your theme's directory]/. 2) Modify theme variables The next step is to modify the theme's parameters so that the theme knows where it is located. Open the [your theme's directory]/sitemap.xmap and look for <global-variables>
<global-variables> <theme-path>[your theme's directory]</theme-path> <theme-name>[your theme's name]</theme-name> </global-variables>
Update both the theme's path to the directory name you created in step one. The theme's name is used only for documentation. 3) Add your CSS stylesheets The base theme template will produce a repository interface without any style - just plain XHTML with no color or formatting. To make your theme useful you will need to supply a CSS Stylesheet that creates your desired look-and-feel. Add your new CSS stylesheets: [your theme's directory]/lib/style.css (The base style sheet used for all browsers) [your theme's directory]/lib/style-ie.css (Specific stylesheet used for internet explorer) 4) Install theme and rebuild DSpace Next rebuild and deploy Dspace (replace <version> with the your current release): 1. Rebuild the DSpace installation package by running the following command from your [dspace-source]/ dspace/ directory:
mvn package
2. Update all DSpace webapps to [dspace]/webapps by running the following command from your [dspacesource]/dspace/target/dspace-[version]-build.dir directory:
ant -Dconfig=[dspace]/config/dspace.cfg update
4. Restart Tomcat This will ensure the theme has been installed as described in the previous section "Configuring Themes and Aspects".
Adding Static Content The news document is located at [dspace]/dspace/config/news-xmlui.xml. There is only one version; it is localized by inserting "i18n" callouts into the text areas. It must be a complete and valid XML DRI document (see Chapter 15). Its (the News document) exact rendering in the XHTML UI depends, of course, on the theme. The default content is designed to operate with the reference themes, so when you modify it, be sure to preserve the tag structure and e.g. the exact attributes of the first DIV tag. Also note that the text is DRI, not HTML, so you must use only DRI tags, such as the XREF tag to construct a link. Example 1: a single language:
<document> <body> <div id="file.news.div.news" n="news" rend="primary"> <head> TITLE OF YOUR REPOSITORY HERE </head> <p> INTRO MESSAGE HERE Welcome to my wonderful repository etc etc ... A service of <xref target="http://myuni.edu/">My University</xref> </p> </div> </body> <options/> <meta> <userMeta/> <pageMeta/> <repositoryMeta/> </meta> </document>
205
Enabling OAI-ORE Harvester using XMLUI to web spiders/crawlers. However, you may also add static HTML (*.html) content to this directory, as needed for your installation. Any static HTML content you add to this directory may also reference static content (e.g. CSS, Javascript, Images, etc.) from the same [dspace-source]/dspace/modules/xmlui/src/main/webapp/static/ directory. You may reference other static content from your static HTML files similar to the following:
<link href="./static/mystyle.css" rel="stylesheet" type="text/css"/> <img src="./static/images/static-image.gif" alt="Static image in /static/images/ directory"/> <img src="./static/static-image.jpg" alt="Static image in /static/ directory"/>
Additional XMLUI Learning Resources either in XMLUI or from the command line. If the scheduler is running, "Import Now" will handle the harvest task as a separate thread. "Reset and Reimport Collection" will perform the same function as "Import Now", but will clear the collection of all existing items before doing so.
Mirage Configuration and Customization Clean new look and feel. Increased browser compatibility. The whole theme renders perfectly in today's modern browsers (Internet Explorer 7 and higher, Firefox, Safari, Chrome, ...) Easier to customize. Enhanced Performance
208
Mirage Configuration and Customization The 'file' list style immediately shows you whether files are attached to the items, by displaying a large thumbnail icon for each of the items. 8.9.2.3.2. Structural enhancements for easier customization. Based on the new restructured dri2xhtml base templates. Templates in the theme, overriding the new base templates, are located in the same folder hierarchy to ensure full transparancy. Automated browser feature detection for improved browser compatibility. In other themes, user agent detection is used to identify which browser version your user is using. Based on the result of this detection, the theme would use a different cascaded style sheet (CSS) to render a compatible page for the visitor. This approach has 2 major issues: User agent detection isn't very reliable Maintaining these different CSS files is a maintenance nightmare for developers, especially when using features from newer browsers. Mirage applies two novel techniques to resolve these issues For compatibility with older Internet Explorer browsers, conditional comments give the body tag a class corresponding to the version of IE modernizr is used to detect which css features are available in the user's browser. This way you can target all browsers that support a certain feature using css classes, and rules affecting the same element can be put together in the same place for all browsers. CSS files are now split up according to function instead of browser. style.css will now fit most needs for customization. Following additional CSS files are included, but will rarely need to be changed: reset.css ensures that browser-specific initializations are being reset. base.css contains a few base styles helper.css contains helper classes to deal with specific functionality. handheld.css and print.css enable you to define styles for handheld devices and printing of pages. jQuery and jQueryUI are included by default. To avoid conflicts the authority control javascript has been rewritten to use jQuery instead of Prototype and Script.aculo.us. 8.9.2.3.3. Enhanced Performance Concatenation andMinificationtechniques for css and js files. The IncludePageMeta has been extended to generate URL's to the concatenated version of all css files using the same media tag. The ConcatenationReader has been created to return concatenated and minified versions of the css and js files. Once js and css files have been minified and concatenated, they are being properly cached. As a result, the minification and concatenation operations only need to happen once, and do not include performance overhead.
209
XMLUI Base Theme Templates (dri2xhtml) Caution: when minification is enabled, all code-comments will be removed. This could be a problem for comments containing copyright notices, so for files with those comments you should disable minification by adding '?nominify' after the url e.g. <map:parameter name="javascript" value="lib/js/jquery-ui-1.8.5.custom.min.js?nominify"/> Disabled by default, these features need to be enabled in the configuration using the properties 'xmlui.theme.enableConcatenation' and 'xmlui.theme.enableMinification' These features can be enabled for other themes as well, but will require an alteration of the theme's sitemap. Javascript references are included at the bottom of the page instead of the top. This optimizes page load times in general.
8.9.2.4. Troubleshooting
8.9.2.4.1. Errors using HTTPS DSpace 1.7.0 ships with a hardcoded http:// link for JQuery, causing problems for users running 1.7.0 Mirage on HTTPS. While awaiting the implementation of this fix in an upcoming release, you can solve in the following file: lib/core/page-structure.xsl, addJavascript template. In this file, you will need to replace
<script type="text/javascript" src="http://ajax.googleapis.com/ajax/libs/jquery/1.4.2/ jquery.min.js"> </script>
with
<script type="text/javascript"> <xsl:text disable-output-escaping="yes">var JsHost = (("https:" == document.location.protocol) ? "https://" : "http://"); document.write(unescape("%3Cscript src='" + JsHost + "ajax.googleapis.com/ajax/ libs/jquery/1.4.2/jquery.min.js' type='text/javascript'%3E%3C/script%3E"));</xsl:text> </script>
8.10.1. dri2xhtml
The dri2xhtml base template is the original template for creating XMLUI themes. It attempts to provide generic XSLT templates which are then applied across the entire DSpace site, thus making it easier to make site-wide changes.
210
XMLUI Base Theme Templates (dri2xhtml) The dri2xhtml base template is used in the following Themes: Reference - the default XMLUI theme Classic - an XMLUI theme which looks similar to JSPUI Kubrick
8.10.2. dri2xhtml-alt
The dri2xhtml-alt base template is an alternative template for creating XMLUI themes. It contains the same XSLT templates from dri2xhtml, but they are divided into multiple files and folders. Each file attempts to group XSLT templates together based on their function, in order to make it easier to find the templates related to the feature you're trying to modify. The dri2xhtml-alt base template is used in the following Themes: Mirage
211
Because the contents of dri2xhtml-alt is identical to the current dri2xhtml.xsl and its derivatives, updating any of the existing themes to reference the new dri2xhtml-alt should not impose any changes in the rendering of the pages.
8.10.2.2. Features
No changes to existing templates found in legacy dri2xhtml Drops inclusion of Handlers other than DIM and Default Templates divided out into files so they can be more easily located, divided by Aspect, Page and Functionality
212
System Administration
9. System Administration
DSpace operates on several levels: as a Tomcat servlet, cron jobs, and on-demand operations. This section explains many of the on-demand operations. Some of the command operations may be also set up as cron jobs. Many of these operations are performed at the Command Line Interface (CLI) also known as the Unix prompt ($:). Future reference will use the term CLI when the use needs to be at the command line. Below is the "Command Help Table". This table explains what data is contained in the individual command/help tables in the sections that follow. Command used: Java class: Arguments: The directory and where the command is to be found. The actual java program doing the work. The required/mandatory or optional arguments available to the user.
The administrator need to build the source xml document in the following format:
<import_structure> <community> <name>Community Name</name> <description>Descriptive text</description> <intro>Introductory text</intro> <copyright>Special copyright notice</copyright> <sidebar>Sidebar text</sidebar> <community> <name>Sub Community Name</name> <community> ...[ad infinitum]... </community> </community>
213
This command-line tool gives you the ability to import a community and collection structure directly from a source XML file. It is executed as follows:
[dspace]/bin/dspace structure-builder -f /path/to/source.xml -o path/to/output.xml -e admin@user.com
This will examine the contents of source.xml, import the structure into DSpace while logged in as the supplied administrator, and then output the same structure to the output file, but including the handle for each imported community and collection as an attribute.
9.1.1. Limitation
Currently this does not export community and collection structures, although it should only be a small modification to make it do so
214
Package Importer and Exporter To see all the options, invoke it as:
[dspace]/bin/dspace packager --help
This mode also displays a list of the names of package ingestion and dissemination plugins that are currently installed in your DSpace. Each Packager plugin also may allow for custom options, which may provide you more control over how a package is imported or exported. You can see a listing of all specific packager options by invoking --help (or -h) with the --type (or -t) option:
[dspace]/bin/dspace packager --help --type METS
The above example will display the normal help message, while also listing any additional options available to the "METS" packager plugin.
9.2.1. Ingesting
9.2.1.1. Ingestion Modes & Options
When ingesting packages DSpace supports several different "modes". (Please note that not all packager plugins may support all modes of ingestion) 1. Submit/Ingest Mode (-s option, default) submit package to DSpace in order to create a new object(s) 2. Restore Mode (-r option) restore pre-existing object(s) in DSpace based on package(s). This also attempts to restore all handles and relationships (parent/child objects). This is a specialized type of "submit", where the object is created with a known Handle and known relationships. 3. Replace Mode (-r -f option) replace existing object(s) in DSpace based on package(s). This also attempts to restore all handles and relationships (parent/child objects). This is a specialized type of "restore" where the contents of existing object(s) is replaced by the contents in the AIP(s). By default, if a normal "restore" finds the object already exists, it will back out (i.e. rollback all changes) and report which object already exists. 9.2.1.1.1. Ingesting a Single Package To ingest a single package from a file, give the command:
[dspace]/bin/dspace packager -e [user-email] -p [parent-handle] -t [packager-name] /full/ path/to/package
Where [user-email] is the e-mail address of the E-Person under whose authority this runs; [parent-handle] is the Handle of the Parent Object into which the package is ingested, [packager-name] is the plugin name of the package ingester to use, and /full/path/to/package is the path to the file to ingest (or "-" to read from the standard input). Here is an example that loads a PDF file with internal metadata as a package:
[dspace]/bin/dspace packager -e admin@myu.edu -p 4321/10 -t PDF thesis.pdf
This example takes the result of retrieving a URL and ingests it:
215
9.2.1.1.2. Ingesting Multiple Packages at Once Some Packager plugins support bulk ingest functionality using the --all (or -a) flag. When --all is used, the packager will attempt to ingest all child packages referenced by the initial package (and continue on recursively). Some examples follow: For a Site-based package - this would ingest all Communities, Collections & Items based on the located package files For a Community-based package - this would ingest that Community and all SubCommunities, Collections and Items based on the located package files For a Collection - this would ingest that Collection and all contained Items based on the located package files For an Item this just ingest the Item (including all Bitstreams & Bundles) based on the package file. Here is a basic example of a bulk ingest 'packager' command template:
[dspace]/bin/dspace packager -s -a -t AIP -e <eperson> -p <parent-handle> <file-path>
for example:
[dspace]/bin/dspace packager -s -a -t AIP -e admin@myu.edu -p 4321/12 collection-aip.zip
The above command will ingest the package named "collection-aip.zip" as a child of the specified Parent Object (handle="4321/12"). The resulting object is assigned a new Handle (since -s is specified). In addition, any child packages directly referenced by "collection-aip.zip" are also recursively ingested (a new Handle is also assigned for each child AIP).
216
Package Importer and Exporter 2. Restore, Keep Existing Mode (-r -k) = Attempt to restore object (and optionally children). If an object is found to already exist, skip over it (and all children objects), and continue to restore all other non-existing objects. 3. Force Replace Mode (-r -f) = Restore an object (and optionally children) and overwrite any existing objects in DSpace. Therefore, if an object is found to already exist in DSpace, its contents are replaced by the contents of the package. WARNING: This mode is potentially dangerous as it will permanently destroy any object contents that do not currently exist in the package. You may want to first perform a backup, unless you are sure you know what you are doing! 9.2.1.2.1. Default Restore Mode By default, the restore mode (-r option) will rollback all changes if any object is found to already exist. The user will be informed if which object already exists within their DSpace installation. Use this 'packager' command template:
[dspace]/bin/dspace packager -r -t AIP -e <eperson> <file-path>
For example:
[dspace]/bin/dspace packager -r -t AIP -e admin@myu.edu aip4567.zip
Notice that unlike -s option (for submission/ingesting), the -r option does not require the Parent Object ( -p option) to be specified if it can be determined from the package itself. In the above example, the package "aip4567.zip" is restored to the DSpace installation with the Handle provided within the package itself (and added as a child of the parent object specified within the package itself). If the object is found to already exist, all changes are rolled back (i.e. nothing is restored to DSpace) 9.2.1.2.2. Restore, Keep Existing Mode When the "Keep Existing" flag (-k option) is specified, the restore will attempt to skip over any objects found to already exist. It will report to the user that the object was found to exist (and was not modified or changed). It will then continue to restore all objects which do not already exist. This flag is most useful when attempting a bulk restore (using the --all (or -a) option. One special case to note: If a Collection or Community is found to already exist, its child objects are also skipped over. So, this mode will not auto-restore items to an existing Collection. Here's an example of how to use this 'packager' command:
[dspace]/bin/dspace packager -r -a -k -t AIP -e <eperson> <file-path>
For example:
[dspace]/bin/dspace packager -r -a -k -t AIP -e admin@myu.edu aip4567.zip
217
Package Importer and Exporter In the above example, the package "aip4567.zip" is restored to the DSpace installation with the Handle provided within the package itself (and added as a child of the parent object specified within the package itself). In addition, any child packages referenced by "aip4567.zip" are also recursively restored (the -a option specifies to also restore all child pacakges). They are also restored with the Handles & Parent Objects provided with their package. If any object is found to already exist, it is skipped over (child objects are also skipped). All non-existing objects are restored. 9.2.1.2.3. Force Replace Mode When the "Force Replace" flag (-f option) is specified, the restore will overwrite any objects found to already exist in DSpace. In other words, existing content is deleted and then replaced by the contents of the package(s).
For example:
[dspace]/bin/dspace packager -r -f -t AIP -e admin@myu.edu aip4567.zip
In the above example, the package "aip4567.zip" is restored to the DSpace installation with the Handle provided within the package itself (and added as a child of the parent object specified within the package itself). In addition, any child packages referenced by "aip4567.zip" are also recursively ingested. They are also restored with the Handles & Parent Objects provided with their package. If any object is found to already exist, its contents are replaced by the contents of the appropriate package. If any error occurs, the script attempts to rollback the entire replacement process.
9.2.2. Disseminating
9.2.2.1. Disseminating a Single Object
To disseminate a single object as a package, give the command:
[dspace]/bin/dspace packager -d -e [user-email] -i [handle] -t [packager-name] [file-path]
Where [user-email] is the e-mail address of the E-Person under whose authority this runs; [handle] is the Handle of the Object to disseminate; [packager-name] is the plugin name of the package disseminator to use; and [filepath] is the path to the file to create (or "-" to write to the standard output). For example:
[dspace]/bin/dspace packager -d -t METS -e admin@myu.edu -i 4321/4567 4567.zip
The above code will export the object of the given handle (4321/4567) into a METS file named "4567.zip".
218
for example:
[dspace]/bin/dspace packager -d -a -t METS -e admin@myu.edu -i 4321/4567 4567.zip
The above code will export the object of the given handle (4321/4567) into a METS file named "4567.zip". In addition it would export all children objects to the same directory as the "4567.zip" file.
219
The dublin_core.xml or metadata[prefix].xml_file has the following format, where each metadata element has it's own entry within a <dcvalue> tagset. There are currently three tag attributes available in the <dcvalue> tagset: <element> - the Dublin Core element <qualifier> - the element's qualifier <language> - (optional)ISO language code for element
<dublin_core> <dcvalue element="title" qualifier="none">A Tale of Two Cities</dcvalue> <dcvalue element="date" qualifier="issued">1990</dcvalue> <dcvalue element="title" qualifier="alternate" language="fr">J'aime les Printemps</ dcvalue> </dublin_core>
(Note the optional language tag attribute which notifies the system that the optional title is in French.) Every metadata field used, must be registered via the metadata registry of the DSpace instance first. The contents file simply enumerates, one file per line, the bitstream file names. See the following example:
file_1.doc file_2.pdf license
Please notice that the license is optional, and if you wish to have one included, you can place the file in the .../ item_001/ directory, for example. The bitstream name may optionally be followed by the sequence: \tbundle:bundlename where '\t' is the tab character and 'bundlename' is replaced by the name of the bundle to which the bitstream should be added. If no bundle is specified, the bitstream will be added to the 'ORIGINAL' bundle.
Item Importer and Exporter 1. Create a separate file for the other schema named "metadata{prefix}.xml_", where the {prefix} is replaced with the schema's prefix. 2. Inside the xml file use the dame Dublin Core syntax, but on the <dublin_core> element include the attribute "schema={prefix}". 3. Here is an example for ETD metadata, which would be in the file "metadata_etd.xml":
<xml version="1.0" encoding="UTF-8"?> <dublin_core schema="etd"> <dcvalue element="degree" qualifier="department">Computer Science</dcvalue> <dcvalue element="degree" qualifier="level">Masters</dcvalue> <dcvalue element="degree" qualifier="grantor">Texas A & M</dcvalue>
Item Importer and Exporter Collection ID (either Handle (e.g. 123456789/14) or Database ID (e.g. 2) Source directory where the items reside Mapfile. Since you don't have one, you need to determine where it will be (e.g. /Import/Col_14/mapfile) At the command line:
[dspace]/bin/dspace import --add --eperson=joe@user.com --collection=CollectionID -source=items_dir --mapfile=mapfile
The above command would cycle through the archive directory's items, import them, and then generate a map file which stores the mapping of item directories to item handles. SAVE THIS MAP FILE. Using the map file you can use it for replacing or deleting (unimporting) the file. Testing. You can add --test (or -t) to the command to simulate the entire import process without actually doing the import. This is extremely useful for verifying your import files before doing the actual import.
Long form:
[dspace]/bin/dspace import --replace --eperson=joe@user.com --collection=collectionID -source=items_dire --mapfile=mapfile
In long form:
[dspace]/bin/dspace import --delete --mapfile mapfile
222
Item Importer and Exporter Templates. If you have templates that have constant data and you wish to apply that data during batch importing, add the --template (-p) argument. Resume. If, during importing, you have an error and the import is aborted, you can use the --resume (-R) flag that you can try to resume the import where you left off after you fix the error.
-m or --migrate
-h or --help Table 4. Exporting a Collection To export a collection's items you type at the CLI:
Short form:
[dspace]/bin/dspace export -t COLLECTION -d CollID or Handle -d /path/to/destination -n Some_number
Exporting a Single Item The keyword COLLECTION means that you intend to export an entire collection. The ID can either be the database ID or the handle. The exporter will begin numbering the simple archives with the sequence number that you supply. To export a single item use the keyword ITEM and give the item ID as an argument:
[dspace]/bin/dspace export --type=ITEM --id=itemID --dest=dest_dir --number=seq_num
223
Each exported item will have an additional file in its directory, named 'handle'. This will contain the handle that was assigned to the item, and this file will be read by the importer so that items exported and then imported to another machine will retain the item's original handle. The -m Argument Using the -m argument will export the item/collection and also perform the migration step. It will perform the same process that the next section Transferring Items Between DSpace Instances performs. We recommend that the next section be read in conjunction with this flag being used.
prior to running the item importer. This will remove the above metadata items, except for date.issued - if the item has been published or publicly distributed before and identifier.uri - if it is not the handle, from the dublin_core.xml file and remove all handle files. It will then be safe to run the item exporter.
224
Item Update changes in metadata and bitstream contents. Those familiar with generating the source trees for ItemImporter will find a similar environment in the use of this batch processing tool. For metadata, ItemUpdate can perform 'add' and 'delete' actions on specified metadta elements. For bitstreams, 'add' and 'delete' are similarly available. All these actions can be combined in a single batch run. ItemUpdate supports an undo feature for all actions except bitstream deletion. There is also a test mode, as with ItemImport. However, unlike ItemImport, there is no resume feature for incomplete processing. There is more extensive logging with a summary statement at the end with counts of successful and unsuccessful items processed. One probable scenario for using this tool is where there is an external primary data source for which the DSpace instance is a secondary or down-stream system. Metadata and/or bitstream content changes in the primary system can be exported to the simple archive format to be used by ItemUpdate to synchronize the changes. A note on terminology: item refers to a DSpace item. metadata element refers generally to a qualified or unqualified element in a schema in the form [schema].[element].[qualifier] or [schema].[element] and occasionally in a more specific way to the second part of that form. metadata field refers to a specific instance pairing a metadata element to a value.
-a or --addmetadata [metadata element] Repeatable for multiple elements. The metadata element should be in the form dc.x or dc.x.y. The mandatory argument indicates the metadata fields in the dublin_core.xml file to be added unless already present. However, duplicate fields will not be added to the item metadata without warning or error. -d or --deletemetadata [metadata element] -A or --addbitstream Repeatable for multiple elements. All metadata fields matching the element will be deleted. Adds bitstreams listed in the contents file with the bistream metadata cited there.
225
Registering (Not Importing) Bitstreams -D or --deletebitstream [filter plug classname or alias] Not repeatable. With no argument, this operation deletes bistreams listed in the deletes_contents file. Only bitstream ids are recognized identifiers for this operatiotn. The optional filter argument is the classname of an implementation of org.dspace.app.itemdupate.BitstreamFilter class to identify files for deletion or one of the aliases (ORIGINAL, ORIGINAL_AND_DERIVATIVES, TEXT, THUMBNAIL) which reference existing filters based on membership in a bundle of that name. IN this case, the delete_contents file is not required for any item. The filter properties file will contains properties pertinent to the particular filer used. Multiple filters are not allowed. Displays brief command line help. Email address of the person or the user's database ID (Required) Directory archive to process (Required) Specifies an alternate metadata field (not a handle) used to hold an identifier used to match the DSpace item with that in the archive. If omitted, the item handle is expected to be located in the dc.identifier.uri field. (Optional) Runs the process in test mode with logging but no changes applied to the DSpace instance. (Optional) Prevents any changes to the provenance field to represent changes in the bitstream content resulting from an Add or Delete. No provenance statements are written for thumbnails or text derivative bitstreams, un keepin with the practice of MediaFilterManager. (Optional) The filter properties files to be used by the delete bitstreams action (Optional)
-t or --test -P or --alterprovenance
-F or --filterproperties
This will add from your archive the dc element description based on the handle from the URI (since the -i argument wasn't used).
226
Registering (Not Importing) Bitstreams import to furnish DSpace the metadata and to upload bitstreams, registration provides DSpace the metadata and the location of the bitstreams. DSpace uses a variation of the import tool to accomplish registration.
where -r indicates this is a file to be registered -s n indicates the asset store number (n) -f filepath indicates the path and name of the content file to be registered (filepath) \t is a tab character bundle:bundlename is an optional bundle name permissions: -[r|w] 'group name' is an optional read or write permission that can be attached to the bitstream description: some text is an optional description field to add to the file The bundle, that is everything after the filepath, is optional and is normally not used. The command line for registration is just like the one for regular import:
[dspace]/bin/dspace import -a -e joe@user.com -c collectionID -s items_dir -m mapfile
227
The --workflow and --test flags will function as described in Importing Items. The --delete flag will function as described in Importing Items but the registered content files will not be removed from storage. See Deleting Registered Items. The --replace flag will function as described in Importing Items but care should be taken to consider different cases and implications. With old items and new items being registered or ingested normally, there are four combinations or cases to consider. Foremost, an old registered item deleted from DSpace using --replace will not be removed from the storage. See Deleting Registered Items. where is resides. A new item added to DSpace using --replace will be ingested normally or will be registered depending on whether or not it is marked in the contents files with the -r.
228
METS Tools
9.7.2. Limitations
No corresponding import tool yet No structmap section Some technical metadata not written, e.g. the primary bitstream in a bundle, original filenames or descriptions. Only the MIME type is stored, not the (finer grained) bitstream format. Dublin Core to MODS mapping is very simple, probably needs verification
229
With no options, this traverses the asset store, applying media filters to bitstreams, and skipping bitstreams that have already been filtered. Available Command-Line Options: Help : [dspace]/bin/dspace filter-media -h Display help message describing all command-line options. Force mode : [dspace]/bin/dspace filter-media -f Apply filters to ALL bitstreams, even if they've already been filtered. If they've already been filtered, the previously filtered content is overwritten. Identifier mode : [dspace]/bin/dspace filter-media -i 123456789/2 Restrict processing to the community, collection, or item named by the identifier - by default, all bitstreams of all items in the repository are processed. The identifier must be a Handle, not a DB key. This option may be combined with any other option. Maximum mode : [dspace]/bin/dspace filter-media -m 1000 Suspend operation after the specified maximum number of items have been processed - by default, no limit exists. This option may be combined with any other option. No-Index mode : [dspace]/bin/dspace filter-media -n Suppress index creation - by default, a new search index is created for full-text searching. This option suppresses index creation if you intend to run index-update elsewhere. Plugin mode : [dspace]/bin/dspace filter-media -p "PDF Text Extractor","Word Text Extractor" Apply ONLY the filter plugin(s) listed (separated by commas). By default all named filters listed in the filter.plugins field of dspace.cfg are applied. This option may be combined with any other option. WARNING: multiple plugin names must be separated by a comma (i.e. ',') and NOT a comma followed by a space (i.e. ', '). Skip mode : [dspace]/bin/dspace filter-media -s 123456789/9,123456789/100 SKIP the listed identifiers (separated by commas) during processing. The identifiers must be Handles (not DB Keys). They may refer to items, collections or communities which should be skipped. This option may be combined with any other option. WARNING: multiple identifiers must be separated by a comma (i.e. ',') and NOT a comma followed by a space (i.e. ', '). 230
Sub-Community Management NOTE: If you have a large number of identifiers to skip, you may maintain this comma-separated list within a separate file (e.g. filter-skiplist.txt). Use the following format to call the program. Please note the use of the "grave" or "tick" (`_) symbol and do not use the single quotation. _ [dspace]/bin/dspace filter-media -s `less filter-skiplist.txt` Verbose mode : [dspace]/bin/dspace filter-media -v Verbose mode - print all extracted text and other filter details to STDOUT. Adding your own filters is done by creating a class which implements the org.dspace.app.mediafilter.FormatFilter interface. See the Creating a new Media Filter topic and comments in the source file FormatFilter.java for more information. In theory filters could be implemented in any programming language (C, Perl, etc.) However, they need to be invoked by the Java code in the Media Filter class that you create.
231
Batch Metadata Editing where 's' or '-set' means establish a relationship whereby the community identified by the '-p' parameter becomes the parent of the community identified by the '-c' parameter. Both the 'parentID' and 'childID' values may be handles or database IDs. The reverse operation looks like this:
[dspace]/bin/dspace community-filiator --remove --parent=parentID --child=childID
where 'r' or '-remove' means dis-establish the current relationship in which the community identified by 'parentID' is the parent of the community identified by 'childID'. The outcome will be that the 'childID' community will become an orphan, i.e. a top-level community. If the required constraints of operation are violated, an error message will appear explaining the problem, and no change will be made. An example in a removal operation, where the stated child community does not have the stated parent community as its parent: "Error, child community not a child of parent community". It is possible to effect arbitrary changes to the community hierarchy by chaining the basic operations together. For example, to move a child community from one parent to another, simply perform a 'remove' from its current parent (which will leave it an orphan), followed by a 'set' to its new parent. It is important to understand that when any operation is performed, all the sub-structure of the child community follows it. Thus, if a child has itself children (sub-communities), or collections, they will all move with it to its new 'location' in the community tree.
232
Batch Metadata Editing -f or --file -i or --id Required. The filename of the resulting CSV. The Item, Collection, or Community handle or Database ID to export. If not specified, all items will be exported. Include all the metadata fields that are not normally changed (e.g. provenance) or those fields you configured in the dspace.cfg to be ignored on export. Display the help page.
-a or --all
-h or --help
Example:
[dspace]/bin/dspace metadata-export -f /batch_export/col_14.csv -i /1989.1/24
In the above example we have requested that a collection, assigned handle '1989.1/24' export the entire collection to the file 'col_14.cvs' found in the '/batch_export' directory.
Silent Mode should be used carefully. It is possible (and probable) that you can overlay the wrong data and cause irreparable damage to the database.
233
Example
[dspace]/bin/dspace metadata-import -f /dImport/col_14.csv
If you are wishing to upload new metadata without bistreams, at the command line:
[dspace]/bin/dspace/metadata-import -f /dImport/new_file.csv -e joe@user.com -w -n -t
In the above example we threw in all the arguments. This would add the metadata and engage the workflow, notification, and templates to all be applied to the items that are being added.
Subsequent rows in the csv file relate to items. A typical row might look like:
350,2292,Item title,"Smith, John",2008
If you want to store multiple values for a given metadata element, they can be separated with the double-pipe '||' (or another character that you defined in your _dspace.cfg _file. For example:
Horses||Dogs||Cats
Elements are stored in the database in the order that they appear in the csv file. You can use this to order elements where order may matter, such as authors, or controlled vocabulary such as Library of Congress Subject Headings. When importing a csv file, the importer will overlay the data onto what is already in the repository to determine the differences. It only acts on the contents of the cvs file, rather than on the complete item metadata. This means that the CSV file that is exported can be manipulated quite substantially before being re-imported. Rows (items) or Columns (metadata elements) can be removed and will be ignored. For example, if you only want to edit item
234
Checksum Checker abstracts, you can remove all of the other columns and just leave the abstract column. (You do need to leave the ID column intact. This is mandatory). Editing collection membership. Items can be moved between collections by editing the collection handles in the 'collection' column. Multiple collections can be included. The first collection is the 'owning collection'. The owning collection is the primary collection that the item appears in. Subsequent collections (separated by the field separator) are treated as mapped collections. These are the same as using the map item functionality in the DSpace user interface. To move items between collections, or to edit which other collections they are mapped to, change the data in the collection column. Adding items. New metadata-only items can be added to DSpace using the batch metadata importer. To do this, enter a plus sign '+' in the first 'id' column. The importer will then treat this as a new item. If you are using the command line importer, you will need to use the -e flag to specify the user email address or id of the user that is registered as submitting the items. Deleting Data. It is possible to perform deletes across the board of certain metadata fields from an exported file. For example, let's say you have used keywords (dc.subject) that need to be removed en masse. You would leave the column (dc.subject) intact, but remove the data in the corresponding rows. Migrating Data or Exchanging data. It is possible that you have data in one Dublin Core (DC) element and you wish to really have it in another. An example would be that your staff have input Library of Congress Subject Headings in the Subject field (dc.subject) instead of the LCSH field (dc.subject.lcsh). Follow these steps and your data is migrated upon import: 1. Insert a new column. The first row should be the new metadata element. (We will refer to it as the TARGET) 2. Select the column/rows of the data you wish to change. (We will refer to it as the SOURCE) 3. Cut and paste this data into the new column (TARGET) you created in Step 1. 4. Leave the column (SOURCE) you just cut and pasted from empty. Do not delete it.
235
There are three aspects of the Checksum Checker's operation that can be configured: the execution mode the logging output the policy for removing old checksum results from the database The user should refer to Chapter 5. Configuration for specific configuration beys in the dspace.cfg file.
The checker will keep starting new bitstream checks for the specific durations, so actual execution duration will be slightly longer than the specified duration. Bear this in mind when scheduling checks. Specific Bistream mode:[dspace]/bin/dspace checker -b Checker will only look at the internal bitsteam IDs. Example: [dspace]/bin/dspace checker -b 112 113 4567 Checker will only check bitstream IDs 112, 113 and 4567. Specific Handle mode:[dspace]/bin/dspace checker -a Checker will only check bistreams within the Community, Community or the item itself. Example: [dspace]/bin/dspace checker -a 123456/999 Checker will only check this handle. If it is a Collection or Community, it will run through the entire Collection or Community. Looping mode:[dspace]/bin/dspace checker -l or [dspace]/bin/dspace checker -L There are two modes. The lowercase 'el' (-l) specifies to check every bitstream in the repository once. This is 236
Checksum Checker recommended for smaller repositories who are able to loop through all their content in just a few hours maximum. An uppercase 'L' (-L) specifies to continuously loops through the repository. This is not recommended for most repository systems. Cron Jobs. For large repositories that cannot be completely checked in a couple of hours, we recommend the -d option in cron. Pruning mode:[dspace]/bin/dspace checker -p The Checksum Checker will store the result of every check in the checksum_history table. By default, successful checksum matches that are eight weeks old or older will be deleted when the -p option is used. (Unsuccessful ones will be retained indefinitel). Without this option, the retention settings are ignored and the database table may grow rather large!
You can use the table above for your time units. At the command line: [dspace]/bin/dspace checker -p retention_file_name <ENTER>
237
Embargo The above cron entry would schedule the checker to run the checker every Sunday at 400 (4:00 a.m.) for 2 hours. It also specifies to 'prune' the database based on the retention settings in dspace.cfg. Windows OS. You will be unable to use the checker shell script. Instead, you should use Windows Schedule Tasks to schedule the following command to run at the appropriate times:
[dspace]/bin/dspace checker -d2h -p
You can also combine options (e.g. -m -c) for combined reports. Cron. Follow the same steps above as you would running checker in cron. Change the time but match the regularity. Remember to schedule this after Checksum Checker has run.
9.12. Embargo
If you have implemented the Embargo feature, you will need to run it periodically to check for Items with expired embargoes and lift them. Command used: Java class: Arguments short and (long) forms): -c or --check [dspace]/bin/dspace embargo-lifter org.dspace.embargo.EmbargoManager Description ONLY check the state of embargoed Items, do NOT lift any embargoes
238
Browse Index Creation -i or --identifier -l or --lift -n or --dryrun -v or --verbose -q or --quiet -h or --help Process ONLY this handle identifier(s), which must be an Item. Can be repeated. Only lift embargoes, do NOT check the state of any embargoed items. Do no change anything in the data model, print message instead. Print a line describing the action taken for each embargoed item found. No output except upon error. Display brief help screen.
You must run the Embargo Lifter task periodically to check for items with expired embargoes and lift them from being embargoed. For example, to check the status, at the CLI:
[dspace]/bin/dspace embargo-lifter -c
To lift the actual embargoes on those items that meet the time criteria, at the CLI:
[dspace]/bin/dspace embargo-lifter -l
-s or -start
239
Browse Index Creation -f or -full -v or -verbose Make the tables, and do the indexing. This forces -x. Mutually exclusive with -f and -i. Print extra information to the stdout. If used in conjunction with -p, you cannot use the stdout to generate your database structure. Delete all the indexes, but do not create new ones. For use with -f. This is mutually exclusive with -r. Show this help documentation. Overrides all other arguments.
-d or -delete -h or -help
Updating the Indexes. By running [dspace]/bin/dspace index-update you will reindex your full browse wihtout modifying the table structure. (This should be your default approach if indexing, for example, via a cron job periodically).
[dspace]/bin/dspace index-update
Destroy and rebuild. You can destroy and rebuild the database, but do not do the indexing. Output the SQL to do this to the screen and a file, as well as executing it against the database, while being verbose. At the CLI screen:
[dspace]/bin/dspace index \-r \-t \-p \-v \-x \-o myfile.sql
DSpace Log Converter Add a Series Browse. You want to add a new browse using a previously unused metadata element. webui.browse.index.6 = series:metadata:dc.relation.ispartofseries:text:single_Note: the index # need to be adjusted to your browse stanza in the _dspace.cfg file. Also, you will need to update your Messages.properties file. Combine more than one metadata field into a browse. You may have other title fields used in your repository. You may only want one or two of them added, not all title fields. And/or you may want your series to file in there. webui.browse.index.3 = title:metadata:dc.title,dc:title.uniform,dc:relation.ispartofseries:title:full Separate subject browse. You may want to have a separate subject browse limited to only one type of subject. webui.browse.index.7 = lcsubject.metdata:dc.subject.lcsh.text:single As one can see, the choices are limited only by your metadata schema, the metadata, and your imagination. Remember to run index-init after adding any new defitions in the dspace.cfg to have the indexes created and the data indexed.
The command loads the intermediate log files that have been created by the aforementioned script into SOLR. Command used: Java class: Arguments (short and long forms): -i or -[dspace]/bin/dspace stats-log-importer org.dspace.statistics.util.StatisticsImporter Description input file
241
Client Statistics -m or --s or -Adds a wildcard at the end of the input, so it would mean dspace.log* would be imported To skip the reverse DNS lookups that work out where a user is from. (The DNS lookup finds the inforamtion about the host from its IP addess, such as geographical location, etc. This can be slow, and wouldn't work on a server not connected to the internet.) Display verbose ouput (helpful for debugging) For developers: allows you to import a log file from another system, so because the handles won't exist, it looks up random items in your local system to add hits to instead. Help
-v or --l or --
-h or --
Although the DSpace Log Convertor applies basic spider filtering (googlebot, yahoo slurp, msnbot), it is far from complete. Please refer to Statistics Client (8.15) for spider removal operations, after converting your old logs.
-h or -help Notes:
The usage of these options is open for the user to choose, If they want to keep spider entires in their repository, they can just mark them using "-m" and they will be excluded from statistics queries when "solr.statistics.query.filter.isBot = true" in the dspace.cfg. If they want to keep the spiders out of the solr repository, they can run just use the "-i" option and they will be removed immediately. There are guards in place to control what can be defined as an IP range for a bot, in [dspace]/config/spiders, spider IP address ranges have to be at least 3 subnet sections in length 123.123.123 and IP Ranges can only be on the smallest subnet [123.123.123.0 - 123.123.123.255]. If not, loading that row will cause exceptions in the dspace logs and exclude that IP entry.
242
Test Database
243
AIP Backup and Restore Provides a relatively standard format for people to migrate entire hierarchies (Communities/Collections) from one DSpace to another (or from another system into DSpace).
9.18.1.1. How does this differ from traditional DSpace Backups? Which Backup route is better?
Traditionally, it has always been recommended to backup and restore DSpace's database and files (also known as the "assetstore") separately. This is described in more detail in the Section 11.5, Storage Layer section of the DSpace System Documentation. The traditional backup and restore route is still a recommended and supported option. However, the new AIP Backup & Restore option seeks to try and resolve many of the complexities of a traditional backup and restore. The below table details some of the differences between these two valid Backup and Restore options. Traditional Backup & Restore (Database and Files) Supported Backup/Restore Types Can Backup & Restore all DSpace Yes (Requires two backups/reContent easily stores one for Database and one for Files) Yes (Though, will not backup/restore items which are not officially "in archive") AIP Backup & Restore
Can Backup & Restore a Single No (It is possible, but requires a Yes Community/Collection/Item easily strong understanding of DSpace database structure & folder organization in order to only backup & restore metadata/files belonging to that single object) Backups can be used to move one or more Community/Collection/Items to another DSpace system easily. No (Again, it is possible, but reYes quires a strong understanding of DSpace database structure & folder organization in order to only move metadata/files belonging to that object)
Supported Object Types During Backup & Restore Supports backup/restore of all Communities/Collections/Items (including metadata, files, logos, etc.) Yes Yes
Supports backup/restore of all Peo- Yes ple/Groups/Permissions Supports backup/restore of all Col- Yes lection-specific Item Templates Supports backup/restore of all Col- Yes lection Harvesting settings (only for Collections which pull in all Items via OAI-PMH or OAI-ORE) Supports backup/restore of all Yes Withdrawn (but not deleted) Items
Yes Yes No (This is a known issue. All previously harvested Items will be restored, but the OAI-PMH/OAIORE harvesting settings will be lost during the restore process.) Yes
244
AIP Backup and Restore Traditional Backup & Restore (Database and Files) Supports backup/restore of Item Mappings between Collections Yes AIP Backup & Restore Yes (During restore, the AIP Ingester may throw a false "Could not find a parent DSpaceObject" error (see Common Issues or Error Messages), if it tries to restore an Item Mapping to a Collection that it hasn't yet restored. But this error can be safely bypassed using the 'skipIfParentMissing' flag (see Additional Packager Options for more details). No (AIPs are only generated for objects which are completed and considered "in archive") Yes (Custom Metadata Fields will be automatically recreated. Custom Metadata Schemas must be manually created first, in order for DSpace to be able to recreate custom fields belonging to that schema. See Common Issues or Error Messages for more details.) Not by default (unless your also backup parts of your DSpace directory note, you wouldn't need to backup the '[dspace]/assetstore' folder again, as those files are already included in AIPs)
Supports backup/restore of all in- Yes process, uncompleted Submissions (or those currently in an approval workflow) Supports backup/restore of Items Yes using custom Metadata Schemas & Fields
Supports backup/restore of all local Yes (if you backup your entire DSpace Configurations and CusDSpace directory as part of backtomizations ing up your files)
Based on your local institutions needs, you will want to choose the backup & restore process which is most appropriate to you. You may also find it beneficial to use both types of backups on different time schedules, in order to keep to a minimum the likelihood of losing your DSpace installation settings or its contents. For example, you may choose to perform a Traditional Backup once per week (to backup your local system configurations and customizations) and an AIP Backup on a daily basis. Alternatively, you may choose to perform daily Traditional Backups and only use the AIP Backup as a "permanent archives" option (perhaps performed on a weekly or monthly basis).
245
9.18.1.2. How does this work help DSpace interact with DuraCloud?
This work is entirely about exporting DSpace content objects to a location on a local filesystem. So, this work doesn't interact solely with DuraCloud, and could be used by any backup storage system to backup your DSpace contents. In the initial DuraCloud work, the DuraCloud team is working on a way to "synchronize" DuraCloud with a local file folder. So, DuraCloud can be configured to "watch" a given folder and automatically replicate its contents into the cloud. Therefore, moving content from DSpace to DuraCloud would currently be a two-step process: 1. First, export AIPs describing that content from DSpace to a filesystem folder 2. Second, enable DuraCloud to watch that same filesystem folder and replicate it into the cloud. Similarly, moving content from DuraCloud back into DSpace would also be a two-step process: 1. First, you'd tell DuraCloud to replicate the AIPs from the cloud to a folder on your file system 2. Second, you'd ingest those AIPs back into DSpace (These backup/restore processes may change as we go forward and investigate more use cases. This is just the initial plan.)
246
AIP Backup and Restore An AIP can serve as a DIP (Dissemination Information Package) or SIP (Submission Information Package), especially when transferring custody of objects to another DSpace implementation. In contrast to SIP or DIP, the AIP should include all available DSpace structural and administrative metadata, and basic provenance information. AIPs also describe some basic system level information (e.g. Groups and People).
for example:
247
The above code will export the object of the given handle (4321/4567) into an AIP file named "aip4567.zip". This will not include any child objects for Communities or Collections. 9.18.3.1.3. Exporting AIP Hierarchy To export an AIP hierarchy, use the -a (or --all) package parameter. For example, use this 'packager' command template:
[dspace]/bin/dspace packager -d -a -t AIP -e <eperson> -i <handle> <file-path>
for example:
[dspace]/bin/dspace packager -d -a -t AIP -e admin@myu.edu -i 4321/4567 aip4567.zip
The above code will export the object of the given handle (4321/4567) into an AIP file named "aip4567.zip". In addition it would export all children objects to the same directory as the "aip4567.zip" file. The child AIP files are all named using the following format: File Name Format: <Obj-Type>@<Handle-with-dashes>.zip e.g. COMMUNITY@123456789-1.zip, COLLECTION@123456789-2.zip, ITEM@123456789-200.zip This general file naming convention ensures that you can easily locate an object to restore by its name (assuming you know its Object Type and Handle). Alternatively, if object doesn't have a Handle, it uses this File Name Format: <Obj-Type>@internalid-<DSpace-ID>.zip (e.g. ITEM@internal-id-234.zip) AIPs are only generated for objects which are currently in the "in archive" state in DSpace. This means that inprogress, uncompleted submissions are not described in AIPs and cannot be restored after a disaster. Exporting Entire Site To export an entire DSpace Site, pass the packager the Handle <site-handle-prefix>/0. For example, if your site prefix is "4321", you'd run a command similar to the following:
[dspace]/bin/dspace packager -d -a -t AIP -e admin@myu.edu -i 4321/0 sitewide-aip.zip
Again, this would export the DSpace Site AIP into the file "sitewide-aip.zip", and export AIPs for all Communities, Collections and Items into the same directory as the Site AIP.
AIP Backup and Restore 1. Submit/Ingest Mode (-s option, default) submit AIP(s) to DSpace in order to create a new object(s) (i.e. AIP is treated like a SIP Submission Information Package) 2. Restore Mode (-r option) restore pre-existing object(s) in DSpace based on AIP(s). This also attempts to restore all handles and relationships (parent/child objects). This is a specialized type of "submit", where the object is created with a known Handle and known relationships. 3. Replace Mode (-r -f option) replace existing object(s) in DSpace based on AIP(s). This also attempts to restore all handles and relationships (parent/child objects). This is a specialized type of "restore" where the contents of existing object(s) is replaced by the contents in the AIP(s). By default, if a normal "restore" finds the object already exists, it will back out (i.e. rollback all changes) and report which object already exists. Again, like export, there are two types of AIP Ingestion you can perform (using any of the above modes): Single AIP (default) - Ingests just an AIP describing a single DSpace object. So, if you ran it in this default mode for a Collection AIP, you'd just create a DSpace Collection from the AIP (but not ingest any of its child objects) Hierarchy of AIPs (by including the --all or -a option after the mode) - Ingests the requested AIP describing an object, plus the AIP for all child objects. Some examples follow: For a Site - this would ingest all Communities, Collections & Items based on the located AIP files For a Community - this would ingest that Community and all SubCommunities, Collections and Items based on the located AIP files For a Collection - this would ingest that Collection and all contained Items based on the located AIP files For an Item this just ingest the Item (including all Bitstreams & Bundles) based on the AIP file. The difference between "Submit" and "Restore/Replace" modes It's worth understanding the primary differences between a Submission (specified by -s parameter) and a Restore (specified by -r parameter). Submission Mode (-s mode) - creates a new object (AIP is treated like a SIP) By default, a new Handle is always assigned However, you can force it to use the handle specified in the AIP by specifying -o ignoreHandle=false as one of your parameters By default, a new Parent object must be specified (using the -p parameter). This is the location where the new object will be created. However, you can force it to use the parent object specified in the AIP by specifiying -o ignoreParent=false as one of your parameters By default, will respect a Collection's Workflow process when you submit an Item to a Collection However, you can specifically skip any workflow approval processes by specifying -w parameter. Always adds a new Deposit License to Items Always adds new DSpace System metadata to Items (includes new 'dc.date.accessioned', 'dc.date.available', 'dc.date.issued' and 'dc.description.provenance' entries) 249
AIP Backup and Restore Restore / Replace Mode (-r mode) - restores a previously existing object (as if from a backup) By default, the Handle specified in the AIP is restored However, for restores, you can force a new handle to be generated by specifying -o ignoreHandle=true as one of your parameters. (NOTE: Doesn't work for replace mode as the new object always retains the handle of the replaced object) Although a Restore/Replace does restore Handles, it will not necessarily restore the same internal IDs in your Database. By default, the object is restored under the Parent specified in the AIP However, for restores, you can force it to restore under a different parent object by using the -p parameter. (NOTE: Doesn't work for replace mode, as the new object always retains the parent of the replaced object) Always skips any Collection workflow approval processes when restoring/replacing an Item in a Collection Never adds a new Deposit License to Items (rather it restores the previous deposit license, as long as it is stored in the AIP) Never adds new DSpace System metadata to Items (rather it just restores the metadata as specified in the AIP)
If you leave out the -p parameter, the AIP package ingester will attempt to install the AIP under the same parent it had before. As you are also specifying the -s (submit) parameter, the packager will assume you want a new Handle to be assigned (as you are effectively specifying that you are submitting a new object). If you want the object to retain the Handle specified in the AIP, you can specify the -o ignoreHandle=false option to force the packager to not ignore the Handle specified in the AIP. 250
for example:
[dspace]/bin/dspace packager -s -a -t AIP -e admin@myu.edu -p 4321/12 aip4567.zip
The above command will ingest the package named "aip4567.zip" as a child of the specified Parent Object (handle="4321/12"). The resulting object is assigned a new Handle (since -s is specified). In addition, any child AIPs referenced by "aip4567.zip" are also recursively ingested (a new Handle is also assigned for each child AIP). Another example Ingesting a Top-Level Community (by using the Site Handle, <site-handle-prefix>/0):
[dspace]/bin/dspace packager -s -a -t AIP -e admin@myu.edu -p 4321/0 community-aip.zip
The above command will ingest the package named "community-aip.zip" as a top-level community (i.e. the specified parent is "4321/0" which is a Site Handle). Again, the resulting object is assigned a new Handle. In addition, any child AIPs referenced by "community-aip.zip" are also recursively ingested (a new Handle is also assigned for each child AIP). 9.18.3.2.3. Restoring/Replacing using AIP(s) Restoring is slightly different than just submitting. When restoring, we make every attempt to restore the object as it used to be (including its handle, parent object, etc.). There are currently three restore modes: 1. Default Restore Mode (-r) = Attempt to restore object (and optionally children). Rollback all changes if any object is found to already exist. 2. Restore, Keep Existing Mode (-r -k) = Attempt to restore object (and optionally children). If an object is found to already exist, skip over it (and all children objects), and continue to restore all other non-existing objects. 3. Force Replace Mode (-r -f) = Restore an object (and optionally children) and overwrite any existing objects in DSpace. Therefore, if an object is found to already exist in DSpace, its contents are replaced by the contents of the AIP. WARNING: This mode is potentially dangerous as it will permanently destroy any object contents that do not currently exist in the AIP. You may want to perform a secondary backup, unless you are sure you know what you are doing!
251
AIP Backup and Restore Default Restore Mode By default, the restore mode (-r option) will throw an error and rollback all changes if any object is found to already exist. The user will be informed if which object already exists within their DSpace installation. Restore a Single AIP: Use this 'packager' command template to restore a single object from an AIP (not including any child objects):
[dspace]/bin/dspace packager -r -t AIP -e <eperson> <AIP-file-path>
Restore a Hierarchy of AIPs: Use this 'packager' command template to restore an object from an AIP along with all child objects (from their AIPs):
[dspace]/bin/dspace packager -r -a -t AIP -e <eperson> <AIP-file-path>
For example:
[dspace]/bin/dspace packager -r -a -t AIP -e admin@myu.edu aip4567.zip
Notice that unlike -s option (for submission/ingesting), the -r option does not require the Parent Object ( -p option) to be specified if it can be determined from the package itself. In the above example, the package "aip4567.zip" is restored to the DSpace installation with the Handle provided within the package itself (and added as a child of the parent object specified within the package itself). In addition, any child AIPs referenced by "aip4567.zip" are also recursively ingested (the -a option specifies to also restore all child AIPs). They are also restored with the Handles & Parent Objects provided with their package. If any object is found to already exist, all changes are rolled back (i.e. nothing is restored to DSpace)
252
AIP Backup and Restore DSpace, the Default Restore Mode will report an error that those object(s) could not be recreated. If you encounter this situation, you will need to perform the restore using either the Restore, Keep Existing Modeor the Force Replace Mode (depending on whether you want to keep or replace those existing child objects).
Restore, Keep Existing Mode When the "Keep Existing" flag (-k option) is specified, the restore will attempt to skip over any objects found to already exist. It will report to the user that the object was found to exist (and was not modified or changed). It will then continue to restore all objects which do not already exist. One special case to note: If a Collection or Community is found to already exist, its child objects are also skipped over. So, this mode will not auto-restore items to an existing Collection. Restore a Hierarchy of AIPs: Use this 'packager' command template to restore an object from an AIP along with all child objects (from their AIPs):
[dspace]/bin/dspace packager -r -a -k -t AIP -e <eperson> <AIP-file-path>
For example:
[dspace]/bin/dspace packager -r -a -k -t AIP -e admin@myu.edu aip4567.zip
In the above example, the package "aip4567.zip" is restored to the DSpace installation with the Handle provided within the package itself (and added as a child of the parent object specified within the package itself). In addition, any child AIPs referenced by "aip4567.zip" are also recursively restored (the -a option specifies to also restore all child AIPs). They are also restored with the Handles & Parent Objects provided with their package. If any object is found to already exist, it is skipped over (child objects are also skipped). All non-existing objects are restored. Force Replace Mode When the "Force Replace" flag (-f option) is specified, the restore will overwrite any objects found to already exist in DSpace. In other words, existing content is deleted and then replaced by the contents of the AIP(s).
Replace using a Hierarchy of AIPs: Use this 'packager' command template to replace an object from an AIP along with all child objects (from their AIPs):
[dspace]/bin/dspace packager -r -a -f -t AIP -e <eperson> <AIP-file-path>
253
In the above example, the package "aip4567.zip" is restored to the DSpace installation with the Handle provided within the package itself (and added as a child of the parent object specified within the package itself). In addition, any child AIPs referenced by "aip4567.zip" are also recursively ingested. They are also restored with the Handles & Parent Objects provided with their package. If any object is found to already exist, its contents are replaced by the contents of the appropriate AIP. If any error occurs, the script attempts to rollback the entire replacement process. Restoring Entire Site In order to restore an entire Site from a set of AIPs, you must do the following: 1. Install a completely "fresh" version of DSpace by following the Installation instructions in the DSpace Manual At this point, you should have a completely empty, but fully-functional DSpace installation. You will need to create an initial Administrator user in order to perform this restore (as a full-restore can only be performed by a DSpace Administrator). 2. Once DSpace is installed, run the following command to restore all its contents from AIPs
[dspace]/bin/dspace packager -r -a -f -t AIP -e <eperson> -i <site-handle-prefix>/0 / full/path/to/your/site-aip.zip
Please note the following about the above restore command: Notice that you are running this command in "Force Replace" mode (-r -f). This is necessary as your empty DSpace install will already include a few default groups (Administrators and Anonymous) and your initial administrative user. You need to replace these groups in order to restore your prior DSpace contents completely. <eperson> should be replaced with the Email Address of the initial Administrator (who you created when you reinstalled DSpace). <site-handle-prefix> should be replaced with your DSpace site's assigned Handle Prefix. This is equivalent to the handle.prefix setting in your dspace.cfg /full/path/to/your/site-aip.zip is the full path to the AIP file which represents your DSpace SITE. This file will be named whatever you named it when you actually exported your entire site. All other AIPs are assumed to be referenced from this SITE AIP (in most cases, they should be in the same directory as that SITE AIP).
254
ignoreHandle
ingest-only
ignoreParent
ingest-only
255
AIP Backup and Restore Option Ingest or Export Default Value Description (run dspace packager -h for more help). If 'false', the AIP ingester attempts to restore the object directly under its old Parent (this is the default when running in Restore/replace mode, using the -r flag).
includeBundles
export-only
defaults to "all"
This option can be used to limit the Bundles which are exported to AIPs for each DSpace Item. By default, all file Bundles will be exported into Item AIPs. You could use this option to limit the size of AIPs by only exporting certain Bundles. WARNING: any bundlesnotincluded in AIPs will obviously be unable to be restored. This option expects a comma separated list of bundle names (e.g. "ORIGINAL,LICENSE,CC_LICENSE,MET or "all" if all bundles should be included. (NOTE: If you choose to no longer export LICENSE or CC_LICENSE bundles, you will also need to disable the License Dissemination Crosswalks in the aip.disseminate.rightsMD configuration for the changes to take affect) If 'true', the AIP Disseminator will export an AIP which only consists of the METS Manifest file (i.e. result will be a single 'mets.xml' file). This METS Manifest contains URI references to all content files, but does not contain any content files. This option is experimental, and should never be set to 'true' if you
manifestOnly
both
false
256
AIP Backup and Restore Option Ingest or Export Default Value Description want to be able to restore content files. If 'true' (and the 'DSPACE-ROLES' crosswalk is enabled, see AIP Metadata Dissemination Configurations), then the AIP Disseminator will export user password hashes (i.e. encrypted passwords) into Site AIP's METS Manifest. This would allow you to restore user's passwords from Site AIP. If 'false', then user password hashes are not stored in Site AIP, and passwords cannot be restored at a later time. If 'true', ingestion will skip over any "Could not find a parent DSpaceObject" errors that are encountered during the ingestion process (Note: those errors will still be logged as "warning" messages in your DSpace log file). If you are performing a full site restore (or a restore of a larger Community/Collection hierarchy), you may encounter these errors if you have a larger number of Item mappings between Collections (i.e. Items which are mapped into several collections at once). When you are performing a recursive ingest, skipping these errors should not cause any problems. Once the missing parent object is ingested it will automatically restore the Item mapping that caused the error. For more information on this "Could not find a parent DSpaceOb-
passwords
export-only
false
false
257
AIP Backup and Restore Option Ingest or Export Default Value Description ject" error see Common Issues or Error Messages. If 'skip', the AIP Disseminator will skip over any unauthorized Bundle or Bitstream encountered (i.e. it will not be added to the AIP). If 'zero', the AIP Disseminator will add a Zero-length "placeholder" file to the AIP when it encounters an unauthorized Bitstream. If unspecified (the default value), the AIP Disseminator will throw an error if an unauthorized Bundle or Bitstream is encountered. This option works as a basic form of "incremental backup". This option requires that an ISO-8601 date is specified. When specified, the AIP Disseminator will only export Item AIPs which have a last-modified date after the specified ISO-8601 date. This option has no affect on the export of Site, Community or Collection AIPs as DSpace does not record a lastmodified date for Sites, Communities or Collections. For example, when this option is specified during a full-site export, the AIP Disseminator will export the Site AIP, all Community AIPs, all Collection AIPs, and only Item AIPs modified after that date and time. If 'true', every METS file in AIP will be validated before ingesting or exporting. By default, DSpace will validate everything on export, but will skip validation dur-
unauthorized
export-only
unspecified
updatedAfter
export-only
unspecified
validate
both
258
AIP Backup and Restore Option Ingest or Export Default Value Description ing import. Validation on export will ensure that all exported AIPs properly conform to the METS profile (and will throw errors if any do not). Validation on import will ensure every METS file in every AIP is first validated before importing into DSpace (this will cause the ingestion processing to take longer, but tips on speeding it up can be found in the "AIP Configurations To Improve Ingestion Speed while Validating" section below). DSpace recommends minimally validating AIPs on export. Ideally, you should validate both on export and import, but import validation is disabled by default in order to increase the speed of AIP restores.
For example:
[dspace]/bin/dspace packager -r -a -t AIP -o ignoreParent=false -o createMetadataFields=false -e admin@myu.edu aip4567.zip
Via the Java API call If you are programmatically calling the org.dspace.content.packager.DSpaceAIPIngester from your own custom script, you can specify these options via the org.dspace.content.packager.PackageParameters class. As a basic example:
PackageParameters params = new PackageParameters; params.addProperty("createMetadataFields", "false"); params.addProperty("ignoreParent", "true");
259
260
AIP Backup and Restore The DSPACE_CCTEXT crosswalk ensures any Creative Commons Textual Licenses are referenced/stored in AIP The METSRights crosswalk ensures that Permissions/Rights on DSpace Objects (Communities, Collections, Items or Bitstreams) are referenced/stored in AIP. Using this crosswalk means that AIPs can be used to restore permissions that a particular Group or Person had on a DSpace Object. (NOTE: The METSRights crosswalk should always be used in conjunction with the DSPACE-ROLES crosswalk (see above) or a similar crosswalk. The METSRights crosswalk can only restore permissions, and cannot re-create Groups or EPeople in the system. The DSPACE-ROLES can actually re-create the Groups or EPeople as needed.) aip.disseminate.dmd - Lists the DSpace Crosswalks (by name) which should be called to populate the <dmdSec> section of the METS file within the AIP (Default: MODS, DIM) The MODS crosswalk translates the DSpace descriptive metadata (for this object) into MODS. As MODS is a relatively "standard" metadata schema, it may be useful to include a copy of MODS metadata in your AIPs if you should ever want to import them into another (non-DSpace) system. The DIM crosswalk just translates the DSpace internal descriptive metadata into an XML format. This XML format is proprietary to DSpace, but stores the metadata in a format similar to Qualified Dublin Core.
The above settings tell the ingester to ignore any metadata sections which reference DSpace Deposit Licenses or Creative Commons Licenses. These metadata sections can be safely ignored as long as the "LICENSE" and "CC_LICENSE" bundles are included in AIPs (which is the default setting). As the Licenses are included in those Bundles, they will already be restored when restoring the bundle contents.
261
262
AIP Backup and Restore Issue / Error Message How to Fix this Problem mand in Force Replace Mode (-r -f). Please see the section on Restoring an Entire Site for more details on the flags you should be using. If you receive this problem, one or more of your Items is using a custom metadata schema which DSpace is currently not aware of (in the example, the schema is named "mycustomschema"). Because DSpace AIPs do not contain enough details to recreate the missing Metadata Schema, you must create it manually via the DSpace Admin UI. Please note that you only need to create the Schema. You do not need to manually create all the fields belonging to that schema, as DSpace will do that for you as it restores each AIP. Once the schema is created in DSpace, re-run your restore command. DSpace will automatically re-create all fields belonging to that custom metadata schema as it restores each Item that uses that schema. When you encounter this error message it means that an object could not be ingested/restored as it belongs to a parent object which doesn't currently exist in your DSpace instance. During a full restore process, this error can be skipped over and treated as a warning by specifying the 'skipIfParentMissing=true' option (see Additional Packager Options). If you have a larger number of Items which are mapped to multiple Collections, the AIP Ingester will sometimes attempt to restore an item mapping before the Collection itself has been restored (thus throwing this error). Luckily, this is not anything to be concerned about. As soon as the Collection is restored, the Item Mapping which caused the error will also be automatically restored. So, if you encounter this error during a full restore, it is safe to bypass this error message using the 'skipIfParentMissing=true' option. All your Item Mappings should still be restored correctly.
263
AIP Backup and Restore Collection or Community AIPs do not include all child objects (e.g. Items in those Collections or Communities), as each AIP only describes one object. However, these container AIPs do contain references (links) to all child objects. These references can be used by DSpace to automatically restore all referenced AIPs when restoring a Collection or Community. AIPs are only generated for objects which are currently in the "in archive" state in DSpace. This means that in-progress, uncompleted submissions are not described in AIPs and cannot be restored after a disaster. Permanently removed objects will also no longer be exported as AIPs after their removal. However, withdrawn objects will continue to be exported as AIPs, since they are still considered under the "in archive" status. AIPs with identical contents will always have identical checksums. This provides a basic means of validating whether the contents within an AIP have changed. For example, if a Collection's AIP has the same checksum at two different points in time, it means that Collection has not changed during that time period. AIP profile favors completeness and accuracy rather than presenting the semantics of an object in a standard format. It conforms to the quirks of DSpace's internal object model rather than attempting to produce a universally understandable representation of the object. When possible, an AIP tries to use common standards to express objects. An AIP can serve as a DIP (Dissemination Information Package) or SIP (Submission Information Package), especially when transferring custody of objects to another DSpace implementation. In contrast to SIP or DIP, the AIP should include all available DSpace structural and administrative metadata, and basic provenance information. AIPs also describe some basic system level information (e.g. Groups and People). 9.18.7.1.2. General AIP Structure / Examples Generally speaking, an AIP is an Zip file containing a METS manifest and all related content bitstreams, license files and any other associated files. Some examples include: Site AIP (Sample: SITE-example.zip) METS contains basic metadata about DSpace Site and persistent IDs referencing all Top Level Communities METS also contains a list of all Groups and EPeople information defined in the DSpace system. (NOTE: By default, user passwords are not stored in AIPs, unless you specify the 'passwords' flag. See Additional Packager Options.) Community AIP (Sample: COMMUNITY@123456789-1.zip) METS contains all metadata for Community and persistent IDs referencing all members (SubCommunities or Collections). Package may also include a Logo file, if one exists. METS contains any Group COMMUNITY_<ID>_ADMIN group). information for Commmunity-specific groups (e.g.
METS contains all Community permissions/policies (translated into METSRights schema) Collection AIP (Sample: COLLECTION@123456789-2.zip) METS contains all metadata for Collection and persistent IDs referencing all members (Items). Package may also include a Logo file, if one exists.
264
AIP Backup and Restore METS contains any Group information for Collection-specific groups (e.g. COLLECTION_<ID>_ADMIN, COLLECTION_<ID>_SUBMIT, etc.). METS contains all Collection permissions/policies (translated into METSRights schema) If the Collection has an Item Template, the METS will also contain all the metadata for that Item Template. Item AIP (Sample: ITEM@123456789-8.zip) METS contains all metadata for Item and references to all Bundles and Bitstreams. Package also includes all Bitstream files. METS contains all Item/Bundle/Bitstream permissions/policies (translated into METSRights schema) Notes: Bitstreams and Bundles are second-class archival objects; they are recorded in the context of an Item. BitstreamFormats are not even second-class; they are described implicitly within Item technical metadata, and reconstructed from that during restoration EPeople are only defined in Site AIP, but may be referenced from Community or Collection AIPs Groups may be defined in Site AIP, Community AIP or Collection AIP. Where they are defined depends on whether the Group relates specifically to a single Community or Collection, or is just a general site-wide group. What is NOT in AIPs DSpace Site configurations ([dspace]/config/ directory) or customizations (themes, stylesheets, etc) are not described in AIPs DSpace Database model (or customizations therein) is not described in AIPs Any objects which are not currently in the "In Archive" state are not described in AIPs. This means that inprogress, unfinished submissions are never included in AIPs. Customizing What Is Stored in Your AIPs If you choose, you can customize exactly what information is stored in your AIPs. However, you should be aware that you can only restore information which is stored within your AIPs. If you choose to remove information from your AIPs, you will be unable to restore it later on (unless you are also backing up your entire DSpace database and assetstore folder).
AIP Recommendations
It is recommended to minimally use the default settings when generating AIPs. DSpace can only restore information that is included within an AIP. Therefore, if you choose to no longer include some information in an AIP, DSpace will no longer be able to restore that information from an AIP backup There are two ways to go about customizing your AIP format: 1. You can customize your dspace.cfg settings pertaining to AIP generation. These configurations will allow you to specify exactly which DSpace Crosswalks will be called when generating the AIP METS manifest. 2. You can export your AIPs using one of the special options/flags.
265
AIP Backup and Restore dmdSec/mdWrap@MDTYPE="OTHER",@OTHERMDTYPE="DIM". See DIM (DSpace Intermediate Metadata) Schema section below for more information. For Collection AIPs, additional dmdSec elements may exist which describe the Item Template for that Collection. Since an Item template is not an actual Item (i.e. it only includes metadata), it is stored within the Collection AIP. The Item Template's dmdSec elements will be referenced by a div @TYPE="DSpace ITEM Template" in the METS structMap. When the mdWrap@TYPE value is OTHER, the element MUST include a value for the @OTHERTYPE attribute which names the crosswalk that produced (or interprets) that metadata, e.g. DIM. mets/amdSec element(s) One or more amdSec elements are include for all AIPs. The first amdSec element contains administrative metadata (technical, source, rights, and provenance) for the entire archival object. Additional amdSec elements may exist to describe parts of the archival object (e.g. Bitstreams or Bundles in an Item). techMD elements. By default, two types of techMD elements may be included: PREMIS metadata about an object may be included here (currently only specified for Bitstreams (files)). Specified by mdWrap@MDTYPE="PREMIS". See PREMIS Schema section below for more information. DSPACE-ROLES metadata may appear here to describe the Groups or EPeople related to this object (_currently only specified for Site, Community and Collection). Specified by mdWrap@MDTYPE="OTHER",@OTHERMDTYPE="DSPACE-ROLES". See DSPACE-ROLES Schema section below for more information. rightsMD elements. By default, there are four possible types of rightsMD elements which may be included: METSRights metadata may appear here to describe the permissions on this object. Specified by mdWrap@MDTYPE="OTHER",@OTHERMDTYPE="METSRIGHTS". See METSRights Schema section below for more information. DSpaceDepositLicense if the object is an Item and it has a deposit license, it is contained here. Specified by mdWrap@MDTYPE="OTHER",@OTHERMDTYPE="DSpaceDepositLicense". CreativeCommonsRDF If the object is an Item with a Creative Commons license expressed in RDF, it is included here. Specified by mdWrap@MDTYPE="OTHER",@OTHERMDTYPE="CreativeCommonsRDF". CreativeCommonsText If the object is an Item with a Creative Commons license in plain text, it is included here. Specified by mdWrap@MDTYPE="OTHER",@OTHERMDTYPE="CreativeCommonsText". sourceMD element. By default, there is only one type of sourceMD element which may appear: AIP-TECHMD metadata may appear here. This stores basic technical/source metadata about in object in a DSpace native format. Specified by mdWrap@MDTYPE="OTHER",@OTHERMDTYPE="AIPTECHMD". See AIP Technical Metadata Schema (AIP-TECHMD) section below for more information. digiprovMD element. Not used at this time. 267
AIP Backup and Restore mets/fileSec element For ITEM objects: Each distinct Bundle in an Item goes into a fileGrp. The fileGrp has a @USE attribute which corresponds to the Bundle name. Bitstreams in bundles become file elements under fileGrp. mets/fileSec/fileGrp/file elements Set @SIZE to length of the bitstream. There is a redundant value in the <techMD> but it is more accessible here. Set @MIMETYPE, @CHECKSUM, @CHECKSUMTYPE to corresponding bitstream values. There is redundant info in the <techMD>. (For DSpace, the @CHECKSUMTYPE="MD5" at all times) SET @SEQ to bitstream's SequenceID if it has one. SET @ADMID to the list of <amdSec> element(s) which describe this bitstream. For COLLECTION and COMMUNITY objects: Only if the object has a logo bitstream, there is a fileSec with one fileGrp child of @USE="LOGO". The fileGrp contains one file element, representing the logo Bitstream. It has the same @MIMETYPE, @CHECKSUM, @CHECKSUMTYPE attributes as the Item content bitstreams, but does NOT include metadata section references (e.g. @ADMID) or a @SEQ attribute. See the main structMap for the fptr reference to this logo file. mets/structMap - Primary structure map, @LABEL="DSpace Object", @TYPE="LOGICAL" For ITEM objects: 1. Top-Level div with @TYPE="DSpace Object Contents". For every Bitstream in Item it contains a div with @TYPE="DSpace BITSTREAM". Each Bitstream div has a single fptr element which references the bitstream location. If Item has primary bitstream, put it in structMap/div/fptr (i.e. directly under the div with @TYPE="DSpace Object Contents") For COLLECTION objects: 1. Top-Level div with @TYPE="DSpace Object Contents". For every Item in the Collection, it contains a div with @TYPE="DSpace ITEM". Each Item div has up to two child mptr elements: a. One linking to the Handle of that Item. Its @LOCTYPE="HANDLE", and @xlink:href value is the raw Handle. b. (Optional) one linking to the location of the local AIP for that Item (if known). Its @LOCTYPE="URL", and @xlink:href value is a relative link to the AIP file on the local filesystem. 268
AIP Backup and Restore If Collection has a Logo bitstream, there is an fptr reference to it in the very first div. If the Collection includes an Item Template, there will be a div with @TYPE="DSpace ITEM Template" within the very first div. This div @TYPE="DSpace ITEM Template" must have a @DMDID specified, which links to the dmdSec element(s) that contain the metadata for the Item Template. For COMMUNITY objects: 1. Top-Level div with @TYPE="DSpace Object Contents". For every Sub-Community in the Community it contains a div with @TYPE="DSpace COMMUNITY". Each Community div has up to two mptr elements: a. One linking to the Handle of that Community. Its @LOCTYPE="HANDLE", and @xlink:href value is the raw Handle. b. (Optional) one linking to the location of the local AIP file for that Community (if known). Its @LOCTYPE="URL", and @xlink:href value is a relative link to the AIP file on the local filesystem. For every Collection in the Community there is a div with @TYPE="DSpace COLLECTION". Each Collection div has up to two mptr elements: a. One linking to the Handle of that Collection. Its @LOCTYPE="HANDLE", and @xlink:href value is the raw Handle. b. (Optional) one linking to the location of the local AIP file for that Collection (if known). Its @LOCTYPE="URL", and @xlink:href value is a relative link to the AIP file on the local filesystem. If Community has a Logo bitstream, there is an fptr reference to it in the very first div. For SITE objects: 1. Top-Level div with @TYPE="DSpace Object Contents". For every Top-level Community in Site, it contains a div with @TYPE="DSpace COMMUNITY". Each Item div has up to two child mptr elements: a. One linking to the Handle of that Community. Its @LOCTYPE="HANDLE", and @xlink:href value is the raw Handle. b. (Optional) one linking to the location of the local AIP for that Community (if known). Its @LOCTYPE="URL", and @xlink:href value is a relative link to the AIP file on the local filesystem. mets/structMap @TYPE="LOGICAL" Structure Map to indicate object's Parent, @LABEL="Parent",
Contains one div element which has the unique attribute value TYPE="AIP Parent Link" to identify it as the older of the parent pointer. It contains a mptr element whose xlink:href attribute value is the raw Handle of the parent object, e.g. 1721.1/4321. 269
By default, DIM metadata is always included in AIPs. It is controlled by the following configuration in your dspace.cfg:
aip.disseminate.dmd = MODS, DIM
DIM Descriptive Elements for Item objects As all DSpace Items already have user-assigned DIM (essentially Qualified Dublin Core) metadata fields, those fields are just exported into the DIM Schema within the METS file. DIM Descriptive Elements for Collection objects For Collections, the following fields are translated to the DIM schema: DIM Metadata Field dc.description dc.description.abstract dc.description.tableofcontents dc.identifier.uri dc.provenance dc.rights dc.rights.license dc.title DIM Descriptive Elements for Community objects For Communities, the following fields are translated to the DIM schema: Database field or value 'introductory_text' field 'short_description' field 'side_bar_text' field Collection's handle 'provenance_description' field 'copyright_text' field 'license' field 'name' field
270
AIP Backup and Restore DIM Metadata Field dc.description dc.description.abstract dc.description.tableofcontents dc.identifier.uri dc.rights dc.title DIM Descriptive Elements for Site objects For the Site Object, the following fields are translated to the DIM schema: Metadata Field dc.identifier.uri dc.title 9.18.7.3.2. MODS Schema By default, all DSpace descriptive metadata (DIM) is also translated into the MODS Schema by utilizing DSpace's MODSDisseminationCrosswalk. DSpace's DIM to MODS crosswalk is defined within your [dspace]/config/crosswalks/mods.properties configuration file. This file allows you to customize the MODS that is included within your AIPs. For more information on the MODS Schema, see http://www.loc.gov/standards/mods/mods-schemas.html In the METS structure, MODS metadata always appears within a dmdSec inside an <mdWrap MDTYPE="MODS"> element. For example:
<dmdSec ID="dmdSec_2189"> <mdWrap MDTYPE="MODS"> ... </mdWrap> </dmdSec>
Database field or value 'introductory_text' field 'short_description' field 'side_bar_text' field Handle of Community 'copyright_text' field 'name' field
Value Handle of Site (format: [handle_prefix]/0) Name of Site (from dspace.cfg 'dspace.name' config)
By default, MODS metadata is always included in AIPs. It is controlled by the following configuration in your dspace.cfg:
aip.disseminate.dmd = MODS, DIM
The MODS metadata is included within your AIP to support interoperability. It provides a way for other systems to interact with or ingest the AIP without needing to understand the DIM Schema. You may choose to disable MODS if you wish, however this may decrease the likelihood that you'd be able to easily ingest your AIPs into a non-DSpace system (unless that non-DSpace system is able to understand the DIM schema). When restoring/ingesting AIPs, DSpace will always first attempt to restore DIM descriptive metadata. Only if no DIM metadata is found, will the MODS metadata be used during a restore. 9.18.7.3.3. AIP Technical Metadata Schema (AIP-TECHMD) The AIP Technical Metadata Schema is a way to translate technical metadata about a DSpace object into the DIM Schema. It is kept separate from DIM as it is considered technical metadata rather than descriptive metadata.
271
AIP Backup and Restore In the METS structure, AIP-TECHMD metadata always appears within a sourceMD inside an <mdWrap MDTYPE="OTHER" OTHERMDTYPE="AIP-TECHMD"> element. For example:
<amdSec ID="amd_2191"> ... <sourceMD ID="sourceMD_2198"> <mdWrap MDTYPE="OTHER" OTHERMDTYPE="AIP-TECHMD"> ... </mdWrap> </sourceMD> ... </amdSec>
By default, AIP-TECHMD metadata is always included in AIPs. It is controlled by the following configuration in your dspace.cfg:
aip.disseminate.sourceMD = AIP-TECHMD
AIP Technical Metadata for Item Metadata Field dc.contributor dc.identifier.uri dc.relation.isPartOf dc.relation.isReferencedBy dc.rights.accessRights AIP Technical Metadata for Bitstream Metadata Field dc.title dc.title.alternative dc.description dc.format dc.format.medium dc.format.mimetype dc.format.supportlevel Value Bitstream's name/title Bitstream's source Bitstream's description Bitstream Format Description Short Name of Format MIMEType of Format System Support Level for Format (necessary to recreate Format during restore, if the format isn't know to DSpace by default) Whether Format is internal (necessary to recreate Format during restore, if the format isn't know to DSpace by default) Value Submitter's email address Handle of Item Owning Collection's Handle (as a URN) All other Collection's this item is linked to (Handle URN of each non-owner) "WITHDRAWN" if item is withdrawn
dc.format.internal
Outstanding Question: Why are we recording the file format support status? That's a DSpace property, rather than an Item property. Do DSpace instances rely on objects to tell them their support status?
272
AIP Backup and Restore Possible answer (from Larry Stone): Format support and other properties of the BitstreamFormat are recorded here in case the Item is restored in an empty DSpace that doesn't have that format yet, and the relevant bits of the format entry have to be reconstructed from the AIP. --lcs AIP Technical Metadata for Collection Metadata Field dc.identifier.uri dc.relation.isPartOf dc.relation.isReferencedBy Value Handle of Collection Owning Community's Handle (as a URN) All other Communities this Collection is linked to (Handle URN of each non-owner)
AIP Technical Metadata for Community Metadata Field dc.identifier.uri dc.relation.isPartOf AIP Technical Metadata for Site Metadata Field dc.identifier.uri 9.18.7.3.4. PREMIS Schema At this point in time, the PREMIS Schema is only used to represent technical metadata about DSpace Bitstreams (i.e. Files). The PREMIS metadata is generated by DSpace's PREMISCrosswalk. Only the PREMIS Object Entity Schema is used. In the METS structure, PREMIS metadata always appears within a techMD inside an <mdWrap MDTYPE="PREMIS"> element. PREMIS metadata is always wrapped withn a <premis:premis> element. For example:
<amdSec ID="amd_2209"> ... <techMD ID="techMD_2210"> <mdWrap MDTYPE="PREMIS"> <premis:premis> ... </premis:premis> </mdWrap> </techMD> ... </amdSec>
Each Bitstream (file) has its own amdSec within a METS manifest. So, there will be a separate PREMIS techMD for each Bitstream within a single Item. By default, PREMIS metadata is always included in AIPs. It is controlled by the following configuration in your dspace.cfg:
273
PREMIS Metadata for Bitstream The following Bitstream information is translated into PREMIS for each DSpace Bitstream (file): Metadata Field <premis:objectIdentifier> <premis:objectCategory> <premis:fixity> <premis:format> <premis:originalName> 9.18.7.3.5. DSPACE-ROLES Schema All DSpace Groups and EPeople objects are translated into a custom DSPACE-ROLES XML Schema. This XML Schema is a very simple representation of the underlying DSpace database model for Groups and EPeople. The DSPACE-ROLES Schemas is generated by DSpace's RoleCrosswalk. Only the following DSpace Objects utilize the DSPACE-ROLES Schema in their AIPs: Site AIP all Groups and EPeople are represented in DSPACE-ROLES Schema Community AIP only Community-based groups (e.g. COMMUNITY_1_ADMIN) are represented in DSPACE-ROLES Schema Collection AIP only Collection-based groups (e.g. COLLECTION_2_ADMIN, COLLECTION_2_SUBMIT, etc.) are represented in DSPACE-ROLES Schema In the METS structure, DSPACE-ROLES metadata always appears within a techMD inside an <mdWrap MDTYPE="OTHER" OTHERMDTYPE="DSPACE-ROLES"> element. For example:
<amdSec ID="amd_2068"> ... <techMD ID="techMD_2070"> <mdWrap MDTYPE="OTHER" OTHERMDTYPE="DSPACE-ROLES"> ... </mdWrap> </techMD> ... </amdSec>
Value Contains Bitstream direct URL Always set to "File" Contains MD5 Checksum of Bitstream Contains File Format information of Bistream Contains original name of file
By default, DSPACE-ROLES metadata is always included in AIPs. It is controlled by the following configuration in your dspace.cfg:
aip.disseminate.techMD = PREMIS, DSPACE-ROLES
Example of DSPACE-ROLES Schema for a SITE AIP Below is a general example of the structure of a DSPACE-ROLES XML file, as it would appear in a SITE AIP.
<DSpaceRoles>
274
275
AIP Backup and Restore standable. Therefore, before export, these Group names are all translated to include an externally understandable identifier, in the form of a Handle. If you use this AIP to restore your groups later, they will be translated back to the normal DSpace format (i.e. the handle will be translated back to the new Internal ID).
9.18.7.3.6. METSRights Schema All DSpace Policies (permissions on objects) are translated into the METSRights schema. This is different than the above DSPACE-ROLES schema, which only represents Groups and People objects. Instead, the MET-
276
AIP Backup and Restore SRights schema is used to translate the permission statements (e.g. a group named "Library Admins" has Administrative permissions on a Community named "University Library"). But the METSRights schema doesn't represent who is a member of a particular group (that is defined in the DSPACE-ROLES schema, as described above).
By default, METSRights metadata is always included in AIPs. It is controlled by the following configuration in your dspace.cfg:
aip.disseminate.rightsMD = DSpaceDepositLicense:DSPACE_DEPLICENSE, \ CreativeCommonsRDF:DSPACE_CCRDF, CreativeCommonsText:DSPACE_CCTEXT, METSRIGHTS
Example of METSRights Schema for an Item An Item AIP will almost always contain several METSRights metadata sections within its METS Manifest. A separate METSRights metadata section is used to describe the permissions on: the Item itself each Bundle (group of files) in the Item each Bitstream (file) within an Item's bundle Below is an example of a METSRights sections for a publicly visible Bitstream, Bundle or Item. Notice it specifies that the "GENERAL PUBLIC" has the permission to DISCOVER or DISPLAY this object.
<rights:RightsDeclarationMD xmlns:rights="http://cosimo.stanford.edu/sdr/metsrights/" RIGHTSCATEGORY="LICENSED"> <rights:Context CONTEXTCLASS="GENERAL PUBLIC"> <rights:Permissions DISCOVER="true" DISPLAY="true" MODIFY="false" DELETE="false" /> </rights:Context> </rights:RightsDeclarationMD>
277
AIP Backup and Restore Example of METSRights Schema for a Collection A Collection AIP contains one METSRights section, which describes the permissions different Groups or People have within the Collection Below is an example of a METSRights sections for a publicly visible Collection, which also has an Administrator group, a Submitter group, and a group for each of the three DSpace workflow approval steps. You'll notice that each of the groups is provided with very specific permissions within the Collection. Submitters & Workflow approvers can "ADD CONTENTS" to a collection (but cannot delete the collection). Administrators have full rights.
<rights:RightsDeclarationMD xmlns:rights="http://cosimo.stanford.edu/sdr/metsrights/" RIGHTSCATEGORY="LICENSED"> <rights:Context CONTEXTCLASS="MANAGED_GRP"> <rights:UserName USERTYPE="GROUP">COLLECTION_hdl:123456789/2_SUBMIT</rights:UserName> <rights:Permissions DISCOVER="true" DISPLAY="true" MODIFY="true" DELETE="false" OTHER="true" OTHERPERMITTYPE="ADD CONTENTS" /> </rights:Context> <rights:Context CONTEXTCLASS="MANAGED_GRP"> <rights:UserName USERTYPE="GROUP">COLLECTION_hdl:123456789/2_WORKFLOW_STEP_3</ rights:UserName> <rights:Permissions DISCOVER="true" DISPLAY="true" MODIFY="true" DELETE="false" OTHER="true" OTHERPERMITTYPE="ADD CONTENTS" /> </rights:Context> <rights:Context CONTEXTCLASS="MANAGED_GRP"> <rights:UserName USERTYPE="GROUP">COLLECTION_hdl:123456789/2_WORKFLOW_STEP_2</ rights:UserName> <rights:Permissions DISCOVER="true" DISPLAY="true" MODIFY="true" DELETE="false" OTHER="true" OTHERPERMITTYPE="ADD CONTENTS" /> </rights:Context> <rights:Context CONTEXTCLASS="MANAGED_GRP"> <rights:UserName USERTYPE="GROUP">COLLECTION_hdl:123456789/2_WORKFLOW_STEP_1</ rights:UserName> <rights:Permissions DISCOVER="true" DISPLAY="true" MODIFY="true" DELETE="false" OTHER="true" OTHERPERMITTYPE="ADD CONTENTS" /> </rights:Context> <rights:Context CONTEXTCLASS="MANAGED_GRP"> <rights:UserName USERTYPE="GROUP">COLLECTION_hdl:123456789/2_ADMIN</rights:UserName> <rights:Permissions DISCOVER="true" DISPLAY="true" COPY="true" DUPLICATE="true" MODIFY="true" DELETE="true" PRINT="true" OTHER="true" OTHERPERMITTYPE="ADMIN" /> </rights:Context> <rights:Context CONTEXTCLASS="GENERAL PUBLIC"> <rights:Permissions DISCOVER="true" DISPLAY="true" MODIFY="false" DELETE="false" /> </rights:Context> </rights:RightsDeclarationMD>
Example of METSRights Schema for a Community A Community AIP contains one METSRights section, which describes the permissions different Groups or People have within that Community. Below is an example of a METSRights sections for a publicly visible Community, which also has an Administrator group. As you'll notice, this content looks very similar to the Collection METSRights section (as described above)
<rights:RightsDeclarationMD xmlns:rights="http://cosimo.stanford.edu/sdr/metsrights/" RIGHTSCATEGORY="LICENSED"> <rights:Context CONTEXTCLASS="MANAGED_GRP"> <rights:UserName USERTYPE="GROUP">COMMUNITY_hdl:123456789/10_ADMIN</rights:UserName> <rights:Permissions DISCOVER="true" DISPLAY="true" COPY="true" DUPLICATE="true" MODIFY="true" DELETE="true" PRINT="true" OTHER="true" OTHERPERMITTYPE="ADMIN" /> </rights:Context>
278
Curation System
<rights:Context CONTEXTCLASS="GENERAL PUBLIC"> <rights:Permissions DISCOVER="true" DISPLAY="true" MODIFY="false" DELETE="false" /> </rights:Context> </rights:RightsDeclarationMD>
9.19.1. Tasks
The goal of the curation system ('CS') is to provide a simple, extensible way to manage routine content operations on a repository. These operations are known to CS as 'tasks', and they can operate on any DSpaceObject (i.e. subclasses of DSpaceObject) - which means Communities, Collections, and Items - viz. core data model objects. Tasks may elect to work on only one type of DSpace object - typically an Item - and in this case they may simply ignore other data types (tasks have the ability to 'skip' objects for any reason). The DSpace core distribution will provide a number of useful tasks, but the system is designed to encourage local extension - tasks can be written for any purpose, and placed in any java package. This gives DSpace sites the ability to customize the behavior of their repository without having to alter - and therefore manage synchronization with - the DSpace source code. What sorts of activities are appropriate for tasks? Some examples: apply a virus scan to item bitstreams (this will be our example below) profile a collection based on format types - good for identifying format migrations ensure a given set of metadata fields are present in every item, or even that they have particular values call a network service to enhance/replace/normalize an items's metadata or content ensure all item bitstreams are readable and their checksums agree with the ingest values Since tasks have access to, and can modify, DSpace content, performing tasks is considered an administrative function to be available only to knowledgeable collection editors, repository administrators, sysadmins, etc. No tasks are exposed in the public interfaces.
9.19.2. Activation
For CS to run a task, the code for the task must of course be included with other deployed code (to [dspace]/ lib, WAR, etc) but it must also be declared and given a name. This is done via a configuration property in [dspace]/config/modules/curate.cfg as follows:
plugin.named.org.dspace.curate.CurationTask = \ org.dspace.curate.ProfileFormats = profileformats, \ org.dspace.curate.RequiredMetadata = requiredmetadata, \ org.dspace.curate.ClamScan = vscan
For each activated task, a key-value pair is added. The key is the fully qualified class name and the value is the taskname used elsewhere to configure the use of the task, as will be seen below. Note that the curate.cfg configuration file, while in the config directory, is located under 'modules'. The intent is that tasks, as well as
279
Curation System any configuration they require, will be optional 'add-ons' to the basic system configuration. Adding or removing tasks has no impact on dspace.cfg. For many tasks, this activation configuration is all that will be required to use it. But for others, the task needs specific configuration itself. A concrete example is described below, but note that these task-specific configuration property files also reside in [dspace]/config/modules
The return value should be a code describing one of 4 conditions: 0 : SUCCESS the task completed successfully 1 : FAIL the task failed (it is up to the task to decide what 'counts' as failure - an example might be that the virus scan finds an infected file) 2 : SKIPPED the task could not be performed on the object, perhaps because it was not applicable -1 : ERROR the task could not be completed due to an error If a task extends the AbstractCurationTask class, that is the only method it needs to define.
280
Curation System
-r - emit reporting to standard out
As with other command-line tools, these invocations could be placed in a cron table and run on a fixed schedule, or run on demand by an administrator.
When a task is selected from the drop-down list and performed, the tab displays both a phrase interpreting the 'status code' of the task execution, and the 'result' message if any has been defined. When the task has been queued, an acknowlegement appears instead. You may configure the words used for status codes in curate.cfg (for clarity, language localization, etc):
ui.statusmessages = \ -3 = Unknown Task, \ -2 = No Status Set, \ -1 = Error, \ 0 = Success, \ 1 = Fail, \ 2 = Skip, \ other = Invalid Status
9.19.4.3. In workflow
CS provides the ability to attach any number of tasks to standard DSpace workflows. Using a configuration file [dspace]/config/workflow-curation.xml, you can declaratively (without coding) wire tasks to any step in a workflow. An example:
<taskset-map> <mapping collection-handle="default" taskset="cautious" /> </taskset-map> <tasksets> <taskset name="cautious"> <flowstep name="step1"> <task name="vscan"> <workflow>reject</workflow> <notify on="fail">$flowgroup</notify> <notify on="fail">$colladmin</notify> <notify on="error">$siteadmin</notify> </task> </flowstep> </taskset> </tasksets>
This markup would cause a virus scan to occur during step one of workflow for any collection, and automatically reject any submissions with infected files. It would further notify (via email) both the reviewers (step 1 group), and the collection administrators, if either of these are defined. If it could not perform the scan, the site administrator would be notified.
281
Curation System The notifications use the same procedures that other workflow notifications do - namely email. There is a new email template defined for curation task use: [dspace]/config/emails/flowtask_notify. This may be language-localized or otherwise modified like any other email template. Like configurable submission, you can assign these task rules per collection, as well as having a default for any collection.
would do approximately what the command line invocation did. the method 'curate' just performs all the tasks configured (you can add multiple tasks to a curator).
would place a request on a named queue "monthly" to virus scan the collection. To read (and process) the queue, we could for example:
[dspace]/bin/dspace curate -q monthly
use the command-line tool, but we could also read the queue programmatically. Any number of queues can be defined and used as needed. In the administrative UI curation 'widget', there is the ability to both perform a task, but also place it on a queue for later processing.
282
Curation System
-1 0 1 2 ERROR - task could not be performed SUCCESS - task performed successfully FAIL - task performed, but failed SKIP - task not performed due to object not being eligible
In the administrative UI, this code is translated into the word or phrase configured by the ui.statusmessages property (discussed above) for display.
CS does not interpret or assign result strings, the task does it. A task may not assign a result, but the 'best practice' for tasks is to assign one whenever possible.
A related issue concerns how non-distributive tasks report their status and results: the status will normally reflect only the last invocation of the task in the container, so important outcomes could be lost. If a task declares itself @Suspendable, however, the CS will cease processing when it encounters a FAIL status. When used in the UI, for example, this would mean that if our virus scan is running over a collection, it would stop and return status (and result) to the scene on the first infected item it encounters. You can even tune @Supendable tasks more precisely by annotating what invocations you want to suspend on. For example:
@Suspendable(invoked=Curator.Invoked.INTERACTIVE) public class MyTask implements CurationTask
283
Curation System would mean that the task would suspend if invoked in the UI, but would run to completion if run on the command-line. Only a few annotation types have been defined so far, but as the number of tasks grow, we can look for common behavior that can be signaled by annotation. For example, there is a @Mutative type: that tells CS that the task may alter (mutate) the object it is working on.
where the left column is the count of bitstreams of the named format and the letter in parentheses is an abbreviation of the repository-assigned support level for that format:
U Unsupported K Known S Supported
The profiler will operate on any DSpace object. If the object is an item, then only that item's bitstreams are profiled; if a collection, all the bitsteams of all the items; if a community, all the items of all the collections of the community.
284
Curation System 9.19.8.3.1. Setup the service from the ClamAV documentation. This plugin requires a ClamAV daemon installed and configured for TCP sockets. Instructions for installing ClamAV (http://www.clamav.net/doc/latest/clamdoc.pdf ) NOTICE: The following directions assume there is a properly installed and configured clamav daemon. Refer to links above for more information about ClamAV. The Clam anti-virus database must be updated regularly to maintain the most current level of anti-virus protection. Please refer to the ClamAV documentation for instructions about maintaining the anti-virus database. 9.19.8.3.2. DSpace Configuration In [dspace]/config/modules/curate.cfg, activate the task: Add the plugin to the comma separated list of curation tasks.
### Task Class implementations plugin.named.org.dspace.curate.CurationTask = \ org.dspace.curate.ProfileFormats = profileformats, \ org.dspace.curate.RequiredMetadata = requiredmetadata, \ org.dspace.curate.ClamScan = vscan
Optionally, add the vscan friendly name to the configuration to enable it in the administrative it in the administrative user interface.
ui.tasknames = \ profileformats = Profile Bitstream Formats, \ requiredmetadata = Check for Required Metadata, \ vscan = Scan for Viruses
9.19.8.3.3. Task Operation from the GUI Curation tasks can be run against container and item dspace objects by e-persons with administrative privileges. A curation tab will appear in the administrative ui after logging into DSpace: 1. Click on the curation tab. 2. Select the option configured in ui.tasknames above. 3. Select Perform. 9.19.8.3.4. Task Operation from the curation command line client To output the results to the console:
[dspace]/bin/dspace curate -t vscan -i <handle of container or item dso> -r -
285
Table 1 Virus Scan Results Table GUI (Interactive Mode) Container Container Item Item Command Line Container T Report on 1st infected bitstream within an item/Scan all contained Items Report on all infected bitstreams/Scan all contained Items Report on 1st infected bitstream Report on all infected bitstreams FailFast T F T F Expectation Stop on 1st Infected Bitstream Stop on 1st Infected Item Stop on 1st Infected Bitstream Scan all bitstreams
286
Directories
10. Directories
10.1. Overview
A complete DSpace installation consists of three separate directory trees: The source directory:: This is where (surprise!) the source code lives. Note that the config files here are used only during the initial install process. After the install, config files should be changed in the install directory. It is referred to in this document as [dspace-source]. The install directory:: This directory is populated during the install process and also by DSpace as it runs. It contains config files, command-line tools (and the libraries necessary to run them), and usually -- although not necessarily -- the contents of the DSpace archive (depending on how DSpace is configured). After the initial build and install, changes to config files should be made in this directory. It is referred to in this document as [dspace]. The web deployment directory:: This directory is generated by the web server the first time it finds a dspace.war file in its webapps directory. It contains the unpacked contents of dspace.war, i.e. the JSPs and java classes and libraries necessary to run DSpace. Files in this directory should never be edited directly; if you wish to modify your DSpace installation, you should edit files in the source directory and then rebuild. The contents of this directory aren't listed here since its creation is completely automatic. It is usually referred to in this document as [tomcat]/webapps/dspace.
287
Source Directory Layout news-side.html - Text of the front-page news in the sidebar, only used in JSPUI. news-top.html - Text of the front-page news in the top box, only used in teh JSPUI. emails/ - Text and layout templates for emails sent out by the system. registries/ - Initial contents of the bitstream format registry and Dublin Core element/qualifier registry. These are only used on initial system setup, after which they are maintained in the database. docs/ - DSpace system documentation. The technical documentation for functionality, installation, configuration, etc. etc/ This directory contains administrative files needed for the install process and by developers, mostly database initialization and upgrade scripts. Any .xml files in etc/ are common to all supported database systems. postgres/ - Versions of the database schema and updater SQL scripts for PostgreSQL. oracle/ - Versions of the database schema and updater SQL scripts for Oracle. modules/ - The Web UI modules "overlay" directory. DSpace uses Maven to automatically look here for any customizations you wish to make to DSpace Web interfaces. jspui - Contains all customizations for the JSP User Interface. src/main/resources/ - The overlay for JSPUI Resources. This is the location to place any custom Messages.properties files. (Previously this file had been stored at: _[dspace-source]/config/language-packs/Messages.properties_ src/main/webapp/ - The overlay for JSPUI Web Application. This is the location to place any custom JSPs to be used by DSpace. lni - Contains all customizations for the Lightweight Network Interface. oai - Contains all customizations for the OAI-PMH Interface. sword - Contains all customizations for the SWORD (Simple Web-service Offering Repository Deposit) Interface. xmlui - Contains all customizations for the XML User Interface (aka Manakin). src/main/webapp/ - The overlay for XMLUI Web Application. This is the location to place custom Themes or Configurations. i18n/ - The location to place a custom version of the XMLUI's messages.xml (You have to manually create this folder) themes/ - The location to place custom Themes for the XMLUI (You have to manually create this folder). src/ - Maven configurations for DSpace System. This directory contains the Maven and Ant build files for DSpace. target/ - (Only exists after building DSpace) This is the location Maven uses to build your DSpace installation package. 288
Installed Directory Layout dspace-[version].dir - The location of the DSpace Installation Package (which can then be installed by running ant update)
289
290
Log Files Log File [dspace]/log/dspace.log.yyyy-mm-dd What's In It Main DSpace log file. This is where the DSpace code writes a simple log of events and errors that occur within the DSpace code. You can control the verbosity of this by editing the [dspace-source]/config/templates/log4j.properties file and then running "ant init_configs". Apache Cocoon log file for the XMLUI. This is where the DSpace XMLUI logs all of its events and errors. This is where Tomcat's standard output is written. Many errors that occur within the Tomcat code are logged here. For example, if Tomcat can't find the DSpace code (dspace.jar), it would be logged in catalina.out. If you're running Tomcat stand-alone (without Apache), it logs some information and errors for specific Web applications to this log file. hostname will be your host name (e.g. dspace.myu.edu) and yyyymm-dd will be the date. If you're using Apache, Tomcat logs information about Web applications running through Apache (mod_webapp) in this log file (yyyy-mm-dd being the date.) Apache logs to this file. If there is a problem with getting mod_webapp working, this is a good place to look for clues. Apache also writes to several other log files, though error_log tends to contain the most useful information for tracking down problems. The Handle server runs as a separate process from the DSpace Web UI (which runs under Tomcat's JVM). Due to a limitation of log4j's 'rolling file appenders', the DSpace code running in the Handle server's JVM must use a separate log file. The DSpace code that is run as part of a Handle resolution request writes log information to this file. You can control the verbosity of this by editing [dspace-source]/config/templates/log4j-handle-plugin.properties. This is the log file for CNRI's Handle server code. If a problem occurs within the Handle server code, before DSpace's plug-in is invoked, this is where it may be logged. On the other hand, a problem with CNRI's Handle server code might be logged here. PostgreSQL also writes a log file. This one doesn't seem to have a default location, you probably had to specify it yourself at some point during installation. In general, this log file rarely contains pertinent information--PostgreSQL is pretty stable, you're more
[dspace]/log/cocoon.log.yyyy-mm-dd
[tomcat]/logs/catalina.out
[tomcat]/logs/hostname_log.yyyy-mm-dd.txt
[tomcat]/logs/apache_log.yyyy-mm-dd.txt
[apache]/error_log
[dspace]/log/handle-plug.log
[dspace]/log/handle-server.log
291
Log Files likely to encounter problems with connecting via JDBC, and these problems will be logged in dspace.log.
These lines control what level of logging takes place. Normally they should be set to INFO, but if you need to see more information in the logs, set them to dEBUG and restart your web server
log4j.appender.A1=org.dspace.app.util.DailyFileAppender is the name of the log file creation method used. This The DailyFileAppender creates a new date-stamped file every day or month. log4j.appender.A1.File=${log.dir}/dspace.log This sets the filename and location of where the log file will be stored. It iwll have a date stamp appended to the file name. This defines the format for the date stamp that is appended to the log file names. If you wish to have log files created monthyl instead of daily, change this to yyyy-MM This defines how many log files will be created. You may wish to define a retention period for log files. If you set this to 365, logs older than a year will be deleted. By default this is set to 0 so that no logs are ever deleted. Ensure that you monitor the disk space used by the logs to make sure that you have enough space for them. It is often important to keep the log files for a long time in case you want to rebuild your statistics.
log4j.appender.A1.DatePattern=yyy-MM-DD
log4j.appender.A1.MaxLogs=0
292
Architecture
11. Architecture
11.1. Overview
The DSpace system is organized into three layers, each of which consists of a number of components.
293
Overview The reason for this design choice is that authentication methods will vary widely between different applications, so it makes sense to leave the logic and responsibility for that in these applications. The source code is organized to cohere very strictly to this three-layer architecture. Also, only methods in a component's public API are given the public access level. This means that the Java compiler helps ensure that the source code conforms to the architecture. Packages within org.dspace.app org.dspace org.dspace.storage Correspond to components in Application layer Business logic layer (except storage and app) Storage layer
The storage and business logic layer APIs are extensively documented with Javadoc-style comments. Generate the HTML version of these by entering the [dspace-source]/dspace directory and running:
mvn javadoc:javadoc
The resulting documentation will be at [dspace-source]dspace-api/target/site/apidocs/index.html. The package-level documentation of each package usually contains an overview of the package and some example usage. This information is not repeated in this architecture document; this and the Javadoc APIs are intended to be used in parallel. Each layer is described in a separate section: Section 11.5, Storage Layer RDBMS Bitstream Store Section 11.3, Business Logic Layer Core Classes Content Management API Workflow System Administration Toolkit E-person/Group Manager Authorisation Handle Manager/Handle Plugin Search Browse API History Recorder Checksum Checker 294
Application Layer Section 11.2, Application Layer Web User Interface OAI-PMH Data Provider Item Importer and Exporter Transferring Items Between DSpace Instances Registration METS Tools Media Filters Sub-Community Management
Application Layer [dspace-source]/dspace-jspui/dspace-jspui-api/src/ main/java/org/dspace/app/webui/util/ [dspace-source]/dspace-jspui [dspace-source]/dspace/modules/jspui/src/main/webapp Miscellaneous classes used by the servlets and filters The JSP files This is where you place customized versions of JSPssee 6. JSPUI Configuration and Customization
[dspace-source]/dspace/modules/xmlui/src/main/we- This is where you place customizations for the Manbapp akin interfacesee 7. Manakin [XMLUI] Configuration and Customization [dspace-source/dspace/modules/jspui/src/main/resources [dspace-source]/dspace-jspui/dspace-jspui-webapp/src/main/webapp/WEB-INF/dspace-tags.tld This is where you can place you customize version of the Messages.properties file. Custom DSpace JSP tag descriptor
Please see the Section 4, Installation instructions for more details about the Installation process.
Application Layer The reasons for this approach are: All of the processing is done before the JSP is invoked, so any error or problem that occurs does not occur halfway through HTML rendering The JSPs contain as little code as possible, so they can be customized without having to delve into Java code too much The org.dspace.app.webui.servlet.LoadDSpaceConfig servlet is always loaded first. This is a very simple servlet that checks the dspace-config context parameter from the DSpace deployment descriptor, and uses it to locate dspace.cfg. It also loads up the Log4j configuration. It's important that this servlet is loaded first, since if another servlet is loaded up, it will cause the system to try and load DSpace and Log4j configurations, neither of which would be found. All DSpace servlets are subclasses of the DSpaceServlet class. The DSpaceServlet class handles some basic operations such as creating a DSpace Context object (opening a database connection etc.), authentication and error handling. Instead of overriding the doGet and doPost methods as one normally would for a servlet, DSpace servlets implement doDSGet or doDSPost which have an extra context parameter, and allow the servlet to throw various exceptions that can be handled in a standard way. The DSpace servlet processes the contents of the HTTP request. This might involve retrieving the results of a search with a query term, accessing the current user's eperson record, or updating a submission in progress. According to the results of this processing, the servlet must decide which JSP should be displayed. The servlet then fills out the appropriate attributes in the HttpRequest object that represents the HTTP request being processed. This is done by invoking the setAttribute method of the javax.servlet.http.HttpServletRequest object that is passed into the servlet from Tomcat. The servlet then forwards control of the request to the appropriate JSP using the JSPManager.showJSP method. The JSPManager.showJSP method uses the standard Java servlet forwarding mechanism is then used to forward the HTTP request to the JSP. The JSP is processed by Tomcat and the results sent back to the user's browser. There is an exception to this servlet/JSP style: index.jsp, the 'home page', receives the HTTP request directly from Tomcat without a servlet being invoked first. This is because in the servlet 2.3 specification, there is no way to map a servlet to handle only requests made to '/'; such a mapping results in every request being directed to that servlet. By default, Tomcat forwards requests to '/' to index.jsp. To try and make things as clean as possible, index.jsp contains some simple code that would normally go in a servlet, and then forwards to home.jsp using the JSPManager.showJSP method. This means localized versions of the 'home page' can be created by placing a customized home.jsp in [dspace-source]/jsp/local, in the same manner as other JSPs. [dspace-source]/jsp/dspace-admin/index.jsp, the administration UI index page, is invoked directly by Tomcat and not through a servlet for similar reasons. At the top of each JSP file, right after the license and copyright header, is documented the appropriate attributes that a servlet must fill out prior to forwarding to that JSP. No validation is performed; if the servlet does not fill out the necessary attributes, it is likely that an internal server error will occur. Many JSPs containing forms will include hidden parameters that tell the servlets which form has been filled out. The submission UI servlet (SubmissionController is a prime example of a servlet that deals with the input from many different JSPs. The step and page hidden parameters (written out by the SubmissionController.getSubmissionParameters() method) are used to inform the servlet which page of which step has just been filled out (i.e. which page of the submission the user has just completed). Below is a detailed, scary diagram depicting the flow of control during the whole process of processing and responding to an HTTP request. More information about the authentication mechanism is mostly described in the configuration section.
297
Application Layer
298
Application Layer This is because DSpace does not have a fully-fledged dissemination architectural piece yet.Displaying an item record is done by a tag rather than a JSP for two reasons: Firstly, it happens in several places (when verifying an item record during submission or workflow review, as well as during standard item accesses), and secondly, displaying the item turns out to be mostly code-work rather than HTML anyway. Of course, the disadvantage of doing it this way is that it is slightly harder to customize exactly what is displayed from an item record; it is necessary to edit the tag code (org.dspace.app.webui.jsptag.ItemTag). Hopefully a better solution can be found in the future. itemlist,collectionlist,communitylist: These tags display ordered sequences of items, collections and communities, showing minimal information but including a link to the page containing full details. These need to be used in HTML tables. popup: This tag is used to render a link to a pop-up page (typically a help page.) If Javascript is available, the link will either open or pop to the front any existing DSpace pop-up window. If Javascript is not available, a standard HTML link is displayed that renders the link destination in a window named 'dspace.popup'. In graphical browsers, this usually opens a new window or re-uses an existing window of that name, but if a window is re-used it is not 'raised' which might confuse the user. In text browsers, following this link will simply replace the current page with the destination of the link. This obviously means that Javascript offers the best functionality, but other browsers are still supported. selecteperson: A tag which produces a widget analogous to HTML <SELECT>, that allows a user to select one or multiple e-people from a pop-up list. sfxlink: Using an item's Dublin Core metadata DSpace can display an SFX link, if an SFX server is available. This tag does so for a particular item if the sfx.server.url property is defined in dspace.cfg.
NEW:
<H1><fmt:message key="jsp.search.results.title"/></H1>
This message can now be changed using the config/language-packs/Messages.properties file. (This must be done at build-time: Messages.properties is placed in the dspace.war Web application file.)
jsp.search.results.title = Search Results
Phrases may have parameters to be passed in, to make the job of translating easier, reduce the number of 'keys' and to allow translators to make the translated text flow more appropriately for the target language. OLD:
<P>Results <%= r.getFirst() %> to <%= r.getLast() %> of <%=r.getTotal() %></P>
NEW:
299
Application Layer
<fmt:message key="jsp.search.results.text"> <fmt:param><%= r.getFirst() %></fmt:param> <fmt:param><%= r.getLast() %></fmt:param> <fmt:param><%= r.getTotal() %></fmt:param> </fmt:message>
(Note: JSTL 1.0 does not seem to allow JSP <%= %> expressions to be passed in as values of attribute in <fmt:param value=""/>) The above would appear in the Messages_xx.properties file as:
jsp.search.results.text = Results {0}-{1} of {2}
Introducing number parameters that should be formatted according to the locale used makes no difference in the message key compared to atring parameters:
jsp.submit.show-uploaded-file.size-in-bytes = {0} bytes
In the JSP using this key can be used in the way belov:
<fmt:message key="jsp.submit.show-uploaded-file.size-in-bytes"> <fmt:param><fmt:formatNumber><%= bitstream.getSize()%></fmt:formatNumber></fmt:param> </fmt:message>
(Note: JSTL offers a way to include numbers in the message keys as jsp.foo.key = {0,number} bytes. Setting the parameter as <fmt:param value="${variable}" /> workes when variable is a single variable name and doesn't work when trying to use a method's return value instead: bitstream.getSize(). Passing the number as string (or using the <%= %> expression) also does not work.) Multiple Messages.properties can be created for different languages. See ResourceBundle.getBundle. e.g. you can add German and Canadian French translations:
Messages_de.properties Messages_fr_CA.properties
The end user's browser settings determine which language is used. The English language file Messages.properties (or the default server locale) will be used as a default if there's no language bundle for the end user's preferred language. (Note that the English file is not called Messages_en.properties this is so it is always available as a default, regardless of server configuration.) The dspace:layout tag has been updated to allow dictionary keys to be passed in for the titles. It now has two new parameters: titlekey and parenttitlekey. So where before you'd do:
<dspace:layout title="Here" parentlink="/mydspace" parenttitle="My DSpace">
And so the layout tag itself gets the relevant stuff out of the dictionary. title and parenttitle still work as before for backwards compatibility, and the odd spot where that's preferable.
300
Application Layer 11.2.1.5.1. Message Key Convention When translating further pages, please follow the convention for naming message keys to avoid clashes. For text in JSPs use the complete path + filename of the JSP, then a one-word name for the message. e.g. for the title of jsp/mydspace/main.jsp use:
jsp.mydspace.main.title
Some common words (e.g. "Help") can be brought out into keys starting jsp. for ease of translation, e.g.:
jsp.admin = Administer
Other common words/phrases are brought out into 'general' parameters if they relate to a set (directory) of JSPs, e.g.
jsp.tools.general.delete = Delete
Phrases that relate strongly to a topic (eg. MyDSpace) but used in many JSPs outside the particular directory are more convenient to be cross-referenced. For example one could use the key below in jsp/submit/saved.jsp to provide a link back to the user's MyDSpace: (Cross-referencing of keysin generalis not a good idea as it may make maintenance more difficult. But in some cases it has more advantages as the meaning is obvious.)
jsp.mydspace.general.goto-mydspace = Go to My DSpace
For text in servlet code, in custom JSP tags or wherever applicable use the fully qualified classname + a oneword name for the message. e.g.
org.dspace.app.webui.jsptag.ItemListTag.title = Title
11.2.1.5.2. Which Languages are currently supported? To view translations currently being developed, please refer to the i18n page of the DSpace Wiki.
The Bundle's primary bitstream field would point to the contents.html Bitstream, which we know is HTML (check the format MIME type) and so we know which to serve up first.
301
Application Layer The HTML servlet employs a trick to serve up HTML documents without actually modifying the HTML or other files themselves. Say someone is looking at contents.html from the above example, the URL in their browser will look like this:
https://dspace.mit.edu/html/1721.1/12345/contents.html
If there's an image called figure1.gif in that HTML page, the browser will do HTTP GET on this URL:
https://dspace.mit.edu/html/1721.1/12345/figure1.gif
The HTML document servlet can work out which item the user is looking at, and then which Bitstream in it is called figure1.gif, and serve up that bitstream. Similar for following links to other HTML pages. Of course all the links and image references have to be relative and not absolute. HTML documents must be "self-contained", as explained here. Provided that full path information is known by DSpace, any depth or complexity of HTML document can be served subject to those contraints. This is usually possible with some kind of batch import. If, however, the document has been uploaded one file at a time using the Web UI, the path information has been stripped. The system can cope with relative links that refer to a deeper path, e.g.
<IMG SRC="images/figure1.gif">
If the item has been uploaded via the Web submit UI, in the Bitstream table in the database we have the 'name' field, which will contain the filename with no path (figure1.gif). We can still work out what images/figure1.gif is by making the HTML document servlet strip any path that comes in from the URL, e.g.
https://dspace.mit.edu/html/1721.1/12345/images/figure1.gif ^^^^^^^ Strip this
BUT all the filenames (regardless of directory names) must be unique. For example, this wouldn't work:
contents.html chapter1.html chapter2.html chapter1_images/figure.gif chapter2_images/figure.gif
since the HTML document servlet wouldn't know which bitstream to serve up for:
https://dspace.mit.edu/html/1721.1/12345/chapter1_images/figure.gif https://dspace.mit.edu/html/1721.1/12345/chapter2_images/figure.gif
since it would just have figure.gif To prevent "infinite URL spaces" appearing (e.g. if a file foo.html linked to bar/foo.html, which would link to bar/bar/foo.html...) this behavior can be configured by setting the configuration property webui.html.max-depthguess. For example, if we receive a request for foo/bar/index.html, and we have a bitstream called just index.html, we will serve up that bitstream for the request if webui.html.max-depth-guess is 2 or greater. If webui.html.maxdepth-guess is 1 or less, we would not serve that bitstream, as the depth of the file is greater. If webui.html.maxdepth-guess is zero, the request filename and path must always exactly match the bitstream name. The default value (if that property is not present in dspace.cfg) is 3.
302
Application Layer
It is this URL that should be registered with www.openarchives.org. Note that you can easily change the 'request' portion of the URL by editing [dspace-source]/etc/oai-web.xml and rebuilding and deploying oai.war. DSpace provides implementations of the OAICat interfaces AbstractCatalog, RecordFactory and Crosswalk that interface with the DSpace content management API and harvesting API (in the search subsystem). Only the basic oai_dc unqualified Dublin Core metadata set export is enabled by default; this is particularly easy since all items have qualified Dublin Core metadata. When this metadata is harvested, the qualifiers are simply stripped; for example, description.abstract is exposed as unqualified description. The description.provenance field is hidden, as this contains private information about the submitter and workflow reviewers of the item, including their e-mail addresses. Additionally, to keep in line with OAI community practices, values of contributor.author are exposed as creator values. Other metadata formats are supported as well, using other Crosswalk implementations; consult the oaicat.properties file described below. To enable a format, simply uncomment the lines beginning with Crosswalks.*. Multiple formats are allowed, and the current list includes, in addition to unqualified DC: MPEG DIDL, METS, MODS. There is also an incomplete, experimental qualified DC. Note that the current simple DC implementation (org.dspace.app.oai.OAIDCCrosswalk) does not currently strip out any invalid XML characters that may be lying around in the data. If your database contains a DC value with, for example, some ASCII control codes (form feed etc.) this may cause OAI harvesters problems. This should rarely occur, however. XML entities (such as >) are encoded (e.g. to >) In addition to the implementations of the OAICat interfaces, there is one main configuration file relevant to OAI-PMH support: oaicat.properties: This file resides in [dspace]/config. You probably won't need to edit this, as it is preconfigured to meet most needs. You might want to change the Identify.earliestDatestamp field to more accurately reflect the oldest datestamp in your local DSpace system. (Note that this is the value of the last_modified column in the Item database table.)
303
Application Layer
11.2.2.1. Sets
OAI-PMH allows repositories to expose an hierarchy of sets in which records may be placed. A record can be in zero or more sets. DSpace exposes collections as sets. The organization of communities is likely to change over time, and is therefore a less stable basis for selective harvesting. Each collection has a corresponding OAI set, discoverable by harvesters via the ListSets verb. The setSpec is the Handle of the collection, with the ':' and '/' converted to underscores so that the Handle is a legal setSpec, for example:
hdl_1721.1_1234
Naturally enough, the collection name is also the name of the corresponding set.
For example:
oai:dspace.myu.edu:123456789/345
If you wish to use a different scheme, this can easily be changed by editing the value of OAI_ID_PREFIX at the top of the org.dspace.app.oai.DSpaceOAICatalog class. (You do not need to change the code if the above scheme works for you; the code picks up the host name and Handles automatically from the DSpace configuration.)
304
Application Layer
11.2.2.6. Deletions
DSpace keeps track of deletions (withdrawals). These are exposed via OAI, which has a specific mechansim for dealing with this. Since DSpace keeps a permanent record of withdrawn items, in the OAI-PMH sense DSpace supports deletions 'persistently'. This is as opposed to 'transient' deletion support, which would mean that deleted records are forgotten after a time. Once an item has been withdrawn, OAI-PMH harvests of the date range in which the withdrawal occurred will find the 'deleted' record header. Harvests of a date range prior to the withdrawal will not find the record, despite the fact that the record did exist at that time. As an example of this, consider an item that was created on 2002-05-02 and withdrawn on 2002-10-06. A request to harvest the month 2002-10 will yield the 'record deleted' header. However, a harvest of the month 2002-05 will not yield the original record. Note that presently, the deletion of 'expunged' items is not exposed through OAI.
from and until are the ISO 8601 dates passed in as part of the original request, and setSpec is also taken from the original request. offset is the number of records that have already been sent to the harvester. For example:
2003-01-01//hdl_1721_1_1234/300
305
Application Layer This means the harvest is 'from' 2003-01-01, has no 'until' date, is for collection hdl:1721.1/1234, and 300 records have already been sent to the harvester. (Actually, if the original OAI-PMH request doesn't specify a 'from' or 'until, OAICat fills them out automatically to '0000-00-00T00:00:00Z' and '9999-12-31T23:59:59Z' respectively. This means DSpace resumption tokens will always have from and until dates in them.)
In release 1.5 a script was written and in release 1.6 the command [dspace]/bin/dspace index-init replaces the script. The stanza from launcher.xml show us how one can build more commands if needed:
<command> <name>index-update</name> <description>Update the search and browse indexes</description> <step passuserargs="false"> <class>org.dspace.browse.IndexBrowse</class> <argument>-i</argument> </step> <step passuserargs="false"> <class>org.dspace.browse.ItemCounter</class> </step>
306
11.3.1.2. Constants
This class contains constants that are used to represent types of object and actions in the database. For example, authorization policies can relate to objects of different types, so the resourcepolicy table has columns resource_id, which is the internal ID of the object, and resource_type_id, which indicates whether the object is an item, collection, bitstream etc. The value of resource_type_id is taken from the Constants class, for example Constants.ITEM.
11.3.1.3. Context
The Context class is central to the DSpace operation. Any code that wishes to use the any API in the business logic layer must first create itself a Context object. This is akin to opening a connection to a database (which is in fact one of the things that happens.) A context object is involved in most method calls and object constructors, so that the method or object has access to information about the current operation. When the context object is constructed, the following information is automatically initialized: A connection to the database. This is a transaction-safe connection. i.e. the 'auto-commit' flag is set to false. A cache of content management API objects. Each time a content object is created (for example Item or Bitstream) it is stored in the Context object. If the object is then requested again, the cached copy is used.
307
Business Logic Layer Apart from reducing database use, this addresses the problem of having two copies of the same object in memory in different states. The following information is also held in a context object, though it is the responsibility of the application creating the context object to fill it out correctly: The current authenticated user, if any Any 'special groups' the user is a member of. For example, a user might automatically be part of a particular group based on the IP address they are accessing DSpace from, even though they don't have an e-person record. Such a group is called a 'special group'. Any extra information from the application layer that should be added to log messages that are written within this context. For example, the Web UI adds a session ID, so that when the logs are analyzed the actions of a particular user in a particular session can be tracked. A flag indicating whether authorization should be circumvented. This should only be used in rare, specific circumstances. For example, when first installing the system, there are no authorized administrators who would be able to create an administrator account!As noted above, the public API is trusted, so it is up to applications in the application layer to use this flag responsibly. Typical use of the context object will involve constructing one, and setting the current user if one is authenticated. Several operations may be performed using the context object. If all goes well, complete is called to commit the changes and free up any resources used by the context. If anything has gone wrong, abort is called to roll back any changes and free up the resources. You should always abort a context if any error happens during its lifespan; otherwise the data in the system may be left in an inconsistent state. You can also commit a context, which means that any changes are written to the database, and the context is kept active for further use.
11.3.1.4. Email
Sending e-mails is pretty easy. Just use the configuration manager's getEmail method, set the arguments and recipients, and send. The e-mail texts are stored in [dspace]/config/emails. They are processed by the standard java.text.MessageFormat. At the top of each e-mail are listed the appropriate arguments that should be filled out by the sender. Example usage is shown in the org.dspace.core.Email Javadoc API documentation.
11.3.1.5. LogManager
The log manager consists of a method that creates a standard log header, and returns it as a string suitable for logging. Note that this class does not actually write anything to the logs; the log header returned should be logged directly by the sender using an appropriate Log4J call, so that information about where the logging is taking place is also stored. The level of logging can be configured on a per-package or per-class basis by editing [dspace]/config/log4j.properties. You will need to stop and restart Tomcat for the changes to take effect. A typical log entry looks like this: 2002-11-11 08:11:32,903 INFO org.dspace.app.webui.servlet.DSpaceServlet anonymous:session_id=BD84E7C194C2CF4BD0EC3A6CAD0142BB:view_item:handle=1721.1/1686 This is breaks down like this: Date and time, milliseconds 2002-11-11 08:11:32,903 @
308
Business Logic Layer Level (FATAL, WARN, INFO or DEBUG) Java class User email or anonymous Extra log info from context Action Extra info INFO org.dspace.app.webui.servlet.DSpaceServlet @ anonymous : session_id=BD84E7C194C2CF4BD0EC3A6CAD0142BB : view_item : handle=1721.1/1686
The above format allows the logs to be easily parsed and analyzed. The [dspace]/bin/log-reporter script is a simple tool for analyzing logs. Try:
[dspace]/bin/log-reporter --help
It's a good idea to 'nice' this log reporter to avoid an impact on server performance.
11.3.1.6. Utils
Utils contains miscellaneous utility method that are required in a variety of places throughout the code, and thus have no particular 'home' in a subsystem.
to construct a brand new item in the system, rather than simply instantiating an in-memory instance of an object in the system. find methods may often be called with invalid IDs, and return null in such a case. A constructor would have to throw an exception in this case. A null return value from a static method can in general be dealt with more simply in code. If an instantiation representing the same underlying archival entity already exists, the find method can simply return that same instantiation to avoid multiple copies and any inconsistencies which might result.
309
Business Logic Layer Collection, Bundle and Bitstream do not have create methods; rather, one has to create an object using the relevant method on the container. For example, to create a collection, one must invoke createCollection on the community that the collection is to appear in:
Context context = new Context(); Community existingCommunity = Community.find(context, 123); Collection myNewCollection = existingCommunity.createCollection();
The primary reason for this is for determining authorization. In order to know whether an e-person may create an object, the system must know which container the object is to be added to. It makes no sense to create a collection outside of a community, and the authorization system does not have a policy for that. Item_s are first created in the form of an implementation of _InProgressSubmission. An InProgressSubmission represents an item under construction; once it is complete, it is installed into the main archive and added to the relevant collection by the InstallItem class. The org.dspace.content package provides an implementation of InProgressSubmission called WorkspaceItem; this is a simple implementation that contains some fields used by the Web submission UI. The org.dspace.workflow also contains an implementation called WorkflowItem which represents a submission undergoing a workflow process. In the previous chapter there is an overview of the item ingest process which should clarify the previous paragraph. Also see the section on the workflow system. Community and BitstreamFormat do have static create methods; one must be a site administrator to have authorization to invoke these.
11.3.2.2. Modifications
When creating, modifying or for whatever reason removing data with the content management API, it is important to know when changes happen in-memory, and when they occur in the physical DSpace storage. Primarily, one should note that no change made using a particular org.dspace.core.Context object will actually be made in the underlying storage unless complete or commit is invoked on that Context. If anything should go wrong during an operation, the context should always be aborted by invoking abort, to ensure that no inconsistent state is written to the storage. Additionally, some changes made to objects only happen in-memory. In these cases, invoking the update method lines up the in-memory changes to occur in storage when the Context is committed or completed. In general, methods that change any [meta]data field only make the change in-memory; methods that involve relationships with other objects in the system line up the changes to be committed with the context. See individual methods in the API Javadoc. Some examples to illustrate this are shown below:
310
Context context = new Context(); Bitstream b = Bitstream.find(context, 1234); b.setName("newfile.txt"); b.update(); context.complete(); Context context = new Context(); Bitstream b = Bitstream.find(context, 1234); b.setName("newfile.txt"); b.update(); context.abort(); Context context = new Context(); Bitstream b = Bitstream.find(context, 1234); b.setName("newfile.txt"); context.complete(); Context context = new Context(); Bitstream bs = Bitstream.find(context, 1234); Bundle bnd = Bundle.find(context, 5678); bnd.add(bs); context.complete();
The new name will not be stored since update was not invoked
The bitstream will be included in the bundle, since update doesn't need to be called
311
Business Logic Layer for Other Metadata Schemas). The other classes starting with DC are utility classes for handling types of data in Dublin Core, such as people's names and dates. As supplied, the DSpace registry of elements and qualifiers corresponds to the Library Application Profile for Dublin Core. It should be noted that these utility classes assume that the values will be in a certain syntax, which will be true for all data generated within the DSpace system, but since Dublin Core does not always define strict syntax, this may not be true for Dublin Core originating outside DSpace. Below is the specific syntax that DSpace expects various fields to adhere to: Element date Qualifier Any or unqualified Syntax Helper Class
ISO 8601 in the UTC DCDate time zone, with either year, month, day, or second precision. Examples:_2000 2002-10 2002-08-14 1999-01-01T14:35:23Z _ In general last name, DCPersonName then a comma, then first names, then any additional information like "Jr.". If the contributor is an organization, then simply the name. Examples:_Doe, John Smith, John Jr. van Dyke, Dick Massachusetts Institute of Technology _ A two letter code taken DCLanguage ISO 639, followed optionally by a two letter country code taken from ISO 3166. Examples:_en fr en_US _ The series name, following by a semicolon followed by the number in that series. Alternatively, just free text._MIT-TR; 1234 My Report Series; ABC-1234 NS1234 _ DCSeriesNumber
contributor
Any or unqualified
language
iso
relation
ispartofseries
312
11.3.3.1. Concepts
The following terms are important in understanding the rest of this section: Plugin Interface A Java interface, the defining characteristic of a plugin. The consumer of a plugin asks for its plugin by interface. Plugin a.k.a. Component, this is an instance of a class that implements a certain interface. It is interchangeable with other implementations, so that any of them may be "plugged in", hence the name. A Plugin is an instance of any class that implements the plugin interface. Implementation class The actual class of a plugin. It may implement several plugin interfaces, but must implement at least one.
313
Business Logic Layer Name Plugin implementations can be distinguished from each other by name, a short String meant to symbolically represent the implementation class. They are called "named plugins". Plugins only need to be named when the caller has to make an active choice between them. SelfNamedPlugin class Plugins that extend the SelfNamedPlugin class can take advantage of additional features of the Plugin Manager. Any class can be managed as a plugin, so it is not necessary, just possible. Reusable Reusable plugins are only instantiated once, and the Plugin Manager returns the same (cached) instance whenever that same plugin is requested again. This behavior can be turned off if desired.
314
Business Logic Layer so that it reads its configuration data, gets the list of names to which it can respond, and passes those on to the Plugin Manager. When the Plugin Manager creates an instance of the XSLT-crosswalk, it records the Plugin Name that was responsible for that instance. The plugin can look at that Name later in order to configure itself correctly for the Name that created it. This mechanism is all part of the SelfNamedPlugin class which is part of any selfnamed plugin. 11.3.3.2.3. Obtaining a Plugin Instance The most common thing you will do with the Plugin Manager is obtain an instance of a plugin. To request a plugin, you must always specify the plugin interface you want. You will also supply a name when asking for a named plugin. A sequence plugin is returned as an array of _Object_s since it is actually an ordered list of plugins. See the getSinglePlugin(), getPluginSequence(), getNamedPlugin() methods. 11.3.3.2.4. Lifecycle Management When PluginManager fulfills a request for a plugin, it checks whether the implementation class is reusable; if so, it creates one instance of that class and returns it for every subsequent request for that interface and name. If it is not reusable, a new instance is always created. For reasons that will become clear later, the manager actually caches a separate instance of an implementation class for each name under which it can be requested. You can ask the PluginManager to forget about (decache) a plugin instance, by releasing it. See the PluginManager.releasePlugin() method. The manager will drop its reference to the plugin so the garbage collector can reclaim it. The next time that plugin/name combination is requested, it will create a new instance. 11.3.3.2.5. Getting Meta-Information The PluginManager can list all the names of the Named Plugins which implement an interface. You may need this, for example, to implement a menu in a user interface that presents a choice among all possible plugins. See the getPluginNames() method. Note that it only returns the plugin name, so if you need a more sophisticated or meaningful "label" (i.e. a key into the I18N message catalog) then you should add a method to the plugin itself to return that.
11.3.3.3. Implementation
Note: The PluginManager refers to interfaces and classes internally only by their names whenever possible, to avoid loading classes until absolutely necessary (i.e. to create an instance). As you'll see below, self-named classes still have to be loaded to query them for names, but for the most part it can avoid loading classes. This saves a lot of time at start-up and keeps the JVM memory footprint down, too. As the Plugin Manager gets used for more classes, this will become a greater concern. The only downside of "on-demand" loading is that errors in the configuration don't get discovered right away. The solution is to call the checkConfiguration() method after making any changes to the configuration. 11.3.3.3.1. PluginManager Class The PluginManager class is your main interface to the Plugin Manager. It behaves like a factory class that never gets instantiated, so its public methods are static.
315
Business Logic Layer Here are the public methods, followed by explanations:
static Object getSinglePlugin(Class intface) throws PluginConfigurationError;
Returns an instance of the singleton (single) plugin implementing the given interface. There must be exactly one single plugin configured for this interface, otherwise the PluginConfigurationError is thrown.Note that this is the only "get plugin" method which throws an exception. It is typically used at initialization time to set up a permanent part of the system so any failure is fatal.See the plugin.single configuration key for configuration details.
static Object[] getPluginSequence(Class intface);
Returns instances of all plugins that implement the interface intface, in an Array. Returns an empty array if no there are no matching plugins.The order of the plugins in the array is the same as their class names in the configuration's value field.See the plugin.sequence configuration key for configuration details.
static Object getNamedPlugin(Class intface, String name);
Returns an instance of a plugin that implements the interface intface and is bound to a name matching name. If there is no matching plugin, it returns null. The names are matched by String.equals().See the plugin.named and plugin.selfnamed configuration keys for configuration details.
static void releasePlugin(Object plugin);
Tells the Plugin Manager to let go of any references to a reusable plugin, to prevent it from being given out again and to allow the object to be garbage-collected. Call this when a plugin instance must be taken out of circulation.
static String[] getAllPluginNames(Class intface);
Returns all of the names under which a named plugin implementing the interface intface can be requested (with getNamedPlugin()). The array is empty if there are no matches. Use this to populate a menu of plugins for interactive selection, or to document what the possible choices are.The names are NOT returned in any predictable order, so you may wish to sort them first.Note: Since a plugin may be bound to more than one name, the list of names this returns does not represent the list of plugins. To get the list of unique implementation classes corresponding to the names, you might have to eliminate duplicates (i.e. create a Set of classes).
static void checkConfiguration();
Validates the keys in the DSpace ConfigurationManager pertaining to the Plugin Manager and reports any errors by logging them. This is intended to be used interactively by a DSpace administrator, to check the configuration file after modifying it. See the section about validating configuration for details. 11.3.3.3.2. SelfNamedPlugin Class A named plugin implementation must extend this class if it wants to supply its own Plugin Name(s). See SelfNamed Plugins for why this is sometimes necessary.
abstract class SelfNamedPlugin { // Your class must override this: // Return all names by which this plugin should be known. public static String[] getPluginNames();
316
An error of this type means the caller asked for a single plugin, but either there was no single plugin configured matching that interface, or there was more than one. Either case causes a fatal configuration error.
public class PluginInstantiationException extends RuntimeException { public PluginInstantiationException(String msg, Throwable cause) }
This exception indicates a fatal error when instantiating a plugin class. It should only be thrown when something unexpected happens in the course of instantiating a plugin, e.g. an access error, class not found, etc. Simply not finding a class in the configuration is not an exception. This is a RuntimeException so it doesn't have to be declared, and can be passed all the way up to a generalized fatal exception handler.
3. Names: (Named plugins only) There are two ways to bind names to plugins: listing them in the value of a plugin.named.interface key, or configuring a class in plugin.selfnamed.interface which extends the SelfNamedPlugin class. 4. Reusable option: (Optional) This is declared in a plugin.reusable configuration line. Plugins are reusable by default, so you only need to configure the non-reusable ones. 11.3.3.4.1. Configuring Singleton (Single) Plugins This entry configures a Single Plugin for use with getSinglePlugin():
plugin.single.interface = classname
For example, this configures the class org.dspace.checker.SimpleDispatcher as the plugin for interface org.dspace.checker.BitstreamDispatcher:
317
11.3.3.4.2. Configuring Sequence of Plugins This kind of configuration entry defines a Sequence Plugin, which is bound to a sequence of implementation classes. The key identifies the interface, and the value is a comma-separated list of classnames: plugin.sequence.interface = classname, ... The plugins are returned by getPluginSequence() in the same order as their classes are listed in the configuration value. For example, this entry configures Stackable Authentication with three implementation classes:
plugin.sequence.org.dspace.eperson.AuthenticationMethod = \ org.dspace.eperson.X509Authentication, \ org.dspace.eperson.PasswordAuthentication, \ edu.mit.dspace.MITSpecialGroup
11.3.3.4.3. Configuring Named Plugins There are two ways of configuring named plugins: 1. Plugins Named in the Configuration A named plugin which gets its name(s) from the configuration is listed in this kind of entry:_plugin.named.interface = classname = name [ , name.. ] [ classname = name.. ]_The syntax of the configuration value is: classname, followed by an equal-sign and then at least one plugin name. Bind more names to the same implementation class by adding them here, separated by commas. Names may include any character other than comma (,) and equal-sign (=).For example, this entry creates one plugin with the names GIF, JPEG, and image/png, and another with the name TeX:
plugin.named.org.dspace.app.mediafilter.MediaFilter = \ org.dspace.app.mediafilter.JPEGFilter = GIF, JPEG, image/png \ org.dspace.app.mediafilter.TeXFilter = TeX
This example shows a plugin name with an embedded whitespace character. Since comma (,) is the separator character between plugin names, spaces are legal (between words of a name; leading and trailing spaces are ignored).This plugin is bound to the names "Adobe PDF", "PDF", and "Portable Document Format".
plugin.named.org.dspace.app.mediafilter.MediaFilter = \ org.dspace.app.mediafilter.TeXFilter = TeX \ org.dspace.app.mediafilter.PDFFilter = Adobe PDF, PDF, Portable Document Format
NOTE: Since there can only be one key with plugin.named. followed by the interface name in the configuration, all of the plugin implementations must be configured in that entry. 2. Self-Named Plugins Since a self-named plugin supplies its own names through a static method call, the configuration only has to include its interface and classname:plugin.selfnamed.interface = classname [ , classname.. ]_The following example first demonstrates how the plugin class, _XsltDisseminationCrosswalk is configured to implement its own names "MODS" and "DublinCore". These come from the keys starting with crosswalk.dissemination.stylesheet.. The value is a stylesheet file.The class is then configured as a self-named plugin:
crosswalk.dissemination.stylesheet.DublinCore = xwalk/TESTDIM-2-DC_copy.xsl crosswalk.dissemination.stylesheet.MODS = xwalk/mods.xsl plugin.selfnamed.crosswalk.org.dspace.content.metadata.DisseminationCrosswalk = \ org.dspace.content.metadata.MODSDisseminationCrosswalk, \ org.dspace.content.metadata.XsltDisseminationCrosswalk
318
Business Logic Layer NOTE: Since there can only be one key with plugin.selfnamed. followed by the interface name in the configuration, all of the plugin implementations must be configured in that entry. The MODSDisseminationCrosswalk class is only shown to illustrate this point. 11.3.3.4.4. Configuring the Reusable Status of a Plugin Plugins are assumed to be reusable by default, so you only need to configure the ones which you would prefer not to be reusable. The format is as follows:
plugin.reusable.classname = ( true | false )
For example, this marks the PDF plugin from the example above as non-reusable:
plugin.reusable.org.dspace.app.mediafilter.PDFFilter = false
319
Business Logic Layer 11.3.3.6.2. A Singleton Plugin This shows how to configure and access a single anonymous plugin, such as the BitstreamDispatcher plugin: Configuration: plugin.single.org.dspace.checker.BitstreamDispatcher=org.dspace.checker.SimpleDispatcher The following code fragment shows how dispatcher, the service object, is initialized and used:
BitstreamDispatcher dispatcher = (BitstreamDispatcher)PluginManager.getSinglePlugin(BitstreamDispatcher .class); int id = dispatcher.next(); while (id != BitstreamDispatcher.SENTINEL) { /* do some processing here */ id = dispatcher.next(); }
11.3.3.6.3. Plugin that Names Itself This crosswalk plugin acts like many different plugins since it is configured with different XSL translation stylesheets. Since it already gets each of its stylesheets out of the DSpace configuration, it makes sense to have the plugin give PluginManager the names to which it answers instead of forcing someone to configure those names in two places (and try to keep them synchronized). NOTE: Remember how getPlugin() caches a separate instance of an implementation class for every name bound to it? This is why: the instance can look at the name under which it was invoked and configure itself specifically for that name. Since the instance for each name might be different, the Plugin Manager has to cache a separate instance for each name. Here is the configuration file listing both the plugin's own configuration and the PluginManager config line:
crosswalk.dissemination.stylesheet.DublinCore = xwalk/TESTDIM-2-DC_copy.xsl crosswalk.dissemination.stylesheet.MODS = xwalk/mods.xsl plugin.selfnamed.org.dspace.content.metadata.DisseminationCrosswalk = \ org.dspace.content.metadata.XsltDisseminationCrosswalk
This look into the implementation shows how it finds configuration entries to populate the array of plugin names returned by the getPluginNames() method. Also note, in the getStylesheet() method, how it uses the plugin name that created the current instance (returned by getPluginInstanceName()) to find the correct stylesheet.
public class XsltDisseminationCrosswalk extends SelfNamedPlugin { .... private final String prefix = "crosswalk.dissemination.stylesheet."; .... public static String[] getPluginNames() { List aliasList = new ArrayList(); Enumeration pe = ConfigurationManager.propertyNames(); while (pe.hasMoreElements())
320
11.3.3.6.4. Stackable Authentication The Stackable Authentication mechanism needs to know all of the plugins configured for the interface, in the order of configuration, since order is significant. It gets a Sequence Plugin from the Plugin Manager. Refer to the Configuration Section on Stackable Authentication for further details.
The workflow system models the states of an Item in a state machine with 5 states (SUBMIT, STEP_1, STEP_2, STEP_3, ARCHIVE.) These are the three optional steps where the item can be viewed and corrected by different groups of people. Actually, it's more like 8 states, with STEP_1_POOL, STEP_2_POOL, and STEP_3_POOL. These pooled states are when items are waiting to enter the primary states. The WorkflowManager is invoked by events. While an Item is being submitted, it is held by a WorkspaceItem. Calling the start() method in the WorkflowManager converts a WorkspaceItem to a WorkflowItem, and begins processing the WorkflowItem's state. Since all three steps of the workflow are optional, if no steps are defined, then the Item is simply archived. Workflows are set per Collection, and steps are defined by creating corresponding entries in the List named workflowGroup. If you wish the workflow to have a step 1, use the administration tools for Collections to create a workflow Group with members who you want to be able to view and approve the Item, and the workflowGroup[0] becomes set with the ID of that Group. If a step is defined in a Collection's workflow, then the WorkflowItem's state is set to that step_POOL. This pooled state is the WorkflowItem waiting for an EPerson in that group to claim the step's task for that WorkflowItem. The WorkflowManager emails the members of that Group notifying them that there is a task to be performed (the text is defined in config/emails,) and when an EPerson goes to their 'My DSpace' page to claim
321
Business Logic Layer the task, the WorkflowManager is invoked with a claim event, and the WorkflowItem's state advances from STEP_x_POOL to STEP_x (where x is the corresponding step.) The EPerson can also generate an 'unclaim' event, returning the WorkflowItem to the STEP_x_POOL. Other events the WorkflowManager handles are advance(), which advances the WorkflowItem to the next state. If there are no further states, then the WorkflowItem is removed, and the Item is then archived. An EPerson performing one of the tasks can reject the Item, which stops the workflow, rebuilds the WorkspaceItem for it and sends a rejection note to the submitter. More drastically, an abort() event is generated by the admin tools to cancel a workflow outright.
322
Business Logic Layer Another kind of Group is also implemented in DSpacespecial Groups. The Context object for each session carries around a List of Group IDs that the user is also a member ofcurrently the MITUser Group ID is added to the list of a user's special groups if certain IP address or certificate criteria are met.
11.3.7. Authorization
The primary classes are: org.dspace.authorize.AuthorizeManager org.dspace.authorize.ResourcePolicy org.dspace.eperson.Group does all authorization, checking policies against Groups defines all allowable actions for an object all policies are defined in terms of EPerson Groups
The authorization system is based on the classic 'police state' model of security; no action is allowed unless it is expressed in a policy. The policies are attached to resources (hence the name ResourcePolicy,) and detail who can perform that action. The resource can be any of the DSpace object types, listed in org.dspace.core.Constants (BITSTREAM, ITEM, COLLECTION, etc.) The 'who' is made up of EPerson groups. The actions are also in Constants.java (READ, WRITE, ADD, etc.) The only non-obvious actions are ADD and REMOVE, which are authorizations for container objects. To be able to create an Item, you must have ADD permission in a Collection, which contains Items. (Communities, Collections, Items, and Bundles are all container objects.) Currently most of the read policy checking is done with itemscommunities and collections are assumed to be openly readable, but items and their bitstreams are checked. Separate policy checks for items and their bitstreams enables policies that allow publicly readable items, but parts of their content may be restricted to certain groups. The AuthorizeManager class' authorizeAction(Context, object, action) is the primary source of all authorization in the system. It gets a list of all of the ResourcePolicies in the system that match the object and action. It then iterates through the policies, extracting the EPerson Group from each policy, and checks to see if the EPersonID from the Context is a member of any of those groups. If all of the policies are queried and no permission is found, then an AuthorizeException is thrown. An authorizeAction() method is also supplied that returns a boolean for applications that require higher performance. ResourcePolicies are very simple, and there are quite a lot of them. Each can only list a single group, a single action, and a single object. So each object will likely have several policies, and if multiple groups share permissions for actions on an object, each group will get its own policy. (It's a good thing they're small.)
323
It is the responsibility of the caller to extract the basic form from whichever displayed form is used. The handle table maps these Handles to resource type/resource ID pairs, where resource type is a value from org.dspace.core.Constants and resource ID is the internal identifier (database primary key) of the object. This allows Handles to be assigned to any type of object in the system, though as explained in the functional overview, only communities, collections and items are presently assigned Handles. HandleManager contains static methods for: Creating a Handle Finding the Handle for a DSpaceObject, though this is usually only invoked by the object itself, since DSpaceObject has a getHandle method Retrieving the DSpaceObject identified by a particular Handle Obtaining displayable forms of the Handle (URI or "proxy URL"). HandlePlugin is a simple implementation of the Handle Server's net.handle.hdllib.HandleStorage interface. It only implements the basic Handle retrieval methods, which get information from the handle database table. The CNRI Handle Server is configured to use this plug-in via its config.dct file. Note that since the Handle server runs as a separate JVM to the DSpace Web applications, it uses a separate 'Log4J' configuration, since Log4J does not support multiple JVMs using the same daily rolling logs. This alternative configuration is located at [dspace]/config/log4j-handle-plugin.properties. The [dspace]/bin/start-handle-server script passes in the appropriate command line parameters so that the Handle server uses this configuration.
11.3.9. Search
DSpace's search code is a simple API which currently wraps the Lucene search engine. The first half of the search task is indexing, and org.dspace.search.DSIndexer is the indexing class, which contains indexContent() which if passed an Item, Community, or Collection, will add that content's fields to the index. The methods unIndexContent() and reIndexContent() remove and update content's index information. The DSIndexer class also has a main() method which will rebuild the index completely. This can be invoked by the dspace/bin/ index-init (complete rebuild) or dspace/bin/index-update (update) script. The intent was for the main() method to be invoked on a regular basis to avoid index corruption, but we have had no problem with that so far. Which fields are indexed by DSIndexer? These fields are defined in dspace.cfg in the section "Fields to index for search" as name-value-pairs. The name must be unique in the form search.index.i (i is an arbitrary positive number). The value on the right side has a unique value again, which can be referenced in search-form (e.g. title, author). Then comes the metadata element which is indexed. '*' is a wildcard which includes all sub elements. For example:
324
tells the indexer to create a keyword index containing all dc.subject element values. Since the wildcard ('*') character was used in place of a qualifier, all subject metadata fields will be indexed (e.g. dc.subject.other, dc.subject.lcsh, etc) By default, the fields shown in the Indexed Fields section below are indexed. These are hardcoded in the DSIndexer class. If any search.index.i items are specified in dspace.cfg these are used rather than these hardcoded fields. The query class DSQuery contains the three flavors of doQuery() methodsone searches the DSpace site, and the other two restrict searches to Collections and Communities. The results from a query are returned as three lists of handles; each list represents a type of result. One list is a list of Items with matches, and the other two are Collections and Communities that match. This separation allows the UI to handle the types of results gracefully without resolving all of the handles first to see what kind of content the handle points to. The DSQuery class also has a main() method for debugging via command-line searches.
325
Business Logic Layer HarvestedItemInfo objects are returned. These objects are simple containers with basic information about the items falling within the given scope and date range. Depending on parameters passed to the harvest method, the containers and item fields may have been filled out with the IDs of communities and collections containing an item, and the corresponding Item object respectively. Electing not to have these fields filled out means the harvest operation executes considerable faster. In case it is required, Harvest also offers a method for creating a single HarvestedItemInfo object, which might make things easier for the caller.
Currently, the above three names would all appear as separate entries in the author index even though they may refer to the same author. In order for an author of several papers to be correctly appear once in the index, each item must specify exactly the same form of their name, which doesn't always happen in practice. Another issue is that two authors may have the same name, even within a single institution. If this is the case they may appear as one author in the index.These issues are typically resolved in libraries with authority control records, in which are kept a 'preferred' form of the author's name, with extra information (such as date of birth/death) in order to distinguish between authors of the same name. Maintaining such records is a huge task with many issues, particularly when metadata is received from faculty directly rather than trained library catalogers. Date of Issue: Items are indexed by date of issue. This may be different from the date that an item appeared in DSpace; many items may have been originally published elsewhere beforehand. The Dublin Core field used is date.issued. The ordering of this index may be reversed so 'earliest first' and 'most recent first' orderings are possible.Note that the index is of items by date, as opposed to an index of dates. If 30 items have the same issue date (say 2002), then those 30 items all appear in the index adjacent to each other, as opposed to a single 2002 entry.Since dates in DSpace Dublin Core are in ISO8601, all in the UTC time zone, a simple alphanumeric sort is sufficient to sort by date, including dealing with varying granularities of date reasonably. For example:
2001-12-10 2002 2002-04
326
Date Accessioned: In order to determine which items most recently appeared, rather than using the date of issue, an item's accession date is used. This is the Dublin Core field date.accessioned. In other aspects this index is identical to the date of issue index. Items by a Particular Author: The browse API can perform is to extract items by a particular author. They do not have to be primary author of an item for that item to be extracted. You can specify a scope, too; that is, you can ask for items by author X in collection Y, for example.This particular flavor of browse is slightly simpler than the others. You cannot presently specify a particular subset of results to be returned. The API call will simply return all of the items by a particular author within a certain scope.Note that the author of the item must exactly match the author passed in to the API; see the explanation about the caveats of the author index browsing to see why this is the case. Subject: Values of the Dublin Core element subject (both unqualified and with any qualifier) are indexed. These are sorted in a case-insensitive fashion.
Note that in the case of title and date browses, Item objects are returned as opposed to actual titles. In these cases, you can specify the 'focus' to be a specific item, or a partial or full literal value. In the case of a literal value,
327
Business Logic Layer if no entry in the index matches exactly, the closest match is used as the focus. It's quite reasonable to specify a focus of a single letter, for example. Being able to specify a specific item to start at is particularly important with dates, since many items may have the save issue date. Say 30 items in a collection have the issue date 2002. To be able to page through the index 20 items at a time, you need to be able to specify exactly which item's 2002 is the focus of the browse, otherwise each time you invoked the browse code, the results would start at the first item with the issue date 2002. Author browses return String objects with the actual author names. You can only specify the focus as a full or partial literal String. Another important point to note is that presently, the browse indexes contain metadata for all items in the main archive, regardless of authorization policies. This means that all items in the archive will appear to all users when browsing. Of course, should the user attempt to access a non-public item, the usual authorization mechanism will apply. Whether this approach is ideal is under review; implementing the browse API such that the results retrieved reflect a user's level of authorization may be possible, but rather tricky.
11.3.10.3. Caveats
Presently, the browse API is not tremendously efficient. 'Indexing' takes the form of simply extracting the relevant Dublin Core value, normalizing it (lower-casing and removing any leading article in the case of titles), and inserting that normalized value with the corresponding item ID in the appropriate browse database table. Database views of this table include collection and community IDs for browse operations with a limited scope. When a browse operation is performed, a simple SELECT query is performed, along the lines of:
SELECT item_id FROM ItemsByTitle ORDER BY sort_title OFFSET 40 LIMIT 20
There are two main drawbacks to this: Firstly, LIMIT and OFFSET are PostgreSQL-specific keywords. Secondly, the database is still actually performing dynamic sorting of the titles, so the browse code as it stands will not scale particularly well. The code does cache BrowseInfo objects, so that common browse operations are performed quickly, but this is not an ideal solution.
328
The default format returned is Atom 1.0, so you should see an Atom document containing your search results. You can extend the syntax with a few other parameters, as follows: Parameter format scope rpp start sort_by Values atom, rss, html handle of a collection or community to restrict the search to number indicating the number of results per page (i.e. per request) number of page to start with (if paginating results) number indicating sorting criteria (same as DSpace advanced search values
Multiple parameters may be specified on the query string, using the "&" character as the delimiter, e.g.:
http://dspace.mysite.edu/open-search/?query=<your query>&format=rss&scope=123456789/1
329
Business Logic Layer Cheap metasearchSearch aggregators like A9 (Amazon) recognize OpenSearch-compliant providers, and so can be added to metasearch sets using their UIs. Then you site can be used to aggregate search results with others. Configuration is through the dspace.cfg file. See OpenSearch Support for more details.
330
DSpace Services Framework the embargo, whose 'force' now resides entirely in the 'lift date' value. For this reason, you cannot embargo content already in your repository (at least using standard tools). The other action taken at installation time is the actual imposition of the embargo. The default behavior here is simply to remove the read policies on all the bundles and bitstreams except for the "LICENSE" or "METADATA" bundles. See the section on Extending Embargo Functionality for how to alter this behavior. Also note that since these policy changes occur before installation, there is no time during which embargoed content is 'exposed' (accessible by non-administrators). The terms interpretation and imposition together are called 'setting' the embargo, and the component that performs them both is called the embargo 'setter'. 3. Embargo Period. After an embargoed item has been installed, the policy restrictions remain in effect until removed. This is not an automatic process, however: a 'lifter' must be run periodically to look for items whose 'lift date' is past. Note that this means the effective removal of an embargo is not the lift date, but the earliest date after the lift date that the lifter is run. Typically, a nightly cron-scheduled invocation of the lifter is more than adequate, given the granularity of embargo terms. Also note that during the embargo period, all metadata of the item remains visible. This default behavior can be changed. One final point to note is that the 'lift date', although it was computed and assigned during the previous stage, is in the end a regular metadata field. That means, if there are extraordinary circumstances that require an administrator (or collection editoranyone with edit permissions on metadata) to change the lift date, they can do so. Thus, they can 'revise' the lift date without reference to the original terms. This date will be checked the next time the 'lifter' is run. One could immediately lift the embargo by setting the lift date to the current day, or change it to 'forever' to indefinitely postpone lifting. 4. Embargo Lift. When the lifter discovers an item whose lift date is in the past, it removes (lifts) the embargo. The default behavior of the lifter is to add the resource policies that would have been added had the embargo not been imposed. That is, it replicates the standard DSpace behavior, in which an item inherits it's policies from its owning collection. As with all other parts of the embargo system, you may replace or extend the default behavior of the lifter (see section V. below). You may wish, e.g. to send an email to an administrator or other interested parties, when an embargoed item becomes available. 5. Post Embargo. After the embargo has been lifted, the item ceases to respond to any of the embargo lifecycle events. The values of the metadata fields reflect essentially historical or provenance values. With the exception of the additional metadata fields, they are indistinguishable from items that were never subject to embargo.
331
11.4.2.1.2. Kernel Startup and Access The kernel can be started and accessed through the use of Servlet Filter/ContextListeners which are provided as part of the DSpace 2 utilities. Developers don't need to understand what is going on behind the scenes and can simply write their applications and package them as webapps and take advantage of the services which are offered by DSpace 2. Access to the kernel is provided via the Kernel Manager and the DSpace object which will locate the kernel object and allow it to be used.
332
/* Access get the Service Manager by convenience method */ ServiceManager manager = dspace.getServiceManager();
The DS2 kernel is compact so it can be completely started up in a unit test (technically integration test) environment (this is who we test the kernel and core services currently). This allows developers to execute code against a fully functional kernel while developing and then deploy their code with high confidence.
333
11.4.3.1. Activators
Developers can use an activator to allow the system to startup their service or provider. It is a simple interface with 2 methods which are called to startup the provider(s) and later to shut them down. These simply allow a developer to run some arbitrary code in order to create and register services if desired. It is the method provided to add plugins directly to the system via configuration as the activators are just listed in the configuration file and the system starts them up in the order it finds them.
334
Storage Layer
11.4.4.3. EventService
Handles events and provides access to listeners for consumption of events.
11.4.4.4. RequestService
In DS2 a request is the concept of a request (HTTP) or an atomic transaction in the system. It is likely to be an HTTP request in many cases but it does not have to be. This service provides the core services with a way to manage atomic transactions so that when a request comes in which requires mutliple things to happen they can either all suceed or all fail without each service attempting to manage this independently. In a nutshell this simply allows identification of the current request and the ability to discover if it succeeded or failed when it ends. Nothing in the system will enforce usage of the service but we encourage developers who are interacting with the system to make use of this service so they know if the request they are participating in with has succeeded or failed and take appropriate actions.
11.4.4.5. SessionService
In DS2 a session is like an HttpSession (and generally is actually one) so this service is here to allow developers to find information about the current session and to access information in it. The session identifies the current user (if authenticated) so it also serves as a way to track user sessions. Since we use HttpSession directly it is easy to mirror sessions across multiple servers in order to allow for no-interruption failover for users when servers go offline.
335
Storage Layer
Most of the functionality that DSpace uses can be offered by any standard SQL database that supports transactions. Presently, the browse indices use some features specific to PostgreSQL and Oracle, so some modification to the code would be needed before DSpace would function fully with an alternative database back-end. The org.dspace.storage.rdbms package provides access to an SQL database in a somewhat simpler form than using JDBC directly. The main class is DatabaseManager, which executes SQL queries and returns TableRow or TableRowIterator objects. The InitializeDatabase class is used to load SQL into the database via JDBC, for example to set up the schema. All calls to the Database Manager require a DSpace Context object. Example use of the database manager API is given in the org.dspace.storage.rdbms package Javadoc. The database schema used by DSpace is created by SQL statements stored in a directory specific to each supported RDBMS platform: PostgreSQL schemas are in [dspace-source]/dspace/etc/postgres/
336
Storage Layer Oracle schemas are in [dspace-source]/dspace/etc/oracle/ The SQL (DDL) statements to create the tables for the current release, starting with an empty database, aer in database_schema.sql. The schema SQL file also creates the two required e-person groups (Anonymous and Administrator) that are required for the system to function properly. Also in [dspace-source]/dspace/etc/[database] are various SQL files called database_schema_1x_1y. These contain the necessary SQL commands to update a live DSpace database from version 1.x to 1.y. Note that this might not be the only part of an upgrade process: see Updating a DSpace Installation for details. The DSpace database code uses an SQL function getnextid to assign primary keys to newly created rows. This SQL function must be safe to use if several JVMs are accessing the database at once; for example, the Web UI might be creating new rows in the database at the same time as the batch item importer. The PostgreSQL-specific implementation of the method uses SEQUENCES for each table in order to create new IDs. If an alternative database backend were to be used, the implementation of getnextid could be updated to operate with that specific DBMS. The etc directory in the source distribution contains two further SQL files. clean-database.sql contains the SQL necessary to completely clean out the database, so use with caution! The Ant target clean_database can be used to execute this. update-sequences.sql contains SQL to reset the primary key generation sequences to appropriate values. You'd need to do this if, for example, you're restoring a backup database dump which creates rows with specific primary keys already defined. In such a case, the sequences would allocate primary keys that were already used. Versions of the .sql files for Oracle are stored in [dspace-source]/dspace/etc/oracle. These need to be copied over their PostgreSQL counterparts in [dspace-source]/dspace/etc prior to installation.
The DSpace database can be backed up and restored using usual methods, for example with pg_dump and psql. However when restoring a database, you will need to perform these additional steps: The fresh_install target loads up the initial contents of the Dublin Core type and bitstream format registries, as well as two entries in the epersongroup table for the system anonymous and administrator groups. Before you restore a raw backup of your database you will need to remove these, since they will already exist in your backup, possibly having been modified. For example, use:
DELETE FROM dctyperegistry; DELETE FROM bitstreamformatregistry; DELETE FROM epersongroup;
After restoring a backup, you will need to reset the primary key generation sequences so that they do not produce already-used primary keys. Do this by executing the SQL in [dspace-source]/dspace/etc/update-sequences.sql, for example with:
psql -U dspace -f [dspace-source]/dspace/etc/update-sequences.sql
Future updates of DSpace may involve minor changes to the database schema. Specific instructions on how to update the schema whilst keeping live data will be included. The current schema also contains a few currently 337
Storage Layer unused database columns, to be used for extra functionality in future releases. These unused columns have been added in advance to minimize the effort required to upgrade.
db.driver
db.username db.password
338
Storage Layer to the relevant store directory) that the bitstream is stored in traditional or SRB storage. The first three pairs of digits are the directory path that the bitstream is stored under. The bitstream is stored in a file with the internal ID as the filename. For example, a bitstream with the internal ID 12345678901234567890123456789012345678 is stored in the directory:
(assetstore dir)/12/34/56/12345678901234567890123456789012345678
The reasons for storing files this way are: Using a randomly-generated 38-digit number means that the 'number space' is less cluttered than simply using the primary keys, which are allocated sequentially and are thus close together. This means that the bitstreams in the store are distributed around the directory structure, improving access efficiency. The internal ID is used as the filename partly to avoid requiring an extra lookup of the filename of the bitstream, and partly because bitstreams may be received from a variety of operating systems. The original name of a bitstream may be an illegal UNIX filename. When storing a bitstream, the BitstreamStorageManager DOES set the following fields in the corresponding database table row: bitstream_id size checksum checksum_algorithm internal_id deleted store_number The remaining fields are the responsibility of the Bitstream content management API class. The bitstream storage manager is fully transaction-safe. In order to implement transaction-safety, the following algorithm is used to store bitstreams: 1. A database connection is created, separately from the currently active connection in the current DSpace context. 2. An unique internal identifier (separate from the database primary key) is generated. 3. The bitstream DB table row is created using this new connection, with the deleted column set to true. 4. The new connection is _commit_ted, so the 'deleted' bitstream row is written to the database 5. The bitstream itself is stored in a file in the configured 'asset store directory', with a directory path and filename derived from the internal ID 6. The deleted flag in the bitstream row is set to false. This will occur (or not) as part of the current DSpace Context. This means that should anything go wrong before, during or after the bitstream storage, only one of the following can be true: No bitstream table row was created, and no file was stored
339
Storage Layer A bitstream table row with deleted=true was created, no file was stored A bitstream table row with deleted=true was created, and a file was stored None of these affect the integrity of the data in the database or bitstream store. Similarly, when a bitstream is deleted for some reason, its deleted flag is set to true as part of the overall transaction, and the corresponding file in storage is not deleted. The above techniques mean that the bitstream storage manager is transaction-safe. Over time, the bitstream database table and file store may contain a number of 'deleted' bitstreams. The cleanup method of BitstreamStorageManager goes through these deleted rows, and actually deletes them along with any corresponding files left in the storage. It only removes 'deleted' bitstreams that are more than one hour old, just in case cleanup is happening in the middle of a storage operation. This cleanup can be invoked from the command line via the Cleanup class, which can in turn be easily executed from a shell on the server machine using /dspace/bin/cleanup. You might like to have this run regularly by cron, though since DSpace is read-lots, write-not-so-much it doesn't need to be run very often.
11.5.2.1. Backup
The bitstreams (files) in traditional storage may be backed up very easily by simply 'tarring' or 'zipping' the assetstore directory (or whichever directory is configured in dspace.cfg). Restoring is as simple as extracting the backed-up compressed file in the appropriate location. Similar means could be used for SRB, but SRB offers many more options for managing backup. It is important to note that since the bitstream storage manager holds the bitstreams in storage, and information about them in the database, that a database backup and a backup of the files in the bitstream store must be made at the same time; the bitstream data in the database must correspond to the stored files. Of course, it isn't really ideal to 'freeze' the system while backing up to ensure that the database and files match up. Since DSpace uses the bitstream data in the database as the authoritative record, it's best to back up the database before the files. This is because it's better to have a bitstream in storage but not the database (effectively non-existent to DSpace) than a bitstream record in the database but not storage, since people would be able to find the bitstream but not actually get the contents. With DSpace 1.7 and above, there is also the option to backup both files and metadata via the Section 9.18, AIP Backup and Restore feature.
(Remember that [dspace] is a placeholder for the actual name of your DSpace install directory). The above example specifies a single asset store.
assetstore.dir = [dspace]/assetstore_0 assetstore.dir.1 = /mnt/other_filesystem/assetstore_1
340
Storage Layer The above example specifies two asset stores. assetstore.dir specifies the asset store number 0 (zero); after that use assetstore.dir.1, assetstore.dir.2 and so on. The particular asset store a bitstream is stored in is held in the database, so don't move bitstreams between asset stores, and don't renumber them. By default, newly created bitstreams are put in asset store 0 (i.e. the one specified by the assetstore.dir property.) This allows backwards compatibility with pre-DSpace 1.1 configurations. To change this, for example when asset store 0 is getting full, add a line to dspace.cfg like:
assetstore.incoming = 1
Then restart DSpace (Tomcat). New bitstreams will be written to the asset store specified by assetstore.dir.1, which is /mnt/other_filesystem/assetstore_1 in the above example. 11.5.2.2.2. Configuring SRB Storage The same framework is used to configure SRB storage. That is, the asset store number (0..n) can reference a file system directory as above or it can reference a set of SRB account parameters. But any particular asset store number can reference one or the other but not both. This way traditional and SRB storage can both be used but with different asset store numbers. The same cautions mentioned above apply to SRB asset stores as well: The particular asset store a bitstream is stored in is held in the database, so don't move bitstreams between asset stores, and don't renumber them. For example, let's say asset store number 1 will refer to SRB. The there will be a set of SRB account parameters like this:
srb.host.1 = mysrbmcathost.myu.edu srb.port.1 = 5544 srb.mcatzone.1 = mysrbzone srb.mdasdomainname.1 = mysrbdomain srb.defaultstorageresource.1 = mydefaultsrbresource srb.username.1 = mysrbuser srb.password.1 = mysrbpassword srb.homedirectory.1 = /mysrbzone/home/mysrbuser.mysrbdomain srb.parentdir.1 = mysrbdspaceassetstore
Several of the terms, such as mcatzone, have meaning only in the SRB context and will be familiar to SRB users. The last, srb.parentdir.n, can be used to used for addition (SRB) upper directory structure within an SRB account. This property value could be blank as well. (If asset store 0 would refer to SRB it would be srb.host = ..., srb.port = ..., and so on (.0 omitted) to be consistent with the traditional storage configuration above.) The similar use of assetstore.incoming to reference asset store 0 (default) or 1..n (explicit property) means that new bitstreams will be written to traditional or SRB storage determined by whether a file system directory on the server is referenced or a set of SRB account parameters are referenced. There are comments in dspace.cfg that further elaborate the configuration of traditional and SRB storage.
341
Because this file is in XML format, you should be familiar with XML before editing this file. By default, this file contains the "traditional" Item Submission Process for DSpace, which consists of the following Steps (in this order): Select Collection -> Initial Questions -> Describe -> Upload -> Verify -> License -> Complete If you would like to customize the steps used or the ordering of the steps, you can do so within the <submission-definition> section of the item-submission.xml . In addition, you may also specify different Submission Processes for different DSpace Collections. This can be done in the <submission-map> section. The item-submission.xml file itself documents the syntax required to perform these configuration changes.
342
The above step definition could then be referenced from within a <submission-process> as simply <step id="custom-step"/> 2. Within a specific <submission-process> definition This is for steps which are specific to a single <submission-process> definition. For example:
<submission-process> <step> ... </step> </submission-process>
343
Each step contains the following elements. The required elements are so marked: heading: Partial I18N key (defined in Messages.properties for JSPUI or messages.xml for XMLUI) which corresponds to the text that should be displayed in the submission Progress Bar for this step. This partial I18N key is prefixed within either the Messages.properties or messages.xml file, depending on the interface you are using. Therefore, to find the actual key, you will need to search for the partial key with the following prefix: XMLUI: prefix is xmlui.Submission. (e.g. "xmlui.Submission.submit.progressbar.describe" for 'Describe' step) JSPUI: prefix is jsp. (e.g. "jsp.submit.progressbar.describe" for 'Describe' step)The 'heading' need not be defined if the step should not appear in the progress bar (e.g. steps which perform automated processing, i.e. non-interactive, should not appear in the progress bar). processing-class (Required): Full Java path to the Processing Class for this Step. This Processing Class must perform the primary processing of any information gathered in this step, for both the XMLUI and JSPUI. All valid step processing classes must extend the abstract org.dspace.submit.AbstractProcessingStep class (or alternatively, extend one of the pre-existing step processing classes in org.dspace.submit.step.*) jspui-binding: Full Java path of the JSPUI "binding" class for this Step. This "binding" class should initialize and call the appropriate JSPs to display the step's user interface. A valid JSPUI "binding" class must extend the abstract org.dspace.app.webui.submit.JSPStep class. This property need not be defined if you are using the XMLUI interface, or for steps which only perform automated processing, i.e. non-interactive steps. xmlui-binding: Full Java path of the XMLUI "binding" class for this Step. This "binding" class should generate the Manakin XML (DRI document) necessary to generate the step's user interface. A valid XMLUI "binding" class must extend the abstract org.dspace.app.xmlui.submission.AbstractSubmissionStep class. This property need not be defined if you are using the JSPUI interface, or for steps which only perform automated processing, i.e. non-interactive steps.
344
Reordering/Removing Submission Steps workflow-editable: Defines whether or not this step can be edited during the Edit Metadata process with the DSpace approval/rejection workflow process. Possible values include true and false. If undefined, defaults to true (which means that workflow reviewers would be allowed to edit information gathered during that step).
345
It's a good idea to keep the definition of the default name-map from the example input-forms.xml so there is always a default for collections which do not have a custom form set.
The underlined part of the URL is the handle. It should look familiar to any DSpace administrator. That is what goes in the collection-handle attribute of your name-map element.
346
347
It's a good idea to keep the definition of the default name-map from the example input-forms.xml so there is always a default for collections which do not have a custom form set. 12.4.3.1.1. Getting A Collection's Handle You will need the handle of a collection in order to assign it a custom form set. To discover the handle, go to the "Communities & Collections" page under "Browse" in the left-hand menu on your DSpace home page. Then, find the link to your collection. It should look something like:
http://myhost.my.edu/dspace/handle/12345.6789/42
The underlined part of the URL is the handle. It should look familiar to any DSpace administrator. That is what goes in the collection-handle attribute of your name-map element.
The page element, in turn, contains a sequence of field elements. Each field defines an interactive dialog where the submitter enters one of the Dublin Core metadata items. 12.4.3.2.2. Composition of a Field Each field contains the following elements, in the order indicated. The required sub-elements are so marked: dc-schema (Required) : Name of metadata schema employed, e.g. dc for Dublin Core. This value must match the value of the schema element defined in dublin-core-types.xml dc-element (Required) : Name of the Dublin Core element entered in this field, e.g. contributor. dc-qualifier: Qualifier of the Dublin Core element entered in this field, e.g. when the field is contributor.advisor the value of this element would be advisor. Leaving this out means the input is for an unqualified DC element. repeatable: Value is true when multiple values of this field are allowed, false otherwise. When you mark a field repeatable, the UI servlet will add a control to let the user ask for more fields to enter additional values. Intended to be used for arbitrarily-repeating fields such as subject keywords, when it is impossible to know in advance how many input boxes to provide. label (Required): Text to display as the label of this field, describing what to enter, e.g. "Your Advisor's Name".
348
Custom Metadata-entry Pages for Submission input-type (Required): Defines the kind of interactive widget to put in the form to collect the Dublin Core value. Content must be one of the following keywords: onebox A single text-entry box. twobox A pair of simple text-entry boxes, used for repeatable values such as the DC subject item. Note: The 'twobox' input type is rendered the same as a 'onebox' in the XML-UI, but both allow for ease of adding multiple values. textarea Large block of text that can be entered on multiple lines, e.g. for an abstract. name Personal name, with separate fields for family name and first name. When saved they are appended in the format 'LastName, FirstName' date Calendar date. When required, demands that at least the year be entered. series Series/Report name and number. Separate fields are provided for series name and series number, but they are appended (with a semicolon between) when saved. dropdown Choose value(s) from a "drop-down" menu list. Note: You must also include a value for the value-pairs-name attribute to specify a list of menu entries from which to choose. Use this to make a choice from a restricted set of options, such as for the language item. qualdrop_value Enter a "qualified value", which includes both a qualifier from a drop-down menu and a free-text value. Used to enter items like alternate identifiers and codes for a submitted item, e.g. the DC identifier field. Note: As for the dropdown type, you must include the value-pairs-name attribute to specify a menu choice list. list Choose value(s) from a checkbox or radio button list. If the repeatable attribute is set to true, a list of checkboxes is displayed. If the repeatable attribute is set to false, a list of radio buttons is displayed. Note: You must also include a value for the value-pairs-name attribute to specify a list of values from which to choose. hint (Required): Content is the text that will appear as a "hint", or instructions, next to the input fields. Can be left empty, but it must be present. required: When this element is included with any content, it marks the field as a required input. If the user tries to leave the page without entering a value for this field, that text is displayed as a warning message. For example, <required>You must enter a title.</required> Note that leaving the required element empty will not mark a field as required, e.g.:<required></required> visibility: When this optional element is included with a value, it restricts the visibility of the field to the scope defined by that value. If the element is missing or empty, the field is visible in all scopes. Currently supported scopes are: workflow : the field will only be visible in the workflow stages of submission. This is good for hiding difficult fields for users, such as subject classifications, thereby easing the use of the submission system. submit : the field will only be visible in the initial submission, and not in the workflow stages. In addition, you can decide which type of restriction apply: read-only or full hidden the field (default behaviour) using the otherwise attribute of the visibility XML element. For example:<visibility otherwise="readonly">workflow</visibility> Note that it is considered a configuration error to limit a field's scope while also requiring it - an exception will be generated when this combination is detected. Look at the example input-forms.xml and experiment with a a trial custom form to learn this specification language thoroughly. It is a very simple way to express the layout of data-entry forms, but the only way to learn all its subtleties is to use it. 349
Custom Metadata-entry Pages for Submission For the use of controlled vocabularies see the Configuring Controlled Vocabularies section. 12.4.3.2.3. Automatically Elided Fields You may notice that some fields are automatically skipped when a custom form page is displayed, depending on the kind of item being submitted. This is because the DSpace user-interface engine skips Dublin Core fields which are not needed, according to the initial description of the item. For example, if the user indicates there are no alternate titles on the first "Describe" page (the one with a few checkboxes), the input for the title.alternative DC element is automatically elided, even on custom submission pages. When a user initiates a submission, DSpace first displays what we'll call the "initial-questions page". By default, it contains three questions with check-boxes: 1. The item has more than one title, e.g. a translated title Controls title.alternative field. 2. The item has been published or publicly distributed before Controls DC fields: date.issued publisher identifier.citation 3. The item consists of more than one fileDoes not affect any metadata input fields. The answers to the first two questions control whether inputs for certain of the DC metadata fields will displayed, even if they are defined as fields in a custom page. Conversely, if the metadata fields controlled by a checkbox are not mentioned in the custom form, the checkbox is elided from the initial page to avoid confusing or misleading the user. The two relevant checkbox entries are "The item has more than one title, e.g. a translated title", and "The item has been published or publicly distributed before". The checkbox for multiple titles trigger the display of the field with dc-element equal to 'title' and dc-qualifier equal to 'alternative'. If the controlling collection's form set does not contain this field, then the multiple titles question will not appear on the initial questions page.
350
It generates the following HTML, which results in the menu widget below. (Note that there is no way to indicate a default choice in the custom input XML, so it cannot generate the HTML SELECTED attribute to mark one of the options as a pre-selected default.)
<select name="identifier_qualifier_0"> <option VALUE="govdoc">Gov't Doc #</option> <option VALUE="uri">URI</option> <option VALUE="isbn">ISBN</option> </select>
351
Creating new Submission Steps That being said, at a higher level, creating a new Submission Step requires the following (in this relative order): 1. (Required) Create a new Step Processing class This class must extend the abstract org.dspace.submit.AbstractProcessingStep class and implement all methods defined by that abstract class. This class should be built in such a way that it can process the input gathered from either the XMLUI or JSPUI interface. 2. (For steps using JSPUI) Create the JSPs to display the user interface. Create a new JSPUI "binding" class to initialize and call these JSPs. Your JSPUI "binding" class must extend the abstract class org.dspace.app.webui.submit.JSPStep and implement all methods defined there. It's recommended to use one of the classes in org.dspace.app.webui.submit.step.* as a reference. Any JSPs created should be loaded by calling the org.dspace.app.webui.submit.JSPStepManager class showJSP() method of the
If this step gathers information to be reviewed, you must also create a Review JSP which will display a read-only view of all data gathered during this step. The path to this JSP must be returned by your getReviewJSP() method. You will find examples of Review JSPs (named similar to review-[step].jsp) in the JSP submit/ directory. 3. (For steps using XMLUI) Create an XMLUI "binding" Step Transformer which will generate the DRI XML which Manakin requires. The Step Transformer must extend and implement all necessary methods within the abstract class org.dspace.app.xmlui.submission.AbstractSubmissionStep It is useful to use the existing classes in org.dspace.app.xmlui.submission.submit.* as references 4. (Required) Add a valid Step Definition to the item-submission.xml configuration file. This may also require that you add an I18N (Internationalization) key for this step's heading. See the sections on Configuring Multilingual Support for JSPUI or Configuring Multilingual Support for XMLUI for more details. For more information on <step> definitions within the item-submission.xml, see the section above on Defining Steps (<step>) within the item-submission.xml.
352
Creating new Submission Steps File' step, then place its configuration immediately after the configuration for that 'Upload File' step. The configuration should look similar to the following:
<step> <processing-class>org.dspace.submit.step.MyNonInteractiveStep</processing-class> <workflow-editable>false</workflow-editable> </step>
Note: Non-interactive steps will not appear in the Progress Bar! Therefore, your submitters will not even know they are there. However, because they are not visible to your users, you should make sure that your non-interactive step does not take a large amount of time to finish its processing and return control to the next step (otherwise there will be a visible time delay in the user interface).
353
13.1. Introduction
This manual describes the Digital Repository Interface (DRI) as it applies to the DSpace digital repository and XMLUI Manakin based interface. DSpace XML UI is a comprehensive user interface system. It is centralized and generic, allowing it to be applied to all DSpace pages, effectively replacing the JSP-based interface system. Its ability to apply specific styles to arbitrarily large sets of DSpace pages significantly eases the task of adapting the DSpace look and feel to that of the adopting institution. This also allows for several levels of branding, lending institutional credibility to the repository and collections. Manakin, the second version of DSpace XML UI, consists of several components, written using Java, XML, and XSL, and is implemented in Cocoon. Central to the interface is the XML Document, which is a semantic representation of a DSpace page. In Manakin, the XML Document adheres to a schema called the Digital Repository Interface (DRI) Schema, which was developed in conjunction with Manakin and is the subject of this guide. For the remainder of this guide, the terms XML Document, DRI Document, and Document will be used interchangeably. This reference document explains the purpose of DRI, provides a broad architectural overview, and explains common design patterns. The appendix includes a complete reference for elements used in the DRI Schema, a graphical representation of the element hierarchy, and a quick reference table of elements and attributes.
354
DRI in Manakin as such but because it happens to precede it. When these structures are attempted to be translated into formats where these types of relationships are explicit, the translation becomes tedious, and potentially problematic. More structured schemas, like TEI or Docbook, are domain specific (much like DRI itself) and therefore not suitable for our purposes. We also decided that the schema should natively support a metadata standard for encoding artifacts. Rather than encoding artifact metadata in structural elements, like tables or lists, the schema would include artifacts as objects encoded in a particular standard. The inclusion of metadata in native format would enable the Theme to choose the best method to render the artifact for display without being tied to a particular structure. Ultimately, we chose to develop our own schema. We have constructed the DRI schema by incorporating other standards when appropriate, such as Cocoon's i18n schema for internationalization, DCMI's Dublin Core, and the Library of Congress's METS schema. The design of structural elements was derived primarily from TEI, with some of the design patterns borrowed from other existing standards such as DocBook and XHTML. While the structural elements were designed to be easily translated into XHTML, they preserve the semantic relationships for use in more expressive languages.
13.2.1. Themes
A Theme is a collection of XSL stylesheets and supporting files like images, CSS styles, translations, and help documents. The XSL stylesheets are applied to the DRI Document to covert it into a readable format and give it structure and basic visual formatting in that format. The supporting files are used to provide the page with a specific look and feel, insert images and other media, translate the content, and perform other tasks. The currently used output format is XHTML and the supporting files are generally limited to CSS, images, and JavaScript. More output formats, like PDF or SVG, may be added in the future. A DSpace installation running Manakin may have several Themes associated with it. When applied to a page, a Theme determines most of the page's look and feel. Different themes can be applied to different sets of DSpace pages allowing for both variety of styles between sets of pages and consistency within those sets. The xmlui.xconf configuration file determines which Themes are applied to which DSpace pages (see the Chapter 7. Manakin [XMLUI] Configuration and Customization for more information on installing and configuring themes). Themes may be configured to apply to all pages of specific type, like browse-by-title, to all pages of a one particular community or collection or sets of communities and collections, and to any mix of the two. They can also be configured to apply to a singe arbitrary page or handle.
355
356
Schema Overview
Figure 1: The two content types across three major divisions of a DRI page. The document element is the root for all DRI pages and contains all other elements. It bears only one attribute, version, that contains the version number of the DRI system and the schema used to validate the produced document. At the time of writing the working version number is "1.1". The meta element is a the top-level element under document and contains all metadata information about the page, the user that requested it, and the repository it is used with. It contains no structural elements, instead being the only container of metadata elements in a DRI Document. The metadata stored by the meta element is broken up into three major groups: userMeta, pageMeta, and objectMeta, each storing metadata information about their respective component. Please refer to the reference entries for more information about these elements. The options element is another top-level element that contains all navigation and action options available to the user. The options are stored as items in list elements, broken up by the type of action they perform. The five types of actions are: browsing, search, language selection, actions that are always available, and actions that are context dependent. The two action types also contain sub-lists that contain actions available to users of varying degrees of access to the system. The options element contains no metadata elements and can only make use of a small set of structural elements, namely the list element and its children. The last major top-level element is the body element. It contains all structural elements in a DRI Document, including the lists used by the options element. Structural elements are used to build a generic representation of
357
Merging of DRI Documents a DSpace page. Any DSpace page can be represented with a combination of the structural elements, which will in turn be transformed by the XSL templates into another format. This is the core mechanism that allows DSpace XML UI to apply uniform templates and styling rules to all DSpace pages and is the fundamental difference from the JSP approach currently used by DSpace. The body element directly contains only one type of element: div. The div element serves as a major division of content and any number of them can be contained by the body. Additionally, divisions are recursive, allowing divs to contain other divs. It is within these elements that all other structural elements are contained. Those elements include tables, paragraph elements p, and lists, as well as their various children elements. At the lower levels of this hierarchy lie the character container elements. These elements, namely paragraphs p, table cells, lists items, and the emphasis element hi, contain the textual content of a DSpace page, optionally modified with links, figures, and emphasis. If the division within which the character class is contained is tagged as interactive (via the interactive attribute), those elements can also contain interactive form fields. Divisions tagged as interactive must also provide method and action attributes for its fields to use.
358
Version Changes (and merged into one Document) by previously executed Aspects. For this reason rules exist that describe which elements can be merged together and what happens to their data and child elements in the process. When merging two DRI Documents, one is considered to be the main document, and the other a feeder document that is added in. The three top level containers (meta, body and options) of both documents are then individually analyzed and merged. In the case of the options and meta elements, the children tags are taken individually as well and treated differently from their siblings. The body elements are the easiest to merge: their respective div children are preserved along with their ordering and are grouped together under one element. Thus, the new body tag will contain all the divs of the main document followed by all the divs of the feeder. However, if two divs have the same n and rend attributes (and in case of an interactive div the same action and method attributes as well), those divs will be merged into one. The resulting div will bear the id, n, and rend attributes of the main document's div and contain all the divs of the main document followed by all the divs of the feeder. This process continues recursively until all the divs have been merged. It should be noted that two divisions with separate pagination rules cannot be merged together. Merging the options elements is somewhat different. First, list elements under options of both documents are compared with each other. Those unique to either document are simply added under the new options element, just like divs under body. In case of duplicates, that is list elements that belong to both documents and have the same n attribute, the two lists will be merged into one. The new list element will consist of the main document's head element, followed label-item pairs from the main document, and then finally the label-item pairs of the feeder, provided they are different from those of the main. Finally, the meta elements are merged much like the elements under body. The three children of meta - userMeta, pageMeta, and objectMeta - are individually merged, adding the contents of the feeder after the contents of the main.
359
Element Reference Element Attributes n rend role rows div action behaviorSensitivFields currentPage firstItemIndex id interactive itemsTotal lastItemIndex method n nextPage pagesTotal pageURLMask pagination previousPage rend DOCUMENT field disabled id n rend required type figure rend source target head id n rend help required required required version required required for interactive required required required for interactive behavior Required?
360
Element Reference Element hi instance item id n rend label id n rend list id n rend type META metadata element language qualifier OPTIONS p id n rend pageMeta params cols maxlength multiple operations rows size reference url repositoryID type referenceSet required required required required required Attributes rend Required? required
361
Element Reference Element Attributes id n orderBy rend type repository repositoryID url repositoryMeta row id n rend role table cols id n rend rows trail rend target userMeta value optionSelected optionValue type xref target required required authenticated required required required required required required required required required Required? required required
13.7.1. BODY
Top-Level Container The body element is the main container for all content displayed to the user. It contains any number of div elements that group content into interactive and display blocks. Parent document Children
362
13.7.2. cell
Rich Text Container Structural Element The cell element contained in a row of a table carries content for that table. It is a character container, just like p, item, and hi, and its primary purpose is to display textual data, possibly enhanced with hyperlinks, emphasized blocks of text, images and form fields. Every cell can be annotated with a role (the most common being "header" and "data") and can stretch across any number of rows and columns. Since cells cannot exist outside their container, row, their id attribute is optional. Parent row Children hi (any) xref (any) figure (any) field (any) Attributes cols: (optional) The number of columns the cell spans. id: (optional) A unique identifier of the element. n: (optional) A local identifier used to differentiate the element from its siblings. rend: (optional) A rendering hint used to override the default display of the element. role: (optional) An optional attribute to override the containing row's role settings.
363
Element Reference rows: (optional) The number of rows the cell spans.
<table n="table-example" id="XMLExample.table.table-example" rows="2" cols="3"> <row role="head"> <cell cols="2">Data Label One and Two</cell> <cell>Data Label Three</cell> ... </row> <row> <cell> Value One </cell> <cell> Value Two </cell> <cell> Value Three </cell> ... </row> ... </table>
13.7.3. div
Structural Element The div element represents a major section of content and can contain a wide variety of structural elements to present that content to the user. It can contain paragraphs, tables, and lists, as well as references to artifact information stored in artifactMeta, repositoryMeta, collections, and communities. The div element is also recursive, allowing it to be further divided into other divs. Divs can be of two types: interactive and static. The two types are set by the use of the interactive attribute and differ in their ability to contain interactive content. Children elements of divs tagged as interactive can contain form fields, with the action and method attributes of the div serving to resolve those fields. Parent body div Children head (zero or one) pagination (zero or one) table (any) p (any) referenceSet (any) list (any) div (any) Attributes action: (required for interactive) The form action attribute determines where the form information should be sent for processing. behavior: (optional for interactive) The acceptable behavior options that may be used on this form. The only possible value defined at this time is "ajax" which means that the form may be submitted multiple times for
364
Element Reference each individual field in this form. Note that if the form is submitted multiple times it is best for the behaviorSensitiveFields to be updated as well. behaviorSensitiveFields: (optional for interactive) A space separated list of field names that are sensitive to behavior. These fields must be updated each time a form is submitted with out a complete refresh of the page (i.e. ajax). currentPage: (optional) For paginated divs, the currentPage attribute indicates the index of the page currently displayed for this div. firstItemIndex: (optional) For paginated divs, the firstItemIndex attribute indicates the index of the first item included in this div. id: (required) A unique identifier of the element. interactive: (optional) Accepted values are "yes", "no". This attribute determines whether the div is interactive or static. Interactive divs must provide action and method and can contain field elements. itemsTotal: (optional) For paginated divs, the itemsTotal attribute indicates how many items exit across all paginated divs. lastItemIndex: (optional) For paginated divs, the lastItemIndex attribute indicates the index of the last item included in this div. method: (required for interactive) Accepted values are "get", "post", and "multipart". Determines the method used to pass gathered field values to the handler specified by the action attribute. The multipart method should be used for uploading files. n: (required) A local identifier used to differentiate the element from its siblings. nextPage: (optional) For paginated divs the nextPage attribute points to the URL of the next page of the div, if it exists. pagesTotal: (optional) For paginated divs, the pagesTotal attribute indicates how many pages the paginated divs spans. pageURLMask: (optional) For paginated divs, the pageURLMask attribute contains the mask of a url to a particular page within the paginated set. The destination page's number should replace the {pageNum} string in the URL mask to generate a full URL to that page. pagination: (optional) Accepted values are "simple", "masked". This attribute determines whether the div is spread over several pages. Simple paginated divs must provide previousPage, nextPage, itemsTotal, firstItemIndex, lastItemIndex attributes. Masked paginated divs must provide currentPage, pagesTotal, pageURLMask, itemsTotal, firstItemIndex, lastItemIndex attributes. previousPage: (optional) For paginated divs the previousPage attribute points to the URL of the previous page of the div, if it exists. rend: (optional) A rendering hint used to override the default display of the element. In the case of the div tag, it is also encouraged to label it as either "primary" or "secondary". Divs marked as primary contain content, while secondary divs contain auxiliary information or supporting fields.
<body> <div n="division-example" id="XMLExample.div.division-example"> <head> Example Division </head> <p> This example shows the use of divisions. </p>
365
Element Reference
<table ...> ... </table> <referenceSet ...> ... </referenceSet> <list ...> ... </list> <div n="sub-division-example" id="XMLExample.div.sub-division-example"> <p> Divisions may be nested </p> ... </div> ... </div> ... </body>
13.7.4. DOCUMENT
Document Root The document element is the root container of an XML UI document. All other elements are contained within it either directly or indirectly. The only attribute it carries is the version of the Schema to which it conforms. Parent none Children meta (one) body (one) options (one) Attributes version: (required) Version number of the schema this document adheres to. At the time of writing the only valid version numbers are "1.0" or "1.1". Future iterations of this schema may increment the version number.
<document version="1.1"> <meta> ... </meta> <body> ... </body> <options> ... </options> </document>
13.7.5. field
Text Container
366
Element Reference Structural Element The field element is a container for all information necessary to create a form field. The required type attribute determines the type of the field, while the children tags carry the information on how to build it. Fields can only occur in divisions tagged as "interactive". Parent cell p hi item Children params (one) help (zero or one) error (any) option (any - only with the select type) value (any - only available on fields of type: select, checkbox, or radio) field (one or more - only with the composite type) valueSet (any) Attributes disabled: (optional) Accepted values are "yes", "no". Determines whether the field allows user input. Rendering of disabled fields may vary with implementation and display media. id: (required) A unique identifier for a field element. n: (required) A non-unique local identifier used to differentiate the element from its siblings within an interactive division. This is the name of the field use when data is submitted back to the server. rend: (optional) A rendering hint used to override the default display of the element. required: (optional) Accepted values are "yes", "no". Determines whether the field is a required component of the form and thus cannot be left blank. type: (required) A required attribute to specify the type of value. Accepted types are: button: A button input control that when activated by the user will submit the form, including all the fields, back to the server for processing. checkbox: A boolean input control which may be toggled by the user. A checkbox may have several fields which share the same name and each of those fields may be toggled independently. This is distinct from a radio button where only one field may be toggled. file: An input control that allows the user to select files to be submitted with the form. Note that a form which uses a file field must use the multipart method.
367
Element Reference hidden: An input control that is not rendered on the screen and hidden from the user. password: A single-line text input control where the input text is rendered in such a way as to hide the characters from the user. radio: A boolean input control which may be toggled by the user. Multiple radio button fields may share the same name. When this occurs only one field may be selected to be true. This is distinct from a checkbox where multiple fields may be toggled. select: A menu input control which allows the user to select from a list of available options. text: A single-line text input control. textarea: A multi-line text input control. composite: A composite input control combines several input controls into a single field. The only fields that may be combined together are: checkbox, password, select, text, and textarea. When fields are combined together they can posses multiple combined values.
<p> <hi> ... </hi> <xref> ... </xref> <figure> ... </figure> ... <field id="XMLExample.field.name" n="name" type="text" required="yes"> <params size="16" maxlength="32"/> <help>Some help text with <i18n>localized content</i18n>.</help> <value type="raw">Default value goes here</value> </field> </p>
13.7.6. figure
Text Container Structural Element The figure element is used to embed a reference to an image or a graphic element. It can be mixed freely with text, and any text within the tag itself will be used as an alternative descriptor or a caption. Parent cell p hi item Children none Attributes
368
Element Reference rend: (optional) A rendering hint used to override the default display of the element. source: (optional) The source for the image, using either a URL or a pre-defined XML entity. target: (optional) A target for an image used as a link, using either a URL or an id of an existing element as a destination.
<p> <hi> ... </hi> ... <xref> ... </xref> ... <field> ... </field> ... <figure source="www.example.com/fig1"> This is a static image. </figure> <figure source="www.example.com/fig1" target="www.example.net"> This image is also a link. </figure> ... </p>
13.7.7. head
Text Container Structural Element The head element is primarily used as a label associated with its parent element. The rendering is determined by its parent tag, but can be overridden by the rend attribute. Since there can only be one head element associated with a particular tag, the n attribute is not needed, and the id attribute is optional. Parent div table list referenceSet Children none Attributes id: (optional) A unique identifier of the element n: (optional) A local identifier used to differentiate the element from its siblings rend: (optional) A rendering hint used to override the default display of the element.
<div ...> <head> This is a simple header associated with its div element. </head> <div ...> <head rend="green"> This header will be green.
369
Element Reference
</head> <p> <head> A header with <i18n>localized content</i18n>. </head> ... </p> </div> <table ...> <head> ... </head> ... </table> <list ...> <head> ... </head> ... </list> ... </body>
13.7.8. help
Text Container Structural Element The optional help element is used to supply help instructions in plain text and is normally contained by the field element. The method used to render the help text in the target markup is up to the theme. Parent field Children none Attributes None
<p> <hi> ... </hi> ... <xref> ... </xref> ... <figure> ... </figure> ... <field id="XMLExample.field.name" n="name" type="text" required="yes"> <params size="16" maxlength="32" /> <help>Some help text with <i18n>localized content</i18n>.</help> </field> ... </p>
13.7.9. hi
Rich Text Container
370
Element Reference Structural Element The hi element is used for emphasis of text and occurs inside character containers like p and list item. It can be mixed freely with text, and any text within the tag itself will be emphasized in a manner specified by the required rend attribute. Additionally, hi element is the only text container component that is a rich text container itself, meaning it can contain other tags in addition to plain text. This allows it to contain other text containers, including other hi tags. Parent cell p item hi Children hi (any) xref (any) figure (any) field (any) Attributes rend: (required) A required attribute used to specify the exact type of emphasis to apply to the contained text. Common values include but are not limited to "bold", "italic", "underline", and "emph".
<p> This text is normal, while <hi rend="bold">this text is bold and this text is <hi rend="italic">bold and italic.</hi></hi> </p>
13.7.10. instance
Structural Element The instance element contains the value associated with a form field's multiple instances. Fields encoded as an instance should also include the values of each instance as a hidden field. The hidden field should be appended with the index number for the instance. Thus if the field is "firstName" each instance would be named "firstName_1", "firstName_2", "firstName_3", etc... Parent field Children value Attributes
371
13.7.11. item
Rich Text Container Structural Element The item element is a rich text container used to display textual data in a list. As a rich text container it can contain hyperlinks, emphasized blocks of text, images and form fields in addition to plain text. The item element can be associated with a label that directly precedes it. The Schema requires that if one item in a list has an associated label, then all other items must have one as well. This mitigates the problem of loose connections between elements that is commonly encountered in XHTML, since every item in particular list has the same structure. Parent list Children hi (any) xref (any) figure (any) field (any) list (any) Attributes id: (optional) A unique identifier of the element n: (optional) A non-unique local identifier used to differentiate the element from its siblings rend: (optional) A rendering hint used to override the default display of the element.
<list n="list-example" id="XMLExample.list.list-example"> <head> Example List </head> <item> This is the first item </item> <item> This is the second item with <hi ...>highlighted text</hi>, <xref ...> a link</xref> and an <figure ...>image</figure>.</item> ... <list n="list-example2" id="XMLExample.list.list-example2"> <head> Example List </head> <label>ITEM ONE:</label> <item> This is the first item </item> <label>ITEM TWO:</label> <item> This is the second item with <hi ...>highlighted text</hi>, <xref ...> a link</xref> and an <figure
372
Element Reference
...>image</figure>.</item> <label>ITEM THREE:</label> <item> This is the third item with a <field ...> ... </field> </item> ... </list> <item> This is the third item in the list </item> ... </list>
13.7.12. label
Text Container Structural Element The label element is associated with an item and annotates that item with a number, a textual description of some sort, or a simple bullet. Parent item Children none Attributes id: (optional) A unique identifier of the element n: (optional) A local identifier used to differentiate the element from its siblings rend: (optional) An optional rend attribute provides a hint on how the label should be rendered, independent of its type.
<list n="list-example" id="XMLExample.list.list-example"> <head>Example List</head> <label>1</label> <item> This is the first item </item> <label>2</label> <item> This is the second item with <hi ...>highlighted text</hi>, <xref ...> a link</xref> and an <figure ...>image</figure>.</item> ... <list n="list-example2" id="XMLExample.list.list-example2"> <head>Example Sublist</head> <label>ITEM ONE:</label> <item> This is the first item </item> <label>ITEM TWO:</label> <item> This is the second item with <hi ...>highlighted text</hi>, <xref ...> a link</xref> and an <figure ...>image</figure>.</item> <label>ITEM THREE:</label> <item> This is the third item with a <field ...> ... </field>
373
Element Reference
</item> ... </list> <item> This is the third item in the list </item> ... </list>
13.7.13. list
Structural Element The list element is used to display sets of sequential data. It contains an optional head element, as well as any number of item and list elements. Items contain textual information, while sublists contain other item or list elements. An item can also be associated with a label element that annotates an item with a number, a textual description of some sort, or a simple bullet. The list type (ordered, bulleted, gloss, etc.) is then determined either by the content of labels on items or by an explicit value of the type attribute. Note that if labels are used in conjunction with any items in a list, all of the items in that list must have a label. It is also recommended to avoid mixing label styles unless an explicit type is specified. Parent div list Children head (zero or one) label (any) item (any) list (any) Attributes id: (required) A unique identifier of the element n: (required) A local identifier used to differentiate the element from its siblings rend: (optional) An optional rend attribute provides a hint on how the list should be rendered, independent of its type. Common values are but not limited to: alphabet: The list should be rendered as an alphabetical index columns: The list should be rendered in equal length columns as determined by the theme. columns2: The list should be rendered in two equal columns. columns3: The list should be rendered in three equal columns. horizontal: The list should be rendered horizontally. numeric: The list should be rendered as a numeric index. vertical: The list should be rendered vertically. 374
Element Reference type: (optional) An optional attribute to explicitly specify the type of list. In the absence of this attribute, the type of a list will be inferred from the presence and content of labels on its items. Accepted values are: form: Used for form lists that consist of a series of fields. bulleted: Used for lists with bullet-marked items. gloss: Used for lists consisting of a set of technical terms, each marked with a label element and accompanied by the definition marked as an item element. ordered: Used for lists with numbered or lettered items. progress: Used for lists consisting of a set of steps currently being performed to accomplish a task. For this type to apply, each item in the list should represent a step and be accompanied by a label that contains the displayable name for the step. The item contains an xref that references the step. Also the rend attribute on the item element should be: "available" (meaning the user may jump to the step using the provided xref), "unavailable" (the user has not meet the requirements to jump to the step), or "current" (the user is currently on the step) simple: Used for lists with items not marked with numbers or bullets.
<div ...> ... <list n="list-example" id="XMLExample.list.list-example"> <head>Example List</head> <item> ... </item> <item> ... </item> ... <list n="list-example2" id="XMLExample.list.list-example2"> <head>Example Sublist</head> <label> ... </label> <item> ... </item> <label> ... </label> <item> ... </item> <label> ... </label> <item> ... </item> ... </list> <label> ... </label> <item> ... </item> ... </list> </div>
13.7.14. META
Top-Level Container The meta element is a top level element and exists directly inside the document element. It serves as a container element for all metadata associated with a document broken up into categories according to the type of metadata they carry. Parent document Children 375
Element Reference userMeta (one) pageMeta (one) repositoryMeta (one) Attributes None
<document version=1.0> <meta> <userMeta> ... </userMeta> <pageMeta> ... </pageMeta> <repositoryMeta> ... </repositoryMeta> </meta> <body> ... </body> <options> ... </options> </document>
13.7.15. metadata
Text Container Structural Element The metadata element carries generic metadata information in the form on an attribute-value pair. The type of information it contains is determined by two attributes: element, which specifies the general type of metadata stored, and an optional qualifier attribute that narrows the type down. The standard representation for this pairing is element.qualifier. The actual metadata is contained in the text of the tag itself. Additionally, a language attribute can be used to specify the language used for the metadata entry. Parent userMeta pageMeta Children none Attributes element: (required) The name of a metadata field. language: (optional) An optional attribute to specify the language used in the metadata tag. qualifier: (optional) An optional postfix to the field name used to further differentiate the names.
<meta> <userMeta> <metadata element="identifier" qualifier="firstName"> Bob </metadata> <metadata element="identifier" qualifier="lastName"> Jones </metadata> <metadata ...> ... </metadata> ... </userMeta> <pageMeta>
376
Element Reference
<metadata element="rights" qualifier="accessRights">user</metadata> <metadata ...> ... </metadata> ... </pageMeta> </meta>
13.7.16. OPTIONS
Top-Level Container The options element is the main container for all actions and navigation options available to the user. It consists of any number of list elements whose items contain navigation information and actions. While any list of navigational options may be contained in this element, it is suggested that at least the following 5 lists be included. Parent document Children list (any) Attributes None
<document version=1.0> <meta> ... </meta> <body> ... </body> <options> <list n="navigation-example1" id="XMLExample.list.navigation-example1"> <head>Example Navigation List 1</head> <item><xref target="/link/to/option">Option One</xref></item> <item><xref target="/link/to/option">Option two</xref></item> ... </list> <list n="navigation-example2" id="XMLExample.list.navigation-example2"> <head>Example Navigation List 2</head> <item><xref target="/link/to/option">Option One</xref></item> <item><xref target="/link/to/option">Option two</xref></item> ...
377
Element Reference
</list> ... </options> </document>
13.7.17. p
Rich Text Container Structural Element The p element is a rich text container used by divs to display textual data in a paragraph format. As a rich text container it can contain hyperlinks, emphasized blocks of text, images and form fields in addition to plain text. Parent div Children hi (any) xref (any) figure (any) field (any) Attributes id: (optional) A unique identifier of the element. n: (optional) A local identifier used to differentiate the element from its siblings. rend: (optional) A rendering hint used to override the default display of the element.
<div n="division-example" id="XMLExample.div.division-example"> <p> This is a regular paragraph. </p> <p> This text is normal, while <hi rend="bold">this text is bold and this text is <hi rend="italic">bold and italic.</hi></hi> </p> <p> This paragraph contains a <xref target="/link/target">link</xref>, a static <figure source="/image.jpg">image</figure>, and a <figure target= "/link/target" source="/image.jpg">image link.</figure> </p> </div>
13.7.18. pageMeta
Metadata Element The pageMeta element contains metadata associated with the document itself. It contains generic metadata elements to carry the content, and any number of trail elements to provide information on the user's current
378
Element Reference location in the system. Required and suggested values for metadata elements contained in pageMeta include but are not limited to: browser (suggested): The user's browsing agent as reported to server in the HTTP request. browser.type (suggested): The general browser family as derived form the browser metadata field. Possible values may include "MSIE" (for Microsoft Internet Explorer), "Opera" (for the Opera browser), "Apple" (for Apple web kit based browsers), "Gecko" (for Netscape, Mozilla, and Firefox based browsers), or "Lynx" (for text based browsers). browser.version (suggested): The browser version as reported by HTTP Request. contextPath (required): The base URL of the Digital Repository system. redirect.time (suggested): The time that must elapse before the page is redirected to an address specified by the redirect.url metadata element. redirect.url (suggested): The URL destination of a redirect page title (required): The title of the document/page that the user currently browsing. See the metadata and trail tag entries for more information on their structure. Parent meta Children metadata (any) trail (any) Attributes None
<meta> <userMeta> ... </userMeta> <pageMeta> <metadata element="title">Examlpe DRI page</metadata> <metadata element="contextPath">/xmlui/</metadata> <metadata ...> ... </metadata> ... <trail source="123456789/6"> A bread crumb item </trail> <trail ...> ... </trail> ... </pageMeta>
379
Element Reference
</meta>
13.7.19. params
Structural Component The params element identifies extra parameters used to build a form field. There are several attributes that may be available for this element depending on the field type. Parent field Children none Attributes cols: (optional) The default number of columns that the text area should span. This applies only to textarea field types. maxlength: (optional) The maximum length that the theme should accept for form input. This applies to text and password field types. multiple: (optional) yes/no value. Determine if the field can accept multiple values for the field. This applies only to select lists. operations: (optional) The possible operations that may be preformed on this field. The possible values are "add" and/or "delete". If both operations are possible then they should be provided as a space separated list.The "add" operations indicates that there may be multiple values for this field and the user may add to the set one at a time. The front-end should render a button that enables the user to add more fields to the set. The button must be named the field name appended with the string "_add", thus if the field's name is "firstName" the button must be called "firstName_add".The "delete" operation indicates that there may be multiple values for this field each of which may be removed from the set. The front-end should render a checkbox by each field value, except for the first, The checkbox must be named the field name appended with the string "_selected", thus if the field's name is "firstName" the checkbox must be called "firstName_selected" and the value of each successive checkbox should be the field name. The front-end must also render a delete button. The delete button name must be the field's name appended with the string "_delete". rows: (optional) The default number of rows that the text area should span. This applies only to textarea field types. size: (optional) The default size for a field. This applies to text, password, and select field types.
<p> <field id="XMLExample.field.name" n="name" type="text" required="yes"> <params size="16" maxlength="32"/> <help>Some help text with <i18n>localized content</i18n>.</help> <default>Default value goes here</default>
380
Element Reference
</field> </p>
13.7.20. reference
Metadata Reference Element reference is a reference element used to access information stored in an extarnal metadata file. The url attribute is used to locate the external metadata file. The type attribute provides a short limited description of the referenced object's type. reference elements can be both contained by includeSet elements and contain includeSets themselves, making the structure recursive. Parent referenceSet Children referenceSet (zero or more) Attributes url: (required) A url to the external metadata file. repositoryIdentifier: (required) A reference to the repositoryIdentifier of the repository. type: (optional) Description of the reference object's type.
<includeSet n="browse-list" id="XMLTest.includeSet.browse-list"> <reference url="/metadata/handle/123/4/mets.xml" repositoryID="123" type="DSpace Item"/> <reference url="/metadata/handle/123/5/mets.xml" repositoryID="123" /> ... </includeSet>
13.7.21. referenceSet
Metadata Reference Element The referenceSet element is a container of artifact or repository references. Parent div reference Children head (zero or one) reference (any)
381
Element Reference Attributes id: (required) A unique identifier of the element n: (required) Local identifier used to differentiate the element from its siblings orderBy: (optional) A reference to the metadata field that determines the ordering of artifacts or repository objects within the set. When the Dublin Core metadata scheme is used this attribute should be the element.qualifier value that the set is sorted by. As an example, for a browse by title list, the value should be sortedBy=title, while for browse by date list it should be sortedBy=date.created rend: (optional) A rendering hint used to override the default display of the element. type: (required) Determines the level of detail for the given metadata. Accepted values are: summaryList: Indicates that the metadata from referenced artifacts or repository objects should be used to build a list representation that is suitable for quick scanning. summaryView: Indicates that the metadata from referenced artifacts or repository objects should be used to build a partial view of the referenced object or objects. detailList: Indicates that the metadata from referenced artifacts or repository objects should be used to build a list representation that provides a complete, or near complete, view of the referenced objects. Whether such a view is possible or different from summaryView depends largely on the repository at hand and the implementing theme. detailView: Indicates that the metadata from referenced artifacts or repository objects should be used to display complete information about the referenced object. Rendering of several references included under this type is up to the theme.
<div ...> <head> Example Division </head> <p> ... </p> <table> ... </table> <list> ... </list> <referenceSet n="browse-list" id="XMLTest.referenceSet.browse-list" type="summaryView" informationModel="DSpace"> <head>A header for the includeset</head> <reference url="/metadata/handle/123/34/mets.xml"/> <reference url=""metadata/handle/123/34/mets.xml/> </referenceSet> ... </p>
13.7.22. repository
Metadata Element The repository element is used to describe the repository. Its principal component is a set of structural metadata that carrier information on how the repository's objects under objectMeta are related to each other. The principal method of encoding these relationships at the time of this writing is a METS document, although other formats, like RDF, may be employed in the future. Parent
382
Element Reference repositoryMeta Children none Attributes repositoryID: requiredA unique identifier assigned to a repository. It is referenced by the object element to signify the repository that assigned its identifier. url: requiredA url to the external METS metadata file for the repository.
<repositoryMeta> <repository repositoryID="123456789" url="/metadata/handle/1234/4/mets.xml" /> </repositoryMeta>
13.7.23. repositoryMeta
Metadata Element The repositoryMeta element contains metadata refernces about the repositories used in the used or referenced in the document. It can contain any number of repository elements. See the repository tag entry for more information on the structure of repository elements. Parent Meta Children repository (any) Attributes None
<meta> <userMeta> ... </usermeta> <pageMeta> ... </pageMeta> <repositoryMeta> <repository repositoryIID="..." url="..." /> </repositoryMeta> </meta>
13.7.24. row
Structural Element
383
Element Reference The row element is contained inside a table and serves as a container of cell elements. A required role attribute determines how the row and its cells are rendered. Parent table Children cell (any) Attributes id: (optional) A unique identifier of the element n: (optional) A local identifier used to differentiate the element from its siblings rend: (optional) A rendering hint used to override the default display of the element. role: (required) Indicates what kind of information the row carries. Possible values include "header" and "data".
<table n="table-example" id="XMLExample.table.table-example" rows="2" cols="3"> <row role="head"> <cell cols="2">Data Label One and Two</cell> <cell>Data Label Three</cell> ... </row> <row> <cell> Value One </cell> <cell> Value Two </cell> <cell> Value Three </cell> ... </row> ... </table>
13.7.25. table
Structural Element The table element is a container for information presented in tabular format. It consists of a set of row elements and an optional header. Parent div
384
Element Reference Children head (zero or one) row (any) Attributes cols: (required) The number of columns in the table. id: (required) A unique identifier of the element n: (required) A local identifier used to differentiate the element from its siblings rend: (optional) A rendering hint used to override the default display of the element. rows: (required) The number of rows in the table.
<div n="division-example" id="XMLExample.div.division-example"> <table n="table1" id="XMLExample.table.table1" rows="2" cols="3"> <row role="head"> <cell cols="2">Data Label One and Two</cell> <cell>Data Label Three</cell> ... </row> <row> <cell> Value One </cell> <cell> Value Two </cell> <cell> Value Three </cell> ... </row> ... </table> ... </div>
13.7.26. trail
Text Container Metadata Element The trail element carries information about the user's current location in the system relative of the repository's root page. Each instance of the element serves as one link in the path from the root to the current page.
385
Element Reference Parent pageMeta Children none Attributes rend: (optional) A rendering hint used to override the default display of the element. target: (optional) An optional attribute to specify a target URL for a trail element serving as a hyperlink. The text inside the element will be used as the text of the link.
<pageMeta> <metadata element="title">Examlpe DRI page</metadata> <metadata element="contextPath">/xmlui/</metadata> <metadata ...> ... </metadata> ... <trail target="/myDSpace"> A bread crumb item pointing to a page. </trail> <trail ...> ... </trail> ... </pageMeta>
13.7.27. userMeta
Metadata Element The userMeta element contains metadata associated with the user that requested the document. It contains generic metadata elements, which in turn carry the information. Required and suggested values for metadata elements contained in userMeta include but not limited to: identifier (suggested): A unique identifier associated with the user. identifier.email (suggested): The requesting user's email address. identifier.firstName (suggested): The requesting user's first name. identifier.lastName (suggested): The requesting user's last name. identifier.logoutURL (suggested): The URL that a user will be taken to when logging out. identifier.url (suggested): A url reference to the user's page within the repository. language.RFC3066 (suggested): The requesting user's preferred language selection code as describe by RFC3066 rights.accessRights (required): Determines the scope of actions that a user can perform in the system. Accepted values are:
386
Element Reference none: The user is either not authenticated or does not have a valid account on the system user: The user is authenticated and has a valid account on the system admin: The user is authenticated and belongs to the system's administrative group See the metadata tag entry for more information on the structure of metadata elements. Parent meta Children metadata (any) Attributes authenticated: (required) Accepted values are "yes", "no". Determines whether the user has been authenticated by the system.
<meta> <userMeta> <metadata element="identifier" qualifier="email">bobJones@tamu.edu</metadata> <metadata element="identifier" qualifier="firstName">Bob</metadata> <metadata element="identifier" qualifier="lastName">Jones</metadata> <metadata element="rights" qualifier="accessRights">user</metadata> <metadata ...> ... </metadata> ... <trail source="123456789/6">A bread crumb item</trail> <trail ...> ... </trail> ... </userMeta> <pageMeta> ... </pageMeta> </meta>
13.7.28. value
Rich Text Container Structural Element The value element contains the value associated with a form field and can serve a different purpose for various field types. The value element is comprised of two subelements: the raw element which stores the unprocessed value directly from the user of other source, and the interpreted element which stores the value in a format appropriate for display to the user, possibly including rich text markup.
387
Element Reference Parent field Children hi (any) xref (any) figure (any) Attributes optionSelected: (optional) An optional attribute for select, checkbox, and radio fields to determine if the value is to be selected or not. optionValue: (optional) An optional attribute for select, checkbox, and radio fields to determine the value that should be returned when this value is selected. type: (required) A required attribute to specify the type of value. Accepted types are: raw: The raw type stores the unprocessed value directly from the user of other source. interpreted: The interpreted type stores the value in a format appropriate for display to the user, possibly including rich text markup. default: The default type stores a value supplied by the system, used when no other values are provided.
<p> <hi> ... </hi> <xref> ... </xref> <figure> ... </figure> <field id="XMLExample.field.name" n="name" type="text" required="yes"> <params size="16" maxlength="32"/> <help>Some help text with <i18n>localized content</i18n>.</help> <value type="default">Author, John</value> </field> </p>
13.7.29. xref
Text Container Structural Element The xref element is a reference to an external document. It can be mixed freely with text, and any text within the tag itself will be used as part of the link's visual body. Parent cell p item
388
Element Reference hi Children none Attributes target: (required) A target for the reference, using either a URL or an id of an existing element as a destination for the xref.
<p> <xref target="/url/link/target">This text is shown as a link.</xref> </p>
389
Appendices
14. Appendices
14.1. Appendix A
14.1.1. Default Dublin Core Metadata Registry
contributor A person, organization, or service responsible for the content of the resource. Catch-all for unspecified contributors. advisor author editor illustrator other spatial temporal Spatial characteristics of content. Temporal characteristics of content. Do not use; only for harvested metadata. Use qualified form if possible. accessioned available copyright created Date DSpace takes possession of item. Date or date range item became available to the public. Date of copyright. Date of creation or manufacture of intellectual content if different from date.issued. Date of publication or distribution. Recommend for theses/dissertations. Catch-all for unambiguous identifiers not defined by qualified form; use identifier.other for a known identifier common to a local collection instead of unqualified form. citation Human-readable, standard bibliographic citation of non-DSpace format of this item A government document number International Standard Book Number Use primarily for thesis advisor.
contributor contributor contributor contributor contributor coverage coverage creator date date date date date
issued submitted
identifier
identifier identifier
govdoc isbn
390
Appendix A identifier identifier identifier identifier identifier description description description abstract provenance issn sici ismn other uri International Standard Serial Number Serial Item and Contribution Identifier International Standard Music Number A known identifier type common to a local collection. Uniform Resource Identifier Catch-all for any description not defined by qualifiers. Abstract or summary. The history of custody of the item since its creation, including any changes successive custodians made to it. Information about sponsoring agencies, individuals, or contractual arrangements for the item. To preserve statement of responsibility from MARC records. A table of contents for a given item. Uniform Resource Identifier pointing to description of this item. Catch-all for any format information not defined by qualifiers. extent medium mimetype Size or duration. Physical medium. Registered MIME type identifiers. Catch-all for non-ISO forms of the language of the item, accommodating harvested values. iso Current ISO standard for language of intellectual content, including country codes (e.g. "en_US"). Entity responsible for publication, distribution, or imprint. Catch-all for references to other related items. isformatof ispartof References additional physical form. References physically or logically containing item.
description
sponsorship
language
391
Appendix A relation relation relation relation relation relation relation ispartofseries haspart isversionof hasversion isbasedon isreferencedby requires Series name and number within that series, if available. References physically or logically contained item. References earlier version. References later version. References source. Pointed to by referenced resource. Referenced resource is required to support function, delivery, or coherence of item. References preceeding item. References succeeding item. References Uniform Resource Identifier for related item Terms governing use and reproduction. uri References terms governing use and reproduction. Do not use; only for harvested metadata. uri Do not use; only for harvested metadata. Uncontrolled index term. classification Catch-all for value from local classification system. Global classification systems will receive specific qualifier Dewey Decimal Classification Number Library of Congress Classification Number Library of Congress Subject Headings MEdical Subject Headings Local controlled vocabulary; global vocabularies will receive specific qualifier. Title statement/title proper. alternative Varying (or substitute) form of title proper appearing in item, e.g. abbreviation or translation Nature or genre of content.
title title
392
Appendix A
License
Item-specific li- Known cense agreed upon to submission Machine-Readable Cataloging records Mathematica Notebook Known
true
application/marc MARC
false
Known
ma doc pdf
Microsoft Word Microsoft Word Known Adobe PDF Adobe Portable Document Format Postscript Files Known
application/postscript
Postscript
Known
Known
false
ppt
Known
false
Known
false false
vsd wpd
WordPerfect 5.1 Known document TeX dvi format Filemaker Pro LaTeX document Photoshop Known Known Known Known
393
Appendix A application/x-tex TeX audio/basic audio/x-aiff audio/basic AIFF Tex/LateX docu- Known ment Basic Audio Known Audio InterKnown change File Format MPEG Audio RealAudio file Known Known false false false tex au, snd aif, aifc, aiff
Joint PhotoKnown graphic Experts Group/JPEG File Interchange Format (JFIF) Portable Network Graphics Tag Image File Format Microsoft Windows bitmap Known Known Known
image/png image/tiff
image/png TIFF
Kodak Photo CD Known image Cascading Style Known Sheets Hypertext Markup Language Plain Text Rich Text Format Extensible Markup Language Moving Picture Experts Group Known
video/mpeg
MPEG
Known
false false
video/quicktime Video Quicktime Video Quicktime Known Used by system: do not remove
394
History
15. History
15.1. Changes in DSpace 1.7.0
15.1.1. New Features
Key DS-396 Summary Assignee Reporter Sarah Shreeves Provide metatags used by Sands Fish Google Scholar for enhanced indexing Add ability to export/import entire Community/Collection/Item structure (for easier backups, migrations, etc.) Move item - inherit default policies of destination collection Having a most used item list similar to the recent submissions New testing framework (GSoC 2010) New Base Theme For DSpace 1.7.0 Discovery release for XMLUI PowerPoint Text Extraction for DSpace Media Filter Modular Configuration (Curation) Curation System (Core Elements) Tim Donohue
DS-466
Tim Donohue
DS-525
Stuart Lewis
Stuart Lewis
DS-603
Ben Bosman
Claudia Jrgen
Administrative UI for Cu- Richard Rodgers ration (XMLUI) Tools for (load) testing Graham Triggs
395
Changes in DSpace 1.7.0 Key Summary Assignee their custom "options" via command line Consider making the Stuart Lewis JSPUI styles.css.jsp a static file Upgrade to latest Google Analytics tracking code LC Authority Names Lookup Feature - names w/o dates Stuart Lewis Kim Shepherd Reporter
DS-467
Stuart Lewis
DS-550 DS-557
DS-561
On login screen, keyboard Andrea Bollini input focus should be set to the first field (E-mail Address) so you don't have to use the mouse (JSPUI) Upgrade DSpace Services Mark Diggory to next release Use modified Cocoon Mark Diggory Servlet Service Impl in place of existing to support proper Cocoon Block addition. Patch for SFX (OpenURL Jeffrey Trimble resolver) New INSTALL event Mark H. Wood when an Item is approved Error Handling in the XMLUI interface after section expired Tim Donohue
Oleksandr Sytnyk
DS-571 DS-577
DS-618
Recommended versions Mark H. Wood of prerequisites becoming outdated Export cleanup Bulgarian for DSpace 1.6.0 Make the timeout for the extended resolver dnslookup configurable Robin Taylor Claudia Jrgen Jeffrey Trimble
Mark H. Wood
DS-646
Remove /bin scripts (re- Jeffrey Trimble placed by 'dspace' commmand) Need Help Testing LNI refactoring changes in AIP Backup/Restore Work Unassigned
Stuart Lewis
DS-647
Tim Donohue
396
Changes in DSpace 1.7.0 Key DS-648 Summary Assignee Reporter Tim Donohue
Modern Browsers are not Tim Donohue identified in XMLUI main sitemap.xmap Bulgarian Localizatoin for Claudia Jrgen DSpace 1.6.2 Brazilian Portuguese (pt_BR) translation for XML-UI 1.6.2 Claudia Jrgen
DS-650 DS-653
DS-660 DS-662
Tidy up DCDate and DC- Robin Taylor DateTest. DSpace import does not Scott Phillips use a predictable ordering when running an import. The registery loader command does not load schemas. Scott Phillips
DS-664
Scott Phillips
DS-670
Relax perforRobin Taylor Robin Taylor mance tests in org.dspace.content.CommunityCollectionIntegrationTest. Add inner class for meta- Graham Triggs data to Item, delay metadata loading until required, reduce impact of changes Separate DatabaseManag- Graham Triggs er from registering / creating database pool, allow datasource to be retrieved from JNDI Reduce browse prune cost Graham Triggs Extend supported input Graham Triggs formats for thumbnail generation in MediaFilter ChoiceAuthority plugin for old-style controlled vocabularies Jeffrey Trimble Graham Triggs
DS-694
DS-695
Graham Triggs
DS-696 DS-697
DS-699
Michael B. Klein
DS-703
Making the OAI max Ben Bosman amount of records configurable Update pdfbox library to Andrea Bollini improve performance and out-of-box support for pdf extraction Alternative to dri2xhtml, with the goal to be more developer-friendly Ben Bosman
Ben Bosman
DS-704
Andrea Bollini
DS-706
Ben Bosman
397
Changes in DSpace 1.7.0 Key DS-707 Summary Fix/improve DSpace through static code inspections Assignee Unassigned Reporter Graham Triggs
DS-713
dspace.log only logs reStuart Lewis moteAddr, needs extending to include X-Forwarded-For JSPUI redirect users to re- Kim Shepherd quested page after authentication solr.statistics.logBots isn't Kim Shepherd documented in dspace.cfg Improve performance of browse Improve database efficiency Graham Triggs Graham Triggs
Stuart Lewis
DS-718
Kim Shepherd
Allow IPAuthentcation Stuart Lewis to work with proxies (examine X-Forwarded-For header) ItemImport - nicer handling of no contents file, and more efficient handling of no handle file sword.cfg seems no longer used after 1.5 Improve performance of Lucene indexing Graham Triggs
DS-736
Graham Triggs
Ability to store an incom- Sands Fish ing package as a file in the event that the ingest fails Reduce pressure on mem- Graham Triggs ory by ensuring that classes with a finalize method make their fields available for garbage collection at earliest possible opportunity Upgrade steps (1.5->1.6) need reordering Jeffrey Trimble
DS-773
Graham Triggs
DS-779 DS-783
398
Changes in DSpace 1.7.0 Key DS-784 Summary Assignee Reporter Jason Stirnaman
Incorrect DSpace Statis- Tim Donohue tics section references throughout documentation
xmlui hardcoded string in Mark H. Wood AuthenticationUtil.java ID: 2088360 xmlui browse in empty Scott Phillips collection displays "Now showing items 1-0" of 0 incorrect numbering Single-argument Robin Taylor Item.getMetadata does not work with mixed-case metadata Special groups shown for Stuart Lewis logged in user rather than for user being examined XMLUI Item Mapper cannot handle multiple words in search box Stuart Lewis
DS-123
Keith Gilbertson
DS-215
Nicholas Riley
DS-242
Stuart Lewis
DS-268
Tim Donohue
DS-426
Item's submission license Claudia Jrgen accessible without beiing configured to be public DCDate.displayDate(false,*) Mark H. Wood displays only year Accessing site-level Kim Shepherd 'mets.xml' in XMLUI doesn't work properly for handle prefixes with periods (e.g. 2010.1) Url in browser is incorrect Ben Bosman after login DatabaseManager.process()Mark H. Wood unnecessarily limits range of DECIMAL or NUMERIC Broken link in the documentation section 8.2.3. Date month and day get default values when user returns to describe form Jeffrey Trimble Robin Taylor
Claudia Jrgen
DS-469 DS-471
DS-493 DS-494
DS-495 DS-497
399
Changes in DSpace 1.7.0 Key DS-501 DS-509 Summary Kubrick Theme - NaN in Item Browse Assignee Robin Taylor Reporter Keith Gilbertson Peter Dietz
Retrieving country names Peter Dietz in SOLR can return ArrayIndexOutOfBounds when country code is unchecked Duplicate listing of dependencies in dspacesword/pom.xml Withdrawn items not shown as deleted in OAI Stuart Lewis
DS-518
Caryn N.
DS-527 DS-537
Kim Shepherd
Malformed Japanese op- Kim Shepherd tion values in the authority lookup window restricted items are Ben Bosman being returned in OAI GetRecord method while using harvest.includerestricted.oai Misspelled attribute in MODS/METS output Keith Gilbertson
DS-538
Ben Bosman
Andrew Hankinson Claudia Jrgen Claudia Jrgen Claudia Jrgen Robin Taylor
Harvest not international- Claudia Jrgen ized Removal of mapped items Kim Shepherd can lead to NPE Item Mapper List of Mapped Items Usability Mark H. Wood
XMLUI 'Notice's are al- Robin Taylor ways added by the Administrative aspect even if the content was generated by another aspect. Value for Recent Submis- Claudia Jrgen sions is not workin in the XMLUI Removing repeatable val- Kim Shepherd ues in DescribeStep does not properly test for authority control Export directories dspace.cfg and build.xml out of sync Claudia Jrgen
DS-547
Marvin Pollard
DS-548
Kim Shepherd
DS-551
Claudia Jrgen
DS-553
Display date generation Robin Taylor only works on dates without time granularity
Claudia Jrgen
400
Changes in DSpace 1.7.0 Key DS-556 DS-562 Summary Add Xalan to Solr pom.xml as dependency Assignee Mark Diggory Reporter Mark Diggory
Community admin or us- Andrea Bollini Andrew Taylor er with WRITE, ADD and ADMIN policy on collection cannot delete that collection due to bug in AuthorizeUtil.authorizeManageTemplateItem(context,collection) Multiple spaces in beClaudia Jrgen tween words in advanced search will make the search return nothing Fixed for Empty description column in Itemview Page - General-Handler.xsl Kim Shepherd Flvio Botelho
DS-563
DS-565
DS-566
Fixed for side bar menu Kim Shepherd dropping when there is license text in collection DIM-Handler.xsl Trailing white spaces lead Claudia Jrgen to wrong order of texts in browse NPE resuming submisAndrea Bollini sion for item with an empty bundle original DSpaceMETSIngester creates empty original bundle Andrea Bollini
DS-569
Claudia Jrgen
DS-573
Andrea Bollini
DS-574
Andrea Bollini
DS-579
Required fields in submis- Ben Bosman sions display wrong error message DIDL format include Andrea Bollini HTML element if the item has no files DIDL doesn't respect Andrea Bollini the hidden fields and the oai_dc metadata section is different than the simple oai_dc implementation typo in messages.xml start-handle-server script broken - Error in launcher.xml: Invalid class name Unassigned Unassigned
Ben Bosman
DS-580
Andrea Bollini
DS-581
Andrea Bollini
DS-583 DS-584
401
Changes in DSpace 1.7.0 Key DS-585 Summary The Content Disposition configuration is ignored by unpublished items DCDate parsing thread synchronization issue Assignee Andrea Bollini Reporter Andrea Bollini
DS-594 DS-596
Robin Taylor
Cannot Delete Communi- Claudia Jrgen ty which has two or more levels of SubCommunities Errors in 1.5.x -> 1.6.x and 1.6.0 - 1.6.1 upgrade steps Blank screen on XMLUI Dspace 1.6.0 Invalid identifiers not escaped Batch metadata import missing item headers [XMLUI] Multi-page forms error [XMLUI] Breadcrumbs dissapear in submission Item exports not deleted on deletion of eperson Documentation refers to install-configs script Jeffrey Trimble
DS-604
Kim Shepherd
Sands Fish Stuart Lewis Stuart Lewis Mark H. Wood Mark H. Wood Robin Taylor Jeffrey Trimble
Antero Neto Stuart Lewis Stuart Lewis Pere Villega Pere Villega Claudia Jrgen Nick Nicholas Keith Gilbertson
ItemImporter stops with Keith Gilbertson "Too many open files" error Batch Metadata Import Stuart Lewis needs to validate metadata fields specified in CSVs NPE when deleting object returned by EPerson.findAll() Error Check Needed for handle.prefix in Dspace.cfg Mark H. Wood
DS-632
Kim Shepherd
DS-633
Mark H. Wood
DS-636
Mark H. Wood
Tim Donohue
DS-637
Browse index bug/fix Andrea Bollini ONLY for authority index: first "too low" confidence value stop current item metadata to be indexed in the autority index
Reinhard Engels
402
Changes in DSpace 1.7.0 Key DS-639 Summary sword swap ingest crosswalk: attribute test for dc.identifier.uri incorrect Assignee Robin Taylor Reporter Bill Hays
DS-644
Broken link in ShibboJeffrey Trimble leth settings in 1.6.2 Documentation SWORD deposits do not Tim Donohue remove temporary package files in the upload directory "Name collision" error in Tim Donohue logs for QDCCrosswalk Minor fix for CommunityTest. Tim Donohue
Tim Donohue
DS-649
Bill Hays
Some OAI-PMH Scott Phillips providers will wrap their metadata inside a wrapper element. SImple search loses it's Scott Phillips scope (community or collection) when navigating between search pagination pages. NPE exception possible when adding a null i18n parameter as content xmlui BitstreamReader holds database connections open while large files download, exhausting connection pool Incorrect use of UTF8 in DSpaceCSV Scott Phillips
DS-665
Scott Phillips
DS-666
Scott Phillips
DS-677
Keith Gilbertson
Keith Gilbertson
DS-686 DS-688
Stuart Lewis
AuthorizeManager.isAdmin(Context, Kim Shepherd DSpaceObject) throws exception when EPerson is null DisplayStatisticsServlet throws NPE if requested handle is null, should throw 404 Kim Shepherd
DS-689
Kim Shepherd
DS-691
file size math in the Kim Shepherd mets:file template in General-Handler.xsl (XMLUI) is incorrect BitstreamStorageManag- Graham Triggs er fails when deleting pri-
Hardy Pottinger
DS-698
Graham Triggs
403
Changes in DSpace 1.7.0 Key Summary Assignee mary bitstream (constraint violation) ArrayIndexOutOfBound- Claudia Jrgen sException when editing group roles DSpace Fails to Check Tim Donohue if a Handle is already assigned before assigning a new handle RecentSubmissions miss- Unassigned ing from Collection view (XMLUI trunk @ r5578) JSPUI-style Community Tim Donohue and Collection logo URLs do not work in XMLUI update-sequences.sql script errors out, as it refers to an old, obsolete 'dctyperegistry' table Tim Donohue Reporter
DS-709
Andreas Schwander
DS-712
Tim Donohue
DS-721
Kim Shepherd
DS-727
Tim Donohue
DS-729
Tim Donohue
DS-743
ItemImport fails on large Unassigned batches with OutOfMemory exception COinS in XMLUI has in- Kim Shepherd valid referrer Id and dc metadata; spans are not properly constructed Item marked NOT in_archive yields inconsistent METS packages Mark H. Wood
Graham Triggs
DS-748
Bill Hays
DS-751
Mark H. Wood
Invalid METS generated Tim Donohue for Items with no Bundles Fix CSV parsing that got broken by refactoring BitstreamReader file descriptor leak Broken authority-control icons in Mirage theme Stuart Lewis Robin Taylor Peter Dietz
Tim Donohue Stuart Lewis Gareth Waller Peter Dietz Keiji Suzuki
StringIndexOutOfBound- Robin Taylor sException occurred in DCDate.java Mirage theme: Repeatable Ben Bosman oneboxes in submission do not show "Add more" buttons
DS-769
Kim Shepherd
404
Changes in DSpace 1.6.2 Key DS-772 Summary Assignee Reporter Tim Donohue
AIP Restore process fails Tim Donohue to consistently restore Item Mappings NoClassDefFoundError Ben Bosman when running filter media on an item that contains a PDF file need to increase version of ojdbc in prerequisites Tim Donohue
DS-774
DS-786
Hardy Pottinger
DS-604
Jeffrey Trimble
Kim Shepherd
405
Attachment spelled as at- Stuart Lewis tachement in DailyReportEmailer Documentation for "schema" attribute in metadata xml files LC Authority Names Lookup Feature - names w/o dates Jeffrey Trimble
DS-534
Keith Gilbertson
DS-557
Kim Shepherd
Mark Diggory
DS-571 DS-577
Upgrade DSpace Services Mark Diggory to next release Use modified Cocoon Mark Diggory Servlet Service Impl in place of existing to support proper Cocoon Block addition.
java.net.MalformedURLException: Mark Diggory unknown protocol: resource Special groups shown for Stuart Lewis logged in user rather than for user being examined CC License being assigned incorrect Mime Type during submission. Jeffrey Trimble
DS-242
Stuart Lewis
DS-295
Steven Williams
DS-471
Accessing site-level Kim Shepherd 'mets.xml' in XMLUI doesn't work properly for handle prefixes with periods (e.g. 2010.1) Url in browser is incorrect Ben Bosman after login
Tim Donohue
DS-493
Ben Bosman
406
Changes in DSpace 1.6.1 Key DS-497 Summary Date month and day get default values when user returns to describe form Kubrick Theme - NaN in Item Browse embargo-lifter command missing from launcher.xml Log Converter difference between docs (logconverter) and launcher (stats-log-converter) Assignee Robin Taylor Reporter Gabriela Mircea
DS-501 DS-506
DS-507
Jeffrey Trimble
Peter Dietz
DS-509
Retrieving country names Peter Dietz in SOLR can return ArrayIndexOutOfBounds when country code is unchecked Connection leak in SWORD authentication process DSRUN does not start Service Manager Duplicate listing of dependencies in dspacesword/pom.xml Reordering of 1.5 -> 1.6 upgrade steps in DSpace manual ItemUpdate - script and manual updates Withdrawn items not shown as deleted in OAI Andrea Bollini
Peter Dietz
DS-513
Andrea Bollini
DS-516 DS-518
DS-523
Jeffrey Trimble
Stuart Lewis
Malformed Japanese op- Kim Shepherd tion values in the authority lookup window restricted items are Ben Bosman being returned in OAI GetRecord method while using harvest.includerestricted.oai Misspelled attribute in MODS/METS output Keith Gilbertson
DS-538
Ben Bosman
DS-539 DS-542
verbose output for stats- Peter Dietz log-importer displays spurious city/country from previous committed entry
407
Changes in DSpace 1.6.1 Key DS-543 DS-544 DS-547 Summary Assignee Reporter Claudia Jrgen Claudia Jrgen Marvin Pollard
Harvest not international- Claudia Jrgen ized Removal of mapped items Kim Shepherd can lead to NPE Value for Recent Submis- Claudia Jrgen sions is not workin in the XMLUI Removing repeatable val- Kim Shepherd ues in DescribeStep does not properly test for authority control Export directories dspace.cfg and build.xml out of sync Add Xalan to Solr pom.xml as dependency Error in update sequence script 1.5 to 1.6 Oracle Fixed for Empty description column in Itemview Page - General-Handler.xsl Claudia Jrgen
DS-548
Kim Shepherd
DS-551
Claudia Jrgen
DS-566
Fixed for side bar menu Kim Shepherd dropping when there is license text in collection DIM-Handler.xsl Batch metadata editor fails to notice change of item's owning collection Jeffrey Trimble
DS-572
Stuart Lewis
DS-573
NPE resuming submisAndrea Bollini sion for item with an empty bundle original DSpaceMETSIngester creates empty original bundle Andrea Bollini
Andrea Bollini
DS-574
Andrea Bollini
DS-579
Required fields in submis- Ben Bosman sions display wrong error message DIDL format include Andrea Bollini HTML element if the item has no files DIDL doesn't respect Andrea Bollini the hidden fields and the oai_dc metadata section is different than the simple oai_dc implementation
Ben Bosman
DS-580
Andrea Bollini
DS-581
Andrea Bollini
408
Changes in DSpace 1.6.0 Key DS-610 Summary Assignee Reporter Robin Taylor
Give METS ingester con- Stuart Lewis figuration option to make use of collection templates Allow the primary bitstream to be set in the item importer / exporter New -zip option for item exporter and importer Creative Commons - option to set legal jurisdiction Community Admin XMLUI: Delegated Admins Patch Stuart Lewis
DS-195
Stuart Lewis
DS-204 DS-205
Unassigned Unassigned
DS-228
Andrea Bollini
Tim Donohue
DS-236
Authority Control, and Jeffrey Trimble plug-in choice control for Metadata Fields Contribution of @MIRE Solr Based Statistics Engine to DSpace. Hide metadata from full item view OAI-PMH + OAI-ORE harvesting support Embargo feature Mark Diggory
Larry Stone
DS-247
Mark Diggory
Claudia Jrgen Scott Phillips Stuart Lewis Stuart Lewis Richard Rodgers
DSpace command launch- Jeffrey Trimble er ItemUpdate - new feature Jeffrey Trimble to batch update metadata and bitstreams Add support for OpenSearch syndicated search conventions Jeffrey Trimble
DS-324
Richard Rodgers
409
Changes in DSpace 1.6.0 Key DS-330 Summary Assignee Reporter Stuart Lewis
Create new session on lo- Stuart Lewis gin / invalidate sessions on logout Add alternate file appender for log4j JSPUI tags/views for @mire Solr statistics module Graham Triggs Kim Shepherd
DS-359 DS-363
DS-377
Add META tags identiLarry Stone fying DSpace source version to Web UIs Item importer - new option to enable workflow notification emails Email test script Jeffrey Trimble
Larry Stone
DS-388
Stuart Lewis
DS-447
Jeffrey Trimble
Stuart Lewis
DS-196
Stuart Lewis
Stuart Lewis
DS-201
handle.jar 6.2 needs Mark Diggory adding to DSpace Maven repository IPAuthentication extended to allow negative matching Stuart Lewis
Stuart Lewis
DS-213
Stuart Lewis
DS-219 DS-221
Internal Server error - in- Stuart Lewis clude login details of user XMLUI 'current activity' recognises Google Chrome as Safari Configurable passing of Javamail parameter settings Move item function in xmlui DSpace Assembly Improvement Stuart Lewis
DS-234
Stuart Lewis
Stuart Lewis
DS-238 DS-241
410
Changes in DSpace 1.6.0 Key DS-251 DS-252 Summary Assignee Reporter Kim Shepherd Larry Stone
Bulk Metadata Editing: Kim Shepherd XMLUI aspect and forms Interpolate variables in the Subject: line of email templates as well Community Admin JSPUI: porting of the DS-228 patch Stuart Lewis
DS-261
Andrea Bollini
Andrea Bollini
Make delegate admin per- Jeffrey Trimble missions configurable Make the OAI DC crosswalk configurable README update for top level of dspace 1.6.0 package directory Stuart Lewis Stuart Lewis
DS-297
Refactor SQL source and Larry Stone Ant script to avoid copying Oracle versions over PostgreSQL Allow long values to be Graham Triggs specified for the max upload request (for uploading files greater than 2Gb) Option to disable mailserver Offer access in AbstractSearch to QueryResults for subclasses Jeffrey Trimble Ben Bosman
Larry Stone
DS-299
Stuart Lewis
DS-306 DS-307
DS-308
documentation on an Jeffrey Trimble added optional configuration parameter Enhance readability of embedded metadata in html head Make SWORD app:accepts configurable Stuart Lewis
Ben Bosman
DS-315
Stuart Lewis
DS-316 DS-319
Jeffrey Trimble
Replace '/ Jeffrey Trimble dspace/bin/dsrun org.dspace.browse.ItemCounter' with /dspace/bin/itemcounter Adjust SWORD ingest Stuart Lewis crosswalk to store bibliographic citation Minho Statistics Database Mark Diggory Manager additions.
DS-333
Stuart Lewis
DS-335
Mark Diggory
411
Changes in DSpace 1.6.0 Key DS-336 DS-347 Summary Assignee Reporter Mark Diggory Larry Stone
DSpace Services 2.0 Sup- Mark Diggory port Add --quiet option to Me- Larry Stone diaFilterManager to disable debug/monitoring output Antispam for suggest item Unassigned feature Merge + Improve Genera- Larry Stone tion of Syndication Feeds Script to convert legacy dspace.log stats into solr stats records Jeffrey Trimble
DS-369
Implement Unassigned harvest.includerestricted.rss in xmlui New verbose option for [dspace]/bin/dspace cleanup script Claudia Jrgen
Stuart Lewis
DS-372
Stuart Lewis
DS-382
Proposal: Add 'dc.creator' Tim Donohue to Author browse index by default Allow user to specify which <dmdSec> is used by the METS Ingester when importing METS from Packager script Tim Donohue
Tim Donohue
DS-386
Tim Donohue
DS-389 DS-407
Misleading label: "Submit Stuart Lewis to This Collection" Install or Upgrade on ex- Jeffrey Trimble isting server throws errors for "mvn package" Updates to Upgrade Instructions necessary Jeffrey Trimble
DS-410 DS-412
Xpdf MediaFilter: gener- Larry Stone ate UTF-8 text, and improve error reporting JSP UI cosmetics: horizontal scroll bar Enable RSS feeds by default New Bitstream.findAll() method New ant step test_database Kim Shepherd Jeffrey Trimble Stuart Lewis Jeffrey Trimble
412
Changes in DSpace 1.6.0 Key DS-455 DS-460 Summary Remove dspace/config/log4j.xml Change logging from RollingFileAppender to DailyFileAppender Add information about setting web proxies for maven to install docs Czech localization of 1.6.0 Assignee Jeffrey Trimble Jeffrey Trimble Reporter Stuart Lewis Stuart Lewis
DS-461
Jeffrey Trimble
Stuart Lewis
DS-567
Claudia Jrgen
Ivan Masr
DS-114
Claudia Jrgen
Claudia Jrgen
DS-118
Claudia Jrgen
Claudia Jrgen
DS-121
XMLUI Feedback form Kim Shepherd breaks with multiple hostnames xmlui browse in empty Scott Phillips collection displays "Now showing items 1-0" of 0 incorrect numbering Anchor in submission doesn't work Larry Stone
Keith Gilbertson
DS-123
Keith Gilbertson
File description not avail- Stuart Lewis able in XMLUI metadataschemaregistry_seq Stuart Lewis is not initialized correctly under Oracle OAI RDF crosswalk fails Stuart Lewis when DC value is null Deleting a primary bitstream does not clear the primary_bitstream_id on the bundle table File descriptions can not be removed/cleared in XMLUI Claudia Jrgen
DS-193 DS-197
DS-198
Unassigned
Kim Shepherd
413
Changes in DSpace 1.6.0 Key DS-199 Summary SWORD module doesn't accept X-No-Op header (dry run) Assignee Unassigned Reporter Claudio Venturini
DS-200 DS-206
SWORD module requires Stuart Lewis the X-Packaging header Input form visibility restriction doesn't work properly Context.java turnOffAuthorisationSystem() can throw a NPE Andrea Bollini
DS-209
Stuart Lewis
Stuart Lewis
DS-212
NPE thrown during Har- Stuart Lewis vest of non-items when visibility restriction is enabled Migrating items that use additional metadata schemas causes an NPE Hardcoded String in the license bitstream Unassigned
Stuart Lewis
DS-216
Stuart Lewis
DS-217 DS-218
Andrea Bollini
Cannot add/remove email Tim Donohue subscriptions from Profile page in XMLUI Email alerts due to internal errors are not sent, if context is missing Claudia Jrgen
DS-222
Claudia Jrgen
DS-223
Submission process show Andrea Bollini previous button in JSPUI also if the step is the first "visible" step dc.description.provenance Stuart Lewis - public display confirmation page of edit Ben Bosman profile has an invalid link Values with double apos Andrea Bollini doesn't work in dropdown and list input type Missig file DCPersonName parses name incorrectly (fix included) Claudia Jrgen Larry Stone
Andrea Bollini
DS-231 DS-232
DS-240
Stuart Lewis
414
Changes in DSpace 1.6.0 Key DS-246 Summary Assignee Reporter Kim Shepherd
Fix configurable browse Unassigned parameter encoding (XMLUI) Missing admin column in community table in database-schema.sql community admin patch sub-daily utility script does not pass arguments to Java (fix included) Invalid identifers are not escaped NullPointerException in HttpServletResponseBufferingWrapper (Cocoon bug?) Bitstream (and item-export) download service does not correctly sense authenticated user Claudia Jrgen
DS-248
Claudia Jrgen
DS-249
Stuart Lewis
Larry Stone
DS-250 DS-253
DS-254
Larry Stone
Larry Stone
DS-255
CompleteStep in submis- Larry Stone sion LOSES SUBMISSION if an exception is thrown Item Export ignores meta- Stuart Lewis data language qualifier Item View Thumbnails Stuart Lewis not displaying in XMLUI Template item some times Andrea Bollini has owningCollection filled and some times not Bug in DS-118, new patch Stuart Lewis included XMLUI misses logging UsageEvent on requests fulfilled from the cache (with proposed fix) Mark Diggory
Larry Stone
DS-262 DS-264
DS-265
IndexBrowse dies fatalGraham Triggs ly when confronting badly-formatted date Oracle JDBC connection Claudia Jrgen string wrong in dspace.cfg - ID: 2722093 Typo in XSL breaks rendering of dri:xref with class Mark H. Wood
Samuel Ottenhoff
DS-269
Claudia Jrgen
DS-274
Larry Stone
415
Changes in DSpace 1.6.0 Key DS-275 Summary Assignee Reporter Larry Stone
License files not listed on Stuart Lewis Item Summary page; XSL bug with patch Patch to fix spelling error Claudia Jrgen in Exception page build.xml fails for ant ver- Jeffrey Trimble sions below 1.7 (patch included) Invalid Link to "Go to DSpace Home" on Page Not Found Tim Donohue
DS-276 DS-280
DS-281
Tim Donohue
DS-282
"Starts with" navigation Unassigned block should not display when browsing by specific value Some rows if presented in Larry Stone the item summary will be wrongly considered odd or even. Item and Bitstream pages Larry Stone do not provide LastModified HTTP header, nor recognize If-Modified-Since [dspace]/exports is not created during fresh install Stuart Lewis
Samuel Ottenhoff
DS-284
Flvio Botelho
DS-285
Larry Stone
DS-290
Stuart Lewis
DS-303
Export migrate option in- Stuart Lewis correctly removes nonhandle identifier.uris Shiboleth default roles Andrea Bollini are applied also to anonymous user and user logged-in with other methods UTF-8 encoding in com- Ben Bosman munity and collection text JSPUI: Left over text in edit item about format Stuart Lewis
Stuart Lewis
DS-309
Andrea Bollini
java.util.NoSuchElementException: Unassigned Timeout waiting for idle objec SWORD temp upload di- Stuart Lewis rectory missing trailing slash
DS-327
Stuart Lewis
416
Changes in DSpace 1.6.0 Key DS-328 Summary SWORD service documents do not include atom:generator element Assignee Stuart Lewis Reporter Stuart Lewis
DS-337 DS-338
A bug related with adding Claudia Jrgen new -EPersons Bitstream download alLarry Stone lows caching of content that requires authorization to read DSpace services log to the Mark Diggory command line Apostrophe in email address prevents EPerson from being selected Graham Triggs
DS-340 DS-344
DS-349
Edit Item in admin UI Larry Stone does not allow setting Bitstream to an Internal BitstreamFormat Missing commits in XM- Unassigned LUI server-side javascript code. Make-handle-server con- Jeffrey Trimble figuration fails. Using manual commands instead of script is successful. New DSpace OAI-PMH Ben Bosman Harvester doesn't support OAI gateways that do not use "sets" E Mail Sent On Item Export Error Message "Letter" links have broken URLs in 2nd-stage Browse XMLUI Submission Interface messes up in IE7 after an empty <hint> in input_forms.xml Unassigned Larry Stone
Larry Stone
DS-353
Flvio Botelho
DS-354
Jeffrey Trimble
DS-365
Mark Diggory
DS-370 DS-373
DS-378
Tim Donohue
Tim Donohue
open-search in jspui won't Richard Rodgers return description.xml community and collection Ben Bosman homepage Packager script is unable Tim Donohue to import the same METS + DIM package that was exported
417
Changes in DSpace 1.6.0 Key DS-392 Summary Assignee Reporter Ben Bosman
Error messages in the sub- Ben Bosman mission do not disappear if e.g. one of the two errors are solved The issue date in the sub- Larry Stone mission lowers each time the describe page is being displayed DSpace Objects (commu- Tim Donohue nities, collections, items, bitstreams) only accessible to logged in users Submission license displayed on collection and item homepage Special characters in collection license lead to parse error Tim Donohue
DS-393
Ben Bosman
DS-395
Claudia Jrgen
DS-398
Claudia Jrgen
DS-399
Tim Donohue
Claudia Jrgen
DS-400
Webui item browse (date, Larry Stone title or similar) reduces displayed issue date by one day View Statistics button does not work in item page Kim Shepherd
Claudia Jrgen
DS-406
Dan Ishimitsu
DS-409
JSPUI Statistics Kim Shepherd Display ignores "statistics.item.authorization.admin" solr statistics file downloads listed in statistics display of communites and collections Create groups via admin UI authorization denied i18n broken in jspui Kim Shepherd
Kim Shepherd
DS-414
Claudia Jrgen
Setting Larry Stone embargo.field.terms to an unqualified field throws uncaught exception on item submission Setting Mark Diggory solr.metadata.item.X property to an unqualified field generates exception in SolrLogger.post on item view and pre-
DS-421
Scott Hanrath
418
Changes in DSpace 1.6.0 Key Summary vents Solr from logging the event Directory 'etc' missing from Ant target 'init_installation'. Ant target 'clean_database' doesn't drop all tables'. Export metadata button displayed in JSPUI Administration List of withdrawn items Item license per default displayed in item display of the xmlui Assignee Reporter
DS-422
Richard Rodgers
Robin Taylor
DS-423
Tim Donohue
Robin Taylor
DS-424
Stuart Lewis
Claudia Jrgen
DS-427
Richard Rodgers
Claudia Jrgen
DS-428 DS-432
Wrong link for bitstreams Unassigned during submission No mention of config/news-xmlui.xml in manual Jeffrey Trimble
DS-436
SWORD Authenticator Andrea Bollini doesn't support the special groups infrastructure Oracle DB Schema has Larry Stone artifacts from past releases JSPUI stats - filename incorrect on second and subsequent files spiders.txt empty OAI PMH is not delivering continuation tokens CLONE - Foreign characters broken in group names. Kim Shepherd
Andrea Bollini
DS-437
Larry Stone
DS-438
Stuart Lewis
'fresh_install' broken on a Stuart Lewis completely clean system handle.canonical.prefix undocumented Kim Shepherd
LDAPHierarchicalAuGraham Triggs thentication fails when the LDAP returns mixed case email address Exception is thrown when Mark H. Wood removing the last file after
DS-480
Flvio Botelho
419
Changes in DSpace 1.5.2 Key Summary Assignee the item is rejected during review. Once added, the descrip- Unassigned tion of a bitstream can not be removed SyndicationFeed expects Robin Taylor dc.date.issued to be available as a java.util.Date Reporter
DS-597
Eija Airio
DS-610
Robin Taylor
DS-11
Charles Kiplagat
DS-13
Stuart Lewis
Charles Kiplagat
DS-16 DS-19
DS-21
Fix for hardcoded meta- Claudia Jrgen data language qualifiers ID: 2433387 Hardcoded String in jspui Claudia Jrgen browse - ID: 2526153
Charles Kiplagat
DS-30
Charles Kiplagat
420
Changes in DSpace 1.5.2 Key DS-31 Summary Bug 2512868 Double quote problem in some fields of JSPUI - ID: 2525942 Assignee Claudia Jrgen Reporter Charles Kiplagat
DS-34
Add File Format Descrip- Unassigned tions to XMLUI 1.5.x ID: 2433852 Enable Google Sitemaps Unassigned for XMLUI - ID: 2462293 DSpace 1.5 XMLUI - En- Unassigned able METS <amdSec> using crosswalks - ID: 2477820 Fix for toDate method in DCDate - ID: 2385187 Stuart Lewis
Charles Kiplagat
DS-35 DS-36
DS-39 DS-45
Messages_th.properties Unassigned for DSpace 1.5.1 JSPUI ID: 2540683 Bug 1617889 Years < 1000 do not display in simple item view - ID: 2524083 Andrea Bollini
DS-46
Charles Kiplagat
DS-47
Add support for rendering Andrea Bollini DOI links in JSPUI (1.4, 1.5) - ID: 2521493 Italian translation xmlui - Andrea Bollini ID: 1984513 XMLUI Cocoon logs Unassigned should not be stored under [xmlui-webapp]/WEBINF/logs/ XMLUI file download links break in Google search results if file 'sequence' number changes. Upgrade XMLUI to Cocoon 2.2 Tim Donohue
Charles Kiplagat
DS-78 DS-85
DS-87
Tim Donohue
DS-93 DS-94
Mark Diggory
Verify Configuration Op- Mark Diggory tions are still applicable with the Cocoon User community. Visiting an item page with Tim Donohue a trailing slash returns a 404 message XMLUI does not come with a robots.txt out-ofTim Donohue
DS-95
Tim Donohue
DS-97
Tim Donohue
421
Changes in DSpace 1.5.2 Key Summary Assignee the-box. It also does not provide suggestions for loading global static content. XMLUI OpenURL support Consistent treatment to users in special groups Mark Diggory Andrea Bollini Reporter
Item exporter does not ex- Stuart Lewis port bitsteam descriptions Patch for Feature Request Claudia Jrgen 2609564 group delete confirm - ID: 2612341 messages_el.xml for DSpace 1.5.1 XMLUI ID: 2615647 Claudia Jrgen
DS-145
Charles Kiplagat
DS-146
Feature Request 2560839 Claudia Jrgen Make sitemap directory configurable - ID: 2560974 Messages_th.properties Claudia Jrgen for DSpace 1.5.1 JSPUI ID: 2540683 Updated Messages_th.properties for DSpace 1.4.2 - ID: 2537799 Usage event (statistics) plugin hook for 1.5 - ID: 2025998 Make a backup of the config dir on Ant update Claudia Jrgen
Charles Kiplagat
DS-147
Charles Kiplagat
DS-148
Charles Kiplagat
DS-150
Mark H. Wood
Charles Kiplagat
Mark Diggory
LDAP_Support for hierar- Unassigned chical LDAP servers Speed up SWORD deposits XPDF support for filtering PDFs for text extraction/search. [Dspace 1.5.2] Russian translation Czech localization of 1.5.2 Basque translation Stuart Lewis Mark Diggory
422
DS-2
"Not found" page returns Mark H. Wood 200 OK instead of 404 Not Found - ID: 2002866 DSpace1.5.1(XML) prob- Stuart Lewis lem with Login to restricted bitstreams - ID: 2164955 XHTML Head Dissimination Crosswalk exposes provenance info - ID: 2343281 HTML tags not stripped in statistics display - ID: 1896225 DSpace Home link style in breadcrumb trail - ID: 1951859 Restricted Items metadata exposed via OAI - ID: 1730606 Implicit group for all registered users - ID: 1587270 Stuart Lewis
Charles Kiplagat
DS-5
Charles Kiplagat
DS-6
Charles Kiplagat
DS-7
Stuart Lewis
Charles Kiplagat
DS-8
Stuart Lewis
Charles Kiplagat
DS-9
Stuart Lewis
Charles Kiplagat
DS-10
Stuart Lewis
Charles Kiplagat
DS-12
Exception handling for Stuart Lewis deleting a metadata field ID: 1606439 xmlui Administrative log Stuart Lewis in as another eperson - ID: 2086481 Submission verify page Unassigned handles dc.identifier.* incorrectly - ID: 2155479 DSpace 1.5 Controlled Stuart Lewis Vocab (edit-metadata.jsp) - ID: 1931796 DSpace 1.5.1(XMLUI) Stuart Lewis Wrong dir usage of StatisticsLoader - ID: 2137425
Charles Kiplagat
DS-14
Charles Kiplagat
DS-15
Charles Kiplagat
DS-17
Charles Kiplagat
DS-18
Charles Kiplagat
423
Changes in DSpace 1.5.2 Key DS-20 Summary 2 Authentications with LoginPage cause connection exhaust - ID: 2352146 Assignee Claudia Jrgen Reporter Charles Kiplagat
DS-22 DS-23
News stored not language Unassigned dependend - ID: 2125833 DSQuery invalid check for empty query string ID: 2343849 Unassigned
DS-24
Error in authorization to Unassigned submit when you add collection. - ID: 1725817 SWORD Service Docu- Stuart Lewis ment fails if Collection is untitled - ID: 1968082 Hardcoded Strings in DS- Claudia Jrgen Query - ID: 2493794 NullPointerException possible in review.jsp ID: 1571645 Itemmap-browse.jsp throws Exception on items without date - ID: 1745573 Imporper display of Umlauts / Encoding of messages_de.xml - ID: 2413800 Claudia Jrgen
Charles Kiplagat
DS-25
Charles Kiplagat
DS-26 DS-27
DS-28
Claudia Jrgen
Charles Kiplagat
DS-29
Unassigned
Charles Kiplagat
DS-32
Export Item email hard- Claudia Jrgen coded strings in sources ID: 2411653 checksum checker can Unassigned not retrieve very large bitstream - ID: 2016130 Submission Forms don't Unassigned preserve order of values ID: 2541285 XMLUI Submission forms display errors during add/remove - ID: 2543413 SubmissionController is not thread safe Small JSP cleanups Unassigned
Charles Kiplagat
DS-33
Charles Kiplagat
DS-37
Charles Kiplagat
DS-38
Charles Kiplagat
DS-40 DS-41
424
Changes in DSpace 1.5.2 Key DS-43 Summary Assignee Reporter Charles Kiplagat
Manakin RSS feed gener- Andrea Bollini ator cache timeout can't be adjusted - ID: 2593393 Double quote problem Unassigned in some fields of JSPUI (continued) - ID: 2528065 Item Export via UI leads to malformed request ID: 2520279 netid.toLowerCase() in different classes and methods - ID: 2458187 Stuart Lewis
DS-58
Andrea Bollini
DS-59
Andrea Bollini
DS-60
Unassigned
Andrea Bollini
DS-61
IP Authentication only Unassigned works on logged in users ID: 2088431 Not right encoding for li- Andrea Bollini cense and extracted texts ID: 2234532 xmlui selection of auAndrea Bollini thentication method - ID: 2086673 bitstream format registry, Andrea Bollini setting bitstream internal ID: 1896055 Impossible retry/resume Andrea Bollini after upload a file with internal format Subscription not sent cor- Andrea Bollini rectly - ID: 2667590 Hardcoded behaviour of Andrea Bollini Initial question step in the submission Improve API for "Ignore- Unassigned Authorization" blocks ID: 1687783 XMLUI Feedback form Tim Donohue does not include any protection from spamming Anonymous group don't works as subgroup - ID: 1688445 Andrea Bollini
Andrea Bollini
DS-62
Andrea Bollini
DS-64
Andrea Bollini
DS-66
Andrea Bollini
DS-67
Andrea Bollini
DS-81 DS-83
DS-84
Andrea Bollini
DS-86
Tim Donohue
DS-88
Andrea Bollini
DS-90
Workflow doesn't skipped Andrea Bollini if the wf group contains empty subgroups Bitstream access rights inheritenc, editing Andrea Bollini
Andrea Bollini
DS-92
Andrea Bollini
425
Changes in DSpace 1.5.2 Key Summary in_archive items - ID: 1993036 Assignee Reporter
DS-96
XMLUI Wing-FrameTim Donohue work 'Radio' class doesn't support adding radio buttons which are unselected by default. Key Stuart Lewis jsp.mydspace.request.export.community not set in Messages.properties Non-admin user and admin menu options (1.5.1 XMUI only) - ID: 2353606 Impossible to complete Registration in JSPUI Andrea Bollini
Tim Donohue
DS-98
Stuart Lewis
DS-99
Stuart Lewis
DS-101 DS-102
Andrea Bollini
Submission without file Andrea Bollini possible with the skip upload functionality disabled Use of the Progress bar Andrea Bollini button don't save the current page modification Cancel/save button during Andrea Bollini edit of a workflow item doesn't work Initial Questions button Unassigned in the progress bar doesn't work in workflow Save & exit button doesn't Unassigned properly work in workflow DIDL Crosswalk thrown NPE Andrea Bollini
DS-103
Andrea Bollini
DS-104
Andrea Bollini
DS-105
Andrea Bollini
DS-106
Andrea Bollini
DS-107 DS-112
Scandinavian caharacters Tim Donohue mess up Front page full text search Export item buttons metadata not showed in verify step File description not correct Unassigned Andrea Bollini Andrea Bollini
426
Changes in DSpace 1.5.2 Key DS-120 DS-122 DS-126 DS-129 Summary Cocoon 2.2 "hijacks" the dspace.log file Assignee Tim Donohue Reporter Tim Donohue Nilde De Paoli Andrea Bollini Claudia Jrgen
Problems with repeatable Andrea Bollini metadata Request to [dspace-base- Andrea Bollini url]/bitstream throw NPE OpenURL supMark Diggory port exposes dc.description.provenance information XMLUI Browse by Author doesn't work for names with special characters (for example: , , , etc.) Tim Donohue
DS-130
Tim Donohue
DS-131
Browse By Title displays Unassigned all titles beginning with punctuation at the top of the list XMLUI overall UTF-8 Tim Donohue encoding is inconsistent and forms do not use UTF-8 Authors re-ordered when Stuart Lewis item edited - ID: 2655052 'ant init_configs' copies wrong dspace.cfg - ID: 2620307 Search with "%" causes NullPointer in XMLUI ID: 2557654 XMLUI Submission forms display errors during add/remove - ID: 2543413 Mark H. Wood
Tim Donohue
DS-132
Tim Donohue
DS-138 DS-143
DS-149
Tim Donohue
Charles Kiplagat
DS-151
Tim Donohue
Charles Kiplagat
DS-152
Submission verify page Tim Donohue handles dc.identifier.* incorrectly - ID: 2155479 Submission Forms don't Tim Donohue preserve order of values ID: 2541285 Maven war plugin changes cause war to include libraries as well bin files in DOS format Mark Diggory
Charles Kiplagat
DS-153
Charles Kiplagat
DS-155
Mark Diggory
DS-185
Unassigned
Andrea Bollini
427
Changes in DSpace 1.5.1 Key DS-186 Summary NPE during edit of eperson in XMLUI Assignee Andrea Bollini Reporter Andrea Bollini
428
Changes in DSpace 1.5.1 Add serlet-api to overlay wars to reduce compile time errors when adding classes Correct issues in feed generation XMLUI Adjust Advanced Search to use search properties from dspace.cfg. Correct bug in Body.toSAX where startElement is called instead of end element. Correct issue with libraries being excluded from wars (Claudia Juergen) Fix for SF bug #2090761 Statistics wrong use of dspace.dir for log location Fix for SF bug #2081930 xmlui hardcoded strings in EditGroupForm.java Fix for SF bug #2080319 jspui hardcoded strings in browse Fix for SF bug #2078305 xmlui hardcoded strings used in UI in xmlui-api Fix for SF bug #2078324 xmlui hardcoded strings used in UI in General-Handler.xsl SF patch #2076066 Review in jspui submission non-dc metadata SF Bug #1983859 added Foreign Lucene Analyzers to poms SF Bug #1989916 - missing LDAP authentication key (Stuart Lewis) #1947036 Patch for SF Bug1896960 SWORD authentication and LDAP + 1989874 LDAPAuthentication pluggable method broken for current users Added copying of registration email template to 1.4 to 1.5 upgrade instructions Fix for SF bug #2055941 LDAP authentication fails for new users in SWORD and Manakin (Zuki Ebetsu / Stuart Lewis) #1990660 SWORD Service Document are malformed / Corrected Atom publishing MIME types (Stuart Lewis / Claudia Juergen) Updated installation and configuration documents for new statistics script, and removed references to Perl (Tim Donohue) Fix for SF bug #2095402 - Non-interactive Submission Steps don't work in JSPUI 1.5 Fix for SF bug #2013921 - Movement in Submission Workflow Causes Skipped Steps Fix for SF bug #2015988 - Configurable Submission bug in SubmissionController Fix for SF bug #2034372 - Resorting Search Results in JSPUI always gives no results Updates to Community/Collection Item Counts (i.e. strengths) for XMLUI. 1.5 upgrade instructions were missing Metadata Registry updates necessary to support SWORD.
429
Changes in DSpace 1.5 (Graham Triggs) Fix various problems with resources potentially not being freed, and other minor fixes suggested by FindBugs Replace URLEncoder with StringEscapeUtils for better fix of escaping the hidden query field Fix #2034372 - Resorting in JSPUI gives no results Fix #1714851 - set eperson.subscription.onlynew in dspace.cfg to only include items that are new to the repository Fix issue where the browse and search indexes will not be updated correctly if you move an Item Fix problem with SWORD not accepting multiple concurrent submissions Fix #1963060 Authors listed in reverse order Fix #1970852 - XMLUI: Browse by Issue Date "Type in Year" doesn't work Statistics viewer for XMLUI, based on existing DStat. Note that this generates the view from the analysis files (.dat), does not require HTML report generation. Fixed incorrect downloading of bitstream on withdrawn item Add JSPUI compatible log messages to XMLUI transformers Clean up use of ThreadLocal Improved cleanup of database resources when web application is unloaded Fix bug #1931799 - duplicate "FROM metadatavalue" Fixed Oracle bugs with ILIKE operators and LIMIT/OFFSET clauses
430
431
Changes in DSpace 1.4.1 1763535 Patch - Alert DSpace administrator of new user registration 1759438 Patch - Multilingualism Language Switch - DSpace Header
432
Changes in DSpace 1.4.1 Various documentation additions and cleanups XHTML compliance improvements Move w3c valid xhtml boiler image into local repository Remove uncessary Log4j Configuration in CheckerCommand Include Windows CLASSPATH in dsrun.bat
433
Changes in DSpace 1.4 1543966 - "Special" groups inside groups bug 1480496 - Cannot turn off "ignore authorization" flag! 1515148 - Community policies not deleting correctly 1556829 - Docs mention old SiteAuthenticator class 1606435 - Workflow text out of context Fix for bitstream authorization timeout Fix to make sure cleanup() doesn't fail with NullPointerException Fix for removeBitstream() failing to update primary bitstream Fix for Advanced Search ignoring conjunctions for arbitrary number of queries Fix minor bug in Harvest.java for Oracle users Fix missing title for news editor page Small Messages.properties modification (change of DSpace copyright text) fix PDFBox tmp file issue Fix HttpServletRequest encoding issues Fix bug in TableRow toString() method where NPE is thrown if tablename not set Update DIDL license and change coding style to DSpace standard
434
Changes in DSpace 1.4 QueryArgs class can support any number of fields in advanced search. Community names no longer have to be unique Enhanced Windows support Support for multiple (flat) metadata schemas Suggest an item page RSS Feeds Performance enhancements Stackable authentication methods Plug-in manager Pluggable SIP/DIP support and metadata crosswalks Nested groups of e-people Expose METS and MPEG-21 DIDL DIPs via OAI-PMH Configurable Lucene search analyzer (e.g. for Chinese metadata) Support for SMTP servers requiring authentication
435
436
437
Changes in DSpace 1.2.2 Search scope retention to improve browsing Community and collection strengths displayed Upgraded OAICat software
438
439
Changes in DSpace 1.2 dspace-admin/confirm-delete-communitymoved to tools/ and changed dspace-admin/edit-collectionmoved to tools/ and changed dspace-admin/edit-communitymoved to tools/ and changed dspace-admin/indexchanged dspace-admin/upload-logochanged dspace-admin/wizard-basicinfochanged dspace-admin/wizard-default-itemchanged dspace-admin/wizard-permissionschanged dspace-admin/wizard-questionschanged help/formats.htmlremoved help/formatschanged indexchanged layout/navbar-adminchanged
15.15.2. Administration
If you are logged in as administrator, you see admin buttons on item, collection, and community pages New collection administration wizard Can now administer collection's submitters from collection admin tool Delegated administration - new 'collection editor' role - edits item metadata, manages submitters list, edits collection metadata, links to items from other collections, and can withdraw items
440
Changes in DSpace 1.2 Admin UI moved from /admin to /dspace-admin to avoid conflict with Tomcat /admin JSPs New EPerson selector popup makes Group editing much easier 'News' section is now editable using admin UI (no more mucking with JSPs)
15.15.3. Import/Export/OAI
New tool that exports DSpace content in AIPs that use METS XML for metadata (incomplete) OAI - sets are now collections, identified by Handles ('safe' with /, : converted to _) OAI - contributor.author now mapped to oai_dc:creator
15.15.4. Miscellaneous
Build process streamlined with use of WAR files, symbolic links no longer used, friendlier to later versions of Tomcat MIT-specific aspects of UI removed to avoid confusion Item metadata now rendered to avoid interpreting as HTML (displays as entered) Forms now have no-cache directive to avoid trouble with browser 'back' button Bundles now have 'names' for more structure in item's content
441
Changes in DSpace 1.2 Moved to dspace-admin and changed: dspace/jsp/admin/authorize-policy-edit.jsp Moved to dspace-admin: dspace/jsp/admin/collection-select.jsp Moved to dspace-admin: dspace/jsp/admin/community-select.jsp Moved to dspace-admin: dspace/jsp/admin/confirm-delete-collection.jsp Moved to dspace-admin: dspace/jsp/admin/confirm-delete-community.jsp Moved to dspace-admin: dspace/jsp/admin/confirm-delete-dctype.jsp Moved to dspace-admin: dspace/jsp/admin/confirm-delete-eperson.jsp Moved to dspace-admin: dspace/jsp/admin/confirm-delete-format.jsp Moved to dspace/jsp/tools: dspace/jsp/admin/confirm-delete-item.jsp Moved to dspace/jsp/tools: dspace/jsp/admin/confirm-withdraw-item.jsp Moved to dspace-admin and changed: dspace/jsp/admin/edit-collection.jsp Moved to dspace-admin and changed: dspace/jsp/admin/edit-community.jsp Moved to dspace/jsp/tools and changed: dspace/jsp/admin/edit-item-form.jsp Moved to dspace-admin and changed: dspace/jsp/admin/eperson-browse.jsp Moved to dspace-admin: dspace/jsp/admin/eperson-confirm-delete.jsp Moved to dspace-admin and changed: dspace/jsp/admin/eperson-edit.jsp Moved to dspace-admin and changed: dspace/jsp/admin/eperson-main.jsp Moved to dspace/jsp/tools and changed: dspace/jsp/admin/get-item-id.jsp Moved to dspace/jsp/tools and changed: dspace/jsp/admin/group-edit.jsp Moved to dspace-admin and changed: dspace/jsp/admin/group-eperson-select.jsp Moved to dspace/jsp/tools and changed: dspace/jsp/admin/group-list.jsp Moved to dspace-admin: dspace/jsp/admin/index.jsp Moved to dspace-admin and changed: dspace/jsp/admin/item-select.jsp Moved to dspace-admin and changed: dspace/jsp/admin/list-communities.jsp Moved to dspace-admin and changed: dspace/jsp/admin/list-dc-types.jsp Removed: dspace/jsp/admin/list-epeople.jsp Moved to dspace-admin and changed: dspace/jsp/admin/list-formats.jsp Moved to dspace/jsp/tools: dspace/jsp/admin/upload-bitstream.jsp Moved to dspace-admin and changed: dspace/jsp/admin/upload-logo.jsp Moved to dspace-admin: dspace/jsp/admin/workflow-abort-confirm.jsp
442
Changes in DSpace 1.2 Moved to dspace-admin and changed: dspace/jsp/admin/workflow-list.jsp Changed: dspace/jsp/browse/authors.jsp Changed: dspace/jsp/browse/items-by-author.jsp Changed: dspace/jsp/browse/items-by-date.jsp Changed: dspace/jsp/browse/no-results.jsp New: dspace-admin/eperson-deletion-error.jsp New: dspace/jsp/dspace-admin/news-edit.jsp New: dspace/jsp/dspace-admin/news-main.jsp New: dspace/jsp/dspace-admin/wizard-basicinfo.jsp New: dspace/jsp/dspace-admin/wizard-default-item.jsp New: dspace/jsp/dspace-admin/wizard-permissions.jsp New: dspace/jsp/dspace-admin/wizard-questions.jsp Changed: dspace/jsp/components/contact-info.jsp Changed: dspace/jsp/error/internal.jsp New: dspace/jsp/help/formats.jsp Changed: dspace/jsp/layout/footer-default.jsp Changed: dspace/jsp/layout/header-default.jsp Changed: dspace/jsp/layout/navbar-admin.jsp Changed: dspace/jsp/layout/navbar-default.jsp Changed: dspace/jsp/login/password.jsp Changed: dspace/jsp/mydspace/main.jsp Changed: dspace/jsp/mydspace/perform-task.jsp Changed: dspace/jsp/mydspace/preview-task.jsp Changed: dspace/jsp/mydspace/reject-reason.jsp Changed: dspace/jsp/mydspace/remove-item.jsp Changed: dspace/jsp/register/edit-profile.jsp Changed: dspace/jsp/register/inactive-account.jsp Changed: dspace/jsp/register/new-password.jsp Changed: dspace/jsp/register/registration-form.jsp Changed: dspace/jsp/search/advanced.jsp
443
Changes in DSpace 1.1.1 Changed: dspace/jsp/search/results.jsp Changed: dspace/jsp/submit/cancel.jsp New: dspace/jsp/submit/cc-license.jsp Changed: dspace/jsp/submit/choose-file.jsp New: dspace/jsp/submit/creative-commons.css New: dspace/jsp/submit/creative-commons.jsp Changed: dspace/jsp/submit/edit-metadata-1.jsp Changed: dspace/jsp/submit/edit-metadata-2.jsp Changed: dspace/jsp/submit/get-file-format.jsp Changed: dspace/jsp/submit/initial-questions.jsp Changed: dspace/jsp/submit/progressbar.jsp Changed: dspace/jsp/submit/review.jsp Changed: dspace/jsp/submit/select-collection.jsp Changed: dspace/jsp/submit/show-license.jsp Changed: dspace/jsp/submit/show-uploaded-file.jsp Changed: dspace/jsp/submit/upload-error.jsp Changed: dspace/jsp/submit/upload-file-list.jsp Changed: dspace/jsp/submit/verify-prune.jsp New: dspace/jsp/tools/edit-item-form.jsp New: dspace/jsp/tools/eperson-list.jsp New: dspace/jsp/tools/itemmap-browse.jsp New: dspace/jsp/tools/itemmap-info.jsp New: dspace/jsp/tools/itemmap-main.jsp
444
Changes in DSpace 1.1 registration page Invalid token error page now displayed when an invalid token is received (as opposed to internal server error.) Fixes SF bug #739999 eperson admin 'recent submission' links fixed for DSpaces deployed somewhere other than at / (e.g. /dspace). help pages Link to help pages now includes servlet context (e.g. '/dspace'). Fixes SF bug #738399.
15.16.2. Improvements
bin/dspace-info.pl now checks jsp and asset store files for zero-length files make-release-package now works with SourceForge CVS eperson editor now doesn't display the spurious text 'null' item exporter now uses Jakarta's cli command line arg parser (much cleaner) item importer improvements: now uses Jakarta's cli command line arg parser (much cleaner) imported items can now be routed through a workflow more validation and error messages before import can now use email addresses and handles instead of just database IDs can import an item to a collection with the workflow suppressed
445
Changes in DSpace 1.1 Many Unicode fixes to the database and Web user interface Collections can now be deleted Bitstream descriptions (if available) displayed on item display page Modified a couple of servlets to handle invalid parameters better (i.e. to report a suitable error message instead of an internal server error) Item templates now work Fixed registration token expiration problem (they no longer expire.)
446