Vous êtes sur la page 1sur 40

Cooperative Research Centre

for Torres Strait

Torres Strait Research Program

Data and Information Management


CRC-TS Project Task Number: 5.2

T. J. Taranto
C. R. Pitcher

Final Report
National Library of Australia Cataloguing-in-Publication data:

Taranto, T. J. (Thomas John).


CRC Torres Strait task 5.2, data and information management :
final report for CRC Torres Strait.

Bibliography.
Includes index.
ISBN 9781921232619 (pbk.)
ISBN 9781921232626 (pdf)

1. Marine sciences - Research - Queensland - Torres Strait Islands. 2. Torres Strait Islands (Qld.).
I. Pitcher, C. R. (Clifford Roland). II. Cooperative Research Centre for Torres Strait. III. CSIRO.
Marine and Atmospheric Research. IV. Title.

551.4609943

Citation:
Taranto, T. J. and C. R. Pitcher (2007). CRC Torres Strait Task 5.2: Data and Information
Management. Final Report for CRC Torres Strait. CSIRO Marine and Atmospheric Research,
Cleveland. pp.42.

Published: March 2007 by CSIRO Marine and Atmospheric Research


© CSIRO Marine and Atmospheric Research and CRC Torres Strait, 2007
This work is copyright. Except as permitted under the Copyright Act 1968 (Cwth), no part of this
publication may be reproduced by any process, electronic or otherwise, without the specific written
permission of the copyright owners. Neither may information be stored electronically in any form
whatsoever without such permission.

DISCLAIMER
CSIRO has taken all reasonable steps to ensure that the information contents in this publication are
accurate at the time of publication. Readers should ensure that they make appropriate inquiries to
determine whether new information is available on the particular subject matter
CRC Torres Data and Information Repository i

June 2007

Data and Information Management

CRC-TS Task Number: 5.2

Tom Taranto
Roland Pitcher
CSIRO Marine and Atmospheric Research
233 Middle St, Cleveland, Qld.

ISBN 9781921232619 (pbk)


ISBN 9781921232626 (pdf)

CRC Torres Strait Research Task 5.2 Final Report


CRC Torres Data and Information Repository ii

ACKNOWLEDGEMENTS
The compilation of data and research information into the Torres Strait Marine Research repository
was achieved through the collaboration of many research agencies. The contributions by AFMA,
AIMS, JCU, QDPI and the TSRA along with the funding by the CRC Torres Strait and the CSIRO
Division of Marine and Atmospheric Research are acknowledged. The continued support by the Reef
and Rainforest Research Centre (RRRC) and the CSIRO Marine and Atmospheric Research (CMAR)
Data Centre will ensure that marine research in the Torres Strait will have an extensive internet based
library of searchable information and data to draw upon into the future.
CRC Torres Data and Information Repository iii
TABLE OF CONTENTS
ACKNOWLEDGEMENTS.....................................................................................................................................ii
TABLE OF CONTENTS....................................................................................................................................... iii
FIGURES........................................................................................................................................................... iii
TABLES ............................................................................................................................................................ iii
NON-TECHNICAL SUMMARY ....................................................................................................................... 1-1
PROJECT: Task 5.2 Data and Information Management................................................................................ 1-1
PRINCIPAL INVESTIGATOR: Tom Taranto................................................................................................ 1-1
CO-INVESTIGATOR: Roland Pitcher ........................................................................................................... 1-1
ADDRESS: ...................................................................................................................................................... 1-1
OBJECTIVES:................................................................................................................................................. 1-1
NON-TECHNICAL SUMMARY: .................................................................................................................. 1-1
Achievements and Outcomes against the objectives (2006 - 2007)............................................................. 1-1
Utilisation and Application of the Research (2006 - 2007).......................................................................... 1-2
Publications (2006 - 2007)........................................................................................................................... 1-2
Outcomes Achieved..................................................................................................................................... 1-2
1. INTRODUCTION ....................................................................................................................................... 1-3
1.1. BACKGROUND ................................................................................................................................. 1-3
1.2. NEED................................................................................................................................................... 1-3
1.3. OBJECTIVES...................................................................................................................................... 1-3
2. METHODS .................................................................................................................................................. 2-4
2.1. Facilitate the collation of CRC-TS Intellectual Property (IP).............................................................. 2-4
2.2. Develop a searchable repository website ............................................................................................. 2-4
2.3. Coordinate the moderation and listing of sensitive data and publications ........................................... 2-5
3. RESULTS .................................................................................................................................................... 3-6
3.1. Facilitate the collation of CRC-TS Intellectual Property (IP).............................................................. 3-6
3.2. Develop a searchable repository website ............................................................................................. 3-8
3.3. Coordinate the moderation and listing of sensitive data and publications ......................................... 3-12
4. DISCUSSION............................................................................................................................................ 4-13
4.1. Facilitate the collation of CRC-TS Intellectual Property (IP)............................................................ 4-13
4.2. Develop a searchable repository website ........................................................................................... 4-13
4.3. Coordinate the moderation and listing of sensitive data and publications ......................................... 4-13
5. BENEFITS................................................................................................................................................. 5-14
6. FURTHER DEVELOPMENT................................................................................................................... 6-14
7. ACHIEVEMENT OF OUTCOMES.......................................................................................................... 7-14
8. CONCLUSIONS ....................................................................................................................................... 8-15
9. RECOMMENDATIONS........................................................................................................................... 9-15
10. REFERENCES .................................................................................................................................... 10-16
11. ABBREVIATIONS & GLOSSARY ................................................................................................... 11-16
12. APPENDIX 1: INTELLECTUAL PROPERTY.................................................................................. 12-17
13. APPENDIX 2: TASK MANAGEMENT LISTING ............................................................................ 13-28
14. APPENDIX 3: STAFF......................................................................................................................... 14-34

FIGURES
Figure 3-1. Index page showing custom repository search tool and Marlin Metadata Search tool...................... 3-9
Figure 3-2. Repository custom Google search result page................................................................................... 3-9
Figure 3-3. Direct access to CMAR Data Centre............................................................................................... 3-10
Figure 3-4. Additional page to search other websites ........................................................................................ 3-10
Figure 3-5. Direct access to other Australian Data Centres .............................................................................. 3-11
Figure 3-6. Search libraries................................................................................................................................ 3-11
Figure 3-7. Custom Google search on contributing web domains .................................................................... 3-12

TABLES
Table 3-1. Status of CRC Torres IP works lodged (as at 23 Mar 2007) by project.. ........................................... 3-6
Table 3-2. Status of CRC Torres IP works lodged (as at 23 Mar 2007) by Task level....................................... 3-7
CRC Torres Data and Information Repository 1-1
NON-TECHNICAL SUMMARY
PROJECT: Task 5.2 Data and Information Management

PRINCIPAL INVESTIGATOR: Tom Taranto

CO-INVESTIGATOR: Roland Pitcher

ADDRESS:
CSIRO Marine and Atmospheric Research
233 Middle St, Cleveland, 4163
Ph: 07 3826 7259 Fax: 07 3826 7222
Email: tom.taranto@csiro.au

OBJECTIVES:
1. To facilitate the collation of CRC-TS Reports, metadata and available associated data from
Principal Investigators in a standard format where possible.
2. To develop a searchable website on a secure repository containing linked Reports, metadata
and available associated data (for limited access where appropriate).
3. To coordinate the moderation and listing of sensitive data and publications for the project.

NON-TECHNICAL SUMMARY:
The CRC Torres Strait Data and Information Management Task was commissioned in June 2006, with
the involvement of a CRC TS Steering Committee, to capture publications and data produced by the
CRC Torres Strait. Preliminary work on the project began in June 2006. As the CRC Torres Strait
wound up in July 2006, the Contract and IP collected by this project has since been transferred to the
Reef and Rainforest Research Centre (RRRC).
A product of the project was to provide a website where all available CRC Torres Strait marine related
research information can be accessed by all stakeholders. The website has been established and
populated with publications and data as received from past CRC Torres Strait Task Leaders. The
website and associated holdings are hosted by the CSIRO Marine and Atmospheric Research Data
Centre, to be maintained in perpetuity.
During the collection phase of the Task, progress reports were provided by way of fortnightly emails
(up to 18th Dec 2006) to all CRC Task Leaders detailing the status of the collection of works and
requesting that outstanding works be lodged to the repository. Due to the slow response by Task
Leaders in lodging their research works, this Project was granted an extension to provide additional
opportunity for Task Leaders to lodge their CRC Torres Strait works. At the time of drafting of this
Final Report (23 March 2007), only 96 of 187 identified works have been received.
An important objective of this Task was to identify sensitive material. Effective procedures were
developed in association with Torres Strait Regional Authority (TSRA) staff to ensure appropriate
access constraints are enforced for all CRC works submitted to the repository.
All available literature and data is now available online at http://www.cmar.csiro.au/DataCentre/torres,
accessible using customized search engines. Following final feedback, the website will be promoted
to other Torres Strait agencies and libraries for inclusion onto their websites.

Achievements and Outcomes against the objectives (2006 - 2007)


The resultant Torres Strait Marine Research Repository has successfully published the CRC Torres
Strait IP works that have been lodged with the website administrator. In addition, the customised
search interfaces available for stakeholders significantly enhanced the repository’s planned outcome to
provide a searchable repository of research from the CRC Torres Strait.
CRC Torres Data and Information Repository 1-2
With a defined repository now permanently maintained by the CSIRO Marine and Atmospheric
Research Data Centre, the maintenance and availability of any lodged IP - not just that of the CRC
Torres Strait - is assured, thus providing an enduring service to the Torres Strait community.
A customised search interface has been developed to simplify searching of the Repository, and can be
incorporated onto other agency websites and/or linked to other independent repositories. A new search
interface that can seamlessly links independent research efforts.

Utilisation and Application of the Research (2006 - 2007)


The design of the repository and its search interfaces provide both individual stakeholders and
organizations the opportunity to participate in providing knowledge to a common user interface.
Individual stakeholders are invited to lodge their works to the repository permanently served by the
CSIRO Marine and Atmospheric Research Data Centre, while those agencies with their own
inventories are invited to supply internet links to their publicly accessible directories.
The application of this research requires that the repository be promoted to stakeholders of the Torres
Strait. In addition there will always be a need to facilitate either the lodgment of IP works to the
repository or hyperlinks to new and valued inventories of information managed by other research
agencies. It is only by the continued maintenance, cooperation and participation of like minded
stakeholders that this search interface can maximize research developments within the Torres Strait.

Publications (2006 - 2007)


Taranto, T. J. and C. R. Pitcher (2007). CRC Torres Strait Task 5.2: Data and Information
Management Final Report for CRC Torres Strait. CSIRO Marine and Atmospheric Research ,
Cleveland.
Taranto, T. J. (2007) CRC Torres Strait Marine Research Repository.
http://www.cmar.csiro.au/DataCentre/torres . CSIRO Marine and Atmospheric Research Data Centre.

Outcomes Achieved

The Torres Strait Marine Research Repository provides stakeholders of the Torres Strait both a
secure repository of past research efforts and a utility that searches both this repository and other
information repositories related to the Torres Strait marine environment.
The collation of CRC-TS reports, metadata (and available associated data) from Principal
Investigators (PIs) is the foundation of the Torres Strait Marine Research Repository. All lodged
literature is in standard pdf format with all submitted metadata adhering to the Australian ANZLIC
standard. All CRC-TS literature and data has been vetted by the TSRA for sensitivity, and
appropriate internet website security options implemented. Both metadata and associated data are
maintained by the CSIRO Marine and Atmospheric Research Data Centre.
In addition to providing a customised searchable website that links to the Repository of CRC-TS
outputs (literature and data), the website provides a searchable interface to other non-CRC research
works and data sources related to Torres Strait marine resources.
By promoting the Repository and its search capabilities this project benefits not only future research
in the region but also the communities that depend on its resources.

KEYWORDS: Torres Strait data metadata reports literature archive repository


CRC Torres Data and Information Repository 1-3
1. INTRODUCTION

1.1. BACKGROUND
This CRC Torres Strait Task was initiated in response to a request from the CRC TS Board during
June 2005 regarding the issue of data archiving and management of information arising from CRC TS
research tasks. Following that request, CSIRO Marine and Atmospheric Research conducted a
preliminary project to scope the requirements to address the CRC TS Board's needs.

1.2. NEED
The scoping project identified that the CRC TS had contracted over 24 projects that were expected to
produce data and final reports/theses. There were also several AFMA contracted projects conducted
since the completion of the AFMA TS Reports and Data Archive (Taranto, 2004). A need was
identified to capture these reports, data & metadata before some CRC TS Task Leaders disperse and/or
become difficult to contact.
The CRC Torres Strait managed a set of Research Tasks under a co-ordinated Research Plan due to
complete by mid 2006. It was identified that the principle tasks of a Data Management Task would be
to facilitate the entry of metadata and collection of each of the individual CRC TS Task datasets,
facilitate the production of reports in a standard PDF format, and work with each PI to get the actual
data and reports lodged into a central system.
It was also identified that a web site would need to be developed to maximise future distribution of
collected information and data — preferably linked under the Torres Strait Regional Authority web
site (www.tsra.gov.au) — containing searchable metadata, the PDF reports as a searchable document
library and access to the actual data for direct downloading where possible and appropriate. All
content was to be moderated for privacy and cultural sensitivities and published under the appropriate
restrictions as defined by the TSRA. A companion DVD was identified as an additional or alternative
product that could also be developed, but this was not progressed.

1.3. OBJECTIVES
The original objectives were:
1. To facilitate the collation of CRC-TS Reports, metadata and available associated data from
Principle Investigators (PI’s) in a standard format where possible.
2. To develop a searchable website on a secure repository containing linked Reports, metadata and
available associated data (for limited access where appropriate).
3. To coordinate the moderation and listing of sensitive data and publications for the project.
CRC Torres Data and Information Repository 2-4

2. METHODS
2.1. Facilitate the collation of CRC-TS Intellectual Property (IP)

To address the objective of collating CRC-TS Reports, metadata and available associated data from
Principal Investigators in a standard format where possible, a number of facilitation services were
provided to CRC TS Task Leaders.
Immediately following an email from the CRC TS to all Task Leaders advising of the initiation of this
Data and Information Management Task, and requesting their response, this task initially produced
and distributed a CRC Torres Strait Final Report template document (on 13 June 2006) to facilitate a
common reporting interface. The template was designed in consultation with CMAR graphic designers
and CRC TS staff. This Final Report adheres to that design template.
At the commencement of the Task, extensive research and interviews were conducted with CRC Task
Leaders and other CRC TS project staff to establish the extent of IP works attributed to the CRC TS.
This IP inventory became the basis of coordinating the lodgment of Task outputs to the Data and
Information Repository. See APPENDIX 1: Intellectual Property.
To promote ongoing lodgment of CRC TS IP works, all Task Leaders were emailed status reports
fortnightly from mid September 2006 to end December 2006. Each status report highlighted the need
for Task Leaders to lodge IP works to the repository and included instructions on the agreed lodgment
process. In addition, all Task Leaders were advised of the preference to lodge reports in PDF format
based on the CRC report template that was specifically developed for the exercise.
Due to the lower than expected lodgment of CRC IP works all Task Leaders were personally contacted
during December 2006 and an extension of this Task's Milestone Report (from end December 2006 to
February 2007) was sought to provide further time for Task Leaders to lodge their IP works. In
addition to the above status reports, and to further facilitate IP lodgment, each Task Leader was
personally informed that they could simply lodge their works by email directly to the Repository
Administrator. Though discussions with Task Leaders were positive, at the time of drafting this report
(23 March 2007) there were still a significant number of identified works yet to be submitted to the
repository.
Another addition to the publications and metadata being collected from the CRC Torres Strait Program
is the likely inclusion of publications and metadata as published on the AFMA Torres Strait Research
DVD (Taranto, 2004). During September 2006 the then AFMA Director (Richard McLoughlin) was
approached to enquire if AFMA was agreeable to releasing copyright to selected works of the AFMA
Torres Strait Research DVD. After receiving a positive response, CSIRO Legal services were
requested to draft the appropriate copyright permissions to present to AFMA. It was envisioned that
conditional on AFMA granting permission, the addition of the 96 publications dating back to 1980 and
approximately 30 metadata statements, from this DVD, to the repository website would provide
stakeholders ready access to an extensive library of information to facilitate planning, management,
and research within the Torres Strait region.

2.2. Develop a searchable repository website


Research was carried out to address the objective of developing a searchable website on a secure
repository containing linked reports, metadata and available associated data (for limited access where
appropriate). Discussions were also conducted with e-Repository stakeholders at various Australian
Universities, CSIRO Corporate and the CMAR Data Centre, regarding implementation of an
enterprise level searchable information and data website repository on a secure environment as part of
this Task’s output.
Evaluations into the open source Fedora and Dspace e-repository applications were investigated
(CatalystIT 2003) and consultation conducted with Fedora hosting agencies at the University of Qld
CRC Torres Data and Information Repository 2-5
and Monash University; and Dspace hosting Agencies at Queensland University of Technology and
the Australian National University Digital Resource Services Centre. After consideration of available
research and discussions, recommendations were made to the host datacenter (CMAR Data Centre) for
implementation of the DSpace digital repository system as a pilot project.
The DSpace open source application was recommended as it is specific to capture, store, index, search,
preserve, and distribute digital research material. Research institutions worldwide use DSpace as an
institutional repository. The DSpace open source platform is freely available and can be customized
and extended as required. Its contents are exportable to other common formats if needed.
The intent of selecting an electronic repository such as Dspace was to automate lodgment, network
linking, storage and distribution of reports, metadata and available associated data and be ranked
highly by the ‘Google Scholar’ search engine. However, due to time constraints this proposed
enterprise system was shelved and a more hybrid system developed that simply addressed the direct
searchable web delivery needs of this Data and Report Management Task.
To successfully manage the anticipated small number (initial expectations were 60 to 80) of CRC
works requiring processing, it was deemed that lodged material be manually loaded to a web file
directory structure hosted by the CMAR Data Centre and that any metadata be manually submitted to
the existing CMAR Marlin Marine Research Metadata Directory by the Repository Administrator.
Without the implementation of an enterprise level repository, other cost effective search engines were
considered. The search engine as used in the AFMA Torres Strait Research DVD publications and
metadata (Taranto, 2004) was examined, but rejected as it did not cater for searching of Adobe pdf
files, which was the prescribed information format to be lodged by Task Leaders. In January 2007,
Google released a new custom search engine which appeared to address the needs of the Task. It was
easy for users to use, quick and searched all necessary file formats (pdf, MSWord and ascii). The
Google search interface was also considered superior as its search protocol is familiar to most
stakeholders who would use the Torres Strait repository — and being freeware, it was also cost
effective. The Custom Google Search engine permits search queries to be constrained to selected
websites, web directories and file types — permitting quick searching of specific material.
Due to the need to utilize the current CMAR Data Centre Marlin Metadata Repository, a separate
search interface was developed to link directly to the system from the website. This interface provides
Torres Strait stakeholders direct access to searching all public holdings of the CSIRO database and
provides direct access to editing of additional metadata records (and associated data) if desired.
As an additional service to Torres Strait stakeholders, other search tools were investigated (and
developed) to provide direct access to: the Australian Spatial Data Directory (ASDD), Google Scholar
and various web domains belonging to Agencies conducting relevant Torres Strait activities.

2.3. Coordinate the moderation and listing of sensitive data and


publications
Security for sensitive research publications and data has been paramount. All lodgments to the
repository were initially placed in a password protected website. During January 2007, discussions
with TSRA staff identified an effective protocol for TSRA to assess the sensitivity of each individual
report and dataset. Under this process, all material deemed non-sensitive by the TSRA was identified
and the Repository Administrator directed to relocate such works onto the public website area in time
for the scheduled repository launch in April 2007. Identified sensitive works were to remain within the
restricted area of the repository under three levels of security. Namely: password protection, ‘robot
exclusion’ (no caching) and an internet index search control file.
CRC Torres Data and Information Repository 3-6
3. RESULTS
Each of the Task’s three objectives was accomplished. The varying levels of completeness were
related to the degree of responses from CRC TS Task Leaders.

3.1. Facilitate the collation of CRC-TS Intellectual Property (IP)


The outcome of this process has been a less than 50% lodgment of identified works to the repository
and of them only a minimal number using the supplied template. See APPENDIX 1: Intellectual
Property.
During the follow-up discussions with each of the Task Leaders during December 2006, some had
implied that the lower than expected response by Task Leaders to lodge their works with the
repository could be attributed to: the late commencement of the CRC Task 5.2 Data and Information
Management Task and the need by many Task Leaders to ‘prepare for the Special Issue Publication
incorporating much of the CRC research outputs’. In order to facilitate a timely start on this Task,
CSIRO authorized staff to commence work in May 2006 prior to contract signing - which occurred on
the 8th December 2006. Most CRC TS Tasks completed at end June 2006.
Table 3-1 summarises the status of lodgments at Project level and Table 3-2 summarises the status of
lodgments at individual Task levels as at 23March 2007. See APPENDIX 1: Intellectual Property for a
more detailed Task Level Table that lists identified IP works and their current status. This listing is
also available online at http://www.cmar.csiro.au/datacentre/torres/CRCTS2003_06/index.htm. This
comprehensive IP listing has been the basis of information collation and at all times during the project
has been available for Task leaders to provide feedback on.

Table 3-1. Status of CRC Torres IP works lodged (as at 23 Mar 2007) by project. Note that the number of
Reports also includes the Final (Web) Reports lodged at the CRC Torres website in July 2006.
Metadata submitted

Abstracts submitted
Reports submitted

Articles submitted
Posters submitted
Papers submitted

OUTSTANDING
Data submitted

SUBMITTED

IDENTIFIED
Presentations
Identified

Identified

Identified

Identified

Identified

Identified

Identified

Identified
TOTAL

TOTAL
Project

Project 1 22 23 1 13 7 15 8 15 0 0 1 1 4 5 2 2 45 74 29

Project 2 8 9 0 8 0 7 0 7 5 5 0 0 0 0 0 2 13 38 25

Project 3 10 15 0 1 0 8 0 8 0 0 0 0 0 0 7 7 17 39 22
Project 4 7 10 0 5 1 3 2 3 0 0 0 0 0 0 0 0 10 21 11
Project 5 4 4 0 0 2 2 0 2 0 0 0 0 0 0 0 0 6 8 2

Unknown 0 7 0 7 7
TOTAL 51 61 1 34 10 35 10 35 5 5 1 1 4 5 9 11 91 187 96

The lower than expected use of the Report template may also be associated to the late commencement
of the project — although the template was provided to all Task Leaders (on 13/06/2006) in advance
of the milestone date, this was closely coincident with the Final Reporting deadline for CRC Torres
Strait Tasks and some Task Leaders had already completed their reports or their reports were well
underway, thus there was insufficient time for many of them to adopt the template. The majority of
Reports lodged were either based on existing agency templates or the MSWord normal template.
CRC Torres Data and Information Repository 3-7
Table 3-2. Status of CRC Torres IP works lodged (as at 23 Mar 2007) by Task. Note that the number of Reports
also includes the Final (Web) Reports lodged at the CRC Torres website in July 2006.

OUTSTANDING
Presentations

SUBMITTED

IDENTIFIED
Abstracts
Identified

Identified

Identified

Identified

Identified

Identified

Identified

Identified
Metadata
Reports

Posters

Articles
Papers

TOTAL

TOTAL
Task

Data
1.1 3 3 0 4 0 2 0 2 2 2 5 13 8
1.2 2 2 0 1 0 2 3 1
1.3 2 2 0 2 0 2 0 2 2 8 6
1.4 2 2 0 1 1 1 1 1 1 1 5 6 1
1.5 2 2 0 1 1 1 0 1 3 5 2
1.6a 2 2 1 3 5 5 4 5 1 1 2 2 15 18 3
1.7 1 1 1 1 0
1.8 2 2 0 1 1 1 0 1 3 5 2
1.11 2 2 0 2 2 2 4 6 2
1.13 2 2 1 1 3 3 0
1.14 1 1 0 1 0 1 0 1 1 4 3
1.15 0 1 0 1 1
1.16 1 1 1 1 0
Prj 1 22 23 1 13 7 15 8 15 0 0 1 1 4 5 2 2 45 74 29
2.1 1 2 0 1 0 0 1 3 2
2.2 5 5 0 5 0 5 0 5 5 5 0 2 10 27 17
2.3 2 2 0 2 0 2 0 2 2 8 6
Prj 2 8 9 0 8 0 7 0 7 5 5 0 0 0 0 0 2 13 38 25
3.1 2 2 0 2 0 2 2 6 4
3.2 2 2 2 2 0
3.3 1 2 0 1 0 2 0 2 1 7 6
3.4 3 5 0 3 0 3 3 11 8
3.5 1 1 1 1 0
3.6 1 2 7 7 8 9 1
3.7 0 1 0 1 0 1 0 3 3
Prj 3 10 15 0 1 0 8 0 8 0 0 0 0 0 0 7 7 17 39 22
4.1a 1 1 0 3 1 1 1 1 3 6 3
4.2 2 2 2 2 0
4.3 2 2 0 1 1 1 3 4 1
4.4 1 2 0 1 1 3 2
4.6 0 2 0 2 2
4.7 1 1 0 1 0 1 0 1 1 4 3
Prj 4 7 10 0 5 1 3 2 3 0 0 0 0 0 0 0 0 10 21 11
5.1 2 2 2 2 0 2 4 6 2
5.2 2 2 2 2 0
Prj 5 4 4 0 0 2 2 0 2 0 0 0 0 0 0 0 0 6 8 2
Unk 0 7 0 7 7
TOTAL 51 61 1 34 10 35 10 35 5 5 1 1 4 5 9 11 91 187 96

The additional inclusion of selected publications and metadata as published on the AFMA Torres
Strait Research DVD publications and metadata (Taranto, 2004) is still in progress at the time of
drafting this report. CSIRO Legal have sought permission from AFMA to load the entire contents of
the AFMA DVD onto the website - except six publications that were copyright to agencies other than
AFMA or CSIRO. Upon receipt of AFMA Copyright permission the selected works can be transferred
to the readied website.
CRC Torres Data and Information Repository 3-8

3.2. Develop a searchable repository website


The repository website was successfully developed in conjunction with the CMAR Data Centre. With
the Torres Strait Marine Repository now residing on servers maintained by the CSIRO, its contents are
automatically backed up and maintained into the future. The Torres Strait Marine Research Repository
as linked to the CRC Torres Strait homepage can be found at
http://www.cmar.csiro.au/datacentre/torres.
The simplified webpage has been specifically designed to minimize any issues should fellow research
agencies wish to incorporate the search tools developed for the repository into their own webpages.
The custom Google Search engine permits search queries to be constrained to the repository website,
searching all necessary file formats (pdf, MS Word, MS Excel, ascii, etc.) and provides a quick search
result of specific material in the familiar Google format.
A separate search interface has been developed to link directly to the CMAR Data Centre. This
interface provides Torres Strait stakeholders direct access to searching all public holdings of the
CSIRO database and provides direct access to editing of additional metadata records (and associated
data), if desired, directly from the website.
The website provides access to CRC TS (now RRRC) copyrighted IP. As the owner of the IP onus is
on the RRRC to ensure all necessary licensing requirements of IP stored within the repository (and
hosted by the CMAR Data Centre) are in place. To assist the RRRC, the Repository Administrator
investigated suitable licensing for the RRRC to assign to the repository holdings when it is made
available to the public at large. The below diagram of the website depicts a link to an existing CSIRO
Disclaimer and the use of a ‘Creative Commons’ license - a simplified licensing regime specifically
designed for the transfer of Australian internet based information and recommended by the
Queensland Government. These are provided as examples that the owner (the RRRC) can put in place
before the IP is made available to the public at large.
As an additional service to Torres Strait stakeholders, other search tools have been developed on an
associated webpage to provide direct access to:
1. the Australian Spatial Data Directory (ASDD), for which the CMAR Data Centre is a node.
This interface facilitates searching across data warehouses e.g. Geoscience Australia, ERIN.
2. Google Scholar. This search engine provides a simple way to broadly search for scholarly
literature. The search term incorporates ‘Torres Strait +marine’ to refine the search to
literature specific to this Task’s area of interest, namely Torres Strait marine research. Publicly
available peer-reviewed papers, theses, books, abstracts and articles, from academic
publishers, professional societies, preprint repositories, universities and other scholarly
organizations are available.
3. Various web domains belonging to Agencies with relevant Torres Strait activities e.g. AFMA,
CRC Torres, RRRC, CSIRO. Visitors to the repository are invited to add other Torres Strait
related web domains by contacting the Repository Administrator identified on the search
engines webpage.
CRC Torres Data and Information Repository 3-9

Figure 3-1. Index page showing custom repository search tool and Marlin Metadata Search tool

Figure 3-2. Repository custom Google search result page


CRC Torres Data and Information Repository 3-10

Figure 3-3. Direct access to CMAR Data Centre

Figure 3-4. Additional page to search other websites


CRC Torres Data and Information Repository 3-11

Figure 3-5. Direct access to other Australian Data Centres

Figure 3-6. Search libraries


CRC Torres Data and Information Repository 3-12

Figure 3-7. Custom Google search on contributing web domains

3.3. Coordinate the moderation and listing of sensitive data and


publications
During the acquisition phase it was noted that some agencies considered their works as confidential or
restricted. IP works that have been identified by either individual researchers and/or TSRA staff as
restricted are listed in the IP listing (APPENDIX 1: Intellectual Property). Confidential/restricted IP
works that were not already managed by individual researchers within a secure enterprise environment
were requested to submit their CRC TS works to the Repository. A small number of sensitive works
(2) currently reside within a restricted area of the repository under three levels of security. Namely:
password protection, ‘robot exclusion’ (no caching) and an internet index search control file.
It should be noted though that some researchers were not prepared to lodge IP that was seen as
confidential or restricted. In this situation, it was ascertained whether the current custodian provided a
secure backup environment. As the Repository Administrators were authorized only to collate lodged
IP, no other action was taken other than to identify those works within both the Milestone Report and
this Final Report. See APPENDIX 1: Intellectual Property for a more detailed Task Level Table that
lists of all identified IP works and their current status – those that are coloured black have been
identified as remaining with the existing custodian.
CRC Torres Data and Information Repository 4-13
4. DISCUSSION
4.1. Facilitate the collation of CRC-TS Intellectual Property (IP)
The number of CRC TS IP works identified were higher than initially anticipated — 187 compared
with the anticipated 60 to 80. However, the lodgment rate was markedly lower — approximately 50%.
This low rate of lodgment by Task Leaders was qualitatively attributed to 2 factors; late initiation of
CRC Task 5.2 Data and Information Management, and the need by many Task Leaders to ‘prepare for
the Special Issue Publication incorporating much of the CRC research outputs’.
The CRC Task 5.2 Data and Information Management Task did not commence until June 2006. With
most CRC TS Tasks scheduled to complete at end June 2006 this left little time to develop and
familiarize Task Leaders with any protocols necessary to effectively lodge IP works. The late
commencement also limited the adoption of the report template that was specifically drafted to provide
a consistent appearance to CRC TS IP works.
The need by many Task Leaders to ‘prepare for the Special Issue Publication incorporating much of
the CRC research outputs’ also took attention away from the lodgment of IP works in a coordinated
manner. Many Task Leaders appeared to place higher priority on the publication than securing the IP
into the repository. This prioritization was not contested.

4.2. Develop a searchable repository website


The resultant ‘hybrid’ repository - a web directory hosted by the CMAR Data Centre with a custom
Google search engine and linked directly to the CMAR Data Centre Metadata Directory - successfully
addresses the objective to develop a searchable website on a secure repository containing linked
reports, metadata and available associated data (with limited access where appropriate).
The tools developed for the ‘hybrid’ repository also offer opportunity to provide additional services
for Torres Strait stakeholders. Additional search tools have been developed to provide direct access to
the Australian Spatial Data Directory (ASDD), Google Scholar and various web domains belonging to
Agencies conducting relevant Torres Strait activities. Services that provide greater access to Torres
Strait related research material than could be stored within the Data Centre Repository alone.
Research into enterprise level repositories identified the Dspace system was superior for storing,
searching and disseminating institutional or enterprise level research material. Logistics for
implementing such a system over the short timeframe however, deemed it impractical.
The future success of the Repository requires its promotion as an effective tool for locating relevant
research information. By designing a search engine that is more participatory - allowing stakeholders
to also provide links to the search engine - as well as developing an interface to embody within other
agencies websites it is hoped that it will be openly adopted.
Suitable licensing for the RRRC to assign to the repository holdings has been considered. The existing
Disclaimer or the use of a ‘Creative Commons’ license - a simplified licensing regime specifically
designed for the transfer of Australian internet based information - have been provided as examples for
the RRRC to assign to the website before the IP is made available to the public at large.

4.3. Coordinate the moderation and listing of sensitive data and


publications
The cooperation of TSRA staff helped effective management of lodged research material. All works
lodged were vetted by TRSA staff to ascertain sensitivity and appropriate website security measures
adopted by the Repository Administrator. Some Task Leaders were not prepared to lodge IP that was
seen as confidential or restricted. As Repository Administrators were authorized only to collate lodged
IP, no other action was taken other than to identify those works within both the Milestone Report and
this Final Report. The APPENDIX 1: Intellectual Property identifies IP (coloured black) as having
been identified as remaining with the existing custodian.
CRC Torres Data and Information Repository 7-14
5. BENEFITS
It was envisioned that the CRC Torres Strait online marine research repository would be a foundation
for other research within the region. Future research would be well served by ready access to extensive
past research material, to share research findings with minimal constraints yet with the expectation to
acknowledge sources. The design of the repository and its search interfaces provide both individual
researchers and organizations the opportunity to participate in providing materials to a common user
interface. Individual stakeholders are invited to lodge their works to the repository permanently served
by the CSIRO Marine and Atmospheric Research Data Centre, while those agencies with their own
inventories are invited to supply internet links to their publicly accessible directories.
Fellow research Agencies serving publications, and/or data are invited to provide links to the custom
search interface. The presumption however is that any information linked to the search interface is
publicly available. Restricted items are only catered for within the existing repository which is served
by the CMAR Data Centre.
The Repository search interface has been designed to facilitate inclusion on other websites, and not to
be seen as owned by any one agency. The search page can be simply incorporated into the web frames
belonging to other web domains. It contains minimal parentage information as the purpose of the
repository is to simply provide access to research works irrespective of the custodian of the research.

6. FURTHER DEVELOPMENT
There will always remain a need to promote the repository to stakeholders of the Torres Strait. In
addition there will always be a need to facilitate either the lodgment of IP works to the repository or
hyperlinks to new and valued inventories of information managed by other research agencies. It is
only by the continued cooperation and participation of stakeholders that this search interface can
maximize research developments within the Torres Strait.
At present, the system is dependent on external applications and services being delivered by Google,
relying on the Google web trawler applications (Googlebots) to search and index individual web files.
Though prescribed commands and an efficient web design have been followed to maximise the
instance of hits by Google’s web trawlers, the Googlebots are self managed and there is no guarantee
of a quick search of the complete repository. It has been observed that the current indexing of new
information within the repository takes between one and three weeks. This can be improved by
implementing an enterprise repository system such as Dspace.

7. ACHIEVEMENT OF OUTCOMES
The resultant Torres Strait Marine Research Repository has successfully published the CRC Torres
Strait IP works that have been lodged with Administrators. In addition, the customised search
interfaces available for future stakeholders significantly enhances the repository’s planned outcome to
provide a searchable repository of research IP works from the CRC Torres Strait.
With a defined repository now permanently maintained by the CSIRO Marine and Atmospheric
Research Data Centre, the maintenance and availability of any lodged IP - not just that of the CRC
Torres Strait - is assured, providing an enduring service to the Torres Strait research community.
The customised search interface has been developed so that it can be simply incorporated onto and/or
linked from other agencies websites, seamlessly linking independent research efforts between
stakeholders and custodial agencies of the Torres Strait.
CRC Torres Data and Information Repository 9-15
8. CONCLUSIONS
Overall, the outputs of the project have contributed to greater outcomes than initially anticipated. The
number of CRC IP works successfully lodged into the enduring and secure repository were 10%
higher than initially predicted and the customised search interface provides not only a search of the
repository holdings but also of other defined external repositories.
The disappointing response from CRC Principal Investigators (50%) could have been improved if this
project was commenced before the completion date of the CRC. Relating the lodgement of works to
contractual agreements and/or other pecuniary costs could also be considered.

9. RECOMMENDATIONS
A recommended priority is for the current owner of the Repository, the RRRC, to ensure all necessary
licensing requirements are in place before the repository is made available to the public at large.
Investigations have identified the ‘Creative Commons’ license — a simplified licensing regime
specifically designed for the transfer of Australian internet based information — as a possible solution
for the RRRC. The current draft website also contains a link to the CSIRO Disclaimer as an example
of current practices.
The late commencement and coincidence with Special Issue publication significantly limited
lodgement of intellectual property (IP) works into the repository, only 50% of identified IP works
were eventually lodged to the repository. It is recommended that any future research programs
consider this experience and provide timely services for stakeholders to adhere to protocols such as
template usage and provide an effective protocol for researcher to lodge their works. It is also
suggested that any future contractual agreements include a clause that withholds final payment until all
works have been submitted.
The lodgement of outstanding IP to the repository remains the responsibility of current custodians.
The intellectual property (IP) listing (APPENDIX 1) identifies those works awaiting lodgement.
The now functional Torres Strait Marine Research Repository is a significant resource. It not only
incorporates and searches the IP works of the CRC Torres Strait, it includes search interfaces to many
other Torres Strait related repositories. Though the repository’s security and ongoing service is
guaranteed by its inclusion within the CSIRO Marine and Atmospheric Research Data Centre, it will
require periodic updates and maintenance to continue to act as a value resource. In addition, the
customised search interface, though currently searching many already known websites and external
repositories, will require ongoing maintenance.
It is recommended that a repository administrator be assigned with the responsibility to promote the
Torres Strait Marine Research Repository to stakeholders and other researcher agencies; to update
links to new inventories containing relevant marine research; and to ensure that sustainable protocols
are developed and followed by contributors to the Repository.
The implementation of an enterprise level repository also offers many benefits such as increasing the
rating by academic search engines and a guaranteed timely and comprehensive search. It is
recommended that to enhance the outcomes of the existing ‘hybrid’ search repository, consideration be
given to the implementation of the Dspace open source repository, a widely recognised research
repository.
CRC Torres Data and Information Repository 11-16
10. REFERENCES
CatalystIT (2003). Technical Evaluation of selected Open Source Repository Solutions.
https://eduforge.org/docman/view.php/131/1062/Repository%20Evaluation%20Document.pdf . Cited
January 2007.
Taranto, T.J. and C.R. Pitcher (2004) Torres Strait marine science: collected publications and data,
1980-2003. DVD. Cleveland, Qld., CSIRO Marine Research.

11. ABBREVIATIONS & GLOSSARY


AFMA – Australian Fisheries Management Agency
ASDD - Australian Spatial Data Directory
CRC TS – Cooperative Research Centre, Torres Strait
CMAR – CSIRO Marine and Atmospheric Research
Dspace - an open source software package which provides the tools for management of digital assets
Googlebot – a ‘spider’ or robot that collects documents from the web to build a searchable index
IP – Intellectual Property
RRRC - Reef and Rainforest Research Centre
TSMRR - Torres Strait Marine Research Repository
TSRA - Torres Strait Regional Authority
CRC Torres Data and Information Repository 12-17
12. APPENDIX 1: INTELLECTUAL PROPERTY
Listing of all identified IP works attributed to the CRC Torres Strait with hyperlinks (blue) to those
works that have been lodged to the Torres Strait Marine research Repository as of 23 March 2007.
Hyperlinks coloured red are awaiting lodgment while those that are coloured black have been
identified as secured with existing custodian.
See current listing at http://www.cmar.csiro.au/datacentre/torres/CRCTS2003_06/index.htm
CRC Torres Data and Information Repository 12-18
CRC Torres Data and Information Repository 12-19
CRC Torres Data and Information Repository 12-20
CRC Torres Data and Information Repository 12-21
CRC Torres Data and Information Repository 12-22
CRC Torres Data and Information Repository 12-23
CRC Torres Data and Information Repository 12-24
CRC Torres Data and Information Repository 12-25
CRC Torres Data and Information Repository 12-26
CRC Torres Data and Information Repository 12-27
CRC Torres Data and Information Repository 13-28

13. APPENDIX 2: TASK MANAGEMENT LISTING


Webpage with lodgment details as provided to Task Leaders and listing of each Task, Task Leader
names and Repository Administrator actions to acquire identified task IP. See non-cached webpage
http://www.cmar.csiro.au/datacentre/torres/CRCTS2003_06/Admin/CRCTasks.htm
CRC Torres Data and Information Repository 13-29
CRC Torres Data and Information Repository 13-30
CRC Torres Data and Information Repository 13-31
CRC Torres Data and Information Repository 13-32
CRC Torres Data and Information Repository 13-33
CRC Torres Data and Information Repository 14-34

14. APPENDIX 3: STAFF


Principal Investigator - Tom Taranto
Co-Investigator – Roland Pitcher

Contacts
CMAR Data Centre Manager – Tony Rees
TSRA Contact – Vic McGrath
CRC Torres Strait Contact – David Williams
RRRC Contact – Russell Reichelt