Vous êtes sur la page 1sur 61

REQUIREMENTS

ANALYSIS
DOCUMENT

Alaska Department of Fish & Game


Salmon Data Management System

Axiom Consulting & Design

July, 2009
Revision Sheet

Revision Sheet
Release No. Date Revision Description

Requirements Analysis Document Page i


The intent of this requirements analysis is to assist in the identification of ADF&Gs internal and
external needs for managing key salmon data and making it easily accessible to all that have an
interest in using or understanding it. By working together, nonprofits and government agencies
can leverage more resources to undertake important yet unfunded work that benefits managers
and stakeholders alike.

Requirements Analysis Document Page ii


1.0 General Information

FUNCTIONAL REQUIREMENTS DOCUMENT


TABLE OF CONTENTS
Page #

1.0 Background...................................................................................................................................1-1
1.0 GENERAL INFORMATION.....................................................................................................1-2
1.1 Purpose...................................................................................................................................1-2
1.3 Project Resources...................................................................................................................1-2
1.4 Requirements Gathering Methodology.................................................................................1-3
1.4.1 Isolation of User Groups......................................................................................................................1-3
1.4.2 User Interviews....................................................................................................................................1-3
1.4.3 Workshops...........................................................................................................................................1-4
1.5 Points of Contact....................................................................................................................1-5
1.5.1 Axiom Consulting & Design...............................................................................................................1-5
1.5.2 Alaska Department of Fish & Game Division of Commercial Fisheries Area E Biological Staff....1-5
1.5.3 Alaska Department of Fish & Game Division of Commercial Fisheries Data Management Staff. . .1-5
1.5.4 State of the Salmon..............................................................................................................................1-6
2.0 Statewide salmon information management..............................................................................2-1
2.1 Background............................................................................................................................2-1
2.2 Spatial Organization of Alaska Statewide Salmon Data.......................................................2-1
2.3 Statewide................................................................................................................................2-2
2.3.1 Existing Data Repositories..................................................................................................................2-2
2.3.2 Deficiencies.........................................................................................................................................2-2
2.4 Southeast (Region I)...............................................................................................................2-3
2.4.1 Existing Data Repositories..................................................................................................................2-3
2.4.2 Deficiencies.........................................................................................................................................2-3
2.5 Central (Region II)................................................................................................................2-4
2.5.1 Existing Data Repositories..................................................................................................................2-4
2.5.2 Deficiencies.........................................................................................................................................2-4
2.6 Arctic Yukon Kuskokwim (Region III).................................................................................2-4
2.6.1 Existing Data Repositories..................................................................................................................2-4
2.6.2 Deficiencies.........................................................................................................................................2-4
2.7 Westward (Region IV)...........................................................................................................2-5
2.7.1 Existing Data Repositories..................................................................................................................2-5
2.7.2 Deficiencies.........................................................................................................................................2-5
2.8 Current Efforts of the Computer Information Services Team (CIS)...................................2-5
3.0 PWS Salmon information management....................................................................................3-1
3.1 Background............................................................................................................................3-1
3.2 Spatial Organization of PWS Salmon Data..........................................................................3-1
3.3 Chum and Pink Salmon Aerial Surveys................................................................................3-3
3.3.1 Data Collection Protocols....................................................................................................................3-3
3.3.2 Data Processing and Report generation..............................................................................................3-3

Functional Requirements Document


1.0 General Information

3.3.3 Status of Historical Data......................................................................................................................3-4


3.3.4 Deficiencies.........................................................................................................................................3-5
3.4 Sockeye, Chinook, and Coho Aerial Surveys........................................................................3-5
3.4.1 Data Collection Protocols....................................................................................................................3-5
3.4.2 Status of Historical Data......................................................................................................................3-6
3.4.3 Deficiencies.........................................................................................................................................3-6
3.5 Weirs, Towers and Miles Lake Sonar Escapement...............................................................3-6
3.5.1 Data Sources........................................................................................................................................3-6
3.5.2 In Season Management and Report generation..................................................................................3-6
3.5.3 Status of Historical Data......................................................................................................................3-7
3.5.4 Deficiencies.........................................................................................................................................3-7
3.6 Age Sex Length Data.............................................................................................................3-7
3.7 Commercial Harvest Data.....................................................................................................3-8
4.0 INFORMATION SYSTEM VISION.........................................................................................4-1
4.1 Consolidation of User Needs into Measurable Goals...........................................................4-1
4.1.1 Enable All Data to Be Spatially/Temporally Explicit at Multiple Scales...........................................4-3
4.1.2 Provide Granular Queriable Access to Raw Data for User Groups....................................................4-1
4.1.3 Develop a Series of Modular Data Entry, Reporting and Data Access Tools.....................................4-1
4.1.4 Provide Standardized Metadata for Datasets......................................................................................4-2
4.1.5 Develop Standard Operating Procedures and Data Processing Routines...........................................4-2
4.1.6 Ensure Alaska Department of Fish & Game Data Management Staff can Support, Modify and
Improve Information System.............................................................................................................................4-2
4.2 Use Cases................................................................................................................................4-4
4.2.1 Use Case Actors...................................................................................................................................4-4
4.2.2 Use Case Diagrams and Actor Hierarchy...........................................................................................4-5
4.2.3 Use Case Index....................................................................................................................................4-6
4.2.4 Use Case Descriptions.........................................................................................................................4-6
5.0 OPERATIONAL PLAN..............................................................................................................5-1
5.1 Salmon Data Consolidation Overview..................................................................................5-1
5.1.1 Summary..............................................................................................................................................5-1
5.1.2 Centralizing Data Storage for Salmon Data (Tier 1)..........................................................................5-1
5.1.3 Statewide Reporting Tools...................................................................................................................5-2
5.1.4 User Interface......................................................................................................................................5-3
5.1.5 CIS Mariner System Overhaul............................................................................................................5-3
5.2 Process Flow for CIS Data Consolidation.............................................................................5-4
5.2.1 Requirements Analysis........................................................................................................................5-4
5.2.2 Educate and Enable ADF&G Staff......................................................................................................5-4
5.2.3 Generate Data Dictionary....................................................................................................................5-4
5.2.4 Assessment of All Data Repositories..................................................................................................5-4
5.2.5 Design ETL Process Logic/Add Value to Data...................................................................................5-5
5.2.6 Design Data Warehouse Prototype Schemas.......................................................................................5-5
5.2.7 Build Prototype ETL Processes...........................................................................................................5-5
5.2.8 Assess Prototypes and Plan for Next Development Phase..................................................................5-5
5.2.9 Reiterate Steps 5.2.5-5.2.8..................................................................................................................5-5
6.0 Technical framework..................................................................................................................6-1
6.1 Data Flow Diagram 1.0 Complete System.........................................................................6-1
6.2 Data Flow Diagram 1.1 - Data Entry System.......................................................................6-4
6.3 Data Flow Diagram 1.2 - ETL Processes..............................................................................6-7

Functional Requirements Document


1.0 General Information

6.4 Data Flow Diagram 1.3 - Interoperability Systems............................................................6-10

1.0 GENERAL INFORMATION

Functional Requirements Document


1.0 General Information

1.0 Background
In 2008 funding was awarded to the State of the Salmon Program - a joint effort of the nonprofit
organizations Ecotrust and the Wild Salmon Center - and the Alaska Department of Fish &
Games (ADF&G) Copper River and Prince William Sound Commercial Fisheries office in
Cordova to deepen ADF&Gs capacity for managing their salmon population data. The Alaska
Department of Fish and Game Project (SoS-ADF&G Project) is one of four components of the
SoS-Agency Partnership Initiative (API), a working partnership among three different fisheries
agencies and the State of the Salmon Program with shared objectives for improving salmon data
access and interoperability.

The goal of the SoS-ADF&G Project is:


Create web and database systems for ADF&G staff in Cordova in order to:
Make it easier for ADF&G staff to enter, edit, retrieve, and analyze escapement, age,
sex, size and harvest data.
Provide public access to frequently requested data and information.

To meet the project goal, an explicit requirements analysis is needed to assist in project
optimization from the very beginning. A Requirements Analysis can provide a framework for
project participants to assess progress at multiple points during the production process and help
evaluate whether mid-course corrections are warranted. It also assists with the discovery and
evaluation of similar database systems, web tools, and information development activities that are
being deployed or are under development elsewhere and which can be adopted for or should be
linked to this project. For instance, database architecture, data dictionaries, and/or programming
developed for the Arctic-Yukon-Kuskokwim Salmon Database Management System, the Bristol
Bay Science and Research Institutes Age, Weight, Length in-season data reporting tool, or the
Integrated Status and Effectiveness Monitoring Programs data management applications may be
relevant. These kinds of knowledge and technical transfers will not only help save time and money
but also promote data compatibility, systems interoperability, and coordination across
jurisdictions.

The original scope of work for the Requirements Analysis included a User Needs Assessment; a
Data Flow Diagram; Measurable Goals; Use Case Descriptions; Software Requirements
Specification; and Prototype Development. However, modifications to the R.A.s scope are
warranted in light of the efforts of the newly created ADF&G Computer Information Services
(CIS) team. The focus of the requirements analysis was to assess and plan a salmon data entry,
data access and reporting suite of web based software tools for user groups within the Prince
William Sound (PWS) area. Integration with existing ADF&G data repositories and software
systems was a guiding principle for the effort. In January 2009 the CIS team was formed within
ADF&G to perform analogous tasks to the SoS-ADF&G Project except at the statewide scale
and for more data types and species.

As a result the scope of this Requirements Analysis was subsequently broadened to the entire state
of Alaska so as to put the ADF&G project appropriately within context of the CIS effort. It

Functional Requirements Document Page 1


1.0 General Information

excludes software specifications and the development of a prototype at this time. Its intent is to
provide information adequate to guide strategic project planning decisions and synchrony of the
SoS-ADF&G Project and the CIS effort.

1.0 GENERAL INFORMATION

1.1 Purpose
This intent of this document is to provide a broad assessment of salmon data management practices at the
Alaska Department of Fish & Game (ADF&G) and some strategic guidance for improving those systems to
benefit all information users. Documenting system requirements from the perspective of user groups
provides a solid base for the development of use cases, measurable goals, data flow diagrams and other
software design methodologies. These methodologies will enable software developers at ADF&G to more
effectively plan, design, code, implement, test and maintain the new salmon data management system.

1.3 Project Resources


The following list of informational resources was utilized for the development of this requirements analysis.

RA Source Diagrams - www.axiomalaska.com/CIS_SoS/RADiagrams.zip


The MS Visio source files for data flow diagrams and use cases.

PWS Historical Data Archive - www.axiomalaska.com/CIS_SoS/PWSHistoricDataArchive.zip


File includes a compilation of PWS historic salmon metric data, Annual Management Reports and internal
data processing documentation. More specifically this archive contains tabular files for the storage of
historic weir data at various sites in addition to the miles lake sonar escapement data. This archive also
contains aerial survey data for Coho, Chinook and Sockeye species. Standard operating procedures for the
pink and chum aerial survey program and in season reporting systems are described within a section of the
archive. The legacy SASPop escapement access database is also included.

PWS Salmon Data Matrix- www.axiomalaska.com/CIS_SoS/PWSDataMatrix.xls


Matrix of PWS salmon data sources, metadata, reporting and visualization needs.

PWS Pink and Chum Aerial Survey Data Preparation and Reporting Procedures-
www.axiomalaska.com/CIS_SoS/PWSChumPinkAerialSurveyDataProcessing.pdf
Operating procedures for data preparation and processing for in season reporting for pink and chum aerial
survey data. Example report output is also included in this resource.

PWS Salmon Data Summary- www.axiomalaska.com/CIS_SoS/MoffitPWSSummary.pdf


Presentation put together by Steve Moffitt summarizing salmon data management in PWS by ADF&G.

CIS Workshop Materials - www.axiomalaska.com/CIS_SoS/CISDataWorkshop.zip


Materials and presentations utilized during the ADF&G data management workshop at Axiom offices in
March. Also includes some products of the workshop that were further developed in the requirements
analysis.

CIS Workshop Audio - www.axiomalaska.com/CIS_SoS/CISDataAudio.zip


Partial audio recording of the discussions and presentation at the ADF&G data management workshop held
at Axiom offices in early March.

Functional Requirements Document Page 2


1.0 General Information

1.4 Requirements Gathering Methodology

1.4.1 Isolation of User Groups


Users of the Salmon Data Management System were sorted logically into two categories: Customers and
Stakeholders. Customers are defined as 'the primary beneficiaries of project outcomes,' while Stakeholders
are much more on the periphery. Customers are directly impacted in their everyday business operations by
the capabilities of the information system, and thus their success is tied intimately to the overall success of
the proposed system. Stakeholders are much less tied to the system but will benefit from the added
functionality that the new proposed information system will provide. The information system must
ultimately meet as many of the needs of its users as possible but be focused primarily on meeting the core
needs of the customer group. For the purpose of this analysis the following groups were interviewed:

Customers
Copper River/PWS ADF&G Commercial Fisheries Staff
Statewide ADF&G ADF&G Commercial Fisheries Staff
State of the Salmon Program
Ecotrust Copper River program

Stakeholders
Processors/buyers
Bioinformatics research community
Policy makers
Non Government Organizations (NGOs)
Commercial, recreational, subsistence and personal users
Funding Organizations

1.4.2 User Interviews

The initial step of the requirements analysis involved interviewing all potential user groups and
documenting specific user needs. A concrete set of standardized interview questions was drafted to ensure
consistent and meaningful responses from the pool of potential users. These interviews provided a
foundation for the subsequent steps of the requirements analysis.

Follow-up interviews were performed to reengage various user groups. These interviews were much less
structured and involved more granular discussions of issues. Additional interviews were conducted
primarily with the various components of the ADF&G customer group (Area E Biologists and Statewide
Programmers) and provided more specific information on an as-needed basis.

1.4.3 Workshops

A workshop was held with ADF&G programming staff to isolate system requirements according to their
core needs. Both regional data management staff and statewide CIS team members attended the workshop.
This workshop included discussions of existing salmon data management systems and data repositories,
regional user needs and data management efforts in progress. An overarching three-year implementation
plan was developed as a result of this workshop. The three-year implementation plan is further detailed in
Section 4.0 INFORMATION SYSTEM VISION.

Functional Requirements Document Page 3


1.0 General Information

1.5 Points of Contact

1.5.1 Axiom Consulting & Design

Rob Bochenek
Information Architect
Anchorage, Alaska
(907) 230-0304
rob@axiomalaska.com

Shane StClair
Software Engineer
Anchorage, Alaska
(360) 450-3574
shane@axiomalaska.com

1.5.2 Alaska Department of Fish & Game Division of Commercial Fisheries Area
E Biological Staff

Steve Moffitt
PWS/CR Area Research Biologist
Cordova, Alaska
(907) 424-3212
steve.moffitt@alaska.gov

Glenn Hollowell
Area Management Biologist
Cordova, Alaska
(907) 424-3212
glenn.hollowell@alaska.gov

1.5.3 Alaska Department of Fish & Game Division of Commercial Fisheries Data
Management Staff

Kathleen Jones
Data Processing Manager II
Headquarters
Juneau, Alaska
(907) 465-4753
kathleen.jones@alaska.gov

Tracy Olson

Functional Requirements Document Page 4


1.0 General Information

Analyst/Programmer IV
Headquarters
Juneau, Alaska
(907)465-6350
tracy.olson@alaska.gov

Holly Krenz
Analyst/Programmer IV
Region III
Anchorage, Alaska
(907)267-2418
holly.krenz@alaska.gov

Heath Kimball
Analyst/Programmer III
Region II
Anchorage, Alaska
(907)267-2894
heath.kimball@alaska.gov

Ivan Show
Analyst/Programmer V
Region I & II
Juneau, Alaska
(907)465-6110
ivan.show@alaska.gov

Scott Johnson
Analyst/Programmer IV
Region I
Douglas, Alaska
(907)465-4242
scott.johnson@alaska.gov

1.5.4 State of the Salmon

Cathy P. Kellon
Ecotrust
Portland, Oregon
(503)467-0791
cathy@ecotrust.org

Rich Lincoln, Director


Wild Salmon Center
Portland, Oregon
(971)255-5575
rlincoln@wildsalmoncenter.org

P.S. Rand, Ph.D.


Wild Salmon Center
Portland, Oregon

Functional Requirements Document Page 5


1.0 General Information

(971)255-5546
prand@wildsalmoncenter.org

Functional Requirements Document Page 6


1.0 General Information

Functional Requirements Document Page 1


2.0 Statewide Salmon Information Management

2.0 STATEWIDE SALMON INFORMATION MANAGEMENT

Functional Requirements Document


2.0 Statewide Salmon Information Management

2.0 STATEWIDE SALMON INFORMATION MANAGEMENT

This section provides a brief summary of current data management practices for salmon monitoring data
produced by ADF&G on a statewide scale. Past information management of salmon data are partitioned
across four management regions, with very little coordination between regions. The following content
provides a description of how data are stored and existing software systems across regions. Also included is
a section summarizing the strategies of the Commercial Fisheries Business Intelligence Group (BIG). The
BIG team was recently created (January 1, 2009) to develop standardized statewide data entry and
reporting tools as well as develop a data management framework for centralizing the storage of fisheries
data. Existing ADF&G data management systems and repositories have been documented to the best
ability of the Axiom consultants. This information was gathered through interviews and workshops with
ADF&G programmers and data managers.

2.1 Background
According to Alaskas constitution, state government must manage its natural resources to the maximum
benefit of its people and manage its renewable resources on a sustained yield basis. These directives
provide the impetus for ADF&Gs management of statewide salmon resources. ADF&G has utilized an
escapement goal based fisheries management system developed by the University of Washington since
1959. Effective sustainable yield management of salmon runs requires collection of monitoring data on
salmon escapement and commercial and subsistence harvests. These data exist in various formats and
states of accessibility among the four commercial fisheries management regions.

2.2 Spatial Organization of Alaska Statewide Salmon Data

Figure 1. ADF&G Division of Commercial Fisheries management region boundaries. From


http://www.cf.adfg.state.ak.us/regnmap.php.

Functional Requirements Document Page 1


2.0 Statewide Salmon Information Management

The ADF&G Division of Commercial Fisheries partitions the state of Alaska into four management
regions: Southeast (Region I), Central (Region II), Arctic-Yukon-Kuskokwim (AYK or Region III), and
Westward (Region IV).

2.3 Statewide

2.3.1 Existing Data Repositories


State commercial harvests for salmon dating back to 1969 are recorded in a fish ticket database named
Zephyr. The harvest data exists in two distinct database structures: one containing data before 1974 and
another for 1974 present. Currently, paper copies of fish tickets must be entered by ADF&G technicians
post-season. A more advanced harvest tracking system called eLandings is currently in development by an
interagency team, which will allow for in-season collection of harvest data, including electronic reporting
by tenders and processors. The system for salmon is in beta this year with select processors and tender
operators.

eLandings uses a unified Java enterprise codebase and Oracle database for its various client and server
components. These client components include a web interface for processors, a desktop application for
catcher/processors, and a desktop application for ADF&G staff and enforcement. Disconnected client
applications for at-sea processors (SeaLandings) report to the central server via data email. Also in
development is a separate desktop application for tender operators called 'tLandings', commonly referred to
as the 'Tender Workstation'. A transfer of data occurs from the Tender Workstation to eLandings via a
thumb-drive delivered to the processors.

The ADF&G Mark, Tag, and Age (MTA) Laboratory also houses several statewide data repositories,
including otolith and coded wire tag (CWT) mark/recapture databases. In addition, the MTA Laboratory
hosts a statewide age, sex, and length (ASL) data repository. Management regions can currently upload
ASL inventories and datasets into the repository provided they conform to a published specification.

The ADF&G Gene Conservation Laboratory maintains a statewide genetics database called Loki. Loki is
written in Java with an Oracle database backend. The application was rewritten in 2005 in the Java Swing
framework using the Oracle Application Developer Framework (ADF) and Business Components. This
application is used to store molecular genetic markers used in stock identification. The system is currently
being migrated to a Flex-based user interface with Java middleware and an Oracle database repository to
improve functionality and meet current departmental technology standards.

2.3.2 Deficiencies

ADF&G staff continue to input data to the legacy commercial harvest database (Zephyr) during the
transition to eLandings. The Zephyr desktop application is the major input and reporting tool for historical
commercial havest of Salmon and is written in a product which has reached the end of its life cycle.

Currently reporting capabilities in the eLandings system are underdeveloped. The unified codebase
underpinning the system requires that a cumbersome redeployment be made whenever new reports are
created, making reporting improvements labor intensive. The initial intent of the system was for the
interagency participants to report from their historical database of record, and not the eLandings data.

Management of geospatial data in ADF&G is currently weak. Locations are stored as hierarchical codes
rather than spatially explicit data. Many locations, especially fishery statistical areas, have changed over
time yet retained the same name and code. Documentation of these changes and adjustment of historical
data to fit new location definitions are inconsistent across regions.

Functional Requirements Document Page 2


2.0 Statewide Salmon Information Management

Although a statewide ASL repository is maintained by the MTA Laboratory, use of this system outside of
Region I has been minimal due to data quality concerns or technological barriers. ADF&G data
management staff report that efforts will be made in the near future to increase usage of the repository.
Public interfaces to this data are somewhat difficult to use, and output is only possible in a tab separated
text data dump format.

2.4 Southeast (Region I)

2.4.1 Existing Data Repositories


Region I has utilized a data management application called Integrated Fisheries Database
(IFDB)/Alexander since 1994. Alexander is written in the Centura programming language, utilizes an
Oracle database, and includes both a server component and client desktop applications. The application
includes data management forms and approximately 125 preconfigured reports for salmon in addition to
similar tools for other fisheries. The application has approximately 130 users, consisting of all Region I
commercial fisheries staff.

Various salmon datasets are managed by Alexander. Commercial harvest data dating back to 1969 is
imported from the statewide Zephyr fish ticket database on a nightly basis. In-season commercial harvest
data are also entered to manage the seine fishery. Data on fishery openings are available back to 1969.
Personal use and subsistence data are available from 1985 to the present. Troll fishery performance data
are stored in the system for use in in-season catch estimation. 200,000 aerial survey data records dating
back to 1960 exist in the system. Weir data also exists for about 75 weirs dating back to the early 1900s.
Approximately 4 million ASL biometric records dating back to 1982 also exist in the system. Finally,
historical pink salmon sex ratios are available for in-season run timing comparisons.

The Alexander system is outdated and regional staff are in the process of developing a web-based
replacement system called Zander. Zanders user interface will be developed in Flex 3 and access an Oracle
database via Java middleware. These technologies follow Fish and Game departmental technology
standards.

2.4.2 Deficiencies
Region Is data management system currently lacks fishwheel data from the Chilkat and Taku Rivers,
which is used for in-season management. Troll fishery log book data are also missing from the system.
Regional data managers noted the lack of a mobile ASL data collection application and suggested that this
might be a good candidate for a development project to be used statewide. Current reporting capabilities
also need to be expanded. Region Is data management system cannot currently store or display geospatial
data. As with other regions, locations in collected data are currently recorded as codes and do not have
specific associated geospatial data.

2.5 Central (Region II)

2.5.1 Existing Data Repositories


Region II uses a catch and escapement tool called Mariner. Mariner is a collection of four distinct
applications and includes components for harvest data, escapement data, and news release management.
The application was created for PWS and has been modified for use in Cook Inlet and Bristol Bay. Its news
release system is also utilized by Region III. The application is written in PHP and uses an Oracle database.

Functional Requirements Document Page 3


2.0 Statewide Salmon Information Management

Managers generate daily reports from Mariner to inform management decisions, and the availability and
accuracy of this data are critical. The Mariner system also powers several harvest, escapement, and run
timing reports on the Region II public website, . A developer was hired by ADF&G in early 2009 to
maintain and continue development on the Mariner system. Efforts are being made to integrate this system
with the new statewide eLandings harvest reporting system.

Other specialized management systems include Sedna for the Homer groundfish and shellfish fisheries and
the Fisheries Data Management System (FDMS) for Bristol Bay salmon. The FDMS includes a mobile
data collection application and is used to streamline the collection of ASL, scale, and genetic samples.

See section 3 for descriptions of other PWS specific data management systems.

2.5.2 Deficiencies
ADF&G staff are currently assessing the status of Mariner system and evaluating user needs. Fisheries
managers have commented that the news release system is inflexible and in need of updates.

2.6 Arctic Yukon Kuskokwim (Region III)

2.6.1 Existing Data Repositories


Region III has aggregated salmon ASL; aerial survey; tower, weir, and sonar escapement counts; and
subsistence data into a data repository and management system. Datasets in this system generally extend
back to the early 1960s. Region III has completed a comprehensive data salvage effort, and all available
historical data has been digitized from paper archives. The resulting electronic data are stored in a
Microsoft SQL Server database. A public web interface and a desktop management client have been
developed in ASP.Net. The web interface allows for selection and download of raw data and a few summary
reports. The desktop client tool is intended for internal use by ADF&G and allows for entering and editing
new data.

Region III must contend with internet connectivity issues due to the remote nature of many of its field
camps, and for this reason several disconnected in-season management tools developed in Microsoft Access
and Excel are still in use. Data from these tools are imported into the central historical repository post-
season. Data from various radio telemetry, capture/recapture, and test fishery projects are stored in Access
databases and not yet included in the central repository.

2.6.2 Deficiencies
Reporting and data visualization capabilities of Region IIIs data management system are currently very
limited. Raw data can be downloaded, but high level summary data products are not available. Data
management staff indicate that a more user-friendly data import process is also desirable. Development of a
catch and escapement management tool capable of historical comparison that can function without internet
connectivity would be ideal for Region III. Region IIIs salmon data management system also lacks a
geospatial component and all location data are currently stored as codes rather than explicit geospatial
data; data management staff indicate that management of poorly defined and dynamic location codes is a
major problem.

2.7 Westward (Region IV)

2.7.1 Existing Data Repositories

Region IV is currently in the process of migrating salmon data into a relational database system from

Functional Requirements Document Page 4


2.0 Statewide Salmon Information Management

various R:BASE, Access and Excel electronic files. This project is still in the planning stage and much
work remains to be done. Region IV collects harvest, ASL, daily escapement, and aerial survey data.

2.7.2 Deficiencies
Region IV does not yet have a functional centralized database. Salmon data cannot be accessed on the web;
neither raw nor summarized outputs are available. The current lack of a centralized system leaves datasets
prone to organizational, versioning and quality control problems.

2.8 Initiatives of the Computer Information Services Team (CIS)

The CIS teams long range data management goal is to have a single unified data warehouse for all
ADF&G data with shared reporting tools. The project team named the Business Intelligence Group (BIG)
has been formed within CIS and is investigating possible data warehousing architectures, surveying user
needs, and evaluating several business intelligence reporting tools to interface with the proposed data
warehouse.

Another project team within CIS is the eLandings team. This team is focused on continuing development of
the statewide commercial harvest tracking system: eLandings (see 2.3.1). Another effort called the Standard
Specimen ID (SSID) project aims to assign system-wide unique identifiers to sample efforts at their finest
level of granularity (e.g. fish, aerial survey sightings, crab pots) to allow linking of samples between
information systems. The existing Mariner data entry and reporting system is currently being overhauled.
More information concerning CIS team efforts and strategic plans are available in section 4.

Functional Requirements Document Page 5


3.0 PWS Salmon Information Management

3.0 PWS SALMON INFORMATION MANAGEMENT

Functional Requirements Document


3.0 PWS Salmon Information Management

3.0 PWS SALMON INFORMATION MANAGEMENT

This section details current data management practices for salmon monitoring data produced by the
Cordova ADF&G office for in-season management. The following content provides a description of the
ways in which data are acquired, processed and finally distilled into data products which then assist area
biologists in managing salmon fisheries, monitoring long-term trends, and developing forecasts. Current
interactions with existing ADF&G data management systems and repositories have been documented to the
best ability of the Axiom consultants. This information was gathered through interviews and workshops
with relevant ADF&G staff (Area E biologists and ADF&G statewide programmers). The current lack of
enterprise level data management methods severely limits user groups outside the small group of biologists
at the ADF&G office from accessing and utilizing salmon metric data.

3.1 Background
ADF&G has actively managed Alaskan salmon resources since the early 1960s. ADF&Gs current
escapement survey effort in PWS is comprised of two distinct aerial survey designs (one for chum and pink
salmon and the other for sockeye, coho, and Chinook salmon) and a series of weir, tower and sonar
projects. Commercial harvest data are also monitored closely. However, data entry, processing and
reporting are cumbersome and require duplication of effort.

3.2 Spatial Organization of PWS Salmon Data


ADF&G biologists have broken the PWS sound area into 10 management areas:

Bering River
Coghill
Copper River
Eastern
Eshami
Montague
Northern
Northwestern
Southwestern
Unakwik

The following diagram (Figure 1) provides a spatial representation of the management areas and locations
of pink salmon hatcheries and in situ sampling locations.

Functional Requirements Document Page 1


3.0 PWS Salmon Information Management

Figure 2. Prince William Sound Management Area commercial fishing districts, salmon hatcheries,
weir locations, and Miles Lake sonar camp.

Additionally, the department has further delineated the PWS area into statistical reporting areas. The
following diagram (Figure 2) displays the spatial bounds of the smaller statistical reporting areas contained
within each management area.

Functional Requirements Document Page 2


3.0 PWS Salmon Information Management

Figure 3. Prince William Sound Area showing commercial fishing districts and statistical reporting areas.

These spatial areas are important for understanding reporting requirements, management decisions and
historical archived data for ADF&G Prince William Sound. More information on how the Cordova
ADF&G office documents their salmon metric information can be accessed in the annual management
report archive contained in the historic PWS data archive .

3.3 Chum and Pink Salmon Aerial Surveys

3.3.1 Data Collection Protocols

Between the months of June and September, biologists fly aerial surveys once per week to count salmon
escapement in 215 index streams. Each weekly flight surveys all 215 streams. Each index stream is split
into three sections corresponding to the bay, mouth and stream extent, and fish counts are grouped into each
section for every index stream. Fish counts are recorded for each targeted species (chum, pink and others).
GPS derived flight trackline data are also recorded and archived. Data are collected on paper forms and
entered into an R:BASE data entry program at the Fish & Game office.

3.3.2 Data Processing and Report generation

Once data has been entered into the R:BASE interface it is then exported into a comma-separated value
(CSV) output file and run through a Fortran application to compare in season observations against the

Functional Requirements Document Page 3


3.0 PWS Salmon Information Management

historical data archive. Five in season reports are produced from the Fortran application for each species.
These reports include the following for chum and pink salmon:
Daily counts per sub district
Cumulative counts per sub district
Daily counts per stream
Cumulative counts per stream
Weekly counts per sub district

The procedure for preparing, loading and processing the data is well documented in the pinksalmon
reporting procedure section of the PWS historic data archive.

The reports compare actual observed counts in the various sections of stream (bay, mouth, stream and
cumulative) against forecasted counts and generate a percent deviation metric that represents forecast
accuracy. These outputs are produced for each individual stream system as well as larger groups of streams
systems. Data for retrospective analysis is limited to either even or odd years due to the two-year life cycle
of pink salmon. An example of the above reporting structure can be accessed in the PWS pink and chum
aerial survey data preparation and reporting procedures resource.

3.3.3 Status of Historical Data


Historical escapement data for various species for the years 1960-1999 was standardized and loaded into
an ADF&G SASPop database (Microsoft Access 2000) in a format that the department created in the late
1990s to standardize all escapement data. Inspection of the SASPop database reveals that the structure is
not efficiently normalized; the overuse of relationships between table entities is illustrated in the diagram of
the SASPop database structure (Figure 3). Consequently, data from 2000 forward is stored in an R:BASE
database. Historical data can be found in the PWS historic data archive.

Functional Requirements Document Page 4


3.0 PWS Salmon Information Management

Figure 4. Entity diagram of SASPop historical escapement data repository.

3.3.4 Deficiencies
Data preparation for in season reporting is very cumbersome and inefficient for chum and pink aerial
survey data. Report generation requires transforming data into various transfer formats to facilitate
movement between the R:BASE data entry system and the Fortran analysis program which builds the
reports for in-season management decisions. There is no functional historical data architecture that allows
easy access to raw data, custom queries and data summary products.

3.4 Sockeye, Chinook, and Coho Aerial Surveys

3.4.1 Data Collection Protocols

Between the months of June and October biologists fly approximately 20 aerial surveys over 84 index
streams to count escapement for the sockeye, Chinook and coho species. Data are collected via a PDA
application developed by \PWS ADF&G biologist Glenn Hollowell. Aerial flight GPS tracks are recorded

Functional Requirements Document Page 5


3.0 PWS Salmon Information Management

and point-based spatial data can be associated with individual fish groups through comparison of the PDA
application data with a shapefile generated from the aerial survey flight track data. Two shapefiles are
produced for each survey effort: one for the survey track and the other for the fish groups and counts
observed.

3.4.2 Status of Historical Data


Data exists in a spatially enabled electronic format (see 3.4.1) from 2007 forward but all other historical
aerial survey data (1950s 2006) are stored in aerial survey index area report forms. These report forms
document observations of fish in the 84 index streams in a tabular format with species as columns and
index streams as rows. Some more current data exist in electronic formats but older data are on paper
forms. Historical data can be found in the PWS historic data archive.

3.4.3 Deficiencies
Aerial survey data for sockeye, Chinook and coho salmon exists in a large number of formats, which makes
analyzing and utilizing the data over long time series difficult. In addition, the most current data from 2007
forward is not tied together well because each survey result exists in its own separate electronic file. A
centralized spatial database could be developed to organize and enable this information. Data entry tools
also do not exist for these data types.

3.5 Weirs, Towers and Miles Lake Sonar Escapement

3.5.1 Data Sources

Data are collected to measure escapement for sockeye and coho salmon at a series of weirs in the PWS
area. These sites include:

Coghill Weir Sockeye (1974 current)


Eshami Weir Sockeye (1954 current)
Shrode Weir Sockeye (1957 1974 intermittent)
Long Lake Weir Sockeye and Coho (1974-2006)
Tanada Weir Sockeye (2001-2006)
Gulkana River Tower Chinook (2002-Current)

Data from weir projects are collected and recorded as daily summaries (i.e. total number of fish species
observed passing through the weir per day).

Salmon escapement is also measured using sonar devices at Miles Lake. Data exists from 1978 to the
present. Escapement is measured at the entrance to Miles Lake, and salmon species are undifferentiated.

3.5.2 In Season Management and Report generation

Tower and weir data are utilized in an ancillary fashion during an active fishery. The Miles Lake sonar
project is heavily relied upon to ensure that daily escapement goals are being met for the Copper River
system. Data entry for these data types is primarily a manual process that involves phoning in daily
escapement summaries. Data are transcribed multiple times into various spreadsheets which are then used
for immediate management decisions.

Functional Requirements Document Page 6


3.0 PWS Salmon Information Management

3.5.3 Status of Historical Data

Historical data for weir and Miles Lake sonar escapement metrics are stored in standardized Excel
workbooks. Though each workbook contains a unique spreadsheet structure, the following components are
generally available for all data sources:
Actual Daily Counts number of fish counted moving upstream for each day for historical and
current years
Actual Daily Cumulative cumulative total number of fish that have been observed moving
upstream that season for each day for historical and current years
Daily Percentage percentage of the total run observed for each day for historical and current
years
Daily Cumulative Percentage cumulative percentage of the total number of fish that have been
observed moving upstream that season for each day for historical and current years
Graphs depicting all of the above relationships for historical and current years

Historical data can be found in the PWS historic data archive.

3.5.4 Deficiencies

Serious inefficiencies exist in the methods by which weir and sonar data are recorded, stored and utilized
for in-season management. Data are entered by hand into Excel spreadsheets and then manually
manipulated to create graphs for management use. Each new management season requires preparation of
the master workbook beforehand. Data sharing and distribution are problematic. Multiple data entry staff
manipulating the same data at the same time leads to bifurcations. Comparison of real-time data to
historical data requires manual processing, and other types of analysis are very limited. Submitting data to
statewide salmon metric repositories is cumbersome and requires duplication of data entry effort.

3.6 Age Sex Length Data

ASL data have been collected in the PWS area continuously since the early 1950s. There are approximately
575,000 ASL records for multiple species. A majority of the data are for the sockeye species (395,000
rows) with smaller numbers for the coho, Chinook and chum species. These data have been organized into a
series of Microsoft Access database and Excel files in formats dictated by ADF&G data managers. ASL
data is not heavily relied upon by managers in current in season management practices. There is no
automated data entry and data access system for this data. Data are manually entered into Excel
spreadsheets and stored locally.

3.7 Commercial Harvest Data

Commercial harvest data in PWS is handled in a manner consistent with other areas (see section 2.3).

Functional Requirements Document Page 7


3.0 PWS Salmon Information Management

Functional Requirements Document Page 8


4.0 Information System Vision

4.0 INFORMATION SYSTEM VISION

Functional Requirements Document


4.0 Information System Vision

4.0 INFORMATION SYSTEM VISION


This section was developed from information gathered during user interviews and the ADF&G workshop.
Please note that it attempts to lay out a sample information systems vision for statewide salmon data
consolidation for the sole purpose of identifying how the SoS-ADF&G project might best add value to CIS
efforts and meet the needs of ADF&G staff in Cordova. The analysis is not intended to presume what the
ultimate statewide vision will be. To assist the reader, sidebar notations are written to identify components
that are understood to be the purview of the CIS group or highlight areas with special potential for
collaboration or added value.

As noted above the following section frames a conceptual vision of the ADF&G salmon data management
system for the purpose of framing and integrating the projects requirements analysis. First, measurable
goals are isolated from user needs which were identified during the interview process. Second, use cases are
isolated to envision how users will interact with the system to meet their individual needs.

4.1 Consolidation of User Needs into Measurable Goals


Information gathered from the initial user interviews was combined into a series of coherent measurable
goals. This process involved merging analogous user needs in addition to reconciling conflicts between user
needs in opposition to one another. The combined list of reconciled and consolidated user needs was then
distilled into a much smaller set of measurable goals. This process involved looking at individual user needs
as clues to larger overall goals that the system will need to meet. The development of succinct measurable
goals will allow system prototyping and iterative development phases to commence.

4.1.1 Provide Granular Queriable Access to Raw Data for User Groups

Although summarized data products will be the desired output for most user groups (e.g. the general
public, fisheries managers), other groups including the research and academic communities need access to
raw data. While summarized data products provide an efficient high level view that allows easy ingestion of
datasets by consumers with varying degrees of expertise, expert users need raw data for analysis, modeling
and custom visualizations. As with summarized data products, raw data should ideally be accessed through
a universal query interface where users can select datasets, geospatial extent, temporal extent and data
output format.

4.1.2 Develop a Series of Modular Data Entry, Reporting and Data Access Tools

Despite the general similarity of salmon datasets collected statewide, efforts to develop data entry, reporting
and access components are currently duplicated across the four management regions. Regional data
management staff operate mostly in isolation from other regions. Sharing of developed technologies
between regions is minimal; when sharing does occur it often requires cumbersome modifications to
existing applications. The resulting functionality of modified applications is often impaired because existing
applications were not developed with modularity in mind. Currently public access points for commercial
fisheries data are spread out in various places on the ADF&G commercial fisheries website and utilize a
variety of dissimilar user interfaces.

The development of modular and adaptable data entry, reporting and access components would reduce
effort duplication between management regions and allow regional data management staff to focus on the
fulfillment of data management needs that are unique to their region. Modular shared data entry systems
would allow for standardized quality control and geospatial management. Modular reporting and data
access tools would provide internal and public data interfaces with consistent look, feel and functionality
across regions, which would in turn decrease the learning curve for users who are new to the system.

Requirements Analysis Document Page 1


4.0 Information System Vision

4.1.3 Provide Standardized Metadata for Datasets

While providing intuitive access to raw data and data summary products is important, these datasets are of
limited value without the accompaniment of descriptive metadata records. Federal Geographic Data
Committee (FGDC) compliant metadata records are required for all federally funded projects. Even when
not required the production of FGDC metadata are considered a best practice for all projects since this
allows for a datasets inclusion into data clearinghouses, where it can be discovered and used by others.
Standardized metadata links agency collected data to the rest of the world; datasets without proper
metadata are subject to misinterpretation by users, isolation from other associated datasets, exclusion from
analyses and visualizations where they would have been applicable and possible extinction when they
cannot be identified or verified. Besides allowing for data discovery, metadata can also communicate
detailed data collection methods, notes on data quality and other important aspects that are not captured in
the data itself. Ideally a standards based data dictionary or metadata ontology should be developed to
explicitly document data syntax and quantitative meaning for salmon information.

4.1.4 Develop Standard Operating Procedures and Data Processing Routines

Standard operating procedures for data management must be developed and documented. The data
collection methods, data entry and validation protocols, and established procedures for migrating validated
data into the central repository must all be explicitly stated to prevent confusion and ambiguity both inside
and outside of ADF&G. Clearly defining data entry, validation and vetting procedures will ensure that all
points of data input are held to similar standards of quality.

Datasets will also need to be processed when they are imported into the central data warehouse. Examples
of possible processing routines include code conversions, data validation and structural conversions (for
example, OLTP to OLAP data structures). These procedures should be developed so that they can be run
regularly, independently, and reliably.

4.1.5 Ensure Alaska Department of Fish & Game Data Management Staff can
Support, Modify and Improve Information System

Data management staff must have a thorough understanding of all technologies, applications and
procedures used in the information system to allow for maintenance and development. Thorough and
evolving documentation of the information system is an important component of this goal. Explicit
descriptions of database structures, data flow, data quality standards, application function and organization
and other aspects of the system must be documented and accessible to data management staff. Staff must
also be allowed sufficient time to maintain this documentation; managers often do not factor documentation
time into their project estimates. A wiki-style documentation vehicle would be well suited for this
requirement.

Data management staff must also receive adequate training on the technologies used by the information
system to allow for debugging and development. This training can include classroom style classical
training, web based seminars, online tutorials or books. Data management staff must also stay aware of
emerging technologies that can enhance the information system.

Requirements Analysis Document Page 2


4.0 Information System Vision

4.1.6 Enable All Data to Be Spatially/Temporally Explicit at Multiple Scales

Though virtually all salmon metric data stewarded by ADF&G describes measurements or values which
have an explicit spatial context, the raw data itself is poorly spatially enabled. This is apparent in the
prolific use of common names in various salmon metric datasets and databases. When sampling has
occurred in a stream system the biologist will typically denote the spatial location with a name or code
which has traditional been used to describe a location. An example of this is the Beaver Creek
phenomenon in which measurements are associated with a stream extent by its common name. The problem
arises when a second biologist associates metrics with a second distinct stream that is also called Beaver
Creek. Additionally, locations and extents of common names, management areas and statistical areas have
changed over time. As a hypothetical example, statistical area 334-51 might represent a completely
different area in 2007 than it did in 1985. Finally, the limited number of available predefined location codes
often cause biologists to choose location codes that do not precisely describe a projects sampling location;
the code for an entire stream system may be used when sampling actually took place on a small subset of
the stream. These inconsistencies lead to deep seated problems with retrospective analyses and attempts to
look at long term change at various spatial scales.

Developing a data management framework which explicitly defines geometric objects (points, lines and
polygons) and couples these objects with data will greatly assist all user groups in accessing,
understanding, analyzing and utilizing salmon data. Users will be able to intuitively drill down into
hierarchically nested spatial regimes to look at system change on a watershed, basin, sub-basin or finer
scale. In addition, spatially enabling data will provide transparent data access for user groups who are not
familiar with the internal ADF&G spatial qualifiers (e.g. common name, management area, stat area).
Latitude and longitude are a universally understood metrics for spatial description. Providing a geospatial
context for information is a fundamental requirement for developing the ability to programmatically
marshal data between computer systems via standardized interoperability protocols (Web Feature Service
[WFS] and Web Map Service [WMS]).

Requirements Analysis Document Page 3


4.0 Information System Vision

4.2 Use Cases


Use cases provide specific examples of how users will interact with the system to meet a specific goal. A
single use case can span multiple logical user groups and is more oriented toward a users business role. In
addition, multiple use cases can be directly associated with a single user group. Use cases will detail
scenarios in which users approach the system to meet specific needs and then detail the sequence of events
which take place to meet those specific needs.

4.2.1 Use Case Actors


Use case actors are logical entities which are aggregated from the various ways users will interact with
functionality of the system. In many cases actors can be organized into hierarchically structured schemes in
which the properties of one actor are inherited by other logical actors. Each actor may have one or more
use cases associated with it. For organizational purposes actors have been grouped into two distinct
domains: data report/access and data entry/management.

Use Case Actors Data Report/Access Domain


Generic Data Browser/Access
Resource Manager (ADF&G)
Resource Ecologist (Agency or NGO)
Commercial Fisherman
Public/Recreational Fisherman
Biometrician, Bio-Informatics Researcher
Processors

Use Case Actors - Data Entry/Management Domain


Generic Data Entry
ASL Data Entry
Aerial Survey Data Entry
Weir/Tower/Escapement
Genetics Data Entry

4.2.2 Use Case Diagrams and Actor Hierarchy


Use case diagrams provide a graphical representation of how various use cases apply to system actors
organized into hierarchy structured schemes. Use case actors are represented as stick figures, use cases are

Requirements Analysis Document Page 4


4.0 Information System Vision

represented as yellow ellipses and use case inheritance between actors is represented as directional lines
noted with the <<extends>> operator. A use case actor that extends another actor will inherit all the use
cases associated with the actor the <<extends>> operator is flowing to.

The following use case diagram displays use cases and inheritance between various use case actors for the
Data report/access Domain.

Figure 6. Use case diagram for data report/access domain.

Requirements Analysis Document Page 5


4.0 Information System Vision

4.2.3 Use Case Index


Use cases are listed into an index for organizational purposes.

Use
Case ID Use Case Name Primary Actor
1 Generate Ad Hoc Reports Generic Data Browser
Visualize/Access Data through
2 Online Map Generic Data Browser
3 Access Pre-Batched Reports Generic Data Browser
Access/Query Raw/Summarized
4 Data Generic Data Browser
5 Create/Access Stored Profile Generic Data Browser
Create Localized Reporting
6 Framework Resource manager (ADF&G)
7 Authorize Fishery Announcements Resource manager (ADF&G)
8 Publish Forecast Data Resource manager (ADF&G)
Access Data Interoperability Biometrician/Bio-Informatics
9 Formats Researcher
Biometrician/Bio-Informatics
10 Access Standardized Metadata Researcher
Create Customized Reporting Resource Ecologist (Agency or
11 Framework NGO)
12 Access Outreach Products Public/Recreational Fisherman
13 Access Fishery Announcements Commercial Fisherman
14 Access Forecast Data Processors
15 Access Near Real Time Catch Data Processors

Table 1. List of all use cases and corresponding actors for data report/access domain.

4.2.4 Use Case Descriptions

Use Case ID: 1


Use Case
Name: Generate Ad Hoc Reports
A user approaches the system to create a report with framing
parameters. These parameters could include: time period,
location, salmon metric and applicable analysis/aggregation. The
Use Case system enables the user to rapidly create the report and receive
Description: the information in the format required by the user.
Primary
Actor: Generic Data Browser
Basic Flows: 1. User navigates to statewide public ADF&G fishery data
access page and chooses to build a report.

Functional Requirements Document Page 6


4.0 Information System Vision

2. User chooses what type of report and metric(s) from


selection lists.
3. User narrows information request by spatial proximity,
time period and further modifications to analysis and
aggregation.
4. User reviews request and submits request
5. System responds with summary or entire result set
depending upon size.
6. System prompts user for download options (e.g. pdf, excel,
shapefile ).
7. User selects output options and then downloads report
output if possible.
Alternate
Flows:

Use Case ID: 2


Use Case
Name: Visualize/Access Data through Web Based Map
User navigates a web based map, manipulates the map layers
Use Case and base layers and can access/download data through the
Description: online map interface.
Primary
Actor: Generic Data Browser
1. User navigates to statewide public ADF&G fishery data
access page and chooses to view map based data
visualization.
2. User interacts with system. User browses through various
data layers and turns them on or off from a tree control.
Another control allows the user to filter data by time
constraints.
3. User can request data from the map based interface
through utilization of the WMS protocol by clicking on
features.
4. User can request packaged data in various formats
produced from WFS.
5. System prompts user for download options (e.g. pdf, excel,
Basic Flows: shapefile).
Alternate
Flows:

Use Case ID: 3


Use Case
Name: Access Pre-Batched Reports
Users access a web based library of existing management reports
and data analysis for various spatial, temporal and salmon metric
Use Case domains. User can select reports, run them against the data
Description: warehouse and view/download results.
Primary Generic Data Browser

Functional Requirements Document Page 7


4.0 Information System Vision

Actor:
1. User navigates to statewide public ADF&G fishery data
access page.
2. User chooses to search through existing management
reports.
3. User drills down through a tree structure or searches for
reports via search terms.
4. User selects which reports to run.
5. System responds with summary or entire result set
depending upon size.
6. System prompts user for download options (e.g. pdf, excel,
shapefile).
7. User selects options and then downloads report output if
Basic Flows: possible.
Alternate
Flows:

Use Case ID: 4


Use Case
Name: Access/Query Raw/Summarized Data
Use Case
Description: User searches for and downloads raw datasets.
Primary
Actor: Generic Data Browser
1. User navigates to statewide public ADF&G fishery data
access page and chooses to access raw data.
2. User chooses to search through existing datasets and
metadata.
3. User drills down through a tree structure or searches
through metadata via search terms.
4. User clicks on download data set for a specific dataset.
5. User is prompted with options for filtering dataset by time
and space if appropriate.
6. System responds with summary or entire result set
depending upon size.
7. System prompts user for download options (eg. pdf, excel,
shapefile).
8. User selects options and then downloads data output if
Basic Flows: possible.
Alternate
Flows:

Use Case ID: 5


Use Case
Name: Create/Access Stored Profile
Users will create secure personal profiles to save their personal
Use Case queries and to receive important updates regarding salmon
Description: fishery announcements, forecasts and other information.
Primary Generic Data Browser

Functional Requirements Document Page 8


4.0 Information System Vision

Actor:
1. User creates personal profile (email, password, contact
info) on public statewide ADF&G fishery data access page.
2. Verification email is sent to user.
3. User verifies identity by clicking on link in verification
email then logs in with created profile.
4. User selects to receive fishery announcements via email
which meet the users needs (specific to selected areas
and fishery types). ADF&G management staff will have
created localized reporting frameworks (Use Case 6) for
specific areas and fisheries which can be selected as areas
of interest by the user.
5. User selects reports which are to be saved in their profile
for quick access. ADF&G management staff will have
created localized reporting frameworks (Use Case 6) for
specific areas which can be selected as areas of interest by
the user.
6. User can request to be granted specific roles which could
include: resource ecologist, commercial fisherman or
processor. These roles may have heightened data access
privileges associated with them and default profile options.
7. User can access any available resource specified in 4 - 6
from their profile home page.
Basic Flows: 8. User logs out of system.
Alternate
Flows:

Use Case ID: 6


Use Case
Name: Create Localized Reporting Framework
ADF&G managers will develop a localized framework for in
season management functions. The manager will develop and
store a customized suite of reports for regions and fisheries. The
manager will also establish a distribution framework for other
Use Case management staff and those users who have voiced interest in
Description: receiving reports and announcements.
Primary
Actor: Resource manager (ADF&G)
1. ADF&G manager logs into private ADF&G intranet through
ADF&G Active Directory.
2. Manager configures custom report structures and data
analysis for in season management functions into a local
management profile.
3. Manager provides access to the local management profile
to other ADF&G management staff through Active
Directory.
4. Managers can associate outside parties with localized
reporting framework so that those individuals receive
Basic Flows: information and announcements in an automated fashion.

Functional Requirements Document Page 9


4.0 Information System Vision

Alternate
Flows:

Use Case ID: 7


Use Case
Name: Authorize Fishery Announcements
Use Case An ADF&G manager utilizes the localized reporting framework to
Description: efficiently distribute a fisheries announcement.
Primary
Actor: Resource manager (ADF&G)
1. ADF&G manager logs into private ADF&G intranet through
ADF&G Active Directory.
2. User selects management area and fishery.
3. User uploads fishery announcement in PDF or text format
and fills out some basic fields.
4. System stores announcement and notifies those in the
distribution framework about new announcement.
5. System posts announcement on various data driven web
Basic Flows: pages.
Alternate
Flows:

Use Case ID: 8


Use Case
Name: Publish Forecast Data
Use Case An ADF&G manager utilizes the localized reporting framework to
Description: efficiently publish forecast data.
Primary
Actor: Resource manager (ADF&G)
1. ADF&G manager logs into private ADF&G intranet through
ADF&G Active Directory.
2. User selects management area and fishery.
3. User uploads forecast announcement in PDF or text format
and fills out some basic fields.
4. System stores forecast and notifies those in the
distribution framework about new forecast.
Basic Flows: 5. System posts forecast on various data driven web pages.
Alternate
Flows:

Use Case ID: 9


Use Case
Name: Access Data Interoperability Formats
Users access data directly in GIS clients (e.g. ESRI, Manifold,
Use Case QGIS) and statistical packages (e.g. R, SAS) through WFS and
Description: WMS services.
Primary
Actor: Biometrician/Bio-Informatics Researcher

Functional Requirements Document Page 10


4.0 Information System Vision

1. User navigates to statewide public ADF&G fishery data


access page and chooses to access interoperability
formats.
2. User drills down through a tree structure or searches
through metadata via search terms to select WFS/WMS
layers.
3. User connects client software to WFS/WMS system by
adding selected data layer as data source in client
application.
Basic Flows: 4. User downloads data into client application for analysis.
Alternate
Flows:

Use Case ID: 10


Use Case
Name: Access Standardized Metadata
User downloads data or reports and receives standardized
Use Case metadata which assists in interpreting the data and discloses the
Description: use constraints of the data.
Primary
Actor: Biometrician/Bio-Informatics Researcher
1. User navigates to statewide public ADF&G fishery data
access page and chooses to access and download data
(reports, raw data or interoperability formats).
2. Downloaded data are packaged with a series of metadata
files which may include FGDC metadata records, use
restriction statements, Ecological Metadata Language
(EML) files or other ontology.
3. User can also access metadata from within the application
interface by clicking on corresponding metadata link.
Basic Flows:
Alternate
Flows:

Use Case ID: 11


Use Case
Name: Create Customized Reporting Framework
Non-ADF&G resource ecologist will develop a localized framework
for reporting and data access on custom spatial scales. The
Use Case ecologist will develop and store a customized suite of reports for
Description: fishery data based upon their own regional bounds.
Primary
Actor: Resource Ecologist (Agency or NGO)
Basic Flows: 1. User logs in as a user with resource ecology role on public
statewide ADF&G fishery data access page.
2. User defines spatial regions or uses default ecosystem
regions (e.g. watersheds, basins, sub basins)
3. User is provided with list of datasets that fit within those
spatial domains and applicable preconfigured reports that

Functional Requirements Document Page 11


4.0 Information System Vision

can be run on the data.


4. User selects data sources and preconfigured reports or
builds custom reports.
5. User saves report templates in their profile for quick future
access.

Alternate
Flows:

Use Case ID: 12


Use Case
Name: Access Outreach Products
User approaches ADF&G web site and clicks on public services
link. User will select from a series of internet services or data
products that are packaged for public use. These could include
interactive maps for displaying the distribution of various
species, fishery opening information for sport and commercial
Use Case fishing and other data products packaged into visualizations and
Description: educational materials.
Primary
Actor: Public/Recreational Fisherman
1. User navigates to statewide public ADF&G fishery data
access page and clicks on the public services page.
2. User interacts with system by navigating through
hierarchal catalogue of topics via a tree control and/or
performing searches.
Basic Flows: 3. User utilizes specific service.
Alternate
Flows:

Use Case ID: 13


Use Case
Name: Access Fishery Announcements
User sets up personal profile to receive notification of fishery
announcements. When announcements are released user
Use Case receives notification via multiple mechanisms (e.g. email, text
Description: profile home page).
Primary
Actor: Commercial Fisherman
1. User sets personal profile to receive announcements
regarding a specific fishery and area.
2. ADF&G manager publishes fishery announcement that
matches users selection.
3. User receives notification via multiple channels (e.g. email,
Basic Flows: text, profile home page).
Alternate
Flows:

Functional Requirements Document Page 12


4.0 Information System Vision

Use Case ID: 14


Use Case
Name: Access Forecast Data
User sets up personal profile to receive notification of forecast
Use Case data. When forecasts are released user receives notification via
Description: multiple mechanisms (e.g. email, text and profile home page).
Primary
Actor: Processors
1. User sets personal profile to receive forecasts for a specific
fishery and area.
2. ADF&G manager publishes forecast that matches users
selection.
3. User receives notification via multiple channels (e.g. email,
Basic Flows: text, profile home page).
Alternate
Flows:

Use Case ID: 15


Use Case
Name: Access Near Real Time Catch Data
User sets up personal profile to receive notification of real time
catch information. As real time catch information is received by
Use Case ADF&G, data are sent along to the user via multiple mechanisms
Description: (e.g. email, text, profile home page).
Primary
Actor: Processors
1. User sets personal profile to receive real time catch
information for a specific fishery and area.
2. ADF&G receives real time catch information that matches
user selection.
3. User receives notification via multiple channels (e.g. email,
Basic Flows: text, profile home page).
Alternate
Flows:

Functional Requirements Document Page 13


5.0 Operational Plan

5.0 OPERATIONAL PLAN

Functional Requirements Document


5.0 Operational Plan

5.0 OPERATIONAL PLAN


This section was developed from information gathered during user interviews and the ADF&G workshop.
Please note that it attempts to lay out a sample operational plan for statewide salmon data consolidation for
the sole purpose of identifying how the SoS-ADF&G project might best add value to CIS efforts and meet
the needs of ADF&G staff in Cordova. The analysis is not intended to presume what the ultimate statewide
vision will be. To assist the reader, sidebar notations are written to identify components that are understood
to be the purview of the CIS group or highlight areas with special potential for collaboration or added
value.

5.1 Salmon Data Consolidation Overview

5.1.1 Summary
The BIG team was formed within ADF&G Division of Commercial Fisheries in January of 2009 to
address the serious inefficiencies that exist in statewide salmon information management in addition to
problems with other fisheries information management. This effort represents the first exercise with
department-wide scope and authority to unify and standardize salmon information systems. The BIG
and the eLandings team are working to revamp data storage, reporting and data access interfaces for
virtually all salmon and other commercial fisheries information systems over the next three years. The
following section provides a brief detail of some of the CIS teams plans and current efforts.

5.1.2 Centralizing Data Storage for Salmon Data (Tier


1)
Salmon data and information is currently stored in a series of independent and
isolated data structures. No overarching statewide data framework exists for
this information, and current data management systems (storage, reporting
and access) are independent of one another. As a result, each region is
responsible for supporting their own systems in isolation from one another,
which results in duplication of effort and inefficiencies.

This step relies The BIG team is working towards designing a discreet set of Online
upon development Analytical Processing (OLAP) data warehouse structures which will provide
of a data
dictionary and a single storage point for each specific salmon metric on a statewide scale.
metadata The data warehouses will be constructed from existing distributed Online
ontology. Transactional Processing (OLTP) data structures through Extraction,
Potential Transformation and Loading (ETL) processes. Though each OLTP source
colaboration with database may have its own unique qualities (data types and lookup codes) the
SoS.
ETL process specific to that source will transform the data into a homogenous
OLAP format. The following diagram (Figure 4) provides an example of this
process for ASL data.

Functional Requirements Document Page 1


5.0 Operational Plan

Figure 6. Example of ETL processes migrating source OLTP ASL databases


into a centralized OLAP database through unique transformation functions.

ETL mapping is Each unique ETL process provides a mapping from the OLTP data source to
done in close the common centralized OLAP database. Quality assurance and control are
consultations with
regional staff to also handled during the transformation. The resulting OLAP database
determine syntax provides a statewide centralized vessel for all data of the type specific to the
and meaning of OLAP schema. This process is very effective because it allows satellite source
data. offices to continue everyday operations uninterrupted but also seamlessly
contribute their data to a common statewide data warehouse in which data are
standardized and integrated with data from other independent and
heterogeneous sources. The resulting centralized OLAP data warehouse can
be utilized for reporting, analysis and data access on a statewide scale.

The BIG team is planning on designing and deploying OLAP schemas for the
following salmon metrics: aerial survey escapement, weir and tower
escapement, ASL, fishery productivity tests and commercial harvest data.

5.1.3 Statewide Reporting Tools


A primary objective of the BIG team is to develop a statewide reporting tool
which can analyze data stored in the various centralized OLAP data
structures. The BIG team has isolated the Oracle Business Intelligence (BI)
Tool to facilitate this process. The tool plugs directly into the OLAP structure
and provides a flexible interface for the development of preconfigured queries
and reports in addition to allowing users to easily develop their own reporting
schemes and analytical queries. The Oracle BI tool also allows users to
package data in variety of formats for downloading. Sophisticated domain
level security authentication will provide access control for various user
groups and ensure that data which sensitive is not accessible by unauthorized
parties. It is envisioned that this tool will be used for querying and accessing
data by both the public and internal ADF&G staff.

Functional Requirements Document Page 2


5.0 Operational Plan

5.1.4 User Interface


The CIS team is charged with developing standardized user interface code
modules which will streamline the development of data entry and access tools,
reduce duplication of effort and ensure that these systems can be supported
and developed by information technology (IT) staff throughout the entire
enterprise. The ADF&G department-wide standard for the development of
applications is the programming framework Adobe Flex.

5.1.5 CIS Mariner System Overhaul


One of the first projects undertaken by the CIS team to consolidate data
management systems on statewide scale involves the overhaul of the Mariner
in season catch tracking tool. The current Mariner system is actually three
separate applications which have been developed using different programming
languages. The system provides data entry interfaces for catch estimates,
escapement, fishery openings and processor information as well as some
public facing reporting interfaces. Mariner is currently in maintenance mode
and updates have not been made since the primary developer passed away.
Some of Mariners code base (specifically the new release system) has been
extended beyond its initial scope, and this fact combined with the lack of
system maintenance over the past few years has resulted in flawed
functionality.

The initial goal for Mariner redevelopment is to provide an accurate and


efficient storage and retrieval system for fisheries data in Region II so that
fishery managers can focus their time and effort on interpreting data and
managing the regions fisheries. To accomplish this goal, Mariner will be the
repository of all escapement and stock assessment data used to manage
fisheries in the region. Data should be easily accessible, maintained as the
primary version for all users and include in season estimates and historical
data at a resolution necessary for in season management and post-season
research investigations. Wherever possible, Mariner will perform routine data
manipulation to aid management staff in the interpretation of data. Mariner
may ultimately be used as decision support software, providing managers with
syntheses of fisheries assessment data in an objective and repeatable manner.

Ultimately all Mariner applications will be combined into a unified


application and utilize a single database. Some variation will be built into the
system to address the specific needs of users in different areas, but the overall
centralization of the application will decrease the effort needed to support the
application and train users. Mariner will feature custom data entry interfaces,
but reporting will utilize the Oracle BI tool. Use of the Oracle BI tool will
allow for flexible and adaptable reporting by users of any skill level. Custom
user interfaces will be developed using Adobe Flex and Java middleware in
accordance with ADF&G technology standards. The system will continue to
use an Oracle database for data storage.

5.2 Process Flow for CIS Data Consolidation


The following process flow scheme was developed as a product of the

Functional Requirements Document Page 3


5.0 Operational Plan

ADF&G Salmon Data Management Workshop held at the Axiom office. The
scheme provides a brief description of the sequential steps to be undertaken by
the department over the next three years to standardize and centralize salmon
metric data storage. Specific details such are intentionally omitted in favor of
flexible and generalized development steps.

5.2.1 Requirements Analysis


The requirements analysis continues the first step in preparing to overhaul the
existing information system. It is expected that the requirements analysis will
developed during the first six to nine months of the project and will document
system requirements and implementation methods.

5.2.2 Educate and Enable ADF&G Staff


In order for detailed planning and development to commence, ADF&G staff
need to be educated and enabled to effectively utilize the new technologies
which will be incorporated into the emerging system. Topics will include but
may not be limited to the following concepts and systems:

Data Warehouse Design Processes


Adobe Flex Framework
Management of Oracle BI tool
Management of Spatial Data within Oracle Database
Data Management, Metadata and Interoperability Standards

Staff will be able to acquire knowledge and experience through training and
research.

5.2.3 Generate Data Dictionary


Data and systems The data dictionary provides a basis for developers and users to communicate
interoperability in the same language and provides an exercise to flesh out ambiguities in data,
goals are key to
the SoS-API. business logic, functions and output formats. The data dictionary will provide
Opportunity to the basis for development of standards based metadata which will greatly
develop ontology assist in future interpretations and distributions of the data.
and international
xml schemas.

5.2.4 Assessment of All Data Repositories


A thorough assessment of all data repositories will need to be performed in
order to determine requirements for the development of ETL process logic and
data warehouse prototype design. This step will also allow developers to
isolate those repositories best suited for the development of the prototype ETL
processes. The initial components of this step have already been completed
during the requirements analysis and are available in sections 2 and 3.

5.2.5 Design ETL Process Logic/Add Value to Data


This step involves designing ETL transfer functions which map data from the
source OLTP data repository to the OLAP data warehouse system. This
process will be somewhat complex and may incorporate one or many of the
following transformations and filters:

Functional Requirements Document Page 4


5.0 Operational Plan

-Quality Assurance and Quality Control Checks


-Normalization of Data to Types and Quantities Dictated by the Data
Dictionary
SoS collaboration -Addition Spatial Components to Data
potential.
-Assignment of Standard Specimen IDs
-Loading into Fact Table of Data Warehouse

5.2.6 Design Data Warehouse Prototype Schemas


Data warehouse prototype schema design will occur simultaneously with the
development of ETL logic. The data warehouse is to serve as the centralized
repository for all collected salmon metrics.

5.2.7 Build Prototype ETL Processes


Once the ETL logic and prototype data warehouse schemas have been
finalized the actual ETL procedures can be coded and tested. This will involve
building data manipulation scripts and discreet processing steps for each
transformation and filter applied to the data.

5.2.8 Assess Prototypes and Plan for Next


Development Phase
This review task involves the assessment of ETL process and data warehouse
schema prototypes for flaws and limitations. Performance bottlenecks should
be isolated during this step and any lacking functionality documented. Hands
on experience from the development of the prototypes should inform the
development of further information system components.

5.2.9 Reiterate Steps 5.2.5-5.2.8


ADF&G staff should continue iterations of steps 5.2.5-5.2.8 until satisfactory
ETL processes and data warehouse schemas have been developed for all
salmon data.

Functional Requirements Document Page 5


6.0 Technical Framework

6.0 Technical Framework

Functional Requirements Document


6.0 Technical Framework

6.0 TECHNICAL FRAMEWORK


This section was developed from information gathered during user interviews and the ADF&G workshop.
Please note that it attempts to lay out a sample technical framework for the sole purpose of identifying how
the SoS-ADF&G project might best add value to CIS efforts and meet the needs of ADF&G staff in
Cordova. The analysis is not intended to presume what the ultimate statewide vision will be. To assist the
reader, sidebar notations are written to identify components that are understood to be the purview of the
CIS group or highlight areas with special potential for collaboration or added value.

The technical framework portion of the requirements analysis provides a map for application development.
This section relies heavily upon the use of data flow diagrams (DFDs). These diagrams detail the flow of
information in and out of data sources, data processing steps and interaction with external entities. These
relationships have been documented via the Yourdon/DeMarko data flow diagram notation.

6.1 Data Flow Diagram 1.0 Complete System


The following data flow diagram displays the logical flow of information for the proposed ADF&G salmon
data management system in its entirety. Data sources, data processes and external non-system entities are
represented as double blue lines, red circles, and green boxes respectively. Data flow is represented by lines
connecting these logical entities with arrows noting flow direction.

Functional Requirements Document Page 1


6.0 Technical Framework

Figure 7. Context Level l.0 data flow diagram.

Components of DFD 1.0

E1 (ADF&G Data Entry Staff) This external entity includes all ADF&G data
entry staff for all salmon data types across all locations. These staff will be
interfacing with data entry systems (P1) developed in the Adobe Flex platform.

E2 (ADF&G Management Staff) Management staff within ADF&G who will be


accessing, creating and browsing data and reports through the Oracle BI interface
(P4) and accessing data and map/geographic information system (GIS) based
products through the interoperability interfaces (P3) .

SoS-API goal to E3 (Non-ADF&G Users) All users accessing data and reports outside ADF&G
provide functional data have been grouped into this external entity. These users will be accessing the same
access to this user
demographic. interfaces as the ADF&G management staff but may have less granular access or
limited authorization to access certain resources based on security interfaces.
Additionally, data may be packaged in different ways for these user groups.

D1 (OLTP Databases) Many of the existing ADF&G data reporting and entry
systems rely upon existing online transactional databases. Many OLTP databases
will be fully absorbed into the OLAP data warehouse. All OLTP data structures will
need to persist in usage for temporary storage of in season and near real time data.
Periodic and in most cases automatic ETL processing (P2) will load the data into the
centralized data warehouse. The Oracle BI tool (P4) will connect to these OLTP
databases for access to in season data and connect to the OLAP data warehouse to
access historical data.

D2 (Legacy Storage Formats) A large quantity of historical salmon data are


stored in poorly enabled legacy formats. These data sources will need to upscaled
into their respective OLTP database structures before being extracted transformed
and loaded into the OLAP warehouse.

D3 (Oracle Data Warehouse) The Oracle OLAP data warehouse serves as the
central storage point for all historical and some in season data deemed critical for
real time management decisions. OLAP database architecture provides an optimized
framework for rapid analysis and manipulation. The Oracle BI tool (P4) chosen for
reporting connects seamlessly to the fact and dimensional tables of OLAP data
structures. The data warehouse will also provide the data source for the
interoperability and mapping server (P3).

CIS Lead for data entry P1 (Data Entry System) The data entry system will be developed utilizing the
system. Adobe Flex framework for the user interface and the Java programming language
for middleware, which facilitates connectivity between user interfaces and database
systems. Data will be submitted either to the central OLAP data warehouse or a
localized OLTP database for temporary storage. The localized OLTP databases may
hold in season data or manage information for offices with poor connectivity. All
data that is stored in localized OLTP databases will ultimately be submitted to the
centralized OLAP data warehouse, probably at regular intervals or at the end of the
management season depending upon requirements.

Potential collaboration P2 (ETL) The ETL process involves preparing data for insertion into the
with SoS regarding

Functional Requirements Document Page 2


6.0 Technical Framework

spatial enablement of centralized data warehouse. Data sources which are dynamic (receiving in season
data during ETL. inserts and updates) will require ETL processes that run on a predetermined
schedule. ETL processes will check data for consistency add additional value (SSID
and geospatial enablement) and provide mechanisms for quality assurance/quality
control (QA/QC) from management staff. The ETL processes will be developed
using Oracle Data Warehouse Builder.

SoS and Ecotrust have P3 (Interoperability and Mapping) Interoperability and mapping functions will
personnel with be facilitated through the use of Geoserver, an open source Java based server
experience using these
technologies. application. Geoserver has a native datasource connector to interface directly with
an instance of Oracle which contains spatially enabled information potentially using
Oracle Locator. The Oracle BI tool (P4) will access spatial data and maps produced
from the WFS and WMS interfaces exposed by Geoserver. This interaction between
Geoserver and Oracle BI will enable the BI tool to display map based data for
spatial queries. User groups will interface directly with the Geoserver instance to
build thematic maps and access raw data in a variety of formats. GIS client
applications will also be able to connect directly to the Geoserver instance and
access spatial data on the machine to machine domain.

CIS Lead for statewide P4 (Oracle BI Tool) The Oracle BI tool will provide the framework for users to
reporting tool. access existing preconfigured reports and build their own custom queries and
Potential for
collaboration with SoS reports. The tool can assimilate data from a variety of data sources. The primary
on spatial component. repository of information feeding the reporting function of the BI tool will come
from the Oracle OLAP data warehouse. Some near real time data that has been
staged in localized OLTP databases will also be accessed and assimilated with the
OLAP data. Custom data driven maps produced from the instance of Geoserver (P3)
will be ingested into the BI tool and made available in the BI tools user interface for
spatial querying and visualization.

P5 (Data Salvage) Data stored in legacy data archive systems (static databases,
spreadsheets, and text files) will be imported and manually salvaged through custom
loading processes and data entry.

Functional Requirements Document Page 3


6.0 Technical Framework

6.2 Data Flow Diagram 1.1 - Data Entry System

Figure 8. Level l.1 data flow diagram for data entry.

Components of DFD 1.1

E1.1 (ADF&G Data Entry Staff) This external entity includes all ADF&G data entry
staff for all salmon data types across all locations. Typically these staff will be field

Functional Requirements Document Page 4


6.0 Technical Framework

technicians who are familiar with data collection methods but have little knowledge of
the data management system beyond the data entry interface and procedure. Data entry
settings will vary from field locations with limited internet access to traditional office
settings with more reliable connectivity. Data will typically be recorded on paper forms
before entry into data capture applications, although electronic data capture devices will
be used in some cases. Examples of electronic data capture devices include handheld
data loggers and fish measuring boards.

E1.2 (ADF&G Supervisory Staff) This entity represents the management staff within
ADF&G who will be vetting the data during QA/QC processes. Typically this group
will consist of project biologists who have thorough knowledge of data collection
methods, project context, and realistic data ranges.

D1.1 (OLTP Databases) All ADF&G projects with centralized databases currently
utilize an OLTP model. Many projects continue to collect data in Microsoft Access,
Excel or other local files. OLTP databases will be utilized as an intermediate between
local data storage and the OLAP data warehouse. OLTP database intermediates will be
desirable even in a fully developed information system as holding repositories for
unverified or problematic data that is not ready for warehousing. OLTP database
structures are well suited for dynamic data that is being created and edited and therefore
are a good design for data capture applications.

D1.2 (Oracle Data Warehouse) Data from all projects will eventually be stored in an
Oracle data warehouse with an OLAP structure. The data warehouse will contain static,
definitive datasets that have been vetted by knowledgeable biologists. The OLAP
structure is optimized for quickly reading large datasets but not ideal for data in a high
state of flux; data committed to the data warehouse will be considered relatively
finalized and authoritative.

P1.1 (Data Capture) Manual data capture will be facilitated through the use of data
grids, tree structures, standardized select lists and other form based data entry controls
available through the Adobe Flex API. Some users will require the ability to upload
precompiled data sets in the form of Microsoft Excel, Access or other formats. Data
entry forms should be as intuitive as possible and minimize the number of keystrokes
needed to enter data. Data entry staff should be able to use the tab key to navigate
between fields and enter data without the use of a mouse. Text boxes should suggest
previously entered data (autocompletion) where applicable and possible. User interfaces
developed in Flex will typically communicate with server-side Java middleware web
services for the handling of business logic including data QA/QC, storage, processing
and routing tasks. In field situations where internet connectivity is limited or unavailable
data will have to be stored in a local database and uploaded to the central data capture
system when possible.

Data validation P1.2 (Data Validation) Data validation will be built into data entry interfaces to
could be driven by provide dynamic feedback to data entry staff (E1.1). Data entry error alerts should be
data dictionary or
ontology in an specific on the type of error and provide a visual indication of which field contains the
automated error. Data validation rules include data type mismatches (e.g. alpha characters entered
fashion. Potential in a number field), range violations (e.g. negative numeric values in an observed count
collaboration with field), and format violations (e.g. a date field containing too many characters). Data type
SoS in developing mismatches and format violations should not be allowed to be saved, but data entry staff
ontology.
should be able to override range violations in some cases after confirming the value (e.g.

Functional Requirements Document Page 5


6.0 Technical Framework

an unusually large biometric measurement).

QA/QC flags P1.3 (QA/QC) Once data entry staff (E1.1) have entered raw data, supervisory staff
could be driven by (E1.2) will perform QA/QC procedures on the data to ensure the data has been collected
data dictionary or
ontology in an and entered correctly. This will typically involve filtering the dataset for suspect data and
automated outliers. Quality control Abode Flex user interfaces should be developed to provide data
fashion. Potential views and tools that are optimized for data QA/QC tasks. Supervisory staff should be
collaboration with able to edit data directly in the case of obvious problems or flag records, add comments,
SoS in developing and return data to data entry staff for correction. Ideally both supervisory staff and data
ontology.
entry staff should be notified of the arrival of data awaiting their action by the data
collection application rather than relying on external communication. The simplest
implementation of such a notification system would involve email notifications sent by
the Java middleware application layer. Staff should also be notified of pending data
QA/QC tasks in the Flex-based user interface upon logging into the data capture
application.

P1.4 (Data Submission) After entered data has been vetted by supervisory staff (E1.2)
it will be submitted to the appropriate OLTP (D1.1) or OLAP (D1.2) database. Data
import and transformation procedures and standardized formats will be developed to
allow this process to occur without data management staff intervention. Submission to
the centralized data repository should be handled within the data capture application and
not involve the manual processing of exported files. Data transfer from the data capture
database to the data repository can occur within the Oracle platform using predefined
ETL processes. Data can also be transferred between web services in text formats such
as XML or JSON. In either case, reporting mechanisms will need to be developed to
alert data management staff to data submission activities and failures.

Functional Requirements Document Page 6


6.0 Technical Framework

6.3 Data Flow Diagram 1.2 - ETL Processes

Figure 9. Level l.2 data flow diagram ETL processes.

Components of DFD 1.2

E2.1 (ADF&G CIS Team Supervisory Staff) Represents the ADF&G CIS team
supervisory staff who monitor and resolve errors logged during ETL processes.

D2.1 (OLTP Databases) Many of the existing ADF&G data reporting and entry
systems rely upon existing online transactional databases. Some existing OLTP
data structures will need to be utilized for temporary storage of in season and near
real time data. These OLTP databases will need to be periodically loaded into the
centralized OLAP data warehouse.

D2.2 (Error Log) This data resource details errors and exceptions encountered
during various points in ETL processes. The error log will contain detailed
information regarding the error encountered, any steps taken to resolve the error
and current state of ETL processing for the dataset in question. Supervisory staff

Functional Requirements Document Page 7


6.0 Technical Framework

(E2.1) will use this log to troubleshoot, debug and redesign the ETL processes.

D2.3 (Oracle Data Warehouse) The Oracle OLAP data warehouse serves as the
central storage point for all historical and some in season data. The OLAP
architecture provides an optimized framework for rapid analysis and manipulation.
The warehouse is the final destination for all information traversing the ETL
process.

Potential collaboration P2.1 (Parse Incoming Data) This process involves parsing data entering the
with SoS in developing ETL process for correct format. ETL processes map data between disparate source
ontology.
formats and the final data warehouse OLAP structure. If the source format does
not conform to the format expected by the ETL process then the error is logged
(D2.4) and processing may be interrupted.

Will require explicit data P2.2 (Transform) Before data can be inserted into the OLAP data warehouse it
dictionary or ontology. must be transformed and standardized to the specifications of the data warehouse.
Collaboration potential
with SoS. Measurements may need to be converted into differing units in addition to data type
conversions between source types and native Oracle database types. These
mappings between source and warehouse schemas must be explicitly defined in the
ETL process logic.

P2.3 (Log and Handle Errors) This process monitors the flow of data through
ETL processes and logs and notifies supervisory staff of errors and exceptions
which have occurred. Logs will be stored in either a text file or database structure.
As this process matures certain errors will be resolved automatically by handling
exceptions.

Potential collaboration P2.4 (Add Value to Data) After data has been transformed additional processing
with SoS regarding can occur to add value to the data. This process will involve the assignment of
spatial enablement of
data during ETL. SSIDs, association of spatial features with records and the aggregation of certain
salmon metrics into higher order quantities.

P2.5 (Load Data) The final step in the ETL process involves loading the data into
the final OLAP schema so that it is aggregated together with all other analogous
information and is made available for access and analysis by users.

Functional Requirements Document Page 8


6.0 Technical Framework

6.4 Data Flow Diagram 1.3 - Interoperability Systems

Functional Requirements Document Page 9


6.0 Technical Framework

Figure 10. Level l.3 data flow diagram for interoperability systems.

Components of DFD 1.3

E3.1 (GIS Client Software) GIS client software systems will connect to the
WFS (P3.2) over the network or internet and pull data for use. Example GIS
software systems which can access the WFS include: ESRI Arcview, Manifold,
QGIS and many others that are Open Geospatial Consortium (OGC) compliant.

Potential collaboration D3.1 (Oracle Data Warehouse Spatially Enabled) A subset of the data loaded
with SoS regarding into the Oracle OLAP data warehouse will be spatially enabled. Spatial
spatial enablement of
data during ETL. enablement for this subset of data will be achieved during the Add Value to Data
stage of the ETL process (P2.4). This data will have an associated record in the
spatial dimension table of the corresponding OLAP schema.

D3.2 (Data Output) The WFS (P3.2) architecture allows direct output of data in
a large numbers of standard spatial formats (ESRI Shapefiles, GeoJSON,
Geographic Markup Language and Well Known Text). Data can also be exported
in any standard map projection depending upon the needs of the user or service
requesting the data. Querying and filtering of data are facilitated through the
Common Query Language (CQL) or OGC filter system (P3.3).

D3.3 (Styled Maps) The WMS returns styled maps as image files per web

Functional Requirements Document Page 10


6.0 Technical Framework

service vendor requests. The maps can be produced in a large number of image
types (e.g. .jpg, .png, .gif). The following figure shows an image that was produced
from a WMS request.

Figure 11. Map image produced from WMS request. Image displays vector
shoreline, lakes, rivers, streams, anadromous streams and various pink salmon
distribution data layers for Eastern PWS and Copper River.

D3.4 (SLD Map Styling Files) Styled Layer Definitions (SLDs) are XML based
styling files for rendering feature data in digital maps. SLD files provide an
extremely flexible system to visualize GIS data via WMS requests. The style
produced from applying the SLD to a dataset can be driven by data contained in
each record. Information about the SLD specification can be downloaded at
http://www.opengeospatial.org/standards/sld.

D3.5 (Map Tile) Map tiles are standardized 256 by 256 pixel square images that
are used by web based mapping toolkits such as Openlayers and Google Maps and
some desktop GIS software (ESRI and Manifold).

SoS staff has experience P3.1 (Geoserver Oracle Connector) Geoserver is an open source geospatial
with this technology and data server developed in the Java programming language. It supports data
can quickly get ADF&G
CIS staff up and connections to many enterprise relational databases including Oracle. Geoserver
running. Technology is can be configured to publish spatial information and data as a WMS and WFS.
also critical for SoS API More information on Geoserver can be found at http://www.geoserver.org.
goals.
SoS staff has experience P3.2 (Web Feature Service) WFS is a web service protocol that allows
with these protocols and manipulation of GIS features stored in OGC formats. Data can be queried by
can provide ADF&G CIS
guidance for their use. temporal extent, spatial extent and dataset specific parameters and then packaged
in various data formats. Geoserver is natively enabled as a WFS server. More

Functional Requirements Document Page 11


6.0 Technical Framework

information regarding the WFS specification can be found at


http://www.opengeospatial.org/standards/wfs.

SoS staff has experience P3.3 (Query/Filter System) Querying and filtering of data are facilitated through
with these protocols and the CQL or OGC filter system. These filter protocols provide users with query
can provide ADF&G CIS
guidance for their use. capabilities similar to Structured Query Langauge (SQL) through the WFS
interface.

SoS staff has experience P3.4 (Web Map Service) WMS is a web service protocol that provides an
with these protocols and interface for creating styled maps based on spatial data that is stored in a
can provide ADF&G CIS
guidance for their use. geodatabase or spatial file format. The WMS can provide the Oracle BI tool with
visualizations of spatial data to facilitate spatial querying and aggregation in
reporting. Geoserver is natively enabled as a WMS server. More information
regarding the WMS specification can be found at
http://www.opengeospatial.org/standards/wms.

SoS staff has experience P3.5 (Tile Caching Service) This web service provides a mechanism to cache
with these protocols and map tiles which are accessed by users on a regular basis. This system will increase
can provide ADF&G CIS
guidance for their use. system response time considerably for complex spatial data layers that are
requested more than once.

Add value to BI tool P3.6 (Oracle BI Tool) - The Oracle BI tool will provide the framework for users to
functionality to query access existing preconfigured reports and build their own custom queries and
and access spatial data.
Requires spatial reports. The tool can assimilate data from a variety of data sources including OGC
association during ETL. compliant spatial data sources. The Oracle BI tool can access maps and spatial
features by connecting to the WMS (P3.4). Example of this type of Oracle BI
configuration can be found here http://www.directionsmag.com/article.php?
article_id=2571.

Functional Requirements Document Page 12

Vous aimerez peut-être aussi