Vous êtes sur la page 1sur 12

G00270824

Is Cloud Fit for Government Archiving?


FOUNDATIONAL Refreshed: 11 April 2016 | Published: 14 November 2014

Analyst(s): Neville Cannon

In the rush to cloud, don't forget that whole-of-life record retention and
future compatibility must be maintained. Here are some best practice
considerations for government CIOs and record managers considering such
a move.

Gartner foundational research is reviewed periodically for accuracy. This document was last reviewed
on 11 April 2016.

Key Challenges
Ensuring that whole-of-life records remain readable, accurate and trusted in the long term in a
digital medium that has not been tested or designed to store indefinitely.
Verifying that individual records are able to be found and retrieved, while some are truly deleted
or destroyed when necessary.
Monitoring storage and other associated costs to be certain the business case for cloud-based
archiving remains viable.
Developing policies to support cloud-based archiving and to address the legal implications of
storing records in the cloud.

Recommendations
CIOs:

Develop a business case for the preservation of records in the cloud, taking into account the
whole-of-life costs, including search, retrieval and potential exit.
Consider additional records management requirements that are integral with your cloud
strategy, and plan for all challenges.
Fully appreciate the consequence of the loss of data or records, and plan accordingly to
maintain critical records.
Develop the policies and organizational skills to support cloud-based archiving.

Table of Contents

Introduction............................................................................................................................................ 2
Analysis.................................................................................................................................................. 4
Develop a Business Case for the Preservation of Records in the Cloud That Takes Into Account the
Whole-of-Life Costs, Including, Search, Retrieval and Potential Exit.................................................. 4
Consider Additional Records Management Requirements to Be Integral With Your Cloud Strategy
So That All Challenges Are Planned For............................................................................................ 7
Fully Appreciate the Consequence of the Loss of Data or Records, and Plan Accordingly to Maintain
Critical Records................................................................................................................................ 8
Develop the Policies and Organizational Skills to Support Cloud-Based Archiving...........................10
Gartner Recommended Reading.......................................................................................................... 11

List of Tables

Table 1. Archive Processing Components...............................................................................................4


Table 2. Consequences of Loss..............................................................................................................9

Introduction
Governments around the world are pressing forward with "cloud first" policies in the drive to reduce
costs and increase flexibility and innovation. This pursuit often ignores at worst or minimizes
at best the requirement to digitally preserve the 1% to 5% of records that should be stored for
decades. Cloud computing is now being used by many government and nongovernmental
organizations, and CIOs and record managers must take care when developing their strategy and
business case and selecting suppliers.

Several large suppliers already offer enterprise information management archiving (see "Magic
Quadrant for Enterprise Information Archiving"), largely for Web pages, email and social media.
Hundreds of thousands of email boxes can be archived by the larger players. These solutions
generally are not considered suitable for the preservation of whole-of-life records, but instead offer
good operational solutions to meet compliance requirements. However, some national governments
123
with national archives , , have produced guidance on the use of cloud-based archiving, while others
such as Archives New Zealand are still looking to develop their own guidance documents. All tiers
of government are beginning to use cloud computing to store whole-of-life records, but this can
remain difficult for smaller organizations that do not have the economies of scale, or that still have
much to learn about the required processes. This research covers the requirements for government

Page 2 of 12 Gartner, Inc. | G00270824


organizations to store whole-of-life records, often including sensitive personally identifiable
information (PII), which must be secured as well as preserved.

Increasingly, records and unstructured data are being born digitally and as "digital by default"
policies become more common, this trend is only expected to grow. Governments and associated
agencies have important roles to play, not only for the duration of individual lives, as in care,
adoption and criminal records, but also in maintaining historical facts and records for generations to
come. This, therefore, places additional requirements on government CIOs, architects and archivists
to preserve true chronicles as the records' accuracy and future value will need to be maintained
through several technology changes. Every such change has the ability to introduce file corruption.

Electronic records (including audio and video) are generated in a variety of proprietary formats and
will all need to undergo future migrations, especially when stored in the cloud, as the standards and
technology for storage will move on, and customers will be expected to keep pace with the
commercial offerings to maintain the lowest price (see Note 1).

The risk posed by these inevitable migrations must be understood, and the business case will need
to take into account the cost of any potential exit plan or migration to avoid a move to
unsatisfactory technologies or platforms.
4
Vendors marketing in this space are relatively new and have yet to prove to be technically and
commercially viable in the long term. They offer a variety of approaches, and while some describe
their offering as cloud solutions, they do not all conform to the National Institute of Standards and
Technology (NIST) definition. Some are cloud-hosted on third-party platforms; some are tape-based
solutions, recorded and held off-site. Many offer solutions to part of the archiving process such as
ingest and processing. These are, generally, open source. Others such as Amazon Glacier cover
storage and maintenance. Some can be combined to offer a broader solution, while others such as
Preservica offer a complete packaged solution, and can use on-premises, cloud or hybrid storage
with a high degree of automation in their product. Most solutions are priced on an annual
subscription basis.
5
A community of archivist organizations is working to develop and maintain the standards for digital
preservation and to maintain the ever-growing list of file formats that need to be considered as part
of the process for archival storage. Table 1 based on the Open Archival Information System
(OAIS) shows the various components that must be present for a true archive system.

Gartner, Inc. | G00270824 Page 3 of 12


Table 1. Archive Processing Components

Process Activity Components

Ingest Receipt, verification and packaging of Copy


records Fixity check
Virus scan
File dedupe
Unique ID
Accept submission information package (SIP)
Create archival information package (AIP)

Archival Storage Secure long-term storage of records Disperse long-term storage


Accept AIPs
Maintain AIPs, migration, error checks
Retrieve AIPs for access function

Data Management Secure management of records Coordination of AIP and system information
Database administration
Query handling
Report generation

Administration Stakeholder management Submission agreements from producers


Audits of SIPs against above agreements
Policy generation
Customer service
Relationship management between stakeholders

Preservation Determining the ongoing archive Maps out preservation strategy


Planning strategy, and monitoring future Monitors and recommends changes as necessary
impacting developments Maintains overall effectiveness of archival system

Access Enabling appropriate access to agreed Access control


user categories Public/private interface
Dissemination Information Package (DIP) creation

Source: Adapted from OAIS

Analysis
Develop a Business Case for the Preservation of Records in the Cloud That Takes
Into Account the Whole-of-Life Costs, Including, Search, Retrieval and Potential Exit

Geography Matters

Any chosen solution must maintain and protect the authenticity and accuracy of records. In
addition, CIOs may be obligated to, or wish to, consider the requirement to hold the data within a
given geographic boundary. This, ultimately, may influence whether or not they can proceed with
cloud-based archiving as there may be no suitable solutions within their country's boundary that

Page 4 of 12 Gartner, Inc. | G00270824


meet all the compliance criteria. The records may, for example, be stored offshore, and hence be
subject to the laws of that particular country and, potentially, may be accessed or seized without
their knowledge. This could clearly compromise the records and affect their intrinsic value. The
business case should reflect the overall importance and value of the records, as well as the
organization's objectives and business rationale for moving to the cloud. This may be because the
data center currently is in need of significant investment, or it may be to increase access to records,
whether public or private.

The archives of the state of Michigan have begun utilizing a SaaS solution, Preservica, at a cost of
around $14,000 dollars a year; they are storing more than the 1T- to 10T- package, which has a
base price of $11,950 per year. This is a cloud-hosted application using Amazon Web Services
(AWS) infrastructure. This is a true cloud-based solution, where other solutions such as Arkivum A-
Stor offer tape-based solutions with access times of around five minutes.

The solution that is ultimately chosen will need to comply with relevant local legislation about such
issues as where records can be stored, copyright requirements, the replication of data by third
parties, public records, audit and compliance.

Cost Implications

It is important to know the full costs of the storage service, which should: (1) include all future
upgrades and migrations to new formats (while continuing to hold the original file); (2) account for
the need to validate the metadata (all governments have differing standards and classifications); and
(3) update any indexes so that the information remains retrievable. This last is a vital activity if the
value of the archive is to be preserved or enhanced, because the data must be discoverable and
accessible to both current and new technologies. It ensures that public access, where it's allowed,
can most easily be facilitated. The newly converted records should also have any metadata updated
to reflect for audit purposes that it has been amended.

The Michigan data being stored in the cloud represents just 1% of the state's active needs. Storing
the remaining 99% is not a simple linear progression as other costs and discounts must be
considered. That would be too easy (nor should anyone expect to store everything). For example,
while volume discounts may be applicable to the storage, additional bandwidth charges may be
levied if the data is accessed frequently and in volume. Additional costs that may be incurred
include additional auditing and monitoring of the archive placed with the provider.

Many organizations will currently store data somewhere for a long term, but this is not an archive if it
is not easy to use. Providing access, while a cost, will potentially create value that should be
reflected in the business case.

Vendor Management

Make provision for an ongoing vendor management function in the organization's business case.
This need is likely to increase over time as more records are stored. Data-retention policies may
require the verification of the deletion and destruction of records, especially where needed to
comply with local policy requirements. This may be more time-consuming where records are stored

Gartner, Inc. | G00270824 Page 5 of 12


across different systems, geographies and data centers and will incur additional cost. Also, even
with the ease of copying files, it may never be possible to guarantee the destruction of 100% of
records that are no longer required. This, along with all risks, should feature on the strategic and
operational risk registers.

Access Demands

When considering what to place in the cloud, consider the type and frequency of demand that may
be placed on the service. This will impact costs, as well as choice of technology. Some vendors
offer off-site, tape-based storage that may not be suitable for operational purposes, as access
times can take up to five minutes from making the request. Ensure the business case reflects what
is needed by end users before it is too late or too costly to rectify.

Furthermore, the retrieval of files may be costly if the files are large or the volume is high; some
cloud providers offering inexpensive storage charge for the download bandwidth required to deliver
the service. Record managers should check to see how these costs are treated by any chosen
supplier before they find themselves with rising costs as more data is stored inexpensively, but
access becomes expensive. This may prove operationally prohibitive.

Open Source

Open source preservation software such as Archivematica is currently operational in Canada and is
being tested in Wales, in conjunction with Microsoft Azure public cloud-based storage. Open-
source products typically have to be used in conjunction with multiple components, as they are
unable to deliver a full end-to-end solution by themselves. While using open-source software may
reduce some costs, it doesn't provide free archives. All the other costs still apply, and potentially
greater levels of customization and configuration may be required and lead to the need for
additional skilled staff. Furthermore, there may be an ongoing requirement for staff to maintain the
archive into the future, and as technology changes, CIOs may find that the burden of future
development is theirs.

Skills Required

The staff managing the solution will be required to undertake various functions beyond what is
currently needed, including audits of the records, which cover access, location, any amendments
made and any migrations to new formats. If audits are not completed correctly, records may
become corrupted. Monitoring of the service, as well as vendor management skills, will be required
for the duration, which is likely to extend indefinitely once records are in the cloud. Security of the
solution will need to be evaluated and regularly checked, and the provider should be able to inform
an organization of any unauthorized access attempts and prove that the records remain truthful
following any such attempt. Should the provider use any third-party subcontractors, they would
need to meet the same records management criteria as the chosen prime contractor. The ability and
capacity available to undertake these roles must be reflected in the business case for it to be
considered sufficiently comprehensive.

Page 6 of 12 Gartner, Inc. | G00270824


Start Small

The flexibility and speed of cloud adoption lends itself to the piloting of smaller-scale initiatives with
the merging service providers, and this may be a consideration for those with smaller archives who
are looking to cope in these financially constrained times. This also provides an opportunity to
enhance the value of the records by charging for access to recover some costs. This is clearly a
policy and legal decision that warrants legal advice. While there is no doubt that cloud archiving has
its challenges, it undoubtedly has a significant role to play in the future. We encourage the use of
pilots; they offer a controlled way to learn, and learning is most definitely required.

Exit Plan

The intrinsic value of the archive needs to be maintained, and this will not be possible if the vendor
or the vendor solution is not performing as expected, if the archived is compromised in some way,
or if something catastrophic such as bankruptcy occurs. This eventuality needs to be anticipated,
not because vendors are likely to disappoint or become insolvent, but because, on occasion, it can
happen. If it does, organizations will need to be prepared to move quickly and decisively, so the
overall business case should include a developed and costed exit plan (see "Devising a Cloud Exit
Strategy: Proper Planning Prevents Poor Performance").

Consider Additional Records Management Requirements to Be Integral With Your


Cloud Strategy So That All Challenges Are Planned For
Cloud has not been designed for long-term storage; it has been designed to offer good operational
storage at ever-reducing prices. While it will, no doubt, be used to store data over long periods, it
may become corrupted via many circumstances. Additionally, cloud technology, along with an
organization's architecture, will undoubtedly change over time, so keeping everyone involved
cognizant of any changes and keeping those changes in sync with each other becomes necessary.
In doing so, a number of challenges are likely to be encountered.

Stakeholder Management

In addition to seeking the understanding and agreement of senior management, it is vital that CIOs
gain the support of all parties for their strategy. This type of project will extend beyond normal
business planning and election cycles and would not benefit from changing course midstream. All
stakeholders must be aware of the impacts on the organization of moving data to the cloud,
especially if the data is to be made public for example, to aid transparency or to generate
revenue.

Vendor Challenges

Remember digital preservation of records must extend beyond media failure or technological and
organizational change. So simply considering and planning for the advent of a media failure will not
provide sufficient protection for the records. Records managers must also take into account their
current architecture, its expected life span and their chosen supplier before a refresh is required.

Gartner, Inc. | G00270824 Page 7 of 12


They should plan to work in partnership with their supplier to develop a road map that allows
sufficient time to consider and address any emergent technical issues and the inevitable changes.
An annual review process involving all interested parties may be the best way to achieve this.

In considering the various suppliers' options, beware of the use of proprietary interfaces and
programming; this could lead to a position of vendor lock-in. Given the likely length of a contract,
this would leave the organization with little negotiating power, should they need it, if prices were to
suddenly or substantially rise. Also consider the adoption of standards such as OAIS, whose text is
identical to ISO 14721:2003. This provides the basis (common set of concepts and definitions) for
facilitating discussion between different sectors and professional groups.

Without standard interfaces and common language between the organization and the vendor,
administration and auditing can become more complicated than necessary, and confusion can
easily surface. Using the standards defined above can help all parties determine the functional
requirements for ingest, access, archival standards, preservation planning, data management and
administration, as well as for auditing. Other standards such as ISO 16363 and ISO 16919 are being
developed that could prove useful for the accreditation of future digital archives. The progress of
these should be followed if accreditation is desirable to organization.

All vendors offering IaaS offer cloud-based storage services, and while this is generally adequate for
file storage, it may not be adequate for the long-term storage of records. Record retention imposes
a greater degree of risk around data loss, security and the processing of PII, which may not be
present in the basic offering of generic cloud storage providers like Microsoft, Google and AWS.
These extra requirements can be layered on top of the generic capabilities by specialist providers
such as Preservica, Archivematica and DuraCloud.

Records Destruction

Much attention is focused on the safe retention of records in perpetuity. However, there is a need,
on occasion, to delete and destroy records in accordance with organizational data retention policies
such as NIST 800-88. These standards are for the sanitation of the disks on which the data is
located. However, this can be difficult in the electronic era, where file backups are made and
retained. Records may be held on different servers, and even in different data centers, to ensure
resilience and accessibility. Therefore, it is vital to understand the regime in which data will be
managed so reasonable steps can be taken to certify that all copies have been dealt with once the
decision has been made to delete them. Seek warrants from suppliers that they have destroyed all
copies held by them, or at the minimum, that they have taken all reasonable steps to do so in line
with the appropriate standards.

Fully Appreciate the Consequence of the Loss of Data or Records, and Plan
Accordingly to Maintain Critical Records
In considering moving records to the cloud, organizations should include the consequences of total
6
loss in the business case. Other than the professionals who know this could be a risk, many within
the organization may not wish to contemplate this possibility. While it may be difficult to engage
people in this conversation, it is necessary.

Page 8 of 12 Gartner, Inc. | G00270824


Simply moving records to the cloud to take advantage of the presumably inexpensive storage costs
would, at best, be a short-term gain if the records were subsequently unavailable or corrupted. This
could manifest itself in records that are hard to find, expensive to retrieve, difficult or impossible to
read or impossible to trust. Should any of these occur, the business case would be completely
undermined. Many will assume that the loss of physical control of the storage will lead to a higher
risk of losing data. CIOs should ensure that the organization is appraised of the consequences of
loss (See Table 2), whether they move to cloud or even fail to have an adequate in-house alternative.

Table 2. Consequences of Loss

Damage Impact

Additional cost to re-create Organizations spend approximately $120 to find a misfiled document or approximately
the records 25 hours to recreate it.

Loss of valuable asset If the record cannot be found, read or trusted, it is effectively lost and impacts the
overall value of the asset base.

Risk to ongoing service Missing records can adversely impact the delivered service such as child care, adoption
continuity or criminal records, as well as freedom of information request processing.

Risk of litigation Litigation is expensive, and the loss of vital records can lead to the organization being
sued. Countersuing the vendor is no immediate or easy remedy.

Loss of historical records The loss of historical artifacts can be incalculable and can be seen as a national loss.

Reputational damage Any loss of records can easily cause reputational loss and political consequences,
hence the need for all party support for any given initiative.

Delayed implementation of Delays may occur at the drafting stage as evidence is sought or at the implementation
government policies stage as records to inform processes are not available.

Additional legal activity Donors and/or funders may place legal obligations on archives, and any loss could
being necessary result in lengthy legal disputes. Also, third-party copyright problems might arise if
access and use are inappropriate.

Damage to statutory Children's services, social care, health and the criminal justice system could face
services significant operational problems to long-running cases should records be lost or
corrupted.

Hurt and damage afforded For those children who have been placed into care, with the state serving as a
to individuals corporate parent, loss of their history at any point during their life could cause
considerable harm or distress.

Liability It is unrealistic to assume that the full, unlimited liability of any loss can be passed
across to the vendor, especially if smaller vendors are to be encouraged into this
innovative market.

Source: Gartner (November 2014)

Gartner, Inc. | G00270824 Page 9 of 12


The importance of the records stored will impact the choice of technology, as well as the
architectural approach, such as whether to use mirrored data centers (two online copies stored in
different data centers) or a hybrid approach (one copy stored on-site and one stored in the cloud).
Should the choice result in the use of different technologies, the records manager will need to
account for the increased level of migration activity as technical refreshes occur in order to diffuse
the risk of a damaging event hitting both instances simultaneously. This will improve the
organization's risk profile, but it may adversely impact the business case.

These are a call to proceed with caution after fully evaluating the risks and the benefits of the
chosen approach.

Develop the Policies and Organizational Skills to Support Cloud-Based Archiving


Digital preservation policies are being developed by governments across the world, and guidance is
being offered by many. The National Archives of Australia, Archives New Zealand and The National
Archive (TNA) of the U.K. have all issued guidance documents that are available online. As the
market matures and technology proves itself, we expect it to become easier to utilize the cloud and
its vast storage capabilities to house the increasing volume of digital information and records being
created.

Until that point, policies will need to be reviewed frequently and updated accordingly. Policies and
contracts will need to reflect the approach to copyright material and whether that material is
available to the provider for other uses, as this may impact the licensing required. IT should highlight
the organization's approach and values around the storage and processing of PII and the
requirement for Safe Harbor compliance if operating within the European Economic Area.

Organizational policies will be used to inform the drafting of the contract, vendor negotiations, and
probably even vendor selection. Sample contracting language that may be useful in drafting
7
organization-specific contracts is available on the website of the U.S. National Archives.

Piloting new technical development or use of technology is critical. Preserving digital archives is no
different, and staff will be expected to develop and learn new skills as the technology emerges and
matures. In addition to their core archiving skills, staff will require new skills that include:

A full understanding of digital file formats


Data storage methodologies, including digital media persistence practice
Data integrity checking
Security and auditing methods, including forensics to determine if files remain tamper-free
Information about metadata, copyright and access rights
Standards, such as OAIS and PREMIS
Access arrangements, including Web technologies

Page 10 of 12 Gartner, Inc. | G00270824


If CIOs think their organizations have all these skills invested in one person or within the
organization, they are ready to evaluate or use cloud-based archiving; if they do not have these
skills, they will need to acquire them.

Gartner Recommended Reading


Some documents may not be available as part of your current Gartner subscription.

"Devising a Cloud Exit Strategy: Proper Planning Prevents Poor Performance"

"Magic Quadrant for Enterprise Information Archiving"

"Selecting the Best Archival Storage Architecture for Your Needs"

"Effective Security Assessment of Public Cloud Services"

"Clouds Are Secure: Are You Using Them Securely?"

Evidence
1 National Archives of Australia, Publications and Tools

2 Archives New Zealand What Are the Recordkeeping Implications of Cloud Computing?

3 The National Archives (U.K.) Guidance on Cloud Storage and Digital Preservation

4Examples of cloud providers: Archivematica; Curator's Workbench; Data Accessioner; Preservica;


DSpace Direct; DuraCloud; Amazon Glacier; Amazon S3; Arkivum

5 COSA, ARA, FARMER, MetaArchive

6 How Much Does It Really Cost When You Lose a Document?

7 National Archives (U.S.) Records Management Language for Contracts

Gartner, Inc. | G00270824 Page 11 of 12


GARTNER HEADQUARTERS

Corporate Headquarters
56 Top Gallant Road
Stamford, CT 06902-7700
USA
+1 203 964 0096

Regional Headquarters
AUSTRALIA
BRAZIL
JAPAN
UNITED KINGDOM

For a complete list of worldwide locations,


visit http://www.gartner.com/technology/about.jsp

2014 Gartner, Inc. and/or its affiliates. All rights reserved. Gartner is a registered trademark of Gartner, Inc. or its affiliates. This
publication may not be reproduced or distributed in any form without Gartners prior written permission. If you are authorized to access
this publication, your use of it is subject to the Usage Guidelines for Gartner Services posted on gartner.com. The information contained
in this publication has been obtained from sources believed to be reliable. Gartner disclaims all warranties as to the accuracy,
completeness or adequacy of such information and shall have no liability for errors, omissions or inadequacies in such information. This
publication consists of the opinions of Gartners research organization and should not be construed as statements of fact. The opinions
expressed herein are subject to change without notice. Although Gartner research may include a discussion of related legal issues,
Gartner does not provide legal advice or services and its research should not be construed or used as such. Gartner is a public company,
and its shareholders may include firms and funds that have financial interests in entities covered in Gartner research. Gartners Board of
Directors may include senior managers of these firms or funds. Gartner research is produced independently by its research organization
without input or influence from these firms, funds or their managers. For further information on the independence and integrity of Gartner
research, see Guiding Principles on Independence and Objectivity.

Page 12 of 12 Gartner, Inc. | G00270824

Vous aimerez peut-être aussi