Borbinha, José; Buddharaju, Raju; Khoo, Christopher; Sugimoto, Shigeo; Foo, Schubert; Jatowt, Adam
8th International Conference on Preservation of Digital Objects, November 1 – 4, 2011, Singapore
Shaon, Arif; Samuelsson, Göran; Bos, Marguérite; Kirstein, Michael; Woolf, Andrew; Mason, Paul; Naumann, Kai; Gerber, Urs; Rönsdorf, Carsten
With growing concerns about environmental problems, and an exponential increase in computing capabilities over the last decade, the geospatial community has been producing increasingly voluminous and diverse geographical datasets. Long-term preservation of these geographical data exposed through uniform and interoperable Spatial Data Infrastructures (SDIs) is not typically addressed, but highly important for meeting legislative requirements, the short and long term exploitation of archived data as well as efficiency savings in managing superseded datasets. In this paper, we attempt to set out the path and describe what needs to be done now to future-proof the investment government agencies around the world...
Kejser, Ulla Bøgvad; Thirifays, Alex; Nielsen, Anders Bo
The Danish National Archives, and The Royal Library and the State and University Library are in the process of developing a cost model for digital preservation: Each of the functional entities of the OAIS Reference Model are broken down into measurable, cost-critical activities, and formula are being tailored for each of these in order to create a generic tool for estimating the short and long-term costs of digital preservation. This paper presents an introduction to the subject of the costs of digital preservation and describes the method used to develop the Danish Cost Model for Digital Preservation (CMDP). It then...
von Suchodoletz, Dirk; Cochrane, Euan
Digital objects are often more complex than their common perception as individual files or small sets of files. Standard digital preservation methods can lose important parts of digital objects, or the context of digital objects. To deal with the different types of complex digital objects, and to cope with their special requirements, we propose applying emulation from a different perspective in order to preserve the whole original environment of single digital objects or groups of digital objects. Many of today's preservation scenarios would benefit from a change in our understanding of digital objects. Our understanding should be shifted up from...
Draws, Daniel; Simon, Frank; Simon, Daniel; Euteneuer, Sven
In today’s literature digital preservation and its concepts are usually connoted with long term views on the lifecycle of IT systems and software. In addition to that long term view we believe that concepts available for digital preservation are also useful in short term views where the life span of systems and software is limited to a significantly shorter timeline. In this paper we discuss three different real-world use cases that benefit from DP concepts on a short term basis.
Given that preservation is now a fairly well-described problem, it should, in theory, be possible to calculate with a reasonable degree of accuracy what costs are likely to accrue to an organisation that has responsibility for the long-term stewardship of digital assets. This paper will introduce and describe some of the work that has been carried out over the last 5 years to help institutions and research groups to understand both the cost and the economics of preservation, and to examine the difference between those concepts. It will also describe ongoing phases of work that are being funded in the...
Freitas, Ricardo André Pereira; Ramalho, José Carlos
The work addressed in this paper focuses on the preservation of the conceptual model within a specific class of digital objects: Relational Databases. Previously, a neutral format was adopted to pursue the goal of platform independence and to achieve a standard format in the digital preservation of relational databases, both data and structure (logical model). Currently, in this project, we address the preservation of relational databases by focusing on the conceptual model of the database, considering the database semantics as an portant preservation "property". For the representation of this higher layer of abstraction present in databases we use an ontology...
Weihs, Christian; Rauber, Andreas
One of the most important challenges in planning and maintaining a digital repository is to predict the needed resources on a long term basis, especially storage size and processing power. The main problem emerges from the need to migrate the data at certain times to newer file types, which takes time and alters the needed storage space, potentially branching into several migration paths for individual objects. Understanding the effect of different policy decisions, such as when to migrate or whether to stay within a format family or branching into several format families turns into a complex task, specifically when considering...
Jackson, Andrew N.
To preserve access to digital content, we must preserve the representation information that captures the intended interpretation of the data. In particular, we must be able to capture performance dependency requirements, i.e. to identify the other resources that are required in order for the intended interpretation to be constructed successfully. Critically, we must identify the digital objects that are only referenced in the source data, but are embedded in the performance, such as fonts. This paper describes a new technique for analysing the dynamic dependencies of digital media, focussing on analysing the process that underlies the performance, rather than parsing...
Massol, Marion; Béchard, Lorène; Rouchon, Olivier
The CINES has two main missions, among which is the long-term preservation of French scientific data. To provide this service, CINES deployed in 2006 one of the first digital repository in France named PAC (Plateforme d’Archivage du CINES – the CINES preservation system).
In order to secure this mandate in the long-term, it is absolutely crucial for CINES to prove the quality of the services it provides to the French higher education and research community. For this purpose, the CINES strategy relies on the adoption of a quality assurance approach which includes the certification of its repository.
Over the past four years,...
Donaldson, Devan Ray
Scholars who study trust in digital archives have largely focused their attention on the power of certification by third-party audit as a way to communicate trustworthiness to end-users. In doing so, they assume that the establishment of a network of trusted digital archives will create a climate of trust. But certification at the repository level also assumes the trustworthiness of digital objects within a repository; specifically that digital repository objects are authentic and reliable. This paper proposes the use of document-level seals of approval as a means of communicating to end-users about the trustworthiness of digital objects that is commensurate...
Becker, Christoph; Vieira, Ricardo; Barateiro, José; Antunes, Gonçalo
The last decade has seen a number of reference models and compliance criteria for Digital Preservation (DP) emerging. However, there is a lack of coherence and integration with standards and frameworks in related fields such as Information Systems; Governance, Risk and Compliance (GRC); and Organizational Engineering. DP needs to take a holistic viewpoint to acommodate the concerns of information longevity in the increasingly diverse scenarios in which DP needs to be addressed. In addition to compliance criteria, maturity models are needed to support focused assessment and targeted process improvement efforts in organizations. To enable this holistic perspective, this article discusses...
Ciuffreda, Antonio; Joguin, Vincent; Lange, Andreas; Bergmeyer, Winfried; Pinchbeck, Dan; Konstantelos, Leo; Delve, Janet; Anderson, David
Efficient media transfer is a difficult challenge facing digital preservationists, without a centralized service for strategy and tools advice. Issues include creating a transfer and ingest system adaptable enough to deal with different hardware and software requirements, accessing external registries to help generate accurate and appropriate metadata, and dealing with DRM. Each of these is made more difficult when dealing with complex digital objects such as computer games or digital art. This paper presents the findings of several studies performed within the KEEP project, where numerous open-source and commercial media transfer tools have been evaluated for their effectiveness in generating...
Thirifays, Alex; Dokkedal, Barbara; Nielsen, Anders Bo
The Danish National Archives (DNA) has ingested structurally heterogeneous public digital records since 1973. The year 2004 saw the creation of a new preservation standard into which it was decided to migrate the above mentioned archival holdings. The main objectives of this operation were to save data from technological obsolescence and to reduce the cost of both access and future migrations by streamlining the collection. The project costs approximately 30 FSCs (one ‘FSC’— Format and Structure Conversion—is the way the project’s project management measured 1 person-year, and equals 1,291 person-hours). The total sum of purchasing software, hardware and external services...
Dappert, Angela; Kimura, Akiko; Jackson, Andrew
Many memory institutions hold large collections of hand-held
media, which can comprise hundreds of terabytes of data spread
over many thousands of data-carriers. Many of these carriers are
at risk of significant physical degradation over time, depending on
their composition. Unfortunately, handling them manually is
enormously time consuming and so a full and frequent evaluation
of their condition is extremely expensive. It is, therefore, important to develop scalable processes for stabilizing them onto backed-up online storage where they can be subject to highquality digital preservation management. This goes hand in hand with the need to establish efficient, standardized ways of recording metadata and to deal with...
Hamm, Markus; Becker, Christoph
Significant progress has been made in clarifying the decision factors to consider when choosing preservation actions and the directives governing their deployment. The Planets preservation planning approach and the tool Plato have received considerable takeup and produce a growing body of knowledge on preservation decisions. However, experience sharing is currently complicated by the inherent lack of semantics in criteria specification and a lack of tool support. Furthermore, the impact of decision criteria and criteria sets on the overall planning decision is often hard to judge, and it is unclear what effect a change in the objective evidence underlying an evaluation...
Canteiro, Sara; Barateiro, José
Risk is a constant in every area and at all levels of any organization, whether in a general context or in a specific activity, project or function. Risk Management comprises a set of coordinated activities to direct and control an organization with regard to risk. Risk Assessment is considered the most important phase of Risk Management, which consists in identifying, analyzing and evaluating risks. Digital preservation’s main concern is to keep information accessible and understandable over a long period of time, through means of digital objects; therefore, it is an area that needs a thorough Risk Management and, especially, a...
Antunes, Gonçalo; Pina, Helder
Digital preservation aims at guaranteeing that data or digital objects remain authentic and accessible to users over a long period of time, maintaining their value. Several communities, like biology, medicine, engineering or physics, manage large amounts of scientific information, including large datasets of structured data that matters to preserve, so that it can be used in future research. To achieve long-term digital preservation, it is required to store digital objects reliably, preventing data loss. The data redundancy strategy is required to be able to successfully preserve data. Many of the characteristics required to implement, manage and evolve a preservation environment...
Esteva, Maria; Walling, David; Urban, Tomislav; Jordan, Christopher
The requirements to support large-scale and complex research collections are growing at an accelerated pace. Considering the continuous evolution of the collections, their increasing sizes, the technologies supporting them, and the importance of adequate data management to long-term preservation, a team at the Texas Advanced Computing Center (TACC) developed a cyberinfrastructure to aid researchers in the creation, management, and curation of collections throughout the research lifecycle processes and beyond for access and long term preservation. Collections are maintained on a petabyte-scale data applications facility, and consulting services are available to address data curation needs. In this environment, researchers have the...
Strodl, Stephan; Rauber, Andreas
Assessing the costs of preserving a digital data collection in the long term is a challenging task. The lifecycle costs consist of several cost factors. Some of them are difficult to identify and to break down. In this paper we present a cost model especially for small scale automated digital preservation software system. The cost model allows institutions with limited expertise in data curation to assess the costs for preserving their digital data in the long run. It provides a simple to use methodology that considers the individual characteristics of different settings. The cost model provided detailed formulas to calculate...