Barabucci, Gioele (2019). The CMV+P Document Model, Linear Version. In: Versioning Cultural Objects : Digital Approaches, pp. 153-170. Norderstedt: BoD. ISBN 978-3-7504-2702-0

Bereitstellung unter der CC-Lizenz: Creative Commons Attribution Share Alike.

Download (902kB) | Preview


Digital documents are peculiar in that they are different things at the same time. For example, an HTML document is a series of Unicode codepoints, but also a tree-like structure, as well as a rendered image in a browser window and a series of bits stored on a physical medium. These multiple identities of digital documents not only make it difficult to discuss the evolution of documents (especially digital-born documents) in rigorous scholarly terms, it also creates practical problems for computer-based comparison tools and algorithms. The CMV+P model addresses this problem providing a sound formalization of what a document is and how its many identities can coexist at the same time. In its linear version, described in this paper, the CMV+P model sees each document as a stack of abstraction levels, each composed of a) an addressable Content, b) a Model according to which the content has been recorded, and c) a set of Variants used for equivalence matching. The bottom of this stack is the Physical level, symbolizing the concrete medium that embodies the digital document. Content is moved across levels using transformation functions, i.e. encoding functions used to serialize (save) the document and decoding functions used to deserialize (read) it. A practical application of the CMV+P model is its use in comparison tools, algorithms, and methods. With a clear understanding of the internal stratification of formats and models found in digital documents, comparison tools are able to focus on the most meaningful abstraction levels, providing the user with the ability to understand which comparisons are possible between two arbitrary documents.

Item Type: Book Section, Proceedings Item or annotation in a legal commentary
CreatorsEmailORCIDORCID Put Code
EditorsEmailORCIDORCID Put Code
Corporate Creators: Institut für Dokumentologie und Editorik
URN: urn:nbn:de:hbz:38-106539
Title of Book: Versioning Cultural Objects : Digital Approaches
Series Name at the University of Cologne: Schriften des Instituts für Dokumentologie und Editorik
Volume: 13
Page Range: pp. 153-170
Date: 2019
Publisher: BoD
Place of Publication: Norderstedt
ISSN: 2197-6945
ISBN: 978-3-7504-2702-0
Language: English
Faculty: Faculty of Arts and Humanities
Divisions: Faculty of Arts and Humanities > Zentrale Forschungseinrichtungen > Cologne Center for eHumanities (CCeH)
Subjects: Generalities, Science
Data processing Computer science
Library and information sciences
Uncontrolled Keywords:
Digital HumanitiesEnglish
Digitale EditionenGerman
Related URLs:
Refereed: Yes


Downloads per month over past year


Actions (login required)

View Item View Item