Welten, Sascha, Neumann, Laurenz, Yediel, Yeliz Ucer, da Silva Santos, Luiz Olavo Bonino, Decker, Stefan and Beyan, Oya (2021). DAMS: A Distributed Analytics Metadata Schema. Data Intell., 3 (4). S. 528 - 548. CAMBRIDGE: MIT PRESS. ISSN 2641-435X

Full text not available from this repository.

Abstract

In recent years, implementations enabling Distributed Analytics (DA) have gained considerable attention due to their ability to perform complex analysis tasks on decentralised data by bringing the analysis to the data. These concepts propose privacy-enhancing alternatives to data centralisation approaches, which have restricted applicability in case of sensitive data due to ethical, legal or social aspects. Nevertheless, the immanent problem of DA-enabling architectures is the black-box-alike behaviour of the highly distributed components originating from the lack of semantically enriched descriptions, particularly the absence of basic metadata for data sets or analysis tasks. To approach the mentioned problems, we propose a metadata schema for DA infrastructures, which provides a vocabulary to enrich the involved entities with descriptive semantics. We initially perform a requirement analysis with domain experts to reveal necessary metadata items, which represents the foundation of our schema. Afterwards, we transform the obtained domain expert knowledge into user stories and derive the most significant semantic content. In the final step, we enable machine-readability via RDF(S) and SHACL serialisations. We deploy our schema in a proof-of-concept monitoring dashboard to validate its contribution to the transparency of DA architectures. Additionally, we evaluate the schema's compliance with the FAIR principles. The evaluation shows that the schema succeeds in increasing transparency while being compliant with most of the FAIR principles. Because a common metadata model is critical for enhancing the compatibility between multiple DA infrastructures, our work lowers data access and analysis barriers. It represents an initial and infrastructure-independent foundation for the FAIRification of DA and the underlying scientific data management.

Item Type: Journal Article
Creators:
CreatorsEmailORCIDORCID Put Code
Welten, SaschaUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Neumann, LaurenzUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Yediel, Yeliz UcerUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
da Silva Santos, Luiz Olavo BoninoUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Decker, StefanUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
Beyan, OyaUNSPECIFIEDUNSPECIFIEDUNSPECIFIED
URN: urn:nbn:de:hbz:38-593742
DOI: 10.1162/dint_a_00100
Journal or Publication Title: Data Intell.
Volume: 3
Number: 4
Page Range: S. 528 - 548
Date: 2021
Publisher: MIT PRESS
Place of Publication: CAMBRIDGE
ISSN: 2641-435X
Language: English
Faculty: Unspecified
Divisions: Unspecified
Subjects: no entry
Uncontrolled Keywords:
KeywordsLanguage
BIG DATA; HEALTH; MODELMultiple languages
Computer Science, Information SystemsMultiple languages
URI: http://kups.ub.uni-koeln.de/id/eprint/59374

Downloads

Downloads per month over past year

Altmetric

Export

Actions (login required)

View Item View Item