Credit distribution in relational scientific databases

作者:

Highlights:

• Definition of the Data Credit Distribution problem and application to curated scientific databases.

• Extension of the data citation process to fine-grained credit recognition for data curators.

• Use of data provenance methods, responsibility and the Shapley velue in conjunction with data citation advances.

• Real-world use-case based on curated pharmacological data.

• Extensive experiments on real and synthetic data and comparison of data curators rewarding via data citations and data credit.

摘要

•Definition of the Data Credit Distribution problem and application to curated scientific databases.•Extension of the data citation process to fine-grained credit recognition for data curators.•Use of data provenance methods, responsibility and the Shapley velue in conjunction with data citation advances.•Real-world use-case based on curated pharmacological data.•Extensive experiments on real and synthetic data and comparison of data curators rewarding via data citations and data credit.

论文关键词:Data citation,Data credit,Provenance,Causality and responsibility,Shapley value

论文评审过程:Received 11 January 2021, Revised 26 April 2022, Accepted 27 April 2022, Available online 10 May 2022, Version of Record 16 May 2022.

论文官网地址:https://doi.org/10.1016/j.is.2022.102060