Next-generation heartbeat classification with a column-store DBMS and UDFs

作者:Oscar Castro-Lopez, Daniel E. Lopez-Barron, Ines F. Vega-Lopez

摘要

We live in a digital world where data is being generated at an always increasing rate. This creates the need to develop new technology not only for storing these vast amounts of data, but also for manipulating and analyzing it. It is through this data analysis that we can make decisions and generate knowledge. The medical field is no exception and healthcare and biomedical data must be stored and analyzed to gain insights that help in disease prevention and diagnostics. An example of this kind of data are electrocardiograms (ECG), whose careful analysis has proven to be of significant help to diagnose cardiovascular abnormalities. ECG recording devices can produce a very large amount of data in a short period of time. Usually abstracted as unstructured data, ECG digital signals have traditionally been stored and analyzed using file-based solutions for storage, and ad-hoc programs for data processing. We favor the idea that ECG signals can be abstracted as sets of tuples and stored in database relations. In this paper we present a proposal to store, manage, and analyze ECG data in a column-store database management system (DBMS). We provide extensive empirical evidence showing that incorporating complex analytical tasks such as ECG transformation and classification into a DBMS is not only feasible but also efficient and scalable. For this, we rely on the Structured Query Language provided by relational DBMSs, and the implementation of user defined functions.

论文关键词:User defined functions, Machine learning deployment, Data management, Signal database

论文评审过程:

论文官网地址:https://doi.org/10.1007/s10844-019-00557-w