Multivariate data quality assessment based on rotated factor scores and confidence ellipsoids

作者:

Highlights:

• A novel multivariate data quality assessment was proposed.

• Strategies based on rotated factor scores and confidence ellipsoids were proposed.

• An experimental application to verify the method in a real case was performed.

• The results showed that the method favors the correlated data quality evaluation.

• The method proved to be a better option when compared with other approaches.

摘要

This study explores the nature of the correlation in data to estimate the data quality to be used in decision-making processes. The main contribution of this research is the introduction of a new multivariate method based on rotated factor scores by varimax strategy for the repeatability and reproducibility study to effectively identify possible data of poor quality leading to measurement errors. In addition, a new confidence ellipsoid-based decision support method is developed. The efficiency of the proposed method was demonstrated using the metallographic measurements of the geometric characteristics of the resistance spot welding process. To prove the efficiency of the proposed method, it was compared with other consolidated techniques such as the analysis of variance, weighted principal components method, and factor analysis without rotation. Thus, we verified that the proposed method performed better interpretation of the latent information, minimizing the dimensionality of the data, and separating the quality attributes analyzed by clusters. One response group was classified as acceptable, and the other as marginal. These results were verified by the confidence ellipsoids, in which the proposed method obeyed the Bonferroni bilateral limits, outlining the factors which demonstrated superior discriminatory power with non-overlapping ellipsoids avoiding the confounding and favoring the better data quality analysis for multicriteria decision-making. When compared with the other approaches, the proposed method demonstrated more reliable and robust results without such deficiencies as inversion of the groupings, neglection of the variance-covariance structure, and the variability attributed to the data within the measurement system.

论文关键词:Data quality assessment,Decision-making,Multivariate measurement system,Factor analysis,Varimax rotation,Confidence ellipsoid

论文评审过程:Received 11 April 2019, Revised 30 September 2019, Accepted 26 October 2019, Available online 31 October 2019, Version of Record 3 January 2020.

论文官网地址:https://doi.org/10.1016/j.dss.2019.113173