Red Room - Session 8

1:00 to 2:00 p.m. Wednesday April 22, 2015

Garbage In, Garbage Out: Tackling Bad Data Quality in Life Data Analysis

The quality of the results of an algorithm heavily depends on the quality of the input data ("Garbage In / Garbage Out"). In the case of life data analysis, high quality means that the calculated results (in terms of reliability function, probability of failure, failure rate, mean life, etc.) are credible and accurate. Despite its importance, bad quality of input data is still a big challenge in performing life data analysis. A well-known example is the poor availability of life data (e.g., failure data) that are, in addition, usually limited to the base warranty period. This presentation illustrates the main key issues of data quality in life data analysis and their effects on the calculated results. Different categories of data quality are introduced (e.g., sample size, sample selection, data cleanness, data completeness, level of detail, trustworthiness) and, for each category, a corresponding metric is presented. The scope of the developed metrics is to give qualitative and quantitative information about the effect of bad data quality for a specific category on the results of life data analysis. Finally, a prototype of a functional and informative dashboard for data quality in life data analysis is presented in the form of a software application.

Key Words: Life Data Analysis (Weibull Analysis), Data Quality, Reliability Prediction

Simone Turrin and Ralf Gitzel

ABB AG Corporate Research