What is Data Quality?
Data Quality is an abstract concept until we are confronted with bad data or information which is not usable, not credible, not presented correctly, not accurate, etc.. Then Data Quality becomes a very concrete experience. While, there are many definitions for Data Quality, all of them converge to the same focal point. “fitness for use” by people, applications and business processes.
Sigmafine connects Process industry Data from the moment of creation to the moment of use – making it the foundation of a successful Data Quality strategy for any Industrial Plant Dataset.
Larry P. English defines the concept of Data Quality as follows:
- “Delivering Quality Data is consistently meeting the expectations of people, applications and business process for usable, reliable and timely data and information.”
Another definition comes from Thomas C. Redman:
- “There are two interesting moments in the lifetime of a piece of data: the moment it is created and the moment it is used. Quality, the degree to which the data is fit-for-use, is judged at the moment of use. If it meets the needs at that moment, it is judged of “high-quality.” And conversely. The whole point of data quality management is to connect those moments in time — to ensure that the moment of creation is designed and managed to create data correctly, so everything goes well at the moment of use.”
This latter definition is particularly appropriate for the industrial plant because most, if not all, data used outside of process control, is generated independently and asynchronously relative to the time of use. This data becomes the Industrial Plant Dataset used by people, applications and business processes to create value and understand how business and operations perform.
A “zero-defect” Data Quality strategy in the Process industry does not work because it fails to deliver fitness for use which is the crux of Data Quality. For this reason, the data management strategy of the industrial plant must encompass the following:
- Performing the necessary checks and balances on newly generated data daily or at more frequent intervals if the business scenarios call for it;
- Organizing and transforming data into people, applications and business process readable, usable, reliable inputs;
- Adapting to the evolving needs of the business by allowing the rapid implementation of model based analysis rules and calculations.