What is accuracy?
How close the data is to the true or correct value
What is an example of accuracy?
a customer’s birthdate is recorded as 02/15/1990 when it is actually 02/16/1990
What is completeness?
whether all required data is present
What is an example of completeness?
if 30% of rows in “phone number” column are blank, then the data is incomplete
What is consistency?
whether data is consistent across different systems or datasets
What is an example of consistency?
A customer’s address appears as “123 Main st.” in one table and “123 Main Street” in another
What is continuity?
the level of availability of both historical and current snapshot data points or records
What is an example of continuity?
the continued acquisition and availability of data should be sufficiently consistent with previous data and compatible with future data
What is coverage?
the availability and comprehensiveness of data compared to the total universe or population of interest
Why is coverage important?
ensures that a dataset or subproduct is fit-for-purpose for client consumption across entities
What is discoverability?
the degree to which data can be found by an end user
How can we make data more discoverable?
has the right metadata associated with the content of interest and is indexed/ easily searchable for a user
What is timeliness?
whether data is up to date and available when needed
What is an example of timeliness?
a stock price feed delayed by 10 minutes isn’t timely enough for real-time trading
What is uniqueness?
whether each record is unique and there are no duplicates
What is validity?
whether data follows the correct format, rules or constraints
What is an example of validity?
a field expecting a zip code has a “ABCDE” value
What is the metric for accuracy?
% error free
What is the metric for timeliness?
time to publish
What is the metric for completeness?
% missing data
What is the metric for coverage?
% in-scope of regulation
What is the best in class metric for accuracy?
99%
What is the acceptable metric for accuracy?
98%
What is the bad metric for accuracy?
95%