
Figure 1
Data quality framework.

Figure 2
A universal, two-layer big data quality standard for assessment.
Table 1
The hierarchical big data quality assessment framework (partial content).
| Dimensions | Elements | Indicators | |
|---|---|---|---|
| 1) Availability | 1) Accessibility | ■ | Whether a data access interface is provided |
| ■ | Data can be easily made public or easy to purchase | ||
| 2) Timeliness | ■ | Within a given time, whether the data arrive on time | |
| ■ | Whether data are regularly updated | ||
| ■ | Whether the time interval from data collection and processing to release meets requirements | ||
| 2) Usability | 1) Credibility | ■ | Data come from specialized organizations of a country, field, or industry |
| ■ | Experts or specialists regularly audit and check the correctness of the data content | ||
| ■ | Data exist in the range of known or acceptable values | ||
| 3) Reliability | 1) Accuracy | ■ | Data provided are accurate |
| ■ | Data representation (or value) well reflects the true state of the source information | ||
| ■ | Information (data) representation will not cause ambiguity | ||
| 2) Consistency | ■ | After data have been processed, their concepts, value domains, and formats still match as before processing | |
| ■ | During a certain time, data remain consistent and verifiable | ||
| ■ | Data and the data from other data sources are consistent or verifiable | ||
| 3) Integrity | ■ | Data format is clear and meets the criteria | |
| ■ | Data are consistent with structural integrity | ||
| ■ | Data are consistent with content integrity | ||
| 4) Completeness | ■ | Whether the deficiency of a component will impact use of the data for data with multi-components | |
| ■ | Whether the deficiency of a component will impact data accuracy and integrity | ||
| 4) Relevance | 1) Fitness | ■ | The data collected do not completely match the theme, but they expound one aspect |
| ■ | Most datasets retrieved are within the retrieval theme users need | ||
| ■ | Information theme provides matches with users’ retrieval theme | ||
| 5) Presentation Quality | 1) Readability | ■ | Data (content, format, etc.) are clear and understandable |
| ■ | It is easy to judge that the data provided meet needs | ||
| ■ | Data description, classification, and coding content satisfy specification and are easy to understand | ||

Figure 3
Quality assessment process for big data.
