| • | high volume | (big amount of data, often referred to as exceeding tera- or petabytes), |
| • | high velocity | (fast speed of data generation like streaming data close to real-time), |
| • | high variety | (many diverse data formats and structures from multiple sources), |
| • | high veracity | (conformity with facts and closely related to data quality), |
| • | high value | (the information derived provides benefits to decision makers which in healthcare is closely related to the triple aim). |
Table 1
Data types for big data analytics in healthcare by data generation point.
| DATA GENERATION POINTS | DATA TYPES | EXAMPLES ON TYPICAL DATA CONTENT |
|---|---|---|
| Transactions/billing with different payer organizations | Administrative data | Patient demographics, plan types, type of provider, location, … |
| Medical claims | In-/outpatient visits, diagnosis/procedure coding, referrals, … | |
| Pharmaceutical claims | Drug codes, dosages, prescription dates, manufacturer, … | |
| Ancillary claims | Medical equipment, physiotherapy, home health assistance, … | |
| Clinical/diagnostic processes of different provider organizations (e.g., health, social, aged or disability care) | Institutional data | Educational background, work experience, working times, … |
| EMR/EHR data | Vital signs, medical history, disease conditions, lab results, … | |
| Medical imaging | X-ray, magnetic resonance, computed tomography, ultrasonography, … | |
| Biomarker | “-omics”: genomics, proteomics, metabolomics, lipidomics, … | |
| Registries | Structured collection of disease/population specific measures | |
| Patient- or people-generated | Smart sensor/device data | Biometric data, physical activity, gait/sleep patterns, location, … |
| Web usage data | Social media posts, internet search logs, health forum activity, … | |
| Health-related research | Clinical trial data | Study size, clinically defined parameters and outcomes, … |
| Drug surveillance data | Adverse drug effects, population size, regional uptake/variation, … | |
| (Health) Survey data | Patient-reported outcome measures (PROMs), health literacy, … | |
| Health-related systems | Socio-economic/community-based data | Income, deprivation, education, living situation, marital status, … |
| Environmental/spatial data | Air/noise pollution, temperature, neighbourhood characteristics, … |

Figure 1
Role model of a people-centred health platform for big data analytics (EHR = electronic health record; PROMs = patient-reported outcome measures, with elements of [37]).

Figure 2
Data types most often applied for big data analyses in healthcare (April 2019), illustrated as tree map.

Figure 3
Distribution of the most often used big data analytical models in healthcare (April 2019), illustrated as tree map.
Table 2
The strategic interventions of the people-centred and integrated health services framework that might incorporate big data analytics (results of the in this scoping review and a content analysis, see also Table 8).
| STRATEGIC DIRECTION | POLICY OPTIONS AND STRATEGICAL INTERVENTIONS POTENTIALLY SUPPORTED BY BDA | NUMBER OF PUBLICATIONS IN THE REVIEW (N = 72) | |||
|---|---|---|---|---|---|
| Empowering and engaging people | 36 | (51%) | |||
| Personalized care plans | 31 | 43% | |||
| Self-management activities | 5 | 7% | |||
| Shared decision making | 4 | 6% | |||
| Health education | 3 | 4% | |||
| Access to personal health records | 2 | 3% | |||
| Peer support | 1 | 1% | |||
| Patient satisfaction surveys | 1 | 1% | |||
| Strengthening governance and accountability | 23 | 32% | |||
| Performance evaluation | 15 | 21% | |||
| Performance-based contracting | 8 | 11% | |||
| Decentralization | 8 | 11% | |||
| Patient-reported outcomes | 1 | 1% | |||
| Reorienting the model of care | 56 | 79% | |||
| Clinical decision support | 23 | 32% | |||
| Tailoring population-based services | 19 | 27% | |||
| Surveillance and control systems | 13 | 18% | |||
| Mobile health technologies | 10 | 14% | |||
| Health promotion and disease prevention | 9 | 13% | |||
| Home and nursing care | 5 | 7% | |||
| Coordinating services | 20 | 28% | |||
| Care pathways | 8 | 11% | |||
| Sharing of medical records | 6 | 8% | |||
| Intersectoral partnerships | 5 | 7% | |||
| District-based healthcare delivery | 1 | 1% | |||
| Creating an enabling environment | 17 | 24% | |||
| Resource allocation | 11 | 15% | |||
| System research | 6 | 8% | |||
| Quality assurance | 3 | 4% | |||
| Workforce training | 2 | 3% | |||
Table 3
Challenges in designing a people-centred and integrated health platform to enable big data analytics in healthcare.
| CHALLENGE DOMAIN BIG DATA CHARACTERISTIC | REGULATORY | TECHNOLOGICAL | METHODOLOGICAL | CULTURAL |
|---|---|---|---|---|
| Volume | Investment & technology framework | Data infrastructure | High-dimensional analytics | Teamwork culture |
| Velocity | Communication framework | Data processing | Real-time analytics | Delivery process redesign |
| Variety | Intellectual property framework | Data linkage | Modelling standards & bias | Data sharing culture |
| Veracity | Evaluation framework | Data quality | Evidence- base | Data governance |
| Value | Privacy & ethics framework | Data access & data security | Interpretation & usability | Culture of learning & change |
