Fig. 1

Fig. 2

Fig. 3

Fig. 4

Fig. 5

Fig. 6

Summary statistics of the dataset—before and after cleansing_
| Original database | After data cleansing | |
|---|---|---|
| Number of polices | 20,000 | 19,871 |
| Risk exposure | 11,349 | 11,209 |
| MTPL PD claim count | 571 | 570 |
| Observed MTPL PD frequency | 5,03% | 5,09% |
Statistics for seven newly created variables—original granularity, inter-rater reliability of 4 selected annotators on the common set of 500 observations and significance in our risk model after applying necessary simplifications_
| Variable | Original granularity | Inter-rater reliability | Risk model | ||
|---|---|---|---|---|---|
| Fleiss’ kappa | Interpretation | Granularity after simplification | p-value | ||
| Neighbourhood type | Seven types, multi-choice | 0.52 | Moderate agreement | 2 | 00.01 |
| Building density | Scale 1–5 | 0.50 | Moderate agreement | Not significant | |
| Street View quality | Good/bad/missing | 0.79 | Substantial agreement | 2 | 00.02 |
| House type | Five types, single-choice | 0.69 | Substantial agreement | 2 | 00.01 |
| House age | Scale 1–3 | 0.51 | Moderate agreement | 2 | 00.03 |
| House condition | Scale 1–3 | 0.54 | Moderate agreement | 2 | 00.04 |
| Wealth of residents | Scale 1–10 | 0.32 | Fair agreement | Not significant | |
Data for calculation of X2 statistic for hypothesis verification whether claims in our dataset follow the Poisson distribution_ On average λ = 3_9% and the corresponding X2 = 0_08 with 1 degree of freedom_
| Number of claims | Observed exposure (O) | Expected prob.P(X = k) | Expected exposure (E) | (E – O)2/E |
|---|---|---|---|---|
| 0 | 10,784 | 96% | 10,785 | 0,00 |
| 1 | 417 | 4% | 416 | 0,01 |
| 2 | 7 | 0% | 8 | 0,08 |
| All | 11,209 | |||
