
Figure 1
Multi-tier screening and confidence assignment framework.
Table 1
Confidence Score Interpretation.
| CONFIDENCE SCORE | CLASSIFICATION | DESCRIPTION | IMPLICATION |
|---|---|---|---|
| 1.0 (100%) | Definitive Match | Rule-based classification/No ambiguity. | Fully automated decision. |
| 0.8–1.0 | Very High | AI strongly validates the decision using explicit evidence. | Safe to accept. |
| 0.6–0.79 | High | Criteria appear satisfied based on standard academic content. | Review is optional. |
| 0.4–0.59 | Moderate | Ambiguous context or loosely met criteria. | Manual verification recommended. |
| 0.1–0.39 | Low | Based mainly on keyword estimation. | High risk of error. |
| <0.1 | Unreliable | Derived from failed extraction methods. | Mandatory manual review. |

Figure 2
Robust API response parsing and recovery workflow.
Table 2
Software validation results – extraction field accuracy (N = 19).
| FIELD | ACCURACY |
|---|---|
| Make and Model | 89.47% |
| Wear Location | 84.21% |
| Participant Age | 84.21% |
| Participant Gender | 78.94% |
| Technology | 78.94% |
| Sensor(s) | 68.42% |
| Sample Size (Strict) | 63.15% |
Table 3
Software validation results – confidence level predictive validity.
| CONFIDENCE BUCKET | # OF PAPERS | AVERAGE FIELD ACCURACY |
|---|---|---|
| High (0.8–1.0) | 13 | 93.4% |
| Medium (0.6–0.79) | 3 | 57.1% |
| Low (<0.6) | 4 | 71.4% |
Table 4
Software validation results – extractor performance by AI provider.
| PROVIDER | MODEL | TIME (MM:SS) | ERROR RATE | TESTER NOTES |
|---|---|---|---|---|
| Default/GLM | GLM-4.6V-Flash (6) | 4:39 | 0% | Fast with no errors |
| OpenAI | gpt-4o (11) | 3:52 | 0% | Fast, no errors |
| Anthropic | Claude-Sonnet-4–20250514 (12) | 4:13 | 0% | Fast, good for extraction |
| Cohere | Command-A-03–2025 (14) | 1:53 | 0% | Very fast and accurate |
| Deepseek | DeepSeek-Chat (13) | 4:57 | 0% | Accurate, slower speed |
Table 5
Software validation results – full-text AI data extraction result.
| FILENAME | CONFIDENCE | PAPER TITLE | CONCLUSION | TYPE OF STUDY | POPULATION | INTERVENTION | COMPARISON | OUTCOME | RESULTS |
|---|---|---|---|---|---|---|---|---|---|
| 1-s2.0-S0022510X21003166-main.pdf | 0.95 | Wearing-off symptoms during standard and extended natalizumab dosing intervals: Experiences from the COVID-19 pandemic [16] | Our observations support the need to study the effect of EID on wearing-off symptoms in randomized controlled trials. | Observational study | 30 relapsing-remitting multiple sclerosis (RRMS) patients over 18 years of age receiving natalizumab at the Department of Neurology, Haukeland University Hospital | Extended interval dosing (EID) of natalizumab every six weeks | Standard interval dosing (SID) of natalizumab every four weeks | Change in prevalence or intensity of wearing-off symptoms | 50% (15/30) reported new or increased wearing-off symptoms during EID. Symptom increase was more frequent among patients with pre-existing wearing-off symptoms during SID compared to patients without such pre-existing symptoms [p = 0.0005]. None had decreased symptoms or signs of clinical relapse. |
| 1-s2.0-S221103482100612X-main.pdf | 0.9 | Safety of Natalizumab infusion in multiple sclerosis patients during active SARS-CoV-2 infection [17] | Natalizumab redosing in people with multiple sclerosis during active SARS-CoV-2 infection is not associated with worsening of COVID-19 symptoms or recovery delay and is reasonably safe. The data supports the safety of NTZ redosing in these circumstances and suggests not to delay retreatment to minimize the risk of MS rebound. | Retrospective observational case series/cohort study | 18 relapsing-remitting people with Multiple Sclerosis (pwMS) under Natalizumab treatment, infected by SARS-CoV-2 between October 2020 and May 2021, from 6 Italian MS centers. All had mild COVID-19. | Natalizumab (NTZ) reinfusion (retreatment/redosing) during confirmed active SARS-CoV-2 infection (before achieving a negative swab). | Not Found (The study is a single-arm observational study with no explicit comparison group. Implicit comparison is to general population recovery times.) | Safety outcomes: worsening of SARS-CoV-2 infection symptoms, recovery delay, development of new neurological symptoms suggestive of CNS invasion, time to full recovery, and interval from positive to negative swab. | No patient reported worsening of SARS-CoV-2 symptoms or developed new neurological symptoms after redosing. Mean time to full recovery after NTZ for symptomatic patients was 10 ± 12 days. For the whole cohort, mean interval from first symptom to full recovery was 13 ± 9 days. Mean interval from first positive to first negative swab was 32 ± 15 days. No patient required oxygen support or hospitalization. |
Table 6
Software validation results – screener performance by AI provider.
| PROVIDER | MODEL | TIME (MM:SS) | ERROR RATE | TESTER NOTES |
|---|---|---|---|---|
| Default/GLM | GLM-4.6V-Flash [6] | 2:49 | 0% | Fast; every field found |
| OpenAI | gpt-4o [11] | 3:46 | 0% | Fast, no errors |
| Anthropic | Claude-Sonnet-4–20250514 [12] | 3:46 | 0% | Good; better for screening than OpenAI |
| Cohere | Command-A-03–2025 [14] | 1:33 | 0% | Screened everything; fastest |
| Deepseek | DeepSeek-Chat [13] | 2:58 | 0% | Fast |
Table 7
Software validation results – full-text screening result.
| FILENAME | TITLE | AUTHOR | YEAR | CONFIDENCE | REASON FOR INCLUSION |
|---|---|---|---|---|---|
| 1-s2.0-S0022510X21003166-main.pdf | Wearing-off symptoms during standard and extended natalizumab dosing intervals: Experiences from the COVID-19 pandemic [16] | Gerd Haga Bringeland | 2021 | 0.95 | The paper studies adults (population) receiving natalizumab (intervention) with outcomes related to multiple sclerosis (MS) regarding wearing-off symptoms during standard (SID) and extended (EID) dosing intervals. It meets all inclusion criteria for population, intervention, and outcomes. |
| 1-s2.0-S221103482100612X-main.pdf | Safety of Natalizumab infusion in multiple sclerosis patients during active SARS-CoV-2 infection [17] | Landi D | 2021 | 0.95 | The paper focuses on natalizumab (intervention) in adults with multiple sclerosis (population) during active SARS-CoV-2 infection, which meets all inclusion criteria (population: adults; intervention: natalizumab, SID, EID; outcomes: MS). No exclusion criteria are violated. |
| 1-s2.0-S1878747924000370-main.pdf | Commentary extended interval dosing of natalizumab: More evidence in support [18] | Karlo Toljan | 2024 | 0.95 | The paper evaluates extended interval dosing (EID) of natalizumab versus standard interval dosing (SID) in adults with multiple sclerosis (MS), assessing outcomes such as disease activity, relapse rates, and safety (including PML risk). It meets all PICO criteria: population (adults), intervention (natalizumab, SID/EID), comparison (SID vs EID), and outcomes (MS). |
