Limited fitness of routinely captured student participation data for predicting graduation examination performance in undergraduate medical education
DOI:
https://doi.org/10.17161/sjm.v3i2.25575Keywords:
learning analytics, data quality, medical education, clinical clerkship, educational measurementAbstract
Introduction: Clinical-teaching platforms now routinely capture student-level participation records, and learning analytics promises to turn these into predictions of educational outcomes. Yet the completeness, linkability, and predictive validity of such data are seldom audited, especially in new programmes where capture is still maturing.
Methods: We conducted a single-centre secondary analysis at a new tertiary teaching hospital. Teaching-participation records were linked by name, within enrolment cohort, to graduation examination scores (theory, skill, total) for three cohorts of five-year clinical-medicine interns (2022–2024). We assessed completeness, linkability, and the predictive validity of overall, domain-matched (theory- vs. skill-oriented), and rotation-based exposure, using Spearman correlations with Fisher confidence intervals, Benjamini–Hochberg correction, and cohort-fixed-effects regressions on within-cohort standardised scores.
Results: The platform held 3,216 activities and 16,391 participation records, yet only 61 students linked individually (5.5% of the 1,101-student roster), rising steeply (2, 14, 45). No association survived multiplicity correction (0 of 15). Domain-matched associations were weak, with confidence intervals spanning zero (theory exposure vs. theory ρ = 0.00; skill vs. skill ρ = 0.14), and no stronger than cross-domain placebo pairs. Rotation duration was unrelated to performance; rotation breadth correlated with theory score (ρ = 0.39, p = 0.002) but was confounded with overall participation volume. Given the small, 2024-dominated sample, the nulls exclude only moderate-to-large associations, not small effects.
Conclusions: Routinely captured teaching-participation data showed limited completeness, linkability, and predictive validity for graduation outcomes and are not yet fit for student-level prediction. Institutions considering such use should audit linkability and predictive validity before relying on participation dashboards for high-stakes inference.
Downloads
Published
Data Availability Statement
The de-identified, aggregated data and the analysis code supporting the findings are available from the corresponding author on reasonable request, subject to institutional data-governance approval.
Issue
Section
License
Copyright (c) 2026 Chujie Chen, Zhen Zhang, Peng Yun (Author)

This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.