References
- Frank J Snell LS Cate OT Competency-based medical education: theory to practice Med Teach 2010 32 638 645 10.3109/0142159X.2010.501190
- Berendonk C Stalmeijer RE Schuwirth LWT Expertise in performance assessment: assessors’ perspectives Adv. Health. Sci. Educ. Theory. Pract. 2013 18 559 571 10.1007/s10459-012-9392-x
- Holmboe ES Sherbino J Long DM Swing SR Frank JR The role of assessment in competency-based medical education Med Teach 2010 32 676 682 10.3109/0142159X.2010.500704
- Govaerts MJB Schuwirth LWT van der Vleuten CPM Muijtjens AMM Workplace-based assessment: effects of rater expertise Adv. Health. Sci. Educ. Theory. Pract. 2011 16 151 165 10.1007/s10459-010-9250-7
- Govaerts MJB Van de Wiel MWJ Schuwirth LWT Van der Vleuten CPM Muijtjens AMM Workplace-based assessment: raters’ performance theories and constructs Adv. Health. Sci. Educ. Theory. Pract. 2013 18 375 396 10.1007/s10459-012-9376-x
- Gauthier G St-Onge C Tavares W Rater cognition: Review and integration of research findings Med Educ 2016 50 511 522 10.1111/medu.12973
- Gingerich A Regehr G Eva KW Rater-based assessments as social judgments: rethinking the etiology of rater errors Acad Med 2011 86 S1 S7 10.1097/ACM.0b013e31822a6cf8
- Govaerts MJB van der Vleuten CPM Schuwirth LWT Muijtjens AMM Broadening perspectives on clinical performance assessment: Rethinking the nature of in-training assessment Adv. Health. Sci. Educ. 2007 12 239 260 10.1007/s10459-006-9043-1
- St-Onge C Chamberland M Lévesque A Varpio L The role of the assessor: exploring the clinical supervisor’s skill set Clin Teach 2014 11 209 213 10.1111/tct.12126
- Gallagher P The role of the assessor in the assessment of practice: an alternative view Med Teach 2010 32 E413 E416 10.3109/0142159X.2010.496010
- Ginsburg S McIlroy J Oulanova O Eva K Regehr G Toward authentic clinical evaluation: pitfalls in the pursuit of competency Acad Med 2010 85 780 786 10.1097/ACM.0b013e3181d73fb6
- Smith EV Kulikowich JM An application of generalizability theory and many-faceted Rasch measurement using a complex problem-solving skills assessment Educ Psychol Meas 2004 64 617 639 10.1177/0013164404263876
- Hogan EA Effects of prior expectations on performance ratings: a longitudinal study Acad. Manage. J. 1987 30 354 368 10.2307/256279
- Nickerson RS Confirmation bias: a ubiquitous phenomenon in many guises Rev Gen Psychol 1998 2 175 220 10.1037/1089-2680.2.2.175
- Tversky A Kahneman D Judgement under uncertainty: heuristics and biases Science 1974 185 1124 1131 10.1126/science.185.4157.1124
- Yeates P O’Neill P Mann K Eva KW Effect of exposure to good vs poor medical trainee performance on attending physician rating of subsequent performances JAMA 2012 308 2226 2232 10.1001/jama.2012.36515
- Norcini J Burch V Workplace-based assessment as an educational tool: AMEE Guide No. 31 Med Teach 2007 29 855 871 10.1080/01421590701775453
- Downing SM Haladyna TM Assessment in health professions education 2009 New York Routledge 44 49
- Chambers DW Do repeat clinical competency ratings stereotype students? J Dent Educ 2004 68 1220 1227
- Judge TA Ferris GR Social context of performance evaluation decisions Acad. Manage. J. 1993 36 80 105 10.2307/256513
- Turban DB Jones AP Supervisor-subordinate similarity: types, effects, and mechanisms J Appl Psychol 1988 73 228 234 10.1037/0021-9010.73.2.228
- Waldman DA Avolio BJ Race effects in performance evaluation: controlling for ability, education and experience J Appl Psychol 1991 76 897 901 10.1037/0021-9010.76.6.897
- Downing SM Haladyna TM Validity threats: overcoming interference with proposed interpretations of assessment data Med Educ 2004 38 327 333 10.1046/j.1365-2923.2004.01777.x
- Roberts C Rothnie I Zoanetti N Crossley J Should candidate scores be adjusted for interviewer stringency or leniency in the multiple mini-interview? Med Educ 2010 44 690 698 10.1111/j.1365-2923.2010.03689.x
- Harasym PH Woloschuk W Cunning L Undesired variance due to examiner stringency/leniency effect in communication skill scores assessed in OSCEs Adv. Health. Sci. Educ. Theory. Pract. 2008 13 617 632 10.1007/s10459-007-9068-0
- Boulet JR Mckinley DW Whelan GP Hambleton RK Quality assurance methods for performance-based assessments Adv. Health. Sci. Educ. Theory. Pract. 2003 8 27 47 10.1023/A:1022639521218
- Iramaneerat C Yudkowsky R Myford CM Downing SM Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement Adv. Health. Sci. Educ. Theory. Pract. 2008 13 479 493 10.1007/s10459-007-9060-8
- McManus IC Thompson M Mollon J Assessment of examiner leniency and stringency (‘hawk-dove effect’) in the MRCP(UK) clinical examination (PACES) using multi-facet Rasch modelling BMC Med. Educ. 2006 6 42 10.1186/1472-6920-6-42
- Bartman I Smee S Roy M A method for identifying extreme OSCE examiners Clin Teach 2013 10 27 31 10.1111/j.1743-498X.2012.00607.x
- Prieto G Nieto E Analysis of rater severity on written expression exam using Many Faceted Rasch Measurement Psicologica 2014 35 385 97.
- Raymond MR Viswesvaran C Least squares models to correct for rater effects in performance assessment J Educ Meas 1993 30 253 268 10.1111/j.1745-3984.1993.tb00426.x
- Meijer RR Sitsma K Person-fit statistic—what is their purpose Rasch Meas Trans 2001 15 823
- Karabatsos G Comparing the aberrant response detection performance of thirty-six person-fit statistics Appl Meas Educ 2003 16 277 298 10.1207/S15324818AME1604_2
- Meijer RR Person-fit research: an introduction Appl Meas Educ 1996 9 3 8 10.1207/s15324818ame0901_2
- Rupp AA A systematic review of the methodology for person fit research in item response theory: lessons about generalizability of inferences from the design of simulation studies Psychol Test Assess Model 2013 55 3 38
- Drasgow F Levine MV Williams EA Appropriateness measurement with polychotomous item response models and standardized indices Br J Math Stat Psychol 1985 38 67 86 10.1111/j.2044-8317.1985.tb00817.x
- St-Onge C Valois P Abdous B Germain S Person-fit statistics’ accuracy: a Monte Carlo study of the aberrance rate’s influence Appl. Psychol. Meas. 2011 35:419 32
- Nering ML Meijer RR A comparison of the person response function and the lz person-fit statistic Appl Psychol Meas 1998 22 53 69 10.1177/01466216980221004
- Kinase S Mohammeadi A Takahashi M Application of Monte Carlo simulation and Voxel models to internal dosimetry Applications of Monte Carlo methods in biology, medicine and other fields of science 2011 Garching bei München InTech
- Alexander C Monte Carlo VaR Market risk analysis 2009 Hoboken John Wiley & Sons 201 246
- De Champlain AF A primer on classical test theory and item response theory for assessments in medical education Med Educ 2010 44 109 117 10.1111/j.1365-2923.2009.03425.x
- DeMars C Item response theory 2010 Oxford Oxford University Press 10.1093/acprof:oso/9780195377033.001.0001
- Bertrand R Blais JG Modèles de Mesure: L’Apport de la Théorie des Réponses aux Items 2004 Sainte-Foy Presses de l’Université du Québec
- Osterlind SJ Modern measurement: theory, principles, and applications of mental appraisal 2006 Columbus Pearson Merrill Prentice Hall
- Laurencelle L Germain S Les estimateurs de capacité dans la théorie des réponses aux items et leur biais Tutor Quant Methods Psychol 2011 7 42 53 10.20982/tqmp.07.2.p042
- Levine MV Rubin DB Measuring the appropriateness of multiple-choice test scores J Educ Behav Stat 1979 4 269 290 10.3102/10769986004004269
- Magis D Raiche G Beland S A didactic presentation of Snijders’s lz* index of person fit with emphasis on response model selection and ability estimation J Educ Behav Stat 2012 37 57 81 10.3102/1076998610396894
- Noonan BW Boss MW Gessaroli ME The effect of test length and IRT model on the distribution and stability of three appropriateness indexes Appl. Psychol. Meas. 1992 16 345 352 10.1177/014662169201600405
- Reise SP A comparison of item- and person-fit methods of assessing model-data fit in IRT Appl. Psychol. Meas. 1990 14 127 137 10.1177/014662169001400202
- Olejnik S Algina J Measures of effect size for comparative studies: applications, interpretations, and limitations Contemp Educ Psychol 2000 25 241 286 10.1006/ceps.2000.1040
- Cohen J Statistical power analysis for the behavioral sciences: a computer program 1988 Mahwah Lawrences Erlbaum Associates
- St-Onge C Valois P Abdous B Germain S A Monte Carlo study of the effect of item characteristic curve estimation on the accuracy of three person-fit statistics Appl Psychol Meas 2009 33 307 324 10.1177/0146621608329503
- Team RC. R A language and environment for statistical computing R Foundation for Statistical Computing 2013 Vienna Team RC. R
- Germain S Valois P Abdous B The item response theory library 2016
- Govaerts MJB In-training assessment: learning from practice Clin Teach 2006 3 242 247 10.1111/j.1743-498X.2006.00119.x
- Williams RG Klamen DA McGaghie W Cognitive, social, and environmental sources of bias in clinical performance ratings Teach Learn Med 2003 15 270 292 10.1207/S15328015TLM1504_11
- Haladyna TM Downing SM Construct-irrelevant variance in high-stakes testing Educ Meas Issues Pract 2004 23 17 27 10.1111/j.1745-3992.2004.tb00149.x
- Drasgow F Levine MV McLaughlin ME Detecting inappropriate test scores with optimal and practical appropriateness indices Appl. Psychol. Meas. 1987 11 59 79 10.1177/014662168701100105
- Emons WHM Sijtsma K Meijer RR Testing hypotheses about the person-response function in person-fit analysis Multivariate Behav. Res. 2004 39 1 35 10.1207/s15327906mbr3901_1
- AERA, APA, NCME (American Educational Research Association & National Council on Measurement in Education) Joint Committee on Standards for Educational and Psychological Testing APA Standards for educational and psychological testing 1999 Washington, DC AERA
