Have a personal or library account? Click to login
Detecting rater bias using a person-fit statistic: a Monte Carlo simulation study Cover

Detecting rater bias using a person-fit statistic: a Monte Carlo simulation study

Open Access
|Jan 2018

References

  1. Frank J Snell LS Cate OT Competency-based medical education: theory to practice Med Teach 2010 32 638 645 10.3109/0142159X.2010.501190
  2. Berendonk C Stalmeijer RE Schuwirth LWT Expertise in performance assessment: assessors’ perspectives Adv. Health. Sci. Educ. Theory. Pract. 2013 18 559 571 10.1007/s10459-012-9392-x
  3. Holmboe ES Sherbino J Long DM Swing SR Frank JR The role of assessment in competency-based medical education Med Teach 2010 32 676 682 10.3109/0142159X.2010.500704
  4. Govaerts MJB Schuwirth LWT van der Vleuten CPM Muijtjens AMM Workplace-based assessment: effects of rater expertise Adv. Health. Sci. Educ. Theory. Pract. 2011 16 151 165 10.1007/s10459-010-9250-7
  5. Govaerts MJB Van de Wiel MWJ Schuwirth LWT Van der Vleuten CPM Muijtjens AMM Workplace-based assessment: raters’ performance theories and constructs Adv. Health. Sci. Educ. Theory. Pract. 2013 18 375 396 10.1007/s10459-012-9376-x
  6. Gauthier G St-Onge C Tavares W Rater cognition: Review and integration of research findings Med Educ 2016 50 511 522 10.1111/medu.12973
  7. Gingerich A Regehr G Eva KW Rater-based assessments as social judgments: rethinking the etiology of rater errors Acad Med 2011 86 S1 S7 10.1097/ACM.0b013e31822a6cf8
  8. Govaerts MJB van der Vleuten CPM Schuwirth LWT Muijtjens AMM Broadening perspectives on clinical performance assessment: Rethinking the nature of in-training assessment Adv. Health. Sci. Educ. 2007 12 239 260 10.1007/s10459-006-9043-1
  9. St-Onge C Chamberland M Lévesque A Varpio L The role of the assessor: exploring the clinical supervisor’s skill set Clin Teach 2014 11 209 213 10.1111/tct.12126
  10. Gallagher P The role of the assessor in the assessment of practice: an alternative view Med Teach 2010 32 E413 E416 10.3109/0142159X.2010.496010
  11. Ginsburg S McIlroy J Oulanova O Eva K Regehr G Toward authentic clinical evaluation: pitfalls in the pursuit of competency Acad Med 2010 85 780 786 10.1097/ACM.0b013e3181d73fb6
  12. Smith EV Kulikowich JM An application of generalizability theory and many-faceted Rasch measurement using a complex problem-solving skills assessment Educ Psychol Meas 2004 64 617 639 10.1177/0013164404263876
  13. Hogan EA Effects of prior expectations on performance ratings: a longitudinal study Acad. Manage. J. 1987 30 354 368 10.2307/256279
  14. Nickerson RS Confirmation bias: a ubiquitous phenomenon in many guises Rev Gen Psychol 1998 2 175 220 10.1037/1089-2680.2.2.175
  15. Tversky A Kahneman D Judgement under uncertainty: heuristics and biases Science 1974 185 1124 1131 10.1126/science.185.4157.1124
  16. Yeates P O’Neill P Mann K Eva KW Effect of exposure to good vs poor medical trainee performance on attending physician rating of subsequent performances JAMA 2012 308 2226 2232 10.1001/jama.2012.36515
  17. Norcini J Burch V Workplace-based assessment as an educational tool: AMEE Guide No. 31 Med Teach 2007 29 855 871 10.1080/01421590701775453
  18. Downing SM Haladyna TM Assessment in health professions education 2009 New York Routledge 44 49
  19. Chambers DW Do repeat clinical competency ratings stereotype students? J Dent Educ 2004 68 1220 1227
  20. Judge TA Ferris GR Social context of performance evaluation decisions Acad. Manage. J. 1993 36 80 105 10.2307/256513
  21. Turban DB Jones AP Supervisor-subordinate similarity: types, effects, and mechanisms J Appl Psychol 1988 73 228 234 10.1037/0021-9010.73.2.228
  22. Waldman DA Avolio BJ Race effects in performance evaluation: controlling for ability, education and experience J Appl Psychol 1991 76 897 901 10.1037/0021-9010.76.6.897
  23. Downing SM Haladyna TM Validity threats: overcoming interference with proposed interpretations of assessment data Med Educ 2004 38 327 333 10.1046/j.1365-2923.2004.01777.x
  24. Roberts C Rothnie I Zoanetti N Crossley J Should candidate scores be adjusted for interviewer stringency or leniency in the multiple mini-interview? Med Educ 2010 44 690 698 10.1111/j.1365-2923.2010.03689.x
  25. Harasym PH Woloschuk W Cunning L Undesired variance due to examiner stringency/leniency effect in communication skill scores assessed in OSCEs Adv. Health. Sci. Educ. Theory. Pract. 2008 13 617 632 10.1007/s10459-007-9068-0
  26. Boulet JR Mckinley DW Whelan GP Hambleton RK Quality assurance methods for performance-based assessments Adv. Health. Sci. Educ. Theory. Pract. 2003 8 27 47 10.1023/A:1022639521218
  27. Iramaneerat C Yudkowsky R Myford CM Downing SM Quality control of an OSCE using generalizability theory and many-faceted Rasch measurement Adv. Health. Sci. Educ. Theory. Pract. 2008 13 479 493 10.1007/s10459-007-9060-8
  28. McManus IC Thompson M Mollon J Assessment of examiner leniency and stringency (‘hawk-dove effect’) in the MRCP(UK) clinical examination (PACES) using multi-facet Rasch modelling BMC Med. Educ. 2006 6 42 10.1186/1472-6920-6-42
  29. Bartman I Smee S Roy M A method for identifying extreme OSCE examiners Clin Teach 2013 10 27 31 10.1111/j.1743-498X.2012.00607.x
  30. Prieto G Nieto E Analysis of rater severity on written expression exam using Many Faceted Rasch Measurement Psicologica 2014 35 385 97.
  31. Raymond MR Viswesvaran C Least squares models to correct for rater effects in performance assessment J Educ Meas 1993 30 253 268 10.1111/j.1745-3984.1993.tb00426.x
  32. Meijer RR Sitsma K Person-fit statistic—what is their purpose Rasch Meas Trans 2001 15 823
  33. Karabatsos G Comparing the aberrant response detection performance of thirty-six person-fit statistics Appl Meas Educ 2003 16 277 298 10.1207/S15324818AME1604_2
  34. Meijer RR Person-fit research: an introduction Appl Meas Educ 1996 9 3 8 10.1207/s15324818ame0901_2
  35. Rupp AA A systematic review of the methodology for person fit research in item response theory: lessons about generalizability of inferences from the design of simulation studies Psychol Test Assess Model 2013 55 3 38
  36. Drasgow F Levine MV Williams EA Appropriateness measurement with polychotomous item response models and standardized indices Br J Math Stat Psychol 1985 38 67 86 10.1111/j.2044-8317.1985.tb00817.x
  37. St-Onge C Valois P Abdous B Germain S Person-fit statistics’ accuracy: a Monte Carlo study of the aberrance rate’s influence Appl. Psychol. Meas. 2011 35:419 32
  38. Nering ML Meijer RR A comparison of the person response function and the lz person-fit statistic Appl Psychol Meas 1998 22 53 69 10.1177/01466216980221004
  39. Kinase S Mohammeadi A Takahashi M Application of Monte Carlo simulation and Voxel models to internal dosimetry Applications of Monte Carlo methods in biology, medicine and other fields of science 2011 Garching bei München InTech
  40. Alexander C Monte Carlo VaR Market risk analysis 2009 Hoboken John Wiley & Sons 201 246
  41. De Champlain AF A primer on classical test theory and item response theory for assessments in medical education Med Educ 2010 44 109 117 10.1111/j.1365-2923.2009.03425.x
  42. DeMars C Item response theory 2010 Oxford Oxford University Press 10.1093/acprof:oso/9780195377033.001.0001
  43. Bertrand R Blais JG Modèles de Mesure: L’Apport de la Théorie des Réponses aux Items 2004 Sainte-Foy Presses de l’Université du Québec
  44. Osterlind SJ Modern measurement: theory, principles, and applications of mental appraisal 2006 Columbus Pearson Merrill Prentice Hall
  45. Laurencelle L Germain S Les estimateurs de capacité dans la théorie des réponses aux items et leur biais Tutor Quant Methods Psychol 2011 7 42 53 10.20982/tqmp.07.2.p042
  46. Levine MV Rubin DB Measuring the appropriateness of multiple-choice test scores J Educ Behav Stat 1979 4 269 290 10.3102/10769986004004269
  47. Magis D Raiche G Beland S A didactic presentation of Snijders’s lz* index of person fit with emphasis on response model selection and ability estimation J Educ Behav Stat 2012 37 57 81 10.3102/1076998610396894
  48. Noonan BW Boss MW Gessaroli ME The effect of test length and IRT model on the distribution and stability of three appropriateness indexes Appl. Psychol. Meas. 1992 16 345 352 10.1177/014662169201600405
  49. Reise SP A comparison of item- and person-fit methods of assessing model-data fit in IRT Appl. Psychol. Meas. 1990 14 127 137 10.1177/014662169001400202
  50. Olejnik S Algina J Measures of effect size for comparative studies: applications, interpretations, and limitations Contemp Educ Psychol 2000 25 241 286 10.1006/ceps.2000.1040
  51. Cohen J Statistical power analysis for the behavioral sciences: a computer program 1988 Mahwah Lawrences Erlbaum Associates
  52. St-Onge C Valois P Abdous B Germain S A Monte Carlo study of the effect of item characteristic curve estimation on the accuracy of three person-fit statistics Appl Psychol Meas 2009 33 307 324 10.1177/0146621608329503
  53. Team RC. R A language and environment for statistical computing R Foundation for Statistical Computing 2013 Vienna Team RC. R
  54. Germain S Valois P Abdous B The item response theory library 2016
  55. Govaerts MJB In-training assessment: learning from practice Clin Teach 2006 3 242 247 10.1111/j.1743-498X.2006.00119.x
  56. Williams RG Klamen DA McGaghie W Cognitive, social, and environmental sources of bias in clinical performance ratings Teach Learn Med 2003 15 270 292 10.1207/S15328015TLM1504_11
  57. Haladyna TM Downing SM Construct-irrelevant variance in high-stakes testing Educ Meas Issues Pract 2004 23 17 27 10.1111/j.1745-3992.2004.tb00149.x
  58. Drasgow F Levine MV McLaughlin ME Detecting inappropriate test scores with optimal and practical appropriateness indices Appl. Psychol. Meas. 1987 11 59 79 10.1177/014662168701100105
  59. Emons WHM Sijtsma K Meijer RR Testing hypotheses about the person-response function in person-fit analysis Multivariate Behav. Res. 2004 39 1 35 10.1207/s15327906mbr3901_1
  60. AERA, APA, NCME (American Educational Research Association & National Council on Measurement in Education) Joint Committee on Standards for Educational and Psychological Testing APA Standards for educational and psychological testing 1999 Washington, DC AERA
Language: English
Published on: Jan 2, 2018
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2018 André-Sébastien Aubin, Christina St-Onge, Jean-Sébastien Renaud, published by Bohn Stafleu van Loghum
This work is licensed under the Creative Commons Attribution 4.0 License.