Have a personal or library account? Click to login

Consistency of aberrant response behavior: Are misfit persons consistent across two different questionnaires administered at the same time?

Open Access
|Aug 2025

References

  1. Adams, R.J., Wilson, M., & Wang, W. (1997). The multidimensional random coefficients multinomial logit model. Applied Psychological Measurement, 21(1), 1-23. https://doi.org/10.1177/0146621697211001
  2. Alnahdi, G. H., & Yada, A. (2020). Rasch analysis of the Japanese version of Teacher Efficacy for Inclusive Practices Scale: Scale unidimensionality. Frontiers in Psychology, 11: 1725. https://doi.org/10.3389/fpsyg.2020.01725
  3. American Educational Research Association (AERA), American Psychological Association (APA), National Council for Measurement in Education (NCME). (2014). Standards for educational and psychological testing. American Educational Research Association.
  4. André, Q. (2022). Outlier exclusion procedures must be blind to the researcher’s hypothesis. Journal of Experimental Psychology: General, 151(1), 213–223. https://doi.org/10.1037/xge0001069
  5. Andrich, D., & Marais, I. (2014). Person proficiency estimates in the dichotomous rasch model when random guessing is removed from difficulty estimates of multiple choice items. Applied Psychological Measurement, 38(6), 432-449. https://doi.org/10.1177/0146621614529646
  6. Andrich, D., Marais, I., & Humphry, S. (2016). Controlling guessing bias in the dichotomous Rasch model applied to a large-scale, vertically scaled testing program. Educational and Psychological Measurement, 76(3), 412-435. https://doi.org/10.1177/0013164415594202
  7. Artner, R. (2016). A simulation study of person-fit in the Rasch model. Psychological Test and Assessment Modeling, 58(3), 531–563.
  8. Bond, T. G., & Fox, C. M. (2007). Applying the Rasch model: Fundamental measurement in the human sciences (2nd ed.). Lawrence Erlbaum Associates.
  9. Briz-Redon, A. (2021). Respondent burden effects on item non-response and careless response rates: An analysis of two types of surveys. Mathematics, 9(17), 2035. https://doi.org/10.3390/math9172035
  10. Burchell, B., & Marsh, C. (1992). The effect of questionnaire length on survey response. Quality and Quantity, 26(3), 233-244. https://doi.org/10.1007/BF00172427
  11. Chalmers, R. P. (2012). mirt: A Multidimensional Item Response Theory Package for the R Environment. Journal of Statistical Software, 48(6), 1-29. https://doi.org/10.18637/jss.v048.i06
  12. Conjin, J. M., Emons, W. H. M., & Sijtsma, K. (2014). Statistic lz-based person-fit methods for noncognitive multiscale measures. Applied Psychological Measurement, 38(2), 122-136. https://doi.org/10.1177/0146621613497568
  13. Crişan, D. R., Tendeiro, J. N., & Meijer, R. R. (2017). Investigating the practical consequences of model misfit in unidimensional IRT models. Applied Psychological Measurement, 41(6), 439–455. https://doi.org/10.1177/0146621617695522
  14. Curtis, D. D. (2001). Misfits: People and their problems. What might it all mean? International Education Journal, 2(4), 91-99.
  15. Curtis, D. D. (2004). Person misfit in attitude surveys: Influences, impacts and implications. International Education Journal, 5(2), 125-144.
  16. Drasgow, F., Levine, M. V., & William, E. A. (1985). Appropriateness measurement with polychotomous item response models and standardized indices. British Journal of Mathematical and Statistical Psychology, 38(1), 67-86. https://doi.org/10.1111/j.2044-8317.1985.tb00817.x
  17. Du, J., Wang, Y., Wu, A., Jiang, Y., Duan, Y., Geng, W., Wan, L., Li, J., Hu, J., Jiang, J., Shi, L., & Wei, J. (2024). The validity and IRT psychometric analysis of Chinese version of Difficult Doctor-Patient Relationship Questionnaire (DDPRQ-10). BMC Psychiatry, 23: 900. https://doi.org/10.1186/s12888-023-05385-5
  18. Egberink, I. J. L., Meijer, R. R., Veldkamp, B. P., Schakel, L., & Smid, N. G. (2010). Detection of aberrant item score patterns in computerized adaptive testing: An empirical example using the CUSUM. Personality and Individual Differences, 48(8), 921-925. https://doi.org/10.1016/j.paid.2010.02.023
  19. Emons, M. H. W., Sijtsma, K., & Meijer, R. R. (2005). Global, local, and graphical person fit analysis using person-response functions. Psychological Methods, 10(1), 101-119. https://doi.org/10.1037/1082-989X.10.1.101
  20. Felt, J. M., Castaneda, R., Tiemensma, J., & Depaoli, S. (2017). Using person fit statistics to detect outliers in survey research. Frontiers in Psychology, 8: 863. https://doi.org/10.3389/fpsyg.2017.00863
  21. Ferrando, P. J. (2015). Assessing person fit in typicalresponse measures. In S. P. Reise & D. A. Revicki (Eds.), Handbook of item response theory modeling: Applications to typical performance assessment (pp. 128–155). Routledge/Taylor & Francis Group.
  22. Ferrando, P. J., Vigil-Colet, A., & Lorenzo-Seva, U. (2016). Practical person-fit assessment with the linear FA model: New developments and a comparative study. Frontiers in Psychology, 7: 1973. https://doi.org/10.3389/fpsyg.2016.01973
  23. Haberman, S. J., Sinharay, S., & Chon, K. H. (2013). Assessing item fit for unidimensional item response theory models using residuals from estimated item response functions. Psychometrika, 78(3), 417–440. https://doi.org/10.1007/s11336-012-9305-1
  24. Hayat, B., Rahayu, W., Putra, M. D. K., Sarifah, I., Puri, V. G. S., & Isa, K. (2023). Metacognitive Skills Assessment in Research-Proposal Writing (MSARPW) in the Indonesian university context: Scale development and validation using multidimensional item response models. Jurnal Pengukuran Psikologi dan Pendidikan Indonesia, 12(1), 31-47. https://doi.org/10.15408/jp3i.v12i1.31679
  25. Hong, S. E., Monroe, S., & Falk, C. F. (2020). Performance of person-fit statistics under model misspecification. Journal of Educational Measurement, 57(3), 423-442. https://doi.org/10.1111/jedm.12207
  26. International Test Commission (ITC). (2014). ITC guidelines on quality control in scoring, test analysis, and reporting of test scores. International Journal of Testing, 14(3), 195-217. https://doi.org/10.1080/15305058.2014.918040
  27. Jones, E. A., Wind, S. A., Tsai, C-L., & Ge, Y. (2023). Comparing person-fit and traditional indices across careless response patterns in surveys. Applied Psychological Measurement, 47(5-6), 365-385. https://doi.org/10.1177/01466216231194358
  28. Karabatsos, G. (1998). Analyzing nonadditive conjoint structures: Compounding events by Rasch model probabilities. Journal of Outcome Measurement, 2(3), 191-221.
  29. Karabatsos, G. (2000). A critique of rasch residual fit statistics. Journal of Applied Measurement, 1(2), 152-176.
  30. Karabatsos, G. (2003). Comparing the aberrant response detection performance of thirty-six person-fit statistics. Applied Measurement in Education, 16(4), 277-298. https://doi.org/10.1207/S15324818AME1604_2
  31. Levine, M. V., & Rubin, D. B. (1979). Measuring the appropriateness of multiple-choice test scores. Journal of Educational Statistics, 4(4), 269–290. https://doi.org/10.2307/1164595
  32. Linacre, J. M. (2005). When to stop removing items and persons in Rasch analysis? Rasch Measurement Transactions, 23(4), 1241.
  33. Li, M.-n. F., & Olejnik, S. (1997). The power of Rasch person–fit statistics in detecting unusual response patterns. Applied Psychological Measurement, 21(3), 215–231. https://doi.org/10.1177/01466216970213002
  34. Liu, Y., & Maydeu Olivares, A. (2014). Identifying the source of misfit in item response theory models. Multivariate Behavioral Research, 49(4), 354-371. https://doi.org/10.1080/00273171.2014.910744
  35. Liu, Y., & Liu, H. (2021). Detecting noneffortful responses based on a residual method using an iterative purification process. Journal of Educational and Behavioral Statistics, 46(6), 717-752. https://doi.org/10.3102/1076998621994366
  36. Liu, T., Lan, T., & Xin, T. (2019a). Detecting random responses in a personality scale using IRT-based personfit indices. European Journal of Psychological Assessment, 35(1), 126-136. https://doi.org/10.1027/1015-5759/a000369
  37. Liu, T., Sun, Y., Li, Z., & Xin, T. (2019b). The impact of aberrant response on reliability and validity. Measurement: Interdisciplinary Research and Perspectives, 17(3), 133-142. https://doi.org/10.1080/15366367.2019.1584848
  38. Lundgren, E., & Eklof, H. (2023). Questionnaire-taking motivation: Using response times to assess motivation to optimize on the PISA 2018 student questionnaire. International Journal of Testing, 23(4), 231-256. https://doi.org/10.1080/15305058.2023.2214647
  39. Maroqi, N. (2018). Uji validitas konstruk pada instrumen Rosenberg Self-Esteem Scale dengan metode confirmatory factor analysis (CFA). Jurnal Pengukuran Psikologi dan Pendidikan Indonesia, 7(2), 92-96. https://doi.org/10.15408/jp3i.v7i2.12101
  40. Masters, G. N. (1982). A Rasch model for partial credit scoring. Psychometrika, 47(2), 149–174. https://doi.org/10.1007/BF02296272
  41. Maydeu-Olivares, A. (2013). What should we assess the goodness of fit of IRT models? Measurement, 11(3), 127-137. https://doi.org/10.1080/15366367.2013.841511
  42. Meijer, R., R. (1996). Person fit research: An introduction. Applied Measurement In Education, 9(1), 3-8. https://doi.org/10.1207/s15324818ame0901_2
  43. Meijer, R. R. (2003). Diagnosing item score patterns on a test using item response theory-based person-fit statistics. Psychological Methods, 8(1), 72–87. https://doi.org/10.1037/1082-989X.8.1.72
  44. Meijer, R. R., & Sijtsma, K. (1995). Detection of aberrant item score patterns: A review of recent developments. Applied Measurement in Education, 8(3), 261–272. https://doi.org/10.1207/s15324818ame0803_5
  45. Meijer, R. R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25(2), 107–135. https://doi.org/10.1177/01466210122031957
  46. Meijer, R. R., & Tendeiro, J. N. (2012). The use of the lz and lz* person-fit statistics and problems derived from model misspecification. Journal of Educational and Behavioral Statistics, 37(6), 758-766. https://doi.org/10.3102/1076998612466144
  47. Meijer, R. R., Niessen, M. S. A., & Tendeiro, N. J. (2016). A practical guide to check the consistency of item response patterns in clinical research through person fit statistics: examples and a computer program. Assessment, 23(1), 56-62. https://doi.org/10.1177/1073191115577800
  48. Moshagen, M., & Bader, M. (2024). semPower: General power analysis for structural equation models. Behavior Research Methods, 56(4), 2901-2922. https://doi.org/10.3758/s13428-023-02254-7
  49. Ogihara, Y., & Kusumi, T. (2020). The developmental trajectory of self-esteem across the life span in Japan: Age differences in scores on the Rosenberg Self-Esteem Scale from adolescence to old age. Frontiers in Public Health, 8: 132. https://doi.org/10.3389/fpubh.2020.00132
  50. Olson J. F., & Fremer J. (2013). TILSA Test security guidebook: Preventing, detecting, and investigating test security irregularities. Council of Chief State School Officers.
  51. Panayides, P., & Tymms, P. (2012). Is aberrant response behavior a stable characteristic of students in classroom math tests? Rasch Measurement Transactions, 26(3), 1382-1383.
  52. Panayides, P., & Tymms, P. (2013). Investigating whether aberrant response behaviour in classroom maths tests is a stable characteristic of students. Assessment in Education: Principles, Policy & Practice, 20(3), 349-368. https://doi.org/10.1080/0969594x.2012.723610
  53. Pina, J. A. L., & Montesinos, M. D. H. (2005). Fitting Rasch model using appropriateness measure statistics. The Spanish Journal of Psychology, 8(1), 100-110. https://doi.org/10.1017/S113874160000500X
  54. R Core Team. (2015). R: A language and environment for statistical computing. R Foundation for Statistical Computing. https://www.R-project.org/
  55. Rasch, G. (1960). Probabilistic models for some intelligence and attainment tests. Danish Institute for Educational Research.
  56. Reise, S. P., & Flannery, W. P. (1996). Assessing person-fit on measures of typical performance. Applied Measurement in Education, 9(1), 9–26. https://doi.org/10.1207/s15324818ame0901_3
  57. Rolstad, S., Adler, J., & Ryden, A. (2011). Response burden and questionnaire length: Is shorter better? A review and meta-analysis. Value in Health, 14(8), 1101-1108. https://doi.org/10.1016/j.jval.2011.06.003
  58. Rosenberg, M. (1965). Rosenberg Self-Esteem Scale (RSES) [Database record]. APA PsycTests. https://doi.org/10.1037/t01038-000
  59. Rosseel, Y. (2012). lavaan: An R Package for Structural Equation Modeling. Journal of Statistical Software, 48(2), 1–36. https://doi.org/10.18637/jss.v048.i02
  60. Sijtsma, K., & Meijer, R. R. (2001). The person response function as a tool in person-fit research. Psychometrika, 66(2), 191–207. https://doi.org/10.1007/BF02294835
  61. Smith, R. M. (1986). Person fit in the Rasch model. Educational and Psychological Measurement, 46(2), 359–372. https://doi.org/10.1177/001316448604600210
  62. Spoden, C., Fleischer, J., & Frey, A. (2020). Person misfit, test anxiety, and test-taking motivation in a large-scale mathematics proficiency test for self-evaluation. Studies in Educational Evaluation, 67: 100910. https://doi.org/10.1016/j.stueduc.2020.100910
  63. Tay, L., Meade, A. W., & Cao, M. (2015). An overview and practical guide to IRT measurement equivalence analysis. Organizational Research Methods, 18(1), 3-46. https://doi.org/10.1177/1094428114553062
  64. Tesio, L., Caronni, A., Kumbhare, D., & Scarano, S. (2024a). Interpreting results from Rasch analysis 1. The “most likely” measures coming from the model. Disability and Rehabilitation, 46(3), 591–603. https://doi.org/10.1080/09638288.2023.2169771
  65. Tesio, L., Caronni, A., Simone, A., Kumbhare, D., & Scarano, S. (2024b). Interpreting results from Rasch analysis 2. Advanced model applications and the data-model fit assessment. Disability and Rehabilitation, 46(3), 604–617. https://doi.org/10.1080/09638288.2023.2169772
  66. Turner, K. T., & Engelhard, G., Jr. (2024). Using functional clustering to diagnose person misfit. Journal of Experimental Education, 92(2), 377–397. https://doi.org/10.1080/00220973.2022.2161088
  67. van der Linden, W. J., & van Krimpen-Stoop, E. M. L. A. (2003). Using response times to detect aberrant responses in computerized adaptive testing. Psychometrika, 68(2), 251–265. https://doi.org/10.1007/BF02294800
  68. Wanders, R. B. K., Meijer, R. R., Ruhé, H. G., Sytema, S., Wardenaar, K. J., & de Jonge, P. (2018). Person-fit feedback on inconsistent symptom reports in clinical depression care. Psychological Medicine, 48(11), 1844-1852. https://doi.org/10.1017/S003329171700335X
  69. Wang, W.-C., Chen, P.-H., & Cheng, Y.-Y. (2004). Improving measurement precision of test batteries using multidimensional item response models. Psychological Methods, 9(1), 116–136. https://doi.org/10.1037/1082-989X.9.1.116
  70. Wind, A. S., & Schumacker, E. R. (2017). Detecting measurement disturbances in rater mediated assessments. Educational Measurement: Issues and Practice, 36(4), 44-51. https://doi.org/10.1111/emip.12164
  71. Wright, B. D., & Stone, M. (1999). Measurement essentials (2nd ed.). Wide Range, Inc.
  72. Yekutieli, D., & Benjamini, Y. (1999). Resampling-based
  73. false discovery rate controlling multiple test procedures for correlated test statistics. Journal of Statistical Planning and Inference, 82(1-2), 171-196. https://doi.org/10.1016/S0378-3758(99)00041-5
  74. Zahra, N. S., & Wirawan, H. (2024). Empowering digital transformation: Developing and validating a Digital Leadership Scale through Rasch model analysis. Measurement: Interdisciplinary Research and Perspectives. Advanced online publication. https://doi.org/10.1080/15366367.2024.2334591
  75. Zou, D., & Bolt, D. M. (2023). Person misfit and person reliability in Rating Scale Measures: The role of response styles. Measurement: Interdisciplinary Research and Perspectives, 21(3), 167-180. https://doi.org/10.1080/15366367.2022.2114243
Language: English
Page range: 49 - 60
Submitted on: Nov 21, 2024
Accepted on: Jul 28, 2025
Published on: Aug 16, 2025
Published by: Sciendo
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2025 Muhammad Dwirifqi Kharisma Putra, Faturochman,, published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 License.