Quality of Performance Assessment Instruments for Educators in Higher Education: Implementation of Factor Analysis And Generalizability Theory





Performance assessment, Higher education, Exploratory factor analysis, Generalizability theory


Assessing the quality of learning in higher education is one of the efforts to ensure its standards. Typically, the assessment of the quality of learning implementation involves observation by multiple raters. This study aims to provide construct validity evidence and estimate the reliability of performance assessment instruments for educators in higher education. 225 second-year and third-year students from the Faculty of Education participated as raters, evaluating the performance of educators in their teaching practices. Forty assessment items were used to evaluate the performance of 19 instructors. Exploratory Factor Analysis (EFA) and Generalizability Theory (G-Theory) were employed to examine the quality of the performance assessment instruments. The EFA analysis resulted in the identification of five factors that contribute to educators' performance in teaching: (1) readiness and planning, (2) pedagogy and professionalism, (3) personality, (4) social relationships within the classroom, and (5) social relationships beyond the classroom, collectively explaining 67.671% of the variance. Of the 40 assessment items, 37 demonstrated construct validity, while three required revisions. These findings indicate the alignment between the instrument's factors and the formulated theory of teaching competence. The reliability of the measurements was estimated using G-Theory in RStudio, yielding a relative G coefficient of 0.88 for three raters. The D-Study results indicated that the instrument could be used to assess performance, with an estimated generalizability coefficient of 0.738, requiring a minimum of five raters for each person (educator) being evaluated. We recommend employing G-Study and D-Study to determine the number of raters involved in performance assessment as a means of cost and time efficiency in the evaluation process.


Download data is not yet available.


Allen, M. J., & Yen, W. M. (1979). Introduction to measurement theory. Wadstworth.

Avalos, B. (2011). Teacher professional development in Teaching and Teacher Education over ten years. Teaching and Teacher Education, 27(1), 10–20. https://doi.org/10.1016/j.tate.2010.08.007

Batubara, H. H. (2016). Penggunaan Google Form sebagai alat penilaian kinerja dosen di Prodi PGMI UNISKA Muhammad Arsyad Al Banjari. Jurnal Pendidikan Dasar Islam, 8(1), 39–50. http://ejournal.unsub.ac.id/index.php/sendinusa/article/view/661

Baya’a, N., & Daher, W. (2013). Mathematics teachers’ readiness to integrate ICT in the classroom. International Journal of Emerging Technologies in Learning. https://doi.org/10.3991/ijet.v8i1.2386

Bell, C. A., Wilson, S. M., Higgins, T., & Mccoach, D. B. (2010). Measuring the Effects of Professional Development on Teacher Knowledge?: The Case of Developing Mathematical Ideas. 41(5), 479–512.

Bimpeh, Y., Pointer, W., Smith, B. A., & Harrison, L. (2020). Evaluating Human Scoring Using Generalizability Theory. Applied Measurement in Education, 33(3), 198–209. https://doi.org/10.1080/08957347.2020.1750403

Brennan, R. L. (2001). Generalizability theory: Statistics for social science and public policy. In New York: Springer-Verlag. (Vol. 30).

Brennan, R. L. (2011). Generalizability theory and classical test theory. Applied Measurement in Education, 24(1), 1–21. https://doi.org/10.1080/08957347.2011.532417

Brookhart, S. M., & McMillan, J. H. (2020). Classroom assessment and educational measurement. In Classroom Assessment and Educational Measurement. Routledge. https://doi.org/10.4324/9780429507533-5

Chen, Y. C., & Terada, T. (2021). Development and validation of an observation-based protocol to measure the eight scientific practices of the next generation science standards in K-12 science classrooms. Journal of Research in Science Teaching, 58(10), 1489–1526. https://doi.org/10.1002/tea.21716

Djidu, H., Mashuri, S., Nasruddin, N., Sejati, A. E., Rasmuin, R., Ugi, L. E., & Arua, A. La. (2021). Online learning in the post-Covid-19 pandemic era: Is our higher education ready for it? Jurnal Penelitian Dan Pengkajian Ilmu Pendidikan: E-Saintika, 5(2), 139–151. https://doi.org/10.36312/esaintika.v5i2.479

Djidu, H., & Retnawati, H. (2022). Digitizing mathematics and science learning: What do we need to prepare? 5th International Conference on Current Issues in Education (ICCIE 2021), 640, 296–301. https://www.atlantis-press.com/article/125969632.pdf

Ebel, R. L., & Frisbie, D. A. (1991). Essentials of educational measurement. Prentice-Hall International, Inc.

Gable, R. K., & Wolf, M. B. (1993). Instrument Development in the Affective Domain. In Instrument Development in the Affective Domain. Springer Netherlands. https://doi.org/10.1007/978-94-011-1400-4

Getenet, S. T. (2017). Adapting technological pedagogical content knowledge framework to teach mathematics. Education and Information Technologies, 22(5), 2629–2644. https://doi.org/10.1007/s10639-016-9566-x

Gil-Flores, J., Rodríguez-Santero, J., & Torres-Gordillo, J.-J. (2017). Factors that explain the use of ICT in secondary-education classrooms: The role of teacher characteristics and school infrastructure. Computers in Human Behavior, 68, 441–449. https://doi.org/10.1016/j.chb.2016.11.057

Henkel, M. (1997). Teaching Quality Assessments. Evaluation, 3(1), 9–23. https://doi.org/10.1177/135638909700300102

Hermanu, A. I., Sari, D., Sondari, M. C., & Dimyati, M. (2022). Is it necessary to evaluate university research performance instrument? Evidence from Indonesia. Cogent Social Sciences, 8(1). https://doi.org/10.1080/23311886.2022.2069210

Hill, H. C., Ball, D. L., & Schilling, S. G. (2008). Content Knowledge?: Conceptualizing and Measuring Teachers ’ Topic-Specific Knowledge of Students. 39(4), 372–400.

Hu, B. Y., Fan, X., Yang, Y., & Neitzel, J. (2017). Chinese preschool teachers’ knowledge and practice of teacher-child interactions: The mediating role of teachers’ beliefs about children. Teaching and Teacher Education, 63, 137–147. https://doi.org/10.1016/j.tate.2016.12.014

Ikram, F. F. D., Komala, N., & Syaefullah, T. W. (2018). Analisa Sistem EDOM Politeknik Negeri Jakarta Menggunakan Technology Acceptance Model (TAM). MULTINETICS, 4(1), 34. https://doi.org/10.32722/vol4.no1.2018.pp34-38

Jeffrey, L. M., Milne, J., Suddaby, G., & Higgins, A. (2014). Blended Learning?: How Teachers Balance the Blend of Online and Classroom Components. Journal of Information Technology Education, 13, 121–140. https://doi.org/10.28945/1968

Johnson, E. S., Crawford, A., Moylan, L. A., & Zheng, Y. (2020). Validity of a Special Education Teacher Observation System. Educational Assessment, 25(1), 31–46. https://doi.org/10.1080/10627197.2019.1702461

Johnson, E. S., Zheng, Y., Crawford, A. R., & Moylan, L. A. (2020). Examining rater accuracy and consistency with a special education observation protocol. Studies in Educational Evaluation, 64, 0–18. https://doi.org/10.1016/j.stueduc.2019.100827

Johnson, E. S., Zheng, Y., Crawford, A. R., & Moylan, L. A. (2022). Evaluating an explicit instruction teacher observation protocol through a validity argument approach. The Journal of Experimental Education, 90(2), 419–434. https://doi.org/10.1080/00220973.2020.1811194

Kang, E. (2018). Exploring Elementary Teachers’ Pedagogical Content Knowledge and Confidence in Implementing the NGSS Science and Engineering Practices. Journal of Science Teacher Education, 29(1), 9–29. https://doi.org/10.1080/1046560X.2017.1415616

Klette, K., & Blikstad-Balas, M. (2018). Observation manuals as lenses to classroom teaching: Pitfalls and possibilities. European Educational Research Journal, 17(1), 129–146. https://doi.org/10.1177/1474904117703228

Mardiah, M., & Yulhendri, Y. (2020). Pengaruh IPK, micro teaching, dan praktik pengalaman lapangan (PPL) terhadap kompetensi pedagogik mahasiswa calon guru jurusan Pendidikan Ekonomi FE UNP. Jurnal Ecogen, 3(1), 165–175. https://doi.org/10.24036/jmpe.v3i1.8535

Martin, B. A., & Martin, J. H. (1989). Assessing the Lecture Performance of University Faculty: A Behavioral Observation Scale. Journal of Education for Business, 64(4), 157–160. https://doi.org/10.1080/08832323.1989.10117350

McCoach, D. B., Gable, R. K., & Madura, J. P. (2013). Instrument development in the affective domain. In Journal of Chemical Information and Modeling (3rd ed.). Springer.

Miller, M. D., Linn, R. L., & Gronlund, N. E. (2009). Measurement and assessment in teaching. In Library of Congress Catalogging in Publication Data. Pearson Education, Inc.

Moore, C. T. (2016). gtheory: Apply Generalizability Theory with R. http://evaluationdashboard.com

Morris, A. K., Hiebert, J., & Spitzer, S. M. (2009). Mathematical Knowledge for Teaching in Planning and Evaluating Instruction?: What Can Preservice Teachers Learn?? 40(5), 491–529.

Noben, I., Deinum, J. F., & Hofman, W. H. A. (2022). Quality of teaching in higher education: reviewing teaching behaviour through classroom observations. International Journal for Academic Development, 27(1), 31–44. https://doi.org/10.1080/1360144X.2020.1830776

Poole, M., Harman, E., & Deden, A. (1998). Managing the Quality of Teaching in Higher Education Institutions in the 21st Century. Australian Journal of Education, 42(3), 271–284. https://doi.org/10.1177/000494419804200305

R Core Team. (2022). R: A Language and Environment for Statistical Computing. https://www.r-project.org/

Rahardja, U., Lutfiani, N., Setiani Rafika, A., & Purnama Harahap, E. (2020). Determinants of Lecturer Performance to Enhance Accreditation in Higher Education. 2020 8th International Conference on Cyber and IT Service Management (CITSM), 1–7. https://doi.org/10.1109/CITSM50537.2020.9268871

Retnawati, H., Apino, E., Djidu, H., Ningrum, W. P., Anazifa, R. D., & Kartianom, K. (2019). Scaffolding for international students in statistics lecture. Journal of Physics: Conference Series, 1320(1). https://doi.org/10.1088/1742-6596/1320/1/012078

Retnawati, H., Djidu, H., Kartianom, K., Apino, E., & Anazifa, R. D. (2018). Teachers’ knowledge about higher-order thinking skills and its learning strategy. Problems of Education in the 21st Century, 76(2), 215–230. http://oaji.net/articles/2017/457-1524597598.pdf

Retnawati, H., Hadi, S., & Nugraha, A. C. (2016). Vocational high school teachers’ difficulties in implementing the assessment in curriculum 2013 in Yogyakarta province of Indonesia. International Journal of Instruction, 9(1), 33–48. https://doi.org/10.12973/iji.2016.914a

Rodgers, W. J., Morris-Mathews, H., Romig, J. E., & Bettini, E. (2022). Observation Studies in Special Education: A Synthesis of Validity Evidence for Observation Systems. Review of Educational Research, 92(1), 3–45. https://doi.org/10.3102/00346543211042419

Safi’i, I., Warni, S., & Yanti, P. G. (2019). Persepsi Guru Bahasa Indonesia tentang Hubungan antara Penerapan Full Day School dengan Penguatan Karakter Siswa. Jurnal Pendidikan Karakter, 9(2). https://doi.org/10.21831/jpk.v9i2.27361

Stylianides, G. J. (2007). Investigating the guidance offered to teachers in curriculum materials: the case of proof in mathematics. International Journal of Science and Mathematics Education, 6(1), 191–215. https://doi.org/10.1007/s10763-007-9074-y

Sulistiyo, U., Mukminin, A., Abdurrahman, K., & Haryanto, E. (2017). Learning to teach: A case study of student teachers’ practicum and policy recommendations. The Qualitative Report, 22(3), 712–731. https://nsuworks.nova.edu/tqr/vol22/iss3/3

Taufiq, R. (2015). Penilaian Kinerja Dosen Dalam Bidang Belajar Mengajar Di Fakultas Teknik Universitas Muhammadiyah Tangerang. Faktor Exacta, 5(1), 77–85. https://journal.lppmunindra.ac.id/index.php/Faktor_Exacta/article/view/185

Taylor, M., Yates, A., Meyer, L. H., & Kinsella, P. (2011). Teacher professional leadership in support of teacher professional development. Teaching and Teacher Education, 27(1), 85–94. https://doi.org/10.1016/j.tate.2010.07.005

Undang-Undang Republik Indonesia Nomor 14 Tahun 2005 tentang Guru dan Dosen, (2005).

van Driel, J. H., & Berry, A. (2012). Teacher professional development focusing on pedagogical content knowledge. Educational Researcher, 41(1), 26–28. https://doi.org/10.3102/0013189X11431010

Van Tassel?Baska, J., Quek, C., & Feng, A. X. (2006). The development and use of a structured teacher observation scale to assess differentiated best practice. Roeper Review, 29(2), 84–92. https://doi.org/10.1080/02783190709554391

Wu, Y., & Cai, J. (2022). Does school teaching experience matter in teaching prospective secondary mathematics teachers? Perspectives of university-based mathematics teacher educators. ZDM – Mathematics Education, 0123456789. https://doi.org/10.1007/s11858-022-01344-8

Zurqoni, Z., Retnawati, H., Rahmatullah, S., Djidu, H., & Apino, E. (2020). Has arabic language learning been successfully implemented? International Journal of Instruction, 13(4). https://doi.org/10.29333/iji.2020.13444a




How to Cite

Djidu, H., Istiyono, E., & Widihastuti, W. (2023). Quality of Performance Assessment Instruments for Educators in Higher Education: Implementation of Factor Analysis And Generalizability Theory. Jurnal Penelitian Dan Pengkajian Ilmu Pendidikan: E-Saintika, 7(2), 144–159. https://doi.org/10.36312/esaintika.v7i2.716



Original Research Article