Have a personal or library account? Click to login

Is Google Gemini better than ChatGPT at evaluating research quality?

Open Access
|May 2025

Abstract

Google Gemini 1.5 Flash scores were compared with ChatGPT 4o-mini on evaluations of (a) 51 of the author’s journal articles and (b) up to 200 articles in each of 34 field-based Units of Assessment (UoAs) from the UK Research Excellence Framework (REF) 2021. From (a), the results suggest that Gemini 1.5 Flash, unlike ChatGPT 4o-mini, may work better when fed with a PDF or article full text, rather than just the title and abstract. From (b), Gemini 1.5 Flash seems to be marginally less able to predict an article’s research quality (using a departmental quality proxy indicator) than ChatGPT 4o-mini, although the differences are small, and both have similar disciplinary variations in this ability. Averaging multiple runs of Gemini 1.5 Flash improves the scores.

DOI: https://doi.org/10.2478/jdis-2025-0014 | Journal eISSN: 2543-683X | Journal ISSN: 2096-157X
Language: English
Page range: 1 - 5
Submitted on: Dec 9, 2024
Accepted on: Dec 25, 2024
Published on: May 6, 2025
Published by: Chinese Academy of Sciences, National Science Library
In partnership with: Paradigm Publishing Services
Publication frequency: 4 times per year

© 2025 Mike Thelwall, published by Chinese Academy of Sciences, National Science Library
This work is licensed under the Creative Commons Attribution 4.0 License.