Abstract
Humans develop biases during language learning. For example, we rely more heavily on consonants than on vowels to identify words. Advances on artificial intelligence have allowed the development of proficient large language models that sometimes mimic humans’ language use. They do so by tracking regularities in natural language datasets that are used to train them. Here we test the hypothesis that tracking such regularities is enough for the emergence of responses that resemble the consonant bias. We asked ChatGPT which of two nonsense words (one with a vowel and one with a consonant change) was more similar to a target word. We observed that the model uses more the consonants than the vowels to perform similarity judgments across words in the two languages that we tested (English and Spanish).
