Generative AI's Achilles' Heel: The Challenge of Letter Counting
by Eugene Mazarakis
Introduction
The Devil’s Dictionary is a satirical dictionary written by American journalist Ambrose Bierce, consisting of common words followed by humorous and satirical definitions.
Today, we will try to write our own dictionary, The AI’s dictionary. This dictionary will consist of common words in which AI models failed to accurately count the occurrences of a certain letter within a word. We will examine three “LLMs”: Gemini, chatGPT, Copilot.
The question
Let’s define the prompt for the models:
Prompt: How many times does the letter “x” appear in the word “XXXX” ?
XXXX: Word X: Letter
This is the prompt we provide to Gemini, Copilot, and ChatGPT. We aim to evaluate their ability to count the occurrences of the letter ‘x’ within the word XXXX.
Examples
English Words
- How many times does the letter “c” appear in the word “acceptance” ?
- How many times does the letter “r” appear in the word “strawberry” ?
Greek Words
- Πόσες φορές εμφανίζεται το γράμμα “σ” μέσα στη λέξη “λυσσασμένος”?
Dictionary per model
Below we see, per model, the words in which they fail to correctly calculate the occurrence of the letters.
chatGPT
- acceptance , c
- strawberry , r
Gemini
- acceptance , c
- agreement , e
- keeper , e
- strawberry , r
- weekend , e
- λυσσασμένος, σ
Copilot
- acceptance , c