How secure are CAPTCHAs really?

AI is more advanced than ever... are CAPTCHAs still relevant?

Oct 23, 2024

TLDR: tldr: CAPTCHAs are no longer secure against AI and are mostly just a headache for internet users. CAPTCHAs either adapt or disappear

In the rapidly evolving domain of online security, CAPTCHAs (Completely Automated Public Turing tests to tell Computers and Humans Apart) have long served as a primary defense mechanism against automated bot access. However, as artificial intelligence (AI) technology advances at an unprecedented pace, the efficacy of these security measures is increasingly called into question.

The evolution of CAPTCHA technology began with text-based systems, which presented distorted strings of letters and numbers for users to decipher. Once considered the standard in bot prevention, these systems have now been rendered largely ineffective. Modern AI, equipped with sophisticated Optical Character Recognition (OCR) technology, can now interpret these challenges with greater speed and accuracy than human users. It is noteworthy that Google, the company that popularized CAPTCHAs through its reCAPTCHA system, declared text-based CAPTCHAs obsolete as early as 2014.

In response to the declining effectiveness of text-based systems, image-based CAPTCHAs were introduced. These systems typically require users to perform tasks such as identifying specific objects within a set of images. However, recent developments in AI have also begun to undermine the security of these systems. Advanced AI models, exemplified by GPT-4 and similar large language models, have demonstrated a remarkable ability to interpret instructions, analyze images, and solve these challenges with accuracy comparable to or exceeding human performance.

The latest iteration in CAPTCHA technology is represented by systems like Google's reCAPTCHA v3, which operates invisibly in the background. While this approach improves user experience by eliminating the need for explicit challenges, it raises significant privacy concerns. The system's operation is notably opaque, and relies on security via obscurity, with neither users nor website administrators fully understanding its mechanisms. It is known to collect and analyse extensive user behaviour data, potentially including factors such as cookies, browser fingerprinting, and user history. In an era of increasing concern over digital privacy and stringent data protection regulations like the General Data Protection Regulation (GDPR), this level of covert data collection is problematic.

Given these developments, an objective assessment of current CAPTCHA security yields concerning results. Text-based CAPTCHAs are effectively obsolete in the face of modern AI capabilities. Image-based CAPTCHAs are increasingly vulnerable to advanced AI models. Invisible CAPTCHA systems, while potentially effective, introduce serious privacy implications.

Which begs the question… How secure are CAPTCHAs really?

CAPTCHAs vs. GPTs - The LLM Wars: CAPTCHA Strikes Back

What was once considered an impenetrable fortress against automated bots is now starting to crumble because of OpenAI’s GPT-4o and Google’s Gemini. These models can now not only understand the challenges posed by CAPTCHAs but solve them with startling accuracy.

reCAPTCHA, hCAPTCHA, and Arkose Labs each present distinct types of challenges. reCAPTCHA and hCAPTCHA primarily uses image-based tasks, such as selecting images that contain specific objects (e.g., crosswalks, traffic lights or generated images of animals), while Arkose Labs focuses on more interactive tasks such as rotating an object towards specific direction.

To demonstrate how AI is outpacing CAPTCHA technology, three of the most widely-used CAPTCHA systems - Google’s reCAPTCHA, hCAPTCHA, and Arkose Labs CAPTCHA - were tested against GPT-4o and Gemini.

The goal is simple: have the same prompt for each system, asking the LLMs to solve the CAPTCHA. Below, the format of the prompt is shown, how the results were extracted, and how successful each LLM was in breaking the CAPTCHAs.

What’s particularly revealing is that the prompt given to the LLMs was intentionally vague about the specific task. Instead, we simply provided screenshots of the CAPTCHA challenges. Despite the lack of explicit instructions, the AI models successfully extracted the nature of the task directly from the image and responded accordingly, showcasing their capacity to "understand" the CAPTCHA's request without needing a detailed prompt.

ChatGPT vs. Google reCAPTCHA

Prompt used: “Solve this”

ChatGPT solving Google reCAPTCHA captcha challenge. — ChatGPT solving Google reCAPTCHA.

ChatGPT vs. hCAPTCHA

Prompt used: “Solve this”

ChatGPT solving hCAPTCHA CAPTCHA. — ChatGPT solving hCAPTCHA.

ChatGPT vs. Arkose Labs

Prompt used: “Solve this. Which image is the correct answer?”

This prompt was used because with just “Solve this” GPT didn’t associate the images and simply described how one would rotate the object until it faced the right direction.

ChatGPT solving Arkose Labs MatchKey Advanced CAPTCHA.

Gemini vs reCAPTCHA

Prompt used: “Solve this”

Gemini solving Google reCAPTCHA CAPTCHA challenge. — Gemini incorrectly solving reCAPTCHA. Saw 10 images where there are only 9.

Y U DO DIS?! - Jackie Chan Why? Meme Generator

Gemini “Y U DO DIS?!”… it seems Gemini saw 10 squared images in the CAPTCHA challenge and therefore got confused. With prompt tunning specifying there is a grid 9x9, etc.. Gemini is able to parse the 9 images, but somehow looses track of the position of the middle row, right image, which it says is the bottom right image, reCAPTCHA 1-0 Gemini.

Gemini vs hCAPTCHA

Prompt used: “Solve this”

Gemini… *sigh* after taking some time to read Gemini’s answer, either it assumed the example image as the “top right” or it’s the first row, middle image. Nevertheless, Gemini says “Top Right Image” and even brags:

“This is the most obvious answer”.

Gemini vs Arkose Labs

Prompt used: “Solve this. Which image is the correct answer. Is this image the correct answer? Say yes or no”. The reason for this is Gemini only allows to upload 1 image. So we had to make it a yes/no task.

Google Gemini solving Arkose Labs CAPTCHA challenge — Gemini boasting it’s non-existent CAPTCHA breaking skills with Arkose Labs MatchKey CAPTCHA.

Chill Player… all correct but one? Are you sure you got the task right? I kid you not. Try it for yourself.

And the winner of The LLM Wars is… ChatGPT!

Gemini struggles in all challenges, even maintaining the order of images.

Conclusion

As the limitations of traditional CAPTCHAs become more apparent, new approaches to bot detection are being explored. These include more sophisticated behavioral analysis techniques, multi-factor authentication systems, and AI-powered detection methods designed to identify bot-like behavior. However, these emerging methods face their own challenges in balancing security, user experience, and privacy considerations.

The current state of CAPTCHA security is precarious. Conventional methods are increasingly ineffective, while newer methods introduce privacy concerns. As AI technology continues its rapid advancement, the ongoing contest between security systems and automated bots is likely to intensify.

We saw that GPTs not only know what CAPTCHAs are, can extract the task from the image and ultimately successfully instruct how to solve the challenge. With simple automations one could easily create a CAPTCHA breaking bot.

For website administrators and security professionals, these developments necessitate a reevaluation of security strategies. Relying solely on CAPTCHAs is no longer a viable approach to security. Instead, a multi-layered security strategy that incorporates various methods may be necessary to effectively protect against automated attacks.

For end-users, awareness of the privacy implications of these systems, particularly invisible CAPTCHAs, is crucial. The data provided in the course of these security checks may have more significant privacy implications than immediately apparent.

Moving forward, the challenge lies in developing security measures that can effectively differentiate between human users and increasingly sophisticated bots, while also respecting user privacy and maintaining a seamless user experience. This balance will likely be one of the key challenges in digital security for the foreseeable future.

GotCHA’s Substack

Discussion about this post