Captchas: Why We Need Them, How They’re Evolving, and How You Can Solve Them More Easily

Captchas: Why We Need Them, How They’re Evolving, and How You Can Solve Them More Easily Featured Image

As humans living in the modern world, we have to claim we’re not robots fairly often, and we’re not even living in some futuristic science-fiction dystopia. Regardless, whether you’re checking an “I am not a robot” box, teaching an AI what constitutes a “sign” (Hint: the “correct” answer is whatever the majority of other users have picked; try to think like the crowd), or solving math problems, the goal is always the same: stop bots from messing up websites, and maybe use the humans solving the captchas to digitize some books, train some image-recognition software, or generate some ad revenue. But there’s more to captchas than meets the eye, and they’re far from foolproof.

Why do we need captchas?

CAPTCHA stands for “Completely Automated Public Turing test to tell Computers and Humans Apart,” which, aside from being a truly elegant acronym, tells you most of what you need to know. The idea is, as Google’s reCAPTCHA motto goes, to create a task that is “Easy for people, hard for bots.”

captcha-turing-test

“Bot” generally refers to any program that is set to automatically complete some process, whether it’s posting news on Twitter or leaving spam in website comment sections. Used correctly, these programs are fairly useful, but they can also be used to generate useless/ad-ridden/malicious content, overwhelm a site with signups, rig online poll results, scrape email addresses, or do any number of other unpleasant things. It’s just best not to let them in.

What exactly is a captcha?

captcha-distorted-text

If you’ve been around the Internet for a while, you’ll remember that for most of the 2000s, the most common captcha type was a strip of distorted text with some string of alphanumeric characters in it. This is no longer a very secure form of captcha, but when Google acquired reCAPTCHA in 2009, it was still good enough to get most bots. Since then, Google has switched to the more secure “I am not a robot” boxes (which actually monitor behavior like mouse movement and browser information to check if you’re a bot) and image-recognition challenges. Audio-based captchas are still around, though, and they are surprisingly easy to break with speech-recognition software.

captcha-image-id

The image-recognition captchas have had their own set of issues, as they can be a little ambiguous for human respondents. As mentioned above, though, there is no right answer – since the computer doesn’t know which pictures are storefronts and which are schnauzers – it just accepts the majority human opinion as correct. If 75/100 humans decide to label a blurry picture of a mop as a schnauzer, the computer will assume that the mop is a schnauzer and will mark you wrong if you don’t label it as such.

But there are plenty of other captcha options as well, and they can get pretty creative. These are just a few of the ideas that have made it onto various websites.

The slide-lock captcha:

captcha-slidelock

The math problem captcha:

captcha-math

The drag-and-drop captcha:

captcha-puzzle

The image orientation captcha:

captcha-orientation

The logic/grammar captcha:

captcha-egglue

There are also some captchas you never see, such as the honeypot captcha, which involves adding an invisible field to a webpage, waiting for a bot to fill it out (humans won’t, as they can’t see it), and subsequently kicking the bot off. Then there’s Google’s “invisible captcha,” often paired with their box, which watches how you browse around a webpage (mouse movements, scrolling, clicking, general behavior) to see if it should give you an image-recognition captcha as a double-check.

Captchas: making the world a better place

You may not know it, but the cumulative hours you’ve spent proving you’re not a robot may have actually made a difference. reCAPTCHA, now Google’s captcha service, was originally designed by Luis von Ahn (now better-known as the founder of Duolingo), as a way to use wasted brainpower to digitize books. By presenting users with a scanned word from a book or newspaper, this system could both confirm a user’s identity and take a sort of opinion poll on what the word was. If enough people agreed on the word, the digitization system would accept the answer into the ebook version.

After implementing this system, it only took two years to digitize the entire Google Books library and the entire New York Times archive. By 2012 they switched over to using humans to input house numbers pulled from Google Street View.

captcha-digitization

In 2014, things took an ironic turn towards the robotic: image-recognition captchas. These work on the principle that machines aren’t very good at figuring out what’s in a picture, but, as described above, they’ve been pretty effective at training AIs to do just that. Since this captcha will eventually work itself out of a job, it is being phased out in favor of the less-visible behavioral/tracking-oriented ones.

In conclusion: am I a robot?

As artificial intelligence, deep learning, and a host of other advancements come down the pipe in the next few decades, captchas are going to have to evolve as well. Most captchas in existence have already been cracked, and it’s only getting easier. Training a machine to read distorted-text captchas now takes about fifteen minutes. Perhaps the only thing left in the future will be biometric captchas (hope you like facial recognition scans!), or perhaps we’ll wake up and discover that the singularity has already been reached, and we were the bots the whole time.

Image credit: Chippee via bad google recaptcha house numbers

Subscribe to our newsletter!

Our latest tutorials delivered straight to your inbox

Andrew Braun Avatar

Read next

When Sony shipped the first Walkman in 1979, chairman Akio Morita insisted on a second headphone jack and a “hotline” talk button, convinced it would be rude for one person to listen to music alone — and within a few years buyers had ignored the sociable features so completely that Sony quietly dropped them
Russia still custom-builds the Soyuz return seats for ISS crew members using plaster casts taken weeks before launch, because astronauts grow as much as five centimetres taller during a long-duration stay and a seat moulded to their Earth-shaped spine would no longer fit the body that comes home
The “CrackBerry” nickname stuck for a reason — and the variable-reward psychology that hooked early-2000s executives on their BlackBerrys is the exact same machinery now running every push notification on every smartphone in your pocket
In 1843, Ada Lovelace described a brass-and-punched-card engine that could act on symbols as well as numbers, even composing music if harmony could be reduced to rules, inside seven translator’s notes three times longer than the paper itself
ARPANET sent its first message on 29 October 1969 from a lab at UCLA to a machine at Stanford, and the message was supposed to read ‘LOGIN’ — but the system crashed after the L and the O, meaning the first word ever transmitted over the network that became the internet was, by accident, ‘LO’.
In 1995, Microsoft shipped a cartoon-house interface called Bob, led by Melinda French, who married Bill Gates while it was in development — it demanded twice the memory of a typical home PC, sold roughly 30,000 copies, and was dead within a year, leaving behind the font Comic Sans and the animated assistant that became Clippy.
The Greenland shark grows about one centimetre a year, does not reach sexual maturity until around age 150, and a specimen carbon-dated by Danish researchers in 2016 was estimated to be at least 272 years old, meaning it was already swimming the North Atlantic when Mozart was composing symphonies.
When Apple shipped iOS 12 in June 2018, a small feature called Screen Time slipped onto every iPhone with a counter nobody had quite prepared for — a tally of pickups — and within a day Tim Cook was telling CNN the number of times he picked up his own phone was simply too many