3 Ways To Deal With Captchas Using Selenium

When you first start using Selenium, it might feel great automating many things at once. You can automate a lot of things without ever really having to deal with captchas. However, eventually there comes a time when you have to deal with them specially if you are scraping the web. As always, if there is a way to avoid going through the process of dealing with a Captcha, always take that route.

Here’s an example. Lets say the site example.com wants you to fill a captcha after you scrape data from about ten of its pages over x amount of time. In this case you implementing a solution to solve captchas might just increase the costs and often will just be counter productive. A better approach would be to deal with what’s causing the captcha to pop up in the first place.

This means either trying to simulate a more human like behaviour or using proxies. There are various types of proxies though and that’s a whole another topic. In short, you want residential rotating proxies which unlike data-center proxies are difficult to trace as being proxies.

But, if the site you are trying to automate requires you to fill a captcha upfront here are some solutions.

Method 1 – Using OCR to solve simple Captchas

Before we begin just keep in mind you cannot really solve recaptcha using this solution. You could write your own AI program to do so but with how rapidly reCaptcha v2 evolves this could be a nightmare to maintain. There is a trick to bypass recaptcha though i will mention it within this post.

OCR is short for Optical Character Recognition as it implies it helps recognising characters using code. You could easily turn documents that you take pictures of into editable text.

There are a few OCR libraries available we will be using pytesseract. A python wrapper for the Google’s Tesseract-OCR Engine.

Here’s an example script. You might want to modify things a bit depending on the site you are trying to automate and the captchas they are generating.

#This is a python script
from PIL import Image
from PIL import ImageEnhance
import PIL.ImageOps
import pytesseract
import os
import numpy

imageLocation = "downloadedcaptcha.png"

image = Image.open(imageLocation).convert('RGB')
image = ImageEnhance.Brightness(image)
image = image.enhance(10)
imageArray = numpy.array(image)
imageArray = imageArray[:, :, ::-1].copy()

filename = "{}.png".format(os.getpid())

text = pytesseract.image_to_string(Image.open(filename))

The script starts off with importing all the libraries needed. Then we assign the location of the downloaded captcha image to the variable imageLocation. This captcha imagine will be the one we download while using selenium. We then convert this image to RGB, increase the brightness and enhance it, then we save this image and run pytesseract to get the captcha string.

Will you have a 100% rate of getting the correct string? Maybe if you create your own filters to make the captcha OCR friendly. But even if it is 4 in 5, it will still save you a lot of time and money than using services or doing it manually.

If you don’t want to spend a lot of time coding the filters for captchas for every site, you could just use a product which already has done it for you such as captchasniper.

Captcha Sniper solves 3,669 built in captcha types with additional captcha types and updates being made regularly.

Method 2 – Using a captcha solving service

When you use these services, you need to keep in mind that actual humans will solve these captchas. This means you will have to load your accounts in these websites with some cash. The rates aren’t all that bad.

Almost all of these services have apis that you can implement in your projects quite easily. Here’s an example using deathbycaptcha.

//This is a c# script
using DeathByCaptcha;

// Put your DBC credentials here.
// Use HttpClient class if you want to use HTTP API.
Client client = (Client) new SocketClient(USERNAME, PASSWORD);

// Put your CAPTCHA file name, stream, or vector of bytes,
// and desired timeout (in seconds) here:
Captcha captcha = client.Decode(CAPTCHA_FILE_NAME, TIMEOUT);
if (captcha.Solved && captcha.Correct) {
    Console.WriteLine("CAPTCHA {0}: {1}", captcha.Id, captcha.Text);

Solutions like this are widely considered a norm when dealing with captchas. There are many services like this which can be better or worse depending on your requirements and the load the workers have at the moment. Here ill list some of them with a small review.

  • 2captcha – This is one of the best captcha solvers in the market right now. It can work with various captcha types which many other services don’t support such as funcaptcha, Asirra and of course reCaptcha v2.
  • deathbycaptcha – Deathbycaptcha is the most popular choice in the market when it comes to captcha solving services. They have a hybrid of OCR and human based captcha solving which makes their rates drop considerably. Deathbycaptcha devs also released support for reCaptcha v3 which takes user patterns to evaluate if they are a bot.
  • Anti-captcha – Anti captcha is also one of the most popular captcha solving services. Their workers have an impressive deal of experience solving reCaptchas.

Method 3 – Use Buster for recaptcha

This has been working well for me since I don’t come across many captchas but when I do they are almost always recaptchas and it can get annoying. Previously I used deathbycaptcha but the delay can get frustrating specially with the timeouts.

Buster uses audio from recaptcha and converts it into a string. While you can’t really use it in mass unless you use proxies and sanitize your browser fingerprints it does its job pretty well. With most websites now adapting to use recaptcha this tool saves a lot of time.

Simply download this extension package it as a crx or use something like crxextractor and import it into your selenium browser as an extension.

Leave a Comment