50% OFF Residential Proxies for 9 months — use code IPR50 at checkout

Get The Deal

In This Article

Back to blog

How to Retry Failed Python HTTP Requests

Tutorials

Eugenijus Denisov

Last updated - ‐ 9 min read

Python’s Requests is widely used for various purposes, the most popular one among them being web scraping. It’s a simplified version of the built-in HTTP request (urllib) library, making it easier to connect with servers and websites.

Web scraping projects often utilize requests due to their simplicity and effectiveness. It’s also easier to troubleshoot, which is immensely useful since failed requests occur frequently during scraping.

Getting Started With the Requests Library

Our guide assumes you have some basic knowledge of Python and an IDE . Once you have these things going, you’ll need to install the Requests library.

pip install requests

Your IDE should automatically download, unpack, and install Requests, which you can then use to send requests.

You'll first need to import it, as with every other Python library.

import requests

Sending a request is simple as it's a simple call for the GET method (or any other that you need).

import requests

def send_get_request(URL):
    r = requests.get(URL)
    print(r.status_code)

send_get_request('https://iproyal\.com')

Printing the response code of the response will be important later as these serve as your indicators to optimize failed requests. You can test it by clicking the green arrow at the top right (for PyCharm).

Types of Failed Requests Responses

All successful and unsuccessful attempts to connect to a server will return some form of HTTP status code. We'll avoid the successful ones as you don't need to retry them.

403 Forbidden

Your destination server understood the request but did not respond appropriately, as you are not allowed to access that document (or the entire server). These are usually hard to solve, as 403 is most often returned when you need credentials or have been banned.

If you have credentials, they may be included in your GET request.

import requests

def send_get_request(URL, credentials):
    r = requests.get(URL, auth=credentials)
    print(r.status_code)

login_details = ('username', 'password')
send_get_request('https://iproyal.com', login_details)

Replacing the login_details objects' values with your username and password should allow you to access a protected document. Note that it'll only work on a select few websites. Most now use a more complicated version of logging in .

429 Too Many Requests

One of the most frequent HTTP error responses when web scraping, 429 states that you've been sending too many requests to the same endpoint.

Switching proxies or implementing a strategy to retry failed requests is your best option.

500 Internal Server Error

Something failed on the server's end of things. It's likely that a simple retry would work either instantly or within a few minutes.

502 Bad Gateway

Nearly identical to the 500 Internal Server Error - something went wrong with the upstream server, causing failed requests. Retrying in a short while will likely fix the issue.

503 Service Unavailable

Indicates that the server is likely completely down or otherwise unavailable. While you can retry the response code, it'll only resolve on its own once the administrator fixes the issue.

504 Gateway Timeout

Indicates networking issues, which may be caused by either end of the connection. Retrying with increasing delays could fix the issue.

Ready to get started?
Register now

Implementing a Failed Requests Retry Strategy

Requests provide you with all the tools you need to effectively address most failed requests. Out of the list of status codes above, only 403 and 429 have unique approaches, although 429 can also be solved like the rest.

There are two ways to create a strategy for Python requests retries: one involves a simple loop at set intervals and the other uses increasing delays. The former has the benefit of resolving faster, but it's also more easily detectable.

Retry Failed Requests With a Loop

import requests
import time

def send_get_request(URL, retry):
    for i in range(retry):
        try:
            r = requests.get(URL)
            if r.status_code not in [200, 404]:
                time.sleep(5)
            else:
                break
        except requests.exceptions.ConnectionError:
            pass
    print(r.status_code)

send_get_request('https://dashboard.iproyal.com/login', 5)

Since we'll be using sleep to create a delay, we must import the time library, which is done immediately after importing the Python requests library, although the ordering doesn't matter.

In our function, we now include a retry argument, which specifies the number of times we'll retry failed requests.

Additionally, a for loop is included, which uses the retry number as a range. An if statement is included to verify whether a 200 or 404 response is received. If neither, then the function sleeps for 5 seconds and repeats the process.

If 200 or 400 is received, the function stops. Additionally, if a connection error occurs, it'll simply do nothing, bypassing regular Python requests error handling.

Finally, you can always set a custom Python requests.get timeout function by adding an argument (timeout = N) if timing out is causing an issue.

Retry Failed Requests With HTTPAdapter

We'll have to import more than Python requests for the second strategy. Here is the code snippet:

import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

HTTPAdapter will allow us to mount our failed requests retry strategy to a session. Our strategy will be defined by the urllib3 retry utility.

import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

def send_get_request(URL):
    sess = requests.session()

    retries = Retry(total = 5,
                    backoff_factor = 1,
                    status_forcelist = [429, 500, 502, 503, 504])

    sess.mount('https://', HTTPAdapter(max_retries=retries))
    get_URL = sess.get(URL)
    print(get_URL.status_code)

send_get_request('https://iproyal\.com')

Our function now opens with a session instead of directly sending a request, which is necessary for the current failed requests strategy.

We then define a retry object with a few arguments. First, we set the total number of Python retries to 5, a backoff factor of 1, and set which status codes should be retried. A backoff factor is a more complicated sleep function, which is defined as:

{backoff factor} * (2 ** ({retry number} - 1))

Our first retry will be instant, but others will happen at increasingly long intervals.

Then, we mount our session to HTTPAdapter, which will perform all the necessary retries. After that everything is essentially identical to other strategies.

Finally, any Python request will wait for a response before proceeding. If you want to send multiple requests in parallel, asynchronous programming will be required.

Escape 429 With Proxies

When integrating proxies into your web scraping project, there's a unique way to avoid 429 (Too Many Requests) instead of using Python requests retry strategies.

Since 429 is assigned to an IP address, you can switch proxies to completely avoid the HTTP error code whenever it's received. As long as you have a pay-as-you-go residential proxy , you can keep switching IP addresses to avoid 429.

You can also have a failed requests retry strategy going as a fallback against other error codes.

import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

def send_get_request(URL):
    sess = requests.session()

    proxies = {"http" : "http://USER:PASS@HOST:PORT"}

    retries = Retry(total = 5,
                    backoff_factor = 1,
                    status_forcelist = [500, 502, 503, 504])

    sess.mount('https://', HTTPAdapter(max_retries=retries))
    get_url = sess.get(URL, proxies=proxies)
    if get_url.status_code == 429:
        sess.get(URL, proxies=proxies)
        
    print(get_url.status_code)

send_get_request('https://iproyal\.com')

Since we're using rotating Residential proxies , all we need to do is send a new request with the same endpoint if we receive a 429 error. Rotating proxies will automatically give a new IP address.

With sticky sessions, you should generate a larger list of proxies and place it in a dictionary object, then use an if statement to change to a new IP address once a 429 is received.

import requests
from tenacity import retry, wait_exponential

@retry(wait=wait_exponential(multiplier=1, min=2, max=10))
def fetch(url):
    resp = requests.get(url, timeout=5)
    if resp.status_code != 200:
        raise Exception("Bad response: {}".format(resp.status_code))
    return resp

print(fetch('https://iproyal\.com'))

It offers automatic retries on failure using a clean, decorator-based approach. With @retry, you can implement retry-on-exception logic in Python for functions like HTTP requests or other operations.

Retry With Backoff Pattern / Library

You can also use the backoff library or write your own backoff algorithm. Here is a Python backoff example written without the library. Unless you need something extremely complicated, that will generally work:

import time
import requests

def get_with_backoff(url, max_retries=5, backoff_factor=1):
    for i in range(1, max_retries+1):
        try:
            r = requests.get(url)
            if r.status_code == 200:
                return r
        except requests.exceptions.RequestException:
            pass
        sleep_time = backoff_factor * (2 ** (i - 1))
        time.sleep(sleep_time)
    raise Exception("max retries exceeded")

print(get_with_backoff('https://iproyal\.com'))

It demonstrates a manual Python retry loop utilizing a backoff strategy. You could also plug in the backoff library to do the same.

Conclusion

These basic strategies should let you automatically resolve most HTTP error codes. There are two strategies for avoiding most of the popular HTTP errors. You can set a basic loop to retry failed requests:

import requests
import time 

def send_get_request(URL, retry): #defines a function to send get requests with two arguments
    for i in range(retry): #sets a range for the amount of retries
        try:
            r = requests.get(URL)
            if r.status_code not in [200, 404]: 
                time.sleep(5) #tries to retrieve the URL, if 200 or 404 is not received, waits 5 seconds before trying again
            else:
                break #stops function if 200 or 404 received
        except requests.exceptions.ConnectionError:
            pass
    print(r.status_code)

send_get_request('https://iproyal\.com', 5)

Or you can use the HTTPAdapter Python requests retry strategy, which can be a little slower but less detectable:

import requests
from requests.adapters import HTTPAdapter
from requests.packages.urllib3.util.retry import Retry

def send_get_request(URL): #defines a get request function with one argument
    sess = requests.session() #sets a session object

    retries = Retry(total = 5,
                    backoff_factor = 1,
                    status_forcelist = [429, 500, 502, 503, 504]) #sets the retry amount to 5, backoff_factor to 1, and sets specific HTTP error codes to be retried on

    sess.mount('https://', HTTPAdapter(max_retries=retries)) #mounts HTTPAdapter to the session
    get_URL = sess.get(URL)
    print(get_URL.status_code)

send_get_request('https://iproyal\.com')

Finally, for 429, you can always switch your IP address each time you receive the error code. An if statement and a new status_forcelist are all that's needed.

FAQ

How to import requests in Python?

You start with import requests. That's how you get access to Requests and use requests.get or requests.post.

What status codes should I retry?

You usually retry on HTTP codes like 429, 500, 502, 503, 504. However, you should avoid retrying on 403 unless you're sure it's due to a temporary authorization issue. The codes tell your retry logic when to do attempts. You can also build your own retry logic.

What is backoff_factor?

The backoff_factor is part of a backoff strategy. It controls how long your code waits between retries. For example, wait = backoff_factor × (2 ** n) where "n" is the retry count. It follows an exponential backoff algorithm and is supported by Retry() from urllib3.util.retry import Retry.

How to test retry in Python?

You can simulate failures by hitting a test endpoint that returns error codes like 500 or 429. Then watch your code retry and print out HTTP codes. Use loops or a session object with HTTPAdapter, or use a decorator library like tenacity. That's how you can confirm the retry attempts count.

What is 429 retry after in Python?

When you get a 429 Too Many Requests response code, check its Retry-After header in HTTP headers. You can parse that header and wait before retrying. It fits your retry failed requests logic and respects server limits.

Create Account
Share on
Article by IPRoyal
Meet our writers
Data News in Your Inbox

No spam whatsoever, just pure data gathering news, trending topics and useful links. Unsubscribe anytime.

No spam. Unsubscribe anytime.

Related articles