How to Download Images from URLs Using Python
Python

Vilius Dumcius
Key Takeaways
-
Requests library is the most versatile choice for downloading images, with built-in error handling and proxy support.
-
Urllib3 might require a bit more setup and technical knowledge, but it's faster and provides more control over connections for performance and resource management.
-
Wget is suitable for quick single direct downloads, but may be limited for scraping multiple images.
Urllib3, Wget, and Requests libraries are commonly used to simplify HTTP requests for downloading the HTML code of web pages when scraping. It’s less known that you can also use these tools to download images or other image data from URLs.
Various APIs are commonly used to download images through code , but it’s not necessary to rely on third-party APIs to download image data. We’ll start with the Requests library to show how to download images with a Python script and integrate proxies for avoiding detection.
Downloading Images from URLs (Python Requests Library)
The best tool to use for downloading images while scraping is the one you’re most likely already using - Requests. It’s a powerful and widely used Python library for making HTTP requests and handling responses.
While Requests excels at fetching data from web pages and APIs, it doesn’t directly handle image files. But don’t worry - Python has built-in tools to handle these issues.
All you need to do to download images with Requests is:
- Request the image file using Requests
- Write the response data into a file
Let’s say you want to download an image of a book cover from Books to Scrape , a web scraping sandbox. Here’s how you can do it.
First, import the Requests library:
import requests
Then, create a variable holding the image URL you want to download:
url = 'https://books.toscrape.com/media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg'
Request the image file using requests.get():
res = requests.get(url)
If the request was successful, we can write it into an image file using write-binary mode:
with open('img.jpg','wb') as f: # "wb" stands for binary mode
f.write(res.content)
You do need to be careful not to use the incorrect image file name extension with this method, though. For example, trying to write a PNG file name into a JPG file will result in an unreadable image file.
Here's how you can solve this. If you extract image URLs from a page, you can also get the file name and/or extension from the image URL and use that as the file name for the downloaded images:
def extract_name(url):
file_name = url.split("/")[-1]
return file_name
with open(extract_name(url),'wb') as f:
f.write(res.content)
This will ensure that the image data type matches its name.
In addition to downloading images, the Requests library offers a range of other functionalities. To learn more about Requests, read our full Python library guide . Even if you're only planning to work with image data, it's important to know how to handle errors.
Handling Errors When Downloading Images
Network or file errors are quite common when using the Requests library to download images and image data, especially with concurrent HTTP requests. Such errors are best handled by including try-except blocks. On the other hand, HTTP request errors are best addressed by setting timeouts:
import requests
from requests.exceptions import HTTPError
url = 'https://httpbin.org/status/404' # Always returns 404
try:
response = requests.get(url, timeout=5)
response.raise_for_status()
except HTTPError as http_err:
print(f"HTTP error occurred: {http_err} - URL: {url}")
When the server gives a timeout, such as connect or read, your scraper waits indefinitely for an answer. In such cases, it's best to set a timeout in your scraper to terminate unresponsive requests:
import requests
from requests.exceptions import Timeout
url = 'https://httpbin.org/delay/10' # Will wait 10 seconds before responding
try:
response = requests.get(url, timeout=3) # Timeout after 3 seconds
print(f"Response received: {response.status_code}")
except Timeout as timeout_err:
print(f"Request timed out: {timeout_err} - URL: {url}")
File or IOError (Input/Output Error) in Python occurs when an operation related to file handling or network input/output fails. It could happen due to various reasons, such as the file being already in use, network or disk issues, and many others.
These errors often occur when making asynchronous downloads, and they are not unique to Requests, as Urllib3 and Wget can also experience network and similar issues. Here's what a try-except block handling such errors might look like:
try:
with open("image.jpg", "wb") as f:
f.write(response.content)
except IOError as io_err:
print(f"File I/O error: {io_err}")
Using Proxies With The Requests Library
Nobody will bat an eye if you download an image or two with a script. However, once you decide to download the entire collection of book covers or make asynchronous downloads, you need to be cautious. Websites attempt to restrict such activity, as it puts a strain on their servers.
Proxies act as a gateway between you and the website you're accessing, changing your IP address and the associated geo-location. As a result, it's much more difficult for websites to detect your web scraper.
IPRoyal's rotating residential proxies are one of the best choices for a web scraping project, as you'll automatically receive a new IP address with every request you make. It's easy to set up and safe to use.
First, you need to obtain the link to the proxy. If you're using IPRoyal, you can find it in the dashboard. Afterwards, create a dictionary in your script to hold the proxy links:
proxies = {
'http': 'http://link-to-proxy.com',
'https': 'http://link-to-proxy.com'
}
Then, you can provide the dictionary to the requests.get() function via the proxies parameter:
response = requests.get(
url,
proxies=proxies
)
Incorporate this code into your web scraper and you'll be able to download images while staying anonymous. If you got lost at any point, here's the full code for downloading an image from a website using a proxy:
import requests
def extract_name(url):
file_name = url.split("/")[-1]
return file_name
proxies = {
'http': 'http://link-to-proxy',
'https': 'http://link-to-proxy'
}
url = 'https://books.toscrape.com/media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg'
res = requests.get(
url,
proxies=proxies
)
with open(extract_name(url),'wb') as f:
f.write(res.content)
Downloading Images from URLs (Python Urllib3 Library)
Urllib3 is an HTTP client library for Python, much like Requests. In fact, Requests uses Urllib3 under the hood for its functionality. Both libraries are very similar, and the choice of which one to use for downloading images often falls on the one that you already use.
If you use Urllib3 instead of Requests, here's how you can download images with it.
First, import the library:
import urllib3
Then, create a variable with the image URL you want to scrape:
url = 'http://books.toscrape.com/media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg'
After that, download the image:
res = http.request('GET', url)
Now you can extract the file name from the URL and save it into a file:
def extract_name(url):
file_name = url.split("/")[-1]
return file_name
with open(extract_name(url),'wb') as f:
f.write(res.data)
Here's the full code for convenience:
import urllib3
url = 'http://books.toscrape.com/media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg'
res = urllib3.request('GET', url)
def extract_name(url):
file_name = url.split("/")[-1]
return file_name
with open(extract_name(url),'wb') as f:
f.write(res.data)
Using Proxies With Urllib3
Like with Requests, intensive use of Urllib3 for downloading images requires proxies to blend in with the traffic.
Unfortunately, using proxies with Urllib3 is a bit more complicated than using them with the Requests library. This is especially true when you want to use proxies that require a username and password to access them. To use authenticated proxies with Urllib3, you'll need three things:
- URL and port of your proxy provider (for IPRoyal it’s
http://geo.iproyal.com:12321
) - Your username
- Your password
If you’re using IPRoyal Residential Proxies , you can find this information in your dashboard.
First, make a request header for proxy authorization using urllib3.make_headers() function. Replace the example username and password below with your actual username and password:
default_headers = urllib3.make_headers(proxy_basic_auth='username:password')
Then, create a new instance of the ProxyManager object. It will ensure that all requests will use the proxy. Replace the proxy host and port with the details from your dashboard:
http = urllib3.ProxyManager('http://proxy-host:port', proxy_headers=default_headers)
Now you can call the .get() method on the HTTP object to connect to websites using the proxy:
url = 'http://books.toscrape.com/media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg'
res = http.request('GET', url)
The rest of the code is as it was without the proxy:
def extract_name(url):
file_name = url.split("/")[-1]
return file_name
with open(extract_name(url),'wb') as f:
f.write(res.data)
Here's the full script code for convenience:
import urllib3
default_headers = urllib3.make_headers(proxy_basic_auth='username:password')
http = urllib3.ProxyManager('http://proxy-host:port', proxy_headers=default_headers)
url = 'http://books.toscrape.com/media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg'
res = http.request('GET', url)
def extract_name(url):
file_name = url.split("/")[-1]
return file_name
with open(extract_name(url),'wb') as f:
f.write(res.data)
Downloading Images from URLs (Python Wget Library)
Wget is a Python library that wraps the popular Linux command-line utility, also called Wget. You can use it to download images and various other files through an interface that seamlessly integrates within Python code.
If you have an image URL, downloading it with Wget is extremely easy. First, import the library:
import wget
Then, create a variable to hold the image URL:
url = 'https://books.toscrape.com/media/cache/2c/da/2cdad67c44b002e7ead0cc35693c0e8b.jpg'
Finally, run the wget.download() function on the URL to download the image:
wget.download(url)
The file name will be the last part of the URL. That's it!
Using Proxies With Wget
Unfortunately, the Wget Python library doesn't offer proxy functionality as it's meant for simpler tasks. If you need more flexibility and features, try out the command-line tool.
It has plenty of features and even allows utilizing proxies once you set them as variables. To see how to run Wget requests through proxies, you can check out our tutorial on the Wget proxy command line .
Scraping and Downloading Multiple Images from a Website
Instead of using image URLs, we'll need to use the page HTML. We fetch and parse URLs using BeautifulSoup to extract <img>
tags and src attributes. Image URLs might be relative (e.g., /images/photo.jpg), so we use urljoin to combine them with the base URL to form complete paths.
Once we have the list of image URLs, the next step is to request each one and save it as a file. Here’s what it may look like using Requests, Urllib3, and Wget.
Requests
import requests
from bs4 import BeautifulSoup
from urllib.parse import urljoin
# fetch the HTML page
page_url = "https://books.toscrape.com/"
res = requests.get(page_url)
soup = BeautifulSoup(res.text, "html.parser")
# extract all <img> tags
img_tags = soup.find_all("img")
# Loop through and download each image
for img in img_tags:
img_url = img.get("src") # may be relative
full_url = urljoin(page_url, img_url) # make absolute
# download images
img_res = requests.get(full_url)
# create a file name from the URL
file_name = full_url.split("/")[-1]
with open(file_name, "wb") as f:
f.write(img_res.content)
Urllib3
import urllib3
from bs4 import BeautifulSoup
from urllib.parse import urljoin
http = urllib3.PoolManager()
# fetch the HTML page
page_url = "https://books.toscrape.com/"
res = http.request('GET', page_url)
soup = BeautifulSoup(res.data, "html.parser")
# Extract all <img> tags
img_tags = soup.find_all("img")
# Loop through and download images
for img in img_tags:
img_url = img.get("src")
full_url = urljoin(page_url, img_url) # make absolute
img_res = http.request('GET', full_url)
file_name = full_url.split("/")[-1]
# Save each image
with open(file_name, "wb") as f:
f.write(img_res.data)
Wget
The Wget Python library cannot fetch HTML and parse <img>
tags, as it only downloads files. To download multiple images using Wget, an additional library is required. In this example, we’ll use the Requests library:
import requests
import wget
from bs4 import BeautifulSoup
from urllib.parse import urljoin
# fetch the HTML page
page_url = "https://books.toscrape.com/"
res = requests.get(page_url)
soup = BeautifulSoup(res.text, "html.parser")
# extract all <img> tags
img_tags = soup.find_all("img")
# loop through and download image
for img in img_tags:
img_url = img.get("src")
full_url = urljoin(page_url, img_url) # make absolute
# download directly to file
wget.download(full_url, out=full_url.split("/")[-1])
print(f"Downloaded {full_url.split('/')[-1]}")
For simplicity's sake, these code snippets skip the already covered error handling and proxy integration. In a real-world scenario, your project may also require adding rotating User-agent headers, image metadata handling, file name management, and other functions.
Best Python Libraries for Image Download (2025 Edition)
Requests | Urllib3 | Wget | |
---|---|---|---|
Ease of Use (Single Image) | Very easy, concise syntax | Moderate, more complex | Very easy, one-line downloads |
Ease of Use (Multiple Images) | Easy, loops and session management | Moderate, requires more setup | Limited, needs additional libraries for HTML parsing |
Error Handling | Excellent | Good | Basic |
Proxy Support | Straightforward, built-in support | Powerful, advanced proxy options | Limited, separate command-line tool needed |
Pros | - Beginner-friendly - Robust ecosystem - Widely used |
- High performance - Fine-grained Control - Thread-safe |
- Extremely simple - Minimal code for downloads |
Cons | - Might require external modules - Slower than Urllib3 |
- Complex - Less beginner-friendly |
- No HTML parsing - Limited customization |
Ideal Use Cases | General scraping, web automation tasks | Performance-critical tasks, custom protocols | Quick one-off downloads, command-line style scraping |
Overall Recommendation | Best choice for most cases | Great for advanced users | Useful for quick tasks |
Conclusion
It's relatively easy to download images from image URLs using Urllib3, Wget, and Requests libraries. The choice of library for image files doesn't matter a lot, so it's better to pick tools you're already familiar with. Other options include using Playwright to download images , but we recommend sticking with the Requests library for ease of use.
To learn more about Requests and how it can be used when sending HTTP requests for various web scraping tasks, read our extensive guide on the Python Requests Library .
FAQ
I can't open images that I download with Python. What should I do?
This bug can arise from several issues, but the primary one may be that you're saving the image with the wrong extension. This will make a file that shows to be in an image format but is actually unreadable, just as recalling a text file to img.jpg would do.
TypeError: 'module' object is not callable
This error can occur if you have an outdated version of Urllib3 that doesn't support the simplified request syntax. To solve it, either update the library to a newer version or create a PoolManager as described in the user guide to handle the requests.
Tunnel connection failed: 407 Proxy Authentication Required
This error indicates that you were unable to authenticate with your proxy provider while scraping. This is a common issue with Urllib3, and the best solution for it is to follow our step-by-step guide on how to use proxies with Urllib3.