Guides

How to Download Files in Selenium Python (Headless + Auth)

Published on

September 29, 2025

How to Download Files in Selenium Python (Headless + Auth)

File download handling is an essential part of test automation when applications allow users to export data, generate reports, or retrieve documents. Automating these workflows in Selenium Python is challenging because browsers require explicit configuration to define the download folder, manage file types, and bypass dialog prompts.

You can manage this by setting browser preferences before launching WebDriver. Chrome and Firefox let you define the download path, disable pop-ups, and handle MIME types automatically. For authenticated downloads, you can use session cookies or login flows to ensure files are accessible during the test.

This article explains how to download files in Selenium with Python, including headless mode and authentication.

Overview of File Download Automation in Selenium with Python

Selenium WebDriver was designed to simulate browser actions such as clicking, typing, and navigating between pages. File downloads introduce extra steps because browsers do not treat them like regular page interactions.

Instead of returning a DOM element or visible output, a download is processed by the browser itself. This makes downloads harder to track within test scripts unless the browser is configured beforehand.

In Selenium with Python, file download automation typically involves three parts:

Browser configuration: Set preferences to control the folder where files are stored, disable dialog boxes, and specify how different file types are handled.
Test logic: Trigger the download event, wait until the browser saves the file, and verify that the expected file exists.
Cleanup: Remove any old files before a new test run so results are not mixed with previous downloads.

Why Automating File Downloads is Important in Test Automation

Many business applications rely on downloads to deliver key outputs. Examples include CSV reports for finance, PDF invoices for customers, and Excel exports for analytics. If these files are not tested, you miss validating one of the most critical user-facing features.

Automating file downloads in Selenium with Python ensures that:

Reports and exports are functional: You can confirm that files are generated, saved, and accessible without manual checks.
File formats are correct: Tests verify whether the download matches expected extensions such as .csv, .pdf, or .xlsx.
Data integrity is maintained: You can check file size or even open and validate content after the download completes.
Headless testing remains consistent: Automated downloads can run in CI/CD pipelines without requiring manual intervention.

How to Set Chrome to Save Files in a Specific Folder

By default, Chrome asks users where to save each file. This behavior interrupts automated tests, so you need to configure ChromeOptions to set a download folder and disable prompts. Selenium allows you to pass these settings when launching the WebDriver.

Here is how you can configure Chrome to always save files in a chosen directory:

from selenium import webdriver

# Define download folder path

download_dir = "/path/to/download/folder"

# Set Chrome preferences

chrome_prefs = {

"download.default_directory": download_dir,

"download.prompt_for_download": False,

"download.directory_upgrade": True,

"safebrowsing.enabled": True # ensures downloads are allowed

}

options = webdriver.ChromeOptions()

options.add_experimental_option("prefs", chrome_prefs)

# Launch Chrome with preferences

driver = webdriver.Chrome(options=options)

# Example: trigger file download

driver.get("https://example.com/download-report")

With this setup, Chrome automatically saves files in the specified folder without asking for confirmation. Tests can then check the folder to verify whether the expected file exists.

How to Set Firefox to Save Files in a Specific Folder

Firefox manages downloads through its profile settings. By default, it shows a dialog box asking how to handle each file type, which blocks automation. To avoid this, you can configure the Firefox profile with preferences that define the download folder and map MIME types to automatic save actions.

Here is an example configuration in Selenium Python:

from selenium import webdriver

# Define download folder path

download_dir = "/path/to/download/folder"

# Set Firefox preferences

profile = webdriver.FirefoxProfile()

profile.set_preference("browser.download.folderList", 2) # use custom path

profile.set_preference("browser.download.dir", download_dir)

profile.set_preference("browser.helperApps.neverAsk.saveToDisk",

"application/pdf,application/vnd.ms-excel,text/csv")

profile.set_preference("pdfjs.disabled", True) # disable built-in PDF viewer

# Launch Firefox with profile

driver = webdriver.Firefox(firefox_profile=profile)

# Example: trigger file download

driver.get("https://example.com/export-data")

With these settings, Firefox saves files directly to the given folder without showing confirmation dialogs. The MIME types you define ensure that specific formats such as CSV, PDF, or Excel files are handled automatically.

How to Verify File Downloads in Selenium Python

After configuring Chrome or Firefox for downloads, you need to confirm that the file was actually saved. Selenium itself does not track files, so verification must be done at the filesystem level.

Key steps for verification:

Clear the download folder before starting a test.
Trigger the download action.
Wait until a file appears in the folder and temporary extensions like .crdownload or .part disappear.
Check filename, extension, and size to ensure the file is complete.
Optionally parse the file (CSV, Excel, PDF) to validate its content.

Example helper in Python:

import os, time

def wait_for_download(folder, filename, timeout=30):

path = os.path.join(folder, filename)

end_time = time.time() + timeout

while time.time() < end_time:

if os.path.exists(path) and not path.endswith((".crdownload", ".part")):

return path

time.sleep(1)

raise TimeoutError(f"{filename} not found in {folder}")

This ensures the test continues only when the expected file is available.

Handling File Downloads That Require Authentication in Selenium

When the file sits behind login, you have three practical options. Each is explicit and testable.

1. Login with Selenium and Download in the Browser

Use Selenium to perform the login flow, then trigger the download and verify the file on disk. This works for UI-driven flows and files delivered by normal navigation or form submits.

Key points:

Keep browser cookies and session active
Use the same browser preferences for download folder as earlier
Wait for the download to complete using filesystem checks

Here’s an example:

# selenium_login_and_download.py

from selenium import webdriver

from selenium.webdriver.common.by import By

import time, os

download_dir = "/tmp/selenium_downloads"

options = webdriver.ChromeOptions()

prefs = {"download.default_directory": download_dir,

"download.prompt_for_download": False,

"download.directory_upgrade": True}

options.add_experimental_option("prefs", prefs)

driver = webdriver.Chrome(options=options)

# login

driver.get("https://example.com/login")

driver.find_element(By.ID, "username").send_keys("user")

driver.find_element(By.ID, "password").send_keys("pass")

driver.find_element(By.ID, "submit").click()

# navigate to download page and trigger download

driver.get("https://example.com/my-reports")

driver.find_element(By.CSS_SELECTOR, "a.download-link").click()

# wait and verify via filesystem

# use wait_for_download helper from earlier

2. Use an HTTP Client with Selenium Cookies

Use Selenium only to obtain an authenticated session cookie. Then hand those cookies to requests and download the file directly. This is the most reliable method for large files and for verifying exact response headers such as Content-Disposition.

Why use this:

Avoid browser quirks and temporary file extensions
Stream large files efficiently
Validate HTTP response status and headers

Here’s an example:

# selenium_to_requests_download.py

import requests, os

from selenium import webdriver

download_url = "https://example.com/download/report/123"

download_path = "/tmp/report_123.pdf"

driver = webdriver.Chrome()

# perform login with Selenium

driver.get("https://example.com/login")

# ... login steps ...

# collect cookies from Selenium

s = requests.Session()

for c in driver.get_cookies():

s.cookies.set(c['name'], c['value'], domain=c.get('domain'), path=c.get('path'))

driver.quit()

# include headers if site uses CSRF or custom headers

headers = {"User-Agent": "selenium-requests/1.0"}

with s.get(download_url, headers=headers, stream=True, timeout=60) as r:

r.raise_for_status()

with open(download_path, "wb") as fh:

for chunk in r.iter_content(chunk_size=8192):

if chunk:

fh.write(chunk)

Tip:

If the site uses CSRF tokens, include them in the request headers or form data. You can read them from the page with Selenium before exporting cookies.
For secure cookies with SameSite or HttpOnly those are still usable by requests when copied from Selenium get_cookies.

3. Use Direct Auth Headers or Basic Auth

If the endpoint accepts an API token or HTTP basic auth, call the endpoint directly. This avoids browser state entirely.

Example with token

import requests

token = "Bearer eyJ..."

headers = {"Authorization": token}

r = requests.get("https://example.com/api/download/456", headers=headers, stream=True)

r.raise_for_status()

with open("/tmp/file.pdf", "wb") as fh:

for chunk in r.iter_content(8192):

fh.write(chunk)

Example with basic auth

r = requests.get("https://user:pass@example.com/download/789", stream=True)

How to Download Files in Headless Chrome and Firefox with Selenium

Headless browsers do not show UI dialogs. You must configure the browser to save files automatically and ensure the driver exposes download behavior.

Below are concrete, minimal instructions and code for Chrome and Firefox.

1. Headless Chrome (Selenium 4)

Chrome requires two things when running headless. First, set the usual download preferences. Second, enable the Chrome DevTools Protocol download behavior so Chrome will write files to disk in headless mode.

from selenium import webdriver

from selenium.webdriver.chrome.service import Service

from selenium.webdriver.common.by import By

import os, time

download_dir = "/tmp/selenium_downloads"

os.makedirs(download_dir, exist_ok=True)

options = webdriver.ChromeOptions()

options.add_argument("--headless=new") # use new headless mode if available

options.add_argument("--no-sandbox")

options.add_argument("--disable-gpu")

prefs = {

"download.default_directory": download_dir,

"download.prompt_for_download": False,

"download.directory_upgrade": True,

"safebrowsing.enabled": True

}

options.add_experimental_option("prefs", prefs)

driver = webdriver.Chrome(options=options)

# enable CDP download behavior so headless Chrome writes files

driver.execute_cdp_cmd(

"Page.setDownloadBehavior",

{"behavior": "allow", "downloadPath": download_dir}

)

# trigger download

driver.get("https://example.com/download-report")

driver.find_element(By.CSS_SELECTOR, "a.download-link").click()

# use filesystem check to confirm completion

# e.g. wait_for_download(download_dir, "report_*.pdf", timeout=60)

time.sleep(1)

driver.quit()

Notes

Use --headless=new when available. If the environment uses an older Chrome, use --headless.
execute_cdp_cmd("Page.setDownloadBehavior", ...) is required for headless Chrome to persist downloads.
Confirm ChromeDriver and Chrome versions are compatible.

2. Headless Firefox (GeckoDriver)

Firefox supports headless downloads via profile preferences. Set the download folder and MIME type handling. Recent GeckoDriver versions support downloads in headless mode without extra DevTools commands.

from selenium import webdriver

from selenium.webdriver.firefox.options import Options

from selenium.webdriver.common.by import By

import os, time

download_dir = "/tmp/selenium_downloads"

os.makedirs(download_dir, exist_ok=True)

options = Options()

options.headless = True

profile = webdriver.FirefoxProfile()

profile.set_preference("browser.download.folderList", 2) # use custom dir

profile.set_preference("browser.download.dir", download_dir)

profile.set_preference("browser.helperApps.neverAsk.saveToDisk",

"application/pdf,application/octet-stream,application/vnd.ms-excel,text/csv")

profile.set_preference("pdfjs.disabled", True)

profile.set_preference("browser.download.manager.showWhenStarting", False)

profile.set_preference("browser.download.useDownloadDir", True)

driver = webdriver.Firefox(firefox_profile=profile, options=options)

driver.get("https://example.com/download-report")

driver.find_element(By.CSS_SELECTOR, "a.download-link").click()

# wait and verify file

# e.g. wait_for_download(download_dir, "report_*.pdf", timeout=60)

time.sleep(1)

driver.quit()

Notes

Include all MIME types your application serves in browser.helperApps.neverAsk.saveToDisk.
Disable the built-in PDF viewer when you need PDFs saved as files.
If headless Firefox fails to download, confirm GeckoDriver version and Firefox version compatibility.

How to Remove Old Downloads Before Starting New Tests

Old files in the download folder can interfere with test results. If the directory already contains a file with the same name, your test may incorrectly pass without verifying the new download. To avoid this, clear the download folder at the beginning of each test run.

You can handle this with a short utility function in Python:

import os

import shutil

from pathlib import Path

def clear_downloads(download_dir):

"""Delete all files and subfolders inside the given download directory."""

folder = Path(download_dir)

if not folder.exists():

folder.mkdir(parents=True, exist_ok=True)

return

for item in folder.iterdir():

if item.is_file():

item.unlink()

elif item.is_dir():

shutil.rmtree(item)

Call this function during test setup, before triggering a download. This ensures that only files created during the current test are present in the folder, making validation reliable and consistent.

In CI pipelines or parallel tests, it is better to create a unique download directory per test (for example, using a timestamp or test ID) instead of sharing one folder across all tests.

You can also use BrowserStack Automate that allows you to run tests on real devices in a clean state every time.

Try BrowserStack for Free

Conclusion

Browsers need explicit setup for automated file downloads in Selenium Python. You must define download paths, configure MIME types, clear old files, and confirm that the correct file is saved. Headless runs and authentication add more steps, but they can be handled with proper configuration.

With BrowserStack Automate, you can test downloads on thousands of real browsers and devices. It supports parallel execution, integrates with CI/CD pipelines, and provides debugging features like logs, screenshots, and video recordings.

Run Selenium Tests on Cloud

Try BrowserStack Now