
Selenium 4 introduces major upgrades that make browser automation faster, more reliable, and fully compliant with the W3C WebDriver standard. It ensures consistent test execution across browsers and reduces compatibility issues that were common in earlier versions.
The update brings a redesigned IDE, improved Grid architecture, new relative locators, and deeper debugging support. These enhancements simplify script creation, speed up parallel testing, and improve overall test accuracy for QA teams.
This article highlights the top Selenium 4 features and how they enhance modern test automation.
Selenium WebDriver 4 has evolved into a more stable and developer-friendly automation tool, with several enhancements that streamline cross-browser testing. Built on the W3C WebDriver protocol, it ensures that commands sent from test scripts are interpreted consistently by all major browsers. This eliminates the frequent inconsistencies testers experienced with Selenium 3 when switching between Chrome, Firefox, or Edge.
In addition to protocol compliance, Selenium 4 introduces several usability and performance-focused updates. These include better debugging support, new window and tab management options, and expanded APIs for advanced testing scenarios. Together, these updates help testers build faster, more maintainable automation frameworks with higher accuracy and efficiency.
Selenium 4 introduces a set of new and improved features that modernize automation testing. Each enhancement focuses on simplifying setup, improving stability, and expanding testing capabilities across different browsers and platforms. The following sections explain the most impactful additions that make Selenium 4 a more complete and efficient testing framework.
Selenium 4 fully adopts the W3C WebDriver protocol, which standardizes how browsers interpret automation commands. This ensures that all major browsers like Chrome, Firefox, Safari, and Edge respond to automation instructions in a consistent way.
In Selenium 3, WebDriver often relied on JSON Wire Protocol, which sometimes caused inconsistencies between browser drivers and the test framework.
In Selenium 4, test scripts communicate directly with browsers through W3C endpoints, eliminating these discrepancies. The interaction flow between client, server, and browser is now unified, which means fewer connection issues and smoother test execution across different environments.
Example:
Here’s how a typical session looks under the W3C standard—no change in code, but a big improvement under the hood:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://example.com")
print(driver.title)
driver.quit()
Even though the syntax looks identical to Selenium 3, browsers now interpret commands through a standardized W3C channel, making test runs faster and more predictable.
The Selenium Grid has been completely re-architected in version 4 for better scalability, modularity, and deployment flexibility. It now supports a fully distributed design where each component (Hub, Node, Router, Distributor, Session Map, Event Bus) can run independently. This allows parallel tests to be distributed efficiently across multiple machines or containers.
You can deploy the new Grid as a standalone server, a distributed setup, or a Docker-based cluster. It also features a modern web-based UI to monitor active sessions and debug failures quickly.
Example – Starting Selenium Grid (Standalone Mode):
java -jar selenium-server-4.0.0.jar standalone
Example – Distributed Mode Configuration:
java -jar selenium-server-4.0.0.jar hub
java -jar selenium-server-4.0.0.jar node --detect-drivers true
With Docker support, testers can scale environments dynamically:
docker run -d -p 4444:4444 --name selenium-grid selenium/standalone-chrome
This architecture simplifies CI/CD integration and makes large-scale browser testing more manageable and reliable.
Selenium IDE has been rebuilt from the ground up to provide a smoother, cross-browser test recording experience. It now works as a browser extension for Chrome and Firefox and introduces features like parallel execution, command-line playback (via the Selenium IDE Runner), and improved export options to programming languages such as Python, Java, and JavaScript.
Key Enhancements:
Example – Running Selenium IDE Tests from the Command Line:
selenium-side-runner -s "browserName=chrome" test_suite.side
This allows teams to include recorded tests as part of their automated build workflows, bridging the gap between manual test creation and continuous integration testing.
Relative Locators (also known as Friendly Locators) are among the most practical additions in Selenium 4. They help identify elements based on their visual placement relative to other elements on the page, which makes locators more resilient to layout or attribute changes.
Example – Using Relative Locators:
from selenium.webdriver.common.by import By
from selenium.webdriver.support.relative_locator import locate_with
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.selenium.dev/")
logo = driver.find_element(By.ID, "selenium_logo")
about_link = driver.find_element(locate_with(By.TAG_NAME, "a").below(logo))
print(about_link.text)
driver.quit()
Here, Selenium locates the “About” link based on its position relative to the logo element. This reduces the need for brittle XPath or CSS locators and improves test readability.

One of the most practical improvements in Selenium 4 is its completely rewritten documentation. The new documentation site provides clearer explanations, real-world examples, and consistent references for all major APIs and configuration options.
Each language binding (Python, Java, C#, JavaScript, and Ruby) now has its own dedicated section with uniform structure and updated code samples.
This change may not alter how tests run, but it significantly reduces the learning curve for teams. Developers and QA engineers can now reference official, accurate content instead of depending on outdated third-party tutorials. The documentation also includes migration guides to help testers upgrade from Selenium 3 to Selenium 4 without breaking existing frameworks.
Selenium 4 integrates native support for Chrome DevTools Protocol (CDP), allowing testers to interact directly with browser internals. This means you can capture console logs, intercept network requests, monitor performance metrics, and even emulate geolocation or device conditions, within the WebDriver API.
This eliminates the need for external libraries like Puppeteer or Playwright for low-level browser debugging and monitoring tasks.
Example: Capturing Console Logs Using CDP (Python):
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://example.com")
logs = driver.get_log("browser")
for log in logs:
print(log)
driver.quit()
Example: Blocking Network Requests:
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities.CHROME.copy()
caps["goog:loggingPrefs"] = {"performance": "ALL"}
driver = webdriver.Chrome(desired_capabilities=caps)
driver.execute_cdp_cmd("Network.enable", {})
driver.execute_cdp_cmd("Network.setBlockedURLs", {"urls": ["*.png", "*.jpg"]})
driver.get("https://example.com")
driver.quit()
Through CDP integration, Selenium now supports advanced testing scenarios such as security validations, performance profiling, and offline testing.
Managing multiple windows or tabs has become more intuitive in Selenium 4. The new API introduces methods to create, switch, and manage windows or tabs seamlessly without relying on complex workarounds.
This is particularly useful for testing web applications that trigger pop-ups or open links in new tabs, as testers can now directly control those sessions in a cleaner, more readable manner.
Example – Creating and Managing New Tabs:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://example.com")
# Open a new tab
driver.switch_to.new_window('tab')
driver.get("https://www.selenium.dev/")
print(driver.title)
driver.quit()
You can also open new windows similarly:
driver.switch_to.new_window('window')
This gives testers greater flexibility in validating multi-tab or cross-domain workflows without writing extra logic for handle management.
The Actions Class has been enhanced in Selenium 4 to offer smoother, low-level control of complex user gestures such as drag-and-drop, click-and-hold, or multi-touch interactions. These improvements make UI testing closer to real user behavior, particularly for applications that rely on custom JavaScript events or gesture-based controls.
Example – Using Actions Class for Complex Interactions:
from selenium import webdriver
from selenium.webdriver import ActionChains
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get("https://crossbrowsertesting.github.io/drag-and-drop")
source = driver.find_element(By.ID, "draggable")
target = driver.find_element(By.ID, "droppable")
actions = ActionChains(driver)
actions.click_and_hold(source).move_to_element(target).release().perform()
driver.quit()
The updated Actions API in Selenium 4 supports smoother mouse movement, precise offset control, and even touch input simulation. It allows multiple device actions (like keyboard and pointer) to run concurrently, providing more accurate simulation of real user behavior.
Fluent Wait in Selenium 4 provides better control over synchronization by allowing more refined polling intervals and exception handling. It helps manage dynamic elements that take unpredictable time to load by waiting until a specific condition is met rather than using hard-coded sleep durations.
Although Fluent Wait existed in Selenium 3, Selenium 4 enhances its reliability and introduces cleaner syntax across all bindings. Testers can now define custom conditions or handle multiple exceptions with greater precision, reducing flaky test results caused by timing mismatches.
Example – Implementing Fluent Wait (Python):
from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.common.exceptions import TimeoutException
driver = webdriver.Chrome()
driver.get("https://example.com/login")
try:
element = WebDriverWait(driver, 15, poll_frequency=2, ignored_exceptions=[TimeoutException]).until(
EC.presence_of_element_located((By.ID, "username"))
)
element.send_keys("admin")
finally:
driver.quit()
By defining the timeout, polling interval, and ignored exceptions, Fluent Wait ensures tests adapt dynamically to real-world page load conditions, especially in JavaScript-heavy applications.
Selenium 4 introduces powerful screenshot features that let you capture full-page screenshots or specific elements directly. Previously, testers relied on third-party tools or browser extensions for full-page captures, but now it can be done natively through WebDriver.
This feature is valuable for visual regression testing, UI validation, and documentation, ensuring that captured views accurately represent browser-rendered states.
Example – Capturing a Full Page Screenshot:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get("https://www.selenium.dev/")
driver.save_screenshot("fullpage.png")
driver.quit()
Example – Capturing a Specific Element:
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get("https://www.selenium.dev/")
logo = driver.find_element(By.CLASS_NAME, "navbar-logo")
logo.screenshot("logo.png")
driver.quit()
The ability to capture elements directly without external utilities simplifies visual validation workflows and improves test reporting accuracy.
Selenium 4 introduces native OpenTelemetry integration that allows testers to monitor and trace test execution for performance and debugging purposes. This feature records each WebDriver command, its duration, and related metadata, providing insights into where bottlenecks or failures occur during test runs.
This is particularly beneficial in CI/CD environments where distributed testing is common and root-cause analysis can be complex. The collected telemetry can be sent to monitoring tools like Jaeger, Grafana, or Elastic APM for visualization and performance trend tracking.
Example – Enabling Tracing in Selenium 4:
java -jar selenium-server-4.0.0.jar --tracing true standalone
Once tracing is enabled, Selenium automatically collects execution data, helping QA teams analyze latency, identify bottlenecks, and improve overall testing efficiency.
Selenium 4 deprecates DesiredCapabilities in favor of more structured, browser-specific Options classes and FindBy annotations in favor of modern locator strategies. These changes make the API cleaner, type-safe, and less error-prone.
Instead of defining capabilities as key-value pairs, testers can now use explicit classes like ChromeOptions, FirefoxOptions, or EdgeOptions, which provide better IDE support and validation.
Example – Using Options Instead of DesiredCapabilities:
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument("--start-maximized")
options.set_capability("browserVersion", "latest")
options.set_capability("platformName", "Windows 10")
driver = webdriver.Chrome(options=options)
driver.get("https://example.com")
driver.quit()
This approach improves maintainability and ensures future compatibility with W3C-compliant drivers. Deprecated locator strategies and capability handling have been replaced with clear, structured APIs that align better with modern automation frameworks.
One of the most advanced updates in Selenium 4 is the introduction of Bidirectional (BiDi) APIs. These APIs create a two-way communication channel between the test script and the browser, allowing Selenium to send and receive real-time events from the browser while the test is running.
With BiDi APIs, testers can now perform powerful operations such as intercepting network requests, listening to console logs, monitoring DOM changes, or simulating network conditions without requiring any external tools. This bridges the gap between Selenium and browser-native debugging capabilities.
Example – Listening to Console Logs Using BiDi API (Python):
from selenium import webdriver
from selenium.webdriver.common.desired_capabilities import DesiredCapabilities
caps = DesiredCapabilities.CHROME.copy()
driver = webdriver.Chrome(desired_capabilities=caps)
driver.execute_script("console.log('Hello from Selenium 4');")
logs = driver.get_log("browser")
for log in logs:
print(log["message"])
driver.quit()
Example – Capturing Network Events:
driver.execute_cdp_cmd("Network.enable", {})
driver.execute_cdp_cmd("Network.emulateNetworkConditions", {
"offline": False,
"latency": 100,
"downloadThroughput": 50000,
"uploadThroughput": 50000
})
These bidirectional interactions make Selenium more powerful for debugging, network control, and real-time monitoring, capabilities that were previously limited to DevTools-driven automation tools.
Even with Selenium 4’s improved reliability, local or simulated browser tests still fail to represent real-world performance. Variations in hardware, operating systems, screen resolutions, and browser versions can significantly affect how applications behave. Testing on real devices ensures that scripts validate true user conditions such as rendering accuracy, responsiveness, and interactive performance.
Running Selenium tests on a real device cloud also eliminates the need to maintain an internal device lab, reducing setup overhead and ensuring scalability. Teams can execute parallel tests across multiple browsers and OS combinations to detect environment-specific issues early in the development cycle.
BrowserStack Automate provides a reliable, cloud-based platform for running Selenium tests on 3500+ real browsers and devices. It integrates directly with Selenium 4 and supports advanced capabilities like parallel execution, debugging with video logs, and seamless CI/CD integration.
Example – Running Selenium Tests on BrowserStack Automate:
from selenium import webdriver
capabilities = {
"browserName": "chrome",
"browserVersion": "latest",
"bstack:options": {
"os": "Windows",
"osVersion": "11",
"buildName": "Selenium 4 Tests",
"sessionName": "Login flow validation"
}
}
driver = webdriver.Remote(
command_executor="https://hub-cloud.browserstack.com/wd/hub",
desired_capabilities=capabilities
)
driver.get("https://example.com")
print(driver.title)
driver.quit()
By using BrowserStack Automate with Selenium 4, QA teams can validate performance, UI behavior, and functionality across real-world conditions at scale, ensuring accurate, production-grade test coverage without the burden of device maintenance.
Selenium 4 brings stronger APIs, W3C compliance, better debugging, and a more stable Grid architecture for scalable automation. Features like Relative Locators, Bidirectional APIs, and improved screenshot options simplify testing while offering deeper insights into web application behavior.
To get the most accurate results, tests should run on real browsers and devices. BrowserStack Automate enables Selenium 4 tests to run across thousands of real environments in the cloud, ensuring reliable, cross-browser testing that reflects real user conditions.
Run Selenium Tests on Cloud
Get visual proof, steps to reproduce and technical logs with one click
Continue reading
Try Bird on your next bug - you’ll love it
“Game changer”
Julie, Head of QA
Try Bird later, from your desktop