UI Automation
Selenium WebDriver
Complete Guide

Selenium WebDriver Complete Guide

Parul Dhingra - Senior Quality Analyst
Parul Dhingra13+ Years ExperienceHire Me

Senior Quality Analyst

Updated: 1/23/2026

Selenium WebDriver is the industry standard for browser automation and remains the most widely used web testing framework worldwide. Whether you're automating regression tests, scraping data, or building CI/CD pipelines, understanding Selenium fundamentals is essential for any automation engineer.

This comprehensive guide takes you from installation to writing robust test scripts, covering architecture concepts, best practices, and real-world patterns used by professional automation teams.

What is Selenium WebDriver?

Selenium WebDriver is an API and protocol that defines a language-neutral interface for controlling web browsers. It allows you to write code that:

  • Opens browsers programmatically
  • Navigates to URLs
  • Interacts with page elements (click, type, select)
  • Extracts information from pages
  • Validates page content and behavior

Key Features

FeatureDescription
Multi-browser supportChrome, Firefox, Safari, Edge
Multi-language bindingsJava, Python, C#, JavaScript, Ruby
W3C standardWebDriver protocol is an official W3C specification
Framework integrationWorks with TestNG, JUnit, pytest, and others
Cross-platformWindows, macOS, Linux

Selenium Components

Selenium WebDriver - The core library for browser automation Selenium IDE - Browser extension for recording and playback Selenium Grid - Distribute tests across multiple machines

Career Value: Selenium appears on more QA job listings than any other automation tool. Mastering it opens doors across industries and companies of all sizes.

Selenium Architecture

Understanding Selenium's architecture helps you write better tests and debug issues effectively.

How WebDriver Works

Test Script → WebDriver API → Browser Driver → Browser
  1. Test script calls WebDriver methods (e.g., click())
  2. WebDriver API translates to HTTP requests
  3. Browser driver (ChromeDriver, GeckoDriver) receives requests
  4. Browser executes the action and returns response

Browser Drivers

Each browser requires its own driver executable:

BrowserDriverDownload
ChromeChromeDriverchromedriver.chromium.org
FirefoxGeckoDrivergithub.com/mozilla/geckodriver
EdgeEdgeDriverdeveloper.microsoft.com
SafariSafariDriverBuilt into macOS

WebDriver Protocol

WebDriver uses a REST-like protocol:

  • Commands sent as HTTP requests
  • Responses returned as JSON
  • Sessions maintain browser state
  • Standardized across all implementations

Setting Up Selenium

Java Setup

1. Add Maven dependency:

<dependency>
    <groupId>org.seleniumhq.selenium</groupId>
    <artifactId>selenium-java</artifactId>
    <version>4.18.1</version>
</dependency>

2. Download WebDriver Manager (recommended):

<dependency>
    <groupId>io.github.bonigarcia</groupId>
    <artifactId>webdrivermanager</artifactId>
    <version>5.7.0</version>
</dependency>

3. Basic setup code:

import io.github.bonigarcia.wdm.WebDriverManager;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.chrome.ChromeDriver;
 
public class Setup {
    public static void main(String[] args) {
        // Automatically download and configure ChromeDriver
        WebDriverManager.chromedriver().setup();
 
        // Create browser instance
        WebDriver driver = new ChromeDriver();
 
        // Navigate to URL
        driver.get("https://example.com");
 
        // Clean up
        driver.quit();
    }
}

Python Setup

1. Install Selenium:

pip install selenium

2. Install WebDriver Manager:

pip install webdriver-manager

3. Basic setup code:

from selenium import webdriver
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
 
# Automatically download and configure ChromeDriver
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
 
# Navigate to URL
driver.get("https://example.com")
 
# Clean up
driver.quit()
⚠️

Driver Version Mismatch: The most common setup error is driver/browser version mismatch. WebDriver Manager solves this by automatically downloading the correct driver version for your browser.

Your First Selenium Script

Let's write a complete test that searches on Google:

Java Example

import io.github.bonigarcia.wdm.WebDriverManager;
import org.openqa.selenium.By;
import org.openqa.selenium.Keys;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.chrome.ChromeDriver;
 
public class GoogleSearchTest {
    public static void main(String[] args) {
        // Setup
        WebDriverManager.chromedriver().setup();
        WebDriver driver = new ChromeDriver();
 
        try {
            // Navigate to Google
            driver.get("https://www.google.com");
 
            // Find search box and enter query
            WebElement searchBox = driver.findElement(By.name("q"));
            searchBox.sendKeys("Selenium WebDriver tutorial");
            searchBox.sendKeys(Keys.ENTER);
 
            // Verify results page loaded
            Thread.sleep(2000); // Simple wait - use explicit waits in real code
            String title = driver.getTitle();
            System.out.println("Page title: " + title);
 
            // Verify title contains search term
            if (title.contains("Selenium")) {
                System.out.println("Test PASSED");
            } else {
                System.out.println("Test FAILED");
            }
        } catch (Exception e) {
            e.printStackTrace();
        } finally {
            driver.quit();
        }
    }
}

Python Example

from selenium import webdriver
from selenium.webdriver.common.by import By
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.chrome.service import Service
from webdriver_manager.chrome import ChromeDriverManager
import time
 
# Setup
driver = webdriver.Chrome(service=Service(ChromeDriverManager().install()))
 
try:
    # Navigate to Google
    driver.get("https://www.google.com")
 
    # Find search box and enter query
    search_box = driver.find_element(By.NAME, "q")
    search_box.send_keys("Selenium WebDriver tutorial")
    search_box.send_keys(Keys.ENTER)
 
    # Wait for results
    time.sleep(2)  # Simple wait - use explicit waits in real code
 
    # Verify results page loaded
    title = driver.title
    print(f"Page title: {title}")
 
    # Verify title contains search term
    if "Selenium" in title:
        print("Test PASSED")
    else:
        print("Test FAILED")
 
except Exception as e:
    print(f"Error: {e}")
 
finally:
    driver.quit()

Browser Navigation

WebDriver provides methods to navigate within and between pages.

Navigation Methods

// Navigate to URL
driver.get("https://example.com");
driver.navigate().to("https://example.com");
 
// Browser history
driver.navigate().back();
driver.navigate().forward();
driver.navigate().refresh();
 
// Get current URL
String currentUrl = driver.getCurrentUrl();
 
// Get page title
String title = driver.getTitle();
 
// Get page source
String source = driver.getPageSource();

Python Navigation

# Navigate to URL
driver.get("https://example.com")
 
# Browser history
driver.back()
driver.forward()
driver.refresh()
 
# Get current URL
current_url = driver.current_url
 
# Get page title
title = driver.title
 
# Get page source
source = driver.page_source

Finding Elements

Locating elements is fundamental to Selenium automation. WebDriver provides multiple strategies.

Locator Strategies

StrategyJavaPythonBest For
IDBy.id("login")By.ID, "login"Unique elements
NameBy.name("email")By.NAME, "email"Form fields
Class NameBy.className("btn")By.CLASS_NAME, "btn"Styled elements
CSS SelectorBy.cssSelector("#login")By.CSS_SELECTOR, "#login"Complex queries
XPathBy.xpath("//button")By.XPATH, "//button"Any element
Link TextBy.linkText("Click here")By.LINK_TEXT, "Click here"Links
Tag NameBy.tagName("input")By.TAG_NAME, "input"Element type

Finding Single vs Multiple Elements

// Find first matching element
WebElement element = driver.findElement(By.id("username"));
 
// Find all matching elements
List<WebElement> elements = driver.findElements(By.className("item"));
 
// Check if elements exist
if (!driver.findElements(By.id("error")).isEmpty()) {
    // Handle error message
}

Locator Priority

Use this priority order for stability:

  1. ID - Most reliable when available
  2. Name - Good for form elements
  3. CSS Selector - Fast and flexible
  4. XPath - Most powerful, use when needed

Deep Dive: See our Selenium Locators Masterclass for comprehensive locator strategies and best practices.

Interacting with Elements

Once you find elements, you can interact with them.

Common Interactions

// Click
element.click();
 
// Type text
element.sendKeys("Hello World");
 
// Clear text field
element.clear();
 
// Submit form
element.submit();
 
// Get text content
String text = element.getText();
 
// Get attribute value
String value = element.getAttribute("value");
String href = element.getAttribute("href");
 
// Check element state
boolean displayed = element.isDisplayed();
boolean enabled = element.isEnabled();
boolean selected = element.isSelected();

Working with Dropdowns

import org.openqa.selenium.support.ui.Select;
 
WebElement dropdown = driver.findElement(By.id("country"));
Select select = new Select(dropdown);
 
// Select by visible text
select.selectByVisibleText("United States");
 
// Select by value attribute
select.selectByValue("US");
 
// Select by index (0-based)
select.selectByIndex(0);
 
// Get selected option
WebElement selected = select.getFirstSelectedOption();
String selectedText = selected.getText();
 
// Get all options
List<WebElement> options = select.getOptions();

Working with Checkboxes and Radio Buttons

WebElement checkbox = driver.findElement(By.id("agree"));
 
// Check if selected
if (!checkbox.isSelected()) {
    checkbox.click();
}
 
// Radio buttons work the same way
WebElement radio = driver.findElement(By.id("option1"));
radio.click();

Browser Management

Window Handles

// Get current window handle
String mainWindow = driver.getWindowHandle();
 
// Click link that opens new window
element.click();
 
// Get all window handles
Set<String> allWindows = driver.getWindowHandles();
 
// Switch to new window
for (String window : allWindows) {
    if (!window.equals(mainWindow)) {
        driver.switchTo().window(window);
        break;
    }
}
 
// Switch back to main window
driver.switchTo().window(mainWindow);
 
// Close current window
driver.close();
 
// Quit browser (closes all windows)
driver.quit();

Window Size and Position

// Maximize window
driver.manage().window().maximize();
 
// Set specific size
driver.manage().window().setSize(new Dimension(1920, 1080));
 
// Fullscreen
driver.manage().window().fullscreen();
 
// Get window size
Dimension size = driver.manage().window().getSize();

Handling Frames

// Switch to frame by index
driver.switchTo().frame(0);
 
// Switch to frame by name or ID
driver.switchTo().frame("frameName");
 
// Switch to frame by WebElement
WebElement frameElement = driver.findElement(By.id("myframe"));
driver.switchTo().frame(frameElement);
 
// Switch back to main content
driver.switchTo().defaultContent();
 
// Switch to parent frame
driver.switchTo().parentFrame();

Handling Common Scenarios

JavaScript Alerts

// Switch to alert
Alert alert = driver.switchTo().alert();
 
// Get alert text
String alertText = alert.getText();
 
// Accept alert (OK)
alert.accept();
 
// Dismiss alert (Cancel)
alert.dismiss();
 
// Enter text in prompt
alert.sendKeys("my input");
alert.accept();

Taking Screenshots

import org.openqa.selenium.TakesScreenshot;
import org.openqa.selenium.OutputType;
import java.io.File;
import org.apache.commons.io.FileUtils;
 
// Take screenshot
File screenshot = ((TakesScreenshot) driver).getScreenshotAs(OutputType.FILE);
 
// Save to file
FileUtils.copyFile(screenshot, new File("screenshot.png"));
 
// Element screenshot
WebElement element = driver.findElement(By.id("chart"));
File elementScreenshot = element.getScreenshotAs(OutputType.FILE);

Executing JavaScript

import org.openqa.selenium.JavascriptExecutor;
 
JavascriptExecutor js = (JavascriptExecutor) driver;
 
// Execute script
js.executeScript("alert('Hello!')");
 
// Scroll to element
js.executeScript("arguments[0].scrollIntoView(true);", element);
 
// Return value from script
Long height = (Long) js.executeScript("return document.body.scrollHeight");
 
// Click element using JavaScript (bypasses overlay issues)
js.executeScript("arguments[0].click();", element);

Best Practices

1. Always Use Explicit Waits

import org.openqa.selenium.support.ui.WebDriverWait;
import org.openqa.selenium.support.ui.ExpectedConditions;
 
WebDriverWait wait = new WebDriverWait(driver, Duration.ofSeconds(10));
 
// Wait for element to be clickable
WebElement button = wait.until(
    ExpectedConditions.elementToBeClickable(By.id("submit"))
);
button.click();

2. Use Page Object Model

Separate page structure from test logic:

public class LoginPage {
    private WebDriver driver;
 
    @FindBy(id = "username")
    private WebElement usernameField;
 
    @FindBy(id = "password")
    private WebElement passwordField;
 
    @FindBy(id = "login")
    private WebElement loginButton;
 
    public LoginPage(WebDriver driver) {
        this.driver = driver;
        PageFactory.initElements(driver, this);
    }
 
    public void login(String username, String password) {
        usernameField.sendKeys(username);
        passwordField.sendKeys(password);
        loginButton.click();
    }
}

3. Clean Up Resources

Always close the browser, even on test failure:

@AfterMethod
public void tearDown() {
    if (driver != null) {
        driver.quit();
    }
}

4. Use Meaningful Locators

// Good - uses stable attributes
driver.findElement(By.id("login-button"));
driver.findElement(By.cssSelector("[data-testid='submit']"));
 
// Avoid - fragile locators
driver.findElement(By.xpath("/html/body/div[3]/form/button"));
driver.findElement(By.className("btn-primary")); // if many exist

5. Handle Dynamic Content

// Wait for content to load
wait.until(ExpectedConditions.visibilityOfElementLocated(By.id("results")));
 
// Wait for element count
wait.until(driver -> driver.findElements(By.className("item")).size() > 0);

Test Your Knowledge

Quiz on Selenium WebDriver

Your Score: 0/10

Question: What is the role of browser drivers (like ChromeDriver) in Selenium WebDriver?


Continue Your Selenium Journey


Frequently Asked Questions

Frequently Asked Questions (FAQs) / People Also Ask (PAA)

What programming language should I learn for Selenium?

What is the difference between Selenium WebDriver and Selenium IDE?

Why do I get NoSuchElementException even though the element exists on the page?

What is the difference between driver.close() and driver.quit()?

How do I handle dynamic elements that change IDs?

Should I use CSS selectors or XPath?

How do I run Selenium tests without opening a visible browser?

What is WebDriver Manager and why should I use it?