Data-Driven Testing (CSV, JSON, Excel)
Data-Driven Testing (DDT) separates test logic from test data — the test script runs the same logic repeatedly with different sets of input data. This enables testing with hundreds of data combinations using a single test function, makes non-technical stakeholders able to add test scenarios by editing a CSV file, and dramatically improves test coverage for input-heavy features like forms, pricing engines, and eligibility rules.
Data-Driven Testing Approaches
# ══════════════════════════════════════════════════════════════
# APPROACH 1: pytest.mark.parametrize (simplest — inline data)
# ══════════════════════════════════════════════════════════════
import pytest
@pytest.mark.parametrize("email,password,expected_url,expected_msg", [
("alice@test.com", "Test@1234", "/dashboard", None), # Valid
("wrong@test.com", "Test@1234", "/login", "Invalid credentials"),
("alice@test.com", "wrongpass", "/login", "Invalid credentials"),
("", "Test@1234", "/login", "Email is required"),
("alice@test.com", "", "/login", "Password is required"),
("admin@test.com", "Admin@1234", "/admin", None), # Admin user
])
def test_login_data_driven(driver, base_url, email, password, expected_url, expected_msg):
driver.get(f"{base_url}/login")
driver.find_element("id", "email").send_keys(email)
driver.find_element("id", "password").send_keys(password)
driver.find_element("id", "submit").click()
assert expected_url in driver.current_url
if expected_msg:
error = driver.find_element("css selector", ".error-message")
assert expected_msg in error.text
# ══════════════════════════════════════════════════════════════
# APPROACH 2: CSV file data source (for large datasets)
# ══════════════════════════════════════════════════════════════
# test_data/login_data.csv:
# email,password,expected_url,expected_msg
# alice@test.com,Test@1234,/dashboard,
# wrong@test.com,Test@1234,/login,Invalid credentials
import csv
import pytest
def read_csv_data(filepath):
"""Read test data from CSV and return list of tuples"""
data = []
with open(filepath, 'r') as f:
reader = csv.DictReader(f)
for row in reader:
data.append((
row['email'],
row['password'],
row['expected_url'],
row['expected_msg'] or None
))
return data
@pytest.mark.parametrize("email,password,expected_url,expected_msg",
read_csv_data("test_data/login_data.csv"))
def test_login_csv(driver, base_url, email, password, expected_url, expected_msg):
# Same test logic — data from CSV
pass # implementation as above
# ══════════════════════════════════════════════════════════════
# APPROACH 3: JSON file data source
# ══════════════════════════════════════════════════════════════
# test_data/products.json:
# [
# {"sku": "P001", "name": "iPhone", "price": 999, "inStock": true},
# {"sku": "P002", "name": "MacBook", "price": 1299, "inStock": true},
# {"sku": "P003", "name": "AirPods", "price": 179, "inStock": false}
# ]
import json
def load_json_data(filepath):
with open(filepath) as f:
return json.load(f)
products = load_json_data("test_data/products.json")
@pytest.mark.parametrize("product", products)
def test_product_card_displays_correctly(driver, base_url, product):
driver.get(f"{base_url}/products/{product['sku']}")
name_el = driver.find_element("css selector", ".product-name")
price_el = driver.find_element("css selector", ".product-price")
assert name_el.text == product['name']
assert f"${product['price']}" in price_el.text
stock_badge = driver.find_element("css selector", ".stock-badge")
expected_text = "In Stock" if product['inStock'] else "Out of Stock"
assert stock_badge.text == expected_textCommon Mistakes
- Hardcoding test data in test functions — data belongs in external files or parametrize decorators, not embedded in test logic
- Not testing invalid data — DDT is often used only for valid inputs; always include invalid/edge case rows in your data files
- CSV files without headers — always include headers in CSV files and use DictReader for readable column access
- Mixing production test data with test data files — test data files should use fake/anonymized data; never include real user credentials in committed files
Tip
Tip
Practice DataDriven Testing CSV JSON Excel in small, isolated examples before integrating into larger projects. Breaking concepts into small experiments builds genuine understanding faster than reading alone.
Parameterize with external data.
Practice Task
Note
Practice Task — (1) Write a working example of DataDriven Testing CSV JSON Excel from scratch without looking at notes. (2) Modify it to handle an edge case (empty input, null value, or error state). (3) Share your solution in the Priygop community for feedback.
Quick Quiz
Common Mistake
Warning
A common mistake with DataDriven Testing CSV JSON Excel is skipping edge case testing — empty inputs, null values, and unexpected data types. Always validate boundary conditions to write robust, production-ready software testing code.
Key Takeaways
- Data-Driven Testing (DDT) separates test logic from test data — the test script runs the same logic repeatedly with different sets of input data.
- Hardcoding test data in test functions — data belongs in external files or parametrize decorators, not embedded in test logic
- Not testing invalid data — DDT is often used only for valid inputs; always include invalid/edge case rows in your data files
- CSV files without headers — always include headers in CSV files and use DictReader for readable column access