QA Engineering Interview Questions
Master these 30 carefully curated interview questions to ace your next QA Engineering Interview Questions interview.
Testing shows defects exist; exhaustive testing is impossible; early testing saves cost; defects cluster; pesticide paradox; testing is context-dependent; absence-of-errors fallacy.
1. Testing shows the presence of defects, not their absence — testing reduces the probability of undiscovered defects but cannot prove the software is defect-free. 2. Exhaustive testing is impossible — test everything is impractical; risk-based and priority-based testing is used instead. 3. Early testing saves time and money — defects found in requirements cost 1× to fix; in coding 10×; in production 100×. 4. Defects cluster — a small number of modules typically contain most defects (Pareto principle: 80% of defects in 20% of modules). 5. Pesticide paradox — repeating the same tests stops finding new defects; test cases must be reviewed and updated regularly. 6. Testing is context-dependent — testing a medical device requires different rigor than an e-commerce site. 7. Absence of errors fallacy — finding and fixing defects is useless if the system is unusable or doesn't meet user needs.
Verification checks 'are we building the product right?' (process, documentation); validation checks 'are we building the right product?' (user needs, actual behavior).
Verification is the process of evaluating work products (requirements, design, code, test plans) to ensure they conform to specified requirements — it's done WITHOUT executing the software. Activities: reviews, inspections, walkthroughs, static analysis. Example: verifying that a test plan covers all requirements in the RTM. Validation is the process of evaluating the finished software by executing it to ensure it meets user needs and business requirements. Activities: functional testing, UAT, exploratory testing. Example: having actual end-users run the payment flow and confirm it meets their real-world needs. A product can pass verification (built to specification) but fail validation (specification was wrong). Both are necessary — a well-specified wrong product and a poorly-specified right product are both failures.
STLC is the structured testing process: Requirement Analysis → Test Planning → Test Case Design → Test Environment Setup → Test Execution → Test Closure.
1. Requirement Analysis: QA reviews requirements for testability, identifies ambiguities, creates RTM draft, and raises clarifications. Output: testability assessment, QA requirement questions. 2. Test Planning: Define scope, strategy, resources, schedule, entry/exit criteria, and risk register. Output: Test Plan document. 3. Test Case Design: Write test cases with steps, expected results, test data; design equivalence partitions, boundary values. Output: Test Case Specifications, RTM. 4. Test Environment Setup: Configure test environments, prepare test data, install builds, verify environment readiness. Output: Environment readiness report. 5. Test Execution: Execute test cases, log defects, retest fixes, run regression. Output: Test Logs, Defect Reports. 6. Test Closure: Evaluate exit criteria, analyze lessons learned, archive test artifacts, produce Test Summary Report. Output: Test Completion Report, metrics summary.
Test scenario: high-level 'what to test'. Test case: detailed 'how to test' with steps and expected results. Test script: automated code that executes a test case.
Test Scenario: a high-level description of a functional area to be tested — 'Test user login functionality.' One scenario generates multiple test cases. Used in early planning when full detail isn't available. Test Case: a detailed, step-by-step procedure including preconditions, test steps, test data, expected result, actual result, and pass/fail status. Example: Precondition: User is on login page. Step 1: Enter valid email. Step 2: Enter correct password. Step 3: Click Login. Expected Result: User is redirected to /dashboard within 2 seconds. Test cases are traceable to specific requirements. Test Script: the implementation of a test case as automated code (Selenium, Cypress, Playwright). The script programmatically executes the steps and asserts the expected result. A test case can exist without a script (manual testing); a script always has a corresponding test case as its specification.
Severity is the technical impact of the defect on system functionality; priority is the business urgency to fix it. High severity doesn't always mean high priority.
Severity (set by QA): measures how badly the defect affects the software. Levels: Critical (system crash, data loss, complete feature failure), High (major feature not working, no workaround), Medium (feature partially working, workaround exists), Low (cosmetic, minor inconvenience). Priority (set by Business/PM): measures how urgently the defect needs to be fixed relative to other work. Combinations: High Severity, High Priority — payment processing crash on checkout (fix immediately). High Severity, Low Priority — crash in rarely-used admin-only feature (fix next sprint, low user impact). Low Severity, High Priority — CEO's name spelled wrong on homepage (cosmetic but visible in tomorrow's demo). Low Severity, Low Priority — minor tooltip text error (backlog, fix whenever). QA sets severity objectively based on user impact; the business/PM sets priority based on release context, customer impact, and roadmap.
An RTM is a document mapping each requirement to its test cases, ensuring every requirement is tested and every test case has a requirement basis.
The RTM is a QA's accountability document — it proves that all requirements have been tested. Structure: rows are requirements (REQ-001, REQ-002...), columns include requirement description, test case IDs, execution status, and defect IDs. Forward traceability: from requirements to test cases (are all requirements covered?). Backward traceability: from test cases to requirements (does every test case have a requirement?). Bidirectional RTM: both directions — the gold standard. Uses: requirement coverage analysis (find untested requirements), impact analysis (requirement change → which tests are affected?), audit evidence (show auditors every requirement is covered), and test case maintenance (find orphaned test cases with no requirement basis). In Jira, RTM is implemented through requirement links on test issues. In TestRail, through requirement IDs on test cases. In Excel, a matrix with conditional formatting for status visibility.
EP divides input data into valid/invalid partitions and tests one value per partition. BVA tests values at the boundaries of each partition where defects cluster.
Equivalence Partitioning: divides the input domain into classes of values where the software behaves identically. Only test one representative from each class. For an age field accepting 18-65: Valid partitions: {18-65}. Invalid partitions: {<18}, {>65}, {non-numeric}. Test cases: 30 (valid), 10 (too young), 80 (too old), 'abc' (non-numeric) — 4 tests instead of infinite possibilities. Boundary Value Analysis: tests values at and around partition boundaries — this is where most defects occur. For 18-65 age: test 17 (just below min), 18 (min), 19 (just above min), 64 (just below max), 65 (max), 66 (just above max). The combination of EP and BVA provides maximum defect detection efficiency — EP reduces test cases, BVA targets the most defect-prone values within each partition.
A Test Plan is the master QA document that defines the approach, scope, resources, schedule, and strategy for testing a project. It follows IEEE 829 standard.
Standard sections: 1. Test Plan Identifier — unique ID and version. 2. Introduction and Objectives — what this plan covers. 3. Test Scope — in-scope (what will be tested) and out-of-scope (what won't be tested). 4. Test Strategy — testing levels (unit, integration, system, UAT), types (functional, performance, security), and approaches (risk-based, exploratory). 5. Resource Plan — team roles, responsibilities, required skills. 6. Schedule — test phases, milestones, dependencies. 7. Entry and Exit Criteria — conditions to start and end each test phase. 8. Risk Register — identified risks with probability, impact, and mitigations. 9. Test Deliverables — test cases, defect reports, test summary report. 10. Requirements Traceability Matrix reference. A Test Plan is signed off by QA Lead, PM, and Engineering Lead before testing begins — making quality standards explicit and agreed before a line of code is tested.
A good bug report includes: summary, environment, preconditions, reproduction steps, actual vs expected result, severity, priority, and attachments. It must be reproducible and unambiguous.
Essential fields: (1) Summary: concise, specific — 'Login fails with error ERR-401 when using Facebook OAuth on iOS 17' not 'Login doesn't work.' (2) Environment: OS, browser/device, app version, test environment (staging/UAT). (3) Preconditions: what must be true before executing the steps. (4) Reproduction Steps: numbered, precise, atomic — one action per step. Anyone following these steps should reproduce the defect. (5) Expected Result: what SHOULD happen per the requirement/acceptance criteria. (6) Actual Result: what DID happen — the defect behavior. (7) Severity and Priority. (8) Attachments: screenshot, video recording (essential for UI defects), log files. The 3-step test for a good bug report: Can a developer reproduce it without asking you any questions? Can a new team member understand what the expected behavior should be? Is there enough information to assess risk without more context? If yes to all three, it's a good report.
Exploratory testing is simultaneous test design, execution, and learning — no pre-written test scripts. It's most effective for finding defects that scripted tests miss through creative investigation.
Exploratory testing (ET) is defined by James Bach as 'simultaneous learning, test design, and execution.' The tester uses domain knowledge, curiosity, and analytical thinking to investigate the application — deciding in real-time which areas to test and which test approaches to use. It is NOT unstructured: session-based ET uses time-boxed sessions (60-90 minutes) with a specific charter defining the target and approach. Charter format: 'Explore [target area] using [approach] to discover [risk/vulnerability/behavior].' Most effective when: testing a new feature for the first time (scripted tests only cover planned scenarios), investigating defect clusters (one bug found → explore nearby area), time is limited (ET finds more impactful defects per hour than scripted testing), and when the specification is informal or evolving. ET finds approximately 35% of all defects that scripted testing would miss — it is a professional QA technique, not an excuse to avoid test case writing.
Regression testing verifies that code changes haven't broken existing functionality. Test selection uses risk-based, change-impact, and historical methods to choose the right test subset.
Regression testing is run after every code change to confirm existing functionality still works. Full regression (running all test cases) is only feasible with comprehensive automation — in manual testing, it's impractical for large or fast-moving systems. Test selection strategies: (1) Risk-based selection: run tests covering features most likely affected by the change and features where failure would have the highest business impact. (2) Change-impact analysis: examine what code was changed, identify which features use that code, run tests for those features and their dependencies. (3) Historical data: test cases that historically find defects most often are highest-priority regression candidates. (4) Core regression suite: a curated set of the most critical end-to-end scenarios that must always pass. In Agile, regression is typically: per-sprint targeted regression (tests for modules changed in this sprint) + automated pipeline regression (runs on every build) + manual risk-based regression (for high-risk release areas).
Black-box: test without code knowledge (functionality). White-box: test with code knowledge (internals). Grey-box: partial code access, combines both approaches.
Black-box testing: tester has no knowledge of internal code, architecture, or implementation. Tests based on specifications and requirements only. Techniques: EP, BVA, decision tables, state transition, use case testing. Used at system testing and UAT levels. White-box testing: tester has full access to source code. Tests target internal logic, paths, and branches. Coverage metrics: statement coverage, branch coverage, path coverage, condition coverage. Techniques: basis path testing, control flow testing. Typically done by developers. Grey-box testing: tester has partial knowledge — knows the architecture, APIs, and data flow but not detailed implementation. Combines black-box test design with white-box understanding of risk. More effective at finding integration defects — knows which modules interact and can target those boundaries. Most experienced QA engineers practice grey-box naturally: they don't read code line by line but understand the system architecture well enough to target high-risk areas intelligently.
QA is involved in all Scrum ceremonies: reviewing stories in planning, surfacing blockers in standups, providing quality data in retrospectives, and testing features continuously rather than at sprint end.
Sprint Planning: QA reviews user stories for testability, identifies missing acceptance criteria, flags edge cases developers need to consider, and estimates testing effort. Definition of Done includes QA sign-off. Daily Standup: QA reports testing progress, defects found, and blockers immediately — not at end of sprint. Sprint Review: QA confirms acceptance criteria met, demonstrates quality status, presents defect summary. Sprint Retrospective: QA brings quality metrics (defect injection rate, first-time pass rate, sprint DDP) to enable data-driven improvement discussions. Backlog Refinement: QA reviews upcoming stories before they enter sprints — ensuring acceptance criteria are testable before planning. The key principle: QA participates BEFORE coding, not just after. Three Amigos sessions (PO + Dev + QA) for complex stories prevent 40-60% of defects by surfacing misunderstandings before code is written.
DoD is a shared team checklist of conditions every user story must meet before being considered 'done.' QA ensures quality criteria are in the DoD and enforces them before sprint sign-off.
A strong DoD includes: code written and reviewed, unit tests passing (>80% on new code), integration tests passing, QA functional testing complete with no Critical/High defects open, accessibility checked (WCAG AA for UI), performance verified within SLA, security reviewed for auth/data features, documentation updated (API docs, release notes), and Product Owner acceptance. QA's role in DoD: propose and own the quality conditions in the DoD, participate in DoD definition at sprint zero and retrospective updates, enforce DoD by not marking stories as done when quality conditions aren't met. When the team feels pressure to lower DoD standards ('let's ship it and fix later'), QA documents the decision and the risk accepted — transparent risk acceptance by the team is professional; silent quality debt accumulation is not. The DoD evolves: after retrospectives, add conditions that defend against defect patterns that emerged in recent sprints.
DDP = (Defects found by QA) / (Defects found by QA + Defects found in Production) × 100. It measures QA effectiveness. Improve via better coverage, exploratory testing, and shift-left practices.
DDP is the primary metric for QA process effectiveness — it measures how much of the defect population was caught before reaching customers. Formula: DDP = (QA_defects) / (QA_defects + Production_defects) × 100. Example: QA found 45 defects, 5 escaped to production. DDP = 45/50 × 100 = 90%. Target: >95% for commercial software, >99% for safety-critical. Improvement strategies: (1) Coverage gap analysis — use RTM to find requirements with no or insufficient test coverage. (2) Exploratory testing sessions after scripted testing — finds cases scripted tests miss. (3) Three Amigos / shift-left — catching requirement defects before coding reduces defects injected into code. (4) Module-level DDP tracking — identify which modules have consistently low DDP and invest more testing there. (5) Defect escape RCA — for every production defect, determine 'why wasn't this found in testing?' and fix the gap. Track DDP trend over time — a deteriorating DDP signals coverage gaps or velocity increasing faster than testing capacity.
Identify all risks via brainstorming and historical analysis, score each with probability×impact, rank tests by risk score, allocate effort proportionally, and document deferred low-risk tests.
Process: 1. Risk Identification: conduct risk identification workshop with developers, architects, PM, and support (who knows production pain points). Use historical defect data — modules with high past defect density have high probability of future defects. Categorize risks: functional, non-functional, integration, data, regulatory. 2. Risk Analysis: score each risk on Probability (1-5) and Impact (1-5). Risk Score = P × I (range 1-25). 3. Risk Evaluation: High (16-25): full test coverage, earliest testing start, most experienced tester. Medium (8-15): standard coverage. Low (1-7): reduced coverage, defer to risk acceptance. 4. Test Design: design more test cases for high-risk areas — more edge cases, negative tests, boundary conditions. 5. Execution Priority: execute highest-risk tests first. If time is cut, you've already tested the most important areas. 6. Communication: share risk matrix with stakeholders — make risk acceptance explicit and documented. Update risk assessments as defects are found (finding 5 unexpected defects in Module X elevates its risk score).
Tailor metrics to audience: engineers get defect details, PMs get release readiness data, executives get DDP trends and ROI. All metrics should connect to business impact.
Engineering team (daily, sprint-level): defect count by module and severity, first-time pass rate, test execution progress %, specific defect trends, blockers, and daily quality status. Focus: 'what do we change today?' Product Manager (sprint review, release level): release readiness (pass rate, open Critical/High defects, deferred tests with documented risk), go/no-go recommendation with supporting data. Focus: 'is this safe to release?' Executive/Leadership (quarterly): DDP trend (improving?), escape rate trend, quality investment ROI (production incidents avoided × cost per incident), and quality maturity progression. Focus: 'what is the return on QA investment?' The universal principle: translate technical metrics to business impact. 'We have 3 High-severity defects in the auth module' becomes '3 defects that could block 40% of users from logging in — risk of $15K/hour in support costs based on past incidents.' Metrics with business context are acted on; metrics without context are filed.
TMMi (Test Maturity Model Integration) defines 5 levels of QA process maturity. Use it to assess current practices, identify gaps, and build an improvement roadmap toward higher levels.
TMMi levels: Level 1 (Initial/Chaotic) — ad hoc testing, no defined process, quality depends on heroes. Level 2 (Managed) — test planning, defect management, test monitoring defined at project level. Level 3 (Defined) — organization-wide standard testing process, test training program, non-functional testing practices, peer reviews. Level 4 (Measured) — statistical process control, quality measurement baselines, predictive quality forecasts. Level 5 (Optimized) — continuous improvement culture, defect prevention, quality data drives all investment decisions. Application: conduct a maturity assessment (score each process area against evidence, not self-assessment), identify gap between current state and target state, prioritize improvements by: highest impact gaps first, feasibility within team capacity, alignment with organizational goals. Build a 6-month improvement roadmap with measurable criteria: 'By Q3, test plans will be signed off before test execution begins on 100% of sprints' (measurable, specific). Most commercial teams are Level 2-3. Regulated industries require Level 3+.
QA in microservices uses contract testing, service virtualization, API-level testing, and shift-right monitoring. In DevOps, QA engineers define CI/CD quality gates and own pipeline test automation.
Microservices QA challenges: (1) Service isolation — each service can be tested independently, but end-to-end testing spans 5-10 services. (2) Contract testing (Pact, Spring Cloud Contract) — consumers define the contract (what API calls they make), providers verify they honor it. Decouples service testing. (3) Service virtualization — mock dependent services to test a service independently (WireMock, Hoverfly). (4) Consumer-driven contracts prevent integration defects. DevOps QA role: define CI/CD quality gates (which tests must pass before merge?), write and maintain automation for the pipeline, investigate flaky tests and fix them (unreliable pipeline = ignored pipeline), define observability requirements (what metrics, logs, traces should production emit?), participate in chaos engineering to validate resilience. Shift-right practices: production smoke tests post-deployment, canary releases (expose 1-5% of users to new code, monitor quality metrics, expand if healthy), synthetic monitoring (automated tests running against production every 5 minutes), and RUM (Real User Monitoring) for real-world quality visibility.
Coverage gaps are identified via RTM analysis (requirements with no tests), defect escape RCA (production bugs not covered), and exploratory testing findings. Close gaps by adding test cases and automation.
Identification methods: (1) RTM analysis — run 'no coverage' filter to find requirements with zero test cases. These are absolute gaps — release without covering them is undocumented risk. (2) Production defect RCA — for every production defect, ask 'why wasn't this found in testing?' The answer reveals the coverage gap: missing test case, wrong test data, wrong environment, wrong device/browser. (3) Exploratory testing debrief — document what was explored and what was found. Unchartered areas are potential gaps. (4) Code coverage analysis — zero-coverage paths in unit tests indicate untested code branches. (5) User complaint analysis — recurring user reports about a feature indicate QA didn't test the real-world usage patterns. Closing gaps: add new test cases to RTM, add to regression suite, automate high-frequency gaps, improve test data to cover missing scenarios, expand compatibility matrix if device/browser gaps exist. Document and formally close each gap with the corrective action taken — don't just add tests, record why the gap existed and what was done to prevent recurrence.
Immediately escalate with severity assessment, blast radius analysis, and three options: delay, scope-reduce, or release with monitoring. The business makes the decision; QA documents the risk.
Step 1 — Investigate immediately: understand the exact defect behavior, reproduction rate, affected user percentage, workaround availability. Don't escalate with incomplete information — 5 minutes of investigation prevents a panic conversation based on wrong assumptions. Step 2 — Assess blast radius: which users are affected (all, 5%, international only?)? What's the business impact (data loss, revenue impact, user-facing error)? What's the workaround? Step 3 — Escalate with three options clearly presented: Option A — delay release by [estimated fix + retest time, minimum 4 hours]. Option B — release to [sub-segment] only, exclude the affected flow. Option C — release with immediate monitoring, dedicated support resource on standby, pre-written rollback plan if impact exceeds X threshold. Step 4 — Get a decision, in writing. Step 5 — If Option C chosen, document the risk acceptance with name, date, and mitigation. Step 6 — Post-release, investigate how this defect reached release day without detection and implement the coverage fix.
Reference the requirements document for the expected behavior. If ambiguous, bring in the PM/PO to clarify and document the decision. Never argue interpretation — reference specification.
Response process: Step 1 — Check the requirement: pull up the specific requirement, user story, or acceptance criterion for the feature. Does the observed behavior match what's specified? If the spec says the user should see a success confirmation and they don't, it's a defect regardless of developer opinion. Step 2 — If the spec clearly supports your position: share the specific requirement reference in the defect comment. 'Per REQ-045 and the acceptance criterion: user sees confirmation within 60 seconds. Current behavior: no confirmation appears. This is a defect based on specification.' Step 3 — If the spec is genuinely ambiguous: don't argue. Bring in the Product Owner: 'There's a disagreement on expected behavior for this scenario. The spec doesn't address X explicitly. Can you clarify the expected behavior and update the acceptance criteria?' Step 4 — Document whatever decision is made: if the PM confirms it's 'working as designed,' update the defect to Rejected with the PM's name and date and the clarified acceptance criteria. If the spec is updated, update the test case. Professionalism: you argue with specification, not with people. Make the spec your ally, not your battlefield.
Use payment gateway sandbox environments, anonymized test data, test accounts with generated card numbers, and define specific test scenarios covering happy path, declined cards, timeouts, and 3DS authentication.
Environment: Use the payment provider's official sandbox mode (Stripe test mode, PayPal sandbox, Braintree sandbox). Never test with real production payment credentials. Most gateways provide test card numbers that simulate specific behaviors (success, decline, insufficient funds, 3DS required). Test data: use generated card numbers (Stripe: 4242 4242 4242 4242 for success, 4000 0000 0000 0002 for decline). Never use real card numbers in any environment. GDPR: test environments must not contain real customer financial data. Test scenarios (critical): (1) Successful payment (happy path), (2) Card declined — insufficient funds, (3) Card declined — fraud detection, (4) Card expired, (5) Incorrect CVV, (6) Network timeout mid-transaction, (7) Payment succeeds but order creation fails (partial failure), (8) 3D Secure authentication flow (EU/PSD2 compliance), (9) Currency conversion for international transactions, (10) Refund processing. Performance: load test payment API under expected peak — payment failures under load are catastrophic. Security: verify card data is never logged, encrypted in transit (TLS), and compliant with PCI DSS requirements.
Investigate DDP trend, test coverage per sprint, regression coverage, defect injection rate, and whether QA capacity is proportional to development velocity. Typically indicates a coverage or shift-left gap.
Diagnostic questions: (1) Is DDP declining? If defect detection percentage dropped from 93% to 82%, 18% of defects are now reaching production — something changed in coverage or testing rigor. (2) Is test coverage per sprint keeping up? If development velocity increased 30% but QA headcount/time remained constant, test coverage was compressed. Calculate test cases per story point over the last 5 sprints — if declining, coverage is being cut. (3) Is regression testing adequate? High velocity + compressed schedules often lead to regression shortcuts. Track what regression gets deferred each sprint — accumulating regression debt creates the production defect pattern. (4) Is the defect injection rate increasing? More code changes per sprint without proportional quality improvement = more defects injected. (5) Are new developers working on complex features? Onboarding without adequate QA support increases defect injection. Investigation output: data-backed explanation of why DDP is declining, plus 3 specific proposed remediation actions with expected impact. Present to PM and Engineering Lead with the cost-of-production-defect analysis — $X/incident × increasing incident rate = the business cost of the current trajectory.
Start with the highest-impact basics: defect tracking in Jira, a core regression checklist, and QA involvement in sprint ceremonies. Grow process complexity incrementally as the team matures.
Month 1 — Foundation: (1) Set up Jira with a QA workflow and defect severity/priority fields. (2) Define the basic bug report template — all defects require steps, expected, actual, severity. (3) Create a 15-20 case core regression checklist for the most critical user flows (login, core feature, payment). (4) Attend sprint planning and add QA tasks for each story. (5) Attend retrospectives and share defect counts. Month 2-3 — Structure: (6) Write a lightweight Test Plan for the next major feature release. (7) Create RTM for core features. (8) Implement entry/exit criteria for test phases. (9) Run Three Amigos for complex stories. (10) Begin tracking DDP. Month 4-6 — Improvement: (11) Introduce risk-based testing using a simple P×I matrix. (12) Add performance and security checkpoints. (13) Evaluate test management tool (TestRail). (14) Introduce automated smoke tests for core user journeys. The principle: don't implement full TMMi Level 3 on a 5-person team. Build the minimum viable QA process that prevents the most impactful defects from reaching customers, then grow it as the organization grows.
Test functional accuracy, performance (response under 100ms), relevance ranking, edge cases (special chars, long queries, empty input), accessibility, internationalization, and privacy considerations.
Functional testing: basic autocomplete appears as user types, suggestions are relevant to query, selecting a suggestion performs the search, suggestions update in real-time, clearing input clears suggestions. Performance testing: suggestions appear within 100ms of typing (critical for UX) — test latency under various network conditions (fast 4G, slow 3G, 2G). Suggestions must not degrade typing performance on low-end devices. Edge cases: single character queries, very long queries (1000+ characters), queries with special characters (!@#$%^&), SQL-injection-like queries, HTML tags in query, empty input, queries in multiple languages simultaneously. Internationalization: test with non-Latin scripts (Arabic, Hindi, Chinese, Japanese), RTL text in suggestions, mixed language queries, character encoding. Privacy: verify autocomplete doesn't expose other users' private search history, verify personalized suggestions are based on user's own history only, verify incognito mode doesn't surface personalized suggestions. Accessibility: screen reader announces suggestions, keyboard navigation through suggestions (arrow keys), focus management. Volume testing: simulate peak search traffic and verify autocomplete latency remains <100ms under load.
Focus on performance under peak load (10-100× normal), inventory race conditions, payment processing under stress, failure recovery, and monitoring alerting to detect issues immediately post-launch.
Performance testing is the primary risk: (1) Load testing: simulate expected peak traffic (10×-100× normal based on past flash sales). Use JMeter/k6/Locust. Test: add-to-cart throughput, checkout initiation rate, payment processing rate. Identify bottlenecks before the event. (2) Stress testing: beyond expected peak — where does the system break? What degrades gracefully vs catastrophically? (3) Spike testing: simulate sudden traffic spikes (flash sale announcement goes viral — traffic jumps 50× in 30 seconds). Does the system survive? Does auto-scaling trigger fast enough? Race conditions: (1) Inventory overselling — at peak, two users simultaneously buy the last item. Does the system correctly allow only one? (2) Coupon code abuse — simultaneous application of limited-use codes. Payment testing at scale: verify payment gateway handles 10× normal transaction rate. What happens if payment gateway rate-limits? Failure scenarios: (1) Database failover under load, (2) Cache invalidation under high concurrency, (3) CDN fallback behavior. Monitoring: pre-define real-time dashboards for: error rate, p95/p99 latency, inventory accuracy, payment success rate. Set alerts with 2-minute thresholds — a flash sale issue undetected for 10 minutes can cause millions in lost revenue or chargebacks.
Test adaptive bitrate switching, buffer behavior under network changes, playback on device matrix, DRM protection, CDN failover, and use shift-right monitoring with real user metrics.
Adaptive streaming testing: (1) Verify bitrate switches smoothly when network conditions change (throttle network in test from high to low bandwidth and back). (2) Verify initial resolution selection is appropriate for current bandwidth. (3) Test startup time (time to first frame) across network conditions — Netflix targets <3 seconds. Device and browser matrix: test on the highest-reach device/OS combinations (Smart TVs, iOS, Android, web browsers — each has different streaming behavior). Network condition simulation: simulate various conditions — WiFi, 4G, 3G, 2G, packet loss, intermittent connectivity. Buffer and recovery: what happens when buffering starts? Does the player show correct buffering UI? Does it recover smoothly when connection improves? DRM testing: verify content can't be accessed without valid entitlement. Verify DRM license renewal happens seamlessly mid-playback. CDN failover: verify playback continues if primary CDN node fails (client should switch to backup CDN without user-visible impact). Geo-restriction: verify content availability respects regional licensing. Shift-right monitoring: instrument production with playback quality metrics (bitrate, rebuffer rate, startup failure rate) and monitor per-device, per-region. A 0.1% increase in rebuffer rate at Netflix's scale affects 200K concurrent streams — production monitoring IS quality testing.
Combine compatibility testing across hundreds of hardware configurations and enterprise software stacks, security compliance validation, accessibility testing, group policy enforcement, and long-term stability testing.
Compatibility testing: enterprise environments include thousands of hardware configurations, peripheral combinations, and legacy software stacks. Use a representative hardware matrix covering: OEM configurations (Dell, HP, Lenovo enterprise lines), age range (3-7 year old enterprise hardware), driver combinations. Verify compatibility with enterprise software: Active Directory group policies, antivirus solutions, ERP software, line-of-business applications. Security compliance: the feature must comply with enterprise security standards (NIST, CIS benchmarks), validate against GPO (Group Policy Object) enforcement, verify no new attack surface introduced. Accessible to all users: WCAG AA accessibility testing, screen reader compatibility (NVDA, JAWS), keyboard-only navigation. Stability and reliability: enterprise SLAs require 99.9%+ uptime. Run extended soak tests (72-hour+ continuous operation), memory leak detection (heap growth over time), performance regression (before vs after feature activation). Update scenario: verify feature behaves correctly after Windows Update is applied, feature state is preserved across reboots and updates. Rollback testing: if the feature causes issues, Group Policy disable/rollback must work cleanly.
Evaluate across 4 dimensions: technical QA knowledge (test design, defect management), process knowledge (STLC, Agile), communication and collaboration (advocate vs gatekeeper), and situational judgment (scenario questions).
Structure (90 minutes total): Part 1 — Technical QA Knowledge (25 min): ask them to write test cases for a login feature on the spot — evaluate coverage, EP/BVA usage, negative cases, accessibility, and performance considerations. Follow up: 'How would you prioritize these test cases?' and 'How would you change your approach if this was released to 10M users daily?' Part 2 — Process Knowledge (20 min): 'Walk me through how you would approach testing a new payment feature for a sprint.' Evaluate STLC knowledge, risk awareness, Agile integration, and when they involve stakeholders. Part 3 — Defect Management (15 min): present a vague bug description ('the login page is slow') and ask them to write a proper bug report. Evaluate structure, specificity, environment capture, and reproducibility. Part 4 — Scenario and Behavioral (30 min): 'A developer says your defect is not a bug — how do you respond?' 'Production defect count is rising despite good sprint velocity — what do you investigate?' These reveal communication style (advocate vs gatekeeper), analytical thinking, and professional judgment. Evaluation criteria: Do they think about users and business impact? Do they mention risk? Do they communicate quality in business terms? These distinguish senior QA thinking from junior task-execution thinking.
Ready to master QA Engineering Interview Questions?
Start learning with our comprehensive course and practice these questions.