Should I write these tests in Playwright or Selenium first if I'm new to automation?

If you are picking a framework today and you do not have a constraint pulling you elsewhere, write the first five in Playwright. The reason is narrow and practical: every one of these tests has a moment that races a UI repaint — the cart count incrementing, the search results coming back, the order confirmation rendering after redirect. Playwright's web-first assertions retry the locator until the assertion passes or a timeout hits, which is exactly the behavior you want for those moments. In Selenium and WebdriverIO you write the same wait yourself. It is doable, but it is the kind of code you forget to write the first time and pay for in flake.

How do I run these against my local dev server versus the live QA Shop?

Both should work. The five tests above target stable surfaces — testids, slugs, order numbers — that are identical in local dev and in production. The one variable is the base URL. Most teams parameterize this through an environment variable; we use `RECIPE_BASE_URL` in our own examples. Set it to `http://localhost:3002` for the local Next.js dev server (note the port — QA Shop runs on 3002, not 3000) or to the production origin for a smoke run against the live site.

Can I parallelize these five tests safely against the same shared database?

Yes. The five tests share no state with each other. The cart is per-session, the orders are scoped per account, and the assertions all fence on identifiers your own test produced — order numbers from your own checkout, cart counts from your own session. Two Playwright workers can each run the full sequence at the same time without collision. The only thing to watch is the `/orders` assertion — if you reuse the same seeded user across workers, both workers will land orders against that user, and any assertion that counts the total orders (rather than checking for one specific order number) will be racy.

What about visual regression testing? Should that be in the first five?

No. Visual regression is a different category of test with a different cost profile. The first five tests above all answer a behavioral question — does the cart update, does checkout complete, does the order appear. Visual regression answers a perception question — does this pixel match the baseline. Both are useful, but visual regression is much more sensitive to noise (font rendering, GPU, viewport size) and is best added after the behavioral suite is green. Once your five tests pass on every push, layer Playwright's `toHaveScreenshot()` or a dedicated tool on top.

How do I keep these tests passing across reseeds and database resets?

Reference stable identifiers, never UUIDs. Product slugs like `adjustable-dumbbell-set` survive every reseed because we control them in the seed data. Order numbers survive within a session because your test generated them. The two anti-patterns to avoid are: (a) reading a UUID out of the DOM and asserting against it later in the same run — UUIDs are fine within one run but useless across reseeds; and (b) asserting against the order count on the orders list — that drifts as fixtures move. The selector contract on `/help/features` covers this in detail.

Your first automation suite — five tests to write, in order

There is a moment — usually about a week into a new automation project — where the team has the framework installed, CI running on every PR, and a single example test that someone wrote during the spike. The question that follows is the one this page answers: which test do we write next, and the one after that.

That ordering matters more than people give it credit for. A suite that starts with a smoke test and grows outward in priority order survives the day someone breaks the deploy, because the smoke test catches the break before the longer specs even start. A suite that starts with a five-minute checkout spec and only gets a smoke test added later goes red on every flaky run for the first month, the team starts ignoring red, and by the time the smoke test arrives nobody trusts the suite. The order is the test.

This page is opinionated. We will name five tests, explain why each one goes where it goes, list the testids you will target, suggest the assertions, and give a rough wall-clock runtime. Then five more for when the first five pass. Then a flake-pattern section, because the tests above will flake on first authorship if you are not deliberate about it.

§ 01 · WHY ORDER

Why test order matters

The argument for incremental coverage runs like this. Every test you write is a tradeoff between coverage and time. A smoke test costs five seconds of CI per run and catches the class of bugs where the deploy itself is broken — a build failure that shipped, a missing environment variable, a route that returns 500 because a migration did not land. A checkout spec costs a minute of CI per run and catches the class of bugs where checkout is broken. The latter is more interesting, but the former runs first, fails first, and tells you exactly which class of break you have. If smoke is green and checkout is red, you know it is the checkout flow. If smoke is red, the checkout result is not even meaningful.

The argument against "I'll just script the whole flow" is the same argument inverted. A single end-to-end spec that walks the entire shopper journey is tempting because it gets you to "we have automated tests" in one PR. It is also the spec that will fail for the most reasons — any of seven sequential moves can break, and you have to read the failure backward to figure out which one. Smaller specs in priority order each tell you something specific when they go red.

§ 02 · FIRST FIVE

The first five tests, in order

Five tests. Each one is a self-contained spec, each one builds on the framework setup we covered in getting-started, each one references the selector contract from features. About 300 words on each.

1. Smoke: the homepage renders

What it does. Navigate to / and assert the page actually rendered without error.

Why it goes first. This is the cheapest test in the suite and it catches the highest-blast-radius failures. A broken deploy where the homepage 500s is the kind of failure that breaks every test downstream of it; if you only run one spec on every push, this is the one. It also runs in under five seconds, which means you can run it on every PR without anyone complaining about CI time.

The testids you'll target. data-testid="hero-section" for the top-of-page hero, data-testid="hero-title" for the headline copy, data-testid="featured-products-grid" for the featured-product rail below the fold, and data-testid="cart-count" in the header.

The assertions. Three. The hero section is visible. The featured-products grid contains at least one card (assert by counting children of featured-products-grid, expect ≥ 1). The cart count reads 0 for an unauthenticated fresh context. That is enough — three assertions that together prove the page rendered with data, not just an empty shell.

Expected wall-clock runtime. Three to five seconds locally, five to ten in CI on a cold cache. Keep it that way; if it grows past fifteen seconds you have crept into territory that belongs in a different spec.

What's NOT in this test. No interactions, no navigation, no assertions about specific copy. The smoke test is for "did the page render," not "is the marketing right." Copy assertions belong in a separate spec where they can be updated without touching the smoke layer.

2. PLP search returns results

What it does. Navigate to /products, drive the search input through a URL parameter, and assert that the result grid contains products.

Why it goes second. The product listing page is the second-most-trafficked surface on the site and the second most likely to break in a deploy. Search regressions are also the kind of bug that ships silently — the page renders fine, but the query returns nothing, and customers think the catalog is empty. A test that drives the search and counts results catches both broad and narrow regressions in one shot.

The testids you'll target. data-testid="page-products" for the outer wrapper, data-testid="search-input" for the header search field (or, more reliably, drive the URL param ?search=hub directly), data-testid="plp-grid" for the product grid, and data-testid="plp-count" for the result-count text. Each card inside the grid uses data-testid="product-card-<slug>" — a useful target if you want to assert against a specific known seeded product, e.g. data-testid="product-card-adjustable-dumbbell-set".

The assertions. Two. The result grid is visible. The card count is at least one (read it from plp-count text or count children of plp-grid). If you are feeling thorough, assert that a known seeded slug appears for a known query — this catches search-relevance regressions that a count-only assertion misses.

Expected wall-clock runtime. Eight to twelve seconds. The PLP renders server-side, so most of the wall time is the navigation itself.

What's NOT in this test. Sort, filter, and pagination. Those go in second-tier tests below. Keep the search test about search.

3. PDP add-to-cart updates the header count

What it does. Land on a known product detail page, click the add-to-cart CTA, and assert the cart-count badge in the header increments.

Why it goes third. This is the first test that exercises a state mutation. Up to here, everything has been pure rendering. Add-to-cart introduces a client-side state update — the cart context — and a side-effect that has to propagate to the header. Either of those can break independently of the page render, and both have. This is also the test that proves the cart context provider is mounted and listening; without that, the next two tests in this list would have nowhere to write.

The testids you'll target. Pick a stable PDP. The seeded adjustable-dumbbell-set is a useful choice — in stock, non-trivial price, and not used as a primary anchor in getting-started or how-to-use. Land on /products/adjustable-dumbbell-set. Target data-testid="page-pdp" for the outer wrapper, data-testid="pdp-title" for the product title (handy for assertion), data-testid="pdp-add-to-cart" for the CTA, and data-testid="cart-count" for the header badge.

The assertions. Two. The cart-count text was 0 before the click. The cart-count text is 1 after the click. Use a web-first retry assertion here — Playwright's expect(locator).toHaveText("1") will retry until the badge updates rather than reading once and racing the repaint.¹

Expected wall-clock runtime. Twelve to eighteen seconds.

What's NOT in this test. Cart page navigation, totals, line removal. Those happen in the next test. The add-to-cart spec asserts the side-effect, not the cart contents. Keep the boundary clean — when this fails, you know it is the add-to-cart event, not the cart page render.

4. Authenticated checkout with the success card

What it does. Sign in as a seeded customer, add a product to the cart, walk checkout, submit with the test card 4242 4242 4242 4242, and assert the order-confirmation page renders with an order number.

Why it goes fourth. This is the value flow. A site that sells anything earns its keep on this path, and the regressions that break checkout are the regressions that lose revenue before the on-call engineer reads the alert. Once smoke, search, and add-to-cart pass, you have evidence the surface is up; the checkout spec is what proves the value flow itself works end-to-end. We put it fourth, not first, because it depends on every test before it. If add-to-cart is broken, the checkout spec cannot run; running it anyway just gives you a noisier failure signal.

The testids you'll target. Sign-in via the /auth/login page — note that auth forms use accessibility selectors rather than testids by design, so use Playwright's getByLabel("Email") and getByRole("button", { name: /log in/i }) here, per the contract documented in features. Then on /checkout, target data-testid="checkout-step-content" for the wizard container, data-testid="checkout-summary" for the totals panel, the submit button on each step, and data-testid="checkout-error" as a negative assertion. After submission, the page redirects to /orders/[orderNumber] — assert against data-testid="order-detail-number" and capture the value.

The assertions. Three. The redirect lands on an /orders/<something> URL (use await page.waitForURL(/\/orders\//)). The order-detail-number is visible. The checkout-error testid is not visible at any point during the flow — Playwright's expect(locator).not.toBeVisible() doubles as a negative assertion that fails fast if a decline path was taken.

Expected wall-clock runtime. Forty-five to ninety seconds, depending on how chatty your shipping and payment forms are.

What's NOT in this test. Promo codes, multi-line carts, declined cards. Those are the second-tier tests in the next section.

5. Order history visible after order placed

What it does. Continuing in the same authenticated session as test four, navigate to /orders and assert that the just-placed order appears in the list.

Why it goes fifth. The orders list is the read-after-write proof for checkout. Test four proves the order was created and the confirmation page rendered. Test five proves the order persists, is queryable, and is filterable by the same identifier you captured. Together they exercise the full lifecycle of an order from a customer's point of view. Placing them adjacent in your suite (test four sets up the order, test five fences on it) is the single biggest win for read-after-write coverage.

The testids you'll target. Reuse the order number you captured from test four. Navigate to /orders. Target data-testid="page-orders" for the outer wrapper. Each order card carries data-testid="order-card-number" (the visible order number), and you can fence on it directly with a locator like page.getByTestId("order-card-number").filter({ hasText: capturedNumber }).

The assertions. Two. The orders page wrapper is visible. Exactly one order card has order-card-number text matching the captured order number. Use .filter({ hasText: ... }) rather than reading all order numbers and matching in JavaScript — the filtered locator retries on the DOM, which is exactly what you want here in case the orders list takes a beat to populate after redirect.

Expected wall-clock runtime. Eight to fifteen seconds on top of test four. If you author them as one spec with shared setup, the combined wall-clock is a minute or two; if you split them, plan on each handling its own sign-in.

What's NOT in this test. Status transitions, admin-side actions, invoices. Status round-trip is a great test for later, but it is a multi-account spec and belongs after the basics are green.

§ 03 · NEXT FIVE

The next five tests, when the first five pass

Once the first five run green on every push, write these next. Each gets a name, a why, and the two or three testids you will care about — that is enough scaffolding to script from.

PLP filter narrows results6

Drive the filter via URL params (/products?category=electronics) or click data-testid="filter-group-category" and assert that plp-count decreases. Two assertions: count is lower than the unfiltered baseline, and a known out-of-category slug is no longer in the grid.

PLP sort changes order7

Click data-testid="product-sort" and pick a known option. Capture the first product slug before and after; assert they differ. Sort tests are sneaky-flaky if the catalog has products with identical prices — pick a sort field with no ties.

Out-of-stock disables add-to-cart8

Use the seeded mechanical-keyboard-rgb slug, which ships with stock=0. Land on the PDP and assert the add-to-cart CTA is either not present or disabled. Catches the regression where a stock-zero product becomes purchasable after a buy-box refactor.

Declined card branches checkout9

Failure-mode mirror of test four. Use test card 4000 0000 0000 0002, walk the same checkout flow, and assert data-testid="checkout-error" becomes visible. The success path and the failure path both deserve coverage; the success path runs first because more of the suite depends on it.

Cart line quantity update10

From a populated cart, increment a line quantity, assert the line total reflects the change, and assert data-testid="cart-summary-total" recomputes. Catches the off-by-one where the line updates but the summary does not.

§ 04 · FLAKE PATTERNS

Flake patterns and how to dodge them

Three flake sources you will hit on QA Shop within the first five tests, and the targeted dodge for each.

Race between cart-state update and DOM repaint

In test three, the cart-count badge updates as a side effect of a React state change. Read it once and you might catch the pre-update value; read it after a fixed await page.waitForTimeout(500) and you have introduced a sleep that will flake on a slow CI runner. The right answer is a web-first retry assertion — await expect(page.getByTestId("cart-count")).toHaveText("1") retries the locator and the assertion together until either the text matches or a timeout hits.¹

Auth cookie set after redirect

Test four logs in by clicking the submit button, which posts the credentials and, on success, redirects to the next page. There is a window — usually under 100ms — where the form has been submitted but the redirect has not landed and the session cookie has not yet been attached to the test's browser context. The dodge is await page.waitForURL(/\/(account|orders|checkout)/) before the next assertion — a deterministic wait on the URL transition, no sleep, no retry loop.

Search debounce timing

If your test types into the search input character-by-character (rather than driving the URL parameter directly), there is a debounce between the last keystroke and the request firing. Asserting on the result count immediately after the last keystroke races the debounce. The targeted dodge is to fence on a stable post-debounce signal — assert that plp-count text changed from the pre-search value to a new value, or drive the search via the URL parameter and skip the input entirely.

A lighter dodge applicable to all three: prefer the framework's web-first assertions over manual waits. Every flake pattern above has a "the test ran the assertion before the page was ready" shape, and every framework's modern locator API has a retry-until-true mode that solves it. Cypress documents this strategy explicitly under retryability.²

§ 05 · THE CONTRACT

The testid contract and where the canonical list lives

A short anchor for the testid surface, because the five tests above target a slice of it. Every interactive element in QA Shop carries a data-testid. The taxonomy is feature-prefixed: page-* for outer page wrappers, pdp-* for product detail page elements, cart-* for the cart surface, checkout-* for the checkout wizard, order-* for the orders list and detail, admin-* for the admin panel, qa-* for the developer dock that exposes session state at /test-tools, and accessibility-first selectors on auth surfaces. The full canonical list lives in features, where each surface is walked in detail and the slug-and-order-number identifier policy is documented.

The five tests above plus the five second-tier ones cover most of what a portfolio automation project ever needs to demonstrate. If you want to go deeper, the how-to-use page walks the same flow with a manual-tester eye and includes the admin-side status round-trip — the most under-tested seam in commerce automation. Framework-specific runnable spec files for the five tests above are coming under /recipes once published; longer-form practitioner guidance lives under /learn.

Playwright Test web-first assertions — https://playwright.dev/docs/test-assertions (verified 2026-04-28, Playwright 1.58) ↩ ↩²
Cypress Retry-ability — https://docs.cypress.io/app/core-concepts/retry-ability (verified 2026-04-28, Cypress 13.6) ↩

Frequently asked questions

Last verified: April 28, 2026

Playwright Test web-first assertions — https://playwright.dev/docs/test-assertions (verified 2026-04-28, Playwright 1.58) ↩ ↩²
Cypress Retry-ability — https://docs.cypress.io/app/core-concepts/retry-ability (verified 2026-04-28, Cypress 13.6) ↩

Frequently asked questions

Why test order matters

The first five tests, in order

1. Smoke: the homepage renders

2. PLP search returns results

3. PDP add-to-cart updates the header count

4. Authenticated checkout with the success card

5. Order history visible after order placed

The next five tests, when the first five pass

Flake patterns and how to dodge them

The testid contract and where the canonical list lives

Footnotes

Frequently asked questions

Should I write these tests in Playwright or Selenium first if I'm new to automation?

How do I run these against my local dev server versus the live QA Shop?

Can I parallelize these five tests safely against the same shared database?

What about visual regression testing? Should that be in the first five?

How do I keep these tests passing across reseeds and database resets?

Why test order matters

The first five tests, in order

1. Smoke: the homepage renders

2. PLP search returns results

3. PDP add-to-cart updates the header count

4. Authenticated checkout with the success card

5. Order history visible after order placed

The next five tests, when the first five pass

Flake patterns and how to dodge them

The testid contract and where the canonical list lives

Footnotes

Frequently asked questions

Should I write these tests in Playwright or Selenium first if I'm new to automation?

How do I run these against my local dev server versus the live QA Shop?

Can I parallelize these five tests safely against the same shared database?

What about visual regression testing? Should that be in the first five?

How do I keep these tests passing across reseeds and database resets?