Stop Writing API Tests Manually — Let Your OpenAPI Spec Do the Work

Schemathesis automates API testing by generating test cases directly from your OpenAPI spec — covering edge cases, schema violations, and server errors you'd never think of or have the time to write manually.

You've written the OpenAPI spec. You've documented every endpoint, every request body, every response schema. And then you open your test file and start writing the same information all over again — this time as test cases. Endpoint by endpoint. Parameter by parameter. It's tedious, it's repetitive, and the coverage is always incomplete because there are only so many edge cases a human will think to test manually.

At this point you might reach for Postman, REST Assured, or Karate — solid tools, but ones that don't change the fundamental problem. You're still the one writing every test case. Or maybe you think to hand it off to AI: ask ChatGPT or Copilot to generate the test cases for you. That helps with the tedium, but it introduces a different problem. AI generates tests based on what it thinks your API should do — not what your spec actually says it does. The output needs reviewing, it can hallucinate edge cases that don't apply to your schema, and when your spec changes, the AI-generated tests don't automatically follow.

The spec you already have sits unused as a test asset in both approaches. There's a better way, and most QA engineers haven't heard of it yet.

Recently a team transitioning to a more AI-forward approach was working with a largely untested API that did have a Swagger/OpenAPI spec. Their plan was to have Claude generate the roughly 1,700 API test cases needed to cover it. I was asked to weigh in and didn't like where that was headed. It would have meant:

  • A major maintenance burden over time — AI-generated tests aren't tied to the spec, so every API change requires regenerating or manually updating them
  • Unnecessary overhead at that scale — Playwright is optimized for browser automation; running 1,700 pure API tests through it adds tooling overhead that purpose-built API testing tools don't have
  • Defaults to a happy-path bias — without explicit prompting to think adversarially, AI-generated tests tend to confirm what the spec says should work rather than probe where it breaks, creating false confidence in coverage
  • A wasteful use of AI for a problem that a purpose-built tool could solve more reliably

I suspected there had to be open-source tooling that could generate test cases directly from the OpenAPI spec. After researching the options — Dredd, Portman, CATS, and others — Schemathesis came out the clear winner. Here's why.

Your OpenAPI Spec Is Already a Test Asset

The OpenAPI specification describes everything a testing tool needs: the endpoints, the HTTP methods, the request parameters, the request body schemas, the expected response codes. It's a complete contract — and Schemathesis treats it as one.

Rather than running fixed, hand-authored test cases, Schemathesis reads your spec and generates a large variety of inputs automatically, sends them to your running API, and verifies that every response conforms to what the spec says it should be. Coverage that would take days to write manually runs in minutes, and it updates automatically as your spec evolves.

I feel it's the most capable open-source tool in this space, and genuinely underused relative to how good it is. Most QA engineers I polled hadn't heard of it — which is exactly why it's worth understanding before the rest of the industry catches up.

How Spec-Driven API Testing Tools Compare

Here's how the API tools I evaluated rank across the criteria that mattered the most to me for this use case.

ToolApproachGenerates edge casesOpen sourceActively maintained
Postman / NewmanManual test authoring❌ No⚠️ Partially✅ Yes
REST AssuredManual test authoring (Java)❌ No✅ Yes✅ Yes
KarateManual test authoring (DSL)❌ No✅ Yes✅ Yes
DreddSpec-driven contract testing⚠️ Limited✅ Yes💀 Largely abandoned
PortmanOpenAPI → Postman collection❌ No✅ Yes✅ Yes
CATSSpec-driven fuzzing✅ Yes✅ Yes⚠️ Limited
42CrunchSpec-driven security testing✅ Yes❌ No✅ Yes
SchemathesisProperty-based testing from spec✅ Yes✅ Yes✅ Yes

What Schemathesis Does Differently

Schemathesis is an open-source API testing tool that performs property-based testing against your OpenAPI (or GraphQL) spec. Rather than running a fixed set of hand-written test cases, it uses your spec as a contract to generate use and misuse cases using techniques like input fuzzing.

Types of testing Schemathesis performs:

  • Boundary validation
  • Property input fuzzing (use and misuse cases / error handling)
  • Resource state changes

Defect categories Schemathesis catches:

  • Schema-related
    • API responses not matching schema definition
    • Status codes not documented in the spec
    • Missing headers
    • Wrong content type returned
  • Implementation bugs
    • Unhandled exceptions/crashes (5xx level errors)
    • Rejection of valid inputs according to the spec
    • Header issues
    • Authentication bypasses
  • Stateful tests
    • Getting deleted resource e.g. POST → DELETE → GET (deleted item)
    • Getting created resource e.g. POST → GET (created item)

Some of these classes of bugs are easy to miss with manual test writing: unhandled parameter combinations, missing input validation, responses that don't match the documented schema, and server errors triggered by unexpected but technically valid inputs. It can be difficult to think of all the different input combinations which is where leveraging a tool like this can cover more with less effort and time.

Getting It Running

Schemathesis can be run as a CLI tool or as a Python library. To first evaluate Schemathesis we will use The Petstore API which is a popular OpenAPI testing playground. After, we'll use RESTful Booker for a more complicated example. RESTful Booker has:

  • Realistic CRUD endpoints (create, update, delete)
  • Authentication
  • Intentionally buggy behavior

I have a section listing more practice API endpoints on The Best Websites for Practicing Test Automation if you want to try some others after.

The recommended way to run Schemathesis is via uv, a fast Python package manager. If you don't have it installed, grab it first — it's a one-liner and handles everything including Python if needed. Once you have uv, no separate install step is required. uvx runs Schemathesis in an isolated environment automatically.

PetStore API (unauthenticated endpoints)

The Petstore API has a publicly available OpenAPI spec and no authentication requirements, making it the simplest way to see Schemathesis in action:

uvx schemathesis run https://petstore.swagger.io/v2/swagger.json --checks all

That's it — Schemathesis reads the spec, generates test cases for every endpoint, and reports any violations or server errors it finds.

Here's what the run looks like as it executes:

Schemathesis Output
Schemathesis v4.12.0
━━━━━━━━━━━━━━━━━━━━

 ✅  Loaded specification from https://petstore.swagger.io/v2/swagger.json (in 1.03s)

     Base URL:         https://petstore.swagger.io/v2
     Specification:    Open API 2.0
     Operations:       20 selected / 20 total

 ✅  API capabilities:

     Supports NULL byte in headers:    ✘

 ❌  Examples (in 1.17s)

     ❌  2 failed  ⏭  18 skipped

 ❌  Coverage (in 10.33s)

     ❌ 20 failed

 ❌  Fuzzing (in 3.97s)

     ✅  1 passed  ❌ 19 failed

 🕓  Stateful

     0:00:22 119 scenarios  •  12 covered / 28 selected / 28 total (28 inferred)

     ✅ 99 passed  ❌ 20 failed

One of the failures Schemathesis surfaced:

Failure Detail
________________________________________________ POST /user/createWithList ________________________________________________
1. Test Case ID: z6UISM

- Server error

[500] Internal Server Error:

    `{"code":500,"type":"unknown","message":"something bad happened"}`

Reproduce with:

    curl -X POST -H 'Content-Type: application/json' -d false https://petstore.swagger.io/v2/user/createWithList

And the final summary:

Summary
========================================================= SUMMARY =========================================================

API Operations:
  Selected: 20/20
  Tested: 20

Test Phases:
  ❌ Examples
  ❌ Coverage
  ❌ Fuzzing
  ❌ Stateful

Failures:
  ❌ API accepts invalid authentication: 1
  ❌ API accepts requests without authentication: 1
  ❌ Server error: 7
  ❌ Use after free: 1
  ❌ Response header does not conform to the schema: 1
  ❌ Response violates schema: 8
  ❌ API accepted schema-violating request: 9
  ❌ API rejected schema-compliant request: 3
  ❌ Missing Content-Type header: 3
  ❌ Missing header not rejected: 2
  ❌ Undocumented Content-Type: 3
  ❌ Undocumented HTTP status code: 15
  ❌ Unsupported methods: 17

Test cases:
  2860 generated, 59 found 71 unique failures

RESTFul Booker (authenticated endpoints)

RESTful Booker has protected endpoints that require authentication. This is why I chose to demo it. All the APIs I test at work have some form of authentication — multiple kinds. Schemathesis will work with basic auth, apiKey, bearer tokens, and custom third party, but you have to configure it (to use authentication). If you skip this step, Schemathesis will hit those endpoints and receive 403 responses, cluttering your results with a bunch of false positives. So, get a token first:

curl -X POST https://restful-booker.herokuapp.com/auth \
  -H "Content-Type: application/json" \
  -d '{"username": "admin", "password": "password123"}'

That returns a token you pass as a cookie on the Schemathesis run. RESTful Booker doesn't publish an official OpenAPI spec — we're using a community-maintained spec hosted here for reliability. The --base-url flag tells Schemathesis where to send the actual API requests, independent of where the spec is hosted:

uvx schemathesis run https://www.davidmello.com/specs/restful-booker.swagger.json \
  --base-url https://restful-booker.herokuapp.com \
  --checks all \
  --header "Cookie: token=<your-token>"

That single command reads the spec, generates test cases for every endpoint — authenticated and unauthenticated — sends them to the live API, and reports any spec violations or server errors it finds.

You, like me, might be thinking at this point, "That seems annoyingly manual to require a separate command to get the token then copy it to the header collection in the Schemathesis command." Or, "What if my tokens have a short shelf life and it expires before my test completes?" Fortunately, Schemathesis has an answer for both of these built in.

Create a schemathesis_hooks.py file in the same directory you'll run Schemathesis from:

schemathesis_hooks.py
import requests
import schemathesis

TOKEN = None

@schemathesis.hook
def before_call(context, case, request):
    global TOKEN
    
    if TOKEN is None:
        # Restful Booker requires Content-Type: application/json
        response = requests.post(
            "https://restful-booker.herokuapp.com/auth",
            headers={"Content-Type": "application/json"},
            json={"username": "admin", "password": "password123"}
        )
        
        if response.status_code == 200:
            data = response.json()
            TOKEN = data.get("token")
            if TOKEN:
                print(f"\n✅ Auth Successful! Token: {TOKEN}")
            else:
                # Sometimes it returns {"reason": "Bad credentials"} with 200 OK
                print(f"\n❌ Auth Failed. Response: {data}")
        else:
            print(f"\n❌ Auth HTTP Error {response.status_code}: {response.text}")

    if TOKEN:
        # Inject into the case headers for Schemathesis to use
        case.headers["Cookie"] = f"token={TOKEN}"

Then create a schemathesis.toml in the same directory to wire the hook in automatically:

schemathesis.toml
hooks = "schemathesis_hooks"
base-url = "https://restful-booker.herokuapp.com"

Note we are also leveraging the base-url setting in the toml file so we don't need to send that in the command line either now

With both files in place, drop the manual --header flag — the hook handles authentication for every request automatically:

uvx schemathesis run https://www.davidmello.com/specs/restful-booker.swagger.json --checks all

Before running a full test, it's worth validating the spec first to catch any issues before sending a full request set:

uvx schemathesis run https://www.davidmello.com/specs/restful-booker.swagger.json --max-examples=1
RESTful Booker constrained run
Schemathesis v4.12.0
━━━━━━━━━━━━━━━━━━━━

 ✅  Loaded specification from restful-booker.swagger.json (in 0.13s)

     Base URL:         https://restful-booker.herokuapp.com
     Specification:    Open API 2.0
     Operations:       8 selected / 8 total
     Configuration:    schemathesis.toml

 ✅  API capabilities:

     Supports NULL byte in headers:    ✘

✅ Auth Successful! Token: ec3389fa87e2c05

 ❌  Coverage (in 4.55s)

     ❌ 8 failed

 ❌  Fuzzing (in 0.84s)

     ✅ 4 passed  ❌ 4 failed

 ❌  Stateful (in 0.80s)

     Scenarios:    4
     API Links:    0 covered / 16 selected / 16 total (16 inferred)

     ✅ 2 passed  ❌ 2 failed

A couple of notable failures from the run:

Failure Detail — Server Error
__________________________________________________________________________________________________ GET /booking __________________________________________________________________________________________________
1. Test Case ID: myEjmx

- Server error
- Undocumented Content-Type

    Received: text/plain; charset=utf-8
    Documented: application/json

- Undocumented HTTP status code

    Received: 500
    Documented: 200, 400

[500] Internal Server Error:

    `Internal Server Error`

Reproduce with:

    curl -X GET -H 'Cookie: [Filtered]' 'https://restful-booker.herokuapp.com/booking?firstname=null&checkin=null&checkout=null'
Failure Detail — Schema Violation
_______________________________________________________________________________________________ PUT /booking/{id} ________________________________________________________________________________________________
1. Test Case ID: yJU461

- API rejected schema-compliant request

    Valid data should have been accepted
    Expected: 2xx, 401, 403, 404, 409, 5xx

[400] Bad Request:

    `Bad Request`

Reproduce with:

    curl -X PUT -H 'Authorization: [Filtered]' -H 'Cookie: [Filtered]' -H 'Content-Type: application/json' -d '{}' https://restful-booker.herokuapp.com/booking/1
Summary
API Operations:
  Selected: 8/8
  Tested: 8

Failures:
  ❌ Server error: 2
  ❌ API accepted schema-violating request: 1
  ❌ API rejected schema-compliant request: 2
  ❌ Missing header not rejected: 1
  ❌ Undocumented Content-Type: 5
  ❌ Undocumented HTTP status code: 4
  ❌ Unsupported methods: 6

Test cases:
  27 generated, 12 found 21 unique failures in 16.74s

What It Found That I Wouldn't Have Tested

Where Schemathesis shines is the coverage it delivers without you having to think about what to test. A few findings from the Petstore run stood out:

15 undocumented HTTP status codes — across 20 endpoints. Manually writing tests to catch undocumented response codes would require exhaustively hitting every endpoint with every possible input variation. Nobody does that. Schemathesis does it automatically.

"Use after free" — this is actually a good example of a scenario many people don't look for, but it's something I check and teach others to do. In API testing terms it means Schemathesis successfully retrieved or interacted with a resource after it had been deleted — a POST to create, DELETE to remove, then a GET that still returned data. If it's deleted, you shouldn't be able to GET it. Often, people forget to do stateful testing like this, but that's where the really fun bugs live. Further, many of the competing spec testing tools omit stateful testing altogether. It's also worth noting this is an area where spec-driven testing tools like Schemathesis have a clear edge over hand-written test suites.

9 instances of the API accepting schema-violating requests — the API happily accepted inputs that its own spec said it should reject. This is the kind of silent validation gap that can lead to corrupt data or unexpected behavior in production — or user confusion. Without property-based testing generating invalid inputs systematically, these would be invisible.

In a real-world engagement with limited time I might have applied a risk-based approach and never gotten to these. Schemathesis found all of them in under 20 seconds and the full petstore run in under 1 minute.

What I Liked

  • Against RESTful Booker I was able to cover the entire swagger file running 623 scenarios in under 2 minutes, finding 26 defects.
  • Very readable numbered test failures that include a title, description, expected vs. actual response — and a curl command to reproduce the exact failure, making it easy to hand off directly to a developer.
  • Ability to export results to jUnit, HAR, VCR, and NDJSON — making it easy to plug into CI pipelines or share with developers.
  • This paired nicely with Claude and an ADO MCP server to parse and deduplicate the failures by category and automatically file defect reports in Azure DevOps.
Specific test failure snippet
2. Test Case ID: WjVfwK

- API rejected schema-compliant request

    Valid data should have been accepted
    Expected: 2xx, 401, 403, 404, 409, 5xx

[400] Bad Request:

    `Bad Request`

Reproduce with:

    curl -X PUT -H 'Authorization: [Filtered]' -H 'Cookie: [Filtered]' -H 'Content-Type: application/json' -d '{}' https://restful-booker.herokuapp.com/booking/1

For example, in the above failure, the test indicates we received a 400 Bad Request instead of a 200 OK/Accepted level response and, per the schema, the value sent, {}, should have been an acceptable PUT option. The title API rejected schema-compliant request immediately conveys that.

What Surprised Me or Needs Work

I noticed repeated back-to-back test runs would generate a different number of test cases, which was confusing at first. Schemathesis uses random seeding to generate different inputs from run to run (you can lock this down for reproducible runs via --hypothesis-seed), and it chains requests during stateful testing based on what the API returns — so if the API responds differently, it may explore a different set of scenarios.

The documentation page is overall pretty good, but finding some of the things I was looking for was unintuitive without using their site search and some sections could use a little deeper content.

Pairing Schemathesis with AI

There are at least two ways to leverage AI to turbocharge Schemathesis that I've thought of and used so far.

  1. Parse the output to bucketize errors, create summaries of findings, and automatically file defects — as covered above with Claude and ADO MCP.
  2. Have the AI extract all the curl commands from the output and add them to your regression test suite, or parse the VCR or HAR export to do the same — turning a single Schemathesis run into a reusable, deterministic test suite.

Who Should Use This

  • Test engineers or developers who have OpenAPI/Swagger or GraphQL APIs with little or no test coverage — get high coverage in minutes without hand-writing a single test case.
  • Teams in rapid API prototyping phases where the spec is evolving — Schemathesis regenerates coverage automatically as the spec changes, with no test maintenance required.
  • Teams doing regression testing or pre-release certification — it provides fast coverage and surfaces things you may have missed.
  • Teams performing a bug bash — those activities don't have the luxury of constructing or wiring up many API tests, and Schemathesis handles that work for you.
  • Fits naturally into a CI/CD pipeline.

Where it isn't a good fit

  • APIs without OpenAPI specs
  • Teams that need non-random, deterministic tests (though you can control the seeding or build tests from a run)

Conclusion

I've been genuinely impressed by the speed at which Schemathesis covers an API and the quality of defects it surfaces. The stateful testing in particular is a differentiator — it's where the most critical bugs tend to hide, and most competing tools skip it entirely.

There's more to explore in the official docs — advanced filtering, custom checks, and deeper CI integration among them. My next steps include trialing it on larger production-scale APIs and finding more creative ways to pair the output with AI tooling to automate the defect reporting pipeline further.

It only takes a few minutes to set up a run. Point it at one of your APIs and see what it finds — you might be surprised.