Test Types — Functional and Non-Functional Testing
What's the difference between regression, smoke, sanity, performance, and security testing? Learn all major test types, when to use them, and how they fit in your CI/CD pipeline.
This post is part of the ISTQB Foundation Level series. If you missed the previous post, read Part 4 — Test Levels Explained first.
Types vs. Levels — The Key Distinction
Before we dive in, let’s establish one critical distinction that trips up many ISTQB candidates.
Test levels describe where testing happens — component, integration, system, or acceptance. They define the scope and the portion of the system under test.
Test types describe what characteristic is being tested. Functional? Performance? Security? Usability?
The crucial insight: any test type can be applied at any test level. You can run a performance test on a single component (component level), on the interaction between two services (integration level), or on the full system (system level). The type and the level are independent dimensions.
This is ISTQB section 2.3, and understanding the distinction clearly is worth at least one exam question.
Functional Testing — ISTQB 2.3.1
Functional testing verifies what the system does — that the software functions as specified in the requirements.
A functional test asks: “Does this feature work?”
- Does the login form accept valid credentials and reject invalid ones?
- Does the shopping cart calculate the correct total when a discount code is applied?
- Does the API endpoint return the expected data structure for a given input?
The test basis for functional testing is the functional requirements: use cases, user stories, functional specifications, business rules.
Functional testing can be performed at all test levels:
- Component level: does this function return the correct value?
- Integration level: do these two services exchange the correct data?
- System level: does this end-to-end workflow produce the correct outcome?
- Acceptance level: does this workflow satisfy the business requirement?
Black-box test techniques — equivalence partitioning, boundary value analysis, decision tables, state transition testing — are the primary tools of functional testing. We’ll cover these in detail in a later post.
Non-Functional Testing — ISTQB 2.3.2
Non-functional testing verifies how well the system performs — the quality characteristics beyond raw functionality. Even a system that does exactly what the requirements say can fail users if it does it slowly, insecurely, or in a way that’s impossible to use.
The ISO/IEC 25010 standard defines the quality characteristics that non-functional testing addresses. The main ones you’ll encounter in practice:
:::warning[A Word on Non-Functional Testing in the Wild] Most teams test only functionality. Non-functional failures cause most production outages. The slow database query that works fine with 100 test records but collapses with 10 million production records. The security header that was accidentally removed in a refactor. The page that’s technically functional but requires 14 clicks to complete a task a user does 50 times a day. Non-functional testing is not optional — it’s what stands between a working demo and a reliable product. :::
Performance Testing
Performance testing verifies how fast, how scalable, and how stable a system is under varying loads.
There are several distinct sub-types, each with a different purpose:
| Sub-type | What it tests | Typical question |
|---|---|---|
| Load testing | Behaviour under expected load | Does the system handle 1,000 concurrent users? |
| Stress testing | Behaviour beyond expected load | At what point does the system fail? How does it fail? |
| Spike testing | Behaviour under sudden load increase | What happens when traffic spikes 10× in 10 seconds? |
| Endurance / soak testing | Behaviour under sustained load over time | Does the system degrade over 12 hours of continuous use? (memory leaks, connection pool exhaustion) |
Tools: k6, Apache JMeter, Gatling, Locust
A good performance test suite defines explicit acceptance criteria (e.g., “95th percentile response time < 500ms under 500 concurrent users”) rather than just measuring and hoping the numbers look good.
Security Testing
Security testing verifies that the system protects data and resources from unauthorized access and manipulation.
The OWASP Top 10 is the de facto reference list for web application security vulnerabilities — SQL injection, broken authentication, cross-site scripting (XSS), insecure deserialization, and others. If you’re testing a web application and haven’t verified it against the OWASP Top 10, your security testing is incomplete.
Security testing approaches:
- SAST (Static Application Security Testing): analyzes source code without running the application. Fast, catches issues early, integrates into IDE and CI. Tools: Snyk Code, SonarQube, Semgrep.
- DAST (Dynamic Application Security Testing): attacks the running application from the outside, like a real attacker would. Tools: OWASP ZAP, Burp Suite.
- Penetration testing: a skilled security professional (or team) attempts to compromise the system using the same techniques real attackers use. More thorough than automated scanning, but slower and more expensive.
- Dependency scanning: checks your third-party libraries for known vulnerabilities. Tools: Snyk, Dependabot, OWASP Dependency-Check.
Usability Testing
Usability testing verifies that the system is easy to learn, efficient to use, and satisfying for the intended users.
Methods include:
- Think-aloud protocol: users narrate their thoughts as they attempt tasks, revealing confusion and friction points.
- Task completion testing: users attempt specific tasks; success rate and time-on-task are measured.
- A/B testing: two variants of a UI are shown to different user groups; metrics determine which performs better.
Tools: Maze (unmoderated remote usability testing), Hotjar (heatmaps, session recordings), UserTesting (moderated sessions).
Reliability Testing
Reliability testing verifies that the system operates without failure for a specified period under specified conditions.
Key metrics:
- MTBF (Mean Time Between Failures): average time the system operates between failures.
- MTTR (Mean Time To Recovery): average time to restore service after a failure.
- Availability: typically expressed as a percentage (e.g., 99.9% = ~8.7 hours downtime/year).
Chaos engineering takes reliability testing to its logical extreme: deliberately inject failures into a running system (kill a pod, saturate a network link, corrupt a disk) to verify that the system degrades gracefully and recovers automatically. Tools: Chaos Monkey (Netflix), LitmusChaos, AWS Fault Injection Simulator.
Regression Testing — ISTQB 2.3.3
Every time you change software — fix a bug, add a feature, refactor code, update a dependency — you risk introducing new defects or re-introducing defects that were previously fixed. This is called regression.
Regression testing is the practice of re-running existing tests after changes to confirm that previously working functionality still works.
Risk-based vs. full regression
Running the entire test suite after every change is often impractical. Two approaches manage this trade-off:
Full regression: run everything. Maximum confidence. Often too slow for frequent CI/CD cycles.
Risk-based regression: select tests based on the risk and impact of the change. If you changed the payment module, run payment tests plus anything the payment module touches. Requires good test traceability (knowing which tests cover which functionality).
Regression in CI/CD
In modern pipelines, regression testing happens continuously:
- Unit and fast integration tests on every commit (seconds)
- Broader integration suite on every PR (minutes)
- Full regression suite on merge to main (minutes–hours)
Maintenance of regression suites
Regression suites grow over time and require active maintenance. A test that always passes is not necessarily valueless — it’s a safety net — but tests that are flaky, slow, or test functionality that no longer exists should be removed or fixed. A bloated regression suite becomes a bottleneck.
Smoke Testing
Smoke testing is a quick, broad check to verify that the most critical functionality of a build or deployment works at all.
The name has industrial origins: if you turn on a new electrical circuit and it starts smoking, you know something is fundamentally wrong before you try to use it. In software, a smoke test answers: “Is this build even worth testing further?”
Characteristics:
- Covers ~5–15% of the full test suite
- Runs in minutes, not hours
- Focuses on critical paths: can users log in? Does the home page load? Does the core API respond?
- Run immediately after deployment to any environment
If smoke tests fail, the build is rejected immediately, without spending time running the full suite against a broken deployment.
Sanity Testing
Sanity testing is a narrowly focused check performed after a minor change or bug fix to verify that the specific change works as intended and hasn’t introduced obvious breakage in related functionality.
The differences from related concepts:
| Smoke Testing | Sanity Testing | Regression Testing | |
|---|---|---|---|
| Scope | Broad, critical paths | Narrow, specific area | Broad, all existing functionality |
| Trigger | New build / deployment | Minor fix or change | Any change |
| Goal | Is the build viable? | Did this fix work? | Did anything break? |
| Formality | Often scripted | Often manual / exploratory | Scripted |
| Time | Minutes | Minutes to an hour | Hours |
Sanity testing is often performed by testers without a detailed test script — they use their knowledge of the system to quickly assess whether the change achieved its goal.
Mapping Test Types to Your CI/CD Pipeline
Different test types belong at different points in the delivery pipeline. Here’s a practical mapping:
┌─────────────────────────────────────────────────────────────────┐
│ Stage │ Test Types │
├─────────────────────────────────────────────────────────────────┤
│ Pre-commit │ Unit tests, SAST (linting + static analysis)│
│ Pull Request │ Unit, Integration, Contract, SAST │
│ Merge to main │ Smoke, Regression, DAST (basic) │
│ Pre-release │ Full regression, Performance, Security pentest│
│ Post-deployment │ Smoke, Synthetic monitoring, Chaos (staging) │
│ Production │ Synthetic monitoring, Observability │
└─────────────────────────────────────────────────────────────────┘
The guiding principle: faster tests earlier, slower tests later. You want to fail as early in the pipeline as possible, because failures found earlier are cheaper to fix and faster to resolve.
Conclusion
Test types and test levels form a two-dimensional space for organizing your testing effort:
- Functional testing verifies correctness — does it do what the requirements say?
- Non-functional testing verifies quality — does it do it fast enough, securely enough, reliably enough?
- Regression testing preserves correctness over time — did our changes break something that worked?
- Smoke testing gates the pipeline — is this build even worth testing?
- Sanity testing verifies focused fixes — did this specific change achieve its goal?
Before your next sprint retrospective, ask: “Which non-functional test types are we systematically NOT doing?” Performance, security, and reliability testing are often the first casualties when project schedules tighten — and the first causes of production incidents after release.
:::tip[ISTQB Exam Tip] Remember: test types and test levels are orthogonal. The exam frequently presents scenarios and asks you to identify the test type, the test level, or both. A “performance test on a single function” is non-functional testing at the component level. A “UAT walkthrough of the checkout flow” is functional testing at the acceptance level. Practice applying both dimensions simultaneously. :::
Next up: Part 6 — Static Testing and Reviews