AI-Assisted WPCS + PHPStan + WCAG: A Pre-Commit Stack for 2026
At Wbcom Designs, we run a WordPress agency with over 100 active plugins and custom products in production. The single change that most reduced our bug rate was not a better code review process. It was a pre-commit hook that catches problems before a developer can push. When you layer AI assistance on top of WPCS, PHPStan, and WCAG checks, the result is a quality gate that runs in under 30 seconds and blocks entire categories of regressions.
This post walks through exactly how that stack is configured, what each tool catches that the others miss, and where AI acceleration matters versus where it adds noise.
Why Pre-Commit and Not CI
CI is where you confirm quality. Pre-commit is where you enforce it. The cost difference between catching a PHPCS violation in a pre-commit hook versus having it fail a CI pipeline is real: the developer context-switches away, fixes the issue, pushes again, waits for the pipeline to spin up again. On a busy day that loop costs 15-20 minutes per violation.
Pre-commit hooks run locally, in under 30 seconds on most plugin codebases, with zero context switch. The developer fixes the problem before it ever leaves their machine. The CI pipeline then serves its proper role: confirming that what passed locally also passes in a clean environment with the full test suite.
There is a second reason: developer trust. When CI is the only gate, developers learn to push first and fix later. When pre-commit catches issues immediately, the feedback loop is tight enough that the correction becomes part of the original thought process. It changes how developers write code, not just how they review it.
The Three-Tool Stack
Each tool in the stack has a distinct responsibility. They do not overlap much, but they reinforce each other.
WPCS: WordPress Coding Standards
WPCS is PHP_CodeSniffer running the WordPress ruleset. It catches code style violations, deprecated function usage, direct database queries that should use $wpdb, missing sanitization and escaping, nonce verification gaps, and a long list of WordPress-specific anti-patterns that general PHP linters miss entirely.
The most valuable WPCS checks are not style rules. They are the security-adjacent rules: WordPress.Security.EscapeOutput, WordPress.Security.ValidatedSanitizedInput, and WordPress.DB.PreparedSQL. These catch classes of vulnerabilities before a human reviewer sees the code.
For most plugin codebases, start with the WordPress-Extra ruleset and exclude rules that conflict with your team’s established conventions. The configuration lives in a phpcs.xml.dist file at the repo root.
PHPStan: Static Analysis
PHPStan catches what WPCS cannot: type errors, null pointer dereferences, undefined variable references, incorrect method signatures, and logic paths that can never be reached. It operates at a different layer. WPCS checks code style and security patterns; PHPStan checks correctness.
For WordPress plugins, the essential extension is szepeviktor/phpstan-wordpress, which teaches PHPStan about WordPress core functions, hooks, and data structures. Without it, PHPStan raises false positives on nearly every WP API call.
Start at level 5. Levels 6 and above start requiring generics and strict null handling that can be painful to retrofit on existing codebases. Level 5 catches the bugs that actually ship to production without excessive noise.
WCAG Automated Checks
WCAG is the one that teams consistently skip in pre-commit, and it is exactly why accessibility debt compounds. Automated WCAG checks cover roughly 30-40% of WCAG 2.1 AA criteria. That is not everything, but it is the 30-40% that fails most often: missing alt text, low contrast ratios, form inputs without labels, missing ARIA roles, and keyboard navigation breaks.
For the pre-commit context, we run accessibility checks on any PHP file that outputs HTML. This is done by building a lightweight snapshot of the output and running it through axe-core via Node. The check is fast enough to run pre-commit if you scope it to changed files only.
Where AI Fits In the Stack
AI does not replace any of the three tools. It accelerates two specific parts of the workflow where the tools produce output that requires human interpretation: reading violation reports and generating fixes.
Interpreting Violations
WPCS produces long reports that mix critical security issues with minor whitespace violations. A developer looking at 40 lines of PHPCS output has to triage it manually. With a Claude prompt attached to the pre-commit output, the violations get classified by severity and the two or three critical ones get surfaced first. This is not magic; it is structured filtering. But it changes how developers respond to the output.
The same applies to PHPStan. Level 5 output on a medium-sized plugin can run to 20-30 distinct issues. An AI pass over the output can group related issues (all in the same class, or all triggered by the same root type error) and suggest a single fix that addresses several issues at once.
Generating Fixes
For WPCS, many violations have mechanical fixes: adding esc_html(), replacing a deprecated function with its successor, wrapping a database query in $wpdb->prepare(). These are pattern fixes with known correct forms. AI is well-suited to generating them.
For PHPStan, the fixes are more context-dependent. Adding a type hint to a function signature is mechanical; deciding whether a null check should throw an exception or return early requires judgment. AI handles the mechanical cases well and flags the judgment calls for human review.
For WCAG, AI is most useful at the explanation layer. The axe-core rule that fires is often opaque to developers who have not memorized the WCAG spec. A one-line explanation of what the rule means and why it matters converts a confusing report into an actionable fix.
The Pre-Commit Configuration
The stack is configured via a .pre-commit-config.yaml file at the repo root. Here is what the full configuration looks like for a WordPress plugin, with all three tools wired in. The code lives in a Gist because inline code blocks get mangled by the WAF on this site.
A few implementation notes on that configuration:
- The WPCS hook uses
pass_filenames: trueso it only checks files staged for commit, not the entire codebase. On large plugins this keeps the hook under 10 seconds. - PHPStan runs against the full codebase because it needs to resolve cross-file type references. This is the slow part. For very large codebases, restrict the paths argument to the
src/or plugin root directory. - The WCAG hook is scoped to PHP files that contain HTML output patterns. Files in
includes/that are pure logic with no output are skipped.
The Claude Code Integration
The AI layer sits between the tool output and the developer. It does not run the tools; the pre-commit framework does that. It reads the output and produces two things: a triage summary and a set of suggested fixes.
The integration is a Claude Code skill stored in .claude/skills/pre-commit-triage.md. If you are new to how MCP servers fit into a WordPress development workflow, see how AI tools are changing plugin development via MCP for the architecture context. When a developer runs the pre-commit hook and sees violations, they can invoke the skill and it reads the hook output from the terminal, classifies the violations, and generates fixes for the mechanical ones.
The prompt for the skill is deliberately constrained. It has a fixed context: the violation report, the specific file being checked, and a lookup table of known correct patterns for common WordPress violations. Keeping the context tight keeps the suggestions accurate and prevents hallucinated API calls.

Baseline Results from the Wbcom Codebase
We rolled this stack across our internal plugin suite in January 2026. At the time, the most active plugin had 847 outstanding PHPCS violations when we ran the full scan. Over the first four weeks of enforcement, violations in new commits dropped to near zero because the pre-commit hook caught them before they were committed.
The existing violation debt required a separate cleanup sprint, which took about three days with AI-assisted fixes for the mechanical cases. PHPStan at level 5 surfaced 23 genuine bugs in existing code, three of which were in code paths that handled user-submitted data. Those three would have been security disclosures if found externally.
WCAG checks found 14 accessibility issues across three plugins. Twelve were missing form labels or low-contrast text in admin views. Two were missing ARIA roles on dynamic components. All 14 were fixed in a half-day session.
The ongoing cost is low. Once the baseline is clean, the pre-commit hook adds about 15-20 seconds to the commit flow for most changesets. Developers adapt to it within a week. The complaints stop when the hook catches a bug before it reaches the staging server.
Common Objections
It Slows Down Development
It adds 15-20 seconds to the commit. It saves the 15-20 minutes of the fix-push-wait-CI cycle plus whatever debugging time the bug would have cost downstream. The time math is not close.
PHPStan Has Too Many False Positives
It does at higher levels without the WordPress stubs. At level 5 with szepeviktor/phpstan-wordpress, false positives are rare enough that they are not a workflow problem. When they appear, they are added to the phpstan.neon baseline file and reviewed quarterly.
Developers Will Bypass the Hook
They can, using --no-verify. The correct response is to also run WPCS and PHPStan in CI so the bypass only buys them a short-term workaround. The pre-commit hook is about changing the default behavior, not enforcing an unbreakable policy.
WCAG Automation Catches So Little That It Is Not Worth It
It catches the 30-40% of issues that occur most frequently and can be verified mechanically. That is exactly the category worth automating. The remaining 60-70% requires manual audits, which happen on a release cycle, not a commit cycle.
Setting Up the Stack From Scratch
This is the sequence for a plugin that has none of this in place. Budget about four hours for initial setup and a half-day to a full day for the baseline cleanup sprint depending on codebase size.
| Step | Tool | Time Estimate |
|---|---|---|
| Install pre-commit framework | pre-commit (Python) | 5 minutes |
| Configure WPCS with phpcs.xml.dist | WPCS + PHP_CodeSniffer | 30 minutes |
| Configure PHPStan with phpstan.neon | PHPStan + WP stubs | 45 minutes |
| Wire WCAG checks to changed PHP files | axe-core + Node | 45 minutes |
| Run full baseline scan | All three tools | 30 minutes |
| AI-assisted baseline cleanup | Claude Code | 4-8 hours |
| Team onboarding | README update | 1 hour |
What the Stack Does Not Catch
It is worth being clear about the limits. This stack does not replace:
- Manual code review. PHPStan and WPCS catch structural problems; code review catches design problems, business logic errors, and cases where technically valid code does the wrong thing.
- Integration testing. The stack checks individual files and functions. It does not verify that two plugins interact correctly, that a REST API endpoint returns the right shape for all inputs, or that a block renders correctly in the Gutenberg editor.
- Full accessibility audits. As noted, automated tools cover 30-40% of WCAG criteria. A complete audit requires a human tester with assistive technology.
- Performance analysis. A function that passes PHPStan and WPCS can still run 50 database queries per page load.
The stack is a floor, not a ceiling. It eliminates a category of problems so that code review and testing time can focus on the problems that require human judgment.
Extending the Stack
Once the baseline is clean, two extensions are worth considering. If you are also thinking about local development environment signalling, the WP_DEVELOPMENT_MODE constant integrates well with this stack to suppress certain WPCS rules in dev-only contexts.
Psalm as a second static analysis pass. Psalm and PHPStan catch overlapping but not identical sets of issues. Psalm is stricter about generics and immutability. For plugin codebases, the combined signal is worth the added runtime if you can run Psalm in CI rather than pre-commit.
Security-specific PHPCS rules from Automattic’s VIP Go ruleset. These cover additional patterns related to direct file system access, unsafe redirects, and header injection that the standard WPCS ruleset does not address. They are production-grade rules built for the highest-traffic WordPress environment in existence.
The Wbcom Designs Offer
If you are running a WordPress plugin or theme business and want this stack configured for your codebase, Wbcom Designs can set it up as part of our WordPress code audit service. The audit runs WPCS, PHPStan, and WCAG against your full codebase, classifies all violations, and produces a prioritized remediation plan. We can also configure the pre-commit stack so the issues do not recur.
The full service catalog includes ongoing maintenance and code review if you want the quality gate staffed rather than just configured.
The Developer Experience After Six Months
Six months after rolling out this stack at Wbcom Designs, the developer experience feedback has settled into a clear pattern. The initial resistance lasted about two weeks. Developers complained about the hook adding time to their commit flow and about PHPStan surfacing issues in legacy code they had not touched. Both complaints were legitimate.
The two-week mark is where it gets interesting. By then, developers had seen the hook catch real bugs before they reached staging. A missing nonce check in a form handler. A type error that would have thrown a PHP warning in production. A form label that had been missing alt text for months. The hook caught all of them before anyone manually tested. The complaints about time cost stopped because the ROI became visible.
The AI layer changed something subtler: how developers respond to errors. Before the AI triage, a 40-line PHPCS report prompted a common response: find the fastest fix for the most visible issue and ignore the rest. The AI summary changed the default from “find the shortest path through this” to “address the root cause.” That shift in default behavior is hard to measure but very real in practice.
One specific change: the AI started flagging groups of violations that shared a root cause. Six PHPCS violations in different functions might all trace to a single missing sanitization helper in a base class. Fixing the helper fixed all six. Without the AI grouping, those six violations would have been addressed individually, probably inconsistently. With the grouping, they become one coherent fix.
Integrating With Claude Code Skills
The Claude Code integration I mentioned earlier is worth expanding on. The skill is not a general coding assistant wired to the output. It is a constrained tool with a specific job: read a pre-commit violation report and produce two outputs: a triage summary sorted by severity and a set of suggested fixes for the mechanical violations.
The constraint matters. General-purpose AI coding assistants applied to security violations will occasionally hallucinate valid-looking but incorrect fixes. A constrained skill with a lookup table of known correct patterns for WordPress-specific violations produces more reliable output. The lookup table includes correct forms for the 20-30 most common WPCS violations: how to wrap a database query in $wpdb->prepare(), which escaping function to use for which output context, how to verify a nonce correctly.
PHPStan suggestions from the skill are more cautious. For type annotation additions and simple null checks, the skill generates the fix directly. For anything involving interface implementation changes or inheritance hierarchy modifications, it flags the issue and explains the root cause without generating a fix. Those cases require human judgment about the right architectural response.
The WCAG explanations are the simplest part. axe-core rule identifiers like color-contrast, label, and aria-required-attr are opaque to most developers. The skill converts each rule to a one-sentence explanation: what it means, why it matters, and what the fix looks like. This is the part that most directly reduces the time from “error fired” to “developer understands what to fix.”
Measuring the Business Impact
There are three metrics worth tracking once the stack is in place. These are the ones that translate into business language for a agency or plugin business.
Bug escape rate to production. Track how many production bugs per sprint could have been caught by WPCS or PHPStan. In the first 90 days after deployment at Wbcom Designs, this dropped to near zero for the WPCS-catchable category. PHPStan-catchable bugs took about 60 days to reach near zero because the baseline cleanup sprint was still running.
Code review throughput. Track how long code reviews take on average. When WPCS and PHPStan violations are not in the PR diff, reviewers spend that time on design decisions instead. Our average code review time dropped from 45 minutes to 28 minutes over the same 90-day period. That is 17 minutes per PR, and we process about 8-12 PRs per week across the plugin suite. The time savings compound.
Accessibility issue density per release. WCAG checks in pre-commit do not catch everything, but they catch the same issues repeatedly: missing labels, contrast violations in the same templates, missing ARIA roles on the same components. Tracking the density per release confirms whether the pre-commit checks are actually preventing recurrence. At Wbcom Designs, after the initial cleanup sprint, density went to near zero for the auto-checkable category and has stayed there.
The business case reduces to this: the quality stack prevents a category of bugs from shipping. That category of bugs, if they ship, costs support time, reputation, and in the case of security violations, potential security disclosures. The stack configuration costs four hours to set up and 15-20 seconds per commit to run. The ROI is not close.
The Practical Case for 2026
AI coding tools generate code faster than any previous workflow. That changes the economics of quality enforcement. When developers write code faster, the volume of code to review grows faster. A quality gate that runs in 30 seconds at commit time scales with that volume in a way that human code review alone does not.
The pre-commit stack is not an alternative to code review. It is the prerequisite that makes code review viable at higher code volume. Without it, the review process either slows down (reviewers are checking style violations instead of design decisions) or degrades (reviewers approve PRs faster than they can actually review).
WPCS plus PHPStan plus WCAG, running in under 30 seconds, with AI triaging the output: this is a configuration any WordPress development team can have in production by end of week. The tooling is mature, the configuration is documented, and the results are measurable. There is no good reason to defer it.