Shipping and Maintaining Gutenberg Blocks at Portfolio Scale
BLOCK QUALITY FOUNDATION · 3-PART SERIES
Part 1: Anatomy & First Block · Part 2: Design Contract · → Part 3: Shipping & Audit
A block-quality foundation is only as strong as the gate that catches drift. The first two posts in this series covered the anatomy and the design contract. This post covers what happens after a block is built: how we ship it, how we keep it shippable when the underlying APIs change, and how we maintain dozens of blocks across two dozen products without the codebase rotting.
Part 1 walked the five-file anatomy and the seven-command setup. Part 2 walked the design contract: the four architectural boundaries, the 3-layer token model, the six component primitives, the twelve block non-negotiables, and the same-family rule. This is the third and final post. Read the first two if you have not, links at the bottom.
Deprecations: The Trap and the Pattern
The most expensive bug a custom block can ship is silent invalidation. A user inserts your block, customizes it, publishes, returns three months later, and the editor says “This block contains unexpected or invalid content.” The block disappears from the editor. The user loses trust.
This happens because the block’s saved markup changed between when the user saved and when they reopened. The editor compares the stored HTML to what save() would generate now, finds a mismatch, and refuses to render. WordPress is being defensive (you do not want the editor to silently overwrite custom content), but the user experience is the same as a crash.
The fix is straightforward in principle. Whenever you change save() markup, you add a deprecated entry that knows how to read the old format. Whenever you change attribute schema, you add a migrate function that converts old attributes to new ones. The mechanics:
deprecated: [
{
attributes: oldAttributes,
save({ attributes }) {
// the OLD save() markup, byte-for-byte
return {attributes.content};
},
migrate(attributes) {
// map old attributes to new shape if needed
return { ...attributes, uniqueId: 'wbe-card-' + Date.now() };
},
},
]
Three rules we enforce:
Every breaking save() change adds a deprecated entry. No exceptions. Even if you are confident no production sites use the block yet. Even if it is a single instance. The discipline is what makes the discipline work.
The deprecated save() function must be byte-for-byte the old version. Common mistake: a developer paraphrases the old markup instead of copying it. The editor compares actual saved HTML to deprecated output. Even a whitespace difference breaks the match.
The migrate function handles attribute changes. If you renamed bgColor to backgroundColor, the migrate function reads bgColor from the old attributes and writes backgroundColor to the new ones. The block heals itself when the user reopens.
Dynamic blocks (render.php) dodge this entire class of problem because the markup lives in PHP, not post_content. That is why Part 1 recommended dynamic as the default. The deprecation discipline is for the static blocks you still ship.
The Five Drift Patterns We Catch Most
Across the portfolio, the same five violations surface in 80% of plugins under audit. These are the rules teams forget first, the ones that creep back in during end-of-quarter pushes when the discipline slips.
Raw hex and px in CSS. A developer hardcodes one color in a hurry. The token system silently breaks for that block. Dark mode forgets it. The audit catches the value in source: every CSS file is grepped for #[0-9a-fA-F]{3,6} and [0-9]+px outside the allowlist (token definitions).
Inline in PHP output. A quick fix during a bug becomes permanent. Content Security Policy fails because the inline style cannot be hashed. The style cannot be themed or overridden. The audit catches literals in .php files outside specific sanitization-test fixtures.
Missing :focus-visible rings. A designer wanted cleaner buttons. Keyboard users lost the entire interface. The audit looks for any outline: none declaration without a replacement focus indicator within the same selector cascade.
Tap targets under 40px. A dense admin row tightens to 28px to fit more rows on screen. Thumb users on mobile cannot hit it. The audit measures button heights in CSS and flags anything below 40px without an opt-in density class.
Native alert() and confirm() calls. A JS developer reaches for the browser primitive. The shared modal toolkit exists for a reason: dark mode support, custom focus management, custom keyboard handling, screen-reader announcements. The audit catches alert(, confirm(, and prompt( in any JS file.
These five account for the majority of audit-time fixes. Knowing the list speeds up review. Most teams self-catch these once they have seen the audit report twice.
ux-audit.sh: The Local Gate
The audit script ships with every Wbcom plugin under bin/ux-audit.sh. It runs in roughly three seconds against a 500-file plugin. Exit code 0 means clean. Exit code 1 means at least one block-severity violation. Suitable for CI.
A typical run output:
$ bin/ux-audit.sh
## UX Audit Report - wb-listora
## Generated: 2026-05-23 - Scanned 487 files in 3.2s
[ PASS ] F1 No raw hex / px values
[ PASS ] F2 No inline
The detailed report writes to audit/ux-audit-{date}.md with the exact file path and line number for every violation. A developer opens the report, opens the file at the line, fixes the value, re-runs. Three seconds later they know whether the fix worked.
Two important properties:
The audit is deterministic. Same input, same output. No “the audit was flaky on Tuesday.” It is a grep pipeline plus some structured assertions. Either the rule violates or it does not.
The audit is portable. Every plugin ships its own copy of bin/ux-audit.sh. The script reads PREFIX from the plugin directory name (or accepts an override env var). A contributor cloning a plugin for the first time has the audit working immediately. No central server, no shared service, no version skew between plugins.
The script is open-source within the Wbcom plugin repos. The reference implementation lives at ~/.claude/skills/ux-audit/templates/ux-audit.sh for our team. The same logic can be ported to any agency that wants similar guardrails.
wp-plugin-qa MCP: The CI Gate
ux-audit.sh runs locally on a developer’s machine. The wp-plugin-qa MCP runs in CI. Both run on every PR. Together they form a two-layer net: the developer catches drift before pushing, the CI catches what the developer missed.
The MCP exposes about a dozen actions. The ones every team should know:
wppqa_check_plugin_dev_rules catches violations of our 16 admin rules: nonce without capability, $_POST iteration, browser alert/confirm, inline onclick, tap targets less than 40px, raw -1 inputs. Runs in about 121ms.
wppqa_check_rest_js_contract catches the silent class of bug where JS reads data.foo but PHP returns {bar}. The endpoint succeeds, the frontend renders empty state, the user reports “nothing happens.” We learned this from Learnomy v1.3.0. The MCP now catches it before merge.
wppqa_check_wiring_completeness catches settings that are saved to the database but no template reads. “Setting saves but does nothing” is the most common gas-lighting bug in WordPress plugins. The MCP traces every saved setting to at least one read site or fails the build.
wppqa_audit_plugin runs every check in one command. Pre-release gate. Must return failed=0.
Plugging the MCP into CI is one GitHub Actions step:
- name: Plugin QA audit
run: |
curl -sL https://wbcomdesigns.com/qa/install | bash
wppqa audit --plugin-path .
Local invocation works the same way. The MCP catches mistakes faster than a code review can.
The 22-Item Implementation Checklist
The full checklist we use across every block PR. Print and pin next to your monitor. The audit catches whatever you miss, but catching it yourself is faster than waiting for the gate to fail your build.
Before you code (five items)
- Plugin slug and CSS prefix decided
- block.json on apiVersion 3
- Lucide icon chosen, no Dashicons
- Token CSS file scaffolded
- Variant picked: static or dynamic
While coding (six items)
- All CSS uses `–{prefix}-*` tokens
- Unique ID per block instance
- Per-side spacing object on every spacing attribute
- Hover state declared as attribute, not hardcoded
- InspectorControls grouped logically (Layout, Style, Advanced)
- All block output through `useBlockProps()`
Before merging (six items)
- Tap targets at least 40px on every interactive element
- `:focus-visible` ring on every interactive element
- Keyboard navigation tested with Tab key
- `prefers-reduced-motion` respected on all animations
- No native `alert`, `confirm`, or `prompt`
- No inline `
Before shipping (five items)
- `npm run build` runs clean
- Verified at 390, 640, 1024, 1440 viewport widths
- Deprecation entry added if save() markup changed
- `ux-audit.sh` exits 0
- README updated with the new attribute or behaviour
That is the entire list. Twenty-two items, four phases. A senior block author runs through this in roughly five minutes. A junior developer runs through it in fifteen the first time, eight by the third block. The phases are the order of the work, so the checklist becomes a workflow rather than a separate exercise.
Going Deeper: Six Topics For Senior Developers
This series covered the foundation. Six topics did not fit in three posts. Each is a full meetup of its own. The wp-plugin-development skill in our internal repo covers every one with working code.
Interactivity API stores in practice. data-wp-* directives, state, actions, derived values. Killing jQuery is the easy part. Getting the store model right takes a week of practice. The store is what replaces shared global state across block instances.
apiVersion 2 to 3 migration playbook. The iframe breaks more than CSS. theme.css unavailable, global window gone, scrollparent quirks, focus management changes. We have a 14-failure-mode playbook.
Block context. providesContext and usesContext for nested layouts like Query to Post. Pattern for parent-child blocks that pass typed data without prop drilling.
Block transforms. transforms: { from, to } lets users convert classic shortcodes, raw HTML, or other blocks into yours without losing content. Skipping this breaks users’ posts during plugin transitions.
Plugging wp-plugin-qa MCP into your CI. Local pre-commit hook, GitHub Actions, GitLab. Same exit codes everywhere. The MCP is the difference between “we have rules” and “we enforce rules.”
Block patterns, variations, and styles. Three different concepts that get confused. Patterns ship pre-composed blocks. Variations ship pre-configured attributes. Styles ship CSS class options. Each has its place. A plugin that ships ten patterns when it should have shipped three variations is a plugin that taught its customers to mistrust the editor.
What This Series Is For
The three posts in this series document a real working contract used across a real working portfolio. Nothing in here is hypothetical. The audit script, the MCP, the 22-item checklist, the six primitives, the 3-layer token model: every line of every section is in production at Wbcom Designs and has been for at least 18 months.
The intent is not to publish our methodology so you adopt it verbatim. The intent is to make the case that block development at scale needs a shared contract. The specific tokens, primitives, and rules are not the point. The discipline of having a contract is.
If you ship one custom block per quarter, you can keep it in your head. If you ship a hundred blocks across two dozen plugins, you need a contract. Without one, every block drifts toward its author’s preferences. With one, the portfolio stays coherent year after year.
Adopt ours. Adapt ours. Write your own. The mechanics are documented across these three posts. The detailed reference for each section lives in our wp-plugin-development skill, which is open within our team and available to clients who engage us.
Cross-References
Part 1: The Block-Quality Foundation: How We Ship Gutenberg Blocks That Last
Part 2: The Block Design Contract: Tokens, Primitives, and the Same-Family Rule
If you are an agency, plugin team, or in-house WordPress group looking to adopt a similar foundation across your own work, Wbcom Designs takes on these engagements directly. Block-quality audits, custom integrations, full-stack plugin engineering. Contact details on wbcomdesigns.com.
What We Learned in the First Year of Running This Pipeline
The audit pipeline shipped to the full Wbcom portfolio in mid-2024. Eighteen months of running it in production surfaced lessons we did not predict at the planning stage. Five of them changed how we think about quality enforcement.
Lesson one: the audit is only as honest as its escape hatches. Every quality gate that allowed a bypass became a permanent bypass. The first version of ux-audit.sh had a BYPASS_AUDIT=1 environment variable for “emergency releases.” Within a month, every release was an emergency. We removed the variable. The audit either passes or the merge does not happen. The only acceptable bypass is at the human level, where a lead developer reviews the specific failure, decides it is acceptable for the release, and signs off in the PR description. The bypass is human judgment, never a flag.
Lesson two: false positives kill adoption faster than missed bugs. The first audit script had a few overly-aggressive regex patterns. One flagged any width: 100% as a potential layout violation, which fired on every block that legitimately wanted full-width behavior. Developers learned to ignore the audit output because it cried wolf. We tightened the patterns immediately. The current audit fires false positives roughly 1 in 200 runs. Developers trust the output, fix every real flag, and the foundation stays honest.
Lesson three: the MCP saves more time on cross-plugin work than within a single plugin. The expected value of wp-plugin-qa MCP was “catch mistakes inside one plugin’s PRs.” The actual value was much higher when a developer moved between plugins. Switching from Jetonomy to BuddyPress Polls used to mean re-learning which capability checks were enforced, which sanitization layer was canonical, which option-storage pattern was the local convention. The MCP normalized all of that. A PR that passes the MCP on Jetonomy passes the MCP on Polls. Developers move between codebases at twice the velocity they used to.
Lesson four: the audit pipeline is the documentation. We have thousands of words of internal documentation. Developers read maybe 10% of it. The audit script’s error messages, on the other hand, get read every single time. We invested a quarter in rewriting every audit error message to be self-explanatory: what failed, why it matters, the canonical fix, the location in the foundation reference. The error messages became the documentation new developers actually consume. The prose documentation is now mostly for onboarding and incident retrospectives. The error messages are for the daily work.
Lesson five: AI tooling exposes audit weaknesses, then helps fix them. When Claude Code and Cursor started generating WordPress blocks for us in late 2024, the audit caught a new class of failure: AI-generated code often had perfect-looking patterns that violated the rules in subtle ways. An InspectorControls panel with the wrong capability check. A render.php with _e() instead of esc_html_e(). The AI did not invent these violations; it pattern-matched to outdated WordPress tutorials in its training data. The audit catches them all. And once we saw the pattern, we extended the audit to flag the specific AI-frequent mistakes more aggressively. The result: AI-assisted development became safer than human-only development for routine block work, because the AI got faster but the gate stayed honest.
Common Questions From Teams Adopting This Pipeline
The same questions surface when an agency or in-house team adopts the audit pipeline. Capturing the most common ones here.
Will this slow down our team? The first month, yes. By a noticeable amount. The audit catches mistakes that used to ship to production and cost more time later. The trade is real but the math is heavily in favor of the audit. Six weeks in, the team is faster than before because the same mistakes do not get made twice.
Can we adopt this gradually? Yes. Start with the audit in warning mode (exit code 0 even on violations, but output the report). Let developers see what would fail without blocking merges. After two weeks, switch to error mode on a subset of rules (the easiest fifty percent). Two weeks later, switch the rest to error mode. Total adoption window: four weeks. We have walked three external teams through this exact rollout. It works.
What if we use Tailwind instead of token-based CSS? Tailwind is a different approach to the same problem (constrained vocabulary, predictable composition). The audit would need different rules but the philosophy holds. We have shipped Tailwind-based blocks for clients who require it. The audit catches arbitrary values ([24px], [#3B82F6]) the same way it catches raw hex and px in token-based CSS. The form differs; the discipline does not.
Do we need the MCP to make this work? No, but you need automation of some kind. We started with ux-audit.sh alone and ran it as a pre-commit hook. That worked. The MCP is a faster version of the same enforcement loop. Either way, the rule is: no humans-only review. A human reviewer is a sampler at best.
How long until our team is shipping at full velocity again? Six to eight weeks from rollout, in our experience. The audit reveals existing technical debt during the first month, which slows things down because teams have to clean up before new work. Once the debt is paid, the velocity rises above the pre-audit baseline because the team stops re-fixing the same class of bug.
A Note on Calibration
The numbers throughout this series are real numbers from our portfolio. The 121ms audit time is the median over the last 90 days of CI runs. The 80% drift pattern frequency is from the last quarterly portfolio audit (we audit our own work the same way we audit clients). The 100K+ active install number is the Wbcom portfolio installation footprint as of May 2026.
We share these numbers because the discipline is real and verifiable. If you adopt this foundation and find your numbers are dramatically different, that is interesting data. We learn from agencies and in-house teams that report back. The contact details are at wbcomdesigns.com if you want to compare notes.
The Last Word
A block-quality foundation is not a one-time deliverable. It is a living contract between the developers who build it, the auditors who enforce it, and the future maintainers who inherit it. Every plugin you ship is a vote for whether this contract gets honored. The audit makes the vote visible.
Three posts, eighteen months of accumulated lessons, twenty plugins, hundreds of blocks. Adopt this verbatim or build your own. Either way, the future of WordPress block development at scale lives behind some kind of contract. The teams that ship with a contract are the teams whose blocks still work in 2028. The teams that ship without one are the teams whose blocks customers replaced in 2027.
Pick your side. Then start the work.