claude-code-workflow.html

The production Claude Code workflow — skills, hooks, memory.

The Claude Code setup that actually ships software in 2026. Skills bind the model to your conventions. Hooks gate dangerous operations. Memory persists project facts. Eight stages from project setup to session handoff — the workflow that turns AI-assisted into AI-augmented engineering.

8-stage production workflow This site built with the same workflow 91,000+ pages on one property using it Updated 2026-05-13

KEY FACTS · 2026

  • A production Claude Code workflow in 2026 depends on three artifacts: a skills library (.claude/skills/), hooks-driven settings.json (.claude/settings.json), and a memory system (.claude/memory/). Without all three, output drifts and production safety degrades.
  • Skills are reusable instruction sets that bind the model to project conventions — testing framework, deploy commands, file structure, naming patterns. Stored as Markdown files; loaded automatically based on context relevance.
  • Hooks are shell commands that the harness executes around tool calls — pre-commit checks, post-write type-checking, refusal of unsigned pushes. The mechanism for production safety; memory and skills alone do not enforce anything.
  • Memory files persist project-specific facts between sessions. The right entries answer "what would surprise a new engineer about this codebase?" — not basic conventions (those go in skills).
  • Sub-agents (Task tool) let the main session delegate research, parallel builds, and code review to specialised agents. Keeps the primary context lean and the work parallelisable.

THE EIGHT-STAGE WORKFLOW

  1. 1. Project setup before commit one. Initialise the .claude/ directory before writing code. Create skills/ for project conventions (testing, deploying, schema patterns), settings.json with hooks (typecheck on save, refuse direct prod writes, gate pushes through preview deploys), memory/ with seeded facts about the codebase. Treat this setup as you would tsconfig or eslintrc — checked into the repo, reviewed in PRs.
  2. 2. Session start — Plan mode for non-trivial work. For any task beyond a one-line fix, open Plan mode (write a multi-step plan, get user approval, then execute). Plan mode forces explicit thinking before tool calls — the failure mode of vibe-coding is skipping this step.
  3. 3. TDD enforced via skill. Add a tdd skill that loads on any "implement X" prompt. Red-green-refactor loop: write a failing test, see it fail, implement, see it pass, refactor. The skill enforces it; the engineer verifies each turn.
  4. 4. Sub-agent delegation for parallelisable work. When work splits cleanly (multiple file changes, parallel research, independent reviews), dispatch sub-agents in a single message with multiple Task tool calls. The main context stays focused on synthesis; sub-agents return condensed results.
  5. 5. Hook-gated commits. Pre-commit hook runs typecheck + lint + relevant tests. Refuse on failure. The model cannot bypass — the hook executes regardless of session state. Bypass requires explicit human override via the CLI.
  6. 6. Preview deploys gate production. Every PR creates a Vercel or Netlify preview deploy. Production merge happens only after manual QA on the preview. settings.json hook refuses pushes to main from a branch that lacks a preview URL.
  7. 7. Post-merge memory update. After significant work merges, update .claude/memory/ with anything that surprised the agent or required correction. Surprising findings become part of the project knowledge graph; future sessions inherit them.
  8. 8. Session handoff via written summary. End every session with a one-paragraph summary in HANDOFF-YYYY-MM-DD.md. Captures what shipped, what is in flight, what needs the next session to verify. The next session (yours or a teammate's) starts informed.

WHAT BELONGS IN A SKILLS LIBRARY

A mature production project carries 12-20 skills. Sample entries (live from this site\'s .claude/skills/ directory):

tdd

Enforces red-green-refactor on any feature work. Loaded automatically on "implement X" prompts.

verification-before-completion

Refuses "task complete" claims without running verification commands and showing output.

systematic-debugging

Forces hypothesis-driven debugging — write the hypothesis, design the minimal test, execute, narrow further. Stops trial-and-error spiraling.

writing-plans

Used in Plan mode. Structures multi-step plans with named files, line numbers, and verification commands.

requesting-code-review

Standardises the code review process — what to check, when to escalate, when to merge.

using-git-worktrees

For parallel branches without polluting the main checkout. Critical for stack-based PR workflows.

dispatching-parallel-agents

Recipe for sending multiple sub-agents in a single message — the parallelism primitive.

finishing-a-development-branch

Branch closure checklist — merge, archive, clean up, update memory.

WHAT BELONGS IN MEMORY

Five categories worth persisting. Basic conventions go in skills, not memory. Session-specific in-progress work goes in tasks. Memory is for surprising, project-specific facts that future sessions must know.

Stack-specific quirks

Example: "On Astro, locale regex must list longer locales first — zh-Hant before zh, or routing silently breaks."

Deploy + CI commands

Example: "Production deploy via main merge → Netlify auto-builds. Never push to main with failing typecheck — pre-commit hook should catch it but if bypassed it lands as a failed deploy."

Stakeholder preferences

Example: "Gautam prefers terse responses, no trailing summaries, em-dashes ok on service pages but stripped on blog posts."

Recurring bug patterns

Example: "CSP must list every third-party origin per resource type. Adding new media source without media-src silently blocks video playback."

Reference URLs

Example: "Admin panel at /admin/seo/. LLM tracker at /admin/llm-mentions/. Seed prompts in sql/seo-llm-mentions.sql."

THE WORKFLOW IN ONE SENTENCE

Plan before tool calls, skills enforce conventions, hooks gate destructive operations, sub-agents handle parallelisable work, tests run on real CI, preview deploys gate production, memory captures surprises, written handoffs let the next session start informed.

FREQUENTLY ASKED QUESTIONS

What is the right Claude Code workflow for a production team in 2026?Three layers. Layer one — .claude/skills/ with reusable instruction sets bound to your conventions (testing framework, deploy commands, naming, file structure). Layer two — .claude/settings.json with hooks gating dangerous operations (no direct prod writes, no pushes without preview deploys, typecheck on save). Layer three — .claude/memory/ with project-specific facts that persist across sessions. Plus Plan mode on non-trivial work, TDD enforced via skill, sub-agent delegation for parallel work, hook-gated commits, preview deploys before merge, post-merge memory updates, written session handoffs.
What are Claude Code skills and why do they matter?Skills are reusable instruction sets in .claude/skills/ that bind the model to your project conventions. They load automatically when the context is relevant — so a TDD skill kicks in on "implement X" prompts; a deployment skill kicks in on "ship" prompts. Without skills, the model relies on its general training and drifts toward generic patterns. With skills, the output matches your stack, your testing framework, your deploy story. Skills are the single biggest lever for output consistency in production.
What are hooks and what do they prevent?Hooks are shell commands defined in .claude/settings.json that the harness executes around tool calls. Pre-commit hooks run typecheck + lint + tests; post-write hooks run formatters; user-prompt-submit hooks can inject project context into every prompt. Critically, hooks execute regardless of session state — the model cannot bypass them. This is the mechanism for production safety. Skills and memory shape behaviour; hooks enforce it.
What goes in Claude Code memory?Project-specific facts that would surprise a new engineer. Stack quirks ("on Astro locale regex order matters"), deploy commands ("production deploys via main merge"), stakeholder preferences ("user prefers terse responses, no trailing summaries"), recurring bug patterns ("CSP must list every third-party origin"), and reference URLs to internal admin pages and important files. What does NOT go in memory: basic conventions (those go in skills), session-specific in-progress work (that goes in tasks or commit messages), and anything documented elsewhere in the codebase.
When should I use sub-agents (Task tool) vs the main session?Sub-agents for parallelisable work that benefits from condensed output — research, multi-file investigations, independent code reviews. The main context stays lean; sub-agents return synthesised findings. Stay in the main session for sequential work where context carries forward — feature builds, debugging, anything where each step depends on the previous one. The sub-agent worth using is the one whose results you can act on without re-running the same analysis yourself.
Is vibe-coding production-safe?No. "Vibe-coding" — meaning AI-generated code shipped without human review or harness — fails in production at significantly higher rates than traditional engineering or harness-protected Claude Code work. The failure modes: type drift across files, silent test omission, security regressions from generated boilerplate, plugin-injection vulnerabilities. The fix is not avoiding AI tooling; it is using the harness (skills, hooks, memory, senior review) that turns AI-assisted into AI-augmented.
What is the difference between Claude Code workflow and Cursor workflow?Claude Code is a CLI-first agentic tool — long sessions, multi-file work, sub-agent delegation, harness-based safety. Cursor is an IDE-first completion + chat tool — inline suggestions, single-file edits, lightweight refactors. The workflow shape is different: Claude Code suits agency-scale delivery and multi-file refactors; Cursor suits IDE-native focused sessions. Most senior engineers run both. Detailed comparison: Claude Code vs Cursor for production teams.
How long does it take to set up a production Claude Code workflow?Initial setup: 4-8 hours for a single repository — skills library seeded with project conventions, hooks configured in settings.json, memory primed with the first batch of project facts. Mature workflow: 2-4 weeks of regular use, during which the skills library grows from 3-5 entries to 12-20, the hooks tighten as failure modes surface, and the memory accumulates the surprising facts that should have been there from day one. Past the 4-week point, the workflow is stable and the marginal setup time per new project drops to 1-2 hours.

APPLY THIS WORKFLOW TO YOUR TEAM

Book a 30-minute workflow call. Bring your stack, your current AI tooling, and your team size. By the end of the call you will have a written .claude/ scaffold for your specific repo, a starter skills library, and the hooks that prevent the failure modes most likely to bite you. If you want hands-on help setting it up, that becomes a 1-2 week engagement at the standard rate.

RELATED — CLAUDE CODE CLUSTER