Process Factory Architecture v1.0
Config-driven, block-based execution engine for transparent, auditable pipeline runs
Overview
The Process Factory Engine replaces ad-hoc skill invocation with tracked, stateful DAG execution. Every skill, ingest job, or analysis task becomes a Block — a universal execution unit with typed inputs, outputs, dependencies, gates, and thresholds.
Why This Exists
After building the full 18-skill clone pipeline (Run #1: GREEN 91.9%), we identified that the same transparency/auditability problem exists when building anything, not just running extractions. The solution: a config-driven engine that makes every pipeline run observable, replayable, and auditable.
Core Principles
- Config over code — Pipeline topology defined in JSON, not hardcoded
- Stateless scheduler — Given current block states, computes readiness. No memory needed.
- One schema, three patterns — Block v2 handles DAG batch, stream/event, and analytical lens
- Human-in-the-loop — Gates before or after any block. Approve/reject with feedback.
- Threshold enforcement — Auto-retry below minimum, fail after max attempts
- Modular architecture — Engine tracks state. Claude Code/FORGE executes skills. Dashboard observes.
Block v2 Schema
The universal execution unit. Every skill, ingest job, or analysis task = one block. 11 field groups covering all three execution patterns.
{
// ── Identity ──
id: string // "soul-extractor"
name: string // "Soul Extractor"
layer: string // "L2"
type: enum // extract|validate|build|orchestrate|research|ingest
// ── Execution ──
skill: string // maps to SKILL.md
model: string|null // "opus"|"sonnet"|null (inherit)
timeout: number // 300000 (ms)
runtime: enum // forge|claude-cli|mastery-api|nowpage
// ── Trigger ──
trigger: {
type: enum // dependency|cron|webhook|event|manual
cron: string|null // "0 6 * * *" (Pattern 2)
event: string|null // "entity.created(gate=2)" (Pattern 2)
webhook: string|null // "/api/ingest/youtube" (Pattern 2)
}
// ── Inputs ──
inputs: {
depends: [{block, key}] // wiring from other blocks
files: [glob] // workspace file patterns
env: [string] // required env vars
config: {k:v} // runtime config overrides
}
// ── Outputs ──
outputs: {
artifacts: [{key, path, format}]
metrics: [{key, type}]
publish: [{domain, slug}]|null
}
// ── Execution Control ──
execution: {
parallel_group: string|null // "L2-extractors"
human_gate: enum // before|after|none
threshold: {metric, min}|null
retryable: boolean
max_retries: number // 2
}
// ── Lifecycle (Pattern 1: one-shot, 2: daemon, 3: on-demand) ──
lifecycle: { mode: enum } // one-shot|daemon|on-demand
// ── Scope ──
scope: {
gate: 1|2|3|4|null // Data Gate tier
run_params: {k:v} // per-run overrides
}
// ── Lens (Pattern 3 only) ──
lens: {
perspective: string|null // "positioning"|"gap_analysis"
aggregates: [string]|null // ["entity_id"]
feedback_loop: boolean
}
// ── Record State (Pattern 2 only) ──
record_state: {
track_per: string|null // "content_id"|"entity_id"
states: [string] // ["raw","annotated","embedded","scored"]
}
// ── Runtime State (engine-managed, not in config) ──
state: {
status: enum // pending|ready|running|done|failed|blocked|waiting
started_at: timestamp|null
ended_at: timestamp|null
score: number|null
error: string|null
attempts: number
}
}
Design Decisions
- 11 field groups — covers all three patterns with one definition
- Null fields for inactive patterns — Pattern 1 blocks have
lens: nullandrecord_state: null - Runtime state is engine-managed — config defines the block, engine manages the state
- Typed wiring —
depends: [{block: "soul-extractor", key: "soul-json"}]creates explicit data contracts
Three Execution Patterns
All use the same Block v2 schema. Only the scheduler behavior differs.
Pattern 1: DAG Batch
PRD: PRD 4 (Expert Extraction)
Trigger: dependency (block completion)
Lifecycle: one-shot
Scope: per expert
Example: Clone Factory — 18 blocks, run once per expert, produce complete clone
Scheduler: Evaluate readiness when any block completes. Ready = all deps done + gate approved.
Pattern 2: Stream/Event
PRD: PRD 3 (Universal Ingest)
Trigger: cron / webhook / event
Lifecycle: daemon
Scope: per record
Example: YouTube ingest — watch channel, transcribe new videos, annotate, embed, score
Needs: daemon process manager, event queue, webhook receivers, record_state field
Pattern 3: Analytical Lens
PRD: PRD 5 (Competitor Intel)
Trigger: on-demand / event
Lifecycle: on-demand
Scope: fan-out + aggregate
Example: Pre-meeting brief — fan out to N perspectives, aggregate into strategic brief
Needs: fan-out executor, aggregation engine, feedback collector, lens field
Extension Rule
All extensions are ADDITIVE. Pattern 2 and 3 add new scheduler behaviors and field consumers. They do NOT change Block v2 or the existing DAG scheduler. Each pattern gets its own scheduler implementation that reads the same block schema.
Clone Factory DAG — 18 Blocks
The complete pipeline for cloning an expert's IP into an AI coaching OS.
85% minLegend
Key Properties
- 3 Parallel Groups: L0-recon (3 blocks), L2-extractors (5 blocks), L3-builders (4 blocks)
- 3 Human Gates: After expert-framework-creator, gap-analyzer, clone-tester
- 1 Threshold: clone-tester must score ≥ 85% or fail
- Max retries: 2 per block (configurable)
- Total blocks: 18 (no cycles, valid DAG)
Data Flow Map
Every block's typed inputs and outputs, showing how data flows through the pipeline.
| Block | Inputs From | Key Outputs | Format |
|---|---|---|---|
| deep-research | (config: expert_name) | intel-report, raw-sources | MD, directory |
| expert-recon | (config: expert_name) | recon-report | MD |
| masterybook-sync | (workspace files) | notebook-url | URL |
| expert-framework-creator | deep-research.intel-report, expert-recon.recon-report | expert-framework | JSON |
| demo-compiler | deep-research.intel-report | demo-prompt, demo-config | MD, JSON |
| rubric-builder | expert-framework-creator.expert-framework | rubric | JSON |
| soul-extractor | expert-framework-creator.expert-framework, rubric-builder.rubric | soul-json | JSON |
| voice-extractor | expert-framework-creator.expert-framework, rubric-builder.rubric | voice-json | JSON |
| framework-extractor | expert-framework-creator.expert-framework, rubric-builder.rubric | frameworks-json | JSON |
| resource-extractor | expert-framework-creator.expert-framework | resources-json | JSON |
| offer-extractor | expert-framework-creator.expert-framework | offers-json | JSON |
| gap-analyzer | soul.json, voice.json, frameworks.json, resources.json, offers.json | gaps-json | JSON |
| clone-compiler | All extraction JSONs + gaps.json | system-prompt, knowledge-files, tool-configs | MD, MD, JSON |
| lead-magnet-builder | offers.json, frameworks.json, soul.json | lead-magnet-html | HTML |
| onboarding-builder | frameworks.json, offers.json | onboarding-html | HTML |
| design-system-extractor | (config: brand_url) | design-system-html | HTML |
| clone-tester | system-prompt, knowledge-files, tool-configs, rubric, expert-framework | test-results, audit-sheet, simulation-log | JSON, MD, JSON |
| boarding-orchestrator | All artifacts | boarding-pack | HTML |
Module ↔ Skill Mapping (PRD 4)
How PRD 4's 9 extraction modules map to our 18 skills. Coverage: 9/9 mapped.
| PRD 4 Module | Our Skill(s) | Coverage | Notes |
|---|---|---|---|
| M1: Thinking Structures | expert-framework-creator + soul-extractor | Full | Split across two skills intentionally |
| M2: Voice & Style | voice-extractor | Full | |
| M3: CTA Psychology | offer-extractor | Partial | CTA triggers not explicitly separated |
| M4: Embedded IP | framework-extractor | Full | |
| M5: Modularization | framework-extractor | Folded | Prerequisites/scaffolding included |
| M6: Meta-Structures | framework-extractor | Folded | Program architectures included |
| M7: Pattern Recognition | soul-extractor | Folded | Diagnostic patterns included |
| M8: Prompt Templates | clone-compiler | Reframed | Ground truth as test scenarios |
| M9: Retrieval Patterns | clone-compiler | Reframed | Routing rules in tool-configs |
Summary: 2 partial (M3 CTA), 3 folded (M5-M7 absorbed into broader extractors), 4 full/reframed. No modules unmapped. The "folded" modules are architectural decisions — their content is extracted, just by a different skill than the PRD originally envisioned.
Execution Timeline
Parallel groups and human gates shown on a layer-by-layer timeline.
LAYER BLOCKS GATE PARALLEL?
──────── ───────────────────────────── ────────── ─────────
L0 deep-research YES (3)
expert-recon (L0-recon)
masterybook-sync
L0.25 expert-framework-creator AFTER ⛔ Sequential
L0.5 demo-compiler Sequential
L1 rubric-builder Sequential
L2 soul-extractor YES (5)
voice-extractor (L2-extractors)
framework-extractor
resource-extractor
offer-extractor
L2.5 gap-analyzer AFTER ⛔ Sequential
L3 clone-compiler YES (4)
lead-magnet-builder (L3-builders)
onboarding-builder
design-system-extractor
L3.5 clone-tester (85% min) AFTER ⛔ Sequential
L4 boarding-orchestrator Sequential
Timing Estimates (based on Run #1)
- L0 parallel group: ~10 min wall clock (deep-research is the bottleneck)
- L0.25 framework: ~15 min (3-path convergence, worth the cost)
- Human gate 1: Variable (expert reviews framework)
- L2 parallel group: ~20 min wall clock (framework-extractor is heaviest)
- Human gate 2: Variable (product lead reviews gaps)
- L3 parallel group: ~25 min wall clock (clone-compiler is heaviest)
- L3.5 testing: ~30 min (27 scenarios)
- Total machine time: ~2 hours (excluding human gates)
Intelligence Gathering Pattern (Pattern 3)
How pre-meeting analytical lenses work — future capability for PRD 5.
Flow
- Trigger: "Prepare brief for meeting with [entity]" (on-demand or event-driven)
- Fan-out: Spawn N perspective blocks, each with a different
lens.perspective:- Positioning analysis
- Gap analysis (vs our capabilities)
- Competitive landscape
- Partnership risk assessment
- Execute: Each perspective block runs independently (parallel)
- Aggregate: Merge all perspective outputs into a single strategic brief
- Feedback loop: After meeting, user rates brief usefulness → improves future lenses
Block v2 Fields Used
lens.perspective— which analytical angle this block takeslens.aggregates— what dimension to roll up on (entity_id)lens.feedback_loop— whether to collect post-execution quality signalslifecycle.mode: "on-demand"— runs when triggered, not continuously
Connection to deep-research
Pattern 3 blocks can delegate heavy research to the existing deep-research skill (via Perplexity Sonar), then apply their specific analytical lens to the raw results. The research infrastructure is shared; only the interpretation differs.
PRD Reconciliation
What matches, what's missing, what's deferred.
| PRD | Status | Coverage | What's Needed |
|---|---|---|---|
| PRD 4 (Expert Extraction) | FULLY COVERED | 9/9 modules mapped to 18 skills | Nothing — this IS the active build |
| PRD 3 (Universal Ingest) | ARCHITECTURALLY COMPATIBLE | Block v2 has record_state + event triggers | Daemon scheduler, event queue, webhook receivers |
| PRD 5 (Competitor Intel) | ARCHITECTURALLY COMPATIBLE | Block v2 has lens + aggregation fields | Fan-out executor, aggregation engine, feedback collector |
| Master Registry | STALE | Lists 24 active skills + 8 planned (old inventory) | Needs update to reflect 18 boarding pipeline skills |
| Command Center | ALIGNED | Product dashboard rubric matches Block outputs | Nothing critical |
Extensions Needed for Future Patterns
Pattern 2 (Stream/Event) — PRD 3
record_statefield — already in schema- Daemon process manager — NEW code (keeps blocks alive)
- Event queue — NEW infrastructure (Redis or Supabase Realtime)
- Webhook receivers — NEW API routes (/api/ingest/youtube, etc.)
Pattern 3 (Analytical Lens) — PRD 5
lensfield — already in schema- Fan-out executor — NEW code (spawn N blocks dynamically)
- Aggregation engine — NEW code (merge perspectives)
- Feedback collector — NEW API route
Critical rule: All extensions are ADDITIVE. They add new scheduler behaviors and field consumers. They do NOT change Block v2 or the existing DAG scheduler.
Build Order
| # | What | Status | Notes |
|---|---|---|---|
| 1 | Process Factory Architecture page | DONE | This page (plan.jasondmacdonald.com/process-factory-architecture) |
| 2 | Block Schema Reference page | DONE | plan.jasondmacdonald.com/block-schema |
| 3 | Clone Factory Template JSON | DONE | 18 blocks, validated (no cycles, all deps resolve) |
| 4 | DAG Scheduler API route | DONE | /api/a360/factory-run (create, start, complete, approve, reject) |
| 5 | Factory Dashboard (live) | DONE | align360.asapai.net/factory-dashboard |
| 6 | Memory + context updates | DONE | MEMORY.md, reconciliation.md, process-factory.md, CLAUDE.md |
| 7 | Pattern 2 scheduler | FUTURE | When PRD 3 ingest activates |
| 8 | Pattern 3 scheduler | FUTURE | When PRD 5 intel activates |
Session Continuity Protocol
If a session dies, the next session can reconstruct full context from these durable artifacts:
- Read plan.jasondmacdonald.com/process-factory-architecture (this page) for full design
- Read plan.jasondmacdonald.com/block-schema for technical schema reference
- Read MEMORY.md for project state
- Read memory/process-factory.md for detailed architecture notes
- Read clone-factory-template.json for executable pipeline config
- Read folio-saas/app/api/a360/factory-run/route.ts for engine implementation
- Read align360.asapai.net/factory-dashboard for live pipeline state
Key Files
| File | Role |
|---|---|
_workspaces/samuel-ngu/clone-factory-template.json | Executable pipeline config (18 blocks) |
folio-saas/app/api/a360/factory-run/route.ts | DAG scheduler API engine |
_workspaces/samuel-ngu/artifacts/factory-dashboard.html | Live pipeline dashboard |
_workspaces/samuel-ngu/artifacts/block-schema-reference.html | Technical schema reference |
_workspaces/samuel-ngu/artifacts/process-factory-architecture.html | Architecture design (this page) |
memory/process-factory.md | Architecture notes for session continuity |
memory/reconciliation.md | Compact reconciliation state |