Process Factory Architecture v1.0

Config-driven, block-based execution engine for transparent, auditable pipeline runs

Version: 1.0.0 Date: 2026-03-28 Pattern: DAG Batch (active) Blocks: 18 Run #1: GREEN 91.9%

Overview

The Process Factory Engine replaces ad-hoc skill invocation with tracked, stateful DAG execution. Every skill, ingest job, or analysis task becomes a Block — a universal execution unit with typed inputs, outputs, dependencies, gates, and thresholds.

Why This Exists

After building the full 18-skill clone pipeline (Run #1: GREEN 91.9%), we identified that the same transparency/auditability problem exists when building anything, not just running extractions. The solution: a config-driven engine that makes every pipeline run observable, replayable, and auditable.

Core Principles

Config over code — Pipeline topology defined in JSON, not hardcoded
Stateless scheduler — Given current block states, computes readiness. No memory needed.
One schema, three patterns — Block v2 handles DAG batch, stream/event, and analytical lens
Human-in-the-loop — Gates before or after any block. Approve/reject with feedback.
Threshold enforcement — Auto-retry below minimum, fail after max attempts
Modular architecture — Engine tracks state. Claude Code/FORGE executes skills. Dashboard observes.

Block v2 Schema

The universal execution unit. Every skill, ingest job, or analysis task = one block. 11 field groups covering all three execution patterns.

{
  // ── Identity ──
  id:              string       // "soul-extractor"
  name:            string       // "Soul Extractor"
  layer:           string       // "L2"
  type:            enum         // extract|validate|build|orchestrate|research|ingest

  // ── Execution ──
  skill:           string       // maps to SKILL.md
  model:           string|null  // "opus"|"sonnet"|null (inherit)
  timeout:         number       // 300000 (ms)
  runtime:         enum         // forge|claude-cli|mastery-api|nowpage

  // ── Trigger ──
  trigger: {
    type:          enum         // dependency|cron|webhook|event|manual
    cron:          string|null  // "0 6 * * *" (Pattern 2)
    event:         string|null  // "entity.created(gate=2)" (Pattern 2)
    webhook:       string|null  // "/api/ingest/youtube" (Pattern 2)
  }

  // ── Inputs ──
  inputs: {
    depends:       [{block, key}]   // wiring from other blocks
    files:         [glob]           // workspace file patterns
    env:           [string]         // required env vars
    config:        {k:v}            // runtime config overrides
  }

  // ── Outputs ──
  outputs: {
    artifacts:     [{key, path, format}]
    metrics:       [{key, type}]
    publish:       [{domain, slug}]|null
  }

  // ── Execution Control ──
  execution: {
    parallel_group: string|null     // "L2-extractors"
    human_gate:     enum            // before|after|none
    threshold:      {metric, min}|null
    retryable:      boolean
    max_retries:    number          // 2
  }

  // ── Lifecycle (Pattern 1: one-shot, 2: daemon, 3: on-demand) ──
  lifecycle: { mode: enum }         // one-shot|daemon|on-demand

  // ── Scope ──
  scope: {
    gate:          1|2|3|4|null     // Data Gate tier
    run_params:    {k:v}            // per-run overrides
  }

  // ── Lens (Pattern 3 only) ──
  lens: {
    perspective:   string|null      // "positioning"|"gap_analysis"
    aggregates:    [string]|null    // ["entity_id"]
    feedback_loop: boolean
  }

  // ── Record State (Pattern 2 only) ──
  record_state: {
    track_per:     string|null      // "content_id"|"entity_id"
    states:        [string]         // ["raw","annotated","embedded","scored"]
  }

  // ── Runtime State (engine-managed, not in config) ──
  state: {
    status:        enum             // pending|ready|running|done|failed|blocked|waiting
    started_at:    timestamp|null
    ended_at:      timestamp|null
    score:         number|null
    error:         string|null
    attempts:      number
  }
}

Design Decisions

11 field groups — covers all three patterns with one definition
Null fields for inactive patterns — Pattern 1 blocks have lens: null and record_state: null
Runtime state is engine-managed — config defines the block, engine manages the state
Typed wiring — depends: [{block: "soul-extractor", key: "soul-json"}] creates explicit data contracts

Three Execution Patterns

All use the same Block v2 schema. Only the scheduler behavior differs.

ACTIVE

Pattern 1: DAG Batch

PRD: PRD 4 (Expert Extraction)

Trigger: dependency (block completion)

Lifecycle: one-shot

Scope: per expert

Example: Clone Factory — 18 blocks, run once per expert, produce complete clone

Scheduler: Evaluate readiness when any block completes. Ready = all deps done + gate approved.

FUTURE

Pattern 2: Stream/Event

PRD: PRD 3 (Universal Ingest)

Trigger: cron / webhook / event

Lifecycle: daemon

Scope: per record

Example: YouTube ingest — watch channel, transcribe new videos, annotate, embed, score

Needs: daemon process manager, event queue, webhook receivers, record_state field

FUTURE

Pattern 3: Analytical Lens

PRD: PRD 5 (Competitor Intel)

Trigger: on-demand / event

Lifecycle: on-demand

Scope: fan-out + aggregate

Example: Pre-meeting brief — fan out to N perspectives, aggregate into strategic brief

Needs: fan-out executor, aggregation engine, feedback collector, lens field

Extension Rule

All extensions are ADDITIVE. Pattern 2 and 3 add new scheduler behaviors and field consumers. They do NOT change Block v2 or the existing DAG scheduler. Each pattern gets its own scheduler implementation that reads the same block schema.

Clone Factory DAG — 18 Blocks

The complete pipeline for cloning an expert's IP into an AI coaching OS.

L0 Recon

deep-research

expert-recon

masterybook-sync

L0.25 ↓

expert-framework-creator

HUMAN GATE (after)

L0.5 ↓

demo-compiler

L1 Quality ↓

rubric-builder

L2 Extract ↓

soul-extractor

voice-extractor

framework-extractor

resource-extractor

offer-extractor

L2.5 Audit ↓

gap-analyzer

HUMAN GATE (after)

L3 Build ↓

clone-compiler

lead-magnet-builder

onboarding-builder

design-system-extractor

L3.5 Test ↓

clone-tester 85% min

HUMAN GATE (after)

L4 Ship ↓

boarding-orchestrator

Legend

Research Extract Validate Build Orchestrate Parallel Group Human Gate

Key Properties

3 Parallel Groups: L0-recon (3 blocks), L2-extractors (5 blocks), L3-builders (4 blocks)
3 Human Gates: After expert-framework-creator, gap-analyzer, clone-tester
1 Threshold: clone-tester must score ≥ 85% or fail
Max retries: 2 per block (configurable)
Total blocks: 18 (no cycles, valid DAG)

Data Flow Map

Every block's typed inputs and outputs, showing how data flows through the pipeline.

Block	Inputs From	Key Outputs	Format
deep-research	(config: expert_name)	intel-report, raw-sources	MD, directory
expert-recon	(config: expert_name)	recon-report	MD
masterybook-sync	(workspace files)	notebook-url	URL
expert-framework-creator	deep-research.intel-report, expert-recon.recon-report	expert-framework	JSON
demo-compiler	deep-research.intel-report	demo-prompt, demo-config	MD, JSON
rubric-builder	expert-framework-creator.expert-framework	rubric	JSON
soul-extractor	expert-framework-creator.expert-framework, rubric-builder.rubric	soul-json	JSON
voice-extractor	expert-framework-creator.expert-framework, rubric-builder.rubric	voice-json	JSON
framework-extractor	expert-framework-creator.expert-framework, rubric-builder.rubric	frameworks-json	JSON
resource-extractor	expert-framework-creator.expert-framework	resources-json	JSON
offer-extractor	expert-framework-creator.expert-framework	offers-json	JSON
gap-analyzer	soul.json, voice.json, frameworks.json, resources.json, offers.json	gaps-json	JSON
clone-compiler	All extraction JSONs + gaps.json	system-prompt, knowledge-files, tool-configs	MD, MD, JSON
lead-magnet-builder	offers.json, frameworks.json, soul.json	lead-magnet-html	HTML
onboarding-builder	frameworks.json, offers.json	onboarding-html	HTML
design-system-extractor	(config: brand_url)	design-system-html	HTML
clone-tester	system-prompt, knowledge-files, tool-configs, rubric, expert-framework	test-results, audit-sheet, simulation-log	JSON, MD, JSON
boarding-orchestrator	All artifacts	boarding-pack	HTML

Module ↔ Skill Mapping (PRD 4)

How PRD 4's 9 extraction modules map to our 18 skills. Coverage: 9/9 mapped.

PRD 4 Module	Our Skill(s)	Coverage	Notes
M1: Thinking Structures	expert-framework-creator + soul-extractor	Full	Split across two skills intentionally
M2: Voice & Style	voice-extractor	Full
M3: CTA Psychology	offer-extractor	Partial	CTA triggers not explicitly separated
M4: Embedded IP	framework-extractor	Full
M5: Modularization	framework-extractor	Folded	Prerequisites/scaffolding included
M6: Meta-Structures	framework-extractor	Folded	Program architectures included
M7: Pattern Recognition	soul-extractor	Folded	Diagnostic patterns included
M8: Prompt Templates	clone-compiler	Reframed	Ground truth as test scenarios
M9: Retrieval Patterns	clone-compiler	Reframed	Routing rules in tool-configs

Summary: 2 partial (M3 CTA), 3 folded (M5-M7 absorbed into broader extractors), 4 full/reframed. No modules unmapped. The "folded" modules are architectural decisions — their content is extracted, just by a different skill than the PRD originally envisioned.

Execution Timeline

Parallel groups and human gates shown on a layer-by-layer timeline.

LAYER     BLOCKS                           GATE          PARALLEL?
────────  ─────────────────────────────    ──────────    ─────────
L0        deep-research                                  YES (3)
          expert-recon                                   (L0-recon)
          masterybook-sync

L0.25     expert-framework-creator         AFTER ⛔      Sequential

L0.5      demo-compiler                                  Sequential

L1        rubric-builder                                 Sequential

L2        soul-extractor                                 YES (5)
          voice-extractor                                (L2-extractors)
          framework-extractor
          resource-extractor
          offer-extractor

L2.5      gap-analyzer                     AFTER ⛔      Sequential

L3        clone-compiler                                 YES (4)
          lead-magnet-builder                            (L3-builders)
          onboarding-builder
          design-system-extractor

L3.5      clone-tester (85% min)           AFTER ⛔      Sequential

L4        boarding-orchestrator                          Sequential

Timing Estimates (based on Run #1)

L0 parallel group: ~10 min wall clock (deep-research is the bottleneck)
L0.25 framework: ~15 min (3-path convergence, worth the cost)
Human gate 1: Variable (expert reviews framework)
L2 parallel group: ~20 min wall clock (framework-extractor is heaviest)
Human gate 2: Variable (product lead reviews gaps)
L3 parallel group: ~25 min wall clock (clone-compiler is heaviest)
L3.5 testing: ~30 min (27 scenarios)
Total machine time: ~2 hours (excluding human gates)

Intelligence Gathering Pattern (Pattern 3)

How pre-meeting analytical lenses work — future capability for PRD 5.

Flow

Trigger: "Prepare brief for meeting with [entity]" (on-demand or event-driven)
Fan-out: Spawn N perspective blocks, each with a different lens.perspective:
- Positioning analysis
- Gap analysis (vs our capabilities)
- Competitive landscape
- Partnership risk assessment
Execute: Each perspective block runs independently (parallel)
Aggregate: Merge all perspective outputs into a single strategic brief
Feedback loop: After meeting, user rates brief usefulness → improves future lenses

Block v2 Fields Used

lens.perspective — which analytical angle this block takes
lens.aggregates — what dimension to roll up on (entity_id)
lens.feedback_loop — whether to collect post-execution quality signals
lifecycle.mode: "on-demand" — runs when triggered, not continuously

Connection to deep-research

Pattern 3 blocks can delegate heavy research to the existing deep-research skill (via Perplexity Sonar), then apply their specific analytical lens to the raw results. The research infrastructure is shared; only the interpretation differs.

PRD Reconciliation

What matches, what's missing, what's deferred.

PRD	Status	Coverage	What's Needed
PRD 4 (Expert Extraction)	FULLY COVERED	9/9 modules mapped to 18 skills	Nothing — this IS the active build
PRD 3 (Universal Ingest)	ARCHITECTURALLY COMPATIBLE	Block v2 has record_state + event triggers	Daemon scheduler, event queue, webhook receivers
PRD 5 (Competitor Intel)	ARCHITECTURALLY COMPATIBLE	Block v2 has lens + aggregation fields	Fan-out executor, aggregation engine, feedback collector
Master Registry	STALE	Lists 24 active skills + 8 planned (old inventory)	Needs update to reflect 18 boarding pipeline skills
Command Center	ALIGNED	Product dashboard rubric matches Block outputs	Nothing critical

Extensions Needed for Future Patterns

Pattern 2 (Stream/Event) — PRD 3

record_state field — already in schema
Daemon process manager — NEW code (keeps blocks alive)
Event queue — NEW infrastructure (Redis or Supabase Realtime)
Webhook receivers — NEW API routes (/api/ingest/youtube, etc.)

Pattern 3 (Analytical Lens) — PRD 5

lens field — already in schema
Fan-out executor — NEW code (spawn N blocks dynamically)
Aggregation engine — NEW code (merge perspectives)
Feedback collector — NEW API route

Critical rule: All extensions are ADDITIVE. They add new scheduler behaviors and field consumers. They do NOT change Block v2 or the existing DAG scheduler.

Build Order

#	What	Status	Notes
1	Process Factory Architecture page	DONE	This page (plan.jasondmacdonald.com/process-factory-architecture)
2	Block Schema Reference page	DONE	plan.jasondmacdonald.com/block-schema
3	Clone Factory Template JSON	DONE	18 blocks, validated (no cycles, all deps resolve)
4	DAG Scheduler API route	DONE	/api/a360/factory-run (create, start, complete, approve, reject)
5	Factory Dashboard (live)	DONE	align360.asapai.net/factory-dashboard
6	Memory + context updates	DONE	MEMORY.md, reconciliation.md, process-factory.md, CLAUDE.md
7	Pattern 2 scheduler	FUTURE	When PRD 3 ingest activates
8	Pattern 3 scheduler	FUTURE	When PRD 5 intel activates

Session Continuity Protocol

If a session dies, the next session can reconstruct full context from these durable artifacts:

Read plan.jasondmacdonald.com/process-factory-architecture (this page) for full design
Read plan.jasondmacdonald.com/block-schema for technical schema reference
Read MEMORY.md for project state
Read memory/process-factory.md for detailed architecture notes
Read clone-factory-template.json for executable pipeline config
Read folio-saas/app/api/a360/factory-run/route.ts for engine implementation
Read align360.asapai.net/factory-dashboard for live pipeline state

Key Files

File	Role
`_workspaces/samuel-ngu/clone-factory-template.json`	Executable pipeline config (18 blocks)
`folio-saas/app/api/a360/factory-run/route.ts`	DAG scheduler API engine
`_workspaces/samuel-ngu/artifacts/factory-dashboard.html`	Live pipeline dashboard
`_workspaces/samuel-ngu/artifacts/block-schema-reference.html`	Technical schema reference
`_workspaces/samuel-ngu/artifacts/process-factory-architecture.html`	Architecture design (this page)
`memory/process-factory.md`	Architecture notes for session continuity
`memory/reconciliation.md`	Compact reconciliation state