MasteryMade · Foundation PRD
Transform the expert extraction methodology — which exists as Jason's mental model and scattered session transcripts — into an automated, repeatable skill that agent swarms can execute. Module 1 rubric is always first and gates everything downstream. Three-pass validation catches errors before they compound.
TRIGGER: Entity reaches pipeline_stage='researched' AND has ingested content
┌──────────────┐
│ Module 1: │ ◄── ALWAYS FIRST. Creates rubric.
│ Thinking │ Rubric fails validation → STOP.
│ Structures │ No downstream modules run.
└──────┬───────┘
│ rubric validated ✓
┌──────▼───────┐
│ Modules 2-8 │ ◄── Sequential. Each validates
│ (sequential) │ against Module 1 rubric.
└──────┬───────┘
│ all complete
┌──────▼───────┐
│ Module 9: │ ◄── Retrieval patterns.
│ Retrieval │ When to surface what.
└──────┬───────┘
│
┌──────▼───────┐
│ Three-pass │ ◄── Forward, backward, ground truth.
│ Validation │
└──────┬───────┘
│ all passes ✓
┌──────▼───────┐
│ Store to │ ◄── expert_chunks in Supabase
│ Supabase │ with embeddings, by module
└──────────────┘
CREATE TABLE expert_extractions (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
entity_id UUID NOT NULL REFERENCES entities(id),
gate INT NOT NULL CHECK (gate IN (2, 3)),
module INT NOT NULL CHECK (module BETWEEN 1 AND 9),
module_name TEXT NOT NULL,
version INT NOT NULL DEFAULT 1,
extracted_content JSONB NOT NULL, -- structured per module
raw_source_ids UUID[], -- which content records used
confidence FLOAT, -- self-assessed (0-1)
validation_status TEXT NOT NULL DEFAULT 'pending' CHECK (
validation_status IN ('pending','forward_pass','backward_pass',
'ground_truth_pass','validated','failed')
),
validation_notes TEXT,
gate2_extraction_id UUID, -- Gate 3: link to Gate 2 version
gate2_accuracy JSONB, -- {matched:[],corrected:[],missed:[]}
extracted_at TIMESTAMPTZ DEFAULT now(),
validated_at TIMESTAMPTZ,
updated_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX idx_ext_entity ON expert_extractions(entity_id);
CREATE INDEX idx_ext_module ON expert_extractions(module);
CREATE INDEX idx_ext_gate ON expert_extractions(gate);
CREATE INDEX idx_ext_status ON expert_extractions(validation_status);
CREATE TABLE expert_chunks (
id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
extraction_id UUID NOT NULL REFERENCES expert_extractions(id),
entity_id UUID NOT NULL REFERENCES entities(id),
module INT NOT NULL,
chunk_text TEXT NOT NULL,
chunk_type TEXT NOT NULL, -- 'framework','voice_pattern','cta_template','case_study'
embedding vector(1536),
metadata JSONB DEFAULT '{}',
created_at TIMESTAMPTZ DEFAULT now()
);
CREATE INDEX idx_echunks_entity ON expert_chunks(entity_id);
CREATE INDEX idx_echunks_module ON expert_chunks(module);
CREATE INDEX idx_echunks_type ON expert_chunks(chunk_type);
CREATE INDEX idx_echunks_embedding ON expert_chunks
USING ivfflat (embedding vector_cosine_ops);
Input: All ingested content for entity. Prioritize long-form (webinars, interviews, coaching) over social posts.
{
"rubric_name": "EXPERT_RUBRIC_NAME",
"mental_models": ["model1", "model2"],
"named_frameworks": [
{ "name": "", "components": [], "purpose": "" }
],
"decision_logic": [
{ "if": "condition", "then": "action", "because": "reasoning" }
],
"priority_hierarchy": ["first", "second", "third"],
"inviolable_principles": ["principle1", "principle2"],
"unique_terminology": { "term": "definition" }
}
Validation gate: Before Module 2 proceeds: at least 3 named frameworks identified, at least 5 if/then decision patterns, clear priority ordering, and Jason reviews rubric output. Module 1 is too important to auto-approve. This is the only human gate in the pipeline.
{
"sentence_patterns": [],
"signature_phrases": [],
"tone_descriptors": ["direct","warm","challenging"],
"emphasis_techniques": ["repetition","contrast","questions"],
"language_avoids": [],
"formality_level": "conversational|professional|academic",
"humor_style": "dry|storytelling|none",
"teaching_voice_vs_selling_voice": { "teaching":{}, "selling":{} }
}
Rubric check: "Does this voice pattern match the thinking rubric? Would someone who thinks like Module 1 describes naturally speak this way?"
{
"primary_motivation_triggers": [],
"invitation_patterns": [],
"urgency_creation": [],
"objection_handling": [],
"soft_vs_hard_cta_ratio": 0.7,
"example_ctas": [{ "context":"", "cta_text":"", "motivation_lever":"" }]
}
{
"proprietary_frameworks": [
{ "name":"", "components":[], "how_it_works":"",
"when_to_use":"", "source_material":"" }
],
"original_models": [],
"unique_terminology": { "term": { "definition":"", "usage_context":"" } },
"ip_that_must_not_be_altered": []
}
{
"teaching_progressions": [
{ "name":"", "steps":[], "prerequisites":[], "builds_toward":"" }
],
"prerequisite_chains": { "concept_a": ["requires_b","requires_c"] },
"scaffold_order": ["first","second","third"],
"beginner_vs_advanced": { "beginner":[], "advanced":[] }
}
{
"program_architectures": [
{ "name":"", "structure":"linear|modular|spiral",
"phases":[], "engagement_arc":"", "milestones":[] }
],
"coaching_flow": "",
"content_delivery_preferences": ""
}
{
"diagnostic_patterns": [
{ "signal":"", "diagnosis":"", "prescription":"", "source_example":"" }
],
"triage_framework": "",
"red_flags": [],
"green_flags": []
}
{
"scenario_applications": [
{ "scenario":"", "expert_response":"",
"frameworks_applied":[], "voice_markers":[], "source":"" }
]
}
These become test cases for clone validation — feeds expert-clone-scorer.
{
"routing_rules": [
{ "trigger":"user asks about X",
"retrieve":"Module Y, framework Z",
"context_required":"", "priority":"primary|secondary" }
],
"context_windows": {
"new_user": ["what to surface first"],
"returning_user": ["surface based on history"],
"specific_problem": ["identify and route to framework"]
},
"never_combine": ["A should not appear with B because..."],
"always_combine": ["C is always better with D"]
}
"Does each module logically lead to the next?" Module 1 rubric → Module 2 voice should reflect thinking patterns. Module 2 voice → Module 3 CTAs should use voice patterns. Module 4 IP → Module 5 scaffolding should cover all framework components. Module 7 diagnostics → should reference Module 1 thinking. Module 8 examples → should demonstrate Modules 1-7 in action.
Prompt: "Review modules [N] and [N+1]. Does the output of [N] naturally lead to and support [N+1]? Identify gaps, contradictions, or missing connections."
"Does the deployment artifact trace back to source material?" For each claim in each module: can we point to specific ingested content that supports it? For each framework in Module 4: is there transcript evidence? For each voice pattern in Module 2: can we find 3+ examples?
Prompt: "For each item in this module's extraction, find the specific source content that supports it. If you cannot find evidence, flag as 'unsupported'."
"Does the clone's output match how the expert would actually respond?" Use Module 8 scenario_applications as test cases. Feed scenario to clone using extractions as context. Compare clone response to expert's actual response. Score: voice match (Module 2), framework usage (Module 4), diagnostic accuracy (Module 7).
Integration: Calls expert-clone-scorer with test cases from expert-test-extractor.
When expert transitions from prospect (Gate 2) to signed (Gate 3):
Learning loop: Over time, builds a model of "what we can reliably extract from public content alone" vs "what requires private access." Each subsequent Gate 2 demo gets more accurate.
| Skill | Integration |
|---|---|
| expert-research | Runs BEFORE pipeline. Populates entity, discovers sources, triggers ingest. Output feeds Module 1 as context. |
| expert-doc-processor | PII scrubbing absorbed into ingest pipeline. Single-doc processing retained as utility called by Module 1-8 extractors. |
| expert-test-extractor | Runs AFTER Module 8. Takes scenario_applications, generates structured test cases (Q&A pairs with scoring criteria). |
| expert-clone-scorer | Runs DURING ground truth pass. Compares clone output to Module 8 ground truth. Returns per-dimension score. |
| expert-os-deployment | Runs AFTER validation complete. Takes validated extraction package → value ladder, validation page, onboarding, betaap.io deploy. |
| Expert | Issue | Action |
|---|---|---|
| Matt (SOUL/FLOW) | Complete from Jun 2024. Location unclear — Supabase, GDrive, or scattered across sessions. | Search GDrive. If found, import to expert_extractions table. If not, flag for re-extraction. |
| Bridger (SCALE/POWER) | In Supabase + GDrive. On hold (JV didn't finalize). | Locate existing chunks. Import to new schema. Tag as Gate 2 (no private docs received). |
| Brad (TIGER QUEST) | Rubric complete. Full extraction Nov 2025 — likely in Claude conversation text only. | Search Claude chat history. Export to GDrive. Import to schema. Tag as Gate 2. |
| Samuel (Align360) | In progress. Gate 2 public extraction underway. betaap.io v0 finalizing. | Active test case. Run automated pipeline on public content. Compare against manual extraction so far. |
Problem: Extraction outputs can be 400KB+ per expert. Loading all into context window is wasteful.
Solution: Meta-index that points to sections rather than loading them:
{
"expert": "Samuel / Align360",
"modules_available": [1,2,3,4,5],
"quick_reference": {
"rubric_name": "ALIGN360_METHOD",
"core_frameworks": ["Framework A","Framework B","Framework C"],
"voice_summary": "Warm, direct, faith-integrated, story-driven",
"primary_audience": "Christian professionals seeking alignment"
},
"retrieval_pointers": {
"alignment_questions": "Load Module 4 → Framework B",
"diagnostic_needed": "Load Module 7 → diagnostic_patterns",
"content_generation": "Load Module 2 (voice) + Module 3 (CTA)"
}
}
Clone loads meta-index first (~500 tokens). Full module content only when a retrieval pointer activates. Lazy-load pattern from :2hat applied to expert knowledge.
MASTERYMADE — PRD 4 of 12 — plan.jasondmacdonald.com
Dominia Facta. Build what compounds.