Sprinter Docs

Research Library

Evidence library pattern for source cataloging, grading, and claim synthesis — with the DOC'S Research Protocol Builder as the worked example.

Overview

The Research Library pattern gives tenants a structured evidence pipeline: external sources are ingested, graded against a quality rubric, distilled into verifiable claims, and linked to existing domain records (protocols, conditions, guidelines). The pipeline is fully agent-driven with one human-in-the-loop approval gate.

This pattern was first implemented for the DOC'S tenant (recovery and wellness center, Wall Township NJ) to support Nolan's evidence-base work for six treatment modalities. It is built entirely on generic platform affordances — no DOC'S-specific slugs appear in platform code.

What the pattern provides:

  • A three-tier entity model: source feed → source-item article → evidence-claim assertion
  • Two criteria sets for grading articles and claims
  • Server-side aggregate recompute (source count + quality average) that keeps claim grades deterministic
  • Four agent roles with explicit capability scoping
  • A task pipeline from URL ingestion through HITL claim approval
  • Five views covering the full review and gap-analysis workflow

Entity Model

source (feed)
  └── source-item (article / study)
        ├── supports → evidence-claim (synthesized assertion)
        └── about    → modality (the treatment area)

evidence-claim
  ├── about      → modality
  ├── relevant-to → condition   (optional)
  └── justifies  → protocol    (optional — the Protocol Builder bridge)

Entity types

TypeOwnerKey fields
sourceplatformsource_type, url, scrape_strategy, schedule
source-itemplatformtitle, body, author, published_at, source_url, ingest_status
modalitytenantdescription, device, claims_scope, contraindication_profile
evidence-claimtenantstatement, status, source_count, source_quality_avg

*Server-computed via Inngest on every response.promoted event for a linked source-item.

Relation types

FromToTypeNotes
source-itemmodalityaboutWhich modality the article covers
source-itemevidence-claimsupportsSource-item cites / supports a claim
evidence-claimmodalityaboutWhich modality the claim concerns
evidence-claimconditionrelevant-toOptional — clinical condition linkage
evidence-claimprotocoljustifiesOptional — Protocol Builder bridge

The source-item → evidence-claim direction (supports) is intentional. It matches how reviewers cite evidence ("this study supports this claim") and aligns with the platform's from → to idiom.

Criteria Set Pattern

evidence-quality — grading a source-item

Six dimensions, each scored 0–10 by the Methodology Reviewer agent:

DimensionWhat it measures
study_designRCT / systematic review = 10; cohort = 7; case-series = 3; opinion = 0
sample_sizen > 500 = 10; n > 100 = 7; n > 30 = 5; n < 30 = 3
peer_reviewPeer-reviewed journal = 10; conference = 6; preprint = 3; blog = 0
effect_sizeLarge + CI excludes null = 10; moderate = 7; small = 4; none = 0
methodology_rigorBlinded + randomized + controls = 10; observational = 3
recency< 2 years = 10; < 5 = 7; < 10 = 5; < 20 = 3; older = 1

body-of-evidence-strength — grading an evidence-claim

Five dimensions. Two are server-pre-computed; three are scored qualitatively by the Evidence Strength Assessor agent:

DimensionScored byWhat it measures
source_countServer# of linked source-items (≥ 5 = 10; 3–4 = 7; 2 = 5; 1 = 3)
source_quality_avgServerMean evidence-quality promoted score of supporting source-items (0–10)
consistencyAgentDo sources agree on effect direction? Strong = 10; mixed = 5
clinical_relevanceAgentDoes the effect translate to this tenant's clinical practice?
safety_alignmentAgentDoes the evidence respect safety posture / contraindications?

Why server-computed aggregates? LLMs are unreliable when asked to average scores across a variable-length set of entities. The Inngest function recompute-claim-aggregates-on-response maintains these fields automatically whenever a source-item's evidence-quality response is promoted, so the agent only grades the three qualitative dimensions.

Platform Affordances

These four platform additions ship as part of this pattern and are available to any tenant:

1. extractUrlToArticle composer

features/source-sync/server/extract-url.ts — single-URL ingestion chain (Firecrawl → browser-agent → readable-HTML fallback). See Source Sync for the full API.

2. ingestUrl agent tool

Registered slug "ingestUrl" in features/tools/source-tools.ts. Takes url, optional sourceSlug, optional linkToEntities[]. Creates a source-item entity and attaches relations by slug. Required permission: entities.team.create.

// Agent call example
await tools.ingestUrl({
  url: "https://pubmed.ncbi.nlm.nih.gov/12345678/",
  linkToEntities: [{ slug: "hbot", relationType: "about" }],
})

3. getEntity with includeResponses

getEntity accepts an optional includeResponses: boolean flag (default false). When true, the returned entity includes its promoted responses keyed by criteria-set slug. Downstream agents use this to read a source-item's evidence-quality grade without a separate tool call.

// Agent call example
await tools.getEntity({
  entityId: sourceItemId,
  includeResponses: true,
})
// response.responses["evidence-quality"] contains dimension scores

Backed by features/responses/server/get-responses-for-entities.ts.

4. recompute-claim-aggregates-on-response Inngest function

features/inngest/functions/recompute-claim-aggregates-on-response.ts — listens for response.promoted. If the entity is a source-item, calls recomputeClaimAggregates for all linked claims.

recomputeClaimAggregates in features/responses/server/recompute-claim-aggregates.ts is generic — callers override supportsRelationType (default "supports") and evidenceQualityCriteriaSlug (default "evidence-quality") to match their domain.

DOC'S Worked Example

Tenant configuration

  • 6 modality records seeded: full-body-cryo, localized-cryo, hbot, normatec, pbm-bed, infrared-sauna
  • Criteria sets: evidence-quality + body-of-evidence-strength (as above)
  • Navigation: "Research" group (Modalities, Research Library, Claims Queue, Evidence Gaps) inserted before "Clinical"

Agents

All four agents use explicit customTools lists — no toolGroups. This prevents capability bleed (the Methodology Reviewer must not be able to create entities).

AgentToolsHeartbeat
Research ScoutwebSearch, fetchUrl, ingestUrl, searchEntities, createRelationOptional
Methodology ReviewergetEntity (with includeResponses), submitResponseNo
Claim SynthesizergetEntity, searchEntities, createEntity, createRelationNo
Evidence Strength AssessorgetEntity (with includeResponses), submitResponseNo

Each agent's system prompt embeds the full scoring rubric verbatim. The Claim Synthesizer prompt includes an explicit deduplication step: before creating a new claim, call searchEntities scoped to the target modality and skip any semantically similar existing claim.

Tasks

Task slugAgentTriggerOutput
research-source-sweepResearch Scoutmanual / cronentities
grade-source-itemMethodology Reviewerentity_createdresponse
ingest-url-to-source-itemResearch Scoutmanualentity
extract-claims-for-modalityClaim Synthesizermanualentities
assess-claim-strengthEvidence Strength Assessorentity_createdresponse
promote-claim-to-approvedHuman (HITL)manualstatus

No double-grading: ingest-url-to-source-item does not enqueue grading. The entity_created trigger on grade-source-item fires automatically when any source-item is created — whether by the manual task or by the capture hook.

Capture → ingestion path: when Nolan pastes a URL into Quick Capture, the platform's capture URL routing hook fires, invoking Research Scout with ingestUrl. A draft source-item appears in the inbox within seconds; the evidence-quality grade follows asynchronously via the grade-source-item task (may take a few minutes for long PDFs).

Views

ViewSurface typeEntityPurpose
Modality Librarygridmodality6 modality cards with source count, claim count, gap flag
Research Librarygridsource-itemAll articles; columns: title, modality, grade, date, status
Claims Queuekanbanevidence-claimdraft → under-review → approved → rejected workflow
Evidence GapsgridmodalityModalities with < 5 approved claims or avg grade < 5
Modality Dossierpage (detail)modalityPer-modality: sources, claims, gaps, linked protocols

agent_context workspace setting

The DOC'S tenant agent_context is merged idempotently using sentinel markers (<!-- docs-research-context-start --> ... <!-- docs-research-context-end -->). Re-running any seed script is safe — it replaces only the content between the markers.

The context block instructs all agents to: prefer peer-reviewed primary sources, be conservative about claim strength, respect DOC'S absolute contraindications, and tie claims to modality parameters DOC'S actually uses (session duration, frequency, device settings).

For Agents

Research Scout — finding and ingesting sources

1. webSearch({ query: "HBOT traumatic brain injury RCT 2020-2025", numResults: 20 })
2. For each promising result URL:
   ingestUrl({ url, linkToEntities: [{ slug: "hbot", relationType: "about" }] })
3. Report: list of created source-item entityIds + titles

Methodology Reviewer — grading a source-item

1. getEntity({ entityId: sourceItemId, includeResponses: false })
   — read title, body, author, published_at, source_url
2. Score 6 dimensions against the evidence-quality rubric embedded in your system prompt
3. submitResponse({
     entityId: sourceItemId,
     criteriaSetSlug: "evidence-quality",
     values: { study_design: 8, sample_size: 7, ... },
     rationale: "..."
   })

Claim Synthesizer — extracting claims from graded sources

1. searchEntities({ typeSlug: "source-item", filters: { modality: "hbot" } })
   — find graded source-items for the target modality
2. For each source-item: getEntity({ entityId, includeResponses: true })
   — read body + evidence-quality scores
3. searchEntities({ typeSlug: "evidence-claim", query: "<proposed claim text>" })
   — dedup check BEFORE creating
4. If no near-duplicate: createEntity({ typeSlug: "evidence-claim", ... })
5. createRelation({ fromEntityId: sourceItemId, toEntityId: claimId, type: "supports" })

Evidence Strength Assessor — grading a claim

1. getEntity({ entityId: claimId, includeResponses: false })
   — read statement, source_count, source_quality_avg (server-computed fields)
2. For each supporting source-item: getEntity({ entityId, includeResponses: true })
   — read evidence-quality grades for consistency assessment
3. submitResponse({
     entityId: claimId,
     criteriaSetSlug: "body-of-evidence-strength",
     values: {
       source_count: <read from entity field, do not recalculate>,
       source_quality_avg: <read from entity field, do not recalculate>,
       consistency: 8,
       clinical_relevance: 7,
       safety_alignment: 9,
     },
     rationale: "..."
   })

Note: source_count and source_quality_avg are read from the entity's content fields — do not recompute them. The server maintains them via Inngest.

Design Decisions

customTools over toolGroups for research agentstoolGroups bundle broad tool sets (e.g., all entity tools). The Methodology Reviewer must not have createEntity; the Claim Synthesizer must not have submitResponse before creating a claim. Explicit per-agent tool lists are the only way to correctly scope capability. Caught and enforced during codex review.

Server-computed aggregatessource_count and source_quality_avg are maintained by Inngest, not computed by the Evidence Strength Assessor. LLMs are unreliable when asked to average a variable-length set of numbers; embedding arithmetic in the agent prompt drifts over model updates. Denormalized fields with a server trigger are deterministic and auditable.

source-item → evidence-claim relation direction — the relation goes from source-item to claim (supports), not from claim to source-item (supported-by). This matches reviewer citation behavior ("this study supports this claim") and the platform's from → to idiom where from is the describing entity and to is the target. The initial spec had it inverted; caught in codex review.

Rubric text duplicated in agent prompts — criteria dimension description fields store the threshold text, but agent prompts embed the full rubric verbatim. This prevents prompt drift if criteria dimensions are edited in the admin UI. The grading agents are the rubric's primary consumers; they should not need a separate getCriteriaSets lookup at inference time.

Claim deduplication is a prompt instruction, not a DB constraint — a unique constraint on claim statement text would prevent legitimate rewording. Instead, the Claim Synthesizer is instructed to searchEntities before creating, and acceptance criterion 5 verifies re-running the task does not create duplicates.

Capture hook fires event, not a direct session — the capture server action must complete quickly. URL ingestion (especially Firecrawl) can take 10–30 seconds. The hook fires an Inngest event (capture.url.routed) so ingestion is decoupled from the capture response time.

  • Source Sync — extraction primitives, extractUrlToArticle, ingestUrl tool
  • Entity System — entity types, relations, field schemas
  • Sessions — task execution, session events
  • Capture — URL routing hook
  • Response System — criteria sets, scoring, promotion
  • Inngestrecompute-claim-aggregates-on-response function

On this page