Interactive demo · Causal AI for Pharma & Clinical Research

Causal estimates with their honest fragility.

Move the Rosenbaum sensitivity slider below and watch the W12 anticoagulation effect estimate change. This is what most observational AI vendors don't show you. Marketing site at rosenbound.com; this is the working product.

Product walkthrough

Three moments that define the platform.

A live walkthrough showing the Cognitive Validation Report refusing incoherent data, the Γ-bound sensitivity visualization, and the reproducibility certificate generated on every study. The full platform stays gated for Founding Partners.

Where this fits

Three doors. One product.

rosenbound.com
Marketing & positioning

Causal-inference clinical AI positioned for regulators, drug-safety directors, and RWE methodology leads. The Founding Partner Program lives here.

Visit →
rosenbound.ai
Interactive methodology demo · you are here

Hands-on with the W12 open-confounder benchmark, the Γ-slider sensitivity chart, the MIMIC-IV / FAERS scorecard, the audit-trail SHA-chain verifier, and the five-method causal methodology.

This page
Production platform
Production platform · gated

The actual product. Cohort upload, multi-method causal estimation, reviewer queues, 21 CFR Part 11 audit dashboards, MedDRA / RxNorm mapping, multi-tenant RBAC, WorkOS SSO. Founding Partner seats only.

Apply →
Official Python SDK

pip install rosenbound

Programmatic access to the audited platform — cohort upload, sensitivity-bounded study runs, reproducibility certificate retrieval. Apache 2.0; Pydantic v2 typed; py.typed for IDE autocomplete + mypy. Platform access gated server-side by Bearer token + RBAC + tenant scoping — the SDK is open, the audit substrate is not.

View on PyPI →
The W12 Open-Confounder Benchmark Pre-computed locked benchmark

Five methods on heparin vs LMWH. None recovers RCT direction on bleeding.

MIMIC-IV ICU, n=153,708 admissions. Five causal estimators ran on the same cohort. Move the Γ slider to explore Rosenbaum sensitivity bounds — at Γ = 1.06, the bleed estimate becomes statistically inconsistent with zero. Toggle the bound visibility to see what most vendors actually ship.

Γ = 1.00

At Γ = 1.06, an unobserved confounder of 5.6% odds shift would flip the bleed estimate.

Uncheck to see what observational AI vendors normally show — point estimates only, no fragility.

What this demonstrates → Across AIPW with progressive covariate enrichment, IV-LATE, and DCIE neural counterfactual, all five methods agree on a positive ATT for LMWH on bleeding. RCT-published direction is the opposite. Rosenbaum bound at Γ = 1.06 quantifies why.
Try a prediction

Patient-level mortality risk with explicit fragility disclosure.

Enter synthetic patient inputs to see the kind of output a Rosenbound prediction returns: point estimate, calibrated confidence interval, and a Rosenbaum-bound fragility classification. The form below illustrates the shape of the output; the production model is connected separately to credentialed environments.

Demo only — illustrative output, not from a connected production model. The values below are computed from a documented heuristic that approximates W1 mortality model behavior under MIMIC-IV training distributions. They are NOT real predictions and must not be used for any clinical decision. The production W1 model (test AUROC 0.9476, Beta-Bayes ECE 0.0024) deploys in pharma and CRO environments under separate engagement.

Synthetic patient inputs

Configure inputs on the left and click Compute to see the illustrative output structure.
MIMIC-IV v3.1 Benchmark Portfolio Pre-computed from production training runs

Seven W-lanes locked on the full corpus, plus FAERS Pipeline B for pharmacovigilance.

Click any W-lane row to expand and see train / val / test breakdown, calibration, and the published reference baseline. Log-file references available under NDA. The FAERS Pipeline B callout below covers the pharmacovigilance lane on its own corpus.

Lane Outcome Cohort Test AUROC Calibration
FAERS Pipeline B · B4 booster
Pharmacovigilance severity triage on FDA Adverse Event Reporting System full corpus
0.8872
Test AUROC
158,732
Held-out reports
826
Raw features
Temporal split (train pre-2020, test 2020–2025). LightGBM raw booster, SHA-pinned to VBSM ledger. Cut over to production-pointer 2026-05-29 (replacing prior B3 baseline 0.8728).
ACIC22 Track-2 Causal Inference Challenge V3 lock — full 3,400-cohort canonical

11× coverage lift over baseline.

Full 3,400-cohort canonical. Three Phase-1 ensemble fixes (full module-checkpoint completeness, stratified bootstrap fallback, IPM cost-matrix robustness) closed the coverage gap on the public leaderboard. Specific fix details under NDA.

21 CFR Part 11 ALCOA+ Audit Trail Illustrative entry — production ledger structure

Every model decision, cryptographically chained.

A working example of one VBSM ledger entry. Every prediction the platform makes captures full provenance: SHA-256-chained hash, monotonic_ns timestamp, frozen TGLEntry, model version, user identifier. Click Verify chain to run the integrity check in your browser.

# VBSM Ledger Entry — committed by predict() route { "ledger_id": 14237, "prev_hash": "a8f3c1...7b2d", "event": "prediction_committed", "committed_at_ns": 1782451234567890123, "committed_by": "user_id:pharma-dsm-04", "model_version": "w1_mortality_v3_isotonic_lock_2026_05_03", "input_digest": "sha256:c4e8...91f6", "prediction": { "point_estimate": 0.124, "ci_95": [0.101, 0.147], "calibration": "beta_bayes", "ece": 0.0024 }, "sensitivity": { "gamma_zero": 1.085, "fragility_class": "sensitive" }, "feature_schema_hash": "sha256:8a2b...e4d1", "this_hash": "a9d4f2e1...c83a" }
Methodology

The five estimators, in plain English.

Every method below is a published, peer-reviewed causal-inference technique. Rosenbound's contribution is not novel statistics — it's running all five on the same cohort and reporting their disagreement honestly.

1. AIPW — Augmented Inverse Propensity Weighting

Robins-Rotnitzky-Zhao (1994) and Bang-Robins (2005). The doubly-robust workhorse: combines a propensity-score weighted contrast with an outcome-model-based augmentation. If either the propensity model OR the outcome model is correctly specified, the estimator is consistent.

Rosenbound runs three covariate enrichment stages: v3 (baseline 217 features), v4 (severity-augmented with 24h labs window), and v5 (multi-day trajectory with 72h × 3-bin features). On W12, all three stages produce the same wrong-direction Y_bleed estimate — adding observable severity does NOT recover RCT direction.

2. DR-ATT with Crump-2009 overlap trim

Hahn (1998) doubly-robust ATT formulation, with Lunceford-Davidian (2004) variance estimators, applied on the Crump-Hotz-Imbens-Mitnik (2009) overlap-trimmed cohort. The trim discards units with extreme propensities (e < 0.1 or e > 0.9) where causal contrasts are unidentified.

This is the method that does recover RCT direction on W13 (DOAC vs warfarin in AFib+CKD): Y_stroke +0.0084, Y_bleed -0.0396, both consistent with RE-LY / ROCKET-AF / ARISTOTLE / ENGAGE-AF subgroup analyses. Trim sensitivity α ∈ {0.05, 0.10, 0.15} robust (Δ ATT < 0.004).

3. IV-LATE — Two-Stage Least Squares with prescriber preference

Angrist-Krueger (1991) framework. Uses leave-one-out per-prescriber LMWH preference rate as instrument (n=4,383 unique providers in MIMIC-IV). Stage-1 F-statistic of 67,250 confirms strong-instrument regime; m-of-n bootstrap for inference (Bickel-Sakov 2008).

On W12 Y_vte: ATT = +0.0102 with 95% CI [-0.002, +0.021] — CI includes zero. Recovers RCT-consistent non-inferiority on the venous thromboembolism outcome. On Y_bleed the IV estimate amplifies the wrong direction, suggesting prescriber-preference exclusion violation via specialty patient-mix.

4. DCIE — Patent-pending neural counterfactual learner

USPTO provisional filed March 22, 2026. Differentiable Causal Inference Engine (DCIE): patent-pending individual-level treatment-effect estimation with representation-balanced counterfactual learning. Architecture details and component-level design are held under NDA pending non-provisional conversion.

Production-validated on ACIC22 V3 lock (bias +19.26, RMSE 28.80, coverage 77.53%) and W12 v5-trajectory cohort (Y_bleed ATT +0.0255, AIPW-equivalent). Confirms — surprisingly — that the additional capacity of a neural counterfactual learner doesn't extract signal beyond classical AIPW when the residual confounder is observation-invisible. Further evidence the W12 fragility is informational, not algorithmic.

5. Rosenbaum Γ-sensitivity bounds

Rosenbaum (2002) Observational Studies, Chapter 4. Asks the question that no point estimate can answer: "How strong would an unobserved confounder need to be — measured as an odds-ratio multiplier on treatment assignment — to nullify or flip the estimated effect?"

On W12, Γ_zero = 1.06 for Y_bleed and 1.17 for Y_vte. Both are in the "very sensitive" tier (Γ < 1.2). Plain-English translation: a confounder that shifts the odds of receiving LMWH by 5.6% in patients who would later bleed is sufficient to reverse the headline estimate. Physician judgment artifacts (prior HIT, family allergies, off-record consults) plausibly produce this magnitude of shift, and they don't appear in chartevents.

Patent Architecture USPTO provisional · filed 2026-03-22

Five inventive concepts. All five production-validated. All five protected.

The platform sits on five independently patentable AGI primitives. Implementation details are held under NDA pending non-provisional or PCT conversion (12-month window through 2027-03-22). Clinical AI is the first commercial vertical; each module below is wired into a verified clinical pathway today.

PSIM

Persistent Self-Improving Memory

An episodic-semantic-causal memory layer that records every signal, decision, and outcome the system has seen and consolidates it into a structured knowledge base the model uses to improve on future cases. The platform doesn’t forget what it learned from yesterday’s reviewer queue.

Production-validated: 2.83M-row FAERS pharmacovigilance memory backfill.
DCIE

Differentiable Causal Inference Engine

A neural counterfactual learner with representation-balanced individual-level treatment-effect estimation. The fifth method in the W12 sensitivity pentagon. Used for the patient-level CATE estimates that no observational AI vendor reports honestly.

Production-validated: matches classical causal-inference estimators on the public ACIC22 Track-2 challenge (3,400 cohorts), MIMIC-IV ICU vasopressor treatment effects, and the W12 heparin-vs-LMWH bleeding study.
VBSM

Verifiable Bounded Self-Modification

Every model retrain, every prediction, every reviewer action commits to a SHA-256-chained, append-only ledger with formal pre/post-condition checks. The 21 CFR Part 11 ALCOA+ system of record that an FDA inspector or pharma QA team can run an external verifier against without trusting our code.

Production-validated: every production model — clinical mortality, hospital readmission, pharmacovigilance triage, and causal estimation — writes to a SHA-chained ledger that an independent external verifier can audit.
MCS

Modular Cognitive Substrate

A five-layer cognitive architecture with a typed inter-module control protocol, a meta-learning policy layer, and an embedded self-capability estimator. The latter is what enables the platform to abstain on cases it knows it cannot reliably handle — instead of confidently giving you the wrong answer.

Production-validated: on the FAERS pharmacovigilance benchmark, the platform correctly identifies and abstains on cases it cannot reliably handle, rather than producing low-confidence answers.
HNSI

Hybrid Neuro-Symbolic Integration

A clinical-NER pipeline that turns unstructured discharge summaries and radiology reports into structured, negation-aware, section-aware, temporally-anchored facts that flow into the same causal model the structured data flows into. Maps free-text to RxNorm / MedDRA / SNOMED before it touches the prediction layer.

Production-validated: 1.64M MIMIC-IV-Note clinical notes processed; 80M+ entities extracted with 13–18% negation rate.
Restricted materials — corporate access

Start with the one-pager. Request additional materials in the form.

The Rosenbound One-Page Pitch is auto-served to Founding Partners, advisors, and investors using a corporate, university, or institutional email. The one-pager itself notes deeper materials are available on request — use the checkboxes in the form to flag interest in specific items, and we will follow up directly. Personal email providers are not accepted.

Rosenbound One-Page Pitch (PDF)

Founding insight, four-vertical product table, MIMIC-IV benchmark portfolio, IP defensibility, business model, 5-year ARR potential. Lead with this — it's the fastest read-and-share artifact for design-partner and investor diligence.

Founding Partner Program

Five seats. $100K Year 1, locked Founding rate forever.

Five Founding Partner seats for pharma drug-safety teams and CRO methodology leads. List price is $250K/yr (Aetion-tier methodology-platform comp). Founders pay $100K Year 1, $175K Years 2–3, $250K/yr Year 4+ — locked at the Founding rate for the lifetime of the contract. You get direct founder access, quarterly roadmap co-input, named reference for the case study, and waived integration setup.

Year 1
$100K
vs $250K list (60% off)
Year 2
$175K
locked Founding rate (30% off)
Year 3
$175K
locked Founding rate (30% off)
Year 4+
$250K
locked Founding rate forever