Rosenbound — Interactive Demo

Where this fits

Three doors. One product.

rosenbound.com

Marketing & positioning

Causal-inference clinical AI positioned for regulators, drug-safety directors, and RWE methodology leads. The Founding Partner Program lives here.

Visit →

rosenbound.ai

Interactive methodology demo · you are here

Hands-on with the W12 open-confounder benchmark, the Γ-slider sensitivity chart, the MIMIC-IV / FAERS scorecard, the audit-trail SHA-chain verifier, and the five-method causal methodology.

This page

Production platform

Production platform · gated

The actual product. Cohort upload, multi-method causal estimation, reviewer queues, 21 CFR Part 11 audit dashboards, MedDRA / RxNorm mapping, multi-tenant RBAC, WorkOS SSO. Founding Partner seats only.

Apply →

The W12 Open-Confounder Benchmark Pre-computed locked benchmark

Five methods on heparin vs LMWH. None recovers RCT direction on bleeding.

MIMIC-IV ICU, n=153,708 admissions. Five causal estimators ran on the same cohort. Move the Γ slider to explore Rosenbaum sensitivity bounds — at Γ = 1.06, the bleed estimate becomes statistically inconsistent with zero. Toggle the bound visibility to see what most vendors actually ship.

Rosenbaum Γ (unobserved-confounder strength)

Γ = 1.00

At Γ = 1.06, an unobserved confounder of 5.6% odds shift would flip the bleed estimate.

Sensitivity bound visibility

Show fragility bounds (Rosenbound default)

Uncheck to see what observational AI vendors normally show — point estimates only, no fragility.

What this demonstrates → Across AIPW with progressive covariate enrichment, IV-LATE, and DCIE neural counterfactual, all five methods agree on a positive ATT for LMWH on bleeding. RCT-published direction is the opposite. Rosenbaum bound at Γ = 1.06 quantifies why.

Try a prediction

Patient-level mortality risk with explicit fragility disclosure.

Enter synthetic patient inputs to see the kind of output a Rosenbound prediction returns: point estimate, calibrated confidence interval, and a Rosenbaum-bound fragility classification. The form below illustrates the shape of the output; the production model is connected separately to credentialed environments.

⚠

Demo only — illustrative output, not from a connected production model. The values below are computed from a documented heuristic that approximates W1 mortality model behavior under MIMIC-IV training distributions. They are NOT real predictions and must not be used for any clinical decision. The production W1 model (test AUROC 0.9476, Beta-Bayes ECE 0.0024) deploys in pharma and CRO environments under separate engagement.

Synthetic patient inputs

Age

Admission type

ICU contact during stay?

Peak INR (first 24h)

Peak lactate (mmol/L)

On vasopressors?

Configure inputs on the left and click Compute to see the illustrative output structure.

21 CFR Part 11 ALCOA+ Audit Trail Illustrative entry — production ledger structure

Every model decision, cryptographically chained.

A working example of one VBSM ledger entry. Every prediction the platform makes captures full provenance: SHA-256-chained hash, monotonic_ns timestamp, frozen TGLEntry, model version, user identifier. Click Verify chain to run the integrity check in your browser.

# VBSM Ledger Entry — committed by predict() route { "ledger_id": 14237, "prev_hash": "a8f3c1...7b2d", "event": "prediction_committed", "committed_at_ns": 1782451234567890123, "committed_by": "user_id:pharma-dsm-04", "model_version": "w1_mortality_v3_isotonic_lock_2026_05_03", "input_digest": "sha256:c4e8...91f6", "prediction": { "point_estimate": 0.124, "ci_95": [0.101, 0.147], "calibration": "beta_bayes", "ece": 0.0024 }, "sensitivity": { "gamma_zero": 1.085, "fragility_class": "sensitive" }, "feature_schema_hash": "sha256:8a2b...e4d1", "this_hash": "a9d4f2e1...c83a" }

Methodology

The five estimators, in plain English.

Every method below is a published, peer-reviewed causal-inference technique. Rosenbound's contribution is not novel statistics — it's running all five on the same cohort and reporting their disagreement honestly.

1. AIPW — Augmented Inverse Propensity Weighting

Robins-Rotnitzky-Zhao (1994) and Bang-Robins (2005). The doubly-robust workhorse: combines a propensity-score weighted contrast with an outcome-model-based augmentation. If either the propensity model OR the outcome model is correctly specified, the estimator is consistent.

Rosenbound runs three covariate enrichment stages: v3 (baseline 217 features), v4 (severity-augmented with 24h labs window), and v5 (multi-day trajectory with 72h × 3-bin features). On W12, all three stages produce the same wrong-direction Y_bleed estimate — adding observable severity does NOT recover RCT direction.

2. DR-ATT with Crump-2009 overlap trim

Hahn (1998) doubly-robust ATT formulation, with Lunceford-Davidian (2004) variance estimators, applied on the Crump-Hotz-Imbens-Mitnik (2009) overlap-trimmed cohort. The trim discards units with extreme propensities (e < 0.1 or e > 0.9) where causal contrasts are unidentified.

This is the method that does recover RCT direction on W13 (DOAC vs warfarin in AFib+CKD): Y_stroke +0.0084, Y_bleed -0.0396, both consistent with RE-LY / ROCKET-AF / ARISTOTLE / ENGAGE-AF subgroup analyses. Trim sensitivity α ∈ {0.05, 0.10, 0.15} robust (Δ ATT < 0.004).

3. IV-LATE — Two-Stage Least Squares with prescriber preference

Angrist-Krueger (1991) framework. Uses leave-one-out per-prescriber LMWH preference rate as instrument (n=4,383 unique providers in MIMIC-IV). Stage-1 F-statistic of 67,250 confirms strong-instrument regime; m-of-n bootstrap for inference (Bickel-Sakov 2008).

On W12 Y_vte: ATT = +0.0102 with 95% CI [-0.002, +0.021] — CI includes zero. Recovers RCT-consistent non-inferiority on the venous thromboembolism outcome. On Y_bleed the IV estimate amplifies the wrong direction, suggesting prescriber-preference exclusion violation via specialty patient-mix.

4. DCIE — Patent-pending neural counterfactual learner

USPTO provisional filed March 22, 2026. Differentiable Causal Inference Engine (DCIE): patent-pending individual-level treatment-effect estimation with representation-balanced counterfactual learning. Architecture details and component-level design are held under NDA pending non-provisional conversion.

Production-validated on ACIC22 V3 lock (bias +19.26, RMSE 28.80, coverage 77.53%) and W12 v5-trajectory cohort (Y_bleed ATT +0.0255, AIPW-equivalent). Confirms — surprisingly — that the additional capacity of a neural counterfactual learner doesn't extract signal beyond classical AIPW when the residual confounder is observation-invisible. Further evidence the W12 fragility is informational, not algorithmic.

5. Rosenbaum Γ-sensitivity bounds

Rosenbaum (2002) Observational Studies, Chapter 4. Asks the question that no point estimate can answer: "How strong would an unobserved confounder need to be — measured as an odds-ratio multiplier on treatment assignment — to nullify or flip the estimated effect?"

On W12, Γ_zero = 1.06 for Y_bleed and 1.17 for Y_vte. Both are in the "very sensitive" tier (Γ < 1.2). Plain-English translation: a confounder that shifts the odds of receiving LMWH by 5.6% in patients who would later bleed is sufficient to reverse the headline estimate. Physician judgment artifacts (prior HIT, family allergies, off-record consults) plausibly produce this magnitude of shift, and they don't appear in chartevents.

Patent Architecture USPTO provisional · filed 2026-03-22

Five inventive concepts. All five production-validated. All five protected.

The platform sits on five independently patentable AGI primitives. Implementation details are held under NDA pending non-provisional or PCT conversion (12-month window through 2027-03-22). Clinical AI is the first commercial vertical; each module below is wired into a verified clinical pathway today.

PSIM

Persistent Self-Improving Memory

An episodic-semantic-causal memory layer that records every signal, decision, and outcome the system has seen and consolidates it into a structured knowledge base the model uses to improve on future cases. The platform doesn’t forget what it learned from yesterday’s reviewer queue.

Production-validated: 2.83M-row FAERS pharmacovigilance memory backfill.

DCIE

Differentiable Causal Inference Engine

A neural counterfactual learner with representation-balanced individual-level treatment-effect estimation. The fifth method in the W12 sensitivity pentagon. Used for the patient-level CATE estimates that no observational AI vendor reports honestly.

Production-validated: matches classical causal-inference estimators on the public ACIC22 Track-2 challenge (3,400 cohorts), MIMIC-IV ICU vasopressor treatment effects, and the W12 heparin-vs-LMWH bleeding study.

VBSM

Verifiable Bounded Self-Modification

Every model retrain, every prediction, every reviewer action commits to a SHA-256-chained, append-only ledger with formal pre/post-condition checks. The 21 CFR Part 11 ALCOA+ system of record that an FDA inspector or pharma QA team can run an external verifier against without trusting our code.

Production-validated: every production model — clinical mortality, hospital readmission, pharmacovigilance triage, and causal estimation — writes to a SHA-chained ledger that an independent external verifier can audit.

MCS

Modular Cognitive Substrate

A five-layer cognitive architecture with a typed inter-module control protocol, a meta-learning policy layer, and an embedded self-capability estimator. The latter is what enables the platform to abstain on cases it knows it cannot reliably handle — instead of confidently giving you the wrong answer.

Production-validated: on the FAERS pharmacovigilance benchmark, the platform correctly identifies and abstains on cases it cannot reliably handle, rather than producing low-confidence answers.

HNSI

Hybrid Neuro-Symbolic Integration

A clinical-NER pipeline that turns unstructured discharge summaries and radiology reports into structured, negation-aware, section-aware, temporally-anchored facts that flow into the same causal model the structured data flows into. Maps free-text to RxNorm / MedDRA / SNOMED before it touches the prediction layer.

Production-validated: 1.64M MIMIC-IV-Note clinical notes processed; 80M+ entities extracted with 13–18% negation rate.

Causal estimates with their honest fragility.

Three moments that define the platform.

Three doors. One product.

pip install rosenbound

Five methods on heparin vs LMWH. None recovers RCT direction on bleeding.

Patient-level mortality risk with explicit fragility disclosure.

Synthetic patient inputs

Seven W-lanes locked on the full corpus, plus FAERS Pipeline B for pharmacovigilance.

11× coverage lift over baseline.

Every model decision, cryptographically chained.

The five estimators, in plain English.

Five inventive concepts. All five production-validated. All five protected.

Persistent Self-Improving Memory

Differentiable Causal Inference Engine

Verifiable Bounded Self-Modification

Modular Cognitive Substrate

Hybrid Neuro-Symbolic Integration

Start with the one-pager. Request additional materials in the form.

Rosenbound One-Page Pitch (PDF)

Five seats. $100K Year 1, locked Founding rate forever.

Causal estimates with their honest fragility.

Three moments that define the platform.

Three doors. One product.

pip install rosenbound

Five methods on heparin vs LMWH. None recovers RCT direction on bleeding.

Patient-level mortality risk with explicit fragility disclosure.

Synthetic patient inputs

Seven W-lanes locked on the full corpus, plus FAERS Pipeline B for pharmacovigilance.

11× coverage lift over baseline.

Every model decision, cryptographically chained.

The five estimators, in plain English.

Five inventive concepts. All five production-validated. All five protected.

Persistent Self-Improving Memory

Differentiable Causal Inference Engine

Verifiable Bounded Self-Modification

Modular Cognitive Substrate

Hybrid Neuro-Symbolic Integration

Start with the one-pager. Request additional materials in the form.

Rosenbound One-Page Pitch (PDF)

Five seats. $100K Year 1, locked Founding rate forever.

Request materials access