Why Treatment Assignment and Potential Outcomes Are Not Independent in Observational Studies ?

 

 Step-by-Step Explanation:

1. What Are Potential Outcomes?

  • In causal inference, for each individual we define two potential outcomes:

    • Y(0)Y(0): the outcome the person would have if they did not receive the treatment.

    • Y(1)Y(1): the outcome the person would have if they did receive the treatment.

📌 These are hypothetical — for each person, we only get to observe one of them in real life (depending on their treatment assignment).


2. What Is
T
?

  • TT is the treatment assignment:

    • T=1T = 1: treated

    • T=0T = 0: control (not treated)


3. What Does "Not Independent" Mean?

  • Saying that the potential outcomes (Y(0),Y(1)) and TT are not independent means:

    The likelihood of receiving treatment depends on something that also affects the outcome.

In math:

(Y(0),Y(1))⊥T

That is: the treatment assignment is related to the outcome you’d have had with or without treatment.


4. 🔄 Why Is This a Problem?

In randomized experiments, treatment is assigned randomly, so:

(Y(0),Y(1)) ⁣ ⁣ ⁣
T
(Y(0), Y(1)) \perp\!\!\!\perp T

That’s good — it means treatment is not confounded with outcome.

But in an observational study, people choose treatment (or are selected for it) based on characteristics like:

  • Age

  • Disease severity

  • Socioeconomic status

  • etc.

Those same characteristics may also affect outcomes — so there's confounding.

🧠 Example:

  • Sicker patients are more likely to get an aggressive treatment.

  • But sicker patients also tend to have worse outcomes, even if the treatment is effective.

  • So if we just compare outcomes in treated vs. untreated, we confuse the effect of treatment with the effect of baseline health.


🔑 Why It Matters

If potential outcomes and treatment assignment are not independent, then naïvely comparing treated vs. untreated will give a biased estimate of the treatment effect.

This is the fundamental challenge of causal inference in observational data.


🧪 What Can We Do?

To deal with this lack of independence, methods like:

  • Propensity score matching / weighting / stratification

  • Regression adjustment

  • Inverse probability of treatment weighting (IPTW)

  • Targeted maximum likelihood (TMLE)

  • Instrumental variables

are used to simulate a situation as close as possible to randomization by adjusting for confounders.


✅ Summary in Plain Language

In a randomized trial, treatment is random — it has nothing to do with what the outcome would have been.

But in an observational study, people who get treated are systematically different from those who don't.

As a result, treatment assignment and potential outcomes are not independent, and this creates bias in estimating the true causal effect.

Comments

Popular posts from this blog

Analysis of Repeated Measures Data using SAS

Medical information for Melanoma, Merkel cell carcinoma and tumor mutation burden

Four essential statistical functions for simulation in SAS