Causal Mediation Analysis: An Overview

 What is Causal Mediation Analysis?

Causal mediation analysis is a statistical method used to determine how an independent variable (X) influences an outcome variable (Y) through an intermediate variable (M), known as the mediator. This method helps researchers understand the mechanism behind causal effects.

For example, if we want to study how exercise (X) affects heart health (Y), mediation analysis can determine whether the effect is partially or fully explained by an intermediate factor like weight loss (M).


Key Components of Causal Mediation Analysis

  • Exposure (X): The independent variable (e.g., Exercise)
  • Mediator (M): The variable that transmits part of the effect (e.g., Weight Loss)
  • Outcome (Y): The dependent variable (e.g., Heart Health)

The goal is to decompose the total effect of X on Y into:

  1. Direct Effect (DE): The effect of XX on YY not passing through M.
  2. Indirect Effect (IE) / Mediated Effect: The portion of the effect that passes through M.

Mathematically, the total effect can be written as:

Total Effect=Direct Effect+Indirect Effect

Types of Causal Effects in Mediation Analysis

  1. Natural Direct Effect (NDE): The effect of X on Y when M is held constant at its natural value.
  2. Natural Indirect Effect (NIE): The portion of the effect that occurs through the mediator.
  3. Total Effect (TE): The combined impact of both direct and indirect effects.

Assumptions for Causal Mediation Analysis

  1. No Unmeasured Confounders: X, M, and Y should not have unmeasured common causes.
  2. Temporal Order: X must precede M, and M must precede Y.
  3. No Hidden Feedback Loops: Y should not influence M (causal direction should be clear).

Statistical Approaches for Mediation Analysis

1. Baron & Kenny's Approach (Traditional Regression Method)

  • Step 1: Show that X affects Y
  • Step 2: Show that X affects M
  • Step 3: Show that M affects Y when controlling for X
  • Step 4: If the effect of X on Y significantly reduces (or disappears) after adjusting for MM mediation is present.

2. Sobel Test

  • Tests whether the indirect effect (XMY) is statistically significant.
  • Assumes normality, which may not hold in small samples.

3. Causal Inference-Based Approaches (Counterfactual Mediation Analysis)

  • Robins & Greenland (1992), Pearl (2001) developed methods using counterfactuals to formally define direct and indirect effects.
  • Uses Structural Equation Modeling (SEM) or Generalized Linear Models (GLMs).
  • Implemented in software like R (mediation package) and Python (causalml, dowhy).

Example: Causal Mediation Analysis in Python (Using mediation in R or DoWhy in Python)

Python Example with DoWhy

python
import dowhy from dowhy import CausalModel # Define causal model model = CausalModel( data=df, treatment="Exercise", outcome="Heart_Health", common_causes=["Age", "Diet"], mediator="Weight_Loss" ) # Estimate direct and indirect effects est = model.estimate_effect( model.identify_effect(proceed_when_unidentifiable=True), method_name="mediation_analysis" ) print(est)

Applications of Causal Mediation Analysis

  1. Epidemiology & Public Health:
    • How smoking (X) affects lung disease (Y) through inflammation (M).
  2. Economics & Social Sciences:
    • How education (X) affects income (Y) via job skills (M).
  3. Psychology:
    • How stress (X) influences health (Y) through sleep quality (M).

Comments

Popular posts from this blog

Analysis of Repeated Measures Data using SAS

Four essential statistical functions for simulation in SAS

Medical information for Melanoma, Merkel cell carcinoma and tumor mutation burden