MishenMed: Statistical Consulting for Clinical Trials

Posts

Decomposing Variance in General Linear Mixed Models for Repeated Measurements : Understanding Between-Subject, Within-Subject, and Measurement Error Components

June 18, 2025

In the linear mixed model: Var ( Y i ) = Z i G Z i ′ ⏟ Between-subject variance + R i ⏟ Within-subject variance Var ( Y i ) = Between-subject variance Z i G Z i ′ + Within-subject variance R i Between-subject variance ( Z i G Z i ′ Z i G Z i ′ ): Captures variability due to random effects , like subject-specific intercepts or slopes. Within-subject variance ( R i R i ): Captures variability within a subject , which includes: Measurement error Other time-specific fluctuations 📌 So where is measurement error? Measurement error is part of the within-subject variance . If we assume: R i = σ 2 I R i = σ 2 I then all within-subject variability is attributed to independent measurement error with constant variance σ 2 σ 2 . However, in more complex models, R i R i can include: Autocorrelation (e.g., AR(1) structure) Heteroscedasticity (changing variance over time) M...

Analysis of Repeated Measures Data using SAS (1)

June 18, 2025

1. Basic Concepts of Repeated Measures: In this basic setup of a completely randomized design with repeated measures, there are two factors, treatments and time. Treatment is called the between-subjects factor because levels of treatment can change only between subjects; all measurements on the same subject will represent the same treatment. Time is called a within-subjects factor because different measurements on the same subject are taken at different times. In repeated measures experiments, interest centers on (1) how treatment means differ, (2) how treatment means change over time, and (3) how differences between treatment means change over time. 2. Four-step procedure for mixed model analysis: Step 1: Model the mean structure, usually by specification of the fixed effects. Step 2: Specify the covariance structure, between subjects as well as within subjects. Step 3: Fit the mean model accounting for the covariance structure. Step 4: Make statistical inference bas...

Exploring the Model Landscape in AI, ML, and DL

June 13, 2025

(a) classification models logistic regression decision trees random forest naive Bayes (b) dimensionality reduction models PCA unsupervised technique used primarily for dimensionality reduction robust rolling PCA (R2-PCA) kernel PCA ICA autoencoder (c) clustering methods used in unsupervised learning K-means robust rolling K-means (R2K-means) density-based spatial clustering of applications with noise (DBSCAN) Gaussian mixture model (d) solving equations (explicit/implicit replication) mapping input data to labels via FNNs supervised using a neural network as a solution unsupervised (e) image classification CNNs (f) sequence analysis and NLP (sentiment analysis & more) RNNs LSTMs GRUs (g) LLMs (sentiment analysis, mathematical reasoning, & more) transformers (h) sampling models (simulating/generating data preserving stylized facts) MCMC parametric GANs non-parametric & more

Why Use REML Instead of ML?

June 11, 2025

In standard Maximum Likelihood (ML) , we estimate both β β and Σ Σ from the full data. In REML , we remove the influence of β β by transforming the data into residuals — the part of the data left after accounting for the fixed effects. REML improves estimation by removing the influence of fixed effects from the likelihood. It does this by: Transforming the data into residuals , Building a likelihood function that depends only on the variance structure . This leads to more accurate and reliable estimates of variance components, especially in small or unbalanced datasets. Feature Maximum Likelihood (ML) Residual Maximum Likelihood (REML) What it estimates Estimates both fixed effects β β and variance components Σ Σ together Focuses on estimating variance components Σ Σ only Bias in variance estimates Can be biased , especially in small samples, because it doesn't account for the...