2025-05-07
\[ P(\theta|y) = \frac{P(y|\theta)P(\theta)}{P(y)} \]
\[ P(\text{posterior}) = \frac{P(\text{likelihood})P(\text{prior})}{P(y)} \]
\[ P(\text{posterior}) \propto {P(\text{likelihood})P(\text{prior})} \]
lm
lm
lm
lm
\[\begin{align} y &= \beta_0 + \beta_1 x\\ y &= \beta_0 + \beta_1 x_1 + \beta_2 x_2\\ y &= \exp(\beta_0 + \beta_1 x_1 + \beta_2 x_2)\\ y &= \beta_0 + \beta_1 x^2\\ y &= \beta_0 + \beta_1^2 x \end{align}\]
lm
formula syntaxformula | maths |
---|---|
y ~ x |
\(y = b_0 + b_1 x\) |
y ~ 1 + x |
\(y = b_0 + b_1 x\) |
y ~ 0 + x |
\(y = b_1 x\) |
y ~ x + y |
\(y = b_0 + b_1 x + b_2 y\) |
y ~ x:y |
\(y = b_0 + b_1 x y\) |
y ~ x*y |
\(y = b_0 + b_1 x + b_2 y + b_3 x y\) |
y ~ I(x^2) |
\(y = b_0 + b_1 x^2\) |
Markov Chain Monte Carlo
Compiling Stan program...
Start sampling
Running MCMC with 4 sequential chains...
Chain 1 Iteration: 1 / 2000 [ 0%] (Warmup)
Chain 1 Iteration: 100 / 2000 [ 5%] (Warmup)
Chain 1 Iteration: 200 / 2000 [ 10%] (Warmup)
Chain 1 Iteration: 300 / 2000 [ 15%] (Warmup)
Chain 1 Iteration: 400 / 2000 [ 20%] (Warmup)
Chain 1 Iteration: 500 / 2000 [ 25%] (Warmup)
Chain 1 Iteration: 600 / 2000 [ 30%] (Warmup)
Chain 1 Iteration: 700 / 2000 [ 35%] (Warmup)
Chain 1 Iteration: 800 / 2000 [ 40%] (Warmup)
Chain 1 Iteration: 900 / 2000 [ 45%] (Warmup)
Chain 1 Iteration: 1000 / 2000 [ 50%] (Warmup)
Chain 1 Iteration: 1001 / 2000 [ 50%] (Sampling)
Chain 1 Iteration: 1100 / 2000 [ 55%] (Sampling)
Chain 1 Iteration: 1200 / 2000 [ 60%] (Sampling)
Chain 1 Iteration: 1300 / 2000 [ 65%] (Sampling)
Chain 1 Iteration: 1400 / 2000 [ 70%] (Sampling)
Chain 1 Iteration: 1500 / 2000 [ 75%] (Sampling)
Chain 1 Iteration: 1600 / 2000 [ 80%] (Sampling)
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ x
Data: fake_data (Number of observations: 100)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 3.71 1.14 1.51 5.95 1.00 3540 2793
x 9.97 0.20 9.58 10.36 1.00 3567 2746
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 5.13 0.38 4.47 5.92 1.00 3517 2585
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ x
Data: fake_data (Number of observations: 100)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 3.71 1.14 1.51 5.95 1.00 3540 2793
x 9.97 0.20 9.58 10.36 1.00 3567 2746
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 5.13 0.38 4.47 5.92 1.00 3517 2585
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ x
Data: fake_data (Number of observations: 100)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 3.71 1.14 1.51 5.95 1.00 3540 2793
x 9.97 0.20 9.58 10.36 1.00 3567 2746
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 5.13 0.38 4.47 5.92 1.00 3517 2585
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ x
Data: fake_data (Number of observations: 100)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 3.71 1.14 1.51 5.95 1.00 3540 2793
x 9.97 0.20 9.58 10.36 1.00 3567 2746
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 5.13 0.38 4.47 5.92 1.00 3517 2585
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ x
Data: fake_data (Number of observations: 100)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 3.71 1.14 1.51 5.95 1.00 3540 2793
x 9.97 0.20 9.58 10.36 1.00 3567 2746
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 5.13 0.38 4.47 5.92 1.00 3517 2585
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ x
Data: fake_data (Number of observations: 100)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 3.71 1.14 1.51 5.95 1.00 3540 2793
x 9.97 0.20 9.58 10.36 1.00 3567 2746
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 5.13 0.38 4.47 5.92 1.00 3517 2585
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ x
Data: fake_data (Number of observations: 100)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 3.71 1.14 1.51 5.95 1.00 3540 2793
x 9.97 0.20 9.58 10.36 1.00 3567 2746
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 5.13 0.38 4.47 5.92 1.00 3517 2585
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Family: gaussian
Links: mu = identity; sigma = identity
Formula: y ~ x
Data: fake_data (Number of observations: 100)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 3.71 1.14 1.51 5.95 1.00 3540 2793
x 9.97 0.20 9.58 10.36 1.00 3567 2746
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 5.13 0.38 4.47 5.92 1.00 3517 2585
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
# A draws_df: 1000 iterations, 4 chains, and 6 variables
b_Intercept b_x sigma Intercept lprior lp__
1 2.2 10.4 5.4 57 -8.4 -315
2 2.0 10.5 5.6 57 -8.4 -315
3 2.3 10.3 5.6 56 -8.4 -312
4 2.3 10.3 4.6 56 -8.4 -312
5 3.1 10.1 5.2 56 -8.4 -311
6 4.0 9.8 4.8 55 -8.4 -311
7 3.6 10.1 5.2 56 -8.4 -310
8 2.3 10.3 5.0 56 -8.4 -311
9 3.8 10.0 5.5 56 -8.4 -311
10 5.0 9.7 4.8 56 -8.4 -311
# ... with 3990 more draws
# ... hidden reserved variables {'.chain', '.iteration', '.draw'}
Family: binomial
Links: mu = logit
Formula: heads | trials(flips) ~ 1
Data: coin_data (Number of observations: 1)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept -1.39 0.35 -2.10 -0.75 1.00 1425 1578
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
bmod_coin_weak <- brm(
heads | trials(flips) ~ 1,
data = coin_data,
prior = prior(beta(2, 2), class = "Intercept"), # <--
family = binomial(link = "identity"),
backend = "cmdstanr",
file = "coins_weak"
)
bmod_coin_strong <- brm(
heads | trials(flips) ~ 1,
data = coin_data,
prior = prior(beta(100, 100), class = "Intercept"), # <--
family = binomial(link = "identity"),
backend = "cmdstanr",
file = "coins_strong"
)
Family: binomial
Links: mu = identity
Formula: heads | trials(flips) ~ 1
Data: coin_data (Number of observations: 1)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.22 0.05 0.12 0.34 1.00 1299 1416
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Family: binomial
Links: mu = identity
Formula: heads | trials(flips) ~ 1
Data: coin_data (Number of observations: 1)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.44 0.03 0.38 0.50 1.00 1464 1870
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Extension to the formula:
y ~ p + (r | g)
p
: population level effectsr
: ‘random’ effects, i.e. group level effectsg
: grouping variable
Family: gaussian
Links: mu = identity; sigma = identity
Formula: Petal.Length ~ 1 + Sepal.Length + (1 + Sepal.Length | Species)
Data: iris (Number of observations: 150)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Multilevel Hyperparameters:
~Species (Number of levels: 3)
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sd(Intercept) 0.64 0.69 0.02 2.39 1.01 495 771
sd(Sepal.Length) 0.53 0.28 0.18 1.24 1.01 578 785
cor(Intercept,Sepal.Length) -0.05 0.60 -0.97 0.95 1.00 634 1171
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 0.53 0.54 -0.46 2.02 1.01 376 132
Sepal.Length 0.59 0.24 0.12 1.12 1.01 660 589
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 0.26 0.02 0.23 0.30 1.01 1138 833
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
Estimate Est.Error Q2.5 Q97.5
Intercept 0.5318709 0.5396609 -0.4638470 2.023480
Sepal.Length 0.5873578 0.2442129 0.1190371 1.115754
$Species
, , Intercept
Estimate Est.Error Q2.5 Q97.5
setosa 0.06899166 0.5537010 -1.256369 1.2012115
versicolor -0.15903353 0.5497306 -1.704974 0.8129511
virginica 0.02431809 0.4968133 -1.320186 1.0449123
, , Sepal.Length
Estimate Est.Error Q2.5 Q97.5
setosa -0.41549242 0.2556132 -0.9461325 0.05927435
versicolor 0.06777262 0.2430173 -0.4618805 0.52713793
virginica 0.17090787 0.2443261 -0.3543854 0.64513015
new_data <- tibble(Species = "setosa", Sepal.Length = 7)
(predictions <- tidybayes::add_epred_draws(bmod_iris,
newdata = new_data))
# A tibble: 4,000 × 7
# Groups: Species, Sepal.Length, .row [1]
Species Sepal.Length .row .chain .iteration .draw .epred
<chr> <dbl> <int> <int> <int> <int> <dbl>
1 setosa 7 1 NA NA 1 1.77
2 setosa 7 1 NA NA 2 1.68
3 setosa 7 1 NA NA 3 1.62
4 setosa 7 1 NA NA 4 1.59
5 setosa 7 1 NA NA 5 1.91
6 setosa 7 1 NA NA 6 1.59
7 setosa 7 1 NA NA 7 1.61
8 setosa 7 1 NA NA 8 1.70
9 setosa 7 1 NA NA 9 1.81
10 setosa 7 1 NA NA 10 1.75
# ℹ 3,990 more rows
Fake data: 20% chance of generating a value from a log normal distribution:
Family: hurdle_lognormal
Links: mu = identity; sigma = identity; hu = identity
Formula: x ~ 1
Data: fake_hurdle_data (Number of observations: 100)
Draws: 4 chains, each with iter = 2000; warmup = 1000; thin = 1;
total post-warmup draws = 4000
Regression Coefficients:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
Intercept 4.32 0.70 3.00 5.69 1.00 3163 2520
Further Distributional Parameters:
Estimate Est.Error l-95% CI u-95% CI Rhat Bulk_ESS Tail_ESS
sigma 3.35 0.51 2.54 4.49 1.00 2954 2224
hu 0.75 0.04 0.67 0.83 1.00 3606 3052
Draws were sampled using sample(hmc). For each parameter, Bulk_ESS
and Tail_ESS are effective sample size measures, and Rhat is the potential
scale reduction factor on split chains (at convergence, Rhat = 1).
// generated with brms 2.22.0
functions {
}
data {
int<lower=1> N; // total number of observations
vector[N] Y; // response variable
int<lower=1> K; // number of population-level effects
matrix[N, K] X; // population-level design matrix
int<lower=1> Kc; // number of population-level effects after centering
int prior_only; // should the likelihood be ignored?
}
transformed data {
matrix[N, Kc] Xc; // centered version of X without an intercept
vector[Kc] means_X; // column means of X before centering
for (i in 2:K) {
means_X[i - 1] = mean(X[, i]);
Xc[, i - 1] = X[, i] - means_X[i - 1];
}
}
parameters {
vector[Kc] b; // regression coefficients
real Intercept; // temporary intercept for centered predictors
real<lower=0> sigma; // dispersion parameter
}
transformed parameters {
real lprior = 0; // prior contributions to the log posterior
lprior += student_t_lpdf(Intercept | 3, 57, 33.6);
lprior += student_t_lpdf(sigma | 3, 0, 33.6)
- 1 * student_t_lccdf(0 | 3, 0, 33.6);
}
model {
// likelihood including constants
if (!prior_only) {
target += normal_id_glm_lpdf(Y | Xc, Intercept, b, sigma);
}
// priors including constants
target += lprior;
}
generated quantities {
// actual population-level intercept
real b_Intercept = Intercept - dot_product(means_X, b);
}