Chapter 04 — Statistical Inference for the Mean

How precise is our $41k estimate of average earnings? And how much more data would you need to halve that uncertainty?

The standard error se(X̄) = s/√n measures the precision of the sample mean as an estimate of μ. It quantifies sampling uncertainty — smaller SEs mean more precise estimates. Because SE shrinks with √n, not with n, quadrupling the sample size is required to halve the standard error.

Sample size (n) n = 171

171

SD (s)

—

SE (s/√n)

—

Approx 95% CI width

—

Try This

Set n = 171 (the book's sample), then set n = 684 (4× larger). SE halves from ~$1,952 to ~$976 — the √n rule in action: halving precision costs four times more data.
Set n = 50. SE jumps to ~$3,611 and the approximate 95% CI widens to ~$14,400 — under-sampling doesn't just shrink the window slightly, it nearly doubles it.
Set n = 400. SE drops to ~$1,276 — collecting 229 more observations shaves only ~$676 off the SE, the diminishing-returns shoulder of the 1/√n curve.

Take-away: SE = s/√n quantifies precision; doubling precision always costs four times more data. Read §4.1 in the chapter →

Why do we need a whole family of bell curves — one for every sample size? Because estimating σ from the sample adds uncertainty the normal curve can't see.

The t-distribution is used when σ is unknown and estimated from the sample. It resembles the standard normal N(0, 1) but with fatter tails that reflect the extra uncertainty from using s in place of σ. As n grows, the t-distribution approaches the normal — which is why the normal is a good approximation for large samples (the familiar "n ≥ 30" rule of thumb).

Degrees of freedom

Show Normal N(0,1)

t-crit (5%, two-sided)

—

z-crit (Normal)

1.960

Tail area beyond ±1.96

—

Try This

Set df = 4 with the Normal overlay on. The t-tails are visibly fatter and t-crit = 2.78 (vs 1.96 for the normal) — with tiny samples, you need a much larger t-statistic to reject at 5%.
Set df = 30. t-crit drops to ~2.04, just barely above 1.96 — the classic "n ≥ 30 is large enough" rule is exactly this small gap becoming negligible.
Set df = 100. The two curves are visually indistinguishable and tail area beyond ±1.96 is ~5% — for reasonable samples, the normal is a perfectly usable stand-in for the t.

Take-away: The t-distribution pays for estimating σ with fatter tails and refunds the cost as n grows. Read §4.2 in the chapter →

A point estimate of $41k is a single number. What range of μ values is the data actually compatible with — and how does that range change with our confidence requirement?

A confidence interval provides a range of plausible values for μ. A 95% CI means: if we repeated the sampling procedure many times, approximately 95% of the resulting intervals would contain the true μ. The interval is constructed as X̄ ± t_α/2 × SE(X̄) — wider intervals indicate less precision.

Confidence level 95%

Dataset

X̄

—

t-critical

—

Margin of error

—

CI Lower

—

CI Upper

—

Width

—

Try This

At 95%, note the CI for Female Earnings: [$37,559, $45,266] (width $7,707). Push to 99%. The CI widens to about $10,171 — a 4-point jump in confidence buys you a 30% wider interval.
Drop the confidence level to 90%. The CI narrows to ~$6,457 — you gain precision but accept that roughly 1 in 10 such intervals would miss the true μ.
Switch to Gas Prices at 95%. The CI is only about $0.11 wide because SE is tiny, and $3.81 (the CA average) sits far outside — the data rejects the state average before any formal test is run.

Take-away: Confidence intervals trade precision for confidence — the interval widens every time you demand a stronger claim. Read §4.3 in the chapter →

Is the population earnings mean really $40,000 — or is our $41k sample mean too far off for that story to hold?

A hypothesis test evaluates whether data provide sufficient evidence to reject a specific claim (H₀) about a parameter. The t-statistic measures how many standard errors the estimate is from the hypothesized value. The p-value is the probability of observing a result at least as extreme as ours if H₀ were true. Small p-values (< α, typically 0.05) provide evidence against H₀ and lead us to reject it.

Hypothesized value μ₀ $40,000

X̄

—

t-statistic

—

p-value

—

±t-crit (5%)

—

Try This

Set μ₀ = $40,000 (the book's example). t ≈ 0.72, p ≈ 0.470 — the t-marker sits well inside the white region, so we fail to reject: $40,000 remains a plausible μ.
Drag μ₀ down toward $35,000 and watch p. The μ₀ where p first crosses 0.05 is exactly the 95% CI endpoint — tests and CIs are the same object seen from opposite sides.
Set μ₀ = $50,000. t is sharply negative, p is tiny, and the decision flips to REJECT — $50k is too far from the sample mean to survive as a population mean.

Take-away: A hypothesis test is just a confidence interval read in reverse — the values you'd reject are the ones outside the interval. Read §4.4 in the chapter →

If theory says earnings should be higher than $40k — not just different — can we extract more statistical power from the same data?

One-sided tests concentrate the rejection region in one tail of the distribution, making them more powerful for detecting effects in the specified direction. Use them when theory predicts a specific direction. The one-sided p-value is exactly half the two-sided p-value (when the effect is in the predicted direction), and critical values are smaller — e.g. 1.65 vs ±1.96 at α = 0.05.

Hypothesized value μ₀ $40,000

Alternative hypothesis

t-statistic

—

p-value (one-sided)

—

p-value (two-sided)

—

t-crit (one-sided, 5%)

—

Try This

Set μ₀ = $40,000, H_a: μ > $40,000. One-sided p ≈ 0.235, exactly half of the two-sided 0.470 — neither test is significant, but the directional version uses its power efficiently by only looking right.
Set μ₀ = $38,000 with H_a: μ > $38,000. The one-sided test rejects at 5% while the two-sided barely hangs on — the directional version's "extra power" is real and visible here.
Toggle to H_a: μ < μ₀ while the t-stat is positive. p jumps near 1.0 — testing the wrong direction leaves you defenseless: the data is maximally un-supportive of your alternative.

Take-away: One-sided tests double the power in the predicted direction but leave you unable to detect effects that go the other way. Read §4.6 in the chapter →

Yolo County gas is $0.14/gallon cheaper than the California average — stat-sig at p < 0.0001. But is that $2 per fill-up worth crossing town for?

Even small practical differences can be statistically significant when the sample gives precise estimates. The gasoline-price gap of $0.14 looks trivial, but the standard error is tiny ($0.027), so the t-statistic is large (−5.26) — the difference sits many standard errors from zero. Statistical significance answers "Is there a difference?"; practical significance asks "Does the difference matter?" Both questions matter in econometrics.

CA state average (H₀) $3.81

Sample mean

—

t-statistic

—

p-value

—

Try This

Set H₀ = $3.81 (the book's CA average). t = −5.26, p < 0.0001, REJECT — the tiny SE ($0.027) turns a $0.14 gap into overwhelming statistical evidence.
Slide H₀ toward $3.67 (the sample mean). p grows steeply; it crosses 0.05 near the 95% CI endpoints around $3.62 and $3.72 — inside that narrow window, the data fails to reject.
Set H₀ = $3.67 exactly. t ≈ 0, p ≈ 1.0 — we cannot reject that μ equals our own sample mean, a reassuring sanity check on the machinery.

Take-away: With a small SE, tiny effects become statistically significant — always ask whether the difference also matters practically. Read §4.5 in the chapter →

A poll says 52.1% plan to vote Democrat. Is that a real lead — or just sampling noise dressed up as a majority?

Proportions are simply means of 0/1 binary data, so the whole inference toolkit carries over. Confidence intervals, hypothesis tests, and standard errors for proportions use the same logic as for means — only the SE formula changes: se(p̂) = √(p̂(1−p̂)/n), which follows from the variance of a binary variable. The sample proportion p̂ is literally the sample mean of the 0/1 codings.

Sample proportion (p̂) p̂ = 0.52

Sample size (n) n = 921

Hypothesized proportion (p₀) p₀ = 0.50

—

95% CI Lower

—

95% CI Upper

—

z-statistic

—

p-value

—

Try This

Keep defaults (p̂ = 0.52, n = 921, p₀ = 0.50). This matches the book's voter example: SE ≈ 0.0165, 95% CI ≈ [48.9%, 55.3%] — since 50% sits inside, we fail to reject and the "majority" is within sampling noise.
Keep p̂ = 0.52 and push n to 2,500. The CI tightens enough that 50% now falls outside — the same observed proportion becomes statistically significant purely because SE shrank with √n.
Set p̂ = 0.50 exactly. z = 0, p = 1.0, and the CI centers on p₀ — when the estimate equals the hypothesis, there is nothing to reject.

Take-away: Proportions are means of 0/1 data — the whole inference toolkit (SE, CI, z-tests) carries over, with only the SE formula changing. Read §4.7 in the chapter →

Code Summary

You've explored the key concepts interactively — now reproduce them in code. These self-contained blocks cover everything you practiced above. Pick your language, copy the code, and run it.

# =============================================================================
# CHAPTER 4 CHEAT SHEET: Statistical Inference for the Mean
# =============================================================================

# --- Libraries ---
import numpy as np                       # numerical operations (sqrt, linspace)
import pandas as pd                      # data loading and manipulation
import matplotlib.pyplot as plt          # creating plots and visualizations
from scipy import stats                  # t-distribution and normal distribution

# =============================================================================
# STEP 1: Load data directly from a URL
# =============================================================================
# pd.read_stata() reads Stata .dta files — this sample has 171 female workers
url = "https://raw.githubusercontent.com/quarcs-lab/data-open/master/AED/AED_EARNINGS.DTA"
data_earnings = pd.read_stata(url)
earnings = data_earnings['earnings']

print(f"Dataset: {len(earnings)} observations")

# =============================================================================
# STEP 2: Sample statistics and standard error
# =============================================================================
# The standard error measures the precision of the sample mean as an estimate
# of the population mean. It shrinks with sqrt(n), not n — so quadrupling the
# sample size is needed to halve the SE.
n             = len(earnings)
mean_earnings = earnings.mean()
std_earnings  = earnings.std(ddof=1)           # ddof=1 for sample std dev
se_earnings   = std_earnings / np.sqrt(n)      # standard error = s / sqrt(n)

print(f"Sample mean:        ${mean_earnings:,.2f}")
print(f"Standard deviation: ${std_earnings:,.2f}")
print(f"Standard error:     ${se_earnings:,.2f}")

# =============================================================================
# STEP 3: t-distribution vs standard normal
# =============================================================================
# When sigma is unknown (estimated by s), we use the t-distribution instead of
# the normal. It has fatter tails — more probability in the extremes — but
# approaches the normal as n grows (the "n >= 30" rule of thumb).
x = np.linspace(-4, 4, 200)

fig, ax = plt.subplots(figsize=(10, 6))
ax.plot(x, stats.norm.pdf(x), '--', linewidth=2, label='Standard Normal')
ax.plot(x, stats.t.pdf(x, df=4), linewidth=2, label='t(4) — fatter tails')
ax.plot(x, stats.t.pdf(x, df=30), linewidth=2, label='t(30) — nearly normal')
ax.set_xlabel('t value')
ax.set_ylabel('Density')
ax.set_title('t-Distribution Approaches Normal as Degrees of Freedom Increase')
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

# =============================================================================
# STEP 4: Confidence interval — a range of plausible values for mu
# =============================================================================
# A 95% CI means: if we repeatedly sampled and built CIs, about 95% would
# contain the true mu. Formula: x-bar +/- t_crit * SE
alpha  = 0.05
t_crit = stats.t.ppf(1 - alpha / 2, n - 1)    # critical value from t(n-1)
margin = t_crit * se_earnings

ci_lower = mean_earnings - margin
ci_upper = mean_earnings + margin

print(f"Critical value (t_170, 0.025): {t_crit:.4f}")
print(f"Margin of error:               ${margin:,.2f}")
print(f"95% CI: [${ci_lower:,.2f}, ${ci_upper:,.2f}]")

# Compare 90%, 95%, and 99% — higher confidence requires wider intervals
print(f"\n{'Level':<8} {'Lower':>12} {'Upper':>12} {'Width':>10}")
print("-" * 44)
for conf in [0.90, 0.95, 0.99]:
    tc = stats.t.ppf(1 - (1 - conf) / 2, n - 1)
    lo = mean_earnings - tc * se_earnings
    hi = mean_earnings + tc * se_earnings
    print(f"{conf*100:.0f}%{lo:>14,.2f}{hi:>14,.2f}{hi - lo:>12,.2f}")

# =============================================================================
# STEP 5: Two-sided hypothesis test — H0: mu = $40,000
# =============================================================================
# The t-statistic measures how many standard errors the estimate is from the
# hypothesized value. The p-value is the probability of a result at least as
# extreme as ours if H0 were true.
mu0    = 40000
t_stat = (mean_earnings - mu0) / se_earnings
p_val  = 2 * (1 - stats.t.cdf(abs(t_stat), n - 1))  # two-sided p-value

print(f"H0: mu = ${mu0:,}  vs  Ha: mu != ${mu0:,}")
print(f"t-statistic: {t_stat:.4f}")
print(f"p-value:     {p_val:.4f}")
print(f"Decision:    {'Reject H0' if p_val < 0.05 else 'Do not reject H0'} at alpha = 0.05")

# One-sided (upper-tailed): H0: mu <= 40000 vs Ha: mu > 40000
p_val_one = 1 - stats.t.cdf(t_stat, n - 1)            # upper tail only
t_crit_one = stats.t.ppf(0.95, n - 1)                  # one-sided critical value

print(f"\nOne-sided (upper): p-value = {p_val_one:.4f} (= two-sided / 2)")
print(f"One-sided critical value:    {t_crit_one:.4f} (vs two-sided {t_crit:.4f})")

# =============================================================================
# STEP 6: Statistical vs practical significance — gasoline prices
# =============================================================================
# A $0.14 price gap looks small, but the SE is tiny — so the t-statistic is
# huge and p < 0.0001. Statistical significance says "the difference is real";
# practical significance asks "does it matter?"
url_gas   = "https://raw.githubusercontent.com/quarcs-lab/data-open/master/AED/AED_GASPRICE.DTA"
data_gas  = pd.read_stata(url_gas)
price     = data_gas['price']

n_gas     = len(price)
mean_gas  = price.mean()
se_gas    = price.std(ddof=1) / np.sqrt(n_gas)
mu0_gas   = 3.81                                        # CA state average
t_gas     = (mean_gas - mu0_gas) / se_gas
p_gas     = 2 * (1 - stats.t.cdf(abs(t_gas), n_gas - 1))

print(f"Gas price test: H0: mu = ${mu0_gas}")
print(f"Sample mean: ${mean_gas:.4f}   SE: ${se_gas:.4f}")
print(f"t = {t_gas:.4f},  p = {p_gas:.6f}")
print(f"Decision: {'Reject H0' if p_gas < 0.05 else 'Do not reject H0'}")
print(f"Practical: ${mu0_gas - mean_gas:.2f}/gallon * 15 gal = ${(mu0_gas - mean_gas)*15:.2f} per tank")

# =============================================================================
# STEP 7: Inference for proportions — binary (0/1) data
# =============================================================================
# Proportions are just means of 0/1 variables. The same CI and hypothesis-test
# logic applies — only the SE formula changes: sqrt(p_hat*(1-p_hat)/n).
n_voters  = 921
n_dem     = 480
p_hat     = n_dem / n_voters
se_prop   = np.sqrt(p_hat * (1 - p_hat) / n_voters)

# 95% confidence interval (use z for large-sample proportions)
ci_lo = p_hat - 1.96 * se_prop
ci_hi = p_hat + 1.96 * se_prop

# Hypothesis test: H0: p = 0.50
p0        = 0.50
se_h0     = np.sqrt(p0 * (1 - p0) / n_voters)
z_stat    = (p_hat - p0) / se_h0
p_val_z   = 2 * (1 - stats.norm.cdf(abs(z_stat)))

print(f"Sample proportion: {p_hat:.4f} ({p_hat*100:.1f}%)")
print(f"SE: {se_prop:.4f}")
print(f"95% CI: [{ci_lo:.4f}, {ci_hi:.4f}]")
print(f"\nH0: p = {p0}  z = {z_stat:.4f}  p-value = {p_val_z:.4f}")
print(f"Decision: {'Reject H0' if abs(z_stat) > 1.96 else 'Do not reject H0'}")

Open empty Colab notebook →

* =============================================================================
* CHAPTER 4 CHEAT SHEET: Statistical Inference for the Mean
* =============================================================================

* --- Setup ---
clear all                                // start with a clean workspace
set more off                             // do not pause output for long results

* =============================================================================
* STEP 1: Load data directly from a URL
* =============================================================================
* use loads a Stata .dta file; "clear" drops any data already in memory
* This sample has 171 female workers from the AED_EARNINGS dataset
use "https://raw.githubusercontent.com/quarcs-lab/data-open/master/AED/AED_EARNINGS.DTA", clear

describe                                 // list all variables, types, and labels
display "Observations: " _N              // _N is Stata's built-in observation count

* =============================================================================
* STEP 2: Sample statistics and standard error
* =============================================================================
* The standard error measures the precision of the sample mean as an estimate
* of the population mean. It shrinks with sqrt(n), not n — so quadrupling the
* sample size is needed to halve the SE.
summarize earnings

// After summarize, Stata stores results in r() that you can reference:
display "Sample mean:        $" %10.2fc r(mean)
display "Standard deviation: $" %10.2fc r(sd)
display "Standard error:     $" %10.2fc r(sd) / sqrt(r(N))

* =============================================================================
* STEP 3: t-distribution vs standard normal
* =============================================================================
* When sigma is unknown (estimated by s), we use the t-distribution instead of
* the normal. It has fatter tails — more probability in the extremes — but
* approaches the normal as n grows (the "n >= 30" rule of thumb).

// Generate a grid of t values and plot the densities
clear
set obs 200
gen t = -4 + (_n - 1) * 8 / 199

// tden(df, t) gives the t-density; normalden(t) gives the standard normal
gen pdf_normal = normalden(t)
gen pdf_t4     = tden(4, t)
gen pdf_t30    = tden(30, t)

twoway (line pdf_normal t, lpattern(dash) lwidth(medthick))              ///
       (line pdf_t4 t, lwidth(medthick))                                 ///
       (line pdf_t30 t, lwidth(medthick)),                               ///
    xtitle("t value") ytitle("Density")                                  ///
    title("t-Distribution Approaches Normal as Degrees of Freedom Increase") ///
    legend(order(1 "Standard Normal" 2 "t(4) — fatter tails"            ///
                 3 "t(30) — nearly normal"))

* =============================================================================
* STEP 4: Confidence interval — a range of plausible values for mu
* =============================================================================
* A 95% CI means: if we repeatedly sampled and built CIs, about 95% would
* contain the true mu. Formula: x-bar +/- t_crit * SE

// Reload the earnings data
use "https://raw.githubusercontent.com/quarcs-lab/data-open/master/AED/AED_EARNINGS.DTA", clear

// ci means computes confidence intervals directly
ci means earnings, level(95)

// Compare 90%, 95%, and 99% — higher confidence requires wider intervals
ci means earnings, level(90)
ci means earnings, level(95)
ci means earnings, level(99)

// Manual calculation to see the components
summarize earnings
local n    = r(N)
local xbar = r(mean)
local se   = r(sd) / sqrt(r(N))
local tcrit = invttail(`n' - 1, 0.025)       // two-tailed critical value
local margin = `tcrit' * `se'

display "Critical value (t_170, 0.025): " %8.4f `tcrit'
display "Margin of error:               $" %10.2fc `margin'
display "95% CI: [$" %10.2fc `xbar' - `margin' ", $" %10.2fc `xbar' + `margin' "]"

* =============================================================================
* STEP 5: Two-sided hypothesis test — H0: mu = $40,000
* =============================================================================
* The t-statistic measures how many standard errors the estimate is from the
* hypothesized value. The p-value is the probability of a result at least as
* extreme as ours if H0 were true.

// ttest performs the hypothesis test in one command
ttest earnings == 40000

// Stata displays: t-statistic, degrees of freedom, and p-values for
// Ha: mu != 40000 (two-sided), Ha: mu < 40000, and Ha: mu > 40000

// The stored results can be accessed:
display "t-statistic: " %8.4f r(t)
display "p-value (two-sided): " %8.4f r(p)

// One-sided (upper-tailed): H0: mu <= 40000 vs Ha: mu > 40000
// The one-sided p-value is half the two-sided p-value (for the correct tail)
display "p-value (one-sided, upper): " %8.4f r(p) / 2

* =============================================================================
* STEP 6: Statistical vs practical significance — gasoline prices
* =============================================================================
* A $0.14 price gap looks small, but the SE is tiny — so the t-statistic is
* huge and p < 0.0001. Statistical significance says "the difference is real";
* practical significance asks "does it matter?"
use "https://raw.githubusercontent.com/quarcs-lab/data-open/master/AED/AED_GASPRICE.DTA", clear

summarize price
display "Sample mean: $" %6.4f r(mean) "   SE: $" %6.4f r(sd) / sqrt(r(N))

// Test against the CA state average of $3.81
ttest price == 3.81

display "Decision: " cond(r(p) < 0.05, "Reject H0", "Do not reject H0")

// Practical significance: how much does the difference cost per tank?
local diff = 3.81 - r(mu_1)
display "Practical: $" %5.2f `diff' "/gallon * 15 gal = $" %5.2f `diff' * 15 " per tank"

* =============================================================================
* STEP 7: Inference for proportions — binary (0/1) data
* =============================================================================
* Proportions are just means of 0/1 variables. The same CI and hypothesis-test
* logic applies — only the SE formula changes: sqrt(p_hat*(1-p_hat)/n).

// Create a dataset with 921 voters: 480 Democrat (1), 441 other (0)
clear
set obs 921
gen democrat = (_n <= 480)

// Compute the sample proportion and confidence interval
ci proportions democrat

// Hypothesis test: H0: p = 0.50
prtest democrat == 0.50

// Manual calculation to see the components
summarize democrat
local p_hat = r(mean)
local n     = r(N)
local se_p  = sqrt(`p_hat' * (1 - `p_hat') / `n')
local se_h0 = sqrt(0.50 * 0.50 / `n')
local z     = (`p_hat' - 0.50) / `se_h0'
local pval  = 2 * (1 - normal(abs(`z')))

display "Sample proportion: " %6.4f `p_hat' " (" %4.1f `p_hat' * 100 "%)"
display "SE: " %6.4f `se_p'
display "95% CI: [" %6.4f `p_hat' - 1.96 * `se_p' ", " %6.4f `p_hat' + 1.96 * `se_p' "]"
display "H0: p = 0.50  z = " %6.4f `z' "  p-value = " %6.4f `pval'
display "Decision: " cond(abs(`z') > 1.96, "Reject H0", "Do not reject H0")

Paste into your Stata do-file editor

# =============================================================================
# CHAPTER 4 CHEAT SHEET: Statistical Inference for the Mean
# =============================================================================

# --- Libraries ---
library(haven)           # read Stata .dta files directly from URLs
library(ggplot2)         # grammar of graphics for all plots

# =============================================================================
# STEP 1: Load data directly from a URL
# =============================================================================
# read_dta() reads Stata .dta files — this sample has 171 female workers
url <- "https://raw.githubusercontent.com/quarcs-lab/data-open/master/AED/AED_EARNINGS.DTA"
data_earnings <- read_dta(url)
earnings <- data_earnings$earnings

cat("Dataset:", length(earnings), "observations\n")

# =============================================================================
# STEP 2: Sample statistics and standard error
# =============================================================================
# The standard error measures the precision of the sample mean as an estimate
# of the population mean. It shrinks with sqrt(n), not n — so quadrupling the
# sample size is needed to halve the SE.
n             <- length(earnings)
mean_earnings <- mean(earnings)
std_earnings  <- sd(earnings)              # sd() uses n-1 denominator (sample SD)
se_earnings   <- std_earnings / sqrt(n)    # standard error = s / sqrt(n)

cat("Sample mean:       $", round(mean_earnings, 2), "\n")
cat("Standard deviation:$", round(std_earnings, 2), "\n")
cat("Standard error:    $", round(se_earnings, 2), "\n")

# =============================================================================
# STEP 3: t-distribution vs standard normal
# =============================================================================
# When sigma is unknown (estimated by s), we use the t-distribution instead of
# the normal. It has fatter tails but approaches the normal as n grows.
x <- seq(-4, 4, length.out = 200)

ggplot(data.frame(x = x), aes(x = x)) +
  stat_function(fun = dnorm, linetype = "dashed", linewidth = 1.2,
                aes(color = "Standard Normal")) +
  stat_function(fun = dt, args = list(df = 4), linewidth = 1.2,
                aes(color = "t(4) — fatter tails")) +
  stat_function(fun = dt, args = list(df = 30), linewidth = 1.2,
                aes(color = "t(30) — nearly normal")) +
  labs(x = "t value", y = "Density",
       title = "t-Distribution Approaches Normal as Degrees of Freedom Increase",
       color = NULL) +
  theme_minimal() +
  theme(legend.position = "bottom")

# =============================================================================
# STEP 4: Confidence interval — a range of plausible values for mu
# =============================================================================
# A 95% CI means: if we repeatedly sampled and built CIs, about 95% would
# contain the true mu. Formula: x-bar +/- t_crit * SE
alpha  <- 0.05
t_crit <- qt(1 - alpha/2, df = n - 1)   # critical value from t(n-1)
margin <- t_crit * se_earnings

ci_lower <- mean_earnings - margin
ci_upper <- mean_earnings + margin

cat("Critical t-value (t_170, 0.025):", round(t_crit, 4), "\n")
cat("Margin of error:                $", round(margin, 2), "\n")
cat("95% CI: [$", round(ci_lower, 2), ", $", round(ci_upper, 2), "]\n")

# Compare 90%, 95%, and 99% — higher confidence requires wider intervals
for (conf in c(0.90, 0.95, 0.99)) {
  tc <- qt(1 - (1 - conf)/2, df = n - 1)
  lo <- mean_earnings - tc * se_earnings
  hi <- mean_earnings + tc * se_earnings
  cat(sprintf("%3.0f%% CI: [$%.2f, $%.2f]  width = $%.2f\n",
              conf*100, lo, hi, hi - lo))
}

# =============================================================================
# STEP 5: Two-sided hypothesis test — H0: mu = $40,000
# =============================================================================
# The t-statistic measures how many standard errors the estimate is from the
# hypothesized value. The p-value is the probability of a result at least as
# extreme as ours if H0 were true.
mu0    <- 40000
t_stat <- (mean_earnings - mu0) / se_earnings
p_val  <- 2 * pt(-abs(t_stat), df = n - 1)  # two-sided p-value

cat("\nH0: mu = $40,000 vs Ha: mu != $40,000\n")
cat("t-statistic:", round(t_stat, 4), "\n")
cat("p-value:    ", round(p_val, 4), "\n")
cat("Decision:   ", ifelse(p_val < 0.05, "Reject H0", "Do not reject H0"),
    "at alpha = 0.05\n")

# t.test() does all of this in one call:
t.test(earnings, mu = 40000)

# =============================================================================
# STEP 6: Statistical vs practical significance — gasoline prices
# =============================================================================
url_gas  <- "https://raw.githubusercontent.com/quarcs-lab/data-open/master/AED/AED_GASPRICE.DTA"
data_gas <- read_dta(url_gas)
price    <- data_gas$price

# Test against the CA state average of $3.81
result_gas <- t.test(price, mu = 3.81)
cat("\nGas price test: t =", round(result_gas$statistic, 4),
    ", p =", round(result_gas$p.value, 6), "\n")
cat("Practical: $", round(3.81 - mean(price), 2), "/gallon x 15 gal = $",
    round((3.81 - mean(price)) * 15, 2), "per tank\n")

# =============================================================================
# STEP 7: Inference for proportions — binary (0/1) data
# =============================================================================
# Proportions are just means of 0/1 variables. The same CI and hypothesis-test
# logic applies — only the SE formula changes: sqrt(p_hat*(1-p_hat)/n).
n_voters <- 921
n_dem    <- 480
p_hat    <- n_dem / n_voters

# prop.test() gives CI and hypothesis test in one call
result_prop <- prop.test(n_dem, n_voters, p = 0.50, correct = FALSE)

cat("\nSample proportion:", round(p_hat, 4), "(", round(p_hat*100, 1), "%)\n")
cat("95% CI: [", round(result_prop$conf.int[1], 4), ",",
    round(result_prop$conf.int[2], 4), "]\n")
cat("H0: p = 0.50  p-value =", round(result_prop$p.value, 4), "\n")
cat("Decision:", ifelse(result_prop$p.value < 0.05,
    "Reject H0", "Do not reject H0"), "\n")

Paste into your R console or RStudio

Statistical Inference for the Mean

Standard Error & Sample Size

t-Distribution Explorer

Confidence Interval Visualizer

Two-Sided Hypothesis Test

One-Sided Hypothesis Test

Gasoline Prices: Statistical vs Practical Significance

Inference for Proportions

Code Summary