Definition 1.0.1 (標本平均, 標本分散)
n n n 個のサンプル (sample) X 1 , X 2 , … , X n X_1,X_2,\dots,X_n X 1 , X 2 , … , X n を独立に取得したとする. サンプルはデーターの観測等の標本抽出するごとに異なる値を得るので確率変数とみなせる. これらの平均, 分散を各々次で定義する:
標本平均
X ˉ = 1 n ∑ i = 1 n X i .
\bar{X} = \frac{1}{n} \sum_{i=1}^n X_i.
X ˉ = n 1 i = 1 ∑ n X i .
S 2 = 1 n ∑ i = 1 n ( X i − X ˉ ) 2 .
S^2 = \frac{1}{n} \sum_{i=1}^n (X_i - \bar{X})^2.
S 2 = n 1 i = 1 ∑ n ( X i − X ˉ ) 2 .
Proposition 1.0.2
E [ X ˉ ] = μ , V [ X ˉ ] = σ 2 n .
\begin{aligned} E[\bar{X}] = \mu, && V[\bar{X}] = \frac{\sigma^2}{n}. \end{aligned}
E [ X ˉ ] = μ , V [ X ˉ ] = n σ 2 .
E [ X ˉ ] = 1 n ∑ i = 1 n E [ X i ] = 1 n ∑ i = 1 n μ = n μ n = μ .
\begin{aligned}
E[\bar{X}] &= \frac{1}{n}\sum_{i=1}^n E[X_i] \\
&= \frac{1}{n}\sum_{i=1}^n \mu \\
&= \frac{n\mu}{n} = \mu.
\end{aligned}
E [ X ˉ ] = n 1 i = 1 ∑ n E [ X i ] = n 1 i = 1 ∑ n μ = n n μ = μ .
X ˉ − μ = 1 n ( ∑ i = 1 n X i ) − μ = 1 n ∑ i = 1 n ( X i − μ )
\bar{X} - \mu = \frac{1}{n}\left(\sum_{i=1}^n X_i \right) - \mu = \frac{1}{n}\sum_{i=1}^n\left(X_i - \mu\right)
X ˉ − μ = n 1 ( i = 1 ∑ n X i ) − μ = n 1 i = 1 ∑ n ( X i − μ )
このとき
V [ X ˉ ] = E [ ( X ˉ − μ ) 2 ] = 1 n 2 E [ ( ∑ i = 1 n ( X i − μ ) ) ( ∑ j = 1 n ( X j − μ ) ) ] = 1 n 2 ∑ i = 1 n E [ ( X i − μ ) 2 ] + 2 n 2 ∑ i < j E [ ( X i − μ ) ( X j − μ ) ] = ⋆ 1 n 2 ∑ i = 1 n σ 2 + 2 n 2 ∑ i < j E [ ( X i − μ ) ] ⏟ = 0 E [ ( X j − μ ) ] ⏟ = 0 = n σ 2 n 2 = σ 2 n .
\begin{aligned}
V[\bar{X}] &= E[(\bar{X} - \mu)^2] \\
&= \frac{1}{n^2} E\left[
\left(\sum_{i=1}^n (X_i - \mu)\right)
\left(\sum_{j=1}^n (X_j - \mu)\right)
\right] \\
&= \frac{1}{n^2} \sum_{i=1}^n
E\left[(X_i - \mu)^2 \right]
+ \frac{2}{n^2} \sum_{i < j }
E\left[
(X_i - \mu)(X_j - \mu)
\right] \\
& \underset{\star}{=} \frac{1}{n^2} \sum_{i=1}^n \sigma^2 +
\frac{2}{n^2} \sum_{ i< j } \underbrace{E[(X_i-\mu)]}_{=0}\underbrace{E[(X_j-\mu)]}_{=0} \\
& = \frac{n\sigma^2}{n^2} \\
& = \frac{\sigma^2}{n}.
\end{aligned}
V [ X ˉ ] = E [ ( X ˉ − μ ) 2 ] = n 2 1 E [ ( i = 1 ∑ n ( X i − μ ) ) ( j = 1 ∑ n ( X j − μ ) ) ] = n 2 1 i = 1 ∑ n E [ ( X i − μ ) 2 ] + n 2 2 i < j ∑ E [ ( X i − μ ) ( X j − μ ) ] ⋆ = n 2 1 i = 1 ∑ n σ 2 + n 2 2 i < j ∑ = 0 E [ ( X i − μ ) ] = 0 E [ ( X j − μ ) ] = n 2 n σ 2 = n σ 2 .
ただし = ⋆ \underset{\star}{=} ⋆ = の式変形で確率変数が互いに独立であることを用いている.
Proposition 1.0.3
E [ S 2 ] = n − 1 n σ 2
E[S^2] = \frac{n-1}{n}\sigma^2
E [ S 2 ] = n n − 1 σ 2 まず, S 2 S^2 S 2 の定義式を変形する.
S 2 = 1 n ∑ i = 1 n ( X i − X ˉ ) 2 = 1 n ∑ i = 1 n ( ( X i − μ ) − ( X ˉ − μ ) ) 2 = ⋆ 1 n ∑ i = 1 n ( X i − μ ) 2 + 1 n ∑ i = 1 n ( X ˉ − μ ) 2 − 2 n ∑ i = 1 n ( X i − μ ) ( X ˉ − μ ) = 1 n ∑ i = 1 n ( X i − μ ) 2 − ( X ˉ − μ ) 2 .
\begin{aligned}
S^2 &= \frac{1}{n}\sum_{i=1}^n (X_i - \bar{X})^2 \\
&= \frac{1}{n}\sum_{i=1}^n ((X_i-\mu) - (\bar{X}- \mu))^2 \\
&\underset{\star}{=} \frac{1}{n}\sum_{i=1}^n (X_i - \mu)^2 +
\frac{1}{n}\sum_{i=1}^n (\bar{X}-\mu)^2 -
\frac{2}{n}\sum_{i=1}^n (X_i-\mu)(\bar{X} - \mu) \\
&= \frac{1}{n}\sum_{i=1}^n (X_i-\mu)^2 - (\bar{X} - \mu)^2.
\end{aligned}
S 2 = n 1 i = 1 ∑ n ( X i − X ˉ ) 2 = n 1 i = 1 ∑ n ( ( X i − μ ) − ( X ˉ − μ ) ) 2 ⋆ = n 1 i = 1 ∑ n ( X i − μ ) 2 + n 1 i = 1 ∑ n ( X ˉ − μ ) 2 − n 2 i = 1 ∑ n ( X i − μ ) ( X ˉ − μ ) = n 1 i = 1 ∑ n ( X i − μ ) 2 − ( X ˉ − μ ) 2 .
ただし = ⋆ \underset{\star}{=} ⋆ = の計算では (5 ) を用いている. 以上の式変形と 1.0.2 の結果を使うことで
E [ S 2 ] = E [ 1 n ∑ i = 1 n ( X i − μ ) 2 ] − E [ ( X ˉ − μ ) 2 ] = n σ 2 n − σ 2 n = n − 1 n σ 2
\begin{aligned}
E[S^2] &= E\left[\frac{1}{n}\sum_{i=1}^n(X_i-\mu)^2\right] - E\left[(\bar{X}-\mu)^2\right] \\
&= \frac{n\sigma^2}{n} - \frac{\sigma^2}{n} \\
&= \frac{n-1}{n}\sigma^2
\end{aligned}
E [ S 2 ] = E [ n 1 i = 1 ∑ n ( X i − μ ) 2 ] − E [ ( X ˉ − μ ) 2 ] = n n σ 2 − n σ 2 = n n − 1 σ 2
となる.以上で示したいことが示された.
Theorem 2.0.1 (CLT) Mathematically, if X 1 , X 2 , … , X n X_{1},X_{2},\dots,X_{n} X 1 , X 2 , … , X n is a random sample of size n n n taken from a population with mean μ \mu μ and finite variance σ 2 \sigma ^{2} σ 2 and if X ˉ \bar{X} X ˉ is the sample mean, the limiting form of the distribution of
Z = X ˉ − μ σ / √ n
Z={\frac{\bar{X}-\mu }{\sigma /\surd n}}
Z = σ / √ n X ˉ − μ
is the standard normal distribution.
Example 2.0.2 (Throwing Dices)
using Statistics
using Plots
pyplot()
using LaTeXStrings
using Distributions
dice = [1 , 2 , 3 , 4 , 5 , 6 ]
μ_dice = mean(dice)
σ_dice = std(dice, corrected = false )
n_sample = 500
n_trial = 3000
X̄ = Float64 []
for t = 1 :n_trial
sample = rand(dice, n_sample)
push!(X̄, mean(sample))
end
Z = (X̄ .- μ_dice) ./ (σ_dice / √(n_sample))
p = histogram(Z, normalize = :pdf, label="sample mean" )
d = Normal(0. , 1. )
plot!(p, x -> pdf(d, x), label=L"\mathcal{N}(μ,σ^2)" )
p = plot!(p, xlim = [-3 , 3 ])