| 作者 | fcamel (飛啊!warp的小駱駝) | 看板 | P_fcamel |
| 標題 | [statistics] Central Limit Theorem |
| 時間 | Tue Mar 28 02:24:21 2006 |
這個定理真是太帥了,
學到CLT後才有"我在學統計學"的感覺
background
population: 全體分佈, 真正的分佈情形, 通常不知道它是如何分佈的
sample : 取樣, 從population中抓幾個data出來看
mean of population和mean of samples是不一樣的,
可以由mean of samples猜測mean of population,
通常都是使用mean of samples
definition
A statistic is a function of the random variables in a random samples.
statistic: 統計量
random sample -> independent identical distribution (iid)
白話文解釋: 給定一些sample, 套入一組算式得出一個統計量
ex: Xavg = sigma(Xi)/n, V = sigma(Xi-avg)/(n-1),
Xavg, V are statistics
central limit theorem
let Xavg = (X1 + X2 + ... + Xn)/n, where Xi is random variable,
let u = E[Xi], v = V[Xi]
E[Xavg] = (u + u + ... + u)/n = u
V[Xavg] = (v + v + ... + v)/(n^2) = v/n
standardize Xavg: (CLT)
Z = (Xavg - u) / (v/n)^0.5,
as n -> Inf. is the standard normal distribution
meaning of CLT
不管Xi是什麼機率分佈, 一次抓n個數做平均, 以這個平均數為新的random variable,
取樣夠多次後這個平均數就會成normal distribution
( 如果取樣 k 次的話, 等於用了 kn 個sample )
由於是觀察"平均數"(mean of samples)這個統計量,
機率分佈自然會以mean of population為中心往兩側分佈遞減
由 Z 來看, 取樣多次後雖然會逼近normal distribution,
但因為平均的效果, 會使variance縮小
|