Central Limit Theorem

作者	fcamel (飛啊!warp的小駱駝)	看板	P_fcamel
標題	[statistics] Central Limit Theorem
時間	Tue Mar 28 02:24:21 2006

這個定理真是太帥了,
學到CLT後才有"我在學統計學"的感覺

background

    population: 全體分佈, 真正的分佈情形, 通常不知道它是如何分佈的
    sample    : 取樣, 從population中抓幾個data出來看

    mean of population和mean of samples是不一樣的,
    可以由mean of samples猜測mean of population,
    通常都是使用mean of samples

definition

    A statistic is a function of the random variables in a random samples.

    statistic: 統計量
    random sample -> independent identical distribution (iid)

    白話文解釋: 給定一些sample, 套入一組算式得出一個統計量

    ex: Xavg = sigma(Xi)/n, V = sigma(Xi-avg)/(n-1),
        Xavg, V are statistics

central limit theorem

let Xavg = (X1 + X2 + ... + Xn)/n, where Xi is random variable,

let u = E[Xi], v = V[Xi]
E[Xavg] = (u + u + ... + u)/n = u
V[Xavg] = (v + v + ... + v)/(n^2) = v/n

standardize Xavg: (CLT)

    Z = (Xavg - u) / (v/n)^0.5,
    as n -> Inf. is the standard normal distribution

meaning of CLT

不管Xi是什麼機率分佈, 一次抓n個數做平均, 以這個平均數為新的random variable,
取樣夠多次後這個平均數就會成normal distribution
( 如果取樣 k 次的話, 等於用了 kn 個sample )

由於是觀察"平均數"(mean of samples)這個統計量,
機率分佈自然會以mean of population為中心往兩側分佈遞減

由 Z 來看, 取樣多次後雖然會逼近normal distribution,
但因為平均的效果, 會使variance縮小