Statistical Inference
TRUE STATE OF AFFAIRS | ||
EFFECTIVE | NOT EFFECTIVE | |
EFFECTIVE | True positive(correct) | False positive(type I / alpha error) |
Conclusion of study | ||
NOT EFFECTIVE | False negative(type II /beta error) | True negative (correct) |
Standard Error Tests
Making use of limit/choice of 95% level of confidence to declare any value falling outside the 95% confidence interval or two standard errors to be statistically significant.
These include
- Means /proportions of one sample
- Difference between means / proportions of two samples
The goal of SIGNIFICANCE TESTING of statistical inference (H_{A}) is to see if observed test result/ hypothesized difference is likely to be due to chance, based on principle of relating the observed findings to the hypothetical true state of affairs(H_{0})
H_{0} is rejected using reported p value arbitrarily chosen to be 0.05. Remember that the remaining 5% of data is left out when using 95 % confidence interval in standard error approach
Steps in significance testing
1. Generating hypothesis
NULL HYPOTHESIS VS ALTERNATE HYPOTHESIS
2. Determine probability of error
P VALUE OF SAY 0.05 OR 0.01 (if more confident)
3. Choose appropriate statistical test; calculate test statistic
4. Obtain the p value for data of the particular sample statistic
5. Describe inference and conclusion
Choosing appropriate statistical tests
It turns out that if you were to go out and sample many, many times, most sample statistics that you could calculate would follow a normal distribution.
Remember that a normal curve is characterized by two parameters, a mean and a variability (SD)
Remember standard deviation is natural variability of the population
Standard error can be standard error of the mean or standard error of the odds ratio or standard error of the difference of 2 means, etc. The standard error of any sample statistic.
Examples of Sample Statistics:
- Single population mean
- Single population proportion
- Difference in means (t test)
- Difference in proportions (Z-test),x^{2} test
- Odds ratio/risk ratio
- Correlation coefficient
- Regression coefficient
Common statistical tests
DESIGN | NATURE OF VARIABLES | STATISTICAL TEST | STATISTIC DERIVED |
2 independent groups | Qualitative
(nominal) Quantitative (continuous) |
Chi –square test
Student t -test |
×2
t |
2 related groups | Qualitative
(nominal) Quantitative (continuous) |
Chi –square test
Paired t -test |
×2
t |
More than 2 independent groups | Qualitative
(nominal) Quantitative (continuous) |
Chi –square test
ANOVA |
×2
F |
Many statistics follow normal (or t-distributions)
- Means/difference in means
– T-distribution for small samples
- Proportions/difference in proportions
- Regression coefficients
– T-distribution for small samples
- Natural log of the odds ratio
Parametric Tests (t, z and f tests)
Parametric tests test hypothesis about means or variances. Their hypothesis refers to population ( population mean z or t tests) or population variance (f tests). Their hypothesis concern interval / ratio scale data (weight , BP, IQ). They assume population data to be normally distributed.
Non parametric tests –(chi square test)
These do not test hypothesis concerning parameters. They do not assume normal distribution of data. They are used to test nominal/ordinal data and are less powerful than parametric tests.
Chi –Square Test- essentials
It is used to find out whether observed difference between proportions of events in 2 or more groups can be considered statistically significant. It is a non parametric test and uses qualitative, discrete data in proportions or frequencies (not in %). Statistic calculated is chi square(x^{2})
Steps of applying chi- square test(x^{2})
An assumption of no difference is made which is then proved or disproved with the help of x^{2 }test (the null hypothesis).
– Fix a level of significance (0.05)-the p –value
– Enter study data in the table ,(observed frequency (O)
– Calculate expected frequency for each cell(E)
– Formula of x^{2 }value for each cell =(O-E)^{2}/E
– Expected frequency = (RTxCT/GT)
– Add up results of all cells x^{2}calcul=∑(O-E)^{2}/E
– Degree of freedeom (Df) = (C-1)x(R-1) its 1 in a 2×2 table
– Read x^{2 }table for corresponding p- value
– Compare this p- value with level of significance (0.05) and decide fate of the alternate or research hypothesis
Student’s t-test
Student’s t-test is applied for numerical data (mean values). Normal distribution of variables is assumed. Random sampling is done.
Steps
– Calculate t- value (from data)
– Choose level of significance, p- value 0.05
– Determine degree of freedom (sum of 2 samples sizes minus 2 )
– Find t value for LOS decided from table, if calculated t value is > than table value , Ho is rejected .
How to calculate t- value
- Calculate means of any 2 groups/samples (X1,X2)
- Calculate their difference (X1- X2)
- Calculate standard deviation (SD)of each group
- Calculate standard error(SE) for each SD/√n)
The t- value= X1-X2/√ SD1/n1+ √ SD2/n2