Home » Community Medicine » Measures of Variablility

Measures of Variablility

Central Tendency doesn’t tell us everything. Dispersion/Deviation/Spread tells us a lot about how a variable is distributed. We are most interested in Standard Deviations (σ) and Variance (σ2)

Dispersion

Once you determine that the variable of interest is normally distributed, ideally by producing a histogram of the scores, the next question to be asked about the normal distribution curve is its dispersion: how spread out are the scores around the mean. Dispersion is a key concept in statistical thinking.

The basic question being asked is how much do the scores deviate around the mean?  The more “bunched up” around the mean the better your ability to make accurate predictions.

Mean Deviation

The key concept for describing normal distributions and making predictions from them is called deviation from the mean.  We could just calculate the average distance between each observation and the mean. We must take the absolute value of the distance, otherwise they would just cancel out to zero!

mean deviation example

Is it Really that Easy?

  • No!
  • Absolute values are difficult to manipulate algebraically
  • Absolute values cause enormous problems for calculus (Discontinuity)
  • We need something else…

Variance and Standard Deviation

Instead of taking the absolute value, we square the deviations from the mean.  This yields a positive value. This will result in measures we call the Variance and the Standard Deviation

For Sample-                                              

s: Standard Deviation   

s2: Variance                          

For Population

σ: Standard Deviation

σ2: Variance

calculating variance and standard deviation

In a normal curve, area corresponding to

o   1 SD will  comprise 68% of total area

o   2 SD will comprise 95% of total area

o   3 SD will comprise 99.7% of total area

( The 68- 95-99.7 rule)

Coefficient of Variance

Coefficient of variance measures the spread the spread of data  set as a proportion of its mean. It is expressed as percentage. It is ratio of sample standard deviation to sample mean. CV of population is based on expected value and SD of a random variable

CV = standard deviation/mean x 100

Percentiles

Percentiles give variability of the distribution. The p’th percentile of distribution is the value such that p% of observations fall at or below it. Median is the 50th percentile. They are used in calculation of growth charts for nutritional surveillance and monitoring

Quartiles

Quartiles are the values that divide the data into four groups containing equal numbers of observations. They are the 25th and 75th percentiles.

First quartile is the median of observations below the median of the complete data set, third quartile is the median of observations above the median of the complete data set.

Range

The range of a sample /data set is the difference between the largest and smallest observed value of some quantifiable characteristic. It is a simple summary measure, but crude. Like mean it is affected by extreme values

Data: 2,3,4,5,6,6,6,7,7,8,9

RANGE  2- 9= 7

Interquartile Range (IQR)

It is calculated by taking difference between upper and lower quartiles. IQR is the width of an interval which contains middle 50% of sample. It is smaller than range and less affected by outliers.

Data: 2,3,4,56,6,6,7,7,8,9

Upper quartile=7, lower quartile=4, IQR=3

Check Also

size of dust

Pneumoconiosis -Types, Silicosis, Asbestosis and Preventive Measures

Pneumoconiosis is a group of lung diseases which result from inhalation of dust in certain …