In ecological studies, we consider aggregates of individuals. Aggregates often defined by units e.g. geographic region, school or health care facility.
We try to find out whether the overall occurrence disease in a population correlates with occurrence of the exposure. There is no individual data. The aggregate data is used primarily for hypothesis generation as opposed to hypothesis testing.
Examples of aggregate data:
• Disease rates (incidence, mortality, etc)
• Birth rates
• “Exposure” data: smoking rates,
• Geographic residence,
• Air pollution data,
• Mean income, per capita
• Consumption of saturated fats,
• Proximity to nuclear power plants
Ecological studies are useful for generation of hypotheses, supporting hypotheses, or for intervening at the population level.
Examples
• Rates of stomach cancer declined dramatically after the advent of refrigeration in the 1930s–which supports studies showing risk of stomach cancer increases with consumption of nitrates in preserved foods (sausage, lunch meat, etc.)
• Smoking and lung cancer
• Oral cancer and snuff use in the KPK
Ecologic Fallacy
Grouped data do not necessarily represent individual level data.
Example:
Fat intake and breast cancer rates with countries as the unit of measurement have consistently been found to be highly correlated. But studies of individuals (cohort, case control studies) have not found any association with fat intake.
Why?
Possible reasons include countries with high fat intake are more likely to have other risk factors associated with breast cancer (i.e. late age at first pregnancy).
Or that within population variability is low, but inter-population variability is high.
Extreme example of this is that if everyone in a country had high fat intake, we would not be able to detect any excess because there would not be any population to compare them to with low fat intake.