Friday, December 16, 2011

Hierarchical Linear Models, Overview


Hierarchical Linear Models (HLMs) offer specialized statistical approaches to data that are organized in a hierarchy. A typical example (above) is pupils nested within classes, nested within schools, nested within school districts. If your research explores the relationship between individuals and society, HLMs will be of interest.

A conventional analysis of a hierarchical data set might (1) ignore the hierarchy and just sample students or (2) include indicator or dummy variables for the class, schools and district within which lower levels are nested. The issues raised by HLMs involve understanding the effects of either including or failing to include hierarchical effects and how precisely the effects are included.

HLMs have been around under different names (mixed models, random effects models, variance components models, nested models, multi-level models, etc.) since the beginning of modern statistics but have only recently become popular in the 1990s. This is a little strange since it is hard to think of a unit of analysis one might study that is not nested within some hierarchy (think about patients, countries, firms, workers, etc.). It should also be pointed out that individuals are not always the lowest level of analysis; roles and repeated observations on individuals have also be used to define the lowest level.

Why have HLMs only recently become popular? I can offer three reasons: (1) The slow (very slow) diffusion of ideas from General Systems Theory (GST). Entire academic disciplines have developed around atomistic ideas describing their units of analysis (think of homo economicus in Economics). The ideas of GST have, understandably, met with a lot of resistance. And, let's face it, analyzing systems is more difficult than analyzing individuals and who said analyzing individuals was easy in any event. (2) The world is becoming more interconnected. In a globalized world, systems become more important and more determinant. And, (3) probably most importantly, software is now available in the major statistical packages (HLM, SAS, SPSS, STATA, and R).

This does not mean, however, that studies conducted using atomistic (pooled) analysis are wrong. If the hierarchy doesn't affect the unit of analysis you are studying, parsimony suggests that you ignore it. In other words, a case always has to be made that HLMs are appropriate to your sample. And, there is some evidence that misapplication of HLMs can obscure otherwise significant results.

Consider the simple two level linear model.



The lower case letters, a and b, are the first level ordinary least squares (OLS) parameters, the epsilon is the first-level error term, Y is the dependent variable and Z is the independent variable. At the second level, the gammas are the second-level parameters, W is the second-level dummy variable(s) and the mu's are the second level error terms. Notice that the first level parameters, a and b, are modeled as a function of the higher-level system effects.

Just to keep things simple, I'll assume that the higher-level systems only affect the intercept terms in the model.


After substitution, it's easy to see that we added a separate error term, mu. Adding more error can have the effect, if the added error is large, of obscuring otherwise significant effects.

Another way to see this is to look at the distributional assumptions.


For OLS, the assumption is that there is a single variance for all the normally distributed observations (the term on the left above should be read that the error terms are distributed normally with mean zero and the same variance for all observations where n is the number of observations and I is an identity matrix). For the weighted least squares model, there is an added matrix of weights, upper case sigma, that accounts for the different variances and covariances among the individual error terms. The variability in the variances is called heteroscedasticity and can be tested with Levene's test.

Basically, HLMs handle heteroscedasticity by estimating the error components and then dividing out the error with some form of Generalized Least Squares (GLS). However, if the variances are large compared to the estimated parameters (think t-statistics), significant effects can obscured.

In future posts, I'll go through some of the issues raised by HLMs in much more detail. Here's a partial list of the issues as I currently understand them:
  • Are HLMs enhancing theory or are they just methodologies for partitioning variance?
  • What is the sampling model underlying HLMs and what does it imply for statistical conclusions? Are we sampling at each level in the hierarchy or just the lowest level? If so, what are the consequences of sampling only at the lowest level?
  • What are the implications of analyses derived from multi-level research? For example, does hierarchical analysis imply, in the example above, that students could be transferred to different classes, different schools and different districts and have, for example, improved academic performance? What if the new school, for example, is not in the sample. How is it to be coded as a new dummy variable in the model?
  • What is the meaning of aggregated variables? For example, can we aggregate student intelligence scores to the school level and move the intelligence variable to a higher level? How do we interpret such aggregation?
  • Does statistical power analysis at the individual level cover HLMs or is some other approach needed (power analysis determines how large a sample is necessary to observe a significant result)?
  • There are many methods by which hierarchical analysis can be conducted, from simple to very complex. How well do these various approaches compare to each other and compare to a naive pooled analysis?
  • How powerful are the tests for heteroscedasticity, that is, can these tests be relied on to tell us when HLMs are appropriate?

In future posts I'll go through a presentation by Doug Bates, an expert on HLMs at UW Madison, from a workshop he gave at University of Lausanne in 2009 (here). It's an excellent presentation and it gives me the opportunity to explore the questions raised above.

Next summer, I might have an opportunity to teach HLMs at the University of Tennessee in Knoxville. A tentative description of the course is listed below. Blog postings will basically allow me to serialize my lecture notes.

STATISTICAL ANALYSIS OF HIERARCHICAL LINEAR MODELS

Hierarchical linear models (HLMs) allow researchers to locate units of analysis within hierarchical systems, for example, students within school systems, patients within treatment facilities, firms within industries, states within federal governments, countries within regions, regions within the world system, etc. In fact, it would be rare to find a unit of analysis that was not situated within some higher-order system. This does not mean that HLMs can be applied indiscriminantly. If higher-order systems are not contributing significant variation to your unit of analysis, HLMs can obscure otherwise signifiant effects. This course will take a critical look at HLMs using computer intensive techniques for evaluating alternative estimators. The course will cover both parametric and nonparametric estimators to include the classes of OLS, WLS and GLS estimators, tests for homogeneity of variance and normality, statistical power analysis, the EM algorithm, classes of ML estimators, the bootstrap and rank transformation (RT) models. As a course project, students will do either a comparative methodological study or analyze an existing hierarchical data set. The R statistical package will be used for in-class demonstrations and method studies. For data analysis, students can either work in R or other available packages with HLM capabilities (SAS, SPSS, HLM, Stata, etc.). A course in regression analysis and basic computer literacy are prerequisites. Bring your laptop computer to class and, if possible, have R installed on your machine. R is free and can be obtained from http://www.r-project.org/.

No comments:

Post a Comment