Wednesday, March 19, 2008

Zillow Home Value Index Compared to OFHEO and Case-Shiller Indexes

(Zillow Blog) There was an interesting article in the Wall Street Journal a few weeks ago by David Wessel discussing differences between the housing index produced by the Office of Federal Housing Enterprise Oversight (OFHEO) and Standard & Poor’s Case-Shiller index. Since Zillow recently released our Q4 2007 Home Value Reports, I thought I’d extend Mr. Wessel’s analysis with a comparison of the Zillow Home Value Index (Zindex) to both the OFHEO and Case-Shiller numbers.

First, some background for the uninitiated. Both OFHEO and Case-Shiller utilize a weighted repeat sales methodology originally conceived in the 1960s and subsequently elaborated upon by Professors Karl Case (currently at Wellesley) and Robert Shiller (currently at Yale). This methodology was a significant improvement over the more conventional median sale price as it looks at the price change between repeat sales of the same home versus just looking at the median sale price of homes sold in a given period of time. As noted by Wessel, median sale price is heavily influenced by the type of homes that are selling at a given time, making it a less than ideal measure of home price levels (we’ll address this fact in more depth in a subsequent post).


Of course, using a repeat sale methodology does have its own important caveats. Most importantly, the index is based only on the sample of homes that have sold at least twice, a fact which serves to exclude all new construction (which can account for more than 10% of real estate transactions). The chief major difference between OFHEO and Case-Shiller lies in the fact that OFHEO numbers are based only on homes sales with conforming home mortgages (loans less than $417,000), which eliminates a fair percentage of real estate transactions. Case-Shiller looks at all home sales, regardless of the mortgage amount.

The Zillow Home Value Index takes a different approach to constructing its market index. We generate valuations several times a week on more than 67 million homes, or roughly three out of four homes in the U.S., and calculate historical values dating back to 1997 (thus creating over 13 billion Zestimates). This complicated process allows us to aggregate these house-level valuations into indexes (what we call the Zindex) at the neighborhood, ZIP code, city, county, metro area, and national levels. This Zindex eliminates the bias present in median sale prices by looking at the value of all homes in a region, not just those homes that sold. The statistical models underlying the Zestimates control for the mix of housing for sale by finding patterns in the types of homes that are selling (no matter how unrepresentative of the overall set of homes) and then applying these patterns to all homes. For example, if only a few homes of a certain type sell in a given period, the models can extract the information from those sales and apply it to all homes of that type.

An important property of the Zillow Zestimate that allows us to aggregate them into a very accurate and reliable Zindex is that they have relatively little systematic error meaning that, while each Zestimate has some margin of error, they are just as likely to be above the actual sale price of a home as below. This means that individual estimates, each with some error, can be aggregated to form a quite accurate measure of all homes. What little systematic error does creep into the Zestimates is removed from all historical data series when we calculate the Zindex for the quarterly reports.

So, how does the Zindex compare to the two most common flavors of weighed repeat sales indexes? The table above compares year-over-year changes in market values for OFHEO, Case-Shiller and Zillow for selected markets between the third quarter of 2006 and the third quarter of 2007 (the same markets and periods compared by Wessel in his article). Also added to this table are the Pearson correlation coefficients between the three measures (an indicator of how similar the various measures are to one another). Zillow and Case-Shiller are fairly similar to each other with a correlation of 95% and median absolute error of 1.5%. OFHEO, on the other hand, is about equally dissimilar to both Zillow and Case-Shiller with a correlation of 50% with both other measures (median absolute error of 5.3% when compared with Zillow). The large difference between OFHEO and both Zillow and Case-Shiller is attributable to both data and methodological differences (e.g., the former using only conforming mortgages, different filtering of data, different weights used in the weighted repeat sales methodology, etc.). In short, the Zindex and Case-Shiller are pretty similar indexes overall, and both are somewhat different than OFHEO.

One important difference between the Zindex and the Case-Shiller index is their respective coverage areas. The Zindex is reported for 125 metropolitan areas whereas Case-Shiller reports on 20 higher level metropolitan areas (although the national Case-Shiller index uses data from about 100 metropolitan areas).

Another advantage of the Zindex relative to both OFHEO and Case-Shiller is reporting latency. Case-Shiller releases reports on a monthly basis with a two month data lag (e.g., data through January 2008 comes out in March 2008) and OFHEO releases data on a quarterly basis, usually about two months after the end of the quarter. Both Case-Shiller and OFHEO released their Q4 2007 reports on February 26th. Zestimates and Zindexes are published multiple times per week on, and our full Q4 Zillow Home Value reports were released on February 12th and our Q1 2008 reports are due to be released in May. Be sure to check back then, or, you can always check out the Zindex for your ZIP code, neighborhood, city, state or US by clicking on the “Zestimate and Charts” link on any home details page.

