Title: A Critical Examination of Hedonic Analysis of a Regression Model (HARM) and META-ANALYSIS
1(No Transcript)
2A Critical Examination ofHedonic Analysis of a
Regression Model (HARM) and META-ANALYSIS Alb
ert R. WilsonBSSE, MBA, CRE (Ret)
3 Regression Model
- A model
- intended to allow an exploration
- of the hypothetical relationship
- between possible explanatory variables
- and the sales price
4Regression Model
- Reflection of reality
- The touchstone of that reality? Actual market
participants
5Estimated versus Predicted
- Estimated Sale IN database
- Predicted Sale NOT IN database
6Predicted Sales Prices
- At the mean
- predicted sales price variance
- is larger than estimated variance
- by s2 (variance in the data)
7Mean Confidence Intervals (MCI)Estimated and
Predicted
MCI FOR PREDICTED 4.38 TIMES MCI FOR ESTIMATED
8- DATABASE EDITING
- GARBAGE IN gt GARBAGE OUT (GIGO)
9Case ExampleInfluence on the Removal of
Flipping Transactions on the Predicted Prices
for 33 Properties
PREDICTED SALES PRICES PREDICTED SALES PRICES PREDICTED SALES PRICES PREDICTED SALES PRICES
PROPERTY NO. AS PRESENTED FLIPS REMOVED CHANGE
SUM 5,069,239 4,018,112 (1,051,127)
n 391 379 -12
Adj. R-squared 0.7684 0.7593 -0.0091
10Editing and Confirmation of Data
- STEP 1
- Edit to identify obvious issues (the desk edit)
- Case Example
- Assessors Data 4,325 Removed 747 17.3
- R-Squared 0.79 0.83
- MLS Data 1,888 Removed 779 44.3
11Editing and Confirmation of Data
- STEP 2
- Identify sales that are not appropriate to the
analysis
12Editing and Confirmation of Data
- STEP 3
- Sales confirmation
- A values-neutral interview of sale participants
- OBJECT to elicit the primary factors motivating
the conclusion of the sale price - MUST NOT INTRODUCE ANALYST OPINION
- THIS IS THE ONLY MEANS OF IDENTIFYING/CONFIRMING
THE REASONS - FOR A CONCLUDED PRICE
13Regression Model Considerations
- Faithfully represent
- Identified concerns of actual market participants
- Restrictions imposed by the data
- Estimates of prices
- the ONLY VERIFIABLE OUTPUT
14Coefficient Calculation
- Result of iterative calculations
- designed to provide the
- most accurate estimates of sales prices
- in database
15Coefficient Calculation
- Goodness of Fit
- Measures of the Goodness of Fit apply only to the
relationship between the estimated and actual
sales prices in the database - They do not apply to the coefficients
16Most commonly-cited Goodness-of-Fit Measure
- R-Squared
- (Coefficient of Determination)
17R-Squared
- Generally-applied interpretation
- R-Squared is the amount of variance explained
by the model
18Low R-Squared Models
- Mathematically, as the R-Squared approaches 0.30,
it becomes - more likely
- that the model is only measuring random effects
19The Omitted and Additional Variable Problem
- Omitting generally increases magnitude and
statistical significance of the remaining
coefficients - Adding generally decreases the magnitude and
statistical significance of the remaining
variable coefficients
20Illustration of Omitting or Adding a Variable
Base Model Base Model Added VariableAPN Added VariableAPN Added VariableAPN Omitted VariablePool Omitted VariablePool Omitted VariablePool
Variable Coeff. t-stat Coeff. t-stat Change Coeff. t-stat Change
Intercept 67,370 17.52 -663,632 -8.14 -1085.06 66,293 17.14 -1.60
APN .023 8.98
Fixtures 2,653 5.39 2,511 5.15 -5.35 2,886 5.84 8.74
NoPatio (12,801) -7.77 (5,036) -2.73 -60.66 (13,451) -8.13 5.08
SqFt 40.79 29.23 42.80 30.61 4.93 41.59 29.72 1.96
Pool 8,366 6.77 8,908 7.28 6.48
Garage 19,382 12.90 20,153 13.54 3.98 19,980 13.24 3.09
Middle Ring (16,141) -11.24 (11,230) -7.38 -30.43 (15,276) -10.61 -5.36
Inner Ring (8,875) -4.52 (7,114) -3.64 -19.84 (8,012) -4.06 -9.72
2000 207 0.08 1,787 -0.67 763.29 271 0.10 30.92
2001 (2,017) -0.76 665 0.258 -132.97 (2,028) -0.76 0.55
2002 (719) -0.25 3,976 1.36 -652.99 (615) -0.21 -14.46
2003 7,213 2.67 7,647 2.86 6.02 7,258 2.71 0.62
2004 41,149 15.50 40,380 15.37 -1.87 40,901 15.31 -0.60
2005 132,077 51.04 130,662 50.93 -1.07 131,129 50.43 -0.72
2006 160,367 45.29 159,842 45.63 -0.33 159,897 44.89 -0.29
R-Squared 0.83 0.83 0.83
21Consequences of Variable Selection
- Including the Assessors Parcel Number
- APN Coefficient Value 0.023
- t-statistic 8.98
- Mean Value 30,834,360
- R-Squared 0.83
- Mean Sale Price 211,000
- Results in an incremental increase in the sales
price of - 0.023 x 30,834.360 709,190
- (APN Coef.) x (Mean Value) (Incremental
Increase)
22Consequences of Variable Selection
- Omission of a Variable
- Removal of Pool present in 38 of properties
- SQFT Cofficient changed from 40.79 to 41.79
- Approximately the same t-statistic
- Removal of Fixtures present in 100 of
properties - SQFT Coefficient changed from 40.79 to 46.50
- T-statistic 50.94
23Coefficients
- Coefficients are simply
- multipliers for the explanatory variable
24Causation in Real Estate
- From the Real Estate Appraisers perspective
- Causation demonstrated through sales confirmation
interviews. - Causation NEVER proven through a regression.
25Strengths and Weaknesses
- Can never be better than the data
- Requires significant amount of data five to 15
or more sales - Upper limit to the amount of data too much may
be worse than too little - Guide Are the sales competitive to the subject?
- Estimate of sales prices most accurate at the
mean value of the data - Variance of a predicted sales price larger than
variance of estimated - Thousands of possible regression models
26Further Considerations
- Absent standards, the Rubber Ruler may apply
- When recognized and published standards are not
used, author must demonstrate the accuracy and
reliability of his/her work
27Hedonic Analysis
28The Hedonic Assumption
- The coefficient accurately and only represents
the contribution of the declared meaning of the
explanatory variable to the - sale price
29Hedonic Analysis
- The validity of the hedonic assumption must be
demonstrated
30Revealed Preference
Idea cannot be supported for real estate
31Supporting Literature
- Not a single paper demonstrated the validity
- of the hedonic assumption
- PLUS
- NO indication of confirmation of raw data
- NO indication of adherence to any recognized /
published standards - NO indication of confirmation of results with the
normal or typical market participant - THE RUBBER RULER EFFECT IS MUCH IN EVIDENCE.
32Regression Model Accuracy
- If the regression model is inaccurate,
- then there is no reason
- to expect the coefficients to be
- accurate or meaningful.
- Therefore the HARM cannot be accurate.
33CASE EXAMPLETO POOL OR NOT TO POOL
- Using the data from the previous case.
- Does a pool influence value?
- By how much?
- The Hedonic Approach, the coefficient is the
marginal contribution to value.
34 COMBINED POOL AND NO POOLS COMBINED POOL AND NO POOLS COMBINED POOL AND NO POOLS COMBINED POOL AND NO POOLS, POOL COEFFICIENT SET TO ZERO COMBINED POOL AND NO POOLS, POOL COEFFICIENT SET TO ZERO COMBINED POOL AND NO POOLS, POOL COEFFICIENT SET TO ZERO
Variable COEFFICIENT MEAN VALUES EXPECTED VALUES COEFFICIENT MEAN VALUES EXPECTED VALUES
Intercept 54,089.83 1 54,090 54,089.83 1 54,090
ORIG_FIXTURES 2,805.33 8.73 24,491 2,805.33 8.73 24,491
ORIG_NOPATIO -14,116.47 0.34 -4,800 -14,116.47 0.34 -4,800
ORIG_POOL 9,161.98 0.38 3,482 9,161.98 0 0
ORIG_SQF 41.52 2283.62 94,815 41.52 2283.62 94,815
ORIG_X_3GARAGE 16,212.83 0.4 6,485 16,212.83 0.4 6,485
SY2000 5,980.33 1 5,980 5,980.33 1 5,980
EXPECTED MEAN SALE PRICE 184,543 181,061
Adj R2 0.8816 0.8816
35TO POOL OR NOT TO POOL (CONT.)
- What are the coefficients if there is no pool?
36COMBINED WITH NO POOL VARIABLE COMBINED WITH NO POOL VARIABLE COMBINED WITH NO POOL VARIABLE COMBINED WITH NO POOL VARIABLE
Variable COEFFICIENT MEAN VALUES EXPECTED VALUES
Intercept 52788.1063 1 52,788
ORIG_FIXTURES 3,087.8801 8.73 26,957
ORIG_NOPATIO -14,724.7843 0.34 -5,006
ORIG_SQF 42.3986 2283.62 96,822
ORIG_X_3GARAGE 16,924.691 0.4 6,770
SY2000 5,727.7462 1 5,728
EXPECTED MEAN SALE PRICE 184,059
Adj R2 0.8790
37Comparision
- Orig Fixt 2,805 3,088
- Orig-nopatio -14,116 -14,725
- Orig-no pool 9,162 NA
- Orig-sqf 41.52 42.40
- Orig-garage 16,213 16,925
- SY2000 5,980 5,728
- ESP 184,513 184,059
- R-sq 0.88 0.88
38POOL OR NOT TO POOL (CONT.)
- WHAT HAPPENS IF WE CONSIDER A DATABASE WITH
POOLS, AND SEPARATELY A DATABASE WITHOUT POOLS?
39 WITH POOL ON PROPERTY WITH POOL ON PROPERTY WITH POOL ON PROPERTY WITHOUT POOL ON PROPERTY WITHOUT POOL ON PROPERTY WITHOUT POOL ON PROPERTY
Variable COEFFICIENT MEAN VALUES EXPECTED VALUES COEFFICIENT MEAN VALUES EXPECTED VALUES
Intercept 65,957.89 1.00 65,958 54,993.78 1.00 54,994
ORIG_FIXTURES 2,505.59 9.65 24,179 2,784.14 8.16 22,719
ORIG_NOPATIO -15,415.46 0.22 -3,391 -14,838.47 0.41 -6,084
ORIG_POOL
ORIG_SQF 41.63 2,586.79 107,690 41.46 2,097.20 86,956
ORIG_X_3GARAGE 15,768.93 0.40 6,308 16,308.32 0.31 5,056
SY2000 4,211.37 1.00 4,211 7,209.87 1.00 7,210
EXPECTED MEAN SALE PRICE 204,954 170,850
Adj R2 0.08711 0.8895
40POOLS AND NO POOLS SEPARATELY
- ESTIMATED SALE PRICE WITH POOL 204,954
- R-SQUARED 0.87
- ESTIMATED SALE PRICE W/O POOL 170,805
- R-SQUARED 0.89
41The Coefficient What Counts?
- ALL THAT STATISTICAL SIGNIFICANCE CAN TELL US IS
THAT - FOR THIS MODEL AND DATABASE
- THE COEFFICIENT IS A SIGNIFICANT
- (OR INSIGNIFICANT)
- MULTIPLIER FOR THE EXPLANATORY VARIABLE. NOTHING
MORE.
42The Appropriate StandardEconomic Significance
- For us, economic significance
- is determined by
- what the normal or typical participant
- considers important to the
- conclusion of the transaction.
43A Criticality
- NOT ONE hedonic analysis encountered to date has
actually asked this question - What was important to you in concluding your
transaction?
44Hedonic Analysis of a Regression Model (HARM) is
- Highly inaccurate and unreliable method
- Not appropriate for appraisal work
- Observations apply to hedonic analysis
- NOT
- regression models!