AIC deals with the trade-off between the complexity of the model and the goodness of fit of the model.

Similar documents
In several chapters we have discussed goodness-of-fit tests to assess the

Ad-valorem and Royalty Licensing under Decreasing Returns to Scale

The Effect of Relative Size on Housing Values in Durham

The Improved Net Rate Analysis

A Note on the Efficiency of Indirect Taxes in an Asymmetric Cournot Oligopoly

Volume 35, Issue 1. Hedonic prices, capitalization rate and real estate appraisal

Rockwall CAD. Basics of. Appraising Property. For. Property Taxation

IREDELL COUNTY 2015 APPRAISAL MANUAL

AIC and BIC are each methods of comparing models fit via maximum likelihood. In each case, a larger value is better.

Hedonic Pricing Model Open Space and Residential Property Values

A. K. Alexandridis University of Kent. D. Karlis Athens University of Economics and Business. D. Papastamos Eurobank Property Services S.A.

Volume Title: Well Worth Saving: How the New Deal Safeguarded Home Ownership

The purpose of the appraisal was to determine the value of this six that is located in the Town of St. Mary s.

CABARRUS COUNTY 2016 APPRAISAL MANUAL

Comparables Sales Price (Old Version)

AVM Validation. Evaluating AVM performance

WALLER COUNTY APPRAISAL DISTRICT MASS APPRAISAL REPORT APPRAISAL YEAR 2018

Chapter 35. The Appraiser's Sales Comparison Approach INTRODUCTION

Exposure Draft ED/2013/6, issued by the International Accounting Standards Board (IASB)

Equity from the Assessor s Perspective

Separating the Age Effect from a Repeat Sales Index: Land and Structure Decomposition

Following is an example of an income and expense benchmark worksheet:

Relationship between Proportion of Private Housing Completions, Amount of Private Housing Completions, and Property Prices in Hong Kong

Sincerity Among Landlords & Tenants

On the Disutility and Discounting of Imprisonment and the Theory of Deterrence

Use of Comparables. Claims Prevention Bulletin [CP-17-E] March 1996

Volume 35, Issue 1. Real Interest Rate and House Prices in Malaysia: An Empirical Study

The Effects of Housing Price Changes on the Distribution of Housing Wealth in Singapore

Metro Boston Perfect Fit Parking Initiative

How to Read a Real Estate Appraisal Report

Sales Ratio: Alternative Calculation Methods

The Analytic Hierarchy Process. M. En C. Eduardo Bustos Farías

Evacuation Design Focused on Quality of Flow

Proving Depreciation

THE APPRAISAL OF REAL ESTATE 3 RD CANADIAN EDITION BUSI 330

Hunting the Elusive Within-person and Between-person Effects in Random Coefficients Growth Models

Thank you for the opportunity to comment on the above referenced Exposure Draft.

Over the past several years, home value estimates have been an issue of

Graphical Representation of Defeasible Logic Rules Using Digraphs

What is an Appraisal? Equity Valuations and Consulting Services Ltd.

Is there a conspicuous consumption effect in Bucharest housing market?

The Proposal of Cadastral Value Determination Based on Artificial Intelligence

Geographic Variations in Resale Housing Values Within a Metropolitan Area: An Example from Suburban Phoenix, Arizona

An Assessment of Current House Price Developments in Germany 1

Edmonton Composite Assessment Review Board

31 July 2014 Japan s Modified International Standards (JMIS): Accounting Standards Comprising IFRSs and the ASBJ Modifications

July 17, Technical Director File Reference No Re:

Intangibles CHAPTER CHAPTER OBJECTIVES. After careful study of this chapter, you will be able to:

General Market Analysis and Highest & Best Use. Learning Objectives

Part 1. Estimating Land Value Using a Land Residual Technique Based on Discounted Cash Flow Analysis

Course Commerical/Industrial Modeling Concepts Learning Objectives

Impact Of Financing Terms On Nominal Land Values: Implications For Land Value Surveys

Review of the Prices of Rents and Owner-occupied Houses in Japan

A Model to Calculate the Supply of Affordable Housing in Polk County

Guide to Appraisal Reports

Chapter 13. The Market Approach to Value

MONETARY POLICY AND HOUSING MARKET: COINTEGRATION APPROACH

What Factors Determine the Volume of Home Sales in Texas?

Solutions to Questions

Important Comments I. Request concerning the proposed new standard in general 1.1 The lessee accounting proposed in the discussion paper is extremely

PROPERTY TAX IS A PRINCIPAL REVENUE SOURCE

IAS Revenue. By:

Comparison of Dynamics in the Korean Housing Market Based on the FDW Model for the Periods Before and After the Macroeconomic Fluctuations

Landlord s Checklist Of Silent Lease Issues (Second Edition)

How Did Foreclosures Affect Property Values in Georgia School Districts?

California Real Estate License Exam Prep: Unlocking the DRE Salesperson and Broker Exam 4th Edition

Report on the methodology of house price indices

Demonstration Properties for the TAUREAN Residential Valuation System

SAS at Los Angeles County Assessor s Office

1. There must be a useful number of qualified transactions to infer from. 2. The circumstances surrounded each transaction should be known.

AVA. Accredited Valuation Analyst - AVA Exam.

The capitalization rate is essential to any analysis through the income

VALUATION REPORTING REVISED Introduction. 3.0 Definitions. 2.0 Scope INTERNATIONAL VALUATION STANDARDS 3

REAL PROPERTY VALUATION METHODS

absorption rate ad valorem appraisal broker price opinion capital gain

Impact on Financial Statements of New Accounting Model for Leases

IPSASB Consultation Paper (CP): Financial Reporting for Heritage in the Public Sector Proposed comments from the FOCAL i working group

On 1 February 2013 the IVSC announced the release of an Exposure Draft dealing with amendments to IVS 2011.

EXPLANATION OF MARKET MODELING IN THE CURRENT KANSAS CAMA SYSTEM

2) All long-term leases should be capitalized in the accounts by the lessee.

2. The, and Act, also known as FIRREA, requires that states set standards for all appraisers.

Evaluating Unsmoothing Procedures for Appraisal Data

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

Evaluation of Vertical Equity in Residential Property Assessments in the Lake Oswego and West Linn Areas

Housing as an Investment Greater Toronto Area

Estimating Poverty Thresholds in San Francisco: An SPM- Style Approach

HIGHEST & BEST USE CHALLENGES AND SUPPORTING ADJUSTMENTS 6/11/2018 KEN MROZEK, MAI, SRA, ASA HIGHEST AND BEST USE CHALLENGES AND

Objectives of Housing Task Force: Some Background

THE APPRAISAL OF REAL ESTATE 3 RD CANADIAN EDITION BUSI 330

Cube Land integration between land use and transportation

Definitions ad valorem tax Adaptive Estimation Procedure (AEP) - additive model - adjustments - algorithm - amenities appraisal appraisal schedules

Initial sales ratio to determine the current overall level of value. Number of sales vacant and improved, by neighborhood.

REPEATABILITY & REPRODUCIBILITY (R&R) STUDY

Board Meeting Handout ACCOUNTING FOR CONTINGENCIES September 6, 2007

Scores for Valuation Reports: Appraisal Score & BPO Score. White Paper. White Paper APRIL 2012

Sponsored by a Grant TÁMOP /2/A/KMR Course Material Developed by Department of Economics, Faculty of Social Sciences, Eötvös Loránd

October 1, Mr. Wayne Miller, Chair Appraiser Qualifications Board The Appraisal Foundation th Street, NW, Suite 1111 Washington, DC 20005

Plant design and economics (8)

Racial Prejudice in a Search Model of the Urban Housing Market: Lewis Team Notes

Investor Advisory Committee 401 Merritt 7, P.O. Box 5116, Norwalk, Connecticut Phone: Fax:

Transcription:

1 de 5 21-05-2013 13:24 From Wikipedia, the free encyclopedia The Akaike information criterion is a measure of the relative quality of a statistical model, for a given set of data. As such, AIC provides a means for model selection. AIC is founded on information entropy: it offers a relative estimate of the information lost when a given model is used to represent the process that actually generates the data. AIC deals with the trade-off between the complexity of the model and the goodness of fit of the model. AIC does not provide a test of a model in the sense of testing a null hypothesis; i.e. AIC can tell nothing about how well a model fits the data in an absolute sense. If all the candidate models fit poorly, AIC will not give any warning of that. 1 Definition 2 How to apply AIC in practice 3 AICc 4 Relevance to chi-squared fitting 5 History 6 Bayesian information criterion 7 See also 8 Notes 9 References 10 External links In the general case, the AIC is where k is the number of parameters in the statistical model, and L is the maximized value of the likelihood function for the estimated model. Given a set of candidate models for the data, the preferred model is the one with the minimum AIC value. Hence AIC not only rewards goodness of fit, but also includes a penalty that is an increasing function of the number of estimated parameters. This penalty discourages overfitting (increasing the number of free parameters in the model improves the goodness of the fit, regardless of the number of free parameters in the data-generating process). AIC is founded in information theory. Suppose that the data is generated by some unknown process f. We consider two candidate models to represent f: g 1 and g 2. If we knew f, then we could find the information lost from using g 1 to represent f by calculating the Kullback Leibler divergence, D KL (f ǁ g 1 ); similarly, the information lost from using g 2 to represent f would be found by calculating D KL (f ǁ g 2 ). We would then choose the candidate model that minimized the information loss. We cannot choose with certainty, because we do not know f. Akaike (1974) showed, however, that we can estimate, via AIC, how much more (or less) information is lost by g 1 than by g 2. It is remarkable that such a

2 de 5 21-05-2013 13:24 simple formula for AIC results. The estimate, though, is only valid asymptotically; if the number of data points is small, then some correction is often necessary (see AICc, below). To apply AIC in practice, we start with a set of candidate models, and then find the models' corresponding AIC values. There will almost always be information lost due to using one of the candidate models to represent the "true" model (i.e. the process that generates the data). We wish to select, from among R candidate models, the model that minimizes the information loss. We cannot choose with certainty, but we can minimize the estimated information loss. Denote the AIC values of the candidate models by AIC 1, AIC 2, AIC 3,, AIC R. Let AIC min be the minimum of those values. Then exp((aic min AIC i )/2) can be interpreted as the relative probability that the ith model minimizes the (estimated) information loss. [1] As an example, suppose that there were three models in the candidate set, with AIC values 100, 102, and 110. Then the second model is exp((100 102)/2) = 0.368 times as probable as the first model to minimize the information loss, and the third model is exp((100 110)/2) = 0.007 times as probable as the first model to minimize the information loss. In this case, we would omit the third model from further consideration. We could take a weighted average of the first two models, with weights 1 and 0.368, respectively, and then do statistical inference based on the weighted multimodel; [2] alternatively, we could gather more data to distinguish between the first two models. The quantity exp((aic min AIC i )/2) is the relative likelihood of model i. If all the models in the candidate set have the same number of parameters, then using AIC might at first appear to be very similar to using the likelihood-ratio test. There are, however, important distinctions. In particular, the likelihood-ratio test is valid only for nested models whereas AIC (and AICc) has no such restriction. [3] AICc is AIC with a correction for finite sample sizes: where n denotes the sample size. Thus, AICc is AIC with a greater penalty for extra parameters. Burnham & Anderson (2002) strongly recommend using AICc, rather than AIC, if n is small or k is large. Since AICc converges to AIC as n gets large, AICc generally should be employed regardless. [4] Using AIC, instead of AICc, when n is not many times larger than k 2, increases the probability of selecting models that have too many parameters, i.e. of overfitting. The probability of AIC overfitting can be substantial, in some cases. [5] Brockwell & Davis (1991, p. 273) advise using AICc as the primary criterion in selecting the orders of an ARMA model for time series. McQuarrie & Tsai (1998) ground their high opinion of AICc on extensive simulation work with regression and time series. AICc was first proposed by Hurvich & Tsai (1989). Different derivations of it are given by Brockwell & Davis (1991), Burnham & Anderson, and Cavanaugh (1997). All the derivations assume a univariate linear model with normally distributed errors (conditional upon regressors); if that assumption does not hold, then

3 de 5 21-05-2013 13:24 the formula for AICc will usually change. Further discussion of this, with examples of other assumptions, is given by Burnham & Anderson (2002, ch. 7). In particular, bootstrap estimation is usually feasible. Note that when all the models in the candidate set have the same k, then AICc and AIC will give identical (relative) valuations. In that situation, then, AIC can always be used. Often, one wishes to select amongst competing models where the likelihood functions assume that the underlying errors are normally distributed (with mean zero) and independent. This assumption leads to model fitting. For fitting, the likelihood is given by, where C is a constant independent of the model used, and dependent only on the use of particular data points. i.e. it does not change if the data do not change. The AIC is therefore given by. As only differences in AIC are meaningful, the constant C can be ignored, allowing us to take for model comparisons. Another convenient form arises if the σ i are assumed to be identical and the residual sum of squares (RSS) is available. Then we get AIC = n ln(rss/n) + 2k + C, where again C can be ignored in model comparisons. [6] The Akaike information criterion was developed by Hirotugu Akaike, under the name of "an information criterion". It was first published by Akaike in 1974. [7] The original derivation of AIC relied upon some strong assumptions. Takeuchi (1976) showed that the assumptions could be made much weaker. This work, however, was in Japanese, and was not widely known outside Japan for many years. AICc was originally proposed for linear regression (only) by Sugiura (1978). That instigated the work of Hurvich & Tsai (1989), and several further papers by the same authors, which extended the situations in which AICc could be applied. The work of Hurvich & Tsai contributed to the decision to publish a second edition of the volume by Brockwell & Davis (1991), which is the standard reference for linear time series; the new edition states, "our prime criterion for model selection [among ARMA(p,q) models] will be the AICc". [8] The volume by Burnham & Anderson (2002) was the first attempt to set out the information-theoretic approach in a general context. It includes an English exposition of the results of Takeuchi. The volume led to far greater use of the information-theoretic approach, and now has over 17000 citations on Google Scholar.

4 de 5 21-05-2013 13:24 Akaike originally called his approach an entropy maximization principle. Burnham & Anderson (2002, ch. 2) discuss and expand on this, and trace the approach back to the work of Ludwig Boltzmann on thermodynamics. Briefly, minimizing AIC in a statistical model is essentially equivalent to maximizing entropy in a thermodynamic system. In other words, the information-theoretic approach in statistics is essentially applying the Second Law of Thermodynamics. The AIC penalizes the number of parameters less strongly than does the Bayesian information criterion (BIC), which was independently developed by Akaike and by Schwarz in 1978, using Bayesian formalism. [9] Akaike's version of BIC was originally denoted ABIC (for "a Bayesian Information Criterion") or referred to as Akaike's Bayesian Information Criterion. [10] A comparison of AIC/AICc and BIC is given by Burnham & Anderson (2002, 6.4). The authors argue that AIC/AICc has theoretical advantages over BIC. Firstly, because AIC/AICc is derived from principles of information. Secondly, because the (Bayesian) derivation of BIC has a prior of 1/R (where R is the number of candidate models), which is "not sensible", since the prior should be a decreasing function of k. The authors also show that AIC and AICc can be derived in the same Bayesian framework as BIC, just by using a different prior. Additionally, they present a few simulation studies that suggest AICc tends to have practical/performance advantages over BIC. See also Burnham & Anderson (2004). Further comparison of AIC and BIC, in the context of regression, is given by Yang (2005). In particular, AIC is asymptotically optimal in selecting the model with the least mean squared error, under the assumption that the exact "true" model is not in the candidate set (as is virtually always the case in practice); BIC is not asymptotically optimal under the assumption. Yang further shows that the rate at which AIC converges to the optimum is, in a certain sense, the best possible. Deviance information criterion Focused information criterion Hannan Quinn information criterion Occam's Razor 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. ^ Burnham & Anderson 2002, 6.4.5 ^ Burnham & Anderson 2002 ^ Anderson 2008, 3.6.2 ^ Burnham & Anderson 2004 ^ Claeskens & Hjort 2008, 8.3 ^ Burnham & Anderson 2002, pp. 12, 63 ^ Akaike 1974 ^ Brockwell & Davis 1991, p. 273 ^ McQuarrie & Tsai 1998, p. 22 ^ Akaike 1980 Akaike, Hirotugu (1974), "A new look at the statistical model identification", IEEE Transactions on Automatic Control 19 (6): 716 723, doi:10.1109/tac.1974.1100705 (http://dx.doi.org

5 de 5 21-05-2013 13:24 /10.1109%2FTAC.1974.1100705), MR 0423716 (http://www.ams.org/mathscinetgetitem?mr=0423716). Akaike, Hirotugu (1980), "Likelihood and the Bayes procedure", in Bernardo, J. M.; et al., Bayesian Statistics, Valencia: University Press, pp. 143 166. Anderson, D. R. (2008), Model Based Inference in the Life Sciences, Springer. Brockwell, Peter J.; Davis, Richard A. (1987), Time Series: Theory and Methods, Springer, ISBN 0387964061. Brockwell, Peter J.; Davis, Richard A. (1991), Time Series: Theory and Methods (2nd ed.), Springer, ISBN 0387974296. Republished in 2009: ISBN 1441903194 Burnham, K. P.; Anderson, D. R. (2002), Model Selection and Multimodel Inference: A Practical Information-Theoretic Approach (2nd ed.), Springer-Verlag, ISBN 0-387-95364-7. Burnham, K. P.; Anderson, D. R. (2004), "Multimodel inference: understanding AIC and BIC in Model Selection" (http://www.sortie-nd.org/lme/statistical%20papers /Burnham_and_Anderson_2004_Multimodel_Inference.pdf), Sociological Methods and Research 33: 261 304. Cavanaugh, J. E. (1997), "Unifying the derivations of the Akaike and corrected Akaike information criteria", Statistics and Probability Letters 31: 201 208. Claeskens, G.; Hjort, N. L. (2008), Model Selection and Model Averaging, Cambridge. Fang, Yixin (2011). "Asymptotic equivalence between cross-validations and Akaike Information Criteria in mixed-effects models (http://www.jds-online.com/file_download/278/jds-652a.pdf)", Journal of Data Science, 9:15-21. Hurvich, C. M.; Tsai, C.-L. (1989), "Regression and time series model selection in small samples", Biometrika 76: 297 307. Lukacs, P.M., et al. (2007). "Concerns regarding a call for pluralism of information theory and hypothesis testing", Journal of Applied Ecology, 44:456 460. doi:10.1111/j.1365-2664.2006.01267.x (http://dx.doi.org/10.1111%2fj.1365-2664.2006.01267.x). McQuarrie, A. D. R.; Tsai, C.-L. (1998), Regression and Time Series Model Selection, World Scientific, ISBN 981-02-3242-X. Sugiura, N. (1978), "Further analysis of the data by Akaike s information criterion and the finite corrections", Communications in Statistics - Theory and Methods A7: 13 26. Takeuchi, K. (1976), "???", Suri-Kagaku (Mathematical Sciences) (in Japanese) 153: 12 18. Yang, Y. (2005), "Can the strengths of AIC and BIC be shared?", Biometrika 92: 937 950. Hirotogu Akaike comments on how he arrived at the AIC (http://www.garfield.library.upenn.edu /classics1981/a1981ms54100001.pdf), in This Week's Citation Classic (21 December 1981) AIC (https://noppa.tkk.fi/noppa/kurssi/s-114.1310/luennot/extramaterial.pdf) (Aalto University) Akaike Information Criterion (http://www4.ncsu.edu/~shu3/presentation/aic.pdf) (North Carolina State University) Example Calculation (http://coopunit.forestry.uga.edu/coop_wkshop/inference_effects/aic_reg.pdf) (University of Georgia) Model Selection (http://myweb.uiowa.edu/cavaaugh/ms_seminar.html) (University of Iowa) Retrieved from "http://en.wikipedia.org/w/index.php?title=akaike_information_criterion& oldid=553167518" Categories: Regression variable selection Model selection This page was last modified on 2 May 2013 at 08:22. Text is available under the Creative Commons Attribution-ShareAlike License; additional terms may apply. By using this site, you agree to the Terms of Use and Privacy Policy. Wikipedia is a registered trademark of the Wikimedia Foundation, Inc., a non-profit organization.