Entry and Inefficiency in the Real Estate Brokerage Industry

Similar documents
Testing Cost Inefficiency under Free Entry in the Real Estate Brokerage Industry

Hedonic Pricing Model Open Space and Residential Property Values

Sorting based on amenities and income

Negative Gearing and Welfare: A Quantitative Study of the Australian Housing Market

Ad-valorem and Royalty Licensing under Decreasing Returns to Scale

Do Family Wealth Shocks Affect Fertility Choices?

What Factors Determine the Volume of Home Sales in Texas?

Waiting for Affordable Housing in NYC

The Effect of Relative Size on Housing Values in Durham

DATA APPENDIX. 1. Census Variables

Comparative Study on Affordable Housing Policies of Six Major Chinese Cities. Xiang Cai

Housing Supply Restrictions Across the United States

A Note on the Efficiency of Indirect Taxes in an Asymmetric Cournot Oligopoly

METROPOLITAN COUNCIL S FORECASTS METHODOLOGY

Efficiency in the California Real Estate Labor Market

Naked Exclusion with Minimum-Share Requirements

The Improved Net Rate Analysis

Mortgage Market Institutions and Housing Market Outcomes

Housing market and finance

Land-Use Regulation in India and China

METROPOLITAN COUNCIL S FORECASTS METHODOLOGY JUNE 14, 2017

Oil & Gas Lease Auctions: An Economic Perspective

Volume 35, Issue 1. Hedonic prices, capitalization rate and real estate appraisal

A Model to Calculate the Supply of Affordable Housing in Polk County

The Price Elasticity of the Demand for Residential Land: Estimation and Implications of Tax Code-Related Subsidies on Urban Form

Department of Economics Working Paper Series

National Association for several important reasons: GOING BY THE BOOK

Oligopoly Theory (6) Endogenous Timing in Oligopoly

[03.01] User Cost Method. International Comparison Program. Global Office. 2 nd Regional Coordinators Meeting. April 14-16, 2010.

Real Estate Technology

How to Read a Real Estate Appraisal Report

Volume Title: Well Worth Saving: How the New Deal Safeguarded Home Ownership

2012 Profile of Home Buyers and Sellers New Jersey Report

The Impact of Urban Growth on Affordable Housing:

Estimating National Levels of Home Improvement and Repair Spending by Rental Property Owners

COMPARISON OF THE LONG-TERM COST OF SHELTER ALLOWANCES AND NON-PROFIT HOUSING

MONETARY POLICY AND HOUSING MARKET: COINTEGRATION APPROACH

Cube Land integration between land use and transportation

Comparing the Stock Market and Iowa Land Values: A Question of Timing Michael Duffy ISU Department of Economics

This article is relevant to the Diploma in International Financial Reporting and ACCA Qualification Papers F7 and P2

Introduction Public Housing Education Ethnicity, Segregation, Transactions. Neighborhood Change. Drivers and Effects.

On the Choice of Tax Base to Reduce. Greenhouse Gas Emissions in the Context of Electricity. Generation

Chapter 35. The Appraiser's Sales Comparison Approach INTRODUCTION

University of Zürich, Switzerland

Intangibles CHAPTER CHAPTER OBJECTIVES. After careful study of this chapter, you will be able to:

The Local Impact of Home Building in Douglas County, Nevada. Income, Jobs, and Taxes generated. Prepared by the Housing Policy Department

Groupe de Recherche en Économie et Développement International. Cahier de recherche / Working Paper 04-06

IAS Revenue. By:

Housing Transfer Taxes and Household Mobility: Distortion on the Housing or Labour Market? Christian Hilber and Teemu Lyytikäinen

86 years in the making Caspar G Haas 1922 Sales Prices as a Basis for Estimating Farmland Value

Proving Depreciation

Impact Fees in Illinois

PROPERTY TAX IS A PRINCIPAL REVENUE SOURCE

IFRS - 3. Business Combinations. By:

The Corner House and Relative Property Values

An Assessment of Current House Price Developments in Germany 1

EFFECT OF TAX-RATE ON ZONE DEPENDENT HOUSING VALUE

EITF Issue No EITF Issue No Working Group Report No. 1, p. 1

Optimal Apartment Cleaning by Harried College Students: A Game-Theoretic Analysis

86M 4.2% Executive Summary. Valuation Whitepaper. The purposes of this paper are threefold: At a Glance. Median absolute prediction error (MdAPE)

An overview of the real estate market the Fisher-DiPasquale-Wheaton model

DRAFT REPORT. Boudreau Developments Ltd. Hole s Site - The Botanica: Fiscal Impact Analysis. December 18, 2012

Residential January 2009

Ind AS 115 Impact on the real estate sector and construction companies

What s Next for Commercial Real Estate Leveraging Technology and Local Analytics to Grow Your Commercial Real Estate Business

A NOTE ON AD VALOREM AND PER UNIT TAXATION IN AN OLIGOPOLY MODEL

Commercial Property Price Indexes and the System of National Accounts

Solutions to Questions

METHODOLOGY GUIDE VALUING LANDS IN TRANSITION IN ONTARIO. Valuation Date: January 1, 2016

.01 The objective of this Standard is to prescribe the accounting treatment for investment property and related disclosure requirements.

Chapter 37. The Appraiser's Cost Approach INTRODUCTION

Exposure Draft ED/2013/6, issued by the International Accounting Standards Board (IASB)

7224 Nall Ave Prairie Village, KS 66208

Northgate Mall s Effect on Surrounding Property Values

Estimating the Value of Foregone Rights on Land. A Working Paper Prepared for the Vermillion River Watershed Joint Powers Organization 1.

Over the past several years, home value estimates have been an issue of

Each copy of any part of a JSTOR transmission must contain the same copyright notice that appears on the screen or printed page of such transmission.

THE TAXPAYER RELIEF ACT OF 1997 AND HOMEOWNERSHIP: IS SMALLER NOW BETTER?

The Housing Price Bubble, Monetary Policy, and the Foreclosure Crisis in the U.S.

City and County of San Francisco

ECONOMIC AND MONETARY DEVELOPMENTS

An Assessment of Recent Increases of House Prices in Austria through the Lens of Fundamentals

Relationship between Proportion of Private Housing Completions, Amount of Private Housing Completions, and Property Prices in Hong Kong

Natural Resources Journal

Houses Across Time and Space

This PDF is a selection from a published volume from the National Bureau of Economic Research

Guide Note 12 Analyzing Market Trends

A Real-Option Based Dynamic Model to Simulate Real Estate Developer Behavior

House Price Shock and Changes in Inequality across Cities

Proposed FASB Staff Position No. 142-d, Amortization and Impairment of Acquired Renewable Intangible Assets (FSP 142-d)

Real Estate Reference Material

Economic Organization and the Lease- Ownership Decision in Water

ISSUE 1 Fourth Quarter, REALTORS Commercial Alliance Series HOT TOPICS ANSWERS TO CURRENT BUSINESS ISSUES TENANTS-IN-COMMON INTERESTS

How Did Foreclosures Affect Property Values in Georgia School Districts?

July 17, Technical Director File Reference No Re:

Trends in Affordable Home Ownership in Calgary

Digitalization Crucial for Team Based Work and Production Distribution at the National Land Survey of Sweden

The purpose of the appraisal was to determine the value of this six that is located in the Town of St. Mary s.

Oligopoly. Introduction: Between Monopoly and Competition. In this chapter, look for the answers to these questions: Two extremes

Recommendations for COD Standards. Robert J. Gloudemans Almy, Gloudemans, Jacobs & Denne. for. New York State Office of Real Property Services

Transcription:

Entry and Inefficiency in the Real Estate Brokerage Industry Lu Han University of Toronto lu.han@rotman.utoronto.ca Seung-Hyun Hong University of Illinois hyunhong@uiuc.edu October 1, 2007 Preliminary Draft, Comments Welcome Abstract This paper studies entry decisions in local real estate brokerage markets to investigate sources of potential cost inefficiency due to entry of realtors. We build upon recent empirical work on games of incomplete information, and construct our equilibrium model in terms of fixed points in entry probability. We estimate our model using recently developed nested pseudo likelihood algorithms (Aguirregabiria and Mira, 2007). Using data from 5% PUMS, we find direct evidence for three sources of cost inefficiency wasteful non-price competition, loss of economies of scale, and high fixed costs. In particular, we find that a one-standard-deviation increase in entry rate would increase the average variable costs by 28.96%, resulting from wasteful non-price competition. Contact information: Lu Han: Rotman School of Management, University of Toronto, 105 St. George St., Toronto, Ontario, Canada M5S 3E6. Seung-Hyun Hong: Department of Economics, University of Illinois at Urbana-Champaign, 470F Wohlers Hall, 1206 S. Sixth St., Champaign, IL 61820. The first author is grateful for the financial support from the Bell University Labs.

1 Introduction It is well known that entry can lead to social inefficiency under certain conditions. On the one hand, an increase in the number of entrants tends to reduce prices or increase product variety. On the other hand, when an industry s average costs decrease in output, excessive entry may result if the loss of economy scale cannot be offset by reduced prices or increased varieties. While a large and important theoretical literature on entry and social efficiency exists (e.g., Dixit and Stiglitz (1977); Mankiw and Whinston (1986)), there are relatively few empirical studies attempting to test its direct implications. Such evidence has been difficult to establish empirically, not only because obtaining data on the relevant cost and benefit measures is difficult, but also because one must account for the fact that entry decisions are endogenous. A notable exception is Berry and Waldfogel (1999), who estimate an entry model in the radio broadcasting industry and recover the parameters of the fixed cost function and demand function. While their estimates provide first solid evidence for social inefficiency under free entry, the actual entry and exit is rare in the radio industry because of regulation by the Federal Communications Commission. 1 This paper complements earlier literature by estimating an entry model for the U.S. residential real estate brokerage industry between 1990 and 2000. Our goal is to estimate the effects of entry on the costs of the real estate brokerage service and provide evidence for cost inefficiency due to entry in the real estate brokerage market. Investigating social inefficiency under entry for the real estate brokerage market is important for at least two reasons. First, real estate brokerage industry does not have significant barriers to entry, if entry is perceived as gaining a license in order to practice. 2 In 2004, the industry has a large number of brokerage firms and individual licensed brokers and agents - approximately 98, 000 active firms and 1.9 million active real estate brokers and agents. The estimated revenue of the brokerage industry was exceeding $60 billion, which accounted for about 1% of GDP. 3 Given the size of the the real estate brokerage industry, it is not surprising that the entry and exit behavior in the real estate brokerage industry has been recurrently fea- 1 See Sweeting (2006). 2 See the 1983 and 2007 reports by the Federal Trade Commission and U.S. Department of Justice, http://www.usdoj.gov/atr/public/reports/223094.pdf. 3 The numbers come from BEA (2005). 1

tured in the news reports and policy debates. 4 Second, unlike most other industries, the real estate brokerage industry is characterized by relatively inflexible commission rates. Empirical evidence shows that real estate brokers often charge a commission rate of between 5% and 6% with little variance among firms, across housing markets or over time. In addition, the demand for traditional real estate broker service is relatively price inelastic before 2000. 5 less likely that consumers would benefit from entry through price reductions. Thus, it is Given these two unique features of the real estate brokerage market, many observers believe that entry among traditional real estate brokers is likely to be excessive, since the absence of price competition among traditional brokers may have led to an inefficiently large commitment of resources on fighting for potential clients (Yinger (1981); Crockett (1982)). The empirical evidence of inefficiency in the real estate brokerage market is provided by Hsieh and Moretti (2003), who study the real estate brokerage industry in 1980 and 1990 and show that free entry leads to lower productivity. In particular, they find that, in cities with higher house prices (1) there are more real estate agents relative to the city s workforce, (2) these agents are less productive (measured by sales per agent or sales per hours worked) and (3), wages for agents are not higher than wages in cities with lower housing prices. While Hsieh and Moretti (2003) convincingly provide descriptive evidence for potentially excessive entry in the real estate brokerage market, real estate agent productivity (measured by sales per agent or per hour worked) alone does not explain the sources of market inefficiency. For example, if the marginal cost of providing a service is constant and the same across different agents, and fixed costs are very low, then lower sales per agent do not necessarily imply cost inefficiency. 6 As noted by Berry and Waldfogel (1999), in order to evaluate efficiency in an industry, one needs information on revenues and costs. While earning data are easier to obtain, 4 Examples include recent New York Times discussion on the real estate brokerage industry at http://www.nytimes.com/2007/09/07/business/07agents.html, and the antitrust case against the Kentucky Real Estate Commissions filed by the DOJ in 2005 for prohibiting agents from giving consumers a rebate on some of the commissions they pay. 5 See the 2006 Real Estate Brokerage Report by the Government Accountability Office (GAO (2006)). Hsieh and Moretti (2003) provide the evidence for inflexible commission rates for the period 1980-1998. Section 5.1 in this paper adds further evidence for the period 1988-2002. 6 Consider a simple model, in which an identical agent s profit is given by 0.06P Q c Q, where 0.06 is N N fixed commission rate, P is average housing price, Q is total transactions, N is the number of realtors, and c is constant marginal cost. A potential agent enters the market as long as the profit is greater than w, her reservation wage. In a free entry equilibrium, we have (0.06P c) Q = w. Given that Q, w, and c are fixed, if P N increases, N has to increase as well. In this case, we observe all three patterns as in Hsieh and Moretti (2003), and yet, there is no cost inefficiency. 2

there is no publicly available data on agent costs. In this paper, we provide a simple structural model that allows us to estimate the cost function in the real estate brokerage market. The empirical methodology relies on a basic insight of the traditional discrete-choice approach: the observed entry decisions are an indicator of the underlying profitability. Structural estimations of entry decision therefore allow us to recover economic primitives such as marginal costs and fixed costs, and further enable us to assess the potential loss or gain in efficiency due to free entry. By efficiency, we mean cost efficiency in terms of minimizing total cost of real estate broker service. 7 For our model of the real estate brokerage industry, we build upon the entry literature and particularly, recent empirical work on games with incomplete information (e.g. Augereau, et al. (2006); Bajari, et al. (2006); Seim (2006); Sweeting (2004)). A key feature of our setting is that most market contain a large number of potential entrants, suggesting that incomplete information is plausible because it is practically impossible for each agent to observe profitability (e.g. fixed costs) of all other agents. Each agent then forms her conjecture on other agents entry strategies. Given their conjectures and private information, potential agents simultaneously decide whether to enter. Further exploiting our key feature, we assume that potential agents care only about the average entry probability, and thus, the resulting Bayesian Nash equilibrium is reduced to fixed points in entry probability in that agent conjectures on average entry probability should coincide with the entry probability predicted from the model. 8 Because the goal of our estimation is to estimate the cost function, we further need to distinguish between revenues and costs. To do so, we exploit the information on realtors earnings contained in our main data from the Census of Housing and Population (5% PUMS) for 1990 and 2000. Because realtors earnings are mostly composed of commission fees, we use this information to predict revenues for potential agents. In doing so, we also allow predicted revenues to depend on agent conjectures. As a result, equilibrium conjectures enter the model 7 We do not aim to evaluate the full efficiency of the real estate brokerage market in this paper. Entry is optimal until at the margin benefits, measured in terms of the value of enhanced functioning in the real estate market, are equal to the cost of the resources absorbed. Given the lack of the data on market share of each individual agent and the quality of services provided by each agent, we cannot fully specify the relevant benefits of entry. However, our welfare results based on the cost side analysis should be suggestive indicator of the total welfare, since our findings from the CEX (1988-2002) and the NAR (2005) data do not show any evidence that the competition among real estate agents help to reduce prices or improve quality of matching. 8 Our model is therefore related to Brock and Durlauf (2001). 3

nonlinearly, which complicates using the standard nested fixed point algorithm (e.g. Seim (2006)). For this reason, we employ the nested pseudo likelihood (NPL) algorithm proposed by Aguirregabiria and Mira (2002, 2007), in which the nested fixed point algorithm is swapped in that the outer algorithm iterates on the choice probability to solve the fixed point problem, while the inner algorithm maximizes a pseudo likelihood function given the fixed choice probability. This approach is attractive in our setting, because an initial nonparametric estimate for the choice probability can be the fraction of entrants in each market, and we can simply iterate the standard probit estimations. Another important feature of our model is that we explicitly specify the cost function to test three potential sources of inefficiency under free entry. First, we consider whether entry results in wasteful duplication of fixed costs. Second, by choosing a flexible functional form for average variable cost function, we allow for the possibility that there may be economies or diseconomies of scale in the real estate brokerage market. The optimal number of real estate agents derived from the average cost function could explain potential welfare loss under free entry, even without considering the presence of the fixed costs. Finally, given the lack of price competition in the real estate brokerage industry, our average variable cost function includes not only the costs of executing the transaction but also the cost of resources absorbed in obtaining potential clients. If entry results in inefficiently large commitment of resources in promoting brokerage service, then we expect that average variable cost would increase in the number of entrants, which provides an additional piece of evidence for inefficiency. Our results strongly indicate that all these three sources of cost inefficiency are present. For example, we find that, a one-standard-deviation increase in entry rate increases the average variable costs by $1, 224, which corresponds to 28.96% increase from the mean of the average variable costs. To our knowledge, this is the first direct empirical evidence on wasteful nonprice competition in the real estate brokerage industry. In addition, entry could result in substantial loss of economy both because of the presence of fixed costs and because of the shape of variable cost function. These findings provide direct evidence that, at least from the cost side, entry leads to inefficiently large resources in marketing and producing real estate brokerage service. There are at least two policy lessons that we can draw from this exercise. First, restricting 4

entry from traditional brokers would yield nontrivial cost savings by realizing the economy of scale. Second, with inflexible commission rates, realtors inefficiently engage in wasteful nonprice competition rather than price competition. Given the central role played by the multiple listing mechanisms, one step in this direction is to insure participation by discounted brokers and to encourage price competition. We believe that this paper contributes to the literature in two important ways. First, it is the first to estimate how real estate brokerage costs change with entry and therefore provides a first direct test of cost inefficiency in the real estate brokerage industry. In the real estate brokerage literature, researchers typically infer the cost inefficiency by testing whether the average productivity decreases with the number of agents in the market. An affirmative finding is interpreted as evidence for the increase in costs and hence for market inefficiency under free entry. While this approach is very useful, it provides only an indirect test of the inefficiency under free entry hypothesized in the theory. An alternative approach is to look for evidence that whether the costs in this industry directly increase with entry. The difficulty often stems from the lack of the cost data and the endogeneity of entry decisions. This paper overcomes these difficulties. In addition to providing direct evidence of excessive entry, this approach also allows one to reveal different sources of market inefficiency. Second, this paper also emphasizes several methodological issues that are of particular importance in studying real estate brokerage markets. First, real estate industry is lack of the detailed data on commission rates, individual market shares and costs. Our approach exploits the observed agents revenue and observed entry decision to make inference about the underlying transaction numbers and costs. In this respect, our paper bears a resemblance to Berry and Waldfogel (1999) which recovers the distribution of fixed costs in radio industry. Second, given the large number of real estate agents, it is hard for a potential entrant to observe her rivals fixed costs and hence learn about their entry decisions ex ante. We thus adopt an incomplete information framework similar to Seim (2006) and extend it to the case with many potential agents. Throughout our analysis, we employ two simplifying assumptions. First, while we allow for variations in commission rates, we assume that agents do not directly compete on commission rates. Conversely, one may argue that incentives to compete on commission are high, and thus, 5

more vigorous competition could lead to more dramatic benefits by lowering commission fees for home sellers. Using the data from the 1988-2002 Consumer Expenditure Survey, however, we find that this hypothesis is not supported. One possible explanation is that, while there are large number of agents in each city, there are only relatively few large brokerage companies in most areas. Since brokers are allowed to control fee policies among their agents, sustaining collusive behavior is easier than if each agent sets her own commission rate. 9 Second, while we allow for heterogeneity among agents, we do not model quality competition. One may argue that in the absence of price competition, agents may engage in competition in service quality, which makes consumers better off. Using the data from the 2005 National Association Realtor s Survey data on home buyers and sellers, we find that competition among agents does not help home buyers reduce searching time or help home sellers reduce the time on the market, after controlling for the price concessions for home sellers and price discounts for home buyers. Thus, our welfare result based on the cost side analysis, while limited, is suggestive of the total social welfare inefficiency in the real estate brokerage industry. The paper is organized as follows. Section 2 constructs our equilibrium model and discusses our estimation method. Section 3 describes our data and variables. Section 4 presents the main estimation results and discusses their implications. Section 5 examines two simplifying assumptions in our model. Section 6 concludes the paper. 2 Model 2.1 Set-up We begin with a simple version of the model in which house price and the number of transactions in a market are exogenously given. There are M different markets, and for each market m, S m potential entrants simultaneously decide whether to enter the market or not. 10 To examine a potential realtor s entry decision, consider a two-stage model: in the first stage, potential entrants simultaneously decide whether to enter the market; in the second stage, 9 The constrained variations on commission rates can also be attributed to the reluctance of homesellers to list their home at a commission rate below the prevailing rate. Homeowners must compete among themselves for the sales effort of agents and they may be unwilling to risk offering a lower commission rate. See Goolsby and Childs (1988) for discussion on this issue. 10 For the ease of presentation, this section specifies a cross-sectional entry model. However, our estimation takes the advantage of panel data structure by including the MSA fixed effects and year dummies. 6

realtors in the market compete with each other, and their profits are realized. If potential agent i enters market m, then her post-entry profit is given by π im = R im C m (q im ) F im, (1) where R im denotes the revenue for agent i in market m, and q im is the number of transactions carried out by agent i. The cost function consists of variable cost C m (q im ) and fixed costs F im. Potential agent i will enter the market as long as her expected post-entry profit is positive. Note that we do not explicitly distinguish between incumbent realtors and new entrants. We posit that incumbent realtors may decide to exit the market if their profits become lower than their reservation wages. In other words, even incumbent realtors need to decide whether to enter (i.e. remain on the market) in the first stage. Nevertheless, we allow for agent heterogeneity. Potential differences between incumbents and new entrants, for example, can be captured by different values for R im, q im, and F im. Because most transactions are likely to follow similar standard procedure which may include market specific components, we assume that variable cost function, C m ( ), is the same within market m, and thus is known to all potential agents. In contrast, fixed costs are assumed to differ across agents. Though costs associated with obtaining licenses might be similar, agents should incur additional startup costs to learn neighborhood, local real estate market, related tax laws, and financing information. These costs are likely to vary across different agents. We posit that each market contains many realtors and a large number of potential entrants, since anyone in local labor force can potentially become a realtor. A large number of agents, however, suggest that our setting is more similar to social interactions models as in Brock and Durlauf (2001), rather than common settings in industrial organization where the number of firms is limited. Because it is practically impossible for each potential agent to know fixed costs of all other individuals in the market, we assume that there is private information component in fixed costs. In other words, each agent observes her own fixed costs in the first stage and knows the aggregate distribution of fixed costs, but does not observe fixed costs of other agents. One more implication of a large number of agents is that we may not use common approach in entry literature (e.g. Berry (1992); Bresnahan and Reiss (1991); Berry and Waldfogel (1999)) exploiting free entry equilibrium condition such that π im (Nm) 0 and π im (Nm +1) < 0, where Nm is the equilibrium number of firms. There are two problems. First, N m tends to be large 7

in our data, suggesting that the value of the profit under N m is likely to be very similar to that under N m + 1. Second, we do not observe the exact number of entrants in our data. We use the 5 percent sample of Census of Population and Housing, where we can only estimate total number of realtors in each market. Though the estimate for N m should be unbiased, we may not be able to tell the difference between N m and N m + 1. 2.2 Unobservables and Conjectures In our model, we consider two sources of unobservables. The first is private information on fixed costs. Though some components of fixed costs can be captured by observed variables, the other component in fixed costs, which we denote by ω im, is unobserved to both econometricians and other agents in the market. We assume that ω im is an independently and identically distributed draw from the distribution known to all agents. The second source of unobservable, denoted by η im, is agent uncertainty on the second stage competition. In the first stage, each agent can guess her variable profit realized in the second stage. Though this guess may be close to the realized variable profit, it is unlikely that each agent knows the exact value of her future variable profit. Note that uncertainty in other agents fixed costs (and thus uncertainty in their actions) also implies that the exact values of their variable profits are unknown to agents in the first stage. However, there may be other types of uncertainty such as unexpected housing boom or slump. Therefore, η im reflects these other types of uncertainty unknown to agents in the first stage. 11 Both sources of unobservables imply that potential agents need to form their expectation of variable profits in order to decide whether to enter in the first stage. In other words, potential agents enter as long as E(V i ) F i, where V i = R i C(q i ), suppressing the subscript m, and the expectation is taken over all other players choices and agent uncertainty. The first type of unobservable, however, entails further implication on equilibrium condition. Because other agents fixed costs are unknown, agent i forms her subjective belief on other agents actions and chooses her entry strategy based on her private information about F i, which implies the choice probability of entry. Namely, Pr(d i = 1) = Pr(E(V i ) F i ), where d i is an indicator for agent i s entry, and the expectation is taken with respect to agent i s subjective belief on other 11 Another difference between these two unobservables is that η im is likely to be known to agents in the second stage, whereas ω im is unknown to other agents even in the second stage. 8

agents entry. To the extent that agents conjectures are rational, the equilibrium requires that the choice probability of entry of all agents should coincide with agents subjective beliefs on other agents entry. 12 Note that V i is a function of other agents choices, denoted by d i, and the expectation of V i (d i ) is taken over all other players choices, so that agents conjectures, denoted by σ(d i ), are contained in E(V i (d i )). However, V i (d i ) may be a nonlinear function, in which case the computation of E(V i (d i )) is likely to be demanding. Most literature on games with incomplete information (e.g. Aradillas-Lopez (2005); Augereau, et al. (2006); Bajari, et al. (2006); Seim (2006); Sweeting (2004)) therefore considers a payoff function in the reduced form and assumes that the number of entrants enters the payoff function linearly, thereby directly obtaining E(V i (d i )) = V i (E(d i )) = V i (σ(d i )). We also need to exploit similar kind of simplification, especially because the number of potential entrants in our setting is too large to allow for any tractable computation of the expectation. The assumption on linear profit function is very restrictive in our setting, however, since it prevents us from examining various sources of cost inefficiency. Consequently, we consider nonlinear functional forms for variable profits discussed in the subsequent sections, but impose the following assumptions on agent conjectures. Specifically, potential agent i in market m first anticipates the average probability of agent entry, denoted by σ m. Based on σ m and available information in the first stage, agent i then predicts her expected revenue and quantity in the second stage. The assumption is that instead of taking complicated expectation of variable profit, agent i simply conjectures the average number of realtors in the market 13 and then treat σ m as known information. Note that under this assumption, if we do not allow for η im, then agent i should immediately know her future revenue given σ m, since there is no other uncertainty. For this reason, we introduce η im to reflect other type of uncertainty that allows for the difference between predicted revenue and actual revenue in the second stage. 14 12 See Brock and Durlauf (2001) for similar definition of equilibrium. Most literature on games with incomplete information (e.g. Aradillas-Lopez (2005); Bajari, et al. (2006); Seim (2006); Sweeting (2004)) uses similar Bayesian Nash equilibrium defined in probability space. 13 Note that given σ m and total number of potential agents S m, agent i can simply expect the number of entrants N m to be σ m S m. 14 In some sense, η im accounts for the difference between E(R i(d i)) and R i(e(d i)). 9

entry. Given this assumption on agent conjectures, we can model the choice probability of agent The following subsections describe details on the profit function in our model, and construct our equilibrium model in terms of fixed points in entry probability. 2.3 Revenue and Costs 2.3.1 Revenue Function Agent i earns her revenue from commission fees, so that R im = q im k=1 c ikm P ikm, where P ikm is the price of house k sold by agent i in market m, and c ikm is the commission rate for each transaction. Through a process we do not explicitly model, the equilibrium levels of c ikm, P ikm, and q im are determined in the second stage. There are three reasons we do not model this mechanism in the second stage. First, we do not observe c ikm, P ikm, and q im for each individual transaction by each individual realtor. Second, we instead observe realtors earnings which are mostly commission fees. Third, by directly using R im in the profit, we can separate costs from revenue, and avoid potential model misspecification in the second stage competition. The revenue function for agent i in market m is specified in the following reduced form. log(r im ) = γ 0 + γ 1 Q m S m + γ 2 N m S m + f(p m ) + Z r mδ r + X r imβ r + η im, (2) where Q m denotes total number of transactions in market m, N m is total number of realtors in market m, Z r m is a vector of market characteristics, X r im is a vector of agent i s characteristics, and γ, δ r, and β r are parameters. The superscript r indicates that the variable or the parameter determines revenue. P m denotes a vector of houses prices in market m, and f(p m ) is a function of the distribution of housing prices. A simple example of this function is f(p 1m, P 2m,..., P Jm ) = γ 3 Pm, where P m = 1 J J j=1 P jm. An error term η im reflects components in revenue unknown to agent i in the first stage. 2.3.2 Fixed Cost Function On the cost side, we assume real estate agents costs are characterized by a fixed cost, F im, and a variable cost, C(q im ). Fixed costs measure the costs of becoming a real estate agent. They are assumed to vary both across markets and across realtors within the same market. Using the superscript f to indicate that the variable or the parameter determines fixed cost, we write 10

the fixed costs as F im = Z f mδ f + X f im βf + ω im, (3) where Z f m is a vector of market characteristics, which includes information on licensing requirements, office space in different markets. X f im is a vector of individual characters for agent i in market m, which measures the opportunity wages that a potential agent could earn if she worked outside of the real estate brokerage sector within the same market. Different people may have different opportunity costs, depending on their age, education, experience and skill set. β f and δ f are parameters to be estimated. The unobserved component of an individual s fixed costs, ω im follows an i.i.d standard normal distribution. It is observed to the individual herself, but not to other potential entrants or econometricians. 2.3.3 Average Variable Cost Function Variable costs measure the costs involved in each house transaction. They are assumed to vary across markets only. In market m, given the number of transaction q im and the number of entrants N m and the number of total potential entrants S m, the variable costs are defined as: C(q im, N m S m ) = (Z v mδ v + α N m S m + θ 1 + θ 2 q im + θ 3 q 2 im)q im (4) In a traditional variable profit function where the considerations on the wasteful competing and scale of economies are absent, the average variable cost function is parameterized as Z v mδ v + θ 1 only. Z v m is a vector of market level characteristics, such as gas price, building density, which are presumably affect a realtor s marginal cost in each transaction. δ v and θ are parameters to be estimated. Our average variable cost function is different from the usual average cost function in two ways. First, we include the term α Nm S m to capture an important possible externality from the presence of other agents due to wasteful competition. Unlike most other markets where price competition is allowed, real estate brokerage market is characterized by relatively inflexible commission rates. Since agents do not directly compete on prices, the increased number of entrants could intensify competition among brokers along other dimensions, such as prospecting potential clients. That is, to compete for each sale, real estate agents have to spend additional amount of effort involving a wide range of activity, including marketing their own service to 11

potential clients. Realtors marketing can include paid advertisements in television, radio, print, or online media; informal networking to meet potential buyers and sellers, and giving away pumpkins at Halloween. Unlike the marginal costs involved in selling or buying a house, most of these additional marginal costs do not necessarily generate benefit enough such that it can offset the resources committed to marketing. Second, we include linear and quadratic terms in the number of transactions in marginal cost function. By choosing a flexible functional form, we allow the possibility that there may be economies or diseconomies or both in the real estate brokerage market. In particular, our cost function specifications allows us to test whether the average cost curve is U-shaped. and AC m = (θ 1 + Z v mδ v + α N m S m ) + θ 2 q m + θ 3 q 2 m + F m q m (5) AC m q m = θ 2 + 2θ 3 q m F m q 2 m The source of economies of scale is characterized by fixed costs. In an extreme case where θ 2 = θ 3 = 0, a sufficient amount of F m would yield a natural monopoly as optimal market structure. The source of diseconomies of scale comes from θ 2 and θ 3. The exact ranges of economies or diseconomies thus depend on the value of θ 2 and θ 3 relative to F m and q m. 2.4 Equilibrium Following the discussion in the section 2.2, we posit that in the first stage, a potential agent (6) predicts her expected variable profit as follows. First, she conjectures the fraction of total number of realtors in the second stage. We denote this conjecture by σ m. We can think of σ m as common conjecture on the fraction of equilibrium number of agents in market m. Given that all agents know the aggregate distribution of private information, they are likely to have consensus on σ m, which is why it does not depend on i. Next, agent i replaces Nm S m in her revenue and cost, and then treat σ m, together with P m, Q m, and other market and individual characteristics, as given. 15 Agent i then predicts R im, her expected revenue based on (2). 15 Our baseline model treats P m and Q m as given exogenously, assuming that housing prices and total transactions are determined primarily by overall housing market conditions, but not by the number of realtors. Later in this paper, we will extend our model to let P m and Q m to be endogenously determined partly by the number of realtors, N m. 12

Since revenues depend on quantities, agent i is likely to predict her revenue and quantity at the same time. Because we do not observe the actual quantity for each realtor in our data and do not have enough information to model the second stage, we consider a simplifying assumption for agent s expectation of q im as follows: q im q im (X r im, Z r m, P m, Q m S m, σ m ) = R im R m Q m, (7) where R m total revenue for all realtors in market m. The equation (7) implies that agent i s market share is proportional to her revenue share among total market revenue. To derive (7), we assume that agent i considers the average commission rate, c m, and the average housing prices, Pm, and computes her predicted revenue as R im R im (X r im, Z r m, P m, Q m S m, σ m ) = c m P m q im (X r im, Z r m, P m, Q m S m, σ m ). Because c m Pm Q m = R m, we obtain (7). Given q im, agent i is assumed to predict her variable cost by plugging q im into the known cost function. Once potential agent i predicts her expected variable profit as above, she decides to enter the market as long as her predicted variable profit is greater than her fixed cost, which suggests the following choice probability of entry: Pr{d im = 1 X im, Z m, P m, Q m, S m, σ m } = Pr{ R im C m ( q im ) F im } ( Rim (Xim r = Φ, Zr m, P m, Qm S m, σ m ) C m ( q im (Xim v, Zv m, P m, Qm S m, σ m )) F im (X f im, ) Zf m) (8) where d im is an indicator for whether agent i enters market m, Φ( ) is the cumulative distribution function of standard normal, and we assume that ω im follows the normal distribution of N(0, λ 2 m). λ m The main dependent variable in our model is N m, and the probit model in (8) generates N m based on potential agent s profit which depends on σ m, agent conjecture on the fraction of total number of realtors. For this belief to be rational, it should coincide with the probability of entry predicted from (8). Specifically, the equilibrium σm should satisfy the following condition: { ( ) } Rim σm = Pr R im (σm) (σ C m) m Q m, σm F im dg(x im ), (9) R m where R im = ˆγ 0 + ˆγ 1 h m + ˆγ 2 σ m + ˆγ 3 Pm + Z mˆδ + X im ˆβ, and G(X im ) is the distribution of individual characteristics X im within market m. 13

2.5 Estimation If the probit model (8) does not depend on σ m, then we can estimate the parameters by simply using maximum likelihood estimation for a standard probit, except that R im needs to be estimated before applying a probit. Because the model depends on σ m, however, we need to impose the equilibrium condition in (9). Several empirical studies on games with incomplete information (e.g. Augereau, et al. (2006); Seim (2006)) considere similar equilibrium conditions in probability space and use the nested fixed point algorithm, in which the outer algorithm maximizes a likelihood function, while the inner algorithm solves for the fixed point given the fixed parameters. Applying the nested fixed point algorithm to our context is difficult, however, since σ m enters the equation (9) highly nonlinearly. In contrast, the approach proposed by Aguirregabiria and Mira (2002, 2007), which they call the nested pseudo likelihood (NPL) algorithm, is more straightforward to apply to our context. Because we can easily compute nonparametric estimates for σ m by computing the fraction of realtors in each market, 16 we can use Nm S m as an initial guess for σ m. We then estimate R im and finally estimate the probit model. This completes the first iteration. Using the estimates from the first iteration, we can predict σ 1 m. More specifically, we use the following equation to predict σ m: 1 { σ k+1 m = S m i=1 where we use Nm S m Pr ( Rim R im ( σ m) k ( σ C m) k m Q m, σ m k R m ) F im } weight im Sm l=1 weight lt, (10) for σ 0 m, and the weights are provided by the Census data. We then replace σ 0 m with σ 1 m, and repeat the same probit estimation. This completes the second iteration. We therefore iterate this procedure until σ k m converges. This approach is a simple application of the NPL algorithm, in which the standard nested fixed point algorithm is swapped in the sense that the outer algorithm iterates on the choice probability to solve the fixed point problem, while the inner algorithm maximizes a pseudo likelihood function given the fixed choice probability. In addition to straightforward application to our context, another advantage of the NPL algorithm is that it can be extended to more complicated likelihood function where we may allow for common unobservables that determine both entry decision and other potentially endogenous variables such as housing prices. We discuss this extension in the following section. 16 A consistent estimator for σ m is Nm S m, in that σ m = plim Sm 14 N m Sm.

2.6 Identification The goal of our estimation is to uncover the cost function, which requires us to distinguish between revenues and costs. For this reason, we first exploit the information on realtors earnings in the Census data, since realtors earnings consist mostly of commission fees. 17 Using earnings information, we thus predict revenues and quantities for potential agents. To identify the cost functions, however, we further need exogenous variations that shift revenues but not costs. In this regard, recall that the variable cost function in (4) depends on q im, Nm S m (or σ m ), and Z v m. The quantity q im depends on revenues, because revenues and quantities are predicted at the same time in our model. To the extent that exogenous shifters in revenues do not affect σ m and Z v m, revenue shifters would change only q im in the cost function, which enables us to trace down the variable cost function. However, unobserved market specific factor may affect both revenues and the entry decision, in which case revenue shifters might be correlated with unobserved market specific factor which is also correlated with entry decision, or σ m. this reason, we exploit the panel structure of our data and include MSA fixed effects in both revenues and the choice probability of entry. Note also that we additionally rely on functional forms to separate revenues, variables costs, and fixed costs. 2.7 Extension to Endogenous Housing Price So far, our model has taken the number of transactions, Q m, the house price, P m, as exogenously given. In this section, we relax this assumption by allowing them to be endogenously related to the number of entrants in a given market, N m. explicitly model, the equilibrium levels of quantity and price are specified as: For Through a process we do not log(q m ) = δ n log(n m ) + Z q mδ q + ɛ m (11) log(p m ) = α n log(n m ) + Z p mα p + λ p ɛ m (12) Both house prices and transactions are functions of demand shifters, Z q m, Z p m, and the number of agents, N m. In particular, δ n and α n measure the effects of log N m on log Q m and log P m, respectively. If competition among agents pushes up the number of transactions as agents approach potential clients more aggressively, then we would expect a positive sign of δ N. 17 See the occupation description on real estate brokers and sales agents in the Dictionary of Occupational Titles by the U.S. Department of Labor, which is currently replaced by the O*NET at http://online.onetcenter.org. 15

Similarly, if competition among agents drives up the listing price of a given property, we would expect a positive sign of α n. The instrumental variables for Q m include the MSA-level inflows and outflows. The instrumental variables for P m include both inflows and outflows and housing supply constraint index developed by Saks (2005). 18 Presumably, these variables affect the the number of total transactions and average house price in a given market but not an individual agent s decision to become a realtor. Correlation between the two equations enters through ɛ m, which we assume is independent of the other errors in the revenue function and in the fixed cost function. ɛ m may represent unobserved demand shifters or unobserved heterogeneity in market competition. The inclusion of ɛ creates a selection effect. Suppose that the unobserved factors in ɛ are demand-shifters. Markets with higher than average ɛ will have both a higher than average number of transactions and a higher than average house price. For simplicity, we rewrite equation (11) and (12) as Q m (ɛ) and P m (ɛ). Conditional on ɛ, the likelihood function of observed entries is given as Pr(d im X m, Z m, P m (ɛ), Q m (ɛ)). To obtain the unconditional likelihood function, we must integrate over ɛ, which requires choosing the distribution for ɛ. Rather than using a particular distribution form for ɛ, we follow Heckman and Singer (1986) and approximate the distribution of ɛ using a discrete factor approximation. Specifically, let the distribution of ɛ with L points of support be: P {ɛ = µ l } = p l, l = 1,, L p l 0, l, pl = 1, pl µ l = 0. Then the unconditional likelihood function is: L p k Pr(d im X im, Z m, P m (µ l ), Q m (µ l )) l=1 Note that given this likelihood function, we can apply the NPL algorithm. To do so, we replace N m with S m σ m, and follow the same procedures as in section 2.5. 18 Saks (2005) creates an index of housing supply regulation using information form six separate sources: Wharton Urban Decentralization Project, Regional Council of Governments, International City Management Association, Fiscal Austerity and Urban Innovation, National Register of Historic Places and American Institute of Planners. See the appendix of his paper for a ranking of metropolitan areas by housing supply regulation. 16

3 Data 3.1 Market Structure Our main data source is the 5 percent sample of the Census of Population and Housing in 1990 and 2000, which are commonly referred to as the PUMS 1990 and PUMS 2000. Markets for real estate services are local, owing to the nature of the service. 19 There is no single, agreed upon method for empirical market definitions, although it is clear that the markets should be self-contained in the sense that there is not relevant competition from outside the market. We thus follow Bresnahan and Reiss (1991) by focusing on geographically isolated markets as a way of minimizing the possibility of competition from outside the defined market. particular, we define a market as a metropolitan statistical area (MSA). Table 1 presents the sample statistics from the PUMS 2000. In 2000 we have a sample of 289 MSAs. Among them, 218 MSAs are free-standing metropolitan statistical areas, which are generally surrounded by non-metropolitan territory and therefore are not integrated with other metropolitan areas. We choose to model the entry decision at the real estate agent and broker level rather than brokerage firm level. This is because, in the real estate market, brokerage firm is relatively unimportant and the important capital and goodwill belong to the salesperson. We identify real estate agents as individuals who reported their occupation as real estate sales occupations in the Census. 20 Table 2 presents the difference between real estate agents and brokers and other occupations in 2000. On average, real estate agents and brokers tend to be older, more educated and more likely to be married. Note that real estate agents and brokers tend to earn higher income than non-realtors. However, their income is also more volatile, suggesting their job is exposed to more market risk compared with non-realtors. Table 3 presents the evidence on heterogeneity among real estate agents and brokers in 2000. The large variations in agent demographics and earnings suggest that it is important to allow for heterogeneity not only at market level but also at the individual level. As one way to assess the reliability of the Census measure of the number of real estate agents and brokers and their annual earnings, we use data from the Occupation Employment 19 Competition among realtors is local because real estate is fixed in a geographic location, and buyers and sellers often want in-person interactions with agents with experience and expertise to that particular location. 20 The occupation code for real estate occupation in the 2000 Census is 419020. This classification includes both real estate agents and brokers, but we disregard this distinction for most of the analysis. In 17

Statistics collected by the Bureau of Labor Statistics. The numbers are fairly consistent. This suggests that using self-reported occupation and self-reported income in the Census reflects fairly accurately the actual number of realtors and their actual earnings. Table 4 presents the summary statistics across different markets in both 1990 and 2000. To measure the number of houses transactions in a city, we use information on the date on which the household moved to the current house, along with information on whether the household owns the house in which it lives. In an average MSA in 2000, we have a sample of 33, 539 population, 842 house transactions, and 86 realtors, while we have a sample of 29, 527 population, 662 house transactions, and 110 realtors in an average MSA in 1990. The Census also asks homeowners about the value of their house, which is virtually identical to the average price of houses sold in the last year. The average value of all houses is $176, 659.80, while the average value of houses sold is $188, 025.70. In this paper, we take the second measure as the measure of the price of housing. Note that one cannot assume that all these transactions reported in the Census are conducted through real estate agents and brokers. Clearly, some home sellers choose to avoid paying the brokerage fees by selling their homes by themselves. Hendel, Nevo and Ortalo- Magne (2007) provide evidence on the importance of for-sale-by-owners (FSBO). However, using the National home buyer and seller survey data (2005), we find that two out of five of these FSBO transactions are between close related parties, such as friends and relatives. That is, 40% of FSBOs are not placed on the open market, they are not in direct competition with realtors and therefore have no effect on our estimates. To the extent that the rest of the FSBO transactions are negatively correlated with the market share of realtors, our evidence of the wasteful non-price competition, if any, will provide lower bound for the true effect. 21 Table 5 shows the structure of real estate brokerage market in 2000 and 1990, respectively. In both years, as the number of realtors increases, average number of households, average house value, and average realtor earnings increase substantially. Following Hsieh and Moretti (2003), 21 The resulting bias depends on the extent to which the number of houses directly sold by the owner is correlated with the number of transactions with agents assistance. For example, when interpreting the estimated effect on wasteful non-price competition in the marginal cost, the negative correlation between the FSBO share and realtor-facilitated-transaction share implies that the evidence of wasteful promoting competition, if any, would be underestimated. So our estimate would provide a lower bound for the true wasteful competition in the marginal cost. 18

we compute two measures of average productivity of real estate agents and brokers: sales per agent and sales per hour. Both measures of average productivity decrease with the number and share of realtors in the local market. If everything else was the same, then one could interpret this pattern as an indicator of excessive entry: average cost per transaction increases with the number of realtors. However, one may also worry that the negative correlation between average productivity and the number of agents simply reflects some unobserved heterogeneity across markets. For example, more educated and skilled people tend to live together. In cities with better economic conditions and more job opportunities, we may find less but more capable brokers. To control of this concern, Table 6 reports the comparison of the percentage changes in the average productivity between 1990 and 2000, across 6 groups of MSAs with different percentage changes in the number of realtors. All time-invariant unobserved market heterogeneities are differenced out by controlling for the MSA fixed effects. Using the 142 MSAs in which geographical boundaries do not change between 1990 and 2000, we find that, the higher is the percentage increase in number of realtors, the lower is the percent increase in average productivity. At one extreme, for cities experiencing highest increase in the number of agents, the average sale per agent decreases by 10% and the average hour per agent decreases by 21%. While the evidence here is supportive of our main hypothesis, it is only suggestive. Next section provides a formal test for this hypothesis. 3.2 Fixed Costs Shifters Both fixed costs and marginal costs are affected by exogenous cost shifters. In this subsection, we consider three measures of fixed costs. First, real estate agents and brokers need to pass a number of exams obtain the license. Our data on the licensing requirements come from the Department of Financial and Profession Regulation s website. We use the following variables as measures of agents fixed costs: the number of hours required to take real estate transaction course, the requirement for license renewal, and exam fees. Table 6 provides the number of hours, the license fees and exams fees in 2000. There is significant variation across state requirements. For example, Texas requires a minimum of 270 study hours before an individual could take the exam. On the other hand, Alaska requires 20 study hours only. 22 22 Brokerage entry appears to be more difficult than agent entry. At a minimum, an entrant that wants to establish a brokerage must hire or become a licensed broker. Additionally, an entering broker may require an 19