Jamal Munshi, Sonoma State Univesity, 1996
Many methodological flaws have become commonplace in financial research to the point that familiarity has bred de facto acceptance. Some of these only require refinements in the findings but in most cases the methods used bring to question the findings themselves. In this paper we survey these flaws and investigate their potential impact on conclusions normally drawn from the research. We use simulation to demonstrate how an incorrect conclusion about a known population may be drawn using these methods. We then offer alternate methods that may be used to avoid the pitfalls described. Finally, we examine the validity of the "large sample" statistical model in understanding phenomena and developing theory in financial economics.
Finance, like the other social sciences, inherited a version of the scientific method from the natural sciences and to this day classical Neyman-Pearson large sample hypothesis testing is generally accepted as the only credible research tool in finance. In this research model, we imagine that we live in the world of Ho the null hypothesis, where entropy is maximized and no patterns and therefore no information exists. To find information we seek patterns. We do this by randomly selecting a finite sample from an infinity of possibilities. This is our observation; and given the distributional properties of Ho we may compute the odds that we would observe such a sample by chance. If this probability turns out to be lower than a threshold value of unlikelyhood we proclaim that it is unlikely that we would make such an observation in Ho; therefore we must not be in Ho. The distributional properties of Ho are usually assumed to be described by the Gaussian function.
In practice, financial researchers often stray from this pattern seeking model partly because of convenience and and partly because of unfamiliarity with the actual statistical underpinnings of the research methods; but more importantly, I believe, because the methodology, though a stunning success in the natural sciences, is not a good fit in finance. The Krueger and Kennedy (1990) finding that stock market performance is highly correlated with football games is a harsh reminder that spurious statistics do exist but such a faux pas is harder to identify when the relationships found are the ones we are looking for; or one that is easily rationalized in terms of a dominant theoretical framework.
A commonly found departure from the pattern seeking model is to look for a number of different patterns. The more patterns you look for the greater the odds that a chance pattern will be observed in a sample taken from the Ho distribution. The procedure compromises the statistical method because even in the world of Ho, if you look long enough you will find the pattern you are looking for. In Finance papers mutiple comparisons are frequently used in conjunction with mutiple values of the alpha probability and ex-poste selection of the direction of the effect. Table 1 shows a typical result (Aggarwal 1990). A common practice is to set up a table of many t-tests and then to identify the 'improbable' ones with asterisks. The usual code is * = 'significant at alpha=10%, ** = 'significant at alpha=5%, and *** = 'significant at alpha=1%.
In Table 2 we perform a similar test on 10 samples of n=100 drawn from the Ho population and find three spurious 'rejections' which might have been reported as findings in financial research. The example underlines the need for careful analysis of multiple comparison cases.
At an alpha level of 5%, the probability that all 10 samples would fail to reject Ho is 0.95^10 or about 60%. This means that there is a 40% probability that at least one chance rejection will occur. In other words, the experimentwide error rate is 40% instead of 5%. To hold the experimentwide error rate at 5% each comparison must be made at alpha = 0.005. In general, when making m comparisons, the single comparison alpha level (ac) and the eperimentwide alpha level (ae) are related by the equation
(1-ac)^m =( 1-ae)
In simple linear regression models with many explanatory x-variables the beta coefficient of each x-variable may be interpreted only if the x-variables act independently on y, the response variable. When the x-variables are correlated or even when some linear combination of the x-variables are correlated, the regression coefficients become unstable so that neither the direction nor the magnitude of the observed 'effect' may have a valid interpretation.
Most papers in Finance that use linear regression do not check for this condition and the few that do refer only to a correlation matrix of the x-variables. A correlation matrix is unable to detect correlations between linear combinations.
Some Examples of Multi Co-linearity in Financial Research
(i) The Intervention dummy variable in time
An example model is presented in Table 3 (Blennerhassett and Bowman 1994). The authors wish to measure the impact of the screen trading intervention in the time series of off-market trading. But here they are analyzing the data in real time and not in event time and the authors suspect that the model is subject to historical effects. An intervention dummy variable identifies the implementation of screen trading. The addition of the time variable to the model is an attempt to remove historical effects. But the dummy variable is also a time indicator. Therefore the two variables are highly correlated. Simulated data in Table 3 show a time variable with two different intervention dummy variables. Note that both dummys are highly correlated with time and, unless one dummy is symmetrical around the other, the dummy variables are also likely to be correlated with each other. Models such as these are common in financial research that attempts to model the impact of a historical event such as a change in regulation, market structure, interest rates, trade agreements, and so on.
(ii) Explanatory variables that are algebraically related
Capital structure, that is, the amount of debt a firm carries relative to equity, is a contentious and unresolved topic in Finance. (DeAngelo and Masulis 1980, Bowen, Daley, and Huber 1982, Bradley, Jarrell, and Kim 1984, Kane, Marcus, and McDonald 1984, Kim and Sorensen 1986). A popular model of capital structure maintained that the value of debt is that it is a tax shield and therefore, that firms that enjoy other forms of tax shields such as depreciation would use less debt. The empirical results are conflicting and inconclusive. The regression coefficients of the models differ in sign from one study to another. Some found the non debt tax shield (ndts) coefficient to be negative and argued that the ndts model is correct since it showed that firms with non debt tax shields used less debt. Others found the coefficient to be positive and argued that firms with more tax shields enjoyed higher cash flows and therefore had larger debt capacities. Still others found no relationship between ndts and capital structure.
A possible explanation for the failure of these empirical studies is the natural algebraic relationship that exists between total assets and depreciation. Since both of these variables were used in the model the regression coefficients were unstable and their spurious sign and maginitude became interpreted into financial theory. The empirical results summarized in Table 4 reveal their random and contradictory nature.
(iii) Linear combinations of explanatory variables are correlated
Frequently the starting point of empirical investigations in finance is not theory but an available database such as COMPUSTAT or CRSP. All the variables in the database become candidates for the model and models built in this manner are frequently unstable even when a correlation matrix of the explanatory variables does not reveal the extent of the dependencies between them.
One of the many models proposed for measuring the determinants of capital structure includes D=total debt, E= book value of equity, and A=total assets as explanatory variables (Kim and Sorensen 1986). The three variables have an exact accounting relationship. A = D+E.
(iv) The small firm effect
Banz (1981) first reported that stocks of small firms tend to have higher returns and this relationship was quickly confirmed by studies that followed. Investigators of asset pricing and market efficiency saw Banz's size effect as an anomaly that must be removed from the data before other effects can be detected. Fama and French (1992) first removed the size effect from the data before they tested the relationship between beta and returns. If the Sharpe (1964) model is correct, they reasoned, then there ought to be a significant and positive relationship between beta risk and returns. They found that the relationship if any between these variables was negative.
But small firms not only have higher returns; they also higher betas. Chan and Chen (1988) report a very strong correlation between beta and size. This relationship implies that tests of asset pricing such as those of Fama and French (1992) and Fama and MacBeth (1973) are likely to produce spurious values of the coefficients for beta risk; and this is indeed the case. Some studies have found a strong positive effect of beta risk that is consistent with the CAPM (capital asset pricing model) while others have found no significant relationship and still others report a relationship in the opposite direction. These empirical results have caused a great deal of controversy and have forced financial economists to re-evaluate some fundamental assumptions of asset pricing. Yet, these findings may turn out to be a fluke of multi-collinearity.
For example, when Fama and French (1992) removed the size effect from the data they may have also removed that which they intended to measure - the beta effect. A failure to find a beta effect in the residuals of the size effect does not imply an absence of a beta effect. There are better ways to find the unique contributions of the two correlated variables. For example, one might first regress beta against size and use these residuals as the unique contribution of beta and then regress size against beta and use those residuals as the unique contribution of size. Alternately, one might look for orthogonal principal components of size and beta.
An added complication in asset pricing research is that some of the empirical models include the PE ratio (PE = stock price over accounting earnings) as an explanatory variable in addition to the risk measure beta. But PE too contains a risk measure. Current financial theory interprets PE as a combination of two effects. Ceteris paribus higher perceived risk would lower the PE ratio and higher perceived growth would raise the PE ratio. Empirical studies are complicated by a high collinearity between PE and beta. There are other problems with asset pricing studies that have to do with the time series nature of the data and the methods by which the concept of risk is rendered and we review these concerns in a another paper.
STATISTICALLY SIGNIFICANT FINDINGS FROM VERY LARGE SAMPLES
Sample sizes in financial research are typically very large. For example the Fama French (1992) paper on asset pricing models is a study of monthly stock returns from over 2000 firms in the period 1962 to 1989. This represents over 672,000 observations. When we treat these observations as a sample we get a sampling distribution with a tiny standard deviation. For instance, The degrees of freedom is approximately 670,000 whose square root is 818. In this case the standard devaition of returns was 52%. This means that the standard deviation of the sampling distribution is 0.0636% or about 6 basis points and a difference as small as 12 basis points might be considered significant although no fund manager would pursue a policy to take advantage of such a small effect. Many of the empirical investigations of the determinants of capital structure cited in this paper have proposed regression models that were found to be statistically significant at r-squared values of 1 to 5%. That is to say 95 to 99% of the sum of squares in the sample remain unexplained and are considered by the model to be random backround noise.
FAILURE OF CLASSICAL LARGE SAMPLE STATISTICS IN FINANCE
Black (1993) observes that although there are many researchers in finance, perhaps thousands, there is only one history, and therefore, pretty much just one set of data. Researchers use past studies to formulate new hypotheses and models. They then test these models on the same data that were used to test the past hypotheses. Recall that classical hypothesis testing is formulated on the basis of a priori hypotheses and the probability of the data given the null. But in finance we find ourselves repeatedly testing the probability of the data given the hypothesis given the data; i.e. testing for pattens ex poste.
In this sense classical statistics is a failure in financial research and I would like to propose a re-evaluation of our research agenda and methodology and an emphasis on well designed case studies on small samples and even anecdotes.
TABLE 1: A MULTIPLE COMPARISON EXAMPLE
Variable sample statistic Rejection status
Rejection status: - = fail to reject, * = reject at alpha=10%, **=reject at alpha=5%
TABLE 2: MULTIPLE COMPARISON OF RANDOM NUMBERS
t-Test: Ho: µ = 0 Ha: µ 0
y1 :Sample Mean = -0.11183675 t-Statistic = -1.342 p=0.1828 -
TABLE 3: REGRESSION MODEL WITH DUMMY VARIABLE IN TIME
MODEL: Vt = Bo + B1D + B2t
(coefficients and t-statistics)
Pearson Product-Moment Correlation
TABLE 4: EMPIRICAL INVESTIGATIONS THE NDTS HYPOTHESIS
note: a negative effect supports the DeAngelo Masulis theory
TABLE 5: EXTERNAL EFFECTS IN A TIME SERIES
F-value of the test: 44.392
AAA rate regression weight: -0.004382
Aggarwal, Raj, Distribution of spot and forward exchange rates: empirical evidence and investor valuation of skewness and kurtosis, Decision Sciences, v21, p588-595, 1990
Banz, Rolf, The relationship between return and market value of common stocks, Journal of Financial Economics, v9 p3, 1981
Black, Fisher, Beta and return, Journal of Portfolio Management, Fall 1993, p8
Blennerhassett, Michael, and Robert Bowman, A change in market microstructure - the seitch to screen trading in the New Zealand Stock Exchange, Asia Pacific Finance Association, Sydney, Australia, 1994
Boquist, John and William Moore, Inter0industry leverage differences and the De-Angelu Masulis tax shield hypothesis, Financial Management, Spring 1984, p5-9
Bowen, Robert, Daley, Lane, and Charles Huber, Evidence on the existence and determinants of inter-industry differences in leverage, Financial Management, Winter 1982, p10-20
Brown, Keith, W.B. Harlow, and Seha Tinic, How rational investors deal with uncertainty: reports of the death of the efficient market theory are greatly exaggerated, Financial Management Collection, Fall 1990
Brown, S.J., and J.B. Warner, Using daily stock returns, the case of event studies, Journal of Financial Economics, March 1985, p3
Chan, K. C., and Naifu Chen, An unconditional asset pricing test and and the role of firm size as an instrumental variable for risk, Journal of Finance, v43, p309, 1988
Chan, K. C., and Josef Lakonishok, Are reports of betaŐs death premature?, Journal of Portfolio Management, Summer 1993
Chen, N.F, and Ingersoll, E., Exact pricing in linear factor models with finitely many assets: A note, Journal of Finance June 1983 page 985
Chen, Naifu, Richard Roll, and Stephen Ross, Economic forces and the stock market: testing the APT and alternate asset pricing theories, Working paper, December 1983
Chen, Naifu, Some empirical tests of the theory of arbitrage pricing, Journal of Finance, Dec 1983 pp 1393, p1414
DeAngelo, H. and R.W. Masulis, Optimal capital structure under corporate and personal taxation, Journal of Financial Economics, v8, March 1980, p3-29
Dimson, Elroy, Risk measurement when shares are subject to infrequent trading, Journal of Financial Economics, v7, p197, 1979 (the non-synchronicity problem)
Dybvig, Phillip, and Ross, Stephen, Yes, the APT is Testable, Journal of Finance, Sep, 1985
Fama, Eugene, and Kenneth French, The cross section of expected stock returns, Journal of Finance, v47:2, 1992, p427
Fama, Eugene, and James MacBeth, Risk, return, and equilibrium, Journal of Political Economy, 1973, 81, p607
Garman, Mark, and Michael Klass, On the estimation of security price volatilities from historical data, Journal of Business, v53, p67, 1980
Gatward, Paul and Ian Sharp, Capital structure dynamics with interrelated adjustments: Australian evidence, Third International Conference on Asia Pacific Financial Markets, Singapore, 1993
Haugen, Robert, and Nardin Baker, Interpreting risk and expected return: comment, Journal of Portfolio Management, Spring 1993, p36 (confirms F-F and rationalizes higher returns for lower ris. the market prices growth stocks too high)
Hsieh, David, Nonlinear Dynamics in Financial Markets, Financial Analysts Journal, July-August 1995, p55
Kane, Alex, Marcus, Alan, and Tobert McDonald, How big is the tax advantage of debt, Journal of Finance, July 1984, p841-855
Kim, Wi, and Eric Sorensen, Evidence of the impact of the agency cost of debt on the corporate debt policy, Journal of Financial and Quantitative Analysis, v21:2, July 1986, p131-143
Kolb, Robert, and Ricardo Rodriguez, The regression tendencies of betas, The Financial Review, v24:2 May, 1989, p319 (beta is not stationary)
Krueger, Thomas, and William Kennedy, An examination of the superbowl stock market predictor, Journal of Finance, June, 1990, p691
Kryzanowski, Lawrence, Simon Lalancette, and Minh Chau To, Some tests of APT mispricing using mimicking portfolios, Financial Review, v29: 2, p153, May 1994
Mandelbrot, Benoit, The variation of certain speculative prices, Journal of Business, October 1963
Markowitz, Harry, Portfolio selection, Journal of Finance, v12, March 1952, p77
Parkinson, Michael, The extreme value method for estimating the variance of the rate of return, Journal of Business, v53, p61, 1980
Peters, Edgar E., Fractal structure in the capital markets, Financial Analysts Journal, July/August 1989, pp. 32-37
Reinganum, Marc, A new empirical perspective on the CAPM, Journal of Financial and Quantitative Analysis, v16, p439, 1981
Roll, Richard, A critique of the asset pricing theoryŐs tests, Journal of Financial Economics, March 1977, p129
Roll, Richard and Stephen Ross, An empirical investigation of the arbitrage pricing theory, Journal of Finance, Dec 1980, p1073
Ross, Stephen, The arbitrage theory of capital pricing, Journal of Economic Theory, v13, p341, 1976
Scheinkman, J.A., and Blake LeBaron, Non-linear dynamics and stock returns, Working paper number 181, Dept of Economics, University of Chicago, 1990
Schwert, G. W., Why does stock market volatility change over time?, Journal of Finance, Dec 1989, p1115
Sharpe, William, A simplified model for porftolio returns, Management Science, 1962, p277
Sharpe, William, Capital asset prices: a theory of market equilibrium under conditions of risk, Journal of Finance, v19, p425, 1964
Shukla, Ravi, and Vhrles Trzcinka, Research on risk and return: Can measures of risk explain anything?, Journal of Portfolio Management, Spring 1991 (weekly returns, capm just as good as multifactor apt)
Schwert, G. William, Why does stock market volatility change over time?, The Journal of Finance, December 1989, page 1115-1153
Velleman, Paul, Definition and comparison of robust nonlinear data smoothing algorithms", Journal of the American Statistical Association, 75, September 1980, 609-615.