Causal Inference

STATA Programming

Course description

The course covers empirical strategies for applied (mainly micro-economics) research questions. The main goal of the course is to provide an overview of the statistical tools for counterfactual analysis. The course illustrates the identification strategies, estimation and other related issues (eg internal and external validity) that are relevant for the assessment of causal effects (or equivalently treatment effects) using observational data. Specifically, it will cover an introduction to a set of basic tools: matching and difference-in-difference methods, quasi-experimental/natural experiments settings. It will also briefly touch on approaches that allow for heterogeneous treatment effects and do not assume additive heterogeneity.

Note that we will not cover other interesting topics such as identification of dynamic treatment effects, partial identification (bounds) and synthetic controls methods. Fixed effects and random effects model are covered in other courses offered within the PhD program and will not be discussed.

We will discuss Randomized Trials because they represent the benchmark of non-experimental methods. Experimental methods are illustrated and discussed in detail in other courses.

Features of each particular econometric tool will be illustrated from both the theoretical and practical point of view, often through the discussion of empirical applications.

The emphasis will be on the practical implementation of each approach.

Topics

Fundamentals of Impact Evaluation (6 hours: 3 hours lecture; 3 hours in the computer lab)

The Fundamental Problem of Causal Inference
Potential Outcomes Framework
Basic Approaches to Identification: Randomized Trials
Basic Approaches to Identification: Selection on Observables
Basic Approaches to Identification: A Panel Data/Repeated Cross Section Approach: Difference-in-Differences

Quasi-Experiments (6 hours: 3 hours lecture; 3 hours in the computer lab)

Instrumental Variables: Identification, Estimation, Falsification Checks (Placebo), Interpretation (LATE), External Validity; Weak Instruments
Regression Discontinuity Design: Sharp and Fuzzy Designs, Identification, Estimation, Falsification Checks
Regression Kink Designs: Identification, Estimation, Falsification Checks

Quantile Regression for Impact Evaluation: Introduction (3 hours lecture)

The LATE-QTE model (Abadie et al. 2002)
The Causal Chain Model (Chesher, 2003)
The IV-QTE Model (Chernozhucov et al. 2005)

Prerequisites

This course is open to graduate students enrolled in either EDLE or PhD in Economics or other PhD programs on related topics, as well as to other researchers at the Department.

Participation in the course requires a basic background in statistics and econometrics (namely probability theory, hypothesis testing, linear regression, models for binary dependent variable: logit and probit models). If you have doubts about your background, please get in touch with the instructor.

Learning outcomes

The main emphasis of the course is to encourage students to think critically and clearly. At the end of the course, participants should be able to understand critical points of scientific articles and to start designing and performing their own analysis using the tools illustrated. Mastering the tools introduced in the course will require further personal investment and readings, given that the course is currently a 15-hours module. Students - in particular those who take the course for credit - are expected to be interested in making this further investment.

Teaching methods

Each topic will be covered in class and in a computer laboratory practice session (with the exception of quantile regression for impact evaluation).

All teaching materials (slides, computer programs) will be distributed via mailing list to the enrolled students. Students should subscribe the mailing list by writing to margherita.fort@unibo.it.

Research articles listed among the references can be downloaded from the web. You may use the search engine: http://acnp.unibo.it/cgi-ser/start/it/cnr/fp.html

Most books in the reference list are available at the University libraries. You may check availability through the search engine http://sol.unibo.it/SebinaOpac/Opac?sysb=

Assessment method

Students who take the course for credit will be graded based on the performance in two main tasks (a manuscript review and a research proposal) as well as on in-class participation. To complete in an appropriate way each of these tasks, knowledge of the tools illustrated is required.

In-class participation (20% of grade, i.e. a maximum of 6 points out of 30)

This means: 1) you show up for classes and you review the theory (or read the relevant empirical article) before we go to the computer lab; 2) you read all the research proposals and participate actively to the discussion during the audits: you will have a minimum of about one week to read all the research proposals carefully before the audits.

Manuscript Review (40% of grade, i.e. a maximum of 12 points out of 30)

The purpose of this task is two-fold: (1) to offer the students the opportunity to apply your knowledge to the assessment of an original piece of research; (2) to give you the chance to see what might be involved in reviewing an article. I will try to assign manuscripts related to your research interests but this might as well not happen. You may also receive a paper that is not quite ready for submission but you should treat it as if it was. You should submit a blind-report (with no identifiers) but you may receive a draft with the authors' name. You should not contact the authors to discuss the paper with them. In case you do it, you will receive 0 grade.

Some guidelines for drafting your report are available at the link below

https://dl.dropboxusercontent.com/u/16441444/causality/causal_inference_2014_report_GUIDELINES.pdf

You will be given at least about two weeks from the assignment to deliver your report.

Research Proposal (40% of grade, i.e. a maximum of 12 points out of 30)

This task is designed to encourage the students to do original research on a topic of their choice. The research proposal should have the following key features: (1) it should translate in at list one paper publishable somewhere, i.e. it must relate to an interesting causal question and it must be feasible; (2) it should incorporate a detailed description of the evaluation design and relevant statistical methods and an explicit discussion about the project’s feasibility; (4) it must be clearly and concisely described.

The proposal may represent a replication of an original analysis (for which you may be able to get the data) that you extended in some small but useful way (eg. updating data, applying a different approach to the same research question): preliminary results might be included and discussed. The research proposal may be the result of joint work with at most another student who is taking the course for credit.

A tentative template form for the proposal is available at this link https://dl.dropboxusercontent.com/u/16441444/causality/causal_inference_2014_proposal_TEMPLATE.pdf

The tentative weights are illustrated below

20% (i.e. max 2.5 points out of 12)

Explanation of the causal relationship of interest, ideal experiment, identification strategy

45% (i.e. max 6 points out of 12)

Details on the empirical implementation and/or analysis and interpretation of the results; discussion of caveats; discussion about the implications of the results

This section is the most closely related to the tools illustrate in the course and should highlight your knowledge of the identification approach you decide to pursue as well as the details about the implementation of the approach

0.05% (i.e. max 0.5 point out of 12)

Suggestions for future research

25% (i.e. max 3 points out of 12)

15 minute presentation of your proposal (audit) to the class (we may eliminate this presentation if the number of students enrolled is large) with discussion

You will have about one month to draft your proposal. The audits will all take place (ideally on the same day between 10 am and 6 pm) at the beginning of February (the exact day will be set by the end of the lectures). All draft proposals will be circulated through the mailing list at least two weeks before the audits.

Syllabus

The syllabus can be adapted based on students background. Each of the argument listed will be discussed mentioning specific examples. This syllabus lists many more papers and books than we will actually cover (in detail) during the lectures, due to time constraints. In addition, I may add recent papers that are relevant to a specific topic. The references are included mainly to provide resources for those interested in exploring a particular topic in greater depth. These further readings may be useful when students have to work on their research proposal (see Assessment section) or on their research projects later.

TOPIC 1: FUNDAMENTALS OF IMPACT EVALUATION

CLASS 1

Introduction: structural & reduced form models; ex-ante vs ex-post analysis; the fundamental problem of causal inference; program evaluation vs. program design
Randomized experiments & regression
Measurement error and sources of bias in the standard Ordinary Least Squares regression
Selection on observables and matching (matching based on the propensity score and estimation of the propensity score)
A Panel Data/Repeated Cross sections method: Difference-in-Differences

TUTORIAL 1

Replicating results from field experiments
Matching using STATA

References

Holland, Paul W (1986) Statistics and Causal Inference, Journal of the American Statistical Association 81 (396): pp. 945-970, with discussion
Lalonde, Robert (1986) Evaluating the Econometric Evaluations of Training Programs with Experimental Data, American Economic Review 76(4), pp.604-620
Altonji, Elder and Taber (2005) Selection on Observed and Unobserved Variables: Assessing the Effectiveness of Catholic Schools, Journal of Political Economy, 113, pp- 151-184

Difference-in-Differences

Athey and Imbens (2006) Identification and Estimation in Nonlinear Difference-in-Differences Models, Econometrica 74(2) pp.431-497
Betrand et al. (2004) How Much Should We Trust Difference-in-Differences Estimates?, Quarterly Journal of Economics 119, pp. 245-279
Card, D and Krueger, A. (1994) Minimum Wages and Employment: A Case Study of the Fast-Food Industry in New Jersey and Pennsylvania, American Economic Review 84(4), pp. 772-793

Computing Standard Errors

Barrios, T. and Diamond, R. and Imbens, G. and Kolesar, M. (2012) Clustering, Spatial Correlation and Randomization Inference, The Journal of the American Statistical Association 107 (498), pp. 578-591.
Cameron, Gelbach, Miller (2008) Bootstrap-Based Improvements for Inference with Clustered-Errors, The Review of Economics and Statistics 90(3), pp.414-427
Moulton (1990) An Illustration of A Pitfall in Estimating the Effects of Aggregate Variables on Micro Units, Review of Economic and Statistics, pp. 334-338

Propensity Score Matching

Dehejia, R.H. and Wahba, S. (1998) Propensity Score Matching Methods for Non-Experimental Case Studies, NBER WP 6829
Dehejia, R.H. and Wahba, S. (1999) Causal Effects in Non-Experimental Studies: Re-evaluating the Evaluation of Training Programs, Journal of the American Statistical Association 94 (448) pp. 1053-1062
Dehejia, R.H. and Wahba, S. (2002) Propensity Score-Matching Methods for Nonexperimental Causal Studies, Review of Economics and Statistics, 84(1), pp. 151-161
Dehejia, R.H. and Wahba, S. (2005) Practical Propensity Score Matching. A Reply to Smith and Todd, Journal of Econometrics, 125, pp. 355-364.
Rosenbaum, P.R. and Rubin, D.B. (1983) The Central Role of the Propensity Score in Observational Studies for Causal Effects, Biometrika 70(1), pp.41-55
Rosenbaum, P.R. and Rubin, D.B. (1984) Reducing Bias in Observational Studies using Subclassification on the Propensity Score, Journal of the American Statistical Association 79 (387), pp. 147-156
Smith, J. and Todd, P. (2005a) Does Matching Overcome Lalonde’s critique of Non-Experimental Estimators? Journal of Econometrics 125, pp. 305-353
Smith, J. and Todd, P. (2005b) Rejonder Journal of Econometrics 125, pp. 365-375

References (additional empirical applications)

Della Vigna, S. and Durante, R. and La Ferrara, E. and Knight, B. (2013) Market-Based Lobbying: Evidence from Advertising Spending in Italy. NBER Working Paper 19766 (September 2014 version available from Durante’s web page)
Durante, R. and Knight, B. (2012). Partisan Control, Media Bias and Viewer Responses: Evidence From Berlusconi’s Italy Journal of the European Economic Association, European Economic Association, vol. 10(3), pages 451-481.
Gerber et al. (2009) Does the Media Matter? A Field Experiment Measuring the Effect of Newspapers on Voting Behaviour and Political Opinions, America Economic Journal: Applied Economics, 1(2), pp.35-52
Ichino, A. and Mealli, F. and Nannicini, T. (2008) From Temporary Help Jobs to Permanent Employment: What Can We Learn From Matching Estimators and Their Sensitivity? Journal of Applied Econometrics 23, pp.305-327
La Ferrara, E. and Chong, A. and Duryea, S. (2009) Television and Divorce: Evidence from Brazilian Novelas, Journal of the European Economic Association 7, pp. 458-468
- La Ferrara, E. and Chong, A. and Duryea, S. (2012) Soap Operas and Fertility: Evidence from Brazil. American Economic Journal: Applied Econometrics 4(4)
- Ladd, Jonathan McDonald and Lenz, G.S. (2009) Exploiting a Rare Communication Shift to Document the Persuasive Power of the News Media. American Journal of Political Science 53(2) pp. 394-410

TOPIC 2: QUASI-EXPERIMENTS

CLASS 2

Instrumental variable methods and the Heckman selection model
The Local Average Treatment Effect (LATE)
Weak Instruments
Regression Discontinuity Design (RDD): sharp and fuzzy RDD
Regression Kink Design

TUTORIAL 2

Instrumental variable methods using STATA

References

Instrumental Variables

Angrist, J. and Imbens, G. (1994) Identification and Estimation of Local Average Treatment Effects, Econometrica 62 (2), pp. 467-475
Angrist, J. and Imbens, G. (1995) Two-Stage Least Squares Estimation of Average Causal Effect in Models with Variable Treatment Intensity, Journal of the American Statistical Association, 90 (430), pp. 431-442
Angrist, J., Imbens, G. and Rubin, D. (1996) Identification of Causal Effects Using Instrumental Variables, Journal of the American Statistical Association, 91 (434) pp. 444-455, with discussion
Angrist, J. and Graddy, K. and Imbens G. (2000) The Interpretation of Instrumental Variables Estimators in Simultaneous Equations Models with and Application to the Demand For Fish, Review of Economic Studies, 67 pp. 499-527
Angrist, J. (2004) Treatment Effect Heterogeneity in Theory and Practice, The Economic Journal, 114 (494) p.C52-C83
Angrist and Lavy and Schlosser (2010) Multiple Experiments for the Causal Link Between the Quantity and Quality of Children, Journal of Labor Economics
Angrist, and Fernandez-Val (2013) ExtrapoLATE-ing: External Validity and Overidentification in the LATE Framework, in Advances in Econometric Theory and Applications, 10th World Congress, Volume III
Baum, C. F. and Schaffer, M. and Stillman, S. (2007) Enhanced Routines for Instrumental Variables/GMM Estimation and Testing, STATA Journal 7(4) pp. 465-506
Baum, C. F. and Schaffer, M. and Stillman, S. (2010) ivreg29: Stata module for extended instrumental variables/2SLS, GMM and AC/HAC, LIML and k-class regression, Boston College, Economics Working Paper, n. 667
Black, S.E. and Devereux, P.J and Salvanes, K.G. (2008) Staying in the Classroom and Out of The Maternity Ward? The Effect of Compulsory Schooling Laws on Teenage Births, Economic Journal 118, 1025-1054
Imbens, G. and Rubin, D. (1997) Estimating the Outcome Distribution for Compliers in Instrumental Variables Models, Review of Economic Studies, 64, pp. 555-574
Moulton (1990) An Illustration of A Pitfall in Estimating the Effects of Aggregate Variables on Micro Units, Review of Economic and Statistics, pp. 334-338

Instrumental Variables: Weak Instruments

Bound, J. and Jaeger, D. and Baker, R. (1995) Problems with Instrumental Variables Estimation when the Correlation Between the Instruments and the Endogenous Variables is Weak, Journal of the American Statistical Association 90, 443-450
Staiger and Stock (1997) Instrumental Variable Regressions with Weak Instruments, Econometrica 65 (3), pp. 557-586
Wright, J, and Stock, (1997) GMM with Weak Identification, Econometrica 68(5), pp. 1055-1096
Yogo, M., Wright, J, and Stock (2002) A Survey of Weak Instruments and Weak Identification in Generalized Method of Moments, Journal of Business and Economic Statistics 20, pp. 518-529

Regression Discontinuity Design

Angrist and Lavy (1999) Using the Maimonides Rule to Estimate the Effect of Class Size on Scholastic Achievement, Quarterly Journal of Economics
Barrios, T. and Diamond, R. and Imbens, G. and Kolesar, M. (2012) Clustering, Spatial Correlation and Randomization Inference, The Journal of the American Statistical Association 107 (498), pp. 578-591.
Battistin, E. and Rettore, E. (2008) Ineligibles and Eligible Non-Participants as a Double Comparison Group in Regression-Discontinuity Designs, Journal of Econometrics 142, pp.715-730
Cook (2008) Waiting for Life to Arrive: A History of the Regression-Discontinuity Design in Psychology, Statistics and Economics, Journal of Econometrics 142, pp.636-654
Hahn, J. and Todd, P. and Van der Klaauw, W. (2001) Identification and Estimation of Treatment Effects with a Regression Discontinuity Design, Econometrica 69 (1)
Imbens, G. and Lemieux, T. (2008) Regression Discontinuity Designs: A Guide to Practice, Journal of Econometrics 142 (2), pp. 615-635 and papers on the same journal in the special issue The Regression Discontinuity Design: Theory and Applications, Journal of Econometrics 142(2), pp. 611-850
Lee and Card (2008) Regression Discontinuity Inference with Specification Error, Journal of Econometrics 142(2)
Lee, D.S. (2008) Randomized Experiments from Non-random Selection in U.S. House Elections, Journal of Econometrics 142(2), pp. 675-697
Thistlethwaite and Campbell (1960) Regression Discontinuity Analysis: An Alternative to Ex-Post Facto Experiment, Journal of Educational Psychology 51(6), pp. 309-317
Trochim, W. (1984) Research Designs For Program Evaluation: The Regression-Discontinuity Approach, Beverly Hills: Sage Publications.

Regression Kink Design

Card and Pei and Weber (2012) Nonlinear Policy Rules and the Identification and Estimation of Causal Effects in Generalized Regression Kink Design. NBER WP No.18564. November.

References (additional empirical applications)

Abdulkadiroglu, A. and Angrist, A. and Parag P. (2014) The Elite Illusion: Achievement Effects at the Boston and New York Exam Schools , Econometrica Vol. 82 (1), pp. 137-196
Battistin, E. Brugiavini, A. Rettore, E. Weber, G. (2010) The Retirement Consumption Puzzle: Evidence from a Regression Discontinuity Approach, The American Economic Review 99, pp. 2209 -2226
Fort, M. and Ichino, A. and Tessari, A. and Zanella, G. (2015 mimeo) Early Daycare and IQ: Regression Discontinuity Evidence from the Asilo Nido of Bologna
Fort, M. and Schneeweis, N. and Winter-Ebmer R. (2015 mimeo) Is Education Always Reducing Fertility? Evidence from Compulsory Schooling Reforms
Ichino, A. and Garibaldi, P. and Giavazzi, F. and Rettore, E. (2013) College Cost and Time to Complete a Degree: Evidence from Tuition Discontinuities, The Review of Economics and Statistics 94(3), pp. 699-711
Levine, R. and Loayza, N. and Beck, T. (2000) Financial intermediation and growth: Causality and Causes Journal of Monetary Economics 46, pp. 3177

TOPIC 3 QUANTILE REGRESSION FOR IMPACT EVALUATION: Introduction

CLASS 3

Quantile regression with exogenous regressors and endogenous regressors (introduction)

References

Abadie, A. et al. (2002) Instrumental Variable Estimates of the Effect of Subsidized Training on the Quantiles of Trainee Earnings, Econometrica, Vol. 70 (1), pp. 91-117.
Bitler, M. et al. (2006 What Mean Impacts Miss: Distributional Effects of Welfare Reforms Experiments, American Economic Review, 96 (4) pp. 988-1012
Chernozhucov, V. et al. (2004) The Effects of 401(K) Participation on the Wealth Distribution: An Instrumental Quantile Regression Analysis, The Review of Economics and Statistics, Vol. 86 (3), pp. 735-751.
Chernozhucov, V. et al. (2005) An IV Model of Quantile Treatment Effects, Econometrica, Vol. 73 (1), pp. 245-261.
Chernozhucov, V. et al. (2006) Instrumental Quantile Regression Inference for Structural and Treatment Effect Models, Journal of Econometrics
Chesher, A. (2003) Identification in Nonseparable Models, Econometrica, Vol. 71, pp. 1405-1441
Heckman, J.J. and Smith, J. and Clements, N. (1997) Making the Most Out of Programme Evaluations and Social Experiments: Accounting for Heterogeneity in Programme Impacts, Review of Economic Studies, 64 (4), pp. 487-535
Imbens, G. and Rubin, D. (1997) Estimating the Outcome Distribution for Compliers in Instrumental Variables Models, Review of Economic Studies, 64, pp. 555-574
Koenker, R. and Hallock, K.F. (2002) Quantile Regression, Journal of Economic Perspectives 15(4) pp. 143-156
Koenker, R. (2005) Quantile Regression, Cambridge University Press, Econometric Society Monograph 38
Ma, L. et al (2006) Quantile Regression Methods for Recursive Structural Equation Models, Journal of Econometrics, 134(2), pp. 471-506

References (additional empirical applications)

Brunello, G. and Fort, M. and Weber, G. (2009) Changes in Compulsory Schooling, Education and the Distribution of Wages in Europe, Economic Journal 119