Applied Economic Methods
BUSM112: Instructions for Data Assignment
This data assignment, whether written individually or in a team of up to three people, represents 40% of the total mark.
The assignment is due on 22 March 2019 11:00am and must be submitted in the dropbox that is available in the module’s QMPLUS website.
For those wishing to write the assignment in teams (up to three people): You are responsible for selecting your teammates. In the remote case that you and your teammates do not get along (e.g. due to free-riding issues, clash with work agendas, etc.) then you MUST let the module organiser know (firstname.lastname@example.org) by 15 March the latest so you are allowed to submit the assignment individually, rather than in teams. In this case, you might be sent new personalised files.
Each team needs to submit only ONE data assignment per team. The team members will need to coordinate as who will be responsible to submit the assignment in dropbox by the stated deadline 22 March 2019.
Submitted team assignments are understood to have been the product of the team members and the assignment overall mark will be given equally to each teammate with no exception.
All data assignments, whether written individually or in a team of up to three people, needs to include:
1. The first page of the assignment should be the filled cover sheet in word version (available in second page of this document). This sheet needs to include all the required QMUL id of the student(s) composing the team).
2. All the data assignment must be submitted in word version only.
3. Below each question you must include any relevant tables (e.g. tests, regression tables). These can be copied as a picture into word to preserve format.
4. The appendix of the assignment must include the syntax needed to replicate your results. This syntax can be copied from your do files and copied directly into the word document. In the do file you can include comments, and notations that might make it easier for markers understand and mark your syntax. Failing to include your syntax will result in a heavy penalty and you might be referred to the school Assessment Offence Officer for the assignment to be investigated for plagiarism.
5. DO NOT include your log file. (If your do files are well documented anyone should be able to replicate your results).
Failure to follow points above 1-5, will result in marks deducted from the assignment.
Assignments handed after the deadline will be penalised according to the students’ handbook, unless you have an explicit permission to submit late (granted before the deadline) due to extenuating circumstances. Beware that each team and students wishing to do the assignment individually will receive personalised files to prevent collusion. All submitted assignments will be screened using turninit, and those suspected of having committed an academic offence (e.g. colluding with another team to produce same assignments, plagiarism or ghost writing) will be referred to the school’s Assessment Offence Officer which might result in hefty penalties.
Data Assessment Feedback Form
|MODULE TITLE||Applied Economic Methods|
|Assignment Type||40% Data assignment|
|Student QMUL ID||Fill in here your QMUL id|
|Student QMUL ID||Fill in here the QMUL of second team member if done in group|
|Student QMUL ID||Fill in here the QMUL of third team member if done in group|
|Each team needs to submit only ONE data assignment per team.|
|Marker(s) Initials||Provisional Mark(s)||Late (no. of days)||Penalty – Marks to be deducted||Overall mark|
|Checklist: 1. The first page of the assignment should be the filled cover sheet in word version. 2. All the data assignment must be submitted in word version only. 3. Below each question you must include any relevant tables (e.g. tests, regression tables). These can be copied as a picture into word to preserve format. 4. The appendix of the assignment must include the syntax needed to replicate your results. This syntax can be copied from your do files and copied directly into the word document. Please you can include all your comments, and notations that might make it easier to read and mark. 5. Please do NOT include your log file. (If your do files are well documented anyone should be able to replicate your results). Note that submitted team assignments are understood to have been the product of the team members composed this group and the assignment overall mark will be given equally to each teammate with no exception.|
Applied Economic Methods
The data assignment consists of four parts: A, B, C and D. The answer to these parts has to be typed. The strict word limit is 1500 words excluding the word count of provided instructions/questions, tables, and do files. Any extra text exceeding this word limit will not be read and will not be marked.
Part A (30 marks)
File parta.dta contains information of a randomized intervention. In this randomized intervention 1,000 children were treated with a dosage of fish oils on a daily basis for three months.
The intervention then compared the test scores of the treated students with a group of students that randomly received a placebo. Neither of the participants knew whether they were given the real fish oils nor the placebo.
a) Provide a summary table of the main characteristics of people in the treatment and control group before the intervention took place. [5 marks]
b) Using t-tests explain whether the treated and control groups have on average same characteristics?
c) Estimate the impact of the intervention, by comparing the outcome (the student’s test scores) after the intervention between the treatment and control groups. For this purpose, use a t-test clearly explaining the impact of the intervention (if any) and whether this impact is statistically significant. [5 marks]
d) Using an OLS regression estimate the impact of the intervention by comparing the test scores between the treatment and control groups whilst also controlling in the same regression for other covariates that might have affected the outcome. Explain if your results differ in c) and d). If so explain which results are more reliable of the true impact.
d) Test whether the OLS regression used in option d suffers from any violations for OLS to be reliable and BLUE. If there are any violations, then try correcting for these violations clearly explaining your rationale for these corrections. [10 marks]
Part B (20 marks)
File partb.dta contains information of a non-randomized intervention. The intervention consisted of providing job training to people working in fast food industry in New Jersey in USA. The training provided courses on IT, numeracy and customer service. The people used as a “control group” were also working in the fast food industry but in Pennsylvania state. Independent researchers hope to investigate whether the intervention had any impact by comparing the change in earnings (measured in natural logarithm learnings) in participants of the programme in New Jersey before and after the programme was implemented to those of the control group in Pennsylvania.
a) Estimate and interpret the impact of the programme using the difference-in-difference estimator using panel fixed effects.
b) Estimate and interpret the impact of the programme using the difference-in-difference estimator combined with kernel matching. To match people use the following variables: bk kfc mc wendy.
c) With the data provided, test whether the treatment and control groups are statistically similar before the intervention took place and discuss whether this might affect the reliability of the difference-in-difference estimators obtained above.
Part C (30 marks)
File partc.dta contains information from a real policy programme implemented in Colombia in the 1990s that aimed at increasing education attainment among poor people. To this end, the World Bank gave a secondary school voucher to poor children that wished to continue with their education at secondary level.
These vouchers covered about half of students’ schooling expenses and were renewable depending upon students’ performance.
Given that the programme did not have enough funds to give vouchers to all poor children, these vouchers were randomized through a lottery among eligible households.
The variable won_lottry denotes whether the student won=1 or lost the lottery=0.
The variable use_fin_aid denotes whether the student used the voucher or any other sort of scholarship=1 or not=0.
To estimate the impact of this school voucher programme, all students were tested after the intervention. The file provides information on the students’ tests scores (lscores) including those who won and not the voucher. Note that this test score variable is already measured in natural logarithm.
Questions for part C:
1) Using a simple OLS regression estimate the following regression:
lscores =a+b1 won_lottry + b2 male+ b3 base_age + error
Interpret the coefficient of having won the lottery (variable won_lottry). In your interpretation be clear on whether this variable has a significant impact on the dependent variable, the scores obtained (lscores), and the magnitude of this coefficient.
The regression estimated in question above is likely to be biased. As you can see in the dataset, some students that won the lottery ended up not using the voucher. Also some students that did NOT win the school voucher still managed to go to secondary school as obtained other scholarships or funding (use_fin_aid). Thus, a simple comparison in test scores between winners and losers of the lottery is likely to give a biased estimate of the intervention.
Thus, researchers from the MIT and Stanford have suggested to identify the effect of this intervention on test scores using instrumental variables.
These researchers suggest that we should instead investigate what is the impact of use_fin_aid on test scores. Since use_fin_aid is likely to be endogenous, the researchers suggest to use the variable lottery (won_lottry) as its instrument.
The researchers argue that having won the lottery (won_lottry) is a good instrument as it is random, and very closely correlated to having obtained a school voucher.
2) So your tasks for question 2. Run an instrumental variable regression using as main dependent variable, lscores, the test score variable.
The main covariate of this regression is use_fin_aid. Since use_fin_aid variableis likely to be endogenous, use as instrument whether the student was winner or not of the lottery (variable won_lottry). In your IV regression also control for male and base_age as additional covariates.
Interpret your results of both the first and second stage IV regression (the regression coefficients). The results of both stages need to be presented as well as tables.
3) Explain what characteristics a good instrument should have to deal with endogeneity and whether the instrument used in question above satisfies these characteristics. Show exactly all the tests you used to formulate your answer.
4) Using endogeneity tests explain which results, if those of OLS or IV, offers a more reliable estimate of the impact of the intervention.
Part D (20 marks)
File partd.dta includes information on the exchange rate between the USA dollar and the Euro.
- Explain what does non-stationarity mean within the time series literature. [5 marks]
- Test and explain whether the exchange rate provided is stationary using a time series plot and the autocorrelation function ACF.
- Explain what is a unit-root process is using the Dickey-Fuller test whether the exchange rate follows a unit-root process. [5 marks]
- Test whether removing a trend to the exchange rate series changes your conclusion on whether the series is stationary. [5 marks]