R代写:PPHA311ProblemSet


用R语言代写算术相关练习题。

Requirement

Group work: You may work in groups, but each person must submit individual
answers. These answers must reflect the individual’s own work and may not be
copied from others.
Scratch work and code: Please show your work (where relevant) and append all
code written for this assignment to the end of your submission. Please use
brief but clear comments in the code to reference the applicable assignment
section.

Mathematical Background

Given random variables X and Y and constants a and b, state whether the
following expressions are correct. If the expression is incorrect, please give
the appropriate formula. If the expression is correct only under certain
conditions, state those conditions.

  1. E(aX+b)=aE(X)
  2. Var(aX) = a^2Var(X)
  3. E(XY) = E(X)E(Y)
  4. Var(aX+bY)=a^2Var(X)+b^2Var(Y)
  5. E[E(Y|X)] = E(Y)
  6. Cov(X,Y)=E(XY)
    Now consider a set of random variables: X1, X2, …, Xn. Are each of the below
    expressions correct? If not, state what must be assumed for each to be
    correct.

Exam-Style Questions

  1. True or false: The OLS estimator is biased when the assumption of homoskedasticity is violated.
  2. A dataset based on the U.S. National Longitudinal Survey is used to investigate the returns to education. A linear regression of hourly wage on highest (educational) grade completed in this dataset (n = 2244) yields the following: wage = -1.97 + 0.74grade
    * (a) What is the predicted wage for those who have finished up to 9th grade? What is the predicted wage for those who have finished high school (12th grade)?
    * (b) Is this regression likely to capture a causal relationship between education and wages? Why or why not? What are some potential confounding factors?

Hypothetical Experiment

The table below describes a hypothetical experiment with 2,400 participants.

Category #participants D T Yc Yt Y
1 300 0 0 4 6
2 300 1 0 4 6
3 500 0 1 4 6
4 500 1 1 4 6
5 200 0 0 10 12
6 200 1 0 10 12
7 200 0 1 10 12
8 200 1 1 10 12
where D is a predetermined characteristic, T the treatment status, and Yc and
Yt, the potential outcomes.
  1. Complete the last column in the table (Y).
  2. What is the average treatment effect (ATE)?
  3. Is it plausible that these data come from a RCT?

Data-Driven Question

For this problem, you will analyze data on voting behavior in Colombias 2016
peace referendum. The data includes five variables: department (equivalent to
U.S. states), total of NO votes, total of YES votes, number of registered
voters, and number of rebel attacks during the height of the insurgency. The
raw data is on our Canvas site.

  1. In Stata/R, import the raw data and generate two new variables. First, calculate the vote share of the NO vote. This is the NO vote share of all ballots cast. Call this variable NO_VS. Second, calculate departmental turnout. This is the sum of all ballots cast divided by the number of registered voters in the municipality. Name this variable DEPT_TO. Report the mean for each variable. Produce a clearly labeled histogram of each variable.
  2. Report the bivariate correlation between NO_VS and the variable measuring exposure to rebel violence during the height of the insurgency (RV_EXPOS). Then report the bivariate correlation between DEPT_TO and RV_EXPOS. What, if anything, do you learn?
  3. Use the Stata collapse command or comparable command in R to sum all ballots cast by type (YES, NO) as well as the total number of registered voters for the entire country. Although the collapse command usually includes a by() argument (e.g., collapse (sum) X Y Z, by(unit, time)), we do not need one for this exercise. Recalculate the mean of the NO vote share and departmental turnout. Report these values. Why do these values dier from the reported means of NO_VS and DEPT_TO above?

文章作者: SafePoker
版权声明: 本博客所有文章除特別声明外,均采用 CC BY 4.0 许可协议。转载请注明来源 SafePoker !
  目录