Introduction
UT的R语言作业,比起上次的A1,这次的作业竟然要求画56张图,真是丧心病狂。
使用Data frames去读取数据,然后运算,然后写函数去运算,不能使用index而只能慢慢的去loop,效率低不说,这么大的数据量,卡是必然的。
每个图差不多都要运行5秒才能得出结果,画全套图得好几分钟才能完成。
Requirement
You should write R scripts for predicting future values of a time sequence in
this assignment, and use them to observations on numbers of maximum
temperatures and deaths.
The datasets is derived from that distributed by the NMMAPS, the U.S. National
Morbidity Mortality Air Pollution Study, with some missing temperature values.
I have provided the datasets from 2000-12-31 to 1994-01-01 as a csv file on
the course web page.
Finishing this assignment should provide more practice in basic R scripts, and
on the knowledge of datasets frames and of subscripts that are logical or
numeric lists.
We wish to predict the number of the maximum temperature and deaths for every
day, based only on datasets before that day, except that when predicting the
number of deaths on a day, we may use the maximum temperature for that day, as
well as previous days.
You should also write a function called predictions, which makes predicts for
all days from some start point to the last day for which datasets is provided.
This function should take as its arguments a function to use for predicting, a
datasets frame with values that may be used for predicting, the series of
values for which predicts are to be made, and the start point for making
predicts of this series. It should return a list of predicts for values in the
series from the specified start point to the end.
Once you have produced predicts for every day, you should produce plots of the
predicts, the actual values, and the errors in the predicts. You should also
evaluate how good these predicts were in terms of the average absolute value
of the error.
You should write several functions for predicting the value of some variable
on a single day. All these functions should take as their first arguments a
datasets frame, containing variables that may be used in making the predict,
and as their second argument a series of past values for the variable being
predict (which should have at least one past value, and for some functions
should have to have more than one past value). This predicting function should
return a predict for the next value in this series. The datasets frame should
have at least as many rows as the length of the series plus one (so there
should be values for the day for which the predict is being made). The
datasets frame may have additional rows, but they should not be looked at when
making the predict for this day.
Finally, you should try a more powerful predicting method, in which you first
make predicts with some function, and then try to use another function to
predict the error in the first method. The idea is that if you can manage to
predict the error well, you can get a better predict by just adding the
predicted error to the original prediction.
Summary
这次作业竟然写了快400行代码,画了56张图,完全就是个体力活。