Introduction
本次需要代写的Python作业,包含了5个算术问题需要解决。
Problem 1 - Decrypting Government Data
Your job is to summarize this gov data about oil consumation
- The format of the file is rather bizzare - note that each line has data for two months, in two different years! (Plus I had to hand edit the file to make it parseable)
- Fortunately, Python is great for untangling and manipulating data.
- Write a generator that reads from the given url over the network, and produces a summary line for a year’s data on each ‘next’ call
- remember that urllib.request returns ‘bytes arrays’, not strings
- The generator should read the lines of the oil2.txt file in a lazy fashion - it should only read 13 lines for every two years of output. Note a loop can have any number of ‘yield’ calls in it.
- Ignore the monthly data, just extract the yearly info
- Drop the month column
- In addition to the ‘oil’ generator function, my solution had a separate helper function, ‘def makeCSV- Line(year, data):’
Here is the first two years of data, 2014 and 2013
Year,Quantity,QuantityChange,Unknown,Unknown2,Price,PriceChange
2014,2700903,-112867,246409332,-26397845,91.23,-5.72
2013,2813770,-283638,272807177,-40367786,96.95,-4.15
2012,3097408,-224509,313174963,-18407090,101.11,1.29
2011,3321917,-55160,331582053,79421544,99.82,25.15
2010,3377077,62290,252160509,63448733,74.67,17.74
2009,3314787,-275841,188711776,-153200712,56.93,-38.29
2008,3590628,-99940,341912488,104700835,95.22,30.95
2007,3690568,-43658,237211653,20584322,64.28,6.26
2006,3734226,-20445,216627331,40871990,58.01,11.20
2005,3754671,-66308,175755341,44012676,46.81,12.33
2004,3820979,144974,131742665,32575492,34.48,7.50
2003,3676005,257983,99167173,21883842,26.98,4.37
2002,3418022,-53045,77283331,2990437,22.61,1.21
2001,3471067,71827,74292894,-15583539,21.40,-5.04
2000,3399240,171148,89876433,38986812,26.44,10.68
1999,3228092,-14620,50889621,13637399,15.76,4.28
1998,3242712,173281,37252222,-16973685,11.49,-6.18
1997,3069431,175785,54225907,-704950,17.67,-1.32
1996,2893646,126333,54930857,11181204,18.98,3.17
now that we have something that looks like a CVS file, can do all kinds of
things - could save it to a file then
- excel, openoffice could read it
- Python has a CVS Reader
- with a little juggling, can easily pump the data into a panda DataFrame
Input:
with open(‘/tmp/oil.csv’, ‘w’) as f:
for l in oil(url):
f.write(l + ‘\n’)
o = oil(url)
ls = list(o)
s = ‘\n’.join(ls)
import pandas as pd
import iowe will cover StringIO next week - kind of an ‘in-memory’ file
df = pd.read_csv(io.StringIO(s))
df
—|—
Output:
Year Quantity QuantityChange Unknown Unknown2 Price PriceChange
0 2014 2700903 -112867 246409332 -26397845 91.23 -5.72
1 2013 2813770 -283638 272807177 -40367786 96.95 -4.15
2 2012 3097408 -224509 313174963 -18407090 101.11 1.29
3 2011 3321917 -55160 331582053 79421544 99.82 25.15
4 2010 3377077 62290 252160509 63448733 74.67 17.74
5 2009 3314787 -275841 188711776 -153200712 56.93 -38.29
6 2008 3590628 -99940 341912488 104700835 95.22 30.95
7 2007 3690568 -43658 237211653 20584322 64.28 6.26
8 2006 3734226 -20445 216627331 40871990 58.01 11.20
9 2005 3754671 -66308 175755341 44012676 46.81 12.33
10 2004 3820979 144974 131742665 32575492 34.48 7.50
11 2003 3676005 257983 99167173 21883842 26.98 4.37
12 2002 3418022 -53045 77283331 2990437 22.61 1.21
13 2001 3471067 71827 74292894 -15583539 21.40 -5.04
14 2000 3399240 171148 89876433 38986812 26.44 10.68
15 1999 3228092 -14620 50889621 13637399 15.76 4.28
16 1998 3242712 173281 37252222 -16973685 11.49 -6.18
17 1997 3069431 175785 54225907 -704950 17.67 -1.32
18 1996 2893646 126333 54930857 11181204 18.98 3.17
19 1995 2767313 63116 43749653 5270236 15.81 1.58
20 1994 2704197 160822 38479417 10041 14.23 -0.90
21 1993 2543375 248805 38469376 -83679 15.13 -1.68
—|—
Input:
[df[‘Price’].mean(), df[‘Price’].min(), df[‘Price’].max()]
—|—
Output:
[46.63681818181818, 11.49, 101.11]
—|—
Problem 2
- suppose we want to convert between C(Celsius) and F(Fahrenheit), using the equation 9C = 5 (F-32)
- could write functions ‘c2f’ and ‘f2c’
- do all computation in floating point for this problem
Input:
def c2f(c):
return((9. * c + 5. * 32.) / 5.)
def f2c(f):
return(5. * (f - 32) / 9.)
[c2f(0), c2f(100), f2c(32), f2c(212)]
—|—
Output:
[32.0, 212.0, 0.0, 100.0]
—|—
- to write f2c, we solved the equation for C, and made a function out of the other side of the equation
- to write c2f, we solved for F, . . .
- there is another way to think about this
- rearrange the equation into a symmetric form 9 * C - 5 * F = -32 * 5
- you can think of the equation above as a “constraint” between F and C. if you specify one variable, the other’s value is determined by the equation. in general, if we have c0 * x0 + c1 * x1 + … cN * xN = total
- cI are fixed coefficients
- specifying any N of the (N + 1) x’s will determine the remaining x variable
- define a class, ‘Constaint’ that will do ‘constraint satisfaction’
- you may find ‘dotnone’ to be helpful
Input:regular dot product, except that if or both values in a pair is ‘None’,
that term is defined to contribute 0 to the sum
def dotnone(l1, l2):
‘’’another dot product variant’’’
sum = 0
for e1,e2 in zip(l1,l2):
if not (e1 is None or e2 is None):
sum += e1 * e2
return(sum)
[dotnone([1,2,3], [4,5,6]), dotnone([1,None,3], [4,5,6]), dotnone([None,1], [2,None])]
—|—
Output:
[32, 22, 0]
—|—
Input:
# setup constraint btw C and F
# 1st arg is var names,
# 2nd arg is coefficients
# 3rd arg is total
c = Constraint(‘C F’, [9, -5], -5 * 32)
# 1st arg - variable index or name
# 2nd arg - variable value
# setvar will fire when there is only one unset variable remaining
# it will print the variable values, return them in a list, and
# clear all variable values
c.setvar(0, 100)
C = 100.0
F = 212.0
—|—
Output:
[100.0, 212.0]
—|—
Problem 3 - Hamlet
- Python is very popular in ‘digital humanities’
- MIT has the complete works of Shakespeare in a simple html format
- You will do a simple analysis of Hamlet by reading the html file, one line at a time(usual iteration scheme) and doing pattern matching
- The goal is to return a list of the linecnt, total number of ‘speeches’(look at the file format), and a dict showing the number of ‘speeches’ each character gives
- Your program should read directly from the url given, but you may want to download a copy to examine the structure of the file.
- remember that usrlib.request returns ‘byte arrays’, not strings
- here’s a short sample of the file
HORATIOTush, tush, 'twill not appear.
BERNARDO
Sit down awhile;
HORATIO
And let us once again assail your ears,
That are so fortified against our story
What we have two nights seen.
Well, sit we down,
BERNARDO
And let us hear Bernardo speak of this.
Last night of all,
MARCELLUS
When yond same star that's westward from the pole
Had made his course to illume that part of heaven
Where now it burns, Marcellus and myself,
The bell then beating one,--
Enter Ghost
Peace, break thee off; look, where it comes again!
BERNARDO
In the same figure, like the king that's dead.
---|---
Input:
hamlet(url)
—|—
Output:
[8881,
1150,
defaultdict(int,
{‘All’: 4,
‘BERNARDO’: 23,
‘CORNELIUS’: 1,
‘Captain’: 7,
‘Danes’: 3,
‘FRANCISCO’: 8,
‘First Ambassador’: 1,
‘First Clown’: 33,
‘First Player’: 8,
‘First Priest’: 2,
‘First Sailor’: 2,
‘GUILDENSTERN’: 33,
‘Gentleman’: 3,
‘Ghost’: 14,
‘HAMLET’: 359,
‘HORATIO’: 112,
‘KING CLAUDIUS’: 102,
‘LAERTES’: 62,
‘LORD POLONIUS’: 86,
‘LUCIANUS’: 1,
‘Lord’: 3,
‘MARCELLUS’: 36,
‘Messenger’: 2,
‘OPHELIA’: 58,
‘OSRIC’: 25,
‘PRINCE FORTINBRAS’: 6,
‘Player King’: 4,
‘Player Queen’: 5,
‘Prologue’: 1,
‘QUEEN GERTRUDE’: 69,
‘REYNALDO’: 13,
‘ROSENCRANTZ’: 49,
‘Second Clown’: 12,
‘Servant’: 1,
‘VOLTIMAND’: 2})]
—|—
Problem 4
- in class, we discussed two different ways to represent a polynomial
- polylist, a ‘dense’ represenation, that hold the coefficients in a list
- polydict, a ‘sparse’ representation, that holds (exponent, coefficent) pairs in a dict
- add a method, ‘topolydict()’ to class ‘polylist’, that converts the polylist into a polydict
- add a method, ‘topolylist()’ to class ‘polydict’, that converts the polydict into a polylist
- note that polylist->polydict will always work, but polydict->polylist can fail, because a polylist cannot represent negative exponents. in this case, raise a ValueError
- just to tell them apart, polylist prints with a leading ‘+’
Input:
pl1 = polylist([1, 2, 3])
pl2 = polylist([0, 10, 5])
pd1 = polydict({2:3, 1:2, 0:1})
pd2 = polydict({1:10, 2:5})
pd3 = polydict({-1:10, 2:5})
[pl1, pl2, pd1, pd2, pd3]
—|—
Output:
[+ 3 * X ** 2 + 2 * X + 1,
+ 5 * X ** 2 + 10 * x,
3 * X ** 2 + 2 * X + 1,
5 * X ** 2 + 10 * X,
5 * X ** 2 + 10 * X ** -1]
—|—
Input:
[pl1.topolydict(), pl2.topolydict(), pd1.topolylist(), pd2.topolylist()]
—|—
Output:
[3 * X ** 2 + 2 * X + 1, 5 * X ** 2 + 10 * X, + 3 * X ** 2 + 2 * X + 1, + 5 * X ** 2 + 10 * X]
—|—
Problem 5
define the mul method for polydict
Input:
[pd1, pd2, pd3, pd1 * pd2, pd1 * pd3, pd2 * pd3]
—|—
Output:
[+ 3 * X ** 2 + 2 * X + 1,
+ 5 * X ** 2 + 10 * x,
3 * X ** 2 + 2 * X + 1,
5 * X ** 2 + 10 * X,
5 * X ** 2 + 10 * X ** -1,
15 * X ** 4 + 40 * X ** 3 + 25 * X ** 2 + 10 * X,
15 * X ** 4 + 10 * X ** 3 + 5 * X ** 2 + 30 * X + 20 * X ** -1,
25 * X ** 4 + 50 * X ** 3 + 50 * X + 100]
—|—