完成Python基础算法和数据结构练习。
![Python](https://upload.wikimedia.org/wikipedia/commons/thumb/f/f8/Python_logo_and_wordmark.svg/250px-
Python_logo_and_wordmark.svg.png)
Brangelina
def brangelina(first, second):
—|—
The task of combining the first names of celebrity couples into a catchy
shorthand for media consumption turns out to be surprisingly simple to
automate. Start by counting how many groups of consecutive vowels (aeiou,
since to keep this problem simple, the letter y is always a consonant) there
are inside the first name. For example, ‘brad’ and ‘ben’ have one group,
‘sheldon’ and ‘britain’ have two, and ‘angelina’ and ‘alexander’ have four.
Note that a vowel group can contain more than one vowel, as in ‘juan’ or
‘queueing’.
If the first name has only one vowel group, keep only the consonants before
that group and throw away everything else. For example, ‘ben’ becomes ‘b’, and
‘brad’ becomes ‘br’. Otherwise, if the first word has n > 1
vowel groups,
keep everything before the second last vowel group n - 1. For example,
‘angelina’ becomes ‘angel’ and ‘alexander’ becomes ‘alex’. Concatenate that
with the string you get by removing all consonants at the beginning of the
second name.
All names given to this function are guaranteed to consist of the 26 lowercase
English letters only, and each name will have at least one vowel and one
consonant somewhere in it.
first | second | Expected result |
---|---|---|
‘brad’ | ‘angelina’ | ‘brangelina’ |
‘angelina’ | ‘brad’ | ‘angelad’ |
‘sheldon’ | ‘amy’ | ‘shamy’ |
‘amy’ | ‘sheldon’ | ‘eldon’ |
‘frank’ | ‘ava’ | ‘frava’ |
‘britain’ | ‘exit’ | ‘brexit’ |
(These simple rules do not always produce the best possible result. For | ||
example, ‘ross’ and ‘rachel’ meld into ‘rachel’ instead of ‘rochel’, and | ||
‘joey’ and ‘phoebe’ meld into ‘joebe’ instead of ‘joeybe’ . The reader is | ||
invited to think up more advanced rules that would cover a wider variety of | ||
name combinations and special cases. Some truly advanced set of rules that | ||
recognizes the semantic content of words would even combine ‘donald’ and | ||
‘hillary’ into ‘dollary’ instead of ‘dillary’ …) |
Line with most points
def line_with_most_points(points):
—|—
A point on the two-dimensional grid of integers is given as a two-element
tuple of x - and y - coordinates, for example (2, 5) or (10, 1). As originally
postulated by Euclid, for any two distinct points on the plane, there exists
exactly one straight line that goes through both points. (In differently
shaped spaces, different rules and their consequences apply.) Of course, this
same line, being infinite in both directions, will also go through an infinity
of other points on the plane.
Given a list of points on the integer grid, find the line that contains the
largest number of points from this list. To facilitate automated testing and
guarantee that the answer for each test case is unambiguous, this function
should not return the line itself, but merely the count of how many of these
points lie on that line. The list of points is guaranteed to contain at least
two points and all points in the list are distinct, but the points are
otherwise not given in any sorted order.
points | Expected result |
---|---|
[(42, 1), (7, 5)] | 2 |
[(1, 4), (2, 6), (3, 2), (4, 10)] | 3 |
[(x, y) for x in range(10) for y in range(10)] | 10 |
[(3, 5), (1, 4), (2, 6), (7, 7), (3, 8)] | 3 |
[(5, 6), (7, 3), (7, 1), (2, 1), (7, 4), (2, 6), (7, 7)] | 4 |
This problem can be brute forced with three nested loops, but the point (heh) | |
of this problem is not to do too much more work than you really need to. To | |
simplify your logic, consult the example program geometry.py for the cross | |
product function that can be used to quickly determine whether three points | |
are collinear. |
Wythoff array
def wythoff_array(n):
—|—
Wythoff array (see the Wikipedia article for illustration) is an infinite two-
dimensional grid of integers that is seeded with one and two to start the
first row. In each row, each element equals the sum of the previous two
elements, so the first row contains precisely the Fibonacci numbers.
The first element of each later row is the smallest integer c that does not
appear anywhere in the previous rows. (Since each row is strictly ascending,
you can find this out without having to compute an infinite number of
elements.) To compute the second element of that row, let (a, b) be the first
two elements of the previous row. If the difference c-a equals two, the second
element of that row equals b+3, and otherwise that element equals b+5. This
construction all by itself guarantees the Wythoff array to be an interspersion
of positive integers; every positive integer will appear exactly once in this
infinite grid, with no gaps or duplicates! (This result nicely highlights the
deeper combinatorial importance of the deceptively simple Fibonacci numbers as
building blocks of other integers and integer sequences.)
The difficulty in this problem is determining the first two elements in each
row after the first row, since once you know those first two values, the rest
of the row is utterly trivial to generate as far as you need. This function
should return the position of n inside the Wythoff array as a tuple of form
(row, column), both the row and column numbers starting from zero.
n | Expected result |
---|---|
21 | (0, 6) |
47 | (1, 5) |
1042 | (8, 8) |
424242 | (9030, 6) |
39088169 | (0, 36) |
39088170 | (14930352, 0) |
Sevens rule, zeros drool
def seven_zero(n):
—|—
Seven is considered a lucky number in Western cultures, whereas zero is what
nobody wants to be. To bring these two opposites briefly together, let us look
at positive integers that start with some solid sequence of sevens, followed
by some (possibly empty) solid sequence of zeros. For example, 7, 77777,
7700000, 77777700, or 70000000000000000. A surprising theorem proves that for
any positive integer n, there exist infinitely many integers of such seven-
zero form that are divisible by n. This function should find and return the
smallest such seven-zero integer.
This exercise is about efficiently generating and iterating through the
numbers of the constrained form of sevens and zeros, done in strictly
ascending order to guarantee finding the smallest possible working number.
This is perhaps best done with two nested loops. The outer loop should iterate
through the number of digits d in the current number. The inner loop should
iterate through all possible ways for k from one to d to create a number that
begins with a solid block of k sevens, followed by a solid block of d-k
zeroes.
n | Expected result |
---|---|
17 | 7777777777777777 |
42 | 7770 |
101 | 7777 |
103 | 7777777777777777777777777777777777 |
2**50 | 700000000000000000000000000000000000000000000000000 |
This problem is adapted from the excellent MIT Open Courseware online textbook | |
“Mathematics for Computer Science” (PDF link to the 2018 version for anybody | |
interested) that, like so many other non-constructive combinatorial proofs, | |
uses the pigeonhole principle to prove that some solution must exist for any | |
integer n, but provides no clue about the exact value of such solution. Also | |
as proven in that same textbook, whenever n is not divisible by either 2 or 5, | |
the smallest such number will always consist of some solid sequence of sevens | |
with no zero digits after them. Knowing this will speed up your search by an | |
order of magnitude for such friendly values of n. |
Om nom nom
def cookie(piles):
—|—
The beloved Sesame Street character Cookie Monster has stumbled upon a table
with piles of cookies, each pile a positive integer. However, the monomaniacal
obsessiveness of The Count who has set up this feast has recently escalated to
a whole new level of severity. To allow the Cookie Monster to enjoy this
crumbly fiesta, The Count insists that these cookies must be eaten in the
smallest possible number of moves. Each move consists of the Cookie Monster
choosing one of the remaining pile sizes p . This move removes p cookies from
every pile that contains at least p cookies (thereby eradicating all piles
whose size is exactly p ), and leaves all smaller piles untouched as they
were. This function should compute and return the smallest number of moves
that allows Cookie Monster to scarf down these cookies.
piles | Expected result | (optimal moves) |
---|---|---|
[1, 2, 3, 4, 5, 6] | 3 | [4, 2, 1] |
[2, 3, 5, 8, 13, 21, 34, 55, 89] | 5 | [55, 21, 8, 3, 2] |
[1, 10, 17, 34, 43, 46] | 5 | [34, 9, 8, 3, 1] |
[11, 26, 37, 44, 49, 52, 68, 75, 87, 102] | 6 | [37, 31, 15, 12, 7, 4] |
This problem has a curious property that the moves in the optimal solution can | ||
be performed in any order, which is rare for combinatorial problems of this | ||
spirit. Without loss of generality, it suffices to examine only strictly | ||
descending move sequences. Even then, this problem is far deeper than it would | ||
initially seem . Various greedy strategies that would seem natural for the | ||
addiction-riddled brain of the Cookie Monster such as “choose p to maximize | ||
the number of cookies eaten at the current move” or “choose p to eliminate the | ||
largest possible number of piles” (yep, think about that one) do not always | ||
lead to the shortest possible sequence of moves. |
Autocorrect for stubby fingers
def autocorrect_word(word, words, df):
—|—
In this day and age all you whippersnappers are surely familiar with
autocorrect that replaces a non-word with the “closest” real word in the
dictionary. Many techniques exist to guess more accurately what the user
intended to write. Since many people, such as your instructor, have stubby
fingers, many typos are caused by the user pressing a virtual key next to the
intended key. For example, when the non-existent word is “cst”, the intended
word is far more likely to be “cat” than “cut”, assuming that the text was
entered with an ordinary QWERTY keyboard.
Given a word, a list of words, and a distance function df that tells how “far”
the first argument is from the second (for example, df(‘a’,’s’) == 1 and
df(‘a’, ‘a’) == 0 ), find and return the word in the list of words that have
the same number of letters and whose distance, measured as the sum of the
distances of the characters in the same position, is the smallest. If there
exist several equidistant words, return the first word in the dictionary
order.
word | Expected result |
---|---|
‘qrc’ | ‘arc’ |
‘jqmbo’ | ‘jambo’ |
‘hello’ | ‘hello’ |
‘interokay’ | ‘interplay’ |
Advanced autocorrect algorithms use statistical techniques based on the | |
surrounding context to suggest the best replacement, since not all words are | |
equally likely to be the intended word, given the rest of the sentence. For | |
example, the misspelling ‘urc’ should probably be corrected very differently | |
in the sentence “The gateway consists of an urc of curved blocks” than in “The | |
brave little hobbit swung his sword at the smelly urc.) |
Uambcsrlne the wrod
def unscramble(words, word):
—|—
Smdboeoy nteoicd a few yreas ago taht the lretets isndie Eisgnlh wdors can be
ronmaldy slmecbrad wouhtit antifecfg tiehr rlaibdiatey too mcuh, piovredd that
you keep the frsit and the last lteters as tehy were. Gevin a lsit of words
gtuaraened to be soterd, and one serlcmbad wrod, tihs fctounin shulod rterun
the list of wrdos taht cloud hvae been the orgiianl word taht got seambclrd,
and of csorue retrun taht lsit wtih its wdros sterod in apcabihaetll oerdr to
enrsue the uaigntbmuiy of atematoud testing. In the vast maitjory of ceass,
this list wlil catnoin only one wrod.
word | Expected result (using the wordlist words_sorted.txt) |
---|---|
‘tartnaas’ | [‘tantaras’, ‘tarantas’, ‘tartanas’] |
‘aierotsd’ | [‘asteroid’] |
‘ksmrseah’ | [‘kersmash’] |
‘bttilele’ | [‘belittle’, ‘billette’] |
(Writing the filter to transform the given plaintext to the above scrambled | |
form is also a good little programming exercise in Python. This leads us to a | |
useful estimation exercise: how long would the plaintext have to be for you to | |
write such a filter yourself instead of scrambling the plaintext by hand as | |
you would if you needed to scramble just one word? In general, how many times | |
do you think you need to solve some particular problem until it becomes more | |
efficient to design, write and debug a Python script to do it? Would your | |
answer change if it turned out that millions of people around the world also | |
have that same problem?) |
Substitution words
def substitution_words(pattern, words):
—|—
Given a list of words (once again, guaranteed to be in sorted order and each
consist of the 26 lowercase English letters only) and a pattern that consists
of uppercase English characters, this function should find and return a list
of precisely those words that match the pattern in the sense that there exists
a substitution from uppercase letters to lowercase letters that turns the
pattern into the word. Furthermore, this substitution must be injective,
meaning that no two different uppercase letters in the pattern are mapped to
the same lowercase letter in that word.
For example, the word ‘akin’ matches the pattern ‘ABCD’ with the substitutions
A → a, B → k, C → i and D → n . However, the word ‘area’ would not match that
same pattern, since the pattern characters A and D would both have to be
mapped to the same letter a, against the rules.
pattern | Expected result (using the wordlist words_sorted.txt ) |
---|---|
‘ABBA’ | [‘abba’, ‘acca’, ‘adda’, ‘affa’, ‘akka’, ‘amma’, ‘anna’, ‘atta’, |
‘boob’, ‘deed’, ‘ecce’, ‘elle’, ‘esse’, ‘goog’, ‘immi’, ‘keek’, ‘kook’, | |
‘maam’, ‘noon’, ‘otto’, ‘peep’, ‘poop’, ‘sees’, ‘teet’, ‘toot’] | |
‘ABABA’ | [‘ajaja’, ‘alala’, ‘anana’, ‘arara’, ‘ululu’] |
‘CEECEE’ | [‘beebee’, ‘booboo’, ‘muumuu’, ‘seesee’, ‘soosoo’, ‘teetee’, |
‘weewee’, ‘zoozoo’] | |
‘ABCDCBA’ | [‘adinida’, ‘deified’, ‘hagigah’, ‘murdrum’, ‘reifier’, |
‘repaper’, ‘reviver’, ‘rotator’, ‘senones’] | |
‘DEFDEF’ | 76 words, first three of which are [‘akeake’, ‘atlatl’, ‘barbar’] |
‘ABCDEFGIJKLM’ | 243 words, first three of which are [‘absorptively’, |
‘adjunctively’, ‘adsorptively’] | |
(Extra math problem for the readers who are into combinatorics and discrete | |
math: can you construct an encoding function from strings to natural numbers | |
so that two strings map into the same number if and only if these two strings | |
both match the same pattern ?) |
Manhattan skyline
def manhattan_skyline(towers):
—|—
This classic problem in computational geometry (essentially, geometry that can
be done using only integer arithmetic; yes, that is an actual thing) is best
illustrated by pictures and animations such as those on the page “The Skyline
problem”, so you can first check that it out to get an idea of what is going
on. Given a list of rectangular towers as tuples (s, e, h) where s and e are
the start and end x - coordinates ( e > s
) and h is the height of that
tower, compute and return the total visible area of the towers, being careful
not to double count two or more towers that are partially overlapping. All
towers share the same flat ground baseline at height zero.
The classic solution illustrates the important sweep line technique that
starts by creating a list of precisely those x -coordinate values where
something relevant to the problem takes place. In this problem, the relevant x
-coordinates are those where some tower either starts or ends. Next, loop
through this list in ascending order, updating your computation for the
interval between the current relevant x -coordinate and the previous one. In
this particular problem, you need to maintain a list of active towers so that
tower (s, e, h) becomes active when x == s, and becomes inactive again when x
== e. During each interval, only the tallest active tower has any effect on
the computation.
towers | Expected result |
---|---|
[(2, 3, 39)] | 39 |
[(6, 8, 56), (5, 14, 81), (3, 13, 71)] | 871 |
[(6, 18, 95), (3, 20, 95), (14, 31, 22), (5, 12, 93)] | 1857 |
(The more complex versions of this classic chestnut ask for the silhouette | |
outline as a list of polygons, as on the linked page, and also toss the | |
restriction that all towers must lie on the same ground level line, or even be | |
axis-aligned.) |
Count overlapping disks
def count_overlapping_disks(disks):
—|—
Right on the heels of the previous Manhattan skyline problem, here is another
classic of similar spirit for us to solve efficiently with a sweep line
algorithm . Given a list of disks on the two-dimensional plane as tuples (x,
y, r) so that (x, y) is the center point and r is the radius of that disk,
count how many pairs of disks intersect each other in that their areas have at
least one point in common. To test whether disks (x1, y1, r1) and (x2, y2, r2)
intersect, use the Pythagorean formula (x2-x1)**2 + (y2-y1)**2 <= (r1+r2)**2
. (Note again how this precise formula uses only integer arithmetic whenever
all individual components are integers. And no square roots or some other
nasty irrational numbers.)
For this problem, crudely looping through all possible pairs of disks would
work but be horrendously inefficient as the list grows larger. However, a
sweep line algorithm can solve this problem by looking at a far fewer pairs of
disks. Again, sweep through the space from left to right for all relevant x -
coordinate values and maintain the set of active disks at the moment. Each
individual disk (x, y, r) enters the active set when the sweep line reaches
the x -coordinate x-r, and leaves the active set when the sweep line reaches
x+r . When a disk enters the active set, check for an intersection between
that disk and only the disks presently in the active set.
disks | Expected result |
---|---|
[(0, 0, 3), (6, 0, 3), (6, 6, 3), (0, 6, 3)] | 4 |
[(4, -1, 3), (-3, 3, 2), (-3, 4, 2), (3, 1, 4)] | 2 |
[(-10, 6, 2), (6, -4, 5), (6, 3, 5), (-9, -8, 1), (1, -5, 3)] | 2 |
[(x, x, x // 2) for x in range(2, 101)] | 2563 |