这次需要代写的作业比较冷门,用Python来写前端的Web页面,GUI部分需要用Tkinter库来实现。
Motivation
Hardcopy periodicals such as newspapers, magazines, newsletters, etc. are all
in decline as people increasingly turn to online media. Nonetheless, there is
still a need for people to access regularly-updated information in an easy-to-
read format. Here you will develop a program that produces a customised
periodical in HTML format, using data downloaded from the World-Wide Web. The
program will have a Graphical User Interface that allows the user to control
production of the periodical, which can then be viewed in a standard web
browser. Most importantly, your publication will comprise up-to-date data
sourced from online “feeds” that are updated on a regular basis. To complete
this assignment you will need to: (a) download web pages in Python and use
regular expressions to extract particular elements from them, (b) create an
HTML file containing the extracted elements, and (c) use Tkinter to provide a
simple Graphical User Interface.
Illustrative Example
For the purposes of this task you have a totally free choice of what kind of
periodical to produce. It could be:
- a newspaper
- a current affairs magazine
- a fashion/lifestyle magazine
- a newsletter for online gamers
- a sports journal
- a science and technology review
- etc.
However, whatever theme you choose, you must be able to find at least four
different online web pages that contain regularly-updated stories or articles
in different categories under the overall theme. Each such story must contain
a heading, a photograph, some text and a publication date. A good source for
such data is Rich Site Summary (RSS) web-feed documents. The appendix below
lists some such sites, but you are encouraged to find your own of personal
interest.
To demonstrate the idea, we will publish our own newspaper, using data
extracted from News Limited’s web site. Our demonstration program allows users
to select from several categories, National News, Sports, World News, Business
News, Entertainment and Technology. The program then downloads relevant data
from the Web and uses it to produce an HTML document which can be read in a
standard web browser.
The screenshot below shows our example solution’s GUI when it first starts.
The user is invited to select which categories of information they want
included in their newspaper. In this case this is done by selecting check
buttons, but other solutions are possible. Below the user has selected four
news categories of interest.
When ready the user then presses the button to start “printing” the newspaper
(i.e., to create an HTML file containing its contents). The system downloads
current data from the web site and generates the file. The user can follow the
“printing” process’s progress in the small text window.
As well as printing the latest top news items in each of the four categories,
the system also generates a “masthead” which identifies the periodical.
Once the file has been created the user can open it in their preferred web
browser. Alternatively, pressing the “Read” button in the GUI above will open
the file in the host operating system’s default browser.
The generated document contains the masthead and the current top story in each
of the selected categories. It is shown overleaf as viewed in the Firefox
browser.
Above you can see the masthead with the name of the periodical, The Daily
Planet in this case, and an image indicating the nature of its contents. (Our
fictional newspaper’s slogan and editor are also shown, but these features are
optional.)
Scrolling down in the HTML document shows the current top news item in each of
the four selected categories when the program was run. Three of these are
shown below, as they were when this demonstration was run. (Some of the images
downloaded at this time were small “thumbnails”, hence their blurry appearance
when enlarged.)
Notice that each top news item displayed above contains:
- the category of story;
- the URL where the original data was found;
- the story’s title (headline);
- a photo illustrating the story;
- a short summary of the story; and
- the date and time the story appeared online.
Most importantly, items 3 to 6 are all extracted “live” from the online web
document indicated. This was done by downloading the HTML source and using
regular expressions to find the necessary elements needed to construct our own
version of the story. The first part of the HTML code generated by our Python
program is shown below (as displayed in the Firefox browser).
Although not intended for human consumption, the generated HTML code is
nonetheless laid out neatly, and with comments indicating the purpose of each
part.
To compose our HTML document, Rich Site Summary (RSS) web-feed files are
downloaded from the web site. RSS documents are XML files specifically
intended to be machine-readable. They have a simple structure that makes it
reasonably easy to extract their elements. An example of such a web document
as it appears when examined in a web broswer is shown below.
This was was the source of the data used to produce our National News story
shown above. To compose the corresponding page for our newspaper we extracted
the latest story’s headline, story text, date and the address of the
associated JPEG image. This data was then integrated into our HTML code.
We also discovered that sometimes the downloaded text contained unusual
characters that are not handled properly in Python strings, most notably
“smart” quotes, so we replaced these with plain characters before “printing”
our newspaper.
Requirements and marking guide
To complete this task you are required to develop an application in Python
similar to that above, using the provided publisher.py template file as your
starting point. Your solution must support at least the following features.
Generating a masthead
Your program must be able to generate an HTML file, publication.html, which
begins with a ‘masthead’ identifying the nature of your periodical. When
viewed in a web browser, the masthead part of the document must contain at
least the following elements:
- The name of the periodical.
- An image evocative of the periodical’s theme.
The image must be sourced from online (you cannot attach image files to your
solution). Since it will never change, the URL for this particular image can
be “hardwired” in your Python code. The HTML source generated by your Python
program must be laid out neatly.
Generating four stories
Your Python program must be capable of generating at least four distinct
“stories” as part of your periodical. Each such story must be derived from a
different online web page, and must represent the latest story in a particular
category at the time when the program runs. When viewed in a web browser, each
story must contain at least the following elements:
- the category of story,
- the URL where the original data was found,
- the story’s title (headline),
- an image illustrating the story,
- a short summary of the story, and
- the date and time the story appeared online.
The last four of these items must all be extracted from the online document
and must all belong together (i.e., you can’t have an image from one story and
the headline from another). Each of the elements must be extracted from the
original document separately. It is not acceptable to simply copy large chunks
of the original document’s source code. The HTML source code generated by your
Python program must be laid out neatly.
The precise visual layout, colour and style of the story elements is up to you
and is determined by the design of your generated HTML code. The periodical
must be easy to read. No HTML markup tags or other odd characters should
appear in any of the text displayed to the user.
Data on the web changes frequently, so your solution must continue to work
even after the web documents you use have been updated. For this reason it is
unacceptable to “hardwire” your solution to the particular text and images
appearing on the web on a particular day. Instead you will need to use text
searching functions and regular expressions to actively find the text and
images in the document, regardless of any updates that may have occurred since
you wrote your program.
Providing an intuitive Graphical User Interface
Your program must provide an easy-to-use GUI. This interface must offer at
least the following capabilities to its user:
- It must allow the user to select up to four (or more) categories of story to print in the periodical. Any mechanism can be provided for selecting stories as long as it is intuitive and easy to use, e.g., push buttons, check buttons, menus, “spinboxes”, etc.
- It must allow the user to choose to “print” their periodical, i.e., to generate the publication.html file.
- It must visually indicate to the user the progress being made on downloading data and generating their selected stories. Any clear mechanism can be used for doing so, e.g., a textual description, a progress bar, highlighting of GUI elements, etc.
Code quality and presentation
Your Python program code must be presented in a professional manner. See the
coding guidelines in the IFB104 Code Presentation Guide (on Blackboard under
Assessment) for suggestions on how to achieve this. In particular, each
significant code segment must be clearly commented to say what it does, e.g.,
“Create the masthead”, “Extract the first headline from the web page’s source
code”, etc.
Extra feature
Part B of this assignment will require you to make a ‘last-minute extension’
to your solution. The instructions for Part B will not be released until just
before the final deadline for Assignment 2.
You can add other features if you wish, as long as you meet these basic
requirements. For instance, in our example above we included a button in the
GUI which opened the generated HTML document in the default web browser. We
also supported more than four story categories.
You must complete the task using only basic Python features and the modules
already imported into the provided template. In particular, you may not import
any local image files. All displayed images and story text must be downloaded
from online sources each time your program is run.
However, your solution is not required to follow precisely our example shown
above. Instead you are strongly encouraged to be creative in the your choices
of stories to display, the design of your Graphical User Interface, and the
design of your periodical.