用Python解析文本文件,提取非注释行内容信息。
Requirement
The following content is from a file named “instance10_001.txt” (which is
posted in eClass under Week 1):
#instance10_001.txt
#area [0, MAX_X] x [0, MAX_Y]
100 100
#number of points NUM_PT
#coordinates
0 0
0 90
70 100
100 50
30 30
30 70
70 70
70 30
50 50
45 0
#end of instance
This file describes 10 points in the two-dimensional plane, within the
rectangular area [0, MAX_X] [0, MAX_Y]. Every line starting with a symbol # is
a comment; the first non-comment line contains the integer values for MAX_X
and MAX_Y, which are 100 and 100 in this file (the range for each of them is
[1, 1000]); the second non-comment line contains the number of points NUM_PT,
which is 10 in this file (the range is [1, 1000]); the other non-comment lines
present the integer x- and y-coordinates for the points, one in a line (in a
non-specific order).
In all three assignments, the input files all follow such a file format,
except that the values of the variables can be different and the comment lines
can be missing. Each file is called an instance, and the file name convention
is to start with “instance”, followed by the number of points in the instance,
then an underscore, the index of instance having the same number of points,
and lastly the file suffix “.txt”. That is, “instanceXXX YYY.txt” is the YYY-
th (may or may not be left-patched with 0’s) instance having XXX points; for
example, the above “instance10 001.txt” is the first instance having 10
points.
The following list contains the specifications for Assignment #1 (10 marks in
total):
- Write a single program with multiple functionalities (i.e. objectives), using the command-line options. Suppose your program name is “myprogram”. If a command for running your program is incorrect (such as invalid options), your program prints out the following and then quits (functionality #1):
>myprogram [-i inputfile [-o outputfile]
- One functionality (functionality #2) of your program is to read in the content of an instance file. To read in the file “instance10_001.txt” you will execute the command:
>myprogram -i instance10 001.txt
Here “-i” is the command-line option that indicates the succeeding argument is
the input filename. - Your program will check the correctness of the file content (functionality #3) during reading. That is, to check that the first non-comment line contains two integer values for MAX_X and MAX_Y, the second non-comment line contains an integer value for the number of points NUM_PT (this number also appears in the filename; your program does not need to validate this, but to always use the number read in from the file), and there are exactly NUM_PT more lines, each contains two integer values for the x- and y-coordinates of a point. Your program also makes sure that no coordinate is out of the specified rectangular area, neither there can be duplicate points in the instance.
If your program encounters an error, then reports “Error in reading the
instance file!” and quits; otherwise, it continues to the next item. - With the correctness been checked, your program will print (functionality #4) out the non-comment lines of the input file to the screen, when using the command:
>myprogram -i instance10_001.txt
If an output filename is specified, using either of the following commands:>myprogram -i instance10_001.txt -o output.txt
>myprogram -o output.txt -i instance10_001.txt
where “-o” is the command-line option that indicates the succeeding argument
is the output filename, then instead of printing to the screen all the non-
comment lines of the input file are written into the file “output.txt”. - If your program is not fed with an input file, that is, by executing the following command:
>myprogram
then your program will generate several instances (functionality #5) through a user interface as follows:
Your program will generate in total 7 instances, written into 7 separate files
with their filenames “instance10_j.txt”, for j = 1, 2, …, 7, respectively.
Each instance has the rectangular area [0, 100] [0, 200], and has 10 points.
The coordinates of a point is generated uniformly randomly within the
rectangular area. And your program makes sure there are no duplicate points
within each instance. If it is impossible for your program to generate these
files, prints out “Error in generating instances!” and quits. All these files
are saved in the current directory executing the command, and your program
prints the following to the screen:
instance10_1.txt generated
instance10_2.txt generated
instance10_3.txt generated
instance10_4.txt generated
instance10_5.txt generated
instance10_6.txt generated
instance10_7.txt … done!