用C语言代写一个类似 Photoshop 的图像处理软件,能够对 PPM 格式对图像文件进行处理。
![Image](https://upload.wikimedia.org/wikipedia/commons/thumb/4/44/Checkerboard_identity.svg/240px-
Checkerboard_identity.svg.png)
Requirements
This is a programming assignment, meaning you will write and submit the source
code for a program. All programs must compile (and run) on the ugrad machines
using our standard compiler options.
From this point forward in the semester, your programs will need to not only
compile and run cleanly (i.e. with no warnings, errors, or crashes), but also
not have any memory leaks (as reported by valgrind).
NOTE: For this assignment, you will be working with images. This will mean
that you will want to view images, which will that if you want to work
remotely on the ugrad machines, you will need to set up X-tunnelling using
Xming or XQuartz (see the XWindows section of the tools reference page on
Piazza).
Also, PPM images can be quite large; be aware of file sizes, and try not to
fill up your hard drive (or your disk quota on ugrad) with thousands of cat
pictures…
Program Description
This program will be an image-manipulation program, in the vein of Photoshop.
Yours will have a command-line-based user interface (UI), so there will be no
graphical interface, and the range of operations will be limited, but the
algorithms you will use are the same ones used in programs like Photoshop or
GIMP.
At a basic level, the program will be able to read image files from disk,
perform one of a variety of image manipulation tasks, and then write the
result back to disk as a new image file. Since your program will not have a
GUI, we will use external programs to view the images. If you are on ugrad
(either locally, or remotely with X-tunnelling), you can use the program feh,
which is a very simple command-line image viewer; simply run the program with
the name of an image file as a command-line argument, and it will display the
image on your screen. If you are using a different platform, you are welcome
to use an image viewer of your choice; feh is easy to install using most linux
package managers, but there are other open source image viewing programs, as
well as alternatives for Windows and OSX.
While there are many formats for storing image files, your program will only
need to read and write one, the PPM format. This is essentially the simplest
and easiest format to read and write, which is why it was chosen; its main
drawback is that it does not use any kind of compression, so images stored in
this format tend to be on the large side when compared to formats like JPEG or
PNG or GIF.
Your program will be a command line tool, always run with the name of the
executable file project followed by the name of an input PPM file, the name of
a desired output PPM file, and the specific lower-case name of an operation,
as listed below. Some operations require additional arguments, which will also
be supplied at the command line by the user, at the end of the line. There is
no input entered by the user interactively.
Note that regardless of the desired operation, the first two arguments after
the executable name project are always interpreted as the input file name
followed by the output file name. The next argument is interpreted as the
operation name, and the operation’s arguments (if any) come after that.
The operations your program will be able to recognize and perform are all of
the following (the bolded words are the operation names to be entered at the
command line by the user):
- swap - swap color channels (i.e. what was red becomes blue, what was blue becomes green, etc.)
- bright - change brightness (up or down) by the given amount
- invert - invert the colors (i.e. black becomes white, white becomes black, etc.)
- gray - convert to grayscale (i.e. a full-color image becomes shades of gray)
- crop - crop the image given corner pixel locations
- blur - blur the image (smooth and soften its edges) by a specified amount
- edges - detect edges with intensity gradient above given threshold
For example, at the command prompt, a user of your program might type
./project nika.ppm nika-swap_once.ppm swap
to take as input a PPM file named nika.ppm and create an output file named
nika-swap_once.ppm which contains a swapped version of the nika image. Or, if
the desire is to crop the portion of the nika image with top left corner
(col=60, row=15) and bottom right corner (col=600, row=565), then the user
could type
./project nika.ppm nika-crop.ppm crop 60 15 600 565
to generate the output file named nika-crop.ppm.
When you first begin this assignment, we suggest that you aim to implement PPM
reading and writing first. As an initial test to be sure you’re on the right
track, read in a PPM file and write it out unchanged, and use the Unix diff
tool and visual inspection using the feh tool to confirm that your code works.
You’ll note when using diff that your output file doesn’t contain a comment
line that the original nika.ppm file does; but when viewing the image, you
should see no differences at all. Once this works well, begin successively
working through the other commands as listed.
Development Plan
For this assignment, the development plan is largely in your hands. You are
required to create and submit a Makefile for this project, and the executable
it generates should be named project. You should also rigorously test your
code, but you are not required to hand in specific tests with your submission.
Beyond that, it is largely up to you how to structure your program, and what
order to implement features in, but you must take a modular approach.
We recommend that you break your code into several files, and have as little
code as possible in main(). Here is a suggested (but not required) breakdown
of features into files:
- Makefile - you must include a Makefile that can build your program; it’s how the graders will compile your submission. You are required to build a target whose name is project.
- project.c - main program. The main() function should be extremely simple; it might literally call a single function, then return 0.
- ppm_io.c - contains implementations of functions for reading, writing, creating, destroying, copying, etc. images (using the PPM format) - see scaffolding folder in public repo.
- ppm_io.h - header for ppm i/o stuff (struct and function declarations) - see scaffolding folder in public repo.
- imageManip.c - contains implementations of all image manipulation algorithms
- imageManip.h - header for imageManip stuff (struct and function declarations, possibly some #defines)
Note that this is not a list of ALL the files that should be included in your
submission; it is only a listing of possible source code files.
We recommend that before you start coding, you make a development plan. This
means that you sit down and plan out (on paper) how you will break the program
down into modules (e.g. functions or groups of functions), what each module
will do, and how they will interact. Aim to write small, clean helper
functions for better readability and easier testing, as well as greater
reusability. Then, make a plan for what order you will implement the modules
in. You will also want to test your modules; it’s a good idea to use test-
driven design, which means that you will design tests for your functions
before you actually start trying to write the functions themselves.
For each module, it’s important to think about precisely what it should do,
and also how you can test it to be sure it’s doing what you want. There are
lots of ways of testing your code; for this assignment, a lot of your tests
will likely involve the visual inspection of output images to see if they look
the way they’re supposed to. Still, having an idea of how you’ll test each
piece before you start writing it (and then testing/fixing it before you move
on to the next one) will make your life a lot easier.
Scaffolding Folder
The scaffolding (i.e. starter code) folder for this project (available in the
public class repository) provides you with ppm_io.h, a header file containing
typedefs for structs to hold a pixel and an image, respectively. It also
supplies ppm_io.c, which contains code for writing an image to a PPM file, and
a Makefile for the project. Finally, it contains nika.ppm, an example PPM
image to work with (but you can use convert on ugrad to create more from your
own jpg files). In a subfolder named results, the starter folder includes the
PPM files which resulted in calling the operations displayed on this page
(this page is actually displaying the png versions). We encourage you to store
the provided nika.ppm image and all created images in a subfolder of your own
repository named data, to keep your images separate from your source code
files. You don’t need to submit any PPM files to us; keeping them in a
separate folder will help you avoid accidentally including them. TIP: if
you’re using the data subfolder, we suggest you execute your code from within
the data folder by typing ../project, so you can refer to input filenames
while the program is executing directly as nika.ppm, rather than data/
nika.ppm, saving yourself a lot of typing while testing.
Error Reporting
The approach your program will take for error reporting is to have your main
method return a 0 value indicating success or a positive value indicating
failure. Which positive value your program should return is indicated in the
table below. If more than one error condition occurs, your program should
return the error code listed earliest in the table below.
In addition to returning the specified value, your program should also output
an informative error message to stderr. (The text of the error messages will
not be specified; we’ll leave the exact wording up to you.)
Return value | Error condition it signifies |
---|---|
0 | No errors detected |
1 | Failed to supply input filename or output filename, or both |
2 | Specified input file could not be opened |
3 | Specified input file is not a properly-formatted PPM file, or reading |
input somehow fails | |
4 | No operation name was specified, or operation name specified was invalid |
5 | Incorrect number of arguments or kind of arguments specified for the |
specified operation | |
6 | Arguments for the specified operation were out of range for the given |
input image, or otherwise senseless | |
7 | Specified output file could not be opened for writing, or writing output |
somehow fails | |
8 | Any other error condition not specified above |
Implementation Details
This section contains detailed descriptions of formats, algorithms, and the
like that will be necessary for this assignment.
PPM image format
For this assignment, we will use a very simple image-file format called PPM.
It’s an uncompressed format, meaning that the images will take up a lot of
disk space (compared to JPG or PNG files), but it’s very easy to read and
write from C code (which is why we’re using it).
NOTE: you can use a unix program called convert
to convert between image
formats; e.g. to convert an existing file called “selfie.jpg” into a PPM, you
would type:
$ convert selfie.jpg selfie.ppm
This works for most image format file extensions; it converts to/from most
known image formats, including .jpg, .gif, .png, .ppm, .tiff, and .pdf, and is
installed on the ugrad machines. If it’s not installed on your local machine
(or virtual machine), most linux package managers can install it (or can
install ImageMagick, which is the suite of tools that convert
is part of).
Additionally, most image manipulation programs (e.g. Photoshop, GIMP, etc.)
can export to PPM format.
If you haven’t used GIMP, it’s a free, open-source image manipulation program
with good cross- platform support; try it out! It may be helpful to have
access to either GIMP or Photoshop in order to try things out to see what kind
of output you should expect from your program for various kinds of input
images, but keep in mind that these programs may not use exactly the same
algorithms we will here.
The PPM format itself is pretty simple (compared to most other image formats).
Basically, at the top of the file will be a special “tag” that marks the file
as a PPM; this should be P6. Then, there are three numbers, separated by
whitespace; these numbers represent the size of the image as columns, rows,
and colors. Columns and rows specify the width and height of the image (in
pixels) respectively. (BEWARE: columns come before rows in this format!)
Colors encodes how many different shades of each color a pixel can take on;
for this assignment, this number must always be 255 (you must reject any image
that uses a different value, but you’re unlikely to encounter one).
Immediately after the 255, the binary data encoding pixels will begin.
Optionally, there may be lines starting with a #, which are comments and
should be ignored; these may be intermixed with the above information. You
don’t need to store these; if you read a file and then re-write it, it’s fine
if the comments get lost. The files we test your code with, however, will have
either 0 or 1 comment lines just after the P6 tag, but no comment lines
between the other header values (see nika.ppm in the course public repo for an
example).
All of this will be ANSI text, so you can use the normal text functions (e.g.
fgetc(), fscanf(), fprintf() etc.) to read/write them.
After the color size specification, there will be a single whitespace
character (usually a newline, but that’s not guaranteed), after which the
remainder of the file will be the actual pixel values. Basically, each “pixel”
consists of three values; the first value is the “red” channel, the second
value is the “green” channel, and the third value is the “blue” value. Taken
together, these three values specify a single color, which is the color of
that pixel. Since the max color value is 255, each of these values will be in
the range 0-255, which fits exactly in one byte of memory.
The easiest way to read the pixel values is to create a struct that contains
three unsigned char variables, one per color channel. Then, create an array of
your pixel structs with rows * cols elements. At that point, you can just use
fread() to read the entire array of pixels from the file in one go. Similarly,
you can use fwrite() to write the whole pixel array with a single function
call. We’ve started this off for you in the provided ppm_io.h and ppm_io.c
files.
For the formal “official” description of the PPM format, see the netpbm site.
The starter folder includes one primary example image, called nika.ppm. We’ve
included a copy on this page (below), so you know what it should look like;
you should also be able to open it using feh or any other image program.
Again, you are welcome to work with this image, but you are encouraged to
convert images of your own to the PPM format so you have more than one test
image to work with.
Also note that the images embedded in this page are not actually PPM
formatted, since most web browsers don’t know how to read PPMs (and also PPMs
are kind of large for web use).
Swapping color channels
Swapping color channels is very simple; for each pixel, you just need to copy
each component’s value to the next component (note that this will require a
“temporary” value to store one of them). So, the old value for the green
component becomes the new value for the red component, the old value for the
blue component becomes the new value for the green component, and the old
value for the red component becomes the new value for the blue component.
If you apply swap to the nika.ppm image, the result should look like:
Adjusting brightness
Adjusting brightness is another per-pixel operation; in this case, you will
simply add some value to each channel of each pixel (if the number is
positive, this will brighten the image, if it’s negative, it will darken the
image). By adding the same amount to each channel, you won’t change the color,
just the brightness (i.e. dark green may become light green, but it won’t
become purple).
The only trick here is that you need to make sure you do “saturating” math,
meaning you don’t accidentally overflow (or underflow) your unsigned chars.
So, when you do your addition, you need to check the result and clamp the
result to the range [0, 255].
You will likely want to write a helper function to do the saturation; it would
likely take a double, clamp it to the appropriate range, and return the value
as an unsigned char. You can use this helper function for several of the later
operations as well.
If you apply the brightness transform to the nika.ppm image, the result should
look like:
Inverting colors
Inverting color values is very straightforward; simply take the value of each
component of each pixel, and calculate its “inverse” by subtracting its value
from 255. If you apply the invert transform to the nika.ppm image, the result
should be as shown below. If you invert that resulting photo, you should get
the original photo back.
Grayscale conversion
This is another fairly simple operation, but it does require a little math.
Basically, for each pixel, you will calculate a single grayscale value based
on the three color values, and then assign that same value to all three color
channels (if all three color channels have the same value, you know the pixel
will show up as some shade of gray).
For this program, we will use the NTSC standard conversion formula:
intensity = 0.30red + 0.59green + 0.11*blue;
If you look around online, you will discover that there are actually several
different formulas that can be used, which result in slightly different
results; please use the NTSC version for this assignment (see Wikipedia on
grayscale conversion). Note that once the value is calculated, you will need
to cast it back to an unsigned char before you assign it to each channel.
If you apply the grayscale transform to the nika.ppm image, the result should
look like:
Crop
Cropping an image is pretty simple; you just need to specify the two corners
of the section you want to crop. That will mean 4 integer values: the column
and then row of the upper-left corner, and the column and then row of the
lower-right corner. By looking at the differences between those values, you
can calculate the size of the new image; this will let you allocate the
correct amount of space for the pixel array. Once you’ve done that, you can
just use a loop to go through the pixels of the specified region in the
original image, and copy each component of each pixel to the new image.
If you crop the nika.ppm image from (top col=60, top row=15) to (bottom
col=600, bottom row=565), the result should look like:
Blurring an image
The blur operation is more complex than the previous operations, because you
need to consider more than one pixel at a time. At the simplest level, a blur
works by taking each pixel, and setting its value to some kind of average of
all the pixels in a small neighborhood around it. The simplest blur possible
would just set each pixel to the average of itself and the pixels adjacent to
itself (computed for each color channel separately). However, this kind of
blur isn’t as pretty as we might like. What we will do for this assignment is
similar, but more clever.
For the kind of blurring effect that a program like Photoshop would give you,
you will need to weight the importance of neighboring pixels according to a
Gaussian distribution. This is referred to as a “Gaussian blur” (Wikipedia),
and this is what you will implement for this assignment.
First, we need to create an NxN matrix that holds the values of a 2D
(symmetric) Gaussian distribution with a given variance (we assume 0-mean). N
should be big enough to be at least 10sigma positions wide (to span
approximately 5sigma positions in each direction), and N should always be an
odd number (so there’s an equal number of rows/columns on either side of the
center. If dx and dy store the two coordinates as offsets from the center
(i.e. delta-from-mean), then the Gaussian value can be calculated as:
In code, this might look like:
double g = (1.0 / (2.0 * PI * sq(sigma))) * exp( -(sq(dx) + sq(dy)) / (2 * sq(sigma)));
—|—
Note that you will need to include math.h and link with -lm to get the exp()
function. You will have to define sq() yourself (to square the argument,
either as a function or as a #defined macro), and PI (likely as a #defined
constant).
As an example, if sigma is 0.5, then we would get a 5x5 matrix like this:
0.000000 0.000029 0.000214 0.000029 0.000000
0.000029 0.011660 0.086157 0.011660 0.000029
0.000214 0.086157 0.636620 0.086157 0.000214
0.000029 0.011660 0.086157 0.011660 0.000029
0.000000 0.000029 0.000214 0.000029 0.000000
If sigma was 1.0, we would get an 11x11 matrix:
0.000000 0.000000 0.000000 0.000000 0.000000 0.000001 0.000000 0.000000 0.000000 0.000000 0.000000
0.000000 0.000000 0.000001 0.000007 0.000032 0.000053 0.000032 0.000007 0.000001 0.000000 0.000000
0.000000 0.000001 0.000020 0.000239 0.001072 0.001768 0.001072 0.000239 0.000020 0.000001 0.000000
0.000000 0.000007 0.000239 0.002915 0.013064 0.021539 0.013064 0.002915 0.000239 0.000007 0.000000
0.000000 0.000032 0.001072 0.013064 0.058550 0.096532 0.058550 0.013064 0.001072 0.000032 0.000000
0.000001 0.000053 0.001768 0.021539 0.096532 0.159155 0.096532 0.021539 0.001768 0.000053 0.000001
0.000000 0.000032 0.001072 0.013064 0.058550 0.096532 0.058550 0.013064 0.001072 0.000032 0.000000
0.000000 0.000007 0.000239 0.002915 0.013064 0.021539 0.013064 0.002915 0.000239 0.000007 0.000000
0.000000 0.000001 0.000020 0.000239 0.001072 0.001768 0.001072 0.000239 0.000020 0.000001 0.000000
0.000000 0.000000 0.000001 0.000007 0.000032 0.000053 0.000032 0.000007 0.000001 0.000000 0.000000
0.000000 0.000000 0.000000 0.000000 0.000000 0.000001 0.000000 0.000000 0.000000 0.000000 0.000000
Basically, the sigma parameter lets you control how strong the blur effect is;
the larger your sigma, the more blurry your image will become.
Once you have this Gaussian “filter” matrix, you’ll need to “convolve” it with
your image. That’s fancy math terminology that basically means you loop over
the pixels of your input image, and for each pixel, you place your filter on
your image centered at that pixel, and then for each element of the filter,
multiply it by the pixel value underneath that element. Then, you set the
value for the pixel of the output image to be the normalized sum of those
values. Basically, this is a fancy way of describing a weighted average; the
values in the matrix are the weights that get applied to the corresponding
pixels of the input image. To normalize, you just sum up the values in the
filter matrix, and divide the sum of the weighted pixel values by it; that
ensures that you’re not brightening or darkening the image when you blur it.
You will also need to be careful near the edges of the image, since the parts
of filter may extend past the border of the image. In your calculations, just
skip the filter positions that hang off the edge. (This is why the
normalization is important even if the values in your matrix sum to 1).
You will blur all three color channels in this fashion.
You will likely want to implement this in several functions. For instance, you
may want a function that generates a Gaussian matrix of a given size and
variance, a function to find the filter response for one pixel of the input
image, and finally a function that calls the first function, then loops over
the pixels of the input image and calls the second function for each pixel in
order to generate the output image.
If you apply the blur transform with different sigma values to the nika.ppm
image, the result should look like the images shown below. (Note that blur may
be a bit slow, and the larger a radius you use, the slower it will be.)
Edge Detection
While edge detection algorithms can get quite sophisticated, it turns out that
one of the best edge detectors today, the “Canny edge detector”, (Wikipedia)
follows a relatively simple algorithm. Here, you will implement a few stages
of the original Canny edge detector.
Since for edge detection we only really care about the intensity change of an
image as it shifts from the boundary of one object to another; we need to
first convert our image to grayscale. Next, in order to reduce the number of
false positives our edge detector picks up, we need to denoise the image by
applying a Gaussian filter. This, in fact, is exactly the same as the blur
function you are asked to implement for this assignment! Once we’ve denoised
the image we can go ahead and compute the intensity gradient for each interior
point (i.e. points not on the boundary) of the image in both the horizontal
(x) and vertical (y) directions.
Once you’ve calculated the magnitude of the intensity gradient for each pixel;
you can perform the final step of thresholding each pixel and classifying it
as an edge or not an edge. Specifically, you should set the value of all
channels to 0 (black) if the magnitude exceeds the threshold, else set the
values to 255 (white). You can ignore the boundary points and leave them as
they are.
If you apply the edge detection transform to the nika.ppm image with the sigma
and threshold arguments shown below, the results should look as shown below:
Submission
Remember that programs which do not compile (with standard compiler flags on
the ugrad machines) will not receive credit. Additionally, points will be
deducted for any compiler warnings. Points will also be deducted for any
warnings, errors, or memory leaks reported by valgrind. All executables should
be buildable using a Makefile, and should build and run cleanly.
Submit your project via Gradescope. Your submission should contain all source
code and files necessary to compile your program (including a Makefile) as
well as a README file (which includes both partner names and JHEDs) and a git
log file from your shared midterm project repo. The log should indicate that
all team members were contributing code and pushing their contributions to the
repository. Your submission should not contain any compiled binaries
(executables or object files), or any testing-related files (in particular,
please do not submit any image files).
The requirements for your git log are the same as in previous assignments,
except note that we expect both members of your team to be contributing
commits to your shared midterm project repo.
Only one team member should submit the project on Gradescope. The same team
member should submit all versions of the project in his/her account. Be sure
that source file headers include the names and JHEDs of all team members, so
that each student gets credit for this work.
Formative and Summative Assessment Goals
This project builds on all the previous assignments, as well as in-class work,
and it will make use of several new features, including pointers and dynamic
memory allocation. It is significantly larger in size and complexity than
earlier programming homework assignments, and will be completed with a partner
over a period of nearly three weeks.
The summative assessment goal is to evaluate your growing skill as a C
programmer, and your ability to program in a principled way as program
complexity grows. As always, this involves both C-specific skills like code-
writing and debugging, and broader programming skills like problem solving,
planning, design, algorithmic thinking, testing, teamwork, etc.
As program size and complexity continues to grow, this will also have the
effect of assessing your time management and planning skills; while these
skills are not directly related to your understanding of CS concepts, they are
important skills in their own right. Most assignments for most classes build
and assess these skills to some extent, but the longer duration assignments
for this course will do so more rigorously than most.
The formative assessment goal is to build your programming skills,
particularly relating to pointers and dynamic allocation. This assignment is
also the first assignment designed to build skill with maintaining and using a
codebase over a longer period of time; this skill will become increasingly
important over the second half of the course.
This assignment also gives you a bit more freedom than the earlier ones, in
that you are allowed (and expected) to come up with your own tests. This means
you have greater control, but it also means that it is up to you to ensure
that your functions work correctly and robustly; you cannot merely program to
the tests that have been provided.
As with all programming assignments in this course, this assignment will both
build and test your knowledge of the C programming language, your ability to
work in a team, your knowledge of sound coding practices and stylistic
conventions, your ability to plan, design, and carry out an incremental
development plan, your ability to test and validate a program, and your
ability to debug a program.