使用papi库,代写操作系统作业,根据 Memory hierarchy
,对代码进行优化。
![Memory
hierarchy](https://upload.wikimedia.org/wikipedia/commons/thumb/0/0c/ComputerMemoryHierarchy.svg/300px-
ComputerMemoryHierarchy.svg.png)
Problem 1: Node Profile, Edge Profile, and Path Profile
Instrument the function “func” of program prof.c to gather the node profile,
the edge profile and the path profile of the function. Treat the “for” loops
in the function “func” as nodes, i.e., you don’t need to go in the for loops
to generate paths. The “func” function has a number of arguments. The
corresponding main function provides two possible input sets for those
arguments. You should do the profiling for both input sets. You need to submit
3 versions for the program, each version measuring one type of profile. Draw
the CFG of the “func” function and include it in the report.
Problem 2: Practice Using Hardware Counters and Optimize for Memory
Hierarchy
This problem should be completed on a machine installed with PAPI. Both PAPI
are freely available and can be installed on Linux computers. If you don’t
have access to such a machine, you may use cpeg655.ece.udel.edu.
The PAPI Hardware Counter Library: The PAPI library has been installed under
the directory /usr. The library binaries are in /usr/lib, and the library
header files are in /usr/include.
When you need to compile your program with PAPI, you can use the command line:gcc -I/ usr/include your_program_file -L/usr/lib -lpapi
“.
In this problem you are required to using PAPI hardware counter library to
measure the memory hierarchy performance of all the paths in the function
“func”. In other words, you should report the L1, L2 and TLB cache misses for
each path in the function “func”. You don’t need to measure or report the
counters for other parts of the program. Furthermore, you should optimize/de-
optimize the two programs for memory hierarchy and again use PAPI to verify
their memory hierarchy performance.
(1) Use the PAPI library to measure the L1 cache miss, the L2 cache miss and
the TLB miss of the function “func” of the two programs. The “main” function
of the two programs has already set up the initialization of the PAPI library.
You only need to provide the event names in the line that is labeled with
“Please add your event here.” Submit your code and measurements in your
report. You should measure for both input sets.
(2) Transform the func function so that for one input set of your choice, the
most frequently executed path in the “func” function can achieve minimum and
maximum of L2 cache miss. The “func” is basically a sequence of memory
accesses. You can change the order of the memory accesses, but you cannot add
or remove memory accesses to the sequence. You may also change the data struct
declaration for the transformation. Note that you are required to only work on
the most frequently executed path, implying that you may sacrifice other paths
to achieve the goal. The most important grading criteria is WHY you do what
your do. Submit your code and measurements, together with your explanation of
the transformations, i.e., why they work. (Hint: Optimize/de-optimize by
avoiding/creating cache conflicts.)