PDF: http://cs162.eecs.berkeley.edu/static/hw/hw0.pdf
You have probably been using gcc to compile your programs, but this grows tedious and complicated as the number of files you need to compile increases. You will need to write a Makefile that compiles main.c, wc.c, and map.c. You will also need to write a target check that runs some sort of test that will verify the output of wc and main. (You may do this any way you wish, but you MUST verify that the output is existent and non-erroneous). It may also help to write a clean target (for make clean) to remove your binaries. If all of this is new to you please read 2.2
We are going to use wc.c to get you thinking once again in C, with an eye to how applications utilize the operating system - passing command line arguments from the shell, reading and writing files, and standard file descriptors. All of these things you encountered in CS 61C, but they will take on new meaning in CS 162.
Your first task to write a clone of the unix tool wc, which counts the number of words inside a particular text file. You can run unix’s wc to see what your output should look like, and try to mimic its basic functionality in wc.c (don’t worry about the flags or spacing in the output). While you are working on this take the time to get some experience with gdb. Use it to step through your code and examine variables.
Now that you have dusted off your C skills and gained some familiarity with the CS 162 tools, we want you to understand what is really inside of a running program and what the operating system needs to deal with. Load up your wc executable in gdb with a single input file command line argument, set a breakpoint at wc, and run to there. Take a look at the stack using where or backtrace (bt). While you are looking through gdb, think about the following questions and put your answers in the file gdb.txt.
- What is the value of infile? (hint: print infile)
- What is the object referenced by infile? (hint: *infile)
- What is the value of ofile? How is it different from that of infile? Why?
- What is the address of the function wc?
- Try info stack. Explain what you see.
- Try info frame. Explain what you see.
- Try info registers. Which registers are holding aspects of the program that you recognize?
We have just peeled away the abstraction layers that is the onion of an executing program: the source code, compiled into an object, linked into a executable, that is loaded and executed on a computer. The operating system meets the application as an executable file when you run it. There is more to the executable than meets the eye. Let’s look down inside.
`objdump -x wc`
You will see that it has several segments, names of functions and variables in your program correspond to labels with addresses or values. And the guts of everything is chunks of stuff within segments. While you are looking through the objdump try and think about the following questions and put the answers in the file objdump.txt.
- What file format is used for this binary? And what architecture is it compiled for?
- What are the names of segments you find?
- What segment contains wc (the function) and what is it’s address? (hint: *objdump -w wc — grep wc)
- What about main?
- How do these correspond to what you observed in gdb when you were looking at the loaded, executing program?
- Do you see the stack segment anywhere? What about the heap? Explain.
OK, now you are ready to write a program that reveals its own executing structure. The second file in hw0, map.c provides a rather complete skeleton. You will need to modify it to get the addresses that you are looking for and get the type casts right so that it compiles without warning. The output of the solution looks like the following (the addresses will be different).
precise64 hw0 ./map
Main @ 40058c
recur @ 400544
Main stack: 7fffda11f73c
static data: 601028
Heap: malloc 1: 671010
Heap: malloc 2: 671080
recur call 3: stack@ 7fffda11f6fc
recur call 2: stack@ 7fffda11f6cc
recur call 1: stack@ 7fffda11f69c
recur call 0: stack@ 7fffda11f66c
Now think about the following questions and put the answers in map.txt.
- Using objdump on the map executable. Which of the addresses from the previous section are defined in the executable, and which segment is each defined in?
- Make a list of the important segments, and what they are used for.
- What direction is the stack growing in?
- How large is the stack frame for each recursive call?
- Where is the heap? What direction is it growing in?
- Are the two malloc()ed memory areas contiguous?
- Make a high level map of the address space for the program containing each of the important segments, where they start and end, where the holes are, and what direction things grow in.
The size of the dynamically allocated segments, stack and heap, is something the operating system has to deal with. How large should these be? Poke around a bit to find out how to get and set user limits on linux. Modify main.c so that it prints out the maximum stack size, the maximum number of processes, and maximum number of file descriptors. Currently, when you compile and run main.c you will see it print out a bunch of system resource limits (stack size, heap size, ..etc). Unfortunately all the values will be 0! Your job is to get this to print the ACTUAL statistics. (Hint: man rlimit)
You should expect output similar to this:
precise64 hw0 make
precise64 hw0 ./main
stack size: 8192
process limit: 2782
max file descriptors: 1024
precise64 hw0 make check
pass!
precise64 hw0 echo $?
0
To push to autograder do:
make clean
git add .
git commit -m "Trigger autograder."
git checkout -b ag/hw0
git push personal ag/hw0
This saves your work and it gives the instructors a chance to see the progress you are making.
Congratulations for not waiting till the last minute.
Within a few minutes you should receive an email from the autograder. (If not, please notify the instructors via Piazza).
Now in order to finally submit your code, you need to push to the branch release/hw0.
make clean
git add .
git commit -m "Submitting hw0."
git checkout -b release/hw0
git push personal release/hw0
The reason we gave you two types of branches with an autograder, is that the ag/* are testing branches. Nothing on it will be graded. Whereas you must submit to release in order to get graded. So please only push to release/* when you intend to submit.
Hopefully after this you are slightly more comfortable with your tools. You will need them for the long road ahead!