This turned out a bit longer than I expected, so here's the conclusion
Although there was a bit of ramp up time, I found this portion of the project to be a great primer on creating, connecting to and transmitting with sockets.
As far as resources go, I may very well be in the minority in that I didn't find things like Beej's Guide to Network Programming or the Linux Programming Interface to be particularly helpful as a first resource for this exercise. As thorough as they are, they were almost too indepth and I wound up inadvertently complicating things by attempting to apply the patterns in it.
The best resources for me were:
- This YouTube video about basic sockets
- This video was tremendously helpful in getting the very basics of creating and connecting to sockets.
- Using Telnet to connect to the server that I created and repeatedly sending text and observing the response
- I tried to think of the most basic way in which I could send requests to a socket and still receive a response, without necessarily making changes to a file, compiling it and then running every time. Telnet was a truly indispensible tool in that regard.
- The Linux Manpages, of course
- Remember these acronyms:
- Server, SBLA: S ocket, B ind, L isten, A ccept
- Client, SCR: S ocket, C onnect, R eceive
- Coming from a mostly interpreted language (Ruby & JavaScript) background in my career, the paradigm of reading an incoming string into an appropriately sized buffer (while being aware of buffer overflow and the vulnerabilities associated with it) has become much more solid for me. Although I did have some experience with this writing Go, working with it in C has really cemented the concepts.
- I didn't think the provided resources were particularly helpful. Typically, I do prefer hunting down my own examples and instruction until things "click" but the time pressure and Bonnie's restriction limit deincentivized this sort of experimentation and added a lot of unnecessary stress to the acquisition of the material.
Although on paper, this part sounds more difficult, I apparently found it easier than the Echo Server/Client, judging by my number of submissions.
I really enjoyed how this transfer client was the natural progression from the echo server/client. The skills that I learned in the previous part could be directly applied to this part, and added the extra twist of opening a file, reading it and sending it over the wire.
All projects, in my opinion, should be given in the manner that these 2 warm up exercises were. It starts with something easy, and becomes incrementally more difficult while still retaining the previous skills that we started with as a foundation.
- Make sure to handle error codes from calls to
recv
andsend
appropriately. - Use a test file and make sure the bytes of the original and the bytes of copy are one to one. My test file was a simple text file filled with dummy text from Bacon Ipsum.
- At least for this part, the Linux man pages were all I really needed.
- I thought that this portion of the problem set was relatively straightforward, but if I had to implicate something, I thought there could have been a portion of a video that covered sending files.
This part of the project was where I really hit a brick wall.
As you can see, I spent nearly 30+ hours in the gflib
files trying to get them to work, submitting them, getting frustrated, etc.
There were quite a few painful and hard-won lessons during this part of the project and I'd like to go through them so that they may be documented for other students to benefit from.
Coming from a software engineering background, an automated test suite are one of the single most important things that exist in an application's codebase. Rather than using a pre-rolled soluton, I attempted to create my own testing harness that modelled GETFILE
requests and responses, even going so far as to color code response types (red for a failure, green for a success).
What I failed to recognize was that my test cases in the above code were, at-best, a loose approximation of how my code would be tested on Bonnie. This became even more painfully evident when I had all green specs locally, but nearly all failing tests on Bonnie (my code successfully compiled, at least).
Although I'm proud of the code that I wrote building the test harness, I still feel that I put myself through quite a bit of unnecessary stress by not closely paying attention to the things that the testing server was looking for.
Looking through the test harness code linked above, one might notice that there are two structs named GetRequest
and GetResponse
. Those were my attempts to write object-oriented code in C. But I forgot something very important: C is not object oriented.
Using these two abstractions were a constant source of headache and debugging pain until I finally deleted them, along with the majority of my existing code and started over from scratch.
This is something that I'm really kicking myself for not doing earlier. Every single problem that I had was a simple search away. A lot of great hints were tucked away in various Piazza posts and I think I could have greatly reduced the time spent struggling with this part.
However, I think something could be done to boost the signal to noise ratio of constant flood of posts to Piazza, especially in the first few weeks of class. The constant flow of posts and the feeling of having to catch up to read every single one was very overwhelming and had the inadvertent side effect of eventually ignoring it.
Developing only a single side of the transaction made it way too easy to start developing it in an unrealistic way that would completely fall apart when put to the test. Also it made it way too easy to lose sight of what an outgoing/incoming request/response should look like.
Seems like a no-brainer but there were several times that I thought my work was done and I was getting GF_OK
but the files were not transmitting completely. An easy check of whether or not the images show up would have solved that.
6. Client taking too long? Probably hung? About a 99% chance that you called recv
one too many times.
This one was probably the most aggravating errors to see in Bonnie. But it led to some of the most important discoveries during this phase of the assignment:
- There's a difference between the number of bytes you tried to send, versus how many bytes actually got sent. It shows up all over the README as well as docs for
recv
andsend
but any student taking this class in the future would do well to repeat it 1000 times in prep for this assigment. - 1 call to
recv
on the client-side does NOT necessarily correspond to 1 call tosend
on the server-side. Assuming this is the case is simply asking for trouble.
It was this part of the problem set where I really started to feel the pain of Bonnie and what a black box it was. It was incredibly frustrating and stressful to get very opaque error messages and have to guess at what was going wrong, knowing that you'd have to use one of your submissions in order to see if your hunch was correct.
Perhaps just a C file with failing test cases provided by the teaching staff could be a good intermediate step for students to test their code against. I found the Interop thread where other students posted their compiled binaries to be relatively helpful but it was still just a imitation of what could possibly go wrong. There were still errors that I was unable to reproduce on my local development environment, regardless of the binary I ended up testing against.
Bonnie's method of feedback just seemed so vague that it almost felt like luck when tests started passing, rather the assuredness of passing specs that I usually feel when I'm writing tests and code in other parts of my life.
This part went considerably smoother than the single threaded version, which was most likely due to the fact that we were given pre-compiled versions of gfclient
and gfserver
.
I enjoyed this exercise very much because I was finally able to use both condition variables and mutexes in practical applications, rather than in example code.
- At first, I attempted to have 2 different mutexes, one for the download queue (on the client side) and another for the condition variable. I realized, after some cautious debugging that this didn't quite make sense, as both of these resources vary together, and that a single mutex manages shared access just as well with even less overhead.
- Having to reset the file marker after reading the size with
lseek
was super confusing but incredibly gratifying to figure out. - Using the functions in
content.c
versus usingopen
was something that I went back and forth with several times because ofcontent.c
always returning the same file descriptor, whileopen
would run into the file missing error on Bonnie. Finding out about thepread
function (again from a thread on Piazza) was a revelation and finally made it clear to me that I should be using the content functions.
I thought the seams of where each step such as sending the header and closing sockets could have been better communicated. It took trial and error to realize where certain calls were supposed to go. For example, not knowing that the memory for a context was being de-allocated led to very confusing memory errors that were only solved after reading a thread on Piazza that I happened upon.
- I spent 70+ hours completing this assigment but am almost positive that I won't repeat these same mistakes again and still had a stressful and fun experience.
- It would be helpful if there was a bit less obfuscation around Bonnie's test cases.
- The 10 submissions per 24 hour limit is incredibly stressful and, in my opinion, deincentivizes experimentation and learning for fear of running out of submissions. Perhaps a compiled binary version of Bonnie that students can run locally in order to test their code would be an improvement.
- A bit more direction during the multi-threaded portion as far as what functions are being triggered in your code and what functions you can depend on in the binary would've been helpful.
- Maybe a unifying thread for all basic questions such as setting up dev environments and the like in the Piazza. The first week or so had quite a few distinct threads dedicated to best IDE's, debugging with GDB, etc., that it became too easy, mentally, to diminish the importance of looking to Piazza when I got stuck.