nycdavid/giospr1-notes.md

## giospr1-notes.md

      
    Raw
  

              giospr1-notes.md
            
          
    This turned out a bit longer than I expected, so here's the conclusion
Echo Server/Client

Although there was a bit of ramp up time, I found this portion of the project to be a great primer on creating, connecting to
and transmitting with sockets.
As far as resources go, I may very well be in the minority in that I didn't find things like Beej's Guide to Network Programming or the Linux Programming Interface to be particularly helpful as a first resource for this exercise. As thorough as they are, they were almost too indepth and I wound up inadvertently complicating things by attempting to apply the patterns in it.
The best resources for me were:

This YouTube video about basic sockets

This video was tremendously helpful in getting the very basics of creating and connecting to sockets.


Using Telnet to connect to the server that I created and repeatedly sending text and observing the response

I tried to think of the most basic way in which I could send requests to a socket and still receive a response, without necessarily making changes to a file, compiling it and then running every time. Telnet was a truly indispensible tool in that regard.


The Linux Manpages, of course

Key Takeaways


Remember these acronyms:

Server, SBLA: S ocket, B ind, L isten, A ccept
Client, SCR: S ocket, C onnect, R eceive


Coming from a mostly interpreted language (Ruby & JavaScript) background in my career, the paradigm of reading an incoming string into an appropriately sized buffer (while being aware of buffer overflow and the vulnerabilities associated with it) has become much more solid for me. Although I did have some experience with this writing Go, working with it in C has really cemented the concepts.

Ways this part could have been better:


I didn't think the provided resources were particularly helpful. Typically, I do prefer hunting down my own examples and instruction until things "click" but the time pressure and Bonnie's restriction limit deincentivized this sort of experimentation and added a lot of unnecessary stress to the acquisition of the material.

Transfer Server/Client

Although on paper, this part sounds more difficult, I apparently found it easier than the Echo Server/Client, judging by my number of submissions.
I really enjoyed how this transfer client was the natural progression from the echo server/client. The skills that I learned in the previous part could be directly applied to this part, and added the extra twist of opening a file, reading it and sending it over the wire.
All projects, in my opinion, should be given in the manner that these 2 warm up exercises were. It starts with something easy, and becomes incrementally more difficult while still retaining the previous skills that we started with as a foundation.
Key Takeaways


Make sure to handle error codes from calls to recv and send appropriately.
Use a test file and make sure the bytes of the original and the bytes of copy are one to one. My test file was a simple text file filled with dummy text from Bacon Ipsum.
At least for this part, the Linux man pages were all I really needed.

Ways this part could have been better:


I thought that this portion of the problem set was relatively straightforward, but if I had to implicate something, I thought there could have been a portion of a video that covered sending files.

GETFILE Client & Server

This part of the project was where I really hit a brick wall.


As you can see, I spent nearly 30+ hours in the gflib files trying to get them to work, submitting them, getting frustrated, etc.
There were quite a few painful and hard-won lessons during this part of the project and I'd like to go through them so that they may be documented for other students to benefit from.
1. Make sure that your tests closely emulate how your code will be used in the wild.

Coming from a software engineering background, an automated test suite are one of the single most important things that exist in an application's codebase. Rather than using a pre-rolled soluton, I attempted to create my own testing harness that modelled GETFILE requests and responses, even going so far as to color code response types (red for a failure, green for a success).
What I failed to recognize was that my test cases in the above code were, at-best, a loose approximation of how my code would be tested on Bonnie. This became even more painfully evident when I had all green specs locally, but nearly all failing tests on Bonnie (my code successfully compiled, at least).
Although I'm proud of the code that I wrote building the test harness, I still feel that I put myself through quite a bit of unnecessary stress by not closely paying attention to the things that the testing server was looking for.
2. Don't over-engineer. Keep it simple.

Looking through the test harness code linked above, one might notice that there are two structs named GetRequest and GetResponse. Those were my attempts to write object-oriented code in C. But I forgot something very important: C is not object oriented.
Using these two abstractions were a constant source of headache and debugging pain until I finally deleted them, along with the majority of my existing code and started over from scratch.
3. Don't forget to check Piazza.

This is something that I'm really kicking myself for not doing earlier. Every single problem that I had was a simple search away. A lot of great hints were tucked away in various Piazza posts and I think I could have greatly reduced the time spent struggling with this part.
However, I think something could be done to boost the signal to noise ratio of constant flood of posts to Piazza, especially in the first few weeks of class. The constant flow of posts and the feeling of having to catch up to read every single one was very overwhelming and had the inadvertent side effect of eventually ignoring it.
4. Develop your client and server in tandem.

Developing only a single side of the transaction made it way too easy to start developing it in an unrealistic way that would completely fall apart when put to the test. Also it made it way too easy to lose sight of what an outgoing/incoming request/response should look like.
5. MAKE SURE THE FILE ISN'T CORRUPT

Seems like a no-brainer but there were several times that I thought my work was done and I was getting GF_OK but the files were not transmitting completely. An easy check of whether or not the images show up would have solved that.
6. Client taking too long? Probably hung? About a 99% chance that you called recv one too many times.

This one was probably the most aggravating errors to see in Bonnie. But it led to some of the most important discoveries during this phase of the assignment:

There's a difference between the number of bytes you tried to send, versus how many bytes actually got sent. It shows up all over the README as well as docs for recv and send but any student taking this class in the future would do well to repeat it 1000 times in prep for this assigment.
1 call to recv on the client-side does NOT necessarily correspond to 1 call to send on the server-side. Assuming this is the case is simply asking for trouble.

Ways this portion could have been better

It was this part of the problem set where I really started to feel the pain of Bonnie and what a black box it was. It was incredibly frustrating and stressful to get very opaque error messages and have to guess at what was going wrong, knowing that you'd have to use one of your submissions in order to see if your hunch was correct.
Perhaps just a C file with failing test cases provided by the teaching staff could be a good intermediate step for students to test their code against. I found the Interop thread where other students posted their compiled binaries to be relatively helpful but it was still just a imitation of what could possibly go wrong. There were still errors that I was unable to reproduce on my local development environment, regardless of the binary I ended up testing against.
Bonnie's method of feedback just seemed so vague that it almost felt like luck when tests started passing, rather the assuredness of passing specs that I usually feel when I'm writing tests and code in other parts of my life.
Multithreaded GETFILE Client/Server

This part went considerably smoother than the single threaded version, which was most likely due to the fact that we were given pre-compiled versions of gfclient and gfserver.
I enjoyed this exercise very much because I was finally able to use both condition variables and mutexes in practical applications, rather than in example code.
Key takeaways


At first, I attempted to have 2 different mutexes, one for the download queue (on the client side) and another for the condition variable. I realized, after some cautious debugging that this didn't quite make sense, as both of these resources vary together, and that a single mutex manages shared access just as well with even less overhead.
Having to reset the file marker after reading the size with lseek was super confusing but incredibly gratifying to figure out.
Using the functions in content.c versus using open was something that I went back and forth with several times because of content.c always returning the same file descriptor, while open would run into the file missing error on Bonnie. Finding out about the pread function (again from a thread on Piazza) was a revelation and finally made it clear to me that I should be using the content functions.

Ways that this part could have been better

I thought the seams of where each step such as sending the header and closing sockets could have been better communicated. It took trial and error to realize where certain calls were supposed to go. For example, not knowing that the memory for a context was being de-allocated led to very confusing memory errors that were only solved after reading a thread on Piazza that I happened upon.
Conclusion


I spent 70+ hours completing this assigment but am almost positive that I won't repeat these same mistakes again and still had a stressful and fun experience.
It would be helpful if there was a bit less obfuscation around Bonnie's test cases.
The 10 submissions per 24 hour limit is incredibly stressful and, in my opinion, deincentivizes experimentation and learning for fear of running out of submissions. Perhaps a compiled binary version of Bonnie that students can run locally in order to test their code would be an improvement.
A bit more direction during the multi-threaded portion as far as what functions are being triggered in your code and what functions you can depend on in the binary would've been helpful.
Maybe a unifying thread for all basic questions such as setting up dev environments and the like in the Piazza. The first week or so had quite a few distinct threads dedicated to best IDE's, debugging with GDB, etc., that it became too easy, mentally, to diminish the importance of looking to Piazza when I got stuck.