Skip to content

Instantly share code, notes, and snippets.

@jnguyen1098
Last active February 14, 2024 14:35
Show Gist options
  • Save jnguyen1098/210fdd73b693e22306ab94a82ce8f80c to your computer and use it in GitHub Desktop.
Save jnguyen1098/210fdd73b693e22306ab94a82ce8f80c to your computer and use it in GitHub Desktop.
CIS*3190 Is One of the Most Useful Courses in BComp

CIS*3190 Is One of the Most Useful Courses in BComp

Ask anyone at the University of Guelph what butt end courses exist, and you will usually find folks laughing at CIS*2910, the "useless theory" course, or CIS*4650, the course where you "learn to make compilers, a skill you'll never work on in the real world" (completely untrue, btw). But no course has received the short end of the stick more than CIS*3190 "Software for Legacy Systems". Yuck, a course where you program in Fortran, Ada, and COBOL, some of the most dinosaur-esque languages out there still in use? Blech. Or so you may think. On the contrary, I would argue that the introspection and creative format of the course make it one of the most useful, rewarding, and overall underrated courses at U of Guelph.

CIS*3190 is a Distance Education or "DE" course at Guelph, meaning it is offered entirely online (a format that is different from "SYNC" courses introduced during the COVID era of schooling). This convenience means there aren't any lectures, labs, nor do you interact with other students. So in terms of the group engineering process, it isn't there. But stay with me here.

In all of the times I've seen it running, Prof. Wirth runs CIS*3190 with four equally-weighted, 25% assignments. When I took it, they were as follows:

  • Fortran re-engineering assignment
  • Ada word scrambler project
  • COBOL re-engineering and modularization assignment
  • Language comparison and introspection report

But what it shines in should be more than enough to win prospective elective seekers over.

Self-Pacing

Personally, I found the DE nature of the course to be a benefit, and many other colleagues will agree with me: you can go as fast or as slow as you want (within reason). There isn't a sense of micromanagement like in some courses where you have to juggle weekly assignments, marked labs, peer feedback, among other checkpoints; midterms, exams, term tests—what alarm fatigue! I've always appreciated courses that are flexible in this regard because it allows folks to speed in the fast lane as much as possible while not alienating those who want to stop, smell the flowers, and take their sweet time appreciating the intricacies of Fortran's reshape function. And yes, I have known some folks who finished the course and all of the assignments within the first two weeks. Prof. Wirth even goes as far as upload code blogs detailing topics relevant to the stuff he teaches at the time. For example, when we were doing the COBOL re-engineering assignment, he would coincidentally post COBOL re-engineering examples on his blog. If anyone has taken a Kremer course, you'll be familiar with such a concept. You can see his blog here.

In an era where we deal with hit-or-miss SYNC/ASYNC COVID courses, it is apparent that Prof. Wirth is one of the pioneers of the CIS DE course style, and I can only hope other profs. follow his footsteps.

Everyone Refactors Re-Engineers

It's pretty obvious when a prof. gives you busywork or gives you work for the sake of fulfilling a course. What is all too common are "rote" profs who give you 30 examples of a topic and make you drill it in your head until you memorize it. Other profs., like Prof. Andrew Hamilton-Wright, make you think hard about the underlying processes and assumptions before you go to work (take, for example, AHW's CIS*4150 Testing course, where we weren't told to make state machines and mocks merely, but to introspect holistically about the assignment as a whole in order to make the right engineering judgement calls!). Prof. Wirth is no different and starkly draws a sharp contrast between "refactoring" and "re-engineering", as well contrast between software dev. and programming. And he makes it his death wish that you do too.

When I took CIS*3190, we were given two re-engineering exercises. These two exercises were what propelled my opinion of the course in the first place. Gone were the days where you'd memorize a pattern and then apply it 30 times in a row for marks.

The first exercise involved re-engineering a Fortran 77 implementation of Horst Feistel's Lucifer cipher, one of the earliest block ciphers that was the direct predecessor to the more-popular DES algorithm. It was absolute hell. To start, this is what some of the old code would look like:

      call compress(message,mb,32)
      call compress(key,kb,32)
      write(*,1003)
      write(*,1007) (kb(i),i=0,31)
      write(*,1005)
      write(*,1007) (mb(i),i=0,31)

1000  format(' key '/16(1x,i1))
1001  format(' plain '/16(1x,i1))
1002  format(' cipher '/16(1x,i1))
1003  format(' key ')
1004  format(32z1.1)
1005  format(' plain ')
1006  format(32z1.1)
1007  format(1x,32z1.1)
      end

along with crowd favourites like:

      v=(s0(l)+16*s1(h))*(1-k(jj,ks))+(s0(h)+16*s1(l))*k(jj,ks)

      do 500 kk=0,7,1
      tr(kk)=mod(v,2)
      v=v/2
500   continue

      do 300 kk=0,7,1
      m(kk,mod(o(kk)+jj,8),h0)=mod(k(pr(kk),kc)+tr(pr(kk))+m(kk,mod(o(kk)+jj,8),h0),2)
300   continue
      if (jj .lt. 7 .or. d .eq. 1) kc=mod(kc+1,16)
200   continue

Some of the more annoying parts of Fortran 77 code were:

  • implicit variables—similar to JavaScript where you didn't have to define anything; you could declare all implicit variables as integers, and then every lvalue that didn't exist would be "automatically" declared as an integer. This ambiguity meant that if you were working with variable foo but misspelled it as fooo, you'd be in trouble. Just like in JavaScript

  • lots and lots of labels and jumps. This demonic playground programming was before the era of structured programming, so GO TO statements dominated programming at this time.

Re-engineering the Fortran 77 code involved tracing through the dataflow of each implicit variable, disabling implicits as a whole, tracing through label jumps, as well as many other hurdles of this 40+ year old language! You can see some of the carnage here:

!           The document in verbatim referred to this as "controlled
!           interchange and s-box permutation, whatever that means.
            v = (s0(l) + 16*s1(h)) * (1 - k(jj, ks)) + (s0(h) + 16 * &
                    s1(l)) * k(jj, ks)

!           Here we convert v back into bit array format. We could have
!           used expand(), but this is faster.
            do kk = 0, 7
               tr(kk) = mod(v, 2)
               v = v / 2
            end do

!           Here we do key-interruption and diffusion, combined. The
!           "k + tr" term is the permuted key interruption.
!
!           mod(o(kk) + jj, 8) is the diffusion row for column kk.
!           row = byte and column = bit within byte
            do kk= 0, 7
               m(kk, mod(o(kk) + jj, 8), h0) = mod(k(pr(kk), kc) + &
                       tr(pr(kk)) + m(kk, mod(o(kk) + jj, 8), h0), 2)
            end do

            if (jj < 7 .or. d == 1) then
               kc = mod(kc + 1, 16)
            end if

         end do

Refactoring is to re-engineering what programming is to software development. What this assignment taught, that other rote courses didn't, was that sometimes you had to really take a bird's eye view of the assignment to really understand what was going on, and that blindly applying rote textbook patterns in a "pinhole"-style way didn't always work. You were basically making the jump from poker level 2 to poker level 3. I have yet to find a course that deals with this process as well as CIS*3190. Though, this example might be a little too academic. See the following example in COBOL.

Re-engineering the Past

I'll posit that the hardest assignment in the course was this COBOL re-engineering assignment. I won't show all of it as it is supremely arcane, but this is the main logic:

PROCEDURE DIVISION.
    OPEN INPUT INPUT-FILE, OUTPUT STANDARD-OUTPUT.
    WRITE OUT-LINE FROM TITLE-LINE AFTER ADVANCING 0 LINES.
    WRITE OUT-LINE FROM UNDER-LINE AFTER ADVANCING 1 LINE.
    WRITE OUT-LINE FROM COL-HEADS AFTER ADVANCING 1 LINE.
    WRITE OUT-LINE FROM UNDERLINE-2 AFTER ADVANCING 1 LINE.
S1.  
    READ INPUT-FILE INTO IN-CARD AT END GO TO FINISH.
    IF IN-Z IS GREATER THAN ZERO GO TO B1.
    MOVE IN-Z TO OT-Z.
    WRITE OUT-LINE FROM ERROR-MESS AFTER ADVANCING 1 LINE.
    GO TO S1.
B1. 
    MOVE IN-DIFF TO DIFF.
    MOVE IN-Z TO Z.
    DIVIDE 2 INTO Z GIVING X ROUNDED.
    PERFORM S2 THRU E2 VARYING K FROM 1 BY 1
        UNTIL K IS GREATER THAN 1000.
    MOVE IN-Z TO OUTP-Z.
    WRITE OUT-LINE FROM ABORT-MESS AFTER ADVANCING 1 LINE.
    GO TO S1.
S2. 
    COMPUTE Y ROUNDED = 0.5 * (X + Z / X).
    SUBTRACT X FROM Y GIVING TEMP.
    IF TEMP IS LESS THAN ZERO COMPUTE TEMP = - TEMP.
    IF TEMP / (Y + X) IS GREATER THAN DIFF GO TO E2.
    MOVE IN-Z TO OUT-Z. 
    MOVE Y TO OUT-Y.
    WRITE OUT-LINE FROM PRINT-LINE AFTER ADVANCING 1 LINE.
    GO TO S1.
E2. 
    MOVE Y TO X.
FINISH.
    CLOSE INPUT-FILE, STANDARD-OUTPUT. 
STOP RUN.

The issue with this program is that it relies on highly verbose, confusing fall-through logic to perform error checking. Moreover, program flow becomes very muddied because it relies on labels and jumps to such labels. The more confusing thing is that it combines modern-day function behaviour using the PERFORM keyword and the unstructured GO TO keyword that Edsger Dijkstra wanted to leave behind in his famous paper.

To properly re-engineer this program, a prospective student of the course would have to correctly model the scopes initially defined by the GO TO and PERFORM THRU statements to match modern-day function calls, in addition to properly managing the resources and error checks initially put in by the original developer.

One such re-engineering may look like this:

    *> Sanitize user input and parse string as number
    move function trim(userInput trailing) to radicand.
       
    *> Proceed only if parsed number is valid AND positive
    if radicand is <= 0 or function test-numval-f(userInput) is > 0 then
        display "  Invalid input: positive numbers only!" x"0A"
        exit paragraph
    else
        *> Proceed on
        perform babylon
        *> Clean up and print answer line
        display "" function trim(userInput)
                " = " function trim(answer leading) x"0A"
    end-if.

babylon.
    *> Our initial guess will be half the input
    compute guess rounded = radicand / 2.
    *> Iterate sqrt() until desired precision
    perform with test after until
    function abs(guess - prevGuess) is < 0.000001
        *> Store last guess
        move guess to prevGuess
        *> Calculate next guess using last guess
        compute guess rounded = (prevGuess + radicand / prevGuess) / 2
    end-perform.
    *> Return output
    move guess to answer.

Later revisions of COBOL would avoid sentence periods (.) and instead focus on scope delimiters like end-if, end-perform, etc. Also do notice that people STOPPED SHOUTING and started using more expressive keywords. I will never forget the number of hours I spent experimenting with scope delimiting, nor the lessons taught from this assignment.

... and this only encompasses a third of the original code. There's an entire section that I didn't paste (due to not wanting to bore the reader even more) where you have to mangle with the original title headers, as well as the storage/number formatting constraints—back then, you had to actually specify the "type" of a digit, not just specify the number of digits you'd allocate.

This kind of stuff you don't get out of a typical "read X, create X" assignment that one would find in most CS courses here. Very few courses, if any, ask for the introspection that CIS*3190 asks for.

Levelling the Playing Field

The course teaches you Fortran, Ada, and COBOL. While some view this as a curse, I view it as a blessing. There is a meagre chance that people pursuing a Bachelor's of Computing at the University of Guelph already know these languages, so this guarantees that everyone starts from square 1 in doing the assignments. Re-engineering would be done from a levelled playing field as everyone would have to juggle the skill of learning a new language "on-the-job" AND doing a task such as re-engineering. This essentially guarantees that everyone has something to take from CIS*3190. Why? Next point.

Agnosticism

Remember how I said earlier that Prof. Wirth starkly paints a contrast between software development and programming? You shouldn't be marred by the fact that the course teaches you some of the oldest languages known to man. One who actually understands the spirit of Wirth's assignments would know that refactoring, development, and engineering have overlapping problems and ideologies that are agnostic to the tech stack being used. Regardless of whether you work in the MERN stack with React and every modern framework known to man or you with VBA at an old legacy shop making old B2B solutions, the underlying engineering problems still exist, and this course serves as a harsh testament to that. To view the course as useless because of a reasoning as shallow as "irrelevant stack" is extremely closed-minded and contrary to the engineering aspect of computing, especially since any day, your stack could change.

You're a software developer—not a programmer!

Acknowledgements

Thank you Prof. Wirth for one of the best courses I've ever taken.

I wholly recommend anyone interested in programming to take this course. We have very few CIS electives in the winter semester and few spaces in the fall. The fact that it's DE and self-paced should mean almost anyone can take it as long as they have at least CIS*2500 (even co-op students can take the course). Part of learning comes from iterative improvement, and looking back at the many different mechanisms that programming USED TO BE and then looking back at my spoiled ass using defaultdict(lambda: defaultdict(list)) is nothing short of breathtaking.

I expect to see you guys acobplish great things. ;-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment