An Applied Programming Instructor Commentates an Algorithms Class, Pt. 2

Reading Time: 10 minutes

The title basically captures it: I’m walking through a videotaped algorithms class and recording my observations as a teacher of applied programming classes. Here’s the whole series so far. Check out part 1 for a more complete explanation of what’s happening here.

We left off comparing merge sort and quick sort. We go next to this lecture about heaps:

A heap, as you’ll see in the video, can be illustrated as a tree, with each node pointing to zero, one, or two other nodes. This lecture introduces these two ideas:

Min heap: a tree in which the smallest number is the root node, with all nodes ascending from there
Max heap: a tree in which the largest number is the root node, with all nodes descending from there

They both differ from the binary search tree, which is the variety of tree that I have encountered the most in professional discussions. The teacher reaches the binary search tree in the next lecture:

Here’s the thing about trees: a picture of a tree makes it a lot easier to see why searching a tree takes log n time. But my experience has been that high school math teachers don’t teach logs (or exponents, for that matter) with trees. They teach them with an algebraic term and a function drawn on an XY plane. This is great! This is one way to teach it! But then the high school students get to the collegiate or graduate CS course, and the instructor in that course assumes that at some point, the students have seen the idea of a logarithm represented with a tree—or else they’re just able to intuitively make the leap from the continuous representation we get in math class to the discrete one that computer scientists use. That annoys the crap out of me.

In this case I complained about it on Twitter, but then I decided “Well Chelsea, if it annoys you so much that CS instructors don’t explain this, hell, you’re a CS instructor, why don’t you explain it?” So I did:

So the thing about a tree structure is that SEARCHING a tree, for a program, takes logarithmic time.

That is, as the program grows, the work the program has to do increases LESS than linearly with the size of the data input. Because of this tree.

2/
— Chelsea Troy 🏳️‍🌈 (@HeyChelseaTroy) March 16, 2021

Please disregard the math teacher who showed up on the thread to demand an explanation from me and then insisted that the real problem is that most people just aren’t smart enough for math. (Gold star to me for not vomiting when I saw that, by the way.)

AHA! My favorite is back. With AVL trees!

The “AVL” in AVL tree comes from two names: Adelson-Velski and Landis, two Soviet mathematicians credited with inventing this data structure in the 1960s. In practice, it’s a height-balancing binary search tree—meaning that, it’s a BST in which, each time an item is added, a check is run to ensure that one side of the tree isn’t larger than the other by some threshold, and if it is, an operation is run to rotate the tree such that the sides are more even.

That matters because, if a tree gets uneven around its root node, then the “bigger” side takes longer to traverse than the “smaller” side, and searching on the “bigger” side no longer takes log n steps; it takes more steps than that due to the fact that a larger proportion of “n” lives on this side. Balancing the tree fixes it.

I got the general idea from this lecture. I also know that I learn best from illustrations where I can relate concepts to one another in some sort of spatial way. For stepwise operations the inclusion of animation makes an illustration especially effective. Youtube teems with free playlists from intrepid souls who draw and present exactly those sorts of animations. Here’s one for AVL Trees:

Here’s another one for radix sort, which Erik mentions in the above lecture.

These sorts of videos usually have a bunch of comments below them like “Thank GOODNESS for this video, this person is a WAY better instructor than the one in my algorithms class that I’m taking right now!!”

Let’s dissect that sentiment for a second, because I think the truth is more complicated.

The State of Traditional Algorithms Education

Algorithms is one of a few “standard core” classes across CS programs. Everyone with a CS degree takes one, and every program offers one. Also, because it’s “standard core,” the syllabus stays pretty static. The vast majority of material in Intro to Algorithms comes from the sixties, seventies, and eighties. The class isn’t changing drastically. Also because it’s standard core, every program needs at least one person (and usually several) teaching algorithms.

Algorithms comes relatively early in the students’ education (MIT teaches it to freshmen, I believe). The classes are usually exceptionally large—often one of the largest that students will take at the program. Because of this, algorithms tends to get categorized as a “weeder course.” That moniker suggests that part of the course’s purpose is to “weed out” people who “should not” move on in the specialty.

There’s one more relevant piece here to what we’re about to discuss, which is that most academic positions require teaching alongside research. Academics sometimes shoulder the teaching as a burden of the job for the chance to do the research. They’re not all desperate to teach, nor are they particularly skilled at it. Also remember that there is a heavy survivorship bias in academia towards people to whom the material came easily in the first place, meaning that academics trend low on experience with pedagogical methods to help struggling students.

All of this contributes to algorithms classes trending…not that good. Some percentage of teachers are only teaching because they have to. At all times the department needs someone teaching algorithms. So there’s always demand for someone to teach it. Further, of the classes that need to be taught, algorithms has one of the lowest ongoing overheads because it doesn’t have to get updated much. The result: algorithms is attractive to instructors who don’t want to spend time designing or updating courses.

There’s then relatively little accountability around the quality of the teaching because a high dropout rate in a class like this is expected due to its “weeder course” status. CS instruction in general has a giant problem with “well if someone fails to complete the class it’s because CS is hard and the student was too stupid to hack it.” I myself got subjected to this in university, as you read in the above Twitter thread where I yelled at that math teacher. The thing about the algorithms course is, it has to run. But that’s about the extent of the KPIs for it in most places.

However, it’s not all the teachers’ faults.

First of all, most schools jam the algorithms education into way too little time. We’re gonna talk about that later, but for now I’ll leave it at “most teachers would be better at teaching algorithms if they had more lecture time.”

Also, remember, these classes are very large, which severely limits the time that instructors have to provide individualized help. But that’s not all. A class of almost any size has a wide range of skill levels in it. It’s impossible to pace a live lecture in a way that’s accessible to the students with the least incoming familiarity and interesting to the students with the most incoming familiarity. Finally, large class sizes produce a Kitty Genovese effect. If the instructor makes a mistake or says something incomprehensible, no one says anything because everyone assumes that there are so many people in here that someone else would say something if there were actually a problem. Believe it or not, instructors are human. We screw up. When I screw up in a class of about 26, I’d say someone says something a third of the time. A third of the time. Imagine a class six times that size.

Moreover, universities are pretty intent on failing to imagine education outside of the format of a person at a blackboard in front of the students. Even a lot of MOOCs are “we pointed a camera at a teacher at a blackboard in front of the students.” This severely limits the resources available to instructors to convey the material. One example: blackboards can’t display animations. Animations are especially helpful in the instruction of stepwise operations such as algorithms. Another example: an in-person classroom does not comfortably facilitate group work. Instructor-supervised group work, I have learned, is so damn effective that it’s practically a secret weapon. But you can’t have that in an algos lecture. I’d even argue that recitation doesn’t really facilitate it.

Furthermore, a class serves a different purpose than an instructional video.

It’s one thing to make a video explaining an individual algorithm with animations. It’s a different thing to motivate that algorithm with use cases or situate it within a broader syllabus with an order and a direction. The latter of these, puh-lenty of these video creators, excellent as each individual video is, are not doing well. You cannot watch a playlist of their videos and come out with any idea about how these algorithms compare to each other or when to use which.

It’s telling that so many of the comments are like “This video is SAVING ME in my class!” as opposed to “These videos are helping me learn this from scratch!” My hypothesis is that this is what’s happening: the animated videos serve as fantastic supplemental material to understand the algorithms as individual things, but the class is still doing a lot of heavy lifting on motivating, situating, and comparing them. Either that, or neither the class nor the class’s test pays any attention to motivating, situating, and comparing them. So these students are gonna pass algorithms and not realize that they have no idea how to actually use any of these things they have so painstakingly memorized.

Finally, what is a student to do about this?

Well, what I’ve been doing is watching the lectures and stopping them in the middle to find animated videos. What I’m gonna try going forward is to look at the title of the video, find animated videos on the thing in the title, watch those, then watch the lecture. That way, the super-fast exposure in the lecture isn’t my first brush with how the algorithm works and I can focus on how the instructor motivates, situates, and compares it.

That’s quite enough for one blog post. Tune in next time for some more algorithms videos and pedagogical soapboxing.

If you liked this piece, you might also like:

The compilers series—yes, this is really the kind of person I am

This post about engineering for edge cases, so you get some practical stuff with your theoretical stuff

This piece on documentation, which people who spend their spare time learning about algorithms should probably also read 🙂

Chelsea Troy

An Applied Programming Instructor Commentates an Algorithms Class, Pt. 2

The State of Traditional Algorithms Education

However, it’s not all the teachers’ faults.

Furthermore, a class serves a different purpose than an instructional video.

Finally, what is a student to do about this?

If you liked this piece, you might also like:

Like this:

Leave a ReplyCancel reply

An Applied Programming Instructor Commentates an Algorithms Class, Pt. 2

The State of Traditional Algorithms Education

However, it’s not all the teachers’ faults.

Furthermore, a class serves a different purpose than an instructional video.

Finally, what is a student to do about this?

If you liked this piece, you might also like:

Share this:

Like this:

Leave a ReplyCancel reply

Discover more from Chelsea Troy