### The Math of Khan

30Nov12

I have been following some of the material produced by Khan Academy, in particular their recent and awesome software environment. They started producing Python tutorial videos back in 2011, but earlier this year they made a dramatic switch to a JavaScript dialect and an in-browser graphical immediate interpreter. That is to say, if you write some code that draws a circle you’ll see the circle immediately, and if you change the “100” in your code to “20” you’ll immediately see the circle shrink.

This is indeed dramatic! Pedagogically, one of the most important things that beginners need to accustom themselves to, in order to create complex programs, is to test their code early and test their code often. This prevents complex untraceable bugs and instead lets you focus on fixing one problem at a time. You literally can’t test more often than testing every instant and so the Khan Academy environment really helps the users get in the right groove.

This style of IDE was recently (re)-popularized in the talk “Inventing on Principle’ by Bret Victor:

Below, John Resig discusses the challenges he faced when trying to actually implement this for Khan Academy:

Only recently did I really get my hands dirty in becoming a user of the Khan Academy programming interface. Specifically, a challenge was issued to use the platform to visualize why, when taking samples $X_1, X_2, ..., X_n$ from a population, the computed variance of the subpopulation

$\displaystyle \sum_{i=1}^n (X_i-\mu)^2/n$

is not a good estimate for the variance of the original population, but rather that one should use

$\displaystyle \sum_{i=1}^n (X_i-\mu)^2/(n-1)$

Why is there an n-1 instead of an n? In high school I remember wondering the exact same thing and so this seemed like a worthwhile endeavor to me, and I produced a short slideshow/visualization to explain and prove the accuracy of this formula. Go check it out and see computational ichthyology in all its glory.

I enjoyed using the IDE. It was pretty good at managing to correctly recompute everything on the fly, but it did not work all the time and sometime required refreshing the page to work with mouse input, and I guess that the technical hurdles mentioned in John’s talk are relevant. The “semicolon monster” that pops up which gives you human-readable versions of error messages when you have a syntax error was usually helpful but sometimes inconsistent. For example, this code

var z = function() {
for (var i=0; i<5; i++) {var x = i;}
for (var i=0; i<5; i++) {var x = i;}
};

bring up the error message “Did you forget to add a comma between two parameters?” when it’s really some scope malfunction, since changing the second var i to just i changes the error to “x is already defined!” while deleting the first and last line fixes all errors. I can imagine though that as they work the bugs out this will be an extremely fun system for students to use.

Do you think that students can get enough deep learning when everything is output visually? I have a feeling that the answer is yes, at least with some care on the part of the authors. One of the most common parts of my job as an introductory computer science educator is drawing a billion diagrams (recursion trees, object references, program flow and scope). They could certainly incorporate automatic generation of such diagrams into their lessons, much like Philip Guo’s awesome Python visualizer.

There is value in educators and researchers presenting more of their work as part slideshow, part computer program. Why have an infographic when you can have an info-mation?