Mean and Median Surprises, uPlot, Starting Tabular Tic Tac Toe, Needing SRS, Unfinished Business
It’s been a while! Was enjoying the daily blogging. Fell off.
A funny phenomenon is that when I’m not writing a daily blog, I think: I have nothing to write about, I’ve made no progress. Then, as soon as I am writing daily, I realize I always have a ton to write about. But then when I stop, I go back to assuming I’m not making any progress and have nothing to write about.
Beautiful and Unexpected Properties of the Mean and Median
There are two beautiful properties of the mean and median. Both surprised me, and nobody I’ve talked to knew of them either. They are:
- The mean is the value which minimizes the sum of squared differences to all points.
- The median(s) are the range of values which minimize the sum of absolute differences to all points.
I spent way too long (last week) building graphs to show these properties. I wanted to try out the uPlot library. More on that below.
Drag the red data points to see the computed values change. The data mean is shown in the dashed vertical orange line. The sum of squared difference with all the data points are plotted in yellow. Takeaway: the mean is the value which minimizes the sum of squared differences to all points.
Drag the red data points to see the computed values change. The data median is shown in orange area. The sum of absolute differences with all the data points are plotted in yellow. Takeaway: the medians are the range of values which minimize the sum of absolute differences to all points.
These graphs are both cross-listed here from my Machine Learning notes page, where they live.
uPlot
(The plotting library used above.) I loved how fast and minimal it is. Its demos are gloriously quick to update and interact with.
Of course, I immediately starting doing lots of stuff the library isn’t intended for, like adding my own draggable points, vertical lines and ranges, and hand-manipulating the legend. But it has custom hooks which supported all that without fuss. Limited API surface area is a natural consequence of a focused library, I think.
The other issue with uPlot is its lack of documentation. But AI really helps here. While I spent too long getting the above charts to work well (resizing, touch, light/dark color themes, legends for hand-added points), it would have been a wild amount of time without letting AI rip on rough drafts.
RL Tabular Tic Tac Toe
I’ve finally begun working on a bare-bones tic tac toe agent that will build tables to compute its distributions (e.g., the value function) (I think). I wanted to test my understanding of RL after Chapter 3 (MDPs, the “full RL” definition, the Bellman equation), because I felt like I had all the pieces to build a rudimentary agent if the state space was small enough, but I wasn’t sure. I figured it’d be a good comprehension test, even more than the book’s exercises.01
It has been a good test. (It doesn’t help it’s been a few weeks since reading.) I’m not there yet. I’m optimistic this will help with the intuitive grasp of things before learning and throwing heavyweight solutions like PPO around.
Already Needing Spaced Repetition
I’ve now done interview prep long and broadly enough that things in my DS/A notes and repository that I did just one month ago are already becoming fuzzy. For example, I worked through quicksort in excruciating detail — more comments than code, that kind of thing, to make sure I got all the boundary conditions right. Now, it’s already a haze.
I know spaced repetition is an answer here. I want to ready Andy Matuschak’s guides on:
I’m not sure whether it’s worth automating with LLMs. I originally thought “definitely,” but it’s just complex enough02 I wonder if simply writing a few by hand would be sufficient.
Of course, do you really need to know quicksort by heart? It’s an instance of an absolutely enormous pool of knowledge where:
- practically speaking: no, never
- you will still look stupid if someone asks and you don’t
Unfinished Business
Started many things today, but didn’t finish them! In addition to the RL tabular tic tac toe, plus beginning to ready Andy Matuschak’s article above, I also started:
- rendering and animating graph algorithms — want to start with Dijkstra’s, and only got to picking a framework and working on some AI boilerplate
- reviving daily paper-reading — got so rabbit-holed down related work that I spent 1.5 hours and didn’t make it past section 2
Footnotes
Which (exercises) we did do, and mostly involve substituting equations around to derive other related ones. ↩︎
Generate from a webpage’s sections; cache but update if it changes; do you build in reviews too?; if so, do you lose all progress when a section and its questions change?; easy to store locally, but then can’t review on both phone and computer?; import/export into wholly separate SRS app? ↩︎