March 17th

I was asking Claude to help me build an Artifact to explain transformers from end to end. At first it spit out an explanation, so I asked it to include examples. Then it included examples, but still just text. Then I asked it to visualize it. Then animate it. Then use better examples.

At some point, I asked it to run a training loop on a (tiny) transformer. It grabbed TensorFlow.js from a CDN, ran a loop with a couple of layers, each with a couple of heads, and used the output in the visualizations. It did it right there, right inside of the browser, right inside of Claude.

Today, it occurred to me that Claude will quite happily run some arbitrary Python. Sure, it will do it on modest hardware and it won’t use a bunch of libraries, but it’ll run Python. I grabbed the source for Andrej Karpathy’s microgpt and asked Claude to run it, right there in chat. A few minutes later it spit out a list of names and told me what our loss was.

Why bother? Wouldn’t it be much easier and better to just use Claude Code to do the same thing, on better hardware and with whatever libraries I wanted? Sure, of course, yes. But it’s just sort of... fun, to poke at things and see what happens. It’s fun to play with tools in unexpected ways. Fun to do things in some wrong, goofy way just for the heck of it.

Anyway, didn’t expect to be writing this, because I didn’t expect to be doing the thing I’m writing about, but it took a few minutes and was kind of fun. Pretty good deal. The bigger thing is that it reminded me (though I’m fortunate enough to not need reminding often) that doing things that look dumb to someone else can still be a fun and worthwhile thing to do. I promise that some of the things you love started out looking dumb.

So, if you needed a push to just start doing something for the heck of it, whether that thing would take five minutes or five years, maybe this is it? Maybe it’s just a short note about a pointless thing and has no point. Doesn’t really matter, because writing about it was fun too.