Will’s Blog

Primer: facebook’s HTMX from 2010

2024-02-20T00:00:00+00:00

I recently discovered this 2010 JSConf presentation via HTMX discord:

Summary

At Facebook they had loads of JavaScript and that was causing slow page loads.
Loading the JavaScript async didn’t work well as the page would render, but be non interactive - which sucks from a UX perspective.
They realised that the basic operation the JavaScript was performing was making an http request and swapping the new content somewhere in the DOM - so they wrote a 40 line JavaScript function to do this which they included inline in every pages’ ¹. This gave them instant interactivity.
They were then going through incrementally replacing as much existing JavaScript with just this library - increasing performance (~5s to ~2.5s page load) and reducing complexity and lines of code.

Discussion

The motivation was performance. The effect was better performance - but also much less code to write and a simpler overall system.

I’d recommend watching the whole thing, I found it a pleasure. The author, Makinde Adeagbo, is a charismatic presenter. He covers other areas like downloading JS on-demand, native HTML controls and the tooling they used to find interactions that they could apply Primer to.

The actual code can be found here: https://gist.github.com/makinde/376039.

The context now than then is very different of course. They were already returning html from the server - so it’s less of a jump to having a generic JavaScript library to do it for you. The point of comparison now would be to a react style SPA with endpoints returning JSON and HTML being generated client-side.

Why is this interesting?

The core idea is the same as HTMX, which is currently riding high in the hype cycle - but 14 years ago, and applied within a megacorp. I view the current enthusiasm about HTMX as very much a reaction to SPA misery - both as a user (slow brittle websites) and as a developer (increased complexity and lines of code). In this case the technology was developed and used at the same time, and within the same organisation as react!
Like many YouTube videos I’m left with the question “What happened next?”. How widely did they end up applying this? What happened to the 300 people within facebook at the time that knew about it? It can’t have made that much of an impression on them at the time - they didn’t disperse and evangelise. Why did react win inside facebook? Why, why, why?

Speculation:
- React allowed them to scale their features quickly by having many teams work more independently of each other?
- Something related to the mobile web/apps? At this time it was clear that mobile was going to be the most important client, and maybe they had some ambition for offline only use cases?
- Pure chance - the right people heard about react within facebook at the right time?
- The hypermedia approach was too limited to create the UX that facebook wanted?

Minutiae

Less interesting asides peripheral to the main point of this blog post:

HTMX is 14kB gzipped. IMO there’s no point in it being any smaller - unless it could be so small it could be included directly in every - removing the need for round-trips. I don’t know what a reasonable threshold would be here. The default initial TCP receive buffer size is 128kB on my machine, so if it’s possible to fit the TLS negotiation, HTTP headers, the and at least some content in that size you could display your website within 3 round-trips.
There’s a difference in emphasis WRT motivation between this and HTMX: The problem Makinde set out to solve is clear - facebook is too slow. The pleasant side effect: deleted a bunch of JS. Compare with the motivations on the HTMX website. They’re more like capabilities phrased as questions than motivations, and they’re certainly more philosophical:
- Why should only and
  be able to make HTTP requests?
Of these the 4th one is critical - the rest are absent from Primer, and arguably relatively unimportant. However there are critical features in HTMX that Primer doesn’t implement 2 including:
- Replacing the whole page with working back button behaviour
- URL bar wrangling
- Out-of-band swaps
- Loading indicators
- etc.
For “Load more comments” buttons most of the above are unnecessary, for avoiding whole page loads when changing pages they are very much necessary.
Makinde also describes downloading JavaScript to be executed in response to user actions - rather than loading all possible javascript at page load time. I haven’t touched on that here as it’s not a problem I’m interested in - but it may be relevant to you.

References

Twitter: Carson Gross on primer. There’s a bunch of interest in this tweet and the replies.
Twitter: Dan Abramov on facebook history
Discuss this on Hacker News
Twitter: @jordwalke on primer limitations

Footnotes:

See also htmz ↩
Hacker news comment: Carson Gross on htmx vs htmz ↩

Simple declarative schema migration for SQLite

2022-04-30T00:00:00+00:00

See my colleague David Röthlisberger’s website for an article we authored together: Simple declarative schema migration for SQLite.

The TL;DR: is:

We store our SQL schema in git and on application startup we modify the existing database so the schema matches what the application was expecting.
In doing so we don’t need to write any migration code specific to a particular schema change.
The types of migration we can perform automatically are limited, but in practice these limits are not particularly onerous.
If we do need to explicitly step outside these limits (e.g. renaming a column) we can, and the auto-migration system will still help with the parts of a migration that it understands.
This approach works really well with a branching/CI workflow where multiple versions of the database schema may exist in parallel.

pip and cargo are not the same

2022-02-23T00:00:00+00:00

I often see Rust’s cargo package manager dismissed in online discussions by analogy to pip and npm. Cargo/rust doesn’t suffer from many of the problems that pip does. Some of these reasons are not just about cargo, they’re also about how Rust is different to Python and about how Cargo and Rust integrate:

There are only venvs with Cargo, so you can’t install crates into a location where it will interfere with other unrelated rust programs.
Rust programs are “almost statically linked”. Typically they only dynamically link against libc, and possibly a few more common libraries like openssl.
You don’t need to worry about reproducing the build environment on the host that will be running the executable. Usually just copying the executable over is sufficient (see (2)).
There is a 1 to 1 relationship between cargo crate names and what gets used in the rust file. With pip your pypi package moo can include Python package foo, or whatever else it likes.
Similarly you can’t use something in your rust code that you haven’t asked for in your Cargo.toml - transitive dependencies of your dependencies (mostly) don’t affect you.
Rust provides many tools for managing privacy, so the public interface of a package is explicit. This makes it much harder to accidentally depend on something the crate author considers an implementation detail, making a cargo update much less likely to break your application than the Python equivalent.
There is a culture of taking compatibility seriously in the rust ecosystem. Crates are expected to maintain API compatibility for major versions and Cargo requires the same.
Cargo requires lockfiles, and can generate a lockfile just based on the Cargo.toml and the crates.io index. Pip has pip freeze, but that just captures the packages you’ve got installed. So it requires that the packages be installed in the first place, and won’t include packages that are required on other OSs for example. Pipenv helps here.
Rust packages tend to be more self-contained than Python ones. Often Python packages will be bindings to existing libraries, while Rust ones will be pure rust. This means that you run into issues of missing system dependencies far less often with Cargo than Pip.
Rust maintains much better backwards compatibility than Python. Upgrading to a newer version of Rust for a dependency is very unlikely to break your build - and if it will break anything it will break at build time, rather than run time. Upgrading Python often causes your code or dependencies to break and requires you upgrade Python on your deployment target as well. Rust doesn’t even need to be installed on your deployment target.
A rust executable can include different versions of the same crate in the same rust executable - so you don’t need your transitive dependencies to all agree on the same version if there are compatibility issues.

I don’t have any first hand experience with npm, but I believe at least some of the above will apply there too.

Neil Brown on the UNIX philosophy

2021-12-16T00:00:00+00:00

From an LWN comment by Neil Brown, kernal hacker and author of the Ghosts of Unix Past LWN article series (among others):

One of the big weaknesses of the “do one job and do it well” approach is that those individual tools didn’t really combine very well. sort, join, cut, paste, cat, grep, comm etc make a nice set of tools for simple text-database work, but they all have slightly different ways of identifying and selecting fields and sort orders etc. You can sort-of stick them together with pipes and shell scripts, but it is rather messy and always error prone.

I remember being severly disillusioned by this in my early days. I read some article that explained how a “spell” program can be written to report the spelling errors in a file. It uses ‘tr’ to split into words, then “sort” and “uniq” to get a word list, then “comm” to find the differences. “cool” I thought. Then I looked at the actual “spell” program on my university’s Unix installation. It used a special ‘dcomm’ (or something like that) which knew about “dictionary ordering” (Which ignores case - sometimes). Suddenly the whole illusion came shattering down. Lots of separate tools only do 90% of the work. To do really complete work, you need real purpose-built tools. “do one thing and do it well” is good for prototypes, not for final products.

One thing Unix never gave us was a clear big picture. It was always lots of bits that could mostly be stuck together to mostly work. I spent a good many years as a Unix sysadmin at a University and I got to see a lot of the rough edges and paper over some of them.

I read this when it was first published and it had quite an effect on my thinking.

Seen on HN: Invert shell and terminal

2021-03-31T00:00:00+00:00

Here’s a comment from Ericson2314 that I agree with wholeheartedly:

Invert the shell and terminal: every shell command (with unredirected streams) gets it’s own pty.

Read the whole comment for more.

I think this could work well - but unlike many projects intended to improve the terminal/shell experience it could be implemented in a backward compatible way - and as such have a chance of being widely adopted.

You’d need to implement a new protocol between terminal and shell. Maybe the shell would be in charge - allocating PTYs and multiplexing the output back to the terminal. Or maybe the terminal would be in charge where it would allocate PTYs for children and send them to the shell using FD passing. There are advantages both ways.

Integral to its success though would be getting the support for this new protocol upstream in bash. Bash is the most widely deployed shell and the only way to get to the point where you can log in to a new machine (or over SSH) and expect this to “just work”. Without that it would remain a niche and could join the graveyard of other improved shells/terminals that approximately no-one uses.

As such the protocol would have to be minimal and easily implementable in C. Systemd’s protocols could be an inspiration. I’ve implemented both sides of sd_notify and LISTEN_FDS before, in more than one programming language and it was very straightforward.

Merkle trees and build systems

2020-05-28T00:00:00+00:00

An article written by my colleague David Röthlisberger in part describing how the build system that we built at stb-tester works and the philosophy behind it: Merkle trees and build systems.

In traditional build tools like Make, targets and dependencies are always files. Imagine if you could specify an entire tree (directory) as a dependency: You could exhaustively specify a “build root” filesystem containing the toolchain used for building some target as a dependency of that target. Similarly, a rule that creates that build root would have the tree as its target. Using Merkle trees as first-class citizens in a build system gives great flexibility and many optimization opportunities. In this article I’ll explore this idea using OSTree, Ninja, and Python.

Unlock software freedom one by using better tools

2020-05-25T00:00:00+00:00

An idea that I’ve had burrowing into my mind for quite a few years now is that free software distros are held back by the crap build tools that are used to build software - but the lack of availability of better tools isn’t the cause of the problem.

The FSF define software freedom one as:

The freedom to study how the program works, and change it so it does your computing as you wish (freedom 1). Access to the source code is a precondition for this.

In practice if you want to make a modification to your system it’s a massive ball-ache. In my opinion just getting to the point where you can find the relevant package and build it is harder than making the change in most cases. I think it’s a real shame that we have these distros full of software with source available, but you’d never think of making small changes because there’s such a hurdle to get over, and that hurdle isn’t knowing programming, or understanding the code - it’s just the inconvenience of building and running your modified version.

Back in 2006 (I think) there was the project called One Laptop Per Child. One of the ideas behind it was that not only was this a tool to enable children in the developing world to do things like wikipedia access, word processing, and spreadsheets, but that it was a tool that they could mould themselves. They could change any part of the software that made up the device, safely, in such a way that they couldn’t brick the device and any changes were easily revertible. The thing that really captured my imagination was a button on the keyboard labelled “View Source”! Imagine that! You’re using a piece of software and you want to know how it works, or you want it to work differently and you press the button, and there it is in the editor. You could make a change and build and run it in one click. Perhaps with another click you could share your changes with the world.

Blam! Suddenly the four freedoms aren’t some abstract idea the advantages of which are reserved for professional software engineers, but they’re available in practice to a much larger audience of interested amateurs. In particular, freedom 1, the freedom that is least convenient to exercise now: “The freedom to study how the program works, and change it so it does your computing as you wish” was previously mostly academic to all but the tiniest fraction of users of the software, but is now much broader.

Included in this group of interested amateurs are children. I can imagine myself as a child pressing that button just to see what happens? what does it look like? What happens if I…? Suddenly it becomes play rather than study and discipline and work.

I don’t know how (or if) the view source button ever worked, but I find the idea behind it is exciting.

We can see what children (and adults) are capable of when given the right environment in their amazing Minecraft creations. It doesn’t matter that the environment is constrained, it just matters that the initial barrier to entry is low enough. See also the boot to BASIC BBC micro. You turn the machine on and there you are, ready to start. Along with similar machines it spawned an entire industry of software engineers.

So if the main blocks to this are getting the source and then building the software, what is the solution? I think Google has built (but more importantly - uses) tools that solve both these problems.

Internally Google¹ have a build system called Blaze (see also its open-source cousin Basel). It’s fast at building software at staggering scale. Scale larger than any Linux distro. How is it capable of doing this? All build steps are reproducible - this in-turn makes them cacheable. Secondly the tool has a complete view of the entire build-dependency graph, right back to each source-file checked into the repo. What does this mean? It means that when you change a source file, be it a C file or a header or some code-generation bit, all the dependents are built, but no more. It means that they can be all built in parallel across a fleet of machines, so you get your results promptly. It means that if a generated file would not be affected by that change it is not built.

Contrast this with a typical Linux distro, binary or source based. The concept of packages cuts through the build graph at a level of fixed scale. We define dependencies between packages, and within any particular package there is a build system that defines the finer grained dependencies. This .so, depends on this object file that depends on this source file. Change the source of a man page in openoffice? You rebuild the whole package including all the source files and running big fat linking steps. Add a comment to a source file in a library? Do the dependent packages need to be rebuilt? That’s a decision that needs to be taken manually every time.

With traditional binary distros this is made kind-of tractable by dynamic linking - allowing you to delay the final steps of constructing your system to run-time. I think dynamic linking makes sense where runtime selection of code is genuinely needed, and IPC has too much overhead. Examples include using the right GL library for your hardware, or adding new GStreamer elements for decoding some new video format. It’s also not so onerous for cases where the ABI is small, clear and the library has few dependencies of its own. It can be necessary for proprietary software, where rebuilding to fix an issue in a dependency isn’t possible . But in a lot of cases it’s there just to work around the way that software is packaged, delaying linking to run-time because rebuilding is just too onerous.

Google uses open-source software packages internally. How do they deal with the fact that OSS comes as packages with their own build systems? How do they cope with not having global visibility of the build graph? The answer is that they don’t. If you want to use an open-source dependency in your google project you need to convert the build system to Blaze. This only has to happen once for any project. Popular open-source libraries will already have been converted, so chances are that you can just reuse that effort. By doing so the huge proprietary megacorp gains more of the benefits of the fact that the software is free (as in freedom) than the free-software distros can!

That covers building the software, what about getting the source? Within google every developer effectively has the entire Google codebase checked out on their machine, as a FUSE filesystem. It’s called “Client in the Cloud”. So you want the source locally? It’s already there. Imagine this applied to Debian. You want to make a change to openoffice, but it’s huge and you don’t have space for the whole thing? It’s ok, it’s already there. Microsoft are developing a system for git that can handle repos of the scale of the whole of Debian. It’s open-source, but currently Windows only. See “VFS for Git”.

What would such a distro look like? I’ve seen one example of this from a few years ago gittup. It consists of a git repo containing a submodule for every package. Each package has been modified to build using the tup build system. They can now make a change to any file and run a build and they get a new system in seconds. Check out the website, particularly the section “What gittup.org does that nobody else can”. It gives you a taste for what might be possible. The tools used in gittup wouldn’t scale to the size of Debian, on the other hand we know tools that will, or could. But it’s not about the tools…

The challenge is the sheer amount of software to be packaged is tremendous. Having a complete view of the build graph requires converting every package to the same build system. This requires many people, each with some level of expertise in their package, all pulling in the same direction, all choosing the same tooling. It’s a social problem, rather than a technical one.

As Antoine de Saint-Exupéry said:

“If you want to build a ship, don’t drum up the men to gather wood, divide the work, and give orders. Instead, teach them to yearn for the vast and endless sea.”

So this is my attempt to generate that yearning.

Note: I’ve never worked for Google, this is just my understanding from information google has published and talking to googlers and ex-googlers. Ultimately whether what I say about how things work at google is true or not is irrelevant to the substance of this article. This is about how the free software ecosystem and distros in particular could be different, and better. Google is a strawman in this instance I’ve used to make the ideas more concrete. ↩

Improve integration test sandboxing with systemd socket passing

2014-01-26T12:04:21+00:00

TL;DR version: Improve integration test sandboxing with systemd socket passing. You can allocate a random port for your daemon and don’t need to wait for the daemon to start up to run your test. It’s fast, robust, race-free and doesn’t depend on systemd.

Julien Danjou recently wrote an excellent blog post on Database integration testing strategies with Python. In it he discusses integration tests which require databases, but the advice is applicable to integration tests which require any external process. For one of these tests he:

Chooses a port to listen on then starts the database server with the port number passed in as configuration.
Waits for it to start up by grepping stdout.
Runs the test.
Tears the database server down.

This is great advice but can be improved upon. One weakness to this approach is if the port you’ve selected is already in use your test will fail. This can be the case if the port you’ve chosen just happens to be in-use or if you’re running multiple tests in parallel.

The solution is to use a random unused port each time you run the test. UNIX allows you to do this by asking bind (Python: socket.bind) to bind to port 0. You can then ask getsockname() (Python: socket.getsockname()) to find out which port was actually used. But you can’t know which port you’re going to bind to before you’ve bound to it. If you choose an unused port at random then later try to bind to it another process may have beaten you to it and it may already be in use.

So one way to do this would be to tell the server you’re starting to choose a random port, wait for it to be ready and then find out what port it has chosen. Maybe this involves grepping through logs or making IPC calls.

I like to use a different technique: open the sockets myself and then pass them to the daemon. This way I don’t need to wait for the daemon to start-up and don’t need to inspect it’s logs or query it. I use the systemd socket passing protocol which some daemons support anyway. This means I open a listening socket, and then start my daemon with the environment variable LISTEN_FDS=1 to tell it that I am passing a socket to it and that socket is fd #3†.

I’ve written a small utility - sd-popen.py to do this for me. An example:

$ # Show that the LISTEN_FDS environment variable is set in the child:
$ ./sd-popen.py --outfile=env.log env
LAUNCHED_PORT=32814
LAUNCHED_PID=18150
$ grep LISTEN env.log
LISTEN_PID=18150
LISTEN_FDS=1

$ # Show that the socket is open in the spawned process:
$ ./sd-popen.py sleep 50
LAUNCHED_PORT=48729
LAUNCHED_PID=18898
$ netstat -lp | grep 48729
tcp        0      0 *:48729                 *:*                     LISTEN      18898/sleep     

sd-popen opens a socket, binds it to port 0, spawns the command passed to it with LISTEN_FDS and LISTEN_PID set and then prints the pid and port to stdout before exiting.

The output from sd-popen is compatible with shell so we can write shell scripts like:

export $(./sd-popen.py my-webserver)
wget http://localhost:$LISTEN_PORT/foobar.html
kill ${LAUNCHED_PID}

All without waiting or worries about port clashes.

This is what sd-popen.py looks like:

#!/usr/bin/python
import socket
import sys
import argparse
import subprocess

def main(argv):
    parser = argparse.ArgumentParser()
    parser.add_argument('cmd', help='Command to run')
    parser.add_argument('args', nargs=argparse.REMAINDER,
                        help='Command arguments')
    parser.add_argument('--outfile', default='/dev/null',
                        type=argparse.FileType('w'),
                        help='File to redirect stdout and stderr to')
    args = parser.parse_args(argv[1:])

    s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    s.bind(('', 0))
    _, port = s.getsockname()
    s.listen(5)

    p = sd_popen([args.cmd] + args.args, [s.fileno()], stdout=args.outfile,
                 stdin=open('/dev/null', 'r'), stderr=subprocess.STDOUT)
    sys.stdout.write('LAUNCHED_PORT=%i\nLAUNCHED_PID=%i\n' % (port, p.pid))
    return 0


def sd_popen(args, sockets, preexec_fn=None, *aargs, **kwargs):
    import os, subprocess
    def remap_ports():
        sockets_ = sockets
        os.environ["LISTEN_PID"] = str(os.getpid())
        os.environ["LISTEN_FDS"] = str(len(sockets_))
        for new_fd in range(3, 1024):
            if len(sockets_) == 0 :
                try:
                    os.close(new_fd)
                except:
                    pass
            else:
                oldfd = sockets_.pop(0)
                if new_fd != oldfd:
                    if new_fd in sockets_:
                        replacement_fd = os.dup(new_fd)
                        sockets_ = [replacement_fd if fd == new_fd else fd
                                    for fd in sockets_]
                    os.dup2(oldfd, new_fd)
        if preexec_fn is not None:
            preexec_fn()
    return subprocess.Popen(args, preexec_fn=remap_ports,
                            close_fds=False, *aargs, **kwargs)

if __name__ == '__main__':
    sys.exit(main(sys.argv))

sd_popen is essentially an extension to Python’s subprocess.Popen but allows passing a list of fds to be passed to the client process. In this example we open a socket in main() and pass it to the subprocess before printing the socket and subprocess details. sd_popen is complicated by the dup2 dance to rearrange the file-descriptor numbers but is itself a fairly generic function for launching programs with the socket passing protocol.

One thing to note: there’s nothing non-portable in the above code. It doesn’t depend on systemd and should run fine on any unix system. It does depend on the daemon having socket passing support but that’s easy to add and there’s nothing non-portable about it either.

I’ve used this technique for writing a simple test suite for the polipo caching HTTP proxy. The tests are written in a combination of shell and C. I also use it in my prototype http-dbus-bridge which is a combination of shell and Python.

At the end of Julien Danjou’s blog post he writes:

To speed up tests run, you could also run the test in parallel. It can be interesting as you’ll be able to spread the workload among a lot of different CPUs. However, note that it can require a different database for each test or a locking mechanism to be in place. It’s likely that your tests won’t be able to work altogether at the same time on only one database.

I say - start a different database for each test using the LISTEN_FDS/LISTEN_PID socket passing protocol.

†: fd 3 is the next one after stdin (0), stdout (1) and stderr (2).