Design and Engineering, As One · Matthias Ott
A thoughtful piece by Matthias that’s a must-read for both designers and developers.
A thoughtful piece by Matthias that’s a must-read for both designers and developers.
The entire intellectual and creative output of a team that reinvented personal computing fits in a space that, today, we wouldn’t think twice about wasting on a single font file.
Somewhere in the years that followed we’ve lost the creative solutions, the art of optimization, that being constrained in that way produces.
The best engineers I’ve worked with carry this instinct even when others might think it crazy. They impose their own constraints. They ask what this would look like if it had to be half the size, or run twice as fast, or use a tenth of the memory. Not because anyone demanded it, but because just by thinking there could be a better, more efficient solution, one often emerges.
In an age of abundance, restraint becomes the only scarce thing left, which means saying “no” is more valuable than ever.
I’m as proud of the things I haven’t generated as the things I have.
Update: Never mind! It turns that Google’s issue is with unreachable robots.txt files, not absent robots.txt files. They really need to improve their messaging. Stand down everyone.
A bit has been flipped on Google Search.
Previously, the Googlebot would index any web page it came across, unless a robots.txt file said otherwise.
Now, a robots.txt file is required in order for the Googlebot to index a website.
This puzzles me. Until now, Google was all about “organising the world’s information and making it accessible.” This switch-up will limit “the world’s information” to “the information on websites that have a robots.txt file.”
They’re free to do this. Despite what some people think, Google isn’t a utility. It’s a business. Other search engines are available, with different business models. Kagi. Duck Duck Go. Google != the World Wide Web.
I am curious about this latest move with Google Search though. I’d love to know if it only applies to Google’s search bot. Google has other bots out crawling the web: Adsbot-Google, Google-Extended, Googlebot-Image, GoogleOther, Mediapartners-Google. I’m probably missing a few.
If the new default only applies to the searchbot and doesn’t include say, the crawler that’s fracking the web in order train Google’s large language model, then this is how things work now:
It would be good to get some clarity on this. Alas, the Google Search team are notoriously tight-lipped so I’m not holding my breath.
Find freedom not in infinite choice, but in working a single seam until you strike gold: conducting dozens, even hundreds, of iterations within a tight parameter space—not in search of more, but in search of better.
I don’t normally link to articles on Medium—I respect you too much—and I do wish this were written on Mike Hall’s own site, but this is just too good not to share.
And don’t dismiss this as a nostalgiac case study from the past:
At no point did the constraints make the product feel compromised. Users on modern devices got a smooth experience and instant feedback, while those on older devices got fast, reliable functionality. Users on feature phones got the same core experience without the bells and whistles.
The constraints forced us to solve problems in ways we wouldn’t have considered otherwise. Without those constraints, we could have just thrown bytes at the problem, but with them every feature had to justify itself. Core functionality had to work everywhere, and without JavaScript crutches proper markup became essential.
This experience changed how I approach design problems. Constraints aren’t a straitjacket, keeping us from doing our best work; they are the foundation that makes innovation possible. When you have to work within severe limitations, you find elegant solutions that scale beyond those limitations.
I have little understanding for people using large language models to generate slop; words and images that nobody asked for.
I have more understanding for people using large language models to generate code. Code isn’t the thing in the same way that words or images are; code is the thing that gets you to the thing.
And if a large language model hallucinates some code, you’ll find out soon enough:
With code you get a powerful form of fact checking for free. Run the code, see if it works.
But I want to push back on one justification I see repeatedly about using large language models to write code. Here’s Craig:
There are many moral and ethical issues with using LLMs, but building software feels like one of the few truly ethically “clean”(er) uses (trained on open source code, etc.)
That’s not how this works. Yes, the large language models are trained on lots of code (most of it open source), but they’re not only trained on that. That’s on top of everything else; all the stolen books, all the unpaid creative work of others.
Even Robin Sloan, who first says:
I think the case of code is especially clear, and, for me, basically settled.
…goes on to acknowledge:
But, again, it’s important to say: the code only works because of Everything. Take that data away, train a model using GitHub alone, and you’ll get a far less useful tool.
When large language models are trained on domain-specific data, it’s always in addition to the mahoosive amount of content they’ve already stolen. It’s that mohoosive amount of content—not the domain-specific data—that enables them to parse your instructions.
(Note that I’m being very delibarate in saying “parse”, not “understand.” Though make no mistake, I’m astonished at how good these tools are at parsing instructions. I say that as someone who tried to write natural language parsers for text-only adventure games back in the 1980s.)
So, sure, go ahead and use large language models to write code. But don’t fool yourself into thinking that it’s somehow ethical.
What I said here applies to code too:
If you’re going to use generative tools powered by large language models, don’t pretend you don’t know how your sausage is made.
LLMs are good at transforming text into less text
Laurie is really onto something with this:
This is the biggest and most fundamental thing about LLMs, and a great rule of thumb for what’s going to be an effective LLM application. Is what you’re doing taking a large amount of text and asking the LLM to convert it into a smaller amount of text? Then it’s probably going to be great at it. If you’re asking it to convert into a roughly equal amount of text it will be so-so. If you’re asking it to create more text than you gave it, forget about it.
Depending how much of the hype around AI you’ve taken on board, the idea that they “take text and turn it into less text” might seem gigantic back-pedal away from previous claims of what AI can do. But taking text and turning it into less text is still an enormous field of endeavour, and a huge market. It’s still very exciting, all the more exciting because it’s got clear boundaries and isn’t hype-driven over-reaching, or dependent on LLMs overnight becoming way better than they currently are.
This tracks (ahem) with my experience of coding on trains.
Hidde lists the potentially flaky connectivity as a downside, but for many kinds of deep work I’d say it’s very much a feature, not a bug.
When I went up to London for the State of the Browser conference last month, I shared the train journey with Remy.
I always like getting together with Remy. We usually end up discussing sci-fi books we’re reading, commiserating with one another about conference-organising, discussing the minutiae of browser APIs, or talking about the big-picture vision of the World Wide Web.
On this train ride we ended up talking about the march of time and how death comes for us all …and our websites.
Take The Session, for example. It’s been running for two and a half decades in one form or another. I plan to keep it running for many more decades to come. But I’m the weak link in that plan.
If I get hit by a bus tomorrow, The Session will keep running. The hosting is paid up for a while. The domain name is registered for as long as possible. But inevitably things will need to be updated. Even if no new features get added to the site, someone’s got to install updates to keep the underlying software safe and secure.
Remy and I discussed the long-term prospects for widening out the admin work to more people. But we also discussed smaller steps I could take in the meantime.
Like, there’s the actual content of the website. Now, I currently share exports from the database every week in JSON, CSV, and SQLite. That’s good. But you need to be tech nerd to do anything useful with that data.
The more I talked about it with Remy, the more I realised that HTML would be the most useful format for the most people.
There’s a cute acronym in the world of digital preservation: LOCKSS. Lots Of Copies Keep Stuff Safe. If there were multiple copies of The Session’s content out there in the world, then I’d have a nice little insurance policy against some future catastrophe befalling the live site.
With the seed of the idea planted in my head, I waited until I had some time to dive in and see if this was doable.
Fortunately I had plenty of opportunity to do just that on some other train rides. When I was in Spain and France recently, I spent hours and hours on trains. For some reason, I find train journeys very conducive to coding, especially if you don’t need an internet connection.
By the time I was back home, the code was done. Here’s the result:
The Session archive: a static copy of the content on thesession.org.
If you want to grab a copy for yourself, go ahead and download this .zip file. Be warned that it’s quite large! The .zip file is over two gigabytes in size and the unzipped collection of web pages is almost ten gigabytes. I plan to update the content every week or so.
I’ve put a copy up on Netlify and I’m serving it from the subdomain archive.thesession.org if you want to check out the results without downloading the whole thing.
Because this is a collection of static files, there’s no search. But you can use your browser’s “Find in Page” feature to search within the (very long) index pages of each section of the site.
You don’t need to a web server to click around between the pages: they should all work straight from your file system. Double-clicking any HTML file should give a starting point.
I wanted to reduce the dependencies on each page to as close to zero as I could. All the CSS is embedded in the the page. Likewise with most of the JavaScript (you’ll still need an internet connection to get audio playback and dynamic maps). This keeps the individual pages nice and self-contained. That means they can be shared around (as an email attachment, for example).
I’ve shared this project with the community on The Session and people are into it. If nothing else, it could be handy to have an offline copy of the site’s content on your hard drive for those situations when you can’t access the site itself.
Obviously I’m biased, but I very much agree with Sophie.
This online course from Sara looks superb!
I know how overwhelming and even frustrating accessibility may feel at first. But I promise you, accessibility isn’t always as hard as it seems (especially if you know where and when to start!). And my goal with this course is to make it friendlier and more approachable.
Best of all, there’s $100 off if you sign up now—that’s a 25% saving.
I endorse this statement.
I want to live in a future where Artificial Intelligences can relieve humans of the drudgery of labour. But I don’t want to live in a future which is built by ripping-off people against their will.
- Be skeptical of PR hype
- Question the training data
- Evaluate the model
- Consider downstream harms
There’s a general consensus that large language models are going to get better and better. But what if this as good as it gets …before the snake eats its own tail?
The tails of the original content distribution disappear. Within a few generations, text becomes garbage, as Gaussian distributions converge and may even become delta functions. We call this effect model collapse.
Just as we’ve strewn the oceans with plastic trash and filled the atmosphere with carbon dioxide, so we’re about to fill the Internet with blah. This will make it harder to train newer models by scraping the web, giving an advantage to firms which already did that, or which control access to human interfaces at scale.
Design systems as codified constraints.
Everything old is new again:
In our current “information age,” or so the story goes, we suffer in new and unique ways.
But the idea that modern life, and particularly modern technology, harms as well as helps, is deeply embedded in Western culture: In fact, the Victorians diagnosed very similar problems in their own society.
Season three of the Clearleft podcast is here!
The first episode is a nice gentle one to ease into things. It’s about coaching …and training …and mentorship. Basically I wanted to find out what the differences are between those three things.
But I must confess, there’s a commercial reason why this episode is coming out now. There’s a somewhat salesy promotion of an upcoming coaching programme with Julia Whitney. This is definitely the most overt marketing I’ve done on the Clearleft podcast, but if you listen to the episode, I think you’ll agree that it fits well with the theme.
Fear not, future episodes will not feature this level of cross-promotion. Far from it. You can expect some very revealing podcast episodes that pull no punches in getting under the skin of design at Clearleft.
The stars of this episode are my colleagues Rebecca and Chris, who were an absolute joy to interview.
Have a listen and hear for yourself.
Like Bastian, I’m making a concerted effort now to fly less—offsetting the flights I do take—and to take the train instead. Here’s a description of a train journey to Nottingham for New Adventures, all the way from Germany.