Text is the universal interface

How we've forgotten this to be true

Aug 25, 2023

Back when I first started writing code in Notepad, I didn’t think I was really writing software. I reasoned at the time that this wasn’t actually writing software. You had to use special tools made by Wizards, who actually understood how all this really worked, and the output was something-something binary.

Text went in, and binary went out. You couldn’t explain it.1 It’s just how it worked.

And for moving data between systems, you had to use APIs, and integrations, and databases. Oh yes, databases were very special pieces of software. That’s how real data was stored.

It was all very serious and Enterprise and Oracle and stuff.

Text is the universe interface

Text is in all of this software we write. The code itself. How we interact with CLIs on whatever TTY you use.

But even more than that, the data format we exchange is text. Sure, it’s structured. We’ll put it into JSON or XML or YAML or something else. But largely, those are relatively light data structures to organize text.

It’s all generally speaking just text.

It goes broader in our experience than that, though, and bear with me while I go on might seem like a tangent.

Reading text is the cornerstone of modern education. Pictures or diagrams are great aids. Even when you draw a picture of something, there's almost always text somewhere to explain what the picture means.

The primary means of most information is just text: the density of information that can be expressed with just text is enormous.

If I ask for you to describe how to boil an egg with words, you can do this, but you'd like to add some pictures at key points to make it easier. On the flip side, if you attempt to describe boiling an egg without any kind of marking to communicate temperature or time, you'll find yourself very frustrated.

I can describe roughly what an elephant looks like in a thousand words, and you might get there with those alone. A picture is of course better.

Then there are times where images completely fail. A single word like silence or honesty is difficult to paint in even a thousand pictures. They are inherently abstract.

Sometimes a picture is worth a thousand words, and sometimes a word is worth a thousand pictures.

We default, though, to text. We all intuitively know this to be.

The recent explosion of generative AI in the past year could have had several ways for us to interact with it, but the primary approach being take is a chat bot wherein you ask for things in primarily text. You may get something other than text back, but most of the time, you get text back.

The vast majority of the apps we engage with have text as the primary thing for you to consume. I realize that TikTok and Instagram and YouTube are dominant forms of media now, but I would encourage you to pay attention to how much text surrounds even those. Not just the navigation either, but in the actual visual content there will be a great deal of text to be consumed.

You rarely get a purely symbolic or image or video context for software.

And yet even with all this text everywhere, moving it from one tool to another relies on copy and paste and not text streams.

It should be easy. It’s just text after all.

"I wear the chain I forged in life," replied the Ghost. "I made it link by link, and yard by yard; I girded it on of my own free-will, and of my own free-will I wore it. Is its pattern strange to you?" — A Christmas Carol by Charles Dickens

Streams are Not Everywhere

UNIX made a fantastic set of decisions with text streams that GUIs have largely decided to ignore.2 Specifically the principle of “Write programs to handle text streams, because that is a universal interface.”

The most basic example is literally the “Hello world” application that everyone writes. Unless you learn this in a web browser or desktop or mobile app, you’ll experience it on a command line.

You can hand that output to another program in the shell and in fact chain several of programs together relatively easily.

As a user experience expectation, for whatever reason, we have largely accepted that this is not how most software works any more. If you write it for yourself, you can of course make it that way. But it’s not often even an option for most applications.

Thus, we’ve wound up in this strange world where text heavy applications have no means to send text streams out of them aside from copy and paste.

For instance, if I have Slack open and I want to send that text from one channel to a CLI, that’s like a whole integration into the ecosystem. Fortunately, you can just copy and paste it into a text file and then do work on it.

But it shouldn’t be that hard. It’s just text after all.

Expect the output of every program to become the input to another, as yet unknown, program. Don't clutter output with extraneous information. Avoid stringently columnar or binary input formats. Don't insist on interactive input. — UNIX Time Sharing System

Make Streams Everywhere

If you make it easy for the applications you work on to also have text streams come out of them, you’ll find that they’re:

easier to test
easier to debug
easier to refactor
easier to integrate to other systems

I will grant you for the moment they’re not easier to build3, but I think it’s because we’ve been thinking of them in the wrong way for too long.

So the next application you build. Change this default.

You will likely thank yourself later. Hopefully, it won’t be too hard.

It’s just text after all.

“Never a miscommunication. You can’t explain that.” Sorry. I just couldn’t help myself.

I blame Windows the most for this, because their MSDOS shell was a pain the entire time I was growing up. And it wasn’t until I got to use Linux that I actually liked using a shell.

I’ll also grant you that plain text streams may not be the best choice either in many cases, but I’ll take plain text over nothing.

Robert Roskam's Newsletter

Discussion about this post