Instructions for Humans was a 2017 art exhibition with performances at BOM (Birmingham Open Media) capping off a three year period of research and development. This is the long-delayed final document tying off the project. It is currently a work in progress.

Table of Contents

Bold = complete
Normal = half-written
Italic = unwritten

The Ideas

The Artwork

  • Exhibition
  • Performances
  • Reflections

Conclusion


Data is Data

First image of the far side of the Moon taken by the Luna 3 spacecraft in 1959.

Early unmanned space exploration vehicles took images of the moon with film cameras. If re-entry to Earth was not possible, these photographs were automatically developed using chemicals, dried, scanned and transmitted back to Earth as radio waves. A photograph was not data, but it could become data by measuring how light reflected off small areas of its surface from the top left to the bottom right. If it does, send a blip, if it doesn't, send silence. On or off, 1 or 0. Then, at another time and place, those blips, or lack of them, are used to fill in some, not all, of the areas of a grid and the image appears. Not identical, missing some of the analogue nuance, but good enough.

Before too long it was possible to bypass the chemicals and simply measure the light inside the camera using a sensor. A self-contained, theoretically portable digital camera was produced by a Kodak engineer in 1975. It weighed 3.6kg, recorded 10,000 black and white pixels and stored them on magnetic tape, taking 26 seconds. All consumer and professional digital cameras pretty much follow its process, albeit at somewhat higher resolutions and speeds.

The first self-contained digital camera.

At first, digital photographs were functionally the same as analogue ones, they just didn't look as good. They had their uses, but photographers and publishers tended to prefer the high quality of film which could be scanned into a computer system at a later date if needed. Around the mid 2000s the quality increased to the point that, when combined with convenience, analogue film became a luxury, discarded by both professionals and consumers. In 2011, 36 years after starting this revolution, Kodak went bankrupt.

We now live in a world ruled by digital images. The vast majority of people carry a digital camera with them at all times which is capable of publishing its images to the internet where the vast majority of people are able to see them. These photographs exist only as microscopic magnetic switches on hard drives and storage cards. You cannot look at them in the same way you cannot hear music by looking at the grooves on a vinyl record. In order to see a digital image is has to be processed, translated from encoded data to light up areas of a screen with different colours. Each screen or device translates the data slightly differently depending on its size or operating system. And different apps might use different interpretations depending on their requirements. When I look at a photograph on my computer and then send it to you to look at on your phone, we are looking at different interpretations of the same data. We are not looking at the same photo.

On the surface this doesn't really matter. Photography has always been technologically mediated. The process of making prints from negatives will differ depending on the machines, chemicals and papers employed, and the settings given by the operator. And the act of looking itself introduces enough psychological and emotional variables that any subtle differences in rendering are pretty much moot. Finally, of course, the medium is the message, meaning that the fact that you are looking at the photo in Facebook on your phone on the bus is way more important to your understanding of it and any actions you might take due to it than the contents of the image itself. Data is just the raw material. And as long as everyone's getting the same data then we can try to control the mediums that are translating it, say by regulation of monopolies and encouraging a plurality of platforms.

But once we dig a bit deeper, this fundamental nature of digital images – that they do not exist as images until they are interpreted as such by software – starts to matter quite a bit.

We think of a jpeg file as an image, but it isn't. It's a data format, a way of storing information, which lends itself to be interpreted as an image by software. It can also be interpreted as sound. When you play a jpeg file as sound it mostly sounds like static noise, but it's still a distinct sound unique to that file. Some people like to manipulate image files by importing them into audio editing tools and saving them back as images. If you've ever wondered what a photograph would look like when put through an echo filter, wonder no more.

A photo of a mantis which has been loaded in the audio editing application Audacity and had the echo filter applied to the middle section.

What's kinda fascinating is it looks just like an image that's been put through an echo filter.


Digital data is a sequences of switches, some of them on, some of them off. When we take a photograph we translate the light that comes through the lens into millions of switches, on and off. The same thing when we record sound digitally – millions of switches. Or when we save a word processing file, or a CAD drawing, or a web page. Everything that we call "digital" is a sequence of switches. On and off. 0s and 1s

In order to experience these recordings, these creations, these pieces of media, we have to translate them from their stored state into something we can perceive. Most of the time this is pretty linear. Photo software turns JPEGs into images. Music software turns MP3s into sound. Word processing software DOCs turn into text. Just as record players turn vinyl into music or printing presses turn metal type into newspapers. But it doesn't have to be.

Years ago, when home computers were new and most households had record players, computer data was, very very occasionally, distributed on vinyl records. You would play the record and send the audio not to the speakers but to the computer which would interpret the different tones as 0s and 1s. Digital data to analogue sound to digital data.

This isn't news to anyone who had a ZX Spectrum or some other home computer which loaded software from cassette tapes. Games came on exactly the same kind of tapes as albums did. Part of the nostalgia for that era is the sound of the software playing in a tape deck as much as the game itself.

A similar aural nostalgia can be had for the dial-up "boing boing" sounds produced by a computer modem connecting to the internet over a phone line. For those of us online at home in the 1990s and early 2000s, this was the anthem of the internet, a digital conversation between two computers rendered as a song for us to sing along. There was no good reason for it to be audible to humans, but in doing so it neatly illustrated the neutrality of the digital signal. By design this was code to be interpreted by the modem's circuitry. But it was also music, as demonstrated by the band Looper in 2000.

And using sonography, the same technology that allows parents to see their unborn child, it can be turned into a visual graphic. Same data, different results.


The transmutability of digital information reveals a fundamental truth. Digital media is not constrained by our analogue distinctions of media forms, be they media used in the production of art and craft, or media tools for communication. We use these categories as a way of organising the information (images in JPEG files, audio in MP3s) but we don't need to. All the computer needs is the binary data, the 0s and 1s, and instruction of how to process it.

That instruction can be to make it accessible to human senses, or it can be something else. More often than you imagine, it is something else.


Artificial Intelligence and Machine Learning

Draft – work in progress

The saga of Artificial Intelligence in the 20th century is messy and contradictory, in part because the goal of "intelligence" or "consciousness" is a philosophical question which cannot easily be modelled to mathematics. I've found it's safe to assume that any reference to AI has nothing to do with intelligence. Computers do not think, and if they give an impression of thinking, that's you projecting onto them. That we might perceive intelligence is interesting, but it's a fallacy.

One attempted method to develop AI in the 70s and again in the 90s tried to model the neural pathways of the brain, the electric signals that ping around our skulls as we ponder stuff, but it failed, partly because that's not how intelligence works, but mainly due to limited processing power.

A few years ago, when computation power had increased to what would have been considered an absurd degree, someone dug out this artificial neural network software and gave it another spin. It still didn't produce intelligence, but it proved very good at analysis of massive amount of data.

This gave rise to the Machine Learning revolution of the mid-to-late 2010s. Machine Learning is analogous to teaching a toddler to recognise pictures. You show the toddler a picture of a dog and say "dog". Then you do the same for a picture of a cat. Then more dogs and more cats and eventually when the toddler sees a cat they say "cat", even if they've never seen that particular cat before. But if you show the baby a picture of a fox for the first time it'll either be a dog or a cat, because no training has taken place for foxes.

Machine Learning algorithms work in the same way. They are fed a boat-load of labelled data and attempt to find patterns within the categories. What makes everything labelled "house" unique and distinct from everything not labelled "house" After a while "house" is associated with a number which images of houses will usually equal when processed through the algorithm. Now, when a new image is analysed, if it spits out the "house" number the computer will say, with a degree of certainty, that the image contains a house.

The clever bit happens when it gets it wrong. If you then tell the system that it's not a house, it will take that on board and adjust the algorithm accordingly, just as the baby now knows that things that look like foxes are not cats or dogs.

The mantra of Machine Learning is "more data" and it's relevant that it emerged in an era where corporations had accumulated masses of information, most of it in digital form, stored in "the cloud". Google's index of the web, Flickr and Instagram's photos, Facebook and Twitter's postings. Billions upon trillions of things just sitting there in data centres waiting to be served to the web but otherwise doing nothing. It didn't take long before these companies started churning your stuff through their ML systems, refining their recognition algorithms to better understand the world.

The self driving car is often cited as the holy grail here. You may have noticed a few years back that those "are you a robot" tests switched from hard-to-read text to squares from Google Street View. At time of writing they seem to need me to identify palm trees or traffic signals, and there's reason for that. Palm trees and traffic signals are very similar when rendered as data. You're helping Google's algorithm refine its understanding of them to the point where they're able to tell them apart. And then the self driving car will stop when it's supposed to.

The irony here is the "are you a robot" test is being employed to train a robot to pass a future "are you a robot" test. The tragedy is we're doing unpaid labour training corporate robots so we can log in and get to our stuff.

I don't see much intelligence in this stuff. It's just pattern matching on a massive scale.


Deep Dream and AI Art

Draft – work in progress

One of the interesting features of Machine Learning is that in order to learn it has to create stuff to test its new models. For example, a Generative Adversarial Network, or GAN, sees two neural networks working together to improve their success rate. Using a mathematical interpretation of a training set of, say, photographs, one attempts to generate a candidate while the other attempts to interpret it. They're degree of success or failure is fed back into the model and it is tested again. After doing this a few thousand times the GAN can not only recognise things in photographs, it can also generate images that look like photographs, but aren't.

These reproductions are unique, in that they are not copies, but they are not unique, in that they are derived from, and constrained to the variations of, the training set of photographs. Say you took a thousand photos of yourself and cut them up into tiny squares which you mixed up in a bag. Then you go through the bag and select enough squares to create a photo of yourself. This new image will be unique, as those element won't have been combined in that way before, but it will only be made up of those first 1000 images. In other words, a set of pictures of you can only produce pictures of you. This might seem obvious but the it often gets forgotten and is the source of many problems with bias in machine learning.

The technology behind this is amazing, though. What we're really talking about is how the maths running in our machines sees and understands the world. Since human perception is a rich area for artists, it seems logical that machine perception would be too. Indeed, one of my favourite pieces of art from this era makes that connection explicitly.

Like Magritte's original, that image lends itself to a lot of unpacking. What even does it mean for an image classification algorithm to identify a painting of a pipe with a confidence of 0.145? One thing it implies is subjectivity within computer vision, that a slightly different algorithm trained on a slightly different set of images would have a slightly different confidence. Which means we, on the other side or the camera lens, cannot predict how a computer system will see us. Are we seen or are we invisible? And if we are seen, how are we interpreted? What data-set are we associated with?

A few years ago I was working on a performance that involved a camera obscura installed on a pedestrian thoroughfare on a traffic island in Birmingham. We had a courtesy visit from the local beat police officer who let us know his details in case we had problems and cautioned us to watch out for the "Polish street drinkers" who could get aggressive. But while we were setting up I'd had a few conversations with scruffy looking blokes with east European accents about what we were doing. They weren't aggressive in the slightest. A bit odd, sure, but friendly and curious. I don't blame the officer – his experience of them was always going to be confrontational and that informs his view of them, which was not relevant to my situation. Similarly my experience would not be a useful metric to inform policing after dark.

  • Confirmation bias – Wikipedia, "the tendency to search for, interpret, favor, and recall information that confirms or supports one's prior personal beliefs or values", is a fascinating thing.
  • From recognising stuff to making stuff.
  • Pictures that were never taken. Deep fakes.
  • A dead end of tech demos.

Tech Solutionalism


Mass Surveillance

Making sense of a post-Snowden world.


Mediated World

How do we construct our realities when everything is media. Is everything media?


Who controls the reality construction

The corporatisation of life online and off.


The Black Box

The systems of the internet are hidden.


Cargo Cults