Nothing maps
This Cartography of Generative AI, produced by the Estampa collective, presents a clarifying exposition in a series of numbered paragraphs of both the material infrastructure (mining, energy and water use, exploitative labor practices) and ideological superstructure (AI startups, quasi-philosophical think tanks, venture capitalists, marketing channels) necessary to sustain generative models.
I didn’t find the giant chart that accompanies the text all that useful, but perhaps that is the point of it: “The set of relationships presented here forms a mosaic that is difficult to grasp because it involves the linking of objects and knowledge of different kinds and scales.” The massive diagram illustrates how the organization of this and most other industries is necessarily overwhelming and confusing, which is precisely what allows them to defray responsibility and sustain the plausibility of their self-promoting narratives.
One of the paragraphs from the cartography is about datasets:
The industrial compilation of training datasets is achieved by extracting content from the largest digital archive in existence: the Internet. The data is obtained through automated 'scraping' of online content published and shared by millions of internet users. The original motivation for this extraction did not foresee today's commercial exploitation by start-ups and platforms, but was driven by the desire for scholarly and non-commercial research. Now that these huge digital archives have been used to generate texts and images on demand, we are faced with a series of paradoxes and controversies within the cultural industries. If, on the one hand, the ideology of big data understands Internet content as a vast repository that can be extracted, processed and automated, on the other hand, this extractivist drive is seen by other cultural actors as a process of massive privatisation of the creativity of millions of Internet users.
I’m not sure that this kind of exploitation of data wasn’t foreseen by the platforms and data brokers that worked to amass it. It seems more accurate to say that “scraping” — treating all data as found data that can be readily recontextualized without reference to the reason it was originally collected — had some compelling counter-hegemonic uses, but these may ultimately have served as an alibi for massive data collection and centralization, and for the re-use of data for purposes its providers — when data is intentionally provided rather than captured through surveillance — did not anticipate or consent to.
Part of the ideological infrastructure of AI is rendering that sort of consent irrelevant. Scraping accomplishes that by treating the subjective origin of data as inconsequential. It treats data as useful especially when it is used for unanticipated purposes — when its subjectivity is thereby subtracted. When “user-generated content” is blended with data collected through surveillance — merging “subject of” and “subject to” — it further reinforces the idea that the subjective intention behind some data doesn’t matter. This opens a void where the subject was that can then be replaced by “AI.” Any subjective capability AI seems to have is the inverted reflection of the subjectivity invalidated in the process of assembling datasets. (That is, AI can be subject only when human subjects are treated as objects.)
The model-training process also helps further invalidate the idea of consent by making its assimilation of data irrevocable. The paragraph highlights the “paradox” of appropriation at scale as opposed to how such appropriation is experienced by individuals. The dataset is a quantity that becomes a quality, a totality, a piece of discrete property that can’t be reduced to the trillions of pieces that compose it. The process of training a model makes that more than a rhetorical framing: It cooks datasets into something distinct and operational from which the data can no longer be extracted or deleted, even as the model no longer needs it to function.
But these sorts of antagonisms undermine generative AI’s future development. Not only does generative AI’s prominence threaten to intensify various forms of resistance — withholding data, mounting legal and political challenges to the industry’s practices, etc. — its output calls into question the status of data itself. Though themselves dependent on large-scale datasets, generative models are almost uniquely equipped to pollute such datasets at scale. So while models have been built through a kind of deranged positivism (all the data, no matter what it is or where it came from, can be seen as significant to any and every question), they ultimately invalidate that by feeding synthetic data back into the field of social practices that “data” is supposed to unproblematically reflect. Data can’t just be “found” and still held to be true when it is also so conspicuously being made by models, and at a nearly unlimited scale.
Consider, for example, this 404 Media item by Emanuel Maiberg about how Google Books now indexes AI-generated texts as though they were publications that reflected something meaningful about human practices. This could distort the output of tools like Google’s Ngram viewer, which tracks the frequency with which words and phrases have been used historically. “Either they will be unreliable for teaching us about human-made culture, or they say something perhaps more bleak: that human-made culture is being replaced by AI-generated content,” Maiberg writes.
One can foresee this kind of dataset supplementation affecting any kind of analysis that relies on scraping data, which can no longer be taken to reflect empirical human practice. Instead, it must be assumed that machines will be used to generate data that simulates and obscures human practices, making it harder to determine what different populations are doing or have done. But this isn’t so much a matter of human-made culture being replaced by machine-generated content. Rather machine-made content (which reify existing statistical biases as “truths”) will be used to reproduce and retrench the usual distortions in data already introduced by power imbalances and social inequities and forms of categorical discrimination. Empirical data will be fine-tuned through the augmentation of simulated data to keep producing the expected and familiar outcomes, which have been set up as statistical norms rather than historical injustices. Protecting “pure” pools of data from generative models’ intrusion will require a different kind of “distortion.” Generative models are premised on an “anything goes” model of data collection that undermines the idea that particular data can retain meaning. Instead it’s all and nothing.
“Made with AI”
The image above is a screenshot I took of an image in this Facebook policy post. I manually drew the window around part of the original image, and then my computer executed the process of turning that idea into an actual png file that I could drop into this post. I used technology to manipulate media. Would it make sense then to say that my cropping had been “made with AI”? After all, it is not the “real” image Facebook used anymore. I have distorted that reality by using the devious truth-obliterating cmd-shift-4 capabilities built into this computer’s operating system.
What purpose does it serve, in other words, to conflate manipulated media in general with “AI,” as Facebook seems to want to do in its plan to label some content as “Made with AI”? Wherever the line is drawn seems somewhat arbitrary, when framed in those terms. Anything posted to a platform is “manipulated media,” anything that is recirculated is “manipulated,” anything circulated by an algorithm is certainly “manipulated.” If mediation is seen as inherently and normatively representational, then any mediation is manipulation. The whole of their platforms should be understood by their users as “manipulated media.” Labeling some content as “Made with AI” seems to suggest that there is something intrinsically more real about the rest of it, no matter how it was made or how deliberately misleading it might be under certain conditions.
Facebook is using “AI” to camouflage the fact that they are performing content moderation according to a particular standard that is inevitably not neutral but negotiable. The company used to evoke an empty idea of “community standards,” as though its products were used by a homogenous group with a shared set of values — as though its platforms weren’t also rhetorical battlefields. Now it will use “AI” to obfuscate its own responsibility in promoting some content over other kinds.
What should be done about “manipulated media” on platforms then? I tend to agree with a point Meredith Whittaker made in this piece about the inadequacy and misguidedness of banning TikTok:
The world would be better if these platforms were dismantled and their revenues shared with the people, professions, and communities whose livelihoods and public spaces they’ve worked to foreclose, and if a more localized variation on digital spaces for deliberation, discussion, and discovery could be constructed in their wake.
The problems with platforms don’t have much to do with whether their content was “made with AI” or not. It has everything to do with their overall business model for treating any kind of content whatsoever.
Toss me a carrot
I’ve been reading a transcript of a class on Adorno’s Aesthetic Theory that Fredric Jameson taught at Duke University sometime in the early 2000s, published as Mimesis, Expression, Construction. It’s almost shockingly incoherent, and I would have assumed the student who assembled it was carrying a grudge against Jameson if Jameson himself hadn’t apparently authorized it. Maybe he thought it would help weed some students out of future seminars, like when professors act like tyrants during the first class of the semester or load their syllabus with rules that have no intention of enforcing. I used to have students read some Adorno on day one of the Freshman Composition courses I had to teach: that seemed to help minimize my paper-grading load.
As dull and meandering as most of the transcripts are, and as annoying as the editor’s choice is to include every mm-hmm and um and more [pause]s than a Pinter play, they have their rewarding moments. These are largely not in how Jameson clarifies Adorno’s ideas (though I found the material on Entkunstung — “de-artification” in Jameson’s translation — useful as a way of thinking about generative AI) but in how the transcripts capture the habitus of graduate school “theory” and how it is built from asides about “my friend Peter Bürger” or “chatting with Tim Clark,” or “Slavoj and Dolar” coming with the rest of “our Slovenian friends” to give some “important” papers, or the innumerable passing references to a range of books and writers (Levinas, Lyotard, Derrida, Heidegger, Sloterdijk, Vico, Lessing, Auerbach, Hegel, Kleist, Musil, etc. etc.) that no student could possibly be fully conversant with. It makes me feel like I didn’t miss so much in being an autodidact (just the credentials, the connections, and the sense of entitlement that comes from elite schooling).
My favorite bit so far though comes when, in discussing the problem of nature’s relation to beauty and art, Jameson quotes Sartre complaining about the “righteous” bourgeois people who pat themselves on the back for touching grass:
Several times a year the urban communities of honest folk decide in common — taking into account the necessities of economic life — to change their attitude towards big farming areas. On a given date, town society expands, plays at disintegration; its members betake themselves to the country where, under the ironic eyes of the workers, they are metamorphosed for a while into pure consumers … It is then that Nature appears. What is she? Nothing other than the external world when we cease to have technical relations with things. The producer and the distributor of merchandise have changed into consumers, and the consumers changed into mediators.
Correlatively, reality changes into a setting; the righteous man on vacation is there, simply there in the fields, amidst the cattle; reciprocally, the fields and the cattle reveal to him only their simple being-there. That is what he calls perceiving life and matter in their absolute reality. A country road between two potato patches is enough to manifest Nature to the city dweller. The road was laid out by engineers, the patches are tilled by peasants. But the city dweller does not see the tilling, which is a kind of work that remains foreign to him. He thinks that he is seeing vegetables in the wild state and mineral matter in a state of freedom … If a farmer goes across the field, he will make a vegetable of him too.
I sometimes have the sense that thinking too much about AI and all that leads me to have a similar kind of sentimental feeling for the “human,” as something that is just naturally given on the other side of “technical relations.” Rather than seeing the category of “human” as dependent on all sorts of social and technical mediations, I can imagine that algorithms and AI and social media are bad impositions that threaten a “humanity” and “life” that otherwise goes unquestioned and is attributed equally and equitably to all. The blatant absurdity of AI hype not only goads us into overrating what machines can achieve but also into misconstruing “the state of freedom” as simply the absence of machines, as if that were even a coherent wish. Creativity is not a vegetable, no matter how healthy and wholesome it might seem.
Favorite vegetable currently is the artichoke.