I can’t say I entirely understood the stakes of this essay by Peli Grietzer about “why poetry is a variety of mathematical experience,” but it seems to be making the case that LLMs should not be understood as (ersatz) poets but as the idea of poetry itself. “Art, the Romantics said, is our interface with the real patterns and relations that weave up the world of rational thought and perception,” Grietzer writes, and we are left to infer that LLMs, which also are understood as encoding those patterns and relations, are therefore contemporary works of art. And just as poetry is a form of “ineffable” thought that can’t be paraphrased, neural nets, with their billions of parameters, are also ineffable even as they hold together a coherent model of the “lifeworld” that we can probe and prompt but can never grasp in its totality.
Poetry (i.e. LLMs) can be likened to Kant’s a priori schemata — “Poetry reaches into the ineffable that binds the ‘mere words’ of a concept, be it ‘God’ or ‘envy’ or ‘dog’, to its life in thought and feeling,” Grietzer writes. As Kant puts it in Critique of Pure Reason, “This schematism of our understanding with regard to appearances and their mere form is a hidden art in the depths of the human soul, whose true operations we can divine from nature and lay unveiled before our eyes only with difficulty” (A141). It is effectively a black-box algorithm running in the background of our consciousness, but poetry is apparently one way of accessing it, albeit indirectly. It can capture the essence of the schematism process without dragging it out of the “hidden depths.”
Presumably, LLMs can be interpreted similarly: They can hold the rules of why words and concepts are associated in certain patterns without being able to make them explicit, much as poems resonate with meaning for us without being fully decodable or reduced to a finite specific set of meanings. The output of LLMs shows the magic of schematism at work in our collective language use without oversimplifying it. Instead, it is extrapolated to a level of complexity (billions of interrelations of parameters in a multidimensional space) that is beyond human comprehension. LLMs can do “schematism” in the hidden depths the same way our reason can, without reducing the “art” to a knowable system.
Kant claims that “the power of judgment” — knowing when and how to group certain concepts together — “is a special talent that cannot be taught but only practiced.” In other words, there are no rules for applying schemata, only experience. In a lecture on Critique of Pure Reason, J.M. Bernstein explains:
the understanding contains no rules for the application of rules—no rules for judgment. Why is this the case? Kant’s first answer is that to ask for a rule for the application of a rule — when do I apply the rule ‘dog’ — this is going to be a problem because it opens up an infinite regress. Because if I need a rule for when I apply the rule of dog — then I would need a rule for when I apply the rule of applying the rule of a dog, and we get an infinite regress. So there cannot be a rule for the application of rules.
Thus judgment requires — and this is what he is saying at A 133 — “a peculiar talent which can be practiced only, and cannot be taught. It is the specific quality of so-called mother-wit; and its lack no school can make good.” ... You cannot have an algorithm for everything, you cannot have a rule for everything. There is a notion of ‘mother-wit’ involved. Practice, training, discrimination.
An LLM, however, is purported to be “an algorithm for everything” that doesn’t fall into infinite regress, just an endless process of training that continues to add more parameters. It can compile the rules for applying the rules for applying the rules ad infinitum, until the world runs out of energy.
But why would we want that, or even want to believe that? It seems to hinge on whether you agree with this sentence of Grietzer’s: “In recent years, the field of machine learning has produced exciting mathematical and empirical clues about the patterns that make up human lifeworlds, the mechanics of imaginative grasping, and the resonance between the two.” Exciting! I personally don’t find rendering the world as data to be “exciting,” and tend to see the extraction of patterns from surveillance of human behavior as a mode of domination, a way of turning people’s “imaginative grasping” against them, or a way of imposing normativity.
As I read Grietzer, he seems to arguing that machine learning, like poetry, establishes that there is “real grounding” for the rules holding together our concepts; they “model the causal-material structure of a lifeworld” and make it accessible as “a vibe.” This access to the schematism that is otherwise black-boxed allows us to believe in the integrity of our lifeworlds:
Perception, Kant says, is imaginative reproduction: to perceive is to rebuild a sequence of sensory inputs through a generative recipe, and so to grasp their unity and structure. A vibe is something like the language of these recipes. Vibe happens when your recipes for a world’s myriad objects or phenomena share computational resources and techniques.
Vibes, from this perspective, are not really structures of feeling but statistical constructs, which again is apparently supposed to be “exciting” for reasons I don’t understand. Grietzer concedes that this is “a tricky juncture” in his argument that “poetic thought must be a mathematical or computational relation between mind and world.” Why must that be so? Because we have the technology to prove it? Grietzer wants to claim that poetry is “empirical” in the sense of exceeding the arguments humans have about what it is, as though this would make poetry more real or something.
“To do right by poetic thought, we need to weave a language for sacred mystery from manifest and scientific threads,” Grietzer writes. “Can we do this through something like a minimal poetic gloss on basically technical ideas?” Behind this plea (which seems like a cryptic way of arguing for accepting LLMs as capturing the essence of poetry) is the claim that “the gaps and guesswork in our relationship to poetry threatens to cut us off from poetry.” I just don’t understand why the opposite isn’t more the case: that efforts to rationalize language use with machines that we can’t understand will further alienate us from poetic practice and make the strictly instrumental use of language the horizon of our thought.
Thanks for reading. The piece isn't about AI technology and its implications at all, it draws on ML theory strictly as a resource for cognitive theory. I'm personally pretty negative about LLMs and their implications, but it's not a topic my work deals with.
I can't make sense of Hinton, so I had GPT4 summarize it, but just like a summary it did of one of your other posts here, GPT4 made no mention of AI, LLMs, etc. It guesses at what an essay might mean with poor results. It's also a bad poet. Or a good and incredibly fast writer of banal pastiche. It seems to need formal constraints, especially rhyme, but it only does a few traditional verse forms. It's very hard to get it to do blank verse. Prompting it for a frog-shaped poem about frogs I got this gem.
_ _
/ / \ \
/ / \ \
/ /_ribbit_\ \
(o)- -(o)
\ /
\ / /
\/