
Chris H. Wiggins did a PhD in Theoretical Physics at Princeton, and he is a professor of applied math at Columbia; for more than a decade he has also served as the Chief Data Scientist at the New York Times, where he has helped that publication move successfully into the brave new world of online data-intensive media. He’s the co-author of How Data Happened (Norton, 2023). He talked with SoRA co-founder D. Graham Burnett about “persuasion at scale” and the mathematics of the “frackosphere.”
DGB: Hi Chris! Thanks for taking the time with us. You are one of the true gurus of the sophisticated mathematics that lies behind the “Attention Economy” — meaning the recommendation algorithms that work to feed people material aligned with their appetites. While a lot of this work can be thought of as the dark arts that maximize “time on device” (at all cost), you have been engaged in trying to help a legacy media company survive the rise of social media. In this sense, we sorta think of you as “on the side of the angels” in a pretty messy fight. Still, the tools you teach in your Columbia course “Persuasion at Scale” can be used for good or ill. Tell us a bit about that class...
CWH: In general, I think we’re at a time when it would be good for people to remember the value of collective, good-faith sensemaking. We’ve tried to reflect that in the class title “persuasion at scale” because the topic — whether or not large-scale persuasion architectures are capable of moving society and impacting democracy — has been a fraught subject of analysis, op-eds, and hand-wringing for a number of years.
I wanted students to know that there is a mathematical field here — that the question “Did something work to persuade people?” is something that can be posed as a math question. Not because that makes it objectively true or inarguable, but because I wanted students to see that the act of making something mathematical necessarily involves subjective design choices and carries inherent limitations.
This is especially true in the field of causal inference, which forms Part 1 of the course: When we are on the receiving end of online persuasion, how can we go about trying to infer whether or not that persuasion worked?
Part II of the class focuses on the other side of the table: What does it look like, methodologically, when companies are actively trying to persuade people? So we examined the long history of online optimization algorithms — systems that interact with the world while also trying to drive toward particular goals. In the parlance of technologists, these algorithms must simultaneously explore and exploit, where “exploit” means optimizing for a specific metric.
You mentioned “time on device,” but the success metric could instead be the number of social connections, dollars spent, clicks, or any number of other metrics. These metrics are coded in both senses of the word: coded as computer programs, and coded in the sense understood by social scientists — who must translate complex normative concepts about society or individuals into hard numbers that can be studied statistically.
At the end of the class — though headlines intervened — we taught a bit about large language models and their role in generating persuasive text. We followed a recently-published paper on the subject, and then, fortuitously, in the final week of the semester, a controversy broke out about persuasive algorithms deployed on Reddit that were found to be six to seven times more persuasive than human-generated text.
So we ended the class with an arc: from causal inference to machine learning, optimization, and recommendation engines — ultimately landing in a view of the future that is being written in real time, headline by headline, day by day.
DGB: We hear a lot about “algorithms” — they are the heart of what makes our new media ecosystems, from AI to TikTok, so different from the worlds of print and broadcast that came before. What is an algorithm, and what makes them powerful?
CHW: Aha, a “what is” question! I got this quite a bit when I wrote a book about the history and ethics of data with your colleague at Princeton, Matt Jones. People would often start the podcast by asking me to define “data.” I can give you a definition — but of course, you can get one easily from the opposite of artificial general intelligence, namely the very un-general and very un-artificial intelligence that is our friend, the dictionary.
In a dictionary, an algorithm is a precise description of a sequence of operations. What I like about that definition is how closely it mirrors the original definition of artificial intelligence, which, as you may know, came from a grant proposal.
(The great American computer scientist John McCarthy is on record as saying that he made up the term in order to get money. So if it feels like people are using the term to try to get money, you should know — it’s always been that way.)
In that 1955 grant proposal, he and his colleagues wrote that the 1956 workshop on artificial intelligence would be about the conjecture that every “feature” of intelligence “can, in principle, be so precisely described that a machine can be made to simulate it.” I think that’s a great lens on what an algorithm is. And it also includes an excellent use of the word simulate, meaning: we don’t really know how we think — we know how we think we think — but ultimately, we simulate it and find behavior that reminds us of ourselves.
Now to the second half of your question: what makes algorithms powerful?
The usual answer is scale, but scale is really a proxy for cheapness. By precisely describing what it is we are doing, we can encode it into a computer — which drives the cost of executing that sequence of steps down to nearly zero relative to when a human does it.
What’s powerful about deriving or crafting a precise description of the steps of a process is that messy and subjective decisions — like which product to recommend to which person, or which word to say next — can be made by an automatic computing machine. Once we make those decisions cheap to execute, we can do them many more times in many more contexts, driving efficiency, as they say in industry (and now, in government).
I’m hesitant to say that algorithms themselves are powerful in the same way that Kranzberg opined that “technology is neither good nor bad — nor is it neutral.” What’s powerful is the sociotechnical system — in which an algorithm is the technical nugget at the heart — that we are willing to integrate into our lives and processes. The scale is a form of power. So is its statistical performance. One lesson over the last fifty years is just how much of human behavior can be so successfully simulated — as McCarthy hinted in his original, provocative definition of artificial intelligence. And the way we simulate behavior, of course, didn’t unfold the way the early AI founders expected. They thought they would understand “how experts think.” Instead, the road to victory was the low road — the road of data: extremely large datasets gathered from human behavior became the raw material for computational optimization algorithms that now predict how we will respond to various stimuli — delivered through software: a green button, a red button, a sale price, a new product, or the most clickable meme that human-computer teaming can produce.
DGB: At the Strother School of Radical Attention we work to push back against the intensive, industrial-scale “commodification” of human attention, which we believe is fundamentally at odds with human flourishing. As an elite technician in service to the “attention merchants” what do you think of our project? Are we after something real? Where are the blindspots in our work, as you see them?
CWH: Certainly, it is real that companies are aware of the ways in which optimizing our digital engagement can be intimately tied to profit — including, as is often discussed, the entire surveillance capitalism economy. And certainly, it is true that the ways people behave when they are optimally producing revenue for an information platform may be at odds with human flourishing, as you say.
So the word fracking is apt, in that it captures both the corporate interest and the way that interest extracts value from something we may not perceive as inherently valuable—for example, the ground itself and the rocks underneath.
I don’t know that I’d go so far as to call it a blind spot, but I will say that one thing the metaphor of fracking doesn’t capture is that we are active participants in this. The ground is infinitely passive and has no agency. Yet we, as users, do. I think there’s something to be learned by considering — with empathy — what motivates our self-fracking. What problem are we solving for ourselves? What user need is the frackee addressing when they self-frack?
Sometimes this is described as the supply and demand dynamic. When we look at the companies that produce information platforms, we are effectively looking at the supply of engaging software on the internet. And it’s easy to speculate about the interests of the supplier.
But on the other side of the phone that we hold in the palm of our hand — is our hand. Our demand — in which we are actively and with agency choosing clicky material. And controlling that hand is our mind (physically), and (normatively reified) our needs, wants, and hopes.
I’m not sure if we are elements of the frackosphere, as you’ve considered it, but I do think that’s an extremely important part of the problem. All of us who zoom out and consider the frackosphere benefit from approaching — with empathy — those who frack themselves and asking: What human need are they searching for? If we think of eudaimonia simply as happiness, is this their bliss? Which of our values are thwarted by this auto-fractive ecosystem? And which of these values, if any, are in fact flourishing?
I say this not to praise or to bury the frackers or the frackees, but to center that there is a clear element of human agency here. And we benefit by investigating: What problem are people solving? And is this a problem that used to be met by other solutions—now deemed inaccessible, or expensive, or inefficient, or simply forgotten?
DGB: You have thought a lot about “data,” which is a big part of the story of “human fracking” over the last twenty years. Tell us what we should know about the rise of “big data”?
CWH: My last book was on the history and ethics of data, and part two of that book was really about the rise of big data.
I first heard the phrase big data — if I remember correctly — in 2010, at an event thrown by a local venture capitalist. And certainly, I think the phrase became ascendant among people who saw it as a shorthand for a transformation in business, as a number of companies began to realize they could turn data into value.
In this sense, they were following in the near footsteps of companies like Google, Facebook, or Netflix — companies that were data-capturing and data-optimizing from the very beginning. Almost immediately, these were companies thinking not just about how to produce a product, but how to track the ways people interacted with it.
This was absolutely not central to other companies that nonetheless had websites, but may not have thought of themselves as companies where a central concern was logging every event in order to learn from that event how to optimize the product.
One thing to understand about this rise is how the short-term history we can see — because it played out in consumer-facing companies like the ones I mentioned — was itself standing on the shoulders of companies that were not consumer-facing. A key example is AT&T, whose research arm, Bell Labs, had really invented the future. They were already doing work that involved making sense of streams of messy data on computers in a way we would now recognize as data science.
This dates back decades. As early as 1962, the statistician John Tukey wrote a paper on the future of data analysis, which opens by saying that although he had long considered himself a mathematical statistician, the progress of the field had given him “cause to wonder and to doubt” — in that the work they were doing at Bell Labs was so different from the pencil-and-paper methods that had defined data work since the nineteenth century that it required an entirely new way of thinking.
But the real story I think people should know about the rise of big data is how even the style of Bell Labs was born of, and in fact developed hand-in-glove with, work by the intelligence community, rooted in the concerns of the Cold War.
In some sense, digital computation was born to solve a data science problem: namely, making sense of streams of messy, real-world data. This takes us back to Bletchley Park, where the Colossus machine was built and put to work — but then largely forgotten. Not by accident, but intentionally, as part of state secrecy policy by the British government.
The rise of digital computation and data science grew much more aggressively in the United States, as part of what President Eisenhower would later call the military-industrial complex. Companies and the intelligence community, hand in hand, developed digital computation in order to record, gather, and analyze vast datasets — largely about telecommunications, but also about tracking planes and other flying metallic objects that came to be of interest during the Cold War.
Understanding this history helps localize the history of big data and makes the present feel strange: World War II is far enough away that we can see it clearly as a different time, and yet close enough that it still invites us—like all good history does — to rethink our current human condition.
In this case: Who benefited from these new capabilities? What did they replace? Who set the governance of these capabilities — the terms of service, the policies, and regulations?
And of course, as with all novel technologies and their introduction: how did we come to integrate these systems into our norms? How were those norms, in turn, shaped by societal values, by regulation, and by state and commercial investment?
These latter dynamics are still very much with us today. And taking a historical view is one way to invite a fresh look at what is, what could be, and what do we want.
DGB: Chris Wiggins! Thank you so much for taking the time with us!
There is something quietly terrifying in how casually we now speak of “persuasion at scale.” As if the phrase were describing a public works project or a new subway line. But it is not concrete that is being poured. It is suggestion, engineered, optimized, coded, and deployed. It is the soul, not the sidewalk, being reshaped.
This interview circles around a central, unspoken question: when does persuasion become preemption? When does the attempt to understand and influence human behavior become the replacement of human freedom with something else… something colder, cleaner, more efficient?
It is tempting, of course, to lay all blame at the feet of the platforms. They are easy villains. But what this conversation refuses to let us forget is that there is no puppeteer without a willing hand. That the fracking metaphor cuts both ways: we are not only being drilled, we are drilling ourselves. In pursuit of connection, or validation, or some small oasis of meaning.
And so the algorithm becomes the mirror. Not merely reflecting who we are, but what we have come to need. The tragedy is not that it simulates us, but that it does so with increasing fidelity, until we begin to mistake the reflection for the real.
Somewhere beneath the metrics and models, I wonder whether we have forgotten that the heart cannot be optimized. That meaning does not scale. That truth resists simulation.
And yet, there are still voices… quiet, mathematical, unillusioned… that remind us that none of this is inevitable. That design is still a choice. That even within systems built for persuasion, it is still possible to teach. To ask better questions. To remember what it means to be human.