If you don’t trust social media, you should know you’re not alone. Most people surveyed around the world feel the same—in fact, they’ve been saying so for a decade. There is clearly a problem with misinformation and hazardous speech on platforms such as Facebook and X. And before the end of its term this year, the Supreme Court may redefine how that problem is treated.
Over the past few weeks, the Court has heard arguments in three cases that deal with controlling political speech and misinformation online. In the first two, heard last month, lawmakers in Texas and Florida claim that platforms such as Facebook are selectively removing political content that its moderators deem harmful or otherwise against their terms of service; tech companies have argued that they have the right to curate what their users see. Meanwhile, some policy makers believe that content moderation hasn’t gone far enough, and that misinformation still flows too easily through social networks; whether (and how) government officials can directly communicate with tech platforms about removing such content is at issue in the third case, which was put before the Court this week.
We’re Harvard economists who study social media and platform design. (One of us, Scott Duke Kominers, is also a research partner at the crypto arm of a16z, a venture-capital firm with investments in social platforms, and an adviser to Quora.) Our research offers a perhaps counterintuitive solution to disagreements about moderation: Platforms should give up on trying to prevent the spread of information that is simply false, and focus instead on preventing the spread of information that can be used to cause harm. These are related issues, but they’re not the same.
As the presidential election approaches, tech platforms are gearing up for a deluge of misinformation. Civil-society organizations say that platforms need a better plan to combat election misinformation, which some academics expect to reach new heights this year. Platforms say they have plans for keeping sites secure, yet despite the resources devoted to content moderation, fact-checking, and the like, it’s hard to escape the feeling that the tech titans are losing the fight.
Here is the issue: Platforms have the power to block, flag, or mute content that they judge to be false. But blocking or flagging something as false doesn’t necessarily stop users from believing it. Indeed, because many of the most pernicious lies are believed by those inclined to distrust the “establishment,” blocking or flagging false claims can even make things worse.
On December 19, 2020, then-President Donald Trump posted a now-infamous message about election fraud, telling readers to “be there,” in Washington, D.C., on January 6. If you visit that post on Facebook today, you’ll see a sober annotation from the platform itself that “the US has laws, procedures, and established institutions to ensure the integrity of our elections.” That disclaimer is sourced from the Bipartisan Policy Center. But does anyone seriously believe that the people storming the Capitol on January 6, and the many others who cheered them on, would be convinced that Joe Biden won just because the Bipartisan Policy Center told Facebook that everything was okay?
Our research shows that this problem is intrinsic: Unless a platform’s users trust the platform’s motivations and its process, any action by the platform can look like evidence of something it is not. To reach this conclusion, we built a mathematical model. In the model, one user (a “sender”) tries to make a claim to another user (a “receiver”). The claim might be true or false, harmful or not. Between the two users is a platform—or maybe an algorithm acting on its behalf—that can block the sender’s content if it wants to.
We wanted to find out when blocking content can improve outcomes, without a risk of making them worse. Our model, like all models, is an abstraction—and thus imperfectly captures the complexity of actual interactions. But because we wanted to consider all possible policies, not just those that have been tried in practice, our question couldn’t be answered by data alone. So we instead approached it using mathematical logic, treating the model as a kind of wind tunnel to test the effectiveness of different policies.
Our analysis shows that if users trust the platform to both know what’s right and do what’s right (and the platform truly does know what’s true and what isn’t), then the platform can successfully eliminate misinformation. The logic is simple: If users believe the platform is benevolent and all-knowing, then if something is blocked or flagged, it must be false, and if it is let through, it must be true.
You can see the problem, though: Many users don’t trust Big Tech platforms, as those previously mentioned surveys demonstrate. When users don’t trust a platform, even well-meaning attempts to make things better can make things worse. And when the platforms seem to be taking sides, that can add fuel to the very fire they are trying to put out.
Does this mean that content moderation is always counterproductive? Far from it. Our analysis also shows that moderation can be very effective when it blocks information that can be used to do something harmful.
Going back to Trump’s December 2020 post about election fraud, imagine that, instead of alerting users to the sober conclusions of the Bipartisan Policy Center, the platform had simply made it much harder for Trump to communicate the date (January 6) and place (Washington, D.C.) for supporters to gather. Blocking that information wouldn’t have prevented users from believing that the election was stolen—to the contrary, it might have fed claims that tech-sector elites were trying to influence the outcome. Nevertheless, making it harder to coordinate where and when to go might have helped slow the momentum of the eventual insurrection, thus limiting the post’s real-world harms.
Unlike removing misinformation per se, removing information that enables harm can work even if users don’t trust the platform’s motives at all. When it is the information itself that enables the harm, blocking that information blocks the harm as well. A similar logic extends to other kinds of harmful content, such as doxxing and hate speech. There, the content itself—not the beliefs it encourages—is the root of the harm, and platforms do indeed successfully moderate these types of content.
Do we want tech companies to decide what is and is not harmful? Maybe not; the challenges and downsides are clear. But platforms already routinely make judgments about harm—is a post calling for a gathering at a particular place and time that includes the word violent an incitement to violence, or an announcement of an outdoor concert? Clearly the latter if you’re planning to see the Violent Femmes. Often context and language make these judgments apparent enough that an algorithm can determine them. When that doesn’t happen, platforms can rely on internal experts or even independent bodies, such as Meta’s Oversight Board, which handles tricky cases related to the company’s content policies.
And if platforms accept our reasoning, they can divert resources from the misguided task of deciding what is true toward the still hard, but more pragmatic, task of determining what enables harm. Even though misinformation is a huge problem, it’s not one that platforms can solve. Platforms can help keep us safer by focusing on what content moderation can do, and giving up on what it can’t.