Embracing Platform Transparency in a Digital World to Strengthen Democracy

We live in a world that is increasingly governed by online interactions. We encounter voluminous amounts of news online; for many of us, that is where we get the majority (or even all) of our news. We interact with the political world online, from signing petitions to uploading videos to coordinating offline action. We interact with other human beings online, including friends, family, and coworkers – as well as many total strangers. More recently, we have begun interacting with AI agents online in ways that are coming closer and closer to the way we interact with humans.

The growth of the digital world presents all sorts of challenges to and, yes, opportunities for supporting democracy. Lest we forget, social media was originally hailed as “Liberation Technology” that would prevent autocrats from controlling the flow of information and allow would be democratic reformers to find each other and organize. And examples abound of democratically inclined citizens doing just that, perhaps never quite so clearly as during the Revolution of Dignity in Ukraine. Within democracies, it would facilitate the organization of grass roots political activity and political coordination that could cut across geographic boundaries.

Of course, though, the rise of the digital world has also presented a myriad of challenges to democracy. Fundamental to normative justifications for democracy is the idea of an informed citizenry making choices about the political policies, parties, and leaders for which they will cast their votes. However, the rise of the digital world has fundamentally undermined the traditional gatekeeping role played by trained journalists and editors in the distribution of politically relevant news and information. The digital revolution has made it even easier for foreign malign actors to insert content into another country’s information ecosystem. In what has attracted perhaps the most public attention, the digital world has enabled the spread of misinformation and disinformation. And the rise of AI only threatens to turbo charge all of these challenges.

But beyond the challenges that the rise of the digital information environment presents to democratic citizens, we also need to acknowledge the challenges presented to policy makers in democratic polities, who need to legislate in this era. Regulation of the digital environment is always going to be on the table to protect the quality of democracy, but legislators’ responsibilities go far beyond just this kind of higher-level goal, as they are increasingly being called upon to weigh in on wide-ranging set of digital-adjacent policy issues, such as data centers and the energy that supplies them, age verification laws for accessing online content and apps, data portability, and phones in schools.

Policy makers need to be informed about what is happening on these now crucially important digital platforms if they are to legislate effectively. Unfortunately, the digital world in general, and social media in particular, is incredibly susceptible to what I would call “inference by anecdote”. Precisely because there is so much digital content in circulation and so much of it is optimized for search, it is possible to find examples of just about anything. Moreover, as we spend more and more of our lives online, we are likely to encounter an ever-wider range of “anecdotal experiences” – that one time we were scrolling somewhere and saw something, and we remember that. But good policy is not made on the basis of anecdotes – it requires systematic evidence of trends over time, of relative prevalence of information, and of causal relationships. In short, it requires rigorous scientific research and evidence, which in the digital era means access to data and platforms for independent and external researchers that do not work for those same digital platforms. Otherwise, we are limited to evidence provided by the platforms themselves – not an optimal solution for many obvious reasons – or to inferring from anecdotal evidence.

Another unfortunate characteristic of the digital era is that despite all the accomplishments of myriad tech startups, we continue to see absolute behemoths arising in industry after industry. Netflix. Amazon. Meta. Google. And now OpenAI and Anthropic. Understanding what is happening online, therefore, often boils down to understanding what is happening on a handful of enormous platforms.

Six years ago, Nate Persily and I wrote in the concluding chapter of an edited volume on Social Media and Democracy that there were three options for conducting rigorous research about what was happening on these kinds of platforms: (1) collect data without the cooperation of the platforms (essentially scraping or installing browser extensions with users’ consent that track their browser visits to platforms); (2) collect data through various forms of cooperation with platforms (using platform supplied APIs, soliciting data donations from platform users that rely on platform provided mechanisms for data-takeout, working with data released by the platforms, or even actual working collaborations with researchers at platforms); or (3) lobby governments to regulate data access. We argued that researchers should do all three simultaneously, as the stakes were simply too high to abandon our quest to understand what was happening on these platforms at scale and their impact on society.

Over the course of the past decade, I have been part of research teams that have tried all of these different approaches, and many of them have been quite fruitful. But at the end of the day, both options 1 and 2 mean that your ability to conduct research is always at the whims of the platforms. The scrapers you’ve built can stop working when the way platforms share data changes. API access can be raised to prohibitively expensive levels. Data sets can be provided to outside researchers, but they can also not be. And research collaborations are, of course, at the discretion of the platforms as to whether or not they choose to enter into such arrangements.

Therefore, my contribution to the 100 Ideas series is that for the sake of democracy, we, the people, need to reclaim this decision making about whether or not platforms should make their data available for public-facing research that can inform the public, the press, and policy makers. This should not be a decision left to a handful of companies; it should be a component of doing business in democratic polities.

There are, of course, lots of details to be worked out here, and lots of potential legislation to consider. But what I’m proposing is a fundamental change of mindset here: democracy in the digital era requires digital transparency.

When Persily and I published that volume on Social Media and Democracy, it was before the emergence of chatbots like ChatGPT and Claude. As more and more people migrate to interacting with these types of AI chatbots online, the need for transparency only grows. Scraping publicly available data is no longer going to be able to get us to an understanding of how people are actually using these chatbots – and what the chatbots are telling them. The AI companies have this information, and in the future it is going to be even more crucial that we establish a principle of digital transparency that is rooted in external data access.

So for the importance of democracy and good governance, we need a renewed commitment to transparency and accountability on digital platforms – in other words, data access.