Why It’s So Hard to Count Twitter Bots

by techlowdown · 18 May 2022

Automated accounts have become more sophisticated and complex in recent years. Many fake accounts are partly operated by humans, as well as machines, or just amplify messages written by real people (what Menczer calls “cyborg accounts”). Other accounts use tricks designed to evade human and algorithmic detection, such as rapidly liking and unliking tweets or posting and deleting tweets. And of course there are plenty of automated or semi-automated accounts, such as those run by many companies, that aren’t actually harmful.

The Botometer algorithm uses machine learning to assess a wide range of public data tied to an account—not just the content of tweets, but when messages are sent, who follows an account, and so on—to determine the likelihood of it being a bot. Although the algorithm is state of the art, Menczer says, “a lot of accounts now fall to the range where the algorithm is basically not very sure.”

Menczer and others say that spotting bots is a game of cat and mouse. But they add that it may become significantly more challenging in the future as spammers use algorithms that are better able to generate convincing text and hold coherent conversations.

Twitter itself is better equipped to spot bots using machine learning because it has access to a lot more data about each account. This includes a user’s full history of activity, as well as the different IP addresses and devices they use. But Delip Rao, a machine learning expert who worked on spam detection at Twitter from 2011 to 2013, says the company may not be able to reveal how this works because doing so could disclose personal data or information that could be used to manipulate the platform’s recommendation system.

This week, Musk also got into a spat with Parag Agrawal, Twitter’s CEO, over how easily the company could disclose its methodology for finding bots. On Monday, Agrawal posted a thread explaining how complex the challenge still is. He noted that the private data Twitter holds may change calculations around the number of bots on the service. “FirstnameBunchOfNumbers with no profile pic and odd tweets might seem like a bot or spam to you, but behind the scenes we often see multiple indicators that it’s a real person,” he wrote in the thread. Agrawal also said that Twitter could not disclose details of these assessments.

If Twitter is unable, or unwilling, to reveal its methodology and Musk says he won’t proceed without details, the deal may remain in limbo. Of course, Musk be using the issue as leverage to negotiate the price down.

For now, Musk seems dissatisfied with Twitter’s efforts to explain why finding bots is not as easy as he thinks. He responded to Agrawal’s long thread on Monday with a simple message that seemed far more fitting for a bot than a prospective buyer of Twitter: a single, smiling poop emoji.