Some observations from tracking Twitter

In early 2022 I realized that being a good researcher isn’t just about doing good research—it’s also about communicating ideas well and persuading others to engage with them. In other words, research is a highly social activity. In AI research, the social component largely revolves around Twitter, which distributes ideas in many different ways—people discuss research papers, learn about job opportunities, and meet new collaborators. I’ve even heard of some people meeting their romantic partners through Twitter. 

So I decided to start using twitter to engage more actively with the AI community. At that time I had just read Peak: Secrets from the New Science of Expertise and was really into the idea of, so I made a somewhat scientific attempt of tracking my “performance” on Twitter in a spreadsheet.

As primary metrics to track, I decided on (1) number of new followers gained per tweet (tracked manually by me logging how # followers before and 24 hours after a given tweet), and (2) number of likes that each tweet got. I didn’t think there was much additional signal in tracking number of retweets or impressions. Also, I only tracked “major” tweets which I put some thought into, and didn’t bother tracking replies or quote tweets that I did spontaneously. 

Below is the analysis of the data and some lessons I learned from this exercise, which I hope could be useful to some people who are just starting on twitter or thinking of being more active. Note that my twitter journey is just one example (there are many others). And at the end of all this, as I’ll describe in the last section, I think I ended up falling for some form of Goodhart’s law, and that this type of optimization is the wrong way of approaching twitter.

Followers over time

As an overview, I started with 378 followers on Feb 20, 2022, and ended with 23.6k followers on May 7, 2023. I logged 83 tweets during that period, each represented by a dot in the graph below.

What distribution of followers come from a few popular tweets?

An initial question I had as a Twitter new-joiner was: what distribution of someone’s Twitter followers result from a few popular tweets versus tweeting steadily over time? For me, it was this:

  • Over time, 12.5k new followers (out of 23k followers) were gained within 24 hours of me making a tweet.

  • Out of those 12.5k new followers, 8.1k were gained from my top-ten tweets.

  • My two highest-follower-yield tweets, about joining OpenAI and converting to Research Scientist at Google, led to 3.5k and 1.2k new followers, respectively.

So about one-third of my follower-count came from popular tweets, and the rest came from other tweets and people just deciding to follow me.

Is it possible to forecast how well tweets will do?

Another natural question I had was whether people could typically have a sense of how good a tweet was before they tweeted it, as often the tweets that are the most thoughtful don’t get the most attention [1, 2]. So before each tweet, I tried to forecast how many likes and new followers I would get. Although I wasn’t very accurate, there is definitely a slight correlation. Also there was probably some psychological effect that made my forecasting suboptimal (i.e., I didn’t want to forecast too highly and set myself up for disappointment).

Are there compounding effects from building a follower base?

Another effect I was curious about was whether it would get easier to get likes or new followers after having more followers. My data seems to indicate that there is some compounding effect, but it is small and definitely doesn’t scale linearly (e.g., I didn’t get 4x as many likes per tweet when I had 4x followers). (Note that I also excluded the few tweets with >1k likes or followers to make these plots more readable.)

One thing that I did notice, however, is that after I had some base of followers, I would get more new followers on days when I wasn’t tweeting. This table below shows that from 16k to 23k followers, I gained 61% of my followers on days when I wasn’t tweeting, compared to 41% for my first 6.9k followers.

Follower range Total follower gain Total followers gained from tweets Followers gained without a recent tweet
378 to 6.9k 6.6k 3.9k (59%) 2.7k (41%)
16.1k to 23.5k 7.4k 2.9k (39%) 4.5k (61%)

Likes versus new followers as the optimization metric

A final point that I noticed is that even though the number of likes per tweet is correlated with new followers per tweet, they aren’t the same. Personal posts had a high ratio of new followers per like. On the other hand, posts that were non-personal, such as memes or jokes, could get a lot of likes but didn’t yield a lot of new followers for me.

Tweet New followers Likes New followers per like
Joining OpenAI 3.5k 3.3k 1.06
Research scientist 1.3k 1.7k 0.76
2023 new years resolution 412 631 0.65
Chinese proverbs in AI 289 2.2k 0.13
Mountain view joke 64 904 0.07
Prompting meme 85 1.2k 0.07

Reflection

Overall, this analysis was an interesting exercise for me to learn the ropes of Twitter, and maybe this data is somewhat helpful to new joiners out there. One big caveat, however, is there are a lot of variables that aren’t accounted for, and these patterns will probably vary substantially from person to person. 

For me though, the takeaway at the end of all this is that optimizing for Twitter followers and likes is probably the wrong approach. The main reason is that doing it this way added another “optimization term” to my brain, for which my brain started to backprop and adjust weights to do well on. For example, I noticed that I was a little “too in it”---there were times when I noticed my “free moments” being filled with thoughts on how to write funny jokes or clever tweets that would get a lot of likes. Maybe this would be OK in a perfectly aligned world where the most creative, thoughtful, information-dense tweets get the most likes, but I don’t think we have that perfect alignment yet. So moving forward, I’ve decided to stop tracking tweets, look less at likes and new-followers, and focus on making tweets that are insightful and a positive contribution to the world :)

Previous
Previous

Some intuitions about large language models

Next
Next

Common arguments regarding emergent abilities