What do we do with the Twitter-shaped hole in the internet?

Reading Time: 37 minutes

By the way, you can listen to me read this post aloud on my Patreon! It’s long, so there are three parts to the reading, but I made the first part free to listen.

I joined Twitter in college, over a decade ago. The microblogging site promised a lightweight alternative to Facebook with the added benefit of asymmetric connections: you could follow people who did not follow you back. At the time, Twitter had not yet become an internet behemoth. It just gave me a way to kibitz asynchronously with friends and colleagues before I crystallized anything into a blog post. Outside of tech conferences—and especially during pandemic lockdown, when we had no conferences at all—Twitter became the hallway track.

A silouhetted bird with beautiful, colorful wingfeathers. Image by Victor Tyakht.

At the time, credibility in my field came from one of three places. One: being a well-off guy who got super lucky and started a unicorn. Two: knowing someone already and getting boosted into a sweet position by a friend. Or three: credibility-by-pedigree through Stanford or MIT or Microsoft or somesuch. Names like those still often determine whether people peg themselves and others as “probably worth listening to,” despite their demonstrably high false positive rate.

Social media made another way, though. And that’s how tech ended up letting me—a babyface, tatted-up, any-pronoun femme-dyke with no fancy tech school and no FAANG pedigree—work at Pivotal Labs or Mozilla or teach at U Chicago or O’Reilly or write code for NASA projects. The internet gave me a microphone; I said things into that microphone that convinced people not to throw out my unimpressive1 resume.

To date, over half of the people who read my blog found it on Twitter. I’d guess another 35 percent found it through someone who found it on Twitter. The bird site, somewhere up the chain, delivered my writing to many of the social media hermits who deign to read it, for the same reason that Miranda Priestly’s colleagues chose Andy Sachs’ lumpy cerulean sweater in The Devil Wears Prada.

Twitter probably helped this piece find you, dear reader, whether you realize it or not. And I couldn’t be more thrilled that you’re here, so I owe Twitter a sea of gratitude.

Nevertheless, I have never pretended to stan the site. I’ve openly accused it of causing me gray hairs since I started writing about my machine learning work on there in 2015. I’ve also taken issue several times with specific bar fights that I witnessed there. Remember CalendlyGate? And MortgageGuyGate? If you don’t feel like reading me cuss for 28 minutes, allow me to offer you a couple excerpts from those posts to give you a general idea. This first one came out in April of last year:

This blog post isn’t about Calendly.

It’s about an argument that broke out on Twitter in January of 2022 about Calendly. Specifically, it’s about the way people treated each other in that argument.

Feel free to go have a look yourself; I’m not going screenshot hunting because I never want to see a single word of it ever again. The conversation somehow very quickly devolved into a lot of adults who have Extremely Online Brands about empathy being absolute shitheads to each other. 

Not lemmings. Not children. Adults who are semi-famous on the internet for being empathetic people. I watched people with executive positions and thousands of followers post tiny brain memes about the people they didn’t agree with. To wit:

So clearly I already had a bee in my bonnet. Then, about a year later:

When I wrote the words “We desperately need marginalized voices to build for themselves if we want to inject vision into our languishing internet of personalized ads decorating stalkery stadiums hosting free-for-all rhetorical cage matches,” I was thinking of Twitter. And I wrote that post two years ago, long before Melon Husk took the helm and crashed the birdplane2 for good.

But let’s talk about that plane crash now.

I cannot articulate how utterly emasculating3 it felt to see my twelve at-times-obsessive years of work—my carefully crafted reputation, my public stock—tanked in two weeks by one billionaire blowhard. Twelve years. Two weeks. And I am, let’s face it, pretty high up the privilege scale. How much of most people’s lives could this man just obliterate without realizing it or caring at all, like a child crushing an ant while sprinting down the soccer field? No elite education prepares people for that kind of power disparity. But I digress.

As my Twitter feed devolved into a game of Pong Except Scalding AI Takes…

as my DMs melted into a cesspool of unsolicited feet pic requests…

as ░M░Y░P░ U░S░S░ Y░I░ N░B░I░O ░ skyrocketed to the ranks of de-facto footer for any tweet on any topic…

the broader question emerged:

“Where do we go now?”

That the question emerged on Twitter itself felt fitting, even as Melon (it’s rumored) attempted to sandbag tweets that mentioned the URLs of the popular alternatives.

Let’s talk about those alternatives now.4

Answer One: Private communities on Slack or Discord

I retreated to many of these. How it works: I receive an invitation to a curated group with specific members, with channels in it to organize the specific conversations. It’s like the internet forums of yore, but in walled gardens: even the ‘public’ ones need a bypass UI that people use to get an invitation.

These work great for specific interest groups or social communities. They facilitate conversations on neatly labeled topics in their own channels.

On Slack, you can make connections with people by starting direct messages with those people inside an instance, but outside of channels. The “inside an instance” thing has bitten Slack in the ass twice: first, they took the wildly criticized executive brown-noser step of making the text of direct messages obtainable by instance owners5 (they at least did make obtaining it extremely annoying, but I think that was a technical limitation rather than an ethical boundary). Second, they introduced cross-instance messaging, which somehow got to production without, one must assume, even a single Extremely Online Femme or Trans Person clapping eyes on it, because the feature (especially in its first iteration) could not have catered more perfectly to online harassment campaigns.

Discord—either by having different values or by watching Slack and vowing not to step on the same rakes—uses a different data model, and it seems to work better. Users who meet inside a server (equivalent, for this conversation, to a Slack instance) or otherwise trade handles can, with mutual affirmative consent, establish direct messages outside of any server, such that no server proprietors can get ’em.

The thing is, these sorts of communities serve a different purpose than Twitter did in two important ways.

First, they organize around topics, not people. You can’t go to someone’s profile and see what they said across multiple topics. You can’t even see a collation of what you said. It’s easy to lose conversations among the channels, and easy to accidentally braid together two conversations in one channel. Threading does help with this, but you can’t thread within a thread, so if the braiding happens there, you’re stuck with it. Conversation topic maps often look more like trees than neat buckets. You can’t do that in these communities. A catch-all #general-chat or #water-cooler channel doesn’t fix it.

Second, these communities organize in-groups. They’re by invitation. This is one of their most useful and most critical functions: sometimes, you wanna be in a room with your people. The Midjourney Discord server caters to people who are excited about open source artificial image generation; it’s not a place, generally, that skeptics go to be actively skeptical. My friends’ Discord server is for my friends, and that’s where I go when I have a hot take that I need to load test with people who love and care about me before I get it horribly wrong in front of The Internet. There are discord servers where black and indigenous people of color (BIPOC) get to chat without getting interrupted constantly by white fragility, and servers for disabled people to converse devoid the responsibility of playing Customer Service Hotline for ableds.

These spaces are essential. They also, by design, don’t serve a critical inclusion need that Twitter filled.

In 2022, my mom got cancer. A few months later, five weeks into her chemo treatment, she was admitted to the ICU with sepsis. The miracle that she left the ICU alive is another story entirely. But when she did, she was exceptionally weak. She needed months of rehab to get ready for surgery. At the time, it was not clear whether my mother would ever regain the ability to travel, or work, or even walk. As such, she could barely muster the motivation to eat, or read, or watch movies—things she still could do that she liked. The idea of living a disabled life depressed her too much. As I watched her struggle with this, I felt compelled to root out the idea I had inherited that a disabled life might not be worth living. After all, disability is the one marginalized group that every single one of us, barring catastrophe, will belong to one day. But in order to address my internalized ableism, I needed to listen to disability activists while shutting the fuck up.

Twitter let me follow people whose ideas I needed to hear. No waiting for the New York Times or The Atlantic or some other self-styled publisher kingmaker to decide those ideas were worth hearing. When Twitter broke the tripoly of rich dude magic, in-group bias, and legitimacy-by-proxy, I got access to the words and lessons of the people I should have learned from all along. The internet let people talk, but those voices still needed an index. Google’s search engine, for reasons beyond the scope of this post, didn’t provide it. Twitter did.

A screenshot of one of my tweets from 2021 says "I joke about curating my follow list on this bird site. 

But FOR REAL, this app has allowed me to compose a news feed of takes that are pithy, skeptical, relatively researched, and inclusive of marginalized perspectives.

No agglomeration of paid media is giving me that."

Slack and Discord ain’t built for that, either.

Answer Two: Mastodon

Mastodon replicates a lot of the Twitter functionality; in fact, its open source nature and dedication to decentralization make it read like the answer to Twitter that Discord was to Slack.

I have a Mastodon account. It lives on an instance called clawhammer that belongs to my friend, himself a huge fan of decentralization. I’m on there because when he asked if I had switched to Mastodon yet, I expressed my despair and exhaustion at the whole Twitter situation. I felt completely unmotivated to spin up my own instance, and I didn’t trust any of the other rapidly balooning Mastodon fiefdoms to beat Twitter on matters of governance. If Twitter had proven so vulnerable to crumbling ruin, why should I trust something drastically more volatile?

First of all there’s the excellent point made by folks from Black Twitter about Mastodon: random instance owners don’t necessarily have community guidelines in place, and there’s little recourse for them when racists get freespeechy.6

Mekka Okereke explains on Mastodon 

"I'm not interested in having Fedi 'splained to me. I'm just saying why more Black folk are on BlueSky:

1. It should be almost impossible for a new, non-technical user to onboard onto the Fediverse and accidentally join a server where they will receive racist death threats.

2. It should not be possible for racists to reply to a Black user's post with hateful gore images, without other users on the server knowing it.

3. It should be easy for a new admin to default into a safe, general denylist."

Marco Rogers (@polotek everywhere) surfaced an instructive conversation about this on Mastodon, but I can’t seem to find it now to show you (this might be a me issue or it might be a Mastodon issue). The thrust: since black people are used to being forced to choose between either participation in communities that welcome racism or the “safety” of isolation, they’re choosing options (like staying on Twitter, and we’ll discuss Bluesky later) where at least they have backup available to pile onto the xenophobes and Nazis that get to run rampant everywhere.

This feature of Twitter, which Mastodon has so far failed to match, comes from constituents rather than the platform itself. Though Twitter’s community moderation mechanisms left much to be desired even pre-Melon (we’ll get there), at least making enough of an ass of yourself put you up for candidacy as Twitter’s main character. This unenviable designation fell on folks who caught enough ire from funny or popular tweeters that, for 24-48 hours, they became the central subject of an exuberant public dragging. One famous example: “Bean Dad.” I’ll save you a click on this one:

Bean Dad John Roderick deleted his Twitter account Sunday following an onslaught of brutal criticism about forcing his 9-year-old daughter to struggle for six hours to open a can of beans and the resurfacing of old anti-Semitic and homophobic tweets.
Screenshot examples of tweets in which John Roderick whips out the words "gay," "retard," and "Jew" as slurs.

Mastodon doesn’t have the governance in place to stop bigots. Neither does Twitter, but Twitter at least has the critical mass to expose them to ridicule.

Then there’s the question of succession planning, which the average Mastodon server also can’t answer. Just last month Kris Nova, the founder of the Hachyderm Mastodon server, perished tragically in a climbing accident. First of all, the entire open source community is left poorer in her absence. Respect on her name. But also, her side project became a social media life raft for 38,000 people by accident. If anyone can spin up a Mastodon server, do we think server founders regularly make values-based succession plans? Doubtful. I’m more thoughtful about those kinds of things than most random individuals under the age of 70, and my entire succession plan for this decades-in-the-making blog is “Avdi Grimm has credentials, just in case.”

Despite my extreme skepticism about the volatility of the Mastodon model and FOSS in general, so far, by follower count, Mastodon has been my most successful migration off Twitter. A little more than 10% of my Twitter following managed (bless them) to reassemble on Mastodon, and honestly that’s probably a big chunk of who regularly read what I said on Twitter anyway.

Mastodon has some nice features: they even let you edit your “toots,” which Twitter famously never let people do to tweets. But it ain’t the index Twitter was, either—and like Discord and Slack, that’s by design. It’s decentralized, and instances hang onto their own data. At the limits of decentralization, everyone’s on their own instance running off a little computer that hums away in their basement. For data privacy and data agency, this is ideal. People can completely own and control their individual data! It does, however, hamstring data automation efforts.

For example, without the collective data, you can’t do cross-server follow recommendations. In fact, you can’t even search for people across servers. I’m on a small Mastodon instance. Though personal instances ensure data agency, they also mean the Mastodon search box can’t help find people. I have to go to a search engine and run various searches to find the person I want until I manage to spelunk up the full URL of their account. But I can’t follow it from there because the search engine has linked me to a different Mastodon instance—an instance where, of course, I’m not signed in. So I have to go back to clawhammer, paste the person’s full URL in the search bar there, and then I can follow them. This is an untenable organic growth flow. Now imagine, if finding another person who I am explicitly looking for is that hard, how absurd it would be to expect feature parity with Twitter advanced search on Mastodon.

Let alone complicated features like cross-server categorization, conversation distillation, or automated harassment flagging. Every single one of these are really hard problems: problems that pre-Melon Twitter spent billions of dollars to try to solve. To the admittedly limited extent that Twitter succeeded at that, they collectively made Twitter a viable index for finding more voices on specific topics—credible, questionable, but voices nonetheless. Without that data, Twitter couldn’t have become that index. And Mastodon—again, by its deliberate architecture and purpose—can’t. Not across instances, anyway. Maybe it would be possible within a single, large instance, provided the governance were there to support it.

That brings us to…

Answer Three: Bluesky, and other Twitter-like things

Besides slapping myself onto clawhammer, my piecemeal post-Melon attempt at digital healing included supplicating the patron saint of Bluesky invites, Aveta, to bless me.

Aveta played a pivotal role in porting thousands of members of Black Twitter over to the invite only site. Bluesky owes reparations for sure. As for me? I found myself party to a conversation with two friends, who loved Bluesky, asking me if I was on it. When I responded “no and I’m sad about it,” they went back to excitedly discussing which of the social-media-averse astronauts they might bestow their invite codes on, as tokens of friendship. I whined about the experience on Twitter, and two people who saw it pinged Aveta. Within an hour, I was in.

Anna Gallo posts on Bluesky "Reading Morgan Sung's Tech Chrunch article on Bluesky and feeling like one of the chosen ones because I'm a boring basic white person who got invited by Aveta." Underneath the text, an image portrays a painting of an elven princess knighting a soldier with a long sword. The caption saus "Aveta inviting 500+ black users....and also you"

My friends got it right; Bluesky feels the way Twitter did when Twitter still felt good. The reason Bluesky feels so great is that it organizes conversations the way Twitter did, except thanks to Aveta its current power crowd hails heavily from the sorts of voices that inspired the best of “This website is free.”

It’s no coincidence that Bluesky mirrors Twitter. The project began at Twitter in 2019, with the objective of developing a decentralized alternative that Twitter would eventually adopt. Bluesky has since separated from Twitter and has raised its own funding with the hope (we’re told) of making money eventually…somehow. Maybe by letting people buy the opportunity to set their own domains as handle suffixes? Candidly, based on what happened to the original Twitter, I ain’t got a lotta trust for “we promise we’ll eventually sort revenue…somehow.” It’s just not that easy to do in tech these days.

Bluesky remains behind Twitter on features: it lacks DMs, for example, and scored its name in the annals of KnowYourMeme last spring by spawning “Hellthread“; a thread that got too many responses and broke notifications such that any further reply @channel-ed everyone who had replied prior. They also didn’t bother to name their microposts. Leave it to Bluesky to demonstrate the wisdom of Mastodon for calling them “toots”—not because they called them that objectively ridiculous name, specifically, but because they called them literally anything. I call it the Boaty McBoatface Rule of Names: “If you let The Internet name things, they will do it in the most unflattering way possible.” So that rule teamed up with Rule 34 and christened Bluesky posts “skeets.” The CEO hates it, but it’s too late.

A screenshot from the film "Mean Girls" in which the three Mean girls look toward the camera. Instead of the original like ("Get in loser, we're going shopping,") the caption says "Get in loser, we're skeeting in hellthread."

I got this image from KnowYourMeme, but let's face it, KnowYourMeme probably stole it from somewhere on Bluesky or Twitter.

Nevertheless, if someone’s gonna beat new Twitter at old Twitter’s game, Bluesky has the lead right now. Here’s my question: what’s the vision? Because candidly, Twitter had problems before Melon, and I’m not sure Bluesky plans to solve them. What if the vision is 1) suck up enough users to land a monster evaluation, 2) get bought for one billionty dollars, 3) hop out on a golden parachute before the fuse lit by launching a skeet machine inevitably gets to the dynamite? What’s in it for me to put my writing on a site destined for that? And if I’m not putting it on Bluesky, I’m definitely not putting it on any of these off-brand Twitter clones with fewer features and just as much hunger to swell-sell-skedaddle.

We’re looking at a vacuum that a better social media option could fill.

Are we gonna squander it?

I don’t think we have to.

Maybe the right people, building something good enough, can get everyone to pile onto it like the Twitter of ten years ago (unlikely, in my opinion). Or maybe the moment for centralized social media has passed (fair) and the solution involves a separate service that integrates with a common protocol that all the other social media options share.

There’s work on such a shared protocol: most notably ActivityPub. Mastodon implements ActivityPub, which makes it possible (modulo the user interface issues) for accounts on different instances to interact with one another. A number of other social media projects, including Threads (Facebook’s Twitter thing—I know, everybody‘s doing it), have promised to implement ActivityPub in the future. So, you know, I guess we’ll see if that happens.

Notable aberrant Bluesky instead lives on the AT Protocol, which its website FAQ insists is better than ActivityPub for various reasons. The one with the most external validity has to do with making it easier to shift accounts to other instances: ActivityPub ties your identity to your home server, so the best Mastodon has come up with for moving servers is automated redirect generation. Bluesky has a great point here: it’s just moot in their case since they’re controlling a centralized instance anyway. Saying “yeah but people could switch between services if everybody just used our thing” is also kinda moot because other services aren’t using their thing: they’re using ActivityPub. A detail of great interest for those who skipped the FAQ: the AT Protocol, like Bluesky itself, came out of Twitter. Bluesky originally existed, effectively, to proof-of-concept the protocol. Eventually Twitter might’ve been on it, too, but Melon canned the project.

Anyway, assuming the various social media services eventually sort out their pissing contest and pick a way to talk to each other, external services can draw from that protocol the way TweetDeck and other popular alternative Twitter UIs drew from the Twitter API itself. A service could request a customizable set of data permissions, collate the data they’re permitted to use from the instances that have granted it, and use that to build products that help accounts navigate the fediverse regardless of which particular services the accounts live on. Once we have a collated dataset—no small feat, but let’s pretend we’re there—data automation projects appear on the possibility horizon even without a monolithic media megachurch.

Of course, then there’s merely the project of building successful data automation products for social media. We’re about to talk through a bunch of challenges with that, but I don’t think we can succeed by copying the data models we have so far.

We can’t save social media with flat data models.

I’m already on the record for distrusting ‘User’ models. The short version of why: ‘User’ obfuscates not only the actual reason people log into your thing to begin with, but also the underlying structure within a person or group’s account. There’s a general principle here, which is that if the data model doesn’t match, it doesn’t just hobble the system’s functionality: it hobbles the engineers’ ability to even conceptualize the functionality correctly.

Let’s look at how that plays out in social media systems. Social media operates by introducing people to content and connections (other people), so it scales by finding ways to increase the introduction uptake. How is that accomplished? By attempting to recommend content and connections that people want to engage with.

Tech people think “How hard could that be? I’m a genius; this seems fairly straightforward to me,” and that’s exactly how they fuck themselves. Except, it’s not just themselves they’re fucking. Having decision-making power over what goes to prod on a giant system with millions of users is an amount of leverage most techies haven’t considered and are neither cognitively nor emotionally prepared to possess. Don’t believe me? The failure mode for bad social media recommendations at scale is fueling an insurrection. That sounds absurd, right? You’d never catch me saying it, except that it happened.

Before we move on, let’s address the obvious objection: it happened, but it happened to Facebook (Meta, whatever). This company has been the cartoon villain of the data ethics constituency since, conservatively, 2018 (Cambridge Analytica). Even as we all gaped at Melon sucking the thighbones of our beloved blue bird, nobody so much as joked about switching back to Facebook. Just because Evil, Inc. managed to aid and abet domestic terrorism doesn’t mean we’re at risk ourselves, right?

Unfortunately, I must report that it’s not just that Facebook’s kinda evil. It’s also that getting recommendations right is really hard. To wit, an excerpt from this piece I wrote about the value of Big Dyke Energy at my former employer, Pocket8:

We want to be thoughtful about how we recommend stuff.

Here’s the tricky thing about content recommendations: they can go sideways to catastrophic effect. Let’s look at an example: YouTube built a content recommendation algorithm that optimized for the percentage of the video that people watched and how much they commented. That sounds like it would work, right? Recommend stuff people want to watch and talk about! Except that the videos that get watched all the way through trend shorter, and the videos that generate the most commentary trend edgier. Before you know it—bam. You’ve got a political radicalization algorithm on your hands. The company has deployed a few technical workarounds to try to mitigate this effect, but the nature of the optimizing metric limits what can be done. Facebook ran into similar issues with divisive content recommendation strategies, but they chose not to fix it because, after all, engagement is their profit model.

To lots of geniuses, it would seem positively elementary to model a video’s watch-worthiness on whether people finish it and whether people comment. After all, that’s the pattern they observe personally on the MIT lecture videos they speedwatch on YouTube. Problem is, that pattern doesn’t generalize, and it ends up being really incorrect in a dangerous situation that no one thought about until the model hit the wild.

This is why drug trials involve 10-15 years of phased testing; to make sure that unknown, dangerous side effects won’t rear their ugly heads once the product reaches the masses. The “trials” for machine learning models reaching the same audience in production is a couple of automated validation steps and a few colleagues reviewing pull requests with “LGTM.” I’ve seen product timelines for flinging yet-unbuilt models at the entire customer base within a year. Their makers are like “yes, we know we’re uncaging a Kraken of unknown horrors on our constituents, but we’re gonna iterate!

First of all, every programmer who has ever seen a file called *-tmp in production knows that every promise like this thinly shrouds Schrodinger’s Cat levels of uncertainty. Second of all, in machine learning specifically, plenty of practitioners’ idea of “iterating” is “fling every tangentially relevant dataset into Keras at once and see which attributes pop out as correlates.” Tech product release cycles do not allow for the statistical rigor that the leverage of the products deserve. Of course shit goes left.

So how do you build a data model with thoughtful recommendations in mind?

You’re actually asking me? I mean, I’m no Oracle, but these are things I’d consider based on a decade and a half writing code and almost as long on Twitter specifically:

Step 1: Ditch the ‘User’ model and find the underlying structure.

This is the thing I’m most sure about. A better abstraction is usually to make an Account and then figure out what that means. In fact, many people on social media make separate accounts for separate purposes as an ersatz attempt to better model what the idea of a User isn’t modeling for them. Find the underlying structure beneath an account and figure out how to represent that in a pleasant UI.

Most people’s social media feeds, even on one account, quickly become a mixed-up mishmash of different topics and interests. People posting on social media are conditioned to do extra work on the posting end to get their content noticed by folks who care about the topic (hashtags are an example of this). So already we have9:

Account {has_many :topics}

Step 2: Recommend in steps within that structure.

People are welcome to have all their follows on one feed, of course. But when reading their feed, folks often want, in that exact moment, to engage with a subset of the topics they’ve followed. Not one of them; not all of them; but some of them.

So, as an example, maybe instead of a feed recommender, we give people the option to click and drag one or more of their topics into their desired feed before reading it, and give them the additional option to save combinations of topics as specific feeds to click into when they log on.

So now we have:

Account {has_many :feeds}
Feed {has_many :topics}

You don’t even need machine learning for that. You can string-match on hashtags as a starting point. Eventually some data automation could help categorize microposts into topics.

Once you have the topics your account wants, it’s possible to collect data on when they look at which feeds. With enough data here, we can start to assemble features that recommend certain accounts for certain feeds and certain feeds at certain times of day—explicitly identified so that readers can switch what feed they’re looking at.

This structure also permits—should someone care to monetize such a site—ad targeting based not on people’s profiles or personal data, but on the topics they’re looking at. Oh, I follow a lot of accounts by Chicago cyclists and folks who talk local politics? I’m probably disproportionately likely to buy a subscription to Block Club Chicago. You don’t have to know anything about me; you don’t even have to scrape my zip code (I have a separate soapbox about how ad tech standards scrape data far below the level of granularity they actually know how to use, and you could get the same results while being way less creepy, but this post is already long).

Spotify’s product, for example, can make customized recommendations based on the current playlist. They also advertise concert tickets to people whose playlists contain that artist’s songs. A microblog feed can work like this, too. Spotify can even recommend specific playlists at specific times based on listening history. My “for you” remains a weird melange of workout tunes, morning wake-up tunes, and bedtime tunes, though. It’s not a feed to be listened to on its own. It might be a feed for listening to with the dedicated goal of finding new stuff for my other lists. Uncategorized feed recommendations top out at about that level of utility, also.

Step 3: Proactively consider the quality assurance process.

Spotify has it easier than microblog feeds on a key metric: quality detection. It is much easier to train a model to identify music similarity than it is to train a model to identify true text vs false text within a topic.

Twitter has (or had, pre-Melon) disinformation detectors, and they sorta worked on controversial topics. The amount of money that Twitter threw at that problem, by the way, is inaccessible for a lot of startups.

The quality detection mechanism used by many Discord servers, some non-corporate Slack instances, and pre-Melon Twitter at larger scale, is moderators (mods). This is where actual human people boot accounts that spread misinformation, harass other accounts, or otherwise make the space a crummier one to be in (usually according to some predetermined community guidelines). They prove to be a necessary component of any online community that gets big. Modding is a valuable skill that makes these communities better for all of us.

The first hurdle with mods, of course, is scaling: it’s easy for a platform’s use to overwhelm moderator capacity. Additionally, there’s fallibility on the decisions themselves: stories abounded of Twitter screwing this up. The canonical case involved a perpetrator invoking a stereotype that Twitter’s mods couldn’t find in slurs.txt or whatever incomplete reference they were using, and the target responding with a swearword that the mods did recognize, and the mods sending the target to Twitter Jail instead of the perpetrator. Toward the end there, I also had to start bleeping out my tweets like I was on Paramount Network, or else the prudebot would suppress its visibility on my followers’ feeds. It felt gross and invasive.

What if, in addition to moderation efforts, we approached the problem from the design direction?

That is, what if we identified reputable voices and prioritized those in recommendations? That’s not a job for a moderation team. It’s effectively a job for an curatorial team. Identify reputable voices on topics, and boost those people’s accounts.

Twitter, it must be said, had a version of this. They never fully published how “blue checks” worked. I think this was because they imagined the feature originally as an authoritative indication of identity. They offered it, at first, only to people who they decided were “sufficiently at risk for social media impersonation,” but the public interpreted it as a designation of “follow-worthiness” and it got messy from there.

Usually celebrities seemed to have them. Sometimes accounts with more than a certain number of followers have them. We were told that accounts that applied for the blue check underwent some secret manual approval process. Programmers with star power would just wake up one day and discover that they had one. Disability activists seemed to have an inordinate amount of trouble getting one. Blue checks also weren’t topic-specific, so if you pulled up tweets that had used the hashtag “#SARSCoV2,” you were looking at celebrity actors with blue checks making wild speculations alongside actual epidemiologists who happened to not have X.XM followers.

I get it: verifying identity scales poorly. Verifying the presence of any kind of credibility scales even worse. Verifying specific topic credibility breaks the scope of a manual process without subject matter experts on the topic at hand.

Instead of identifying who people should follow based on engagement or popularity, there is a slower but more transparent option:

Start with ‘topics.’ When a topic gets popular enough (enough people are sticking it in their feeds), find people manually who know about that topic and give them the ability to mark accounts as ‘reputable’ on that specific topic. This could eventually grow by some kind of process in which any reputable account can mark other accounts as ‘reputable.’

Systematic reputability metrics counteract the effect where people engage more with more sensationalist content, more engagement means the thing gets boosted, and now the internet’s attention is systematically pointed at the most sensationalist content creators. This is how we get Jordan Peterson or The Joe Rogan Experience or Trisha Paytas.

Systematic reputability metrics also avoid the circular dependency, and subsequent positive feedback loop, of popularity models. This is a common pitfall with automated recommenders in general: when a model recommends music, for example, based on what other people like, its shortest path is to just recommend Beyoncé. Why? Because everyone has something of hers in their playlist. That produces pretty accuracy numbers, but it doesn’t make the model valuable. It’s not generating interest in someone who should have it and otherwise won’t. It’s reflecting existing interest generated by Beyoncé’s marketing team before the model ever existed. Machine learning models do not create knowledge. They parrot back what’s already happening. One of my biggest beefs with the way that the zeitgeist has framed machine learning is the association with “innovation.” Machine learning models do not innovate. They plagiarize at scale. Is a custom-built pattern recognizer and replicator often valuable as part of a larger decision-making system? Absolutely. But if you fling that into the wild as the whole system, you might be dismayed by the result.

Are systematic reputability metrics perfect? No—eventually you’ll pick a dud kingmaker. Let’s talk about mitigations for that.

Mitigations for dud kingmakers

Because individuals recommend other individuals, you have a chain of provenance. That makes it very easy to turn off malicious actors if they infiltrate the system. Someone marks an army of cussheads as reputable? Remove that recommender’s reputable status, automatically rendering all their recommendations no longer reputable, and boosting on that whole branch of the tree stops. This is called node poisoning: its use in social media networks was first introduced to me by Coraline Ada Ehmke in 2014.

The system I’ve described is single provenance. One could instead implement multiple provenance by allowing reputable accounts to mark people as reputable even if they’re already marked as reputable. Then, much like automatic reference counting, the system would check upon the addition or removal of a reputation counter how many marks the account has for the topic at hand. If it now has zero reputable votes, stop boosting it for this topic.

A more populist way to do this would be to give every account the ability to recommend accounts and just give the ones we already consider to be reputable a higher weight. On Twitter, this happened manually through an account-led tradition called #FollowFriday. On Fridays, folks would tweet with that hashtag and a list of accounts that they recommended. Sometimes they provided justification, and sometimes they didn’t. Nevertheless, when someone included me in their Follow Friday list, the follow bump it provided to my account more or less tracked the amount of trust that that person had built among their followers.

That brings me to an important piece of this: for the love of Gaea, don’t fully automate any kind of reputation designation. I’ll call it the Boaty McBoatface rule of kings: ask The Internet who to listen to, unchecked, and it’s gonna give you, with high confidence, some absolute crackpots. The hijackability of peer recommendation systems drops precipitously with a manual component.

Perhaps there is value instead in a recommendation sum based on marks from accounts of all different weights that puts an account up for manual consideration by a team of subject matter experts. I think that, in order to ultimately assign reputability, folks should have subject matter expertise and also have agreed to a documented system for establishing reputability. That system should explicitly include alternatives to legitimacy-by-proxy, should address matters of diversity and inclusion, and should express a position on both-sides-ism so committees don’t end up forced, like journalists, to boost both the factually correct take and the yahoo one that too many people are talking about. If I find out onea’y’all felt the need to mark Fox News reputable, for example, this was done wrong.

Note: I do not believe that reputability scores or marks should be publicly visible. Twitter made the checks visible, which made them an influence on number of followers, and they used the number of followers as a variable in the blue check system. Like straight popularity models, this introduces a circular dependency that creates a positive feedback loop, gumming up the math and influencing the incentives.

Step 4: Let constituents participate in adjustments, always.

Specifically, when offering recommendations within the data model structure, allow individual people to toggle or reconfigure them. They can opt out of adding a recommended account to their topic. They can opt out of adding a topic to their feed. They can add accounts to feeds that “don’t make sense” according to their topic list. You want to build the future of social media? You’re embarking on a process, not a transaction. So the data model better behave accordingly.

Then, manage changes to the system with a deprecation plan that considers the ways constituents might have customized their configurations, in an effort not to overwrite them. In fact, these cases can offer a starting point for the product team to understand how folks are wanting to use the system in ways it doesn’t support yet. We’ve discussed several examples in this post of ways that Twitter’s constituents, rather than its software engineers, built features on the site to address viable use cases: hashtags. Follow Fridays. Vigilante moderation by Main Character Status.

Here’s the hard truth that silver spoon product managers and programmers who think their six figure salary makes them a savant need to digest: visionary ideas don’t come from random garage nerds anymore, and they don’t come from spreadsheets either. I’ve said it before and I’ll say it again: visionary ideas derive directly from centering the marginal case. Social media stops mattering to the people who make it worthwhile when it stops doing that.

Why is any of this worth it?

Maybe it’s not, on an individual level. I’m sure techbro adventure parties will sell Twitter clones for beaucoup dollars several times in the next several years. Most people building in the Twitter-shaped hole aren’t trying to improve on Twitter: they’re trying to get paid.10 The easy way to get paid is “find a company to hire you and then do whatever Twitter did, to the extent of the available resources.” And you know what? It’s absolutely going to work, because companies will buy it. I’m already aware of at least one design firm that agreed to build the face of social media’s future and then turned in the Twitter UI exactly, for money.11

But the Twitter plagiarism hustle ain’t gonna work on a business-to-consumer level; it won’t result in the breakout future of social media. Why? Because the constituency that made Twitter capable of becoming Twitter wouldn’t join Twitter again. Twitter became Twitter, in large part, because of its timing. The next thing that does exactly what Twitter did won’t land a Twitter community. To expect that to happen is like retelling a joke somebody else already told earlier in the dinner party and expecting people to laugh just as hard.

The right entrant might just find massive latent potential among the thoroughly disillusioned, who don’t trust any platform that’s popped up so far. They want a system that values inclusion, respects reputability (and specifically, an alternative to legitimacy-by-proxy) on topics of expertise, protects targets from harassment, and doesn’t require people to know them from outside the platform before finding them inside the platform.

Trust has enormous value…including enormous monetary value. But you have to capture it before you can monetize it, and you have to monetize it without losing it. That is, from my perspective, the two-step that social media has failed to nail so far.

Footnotes:

  1. I went to Penn. It’s an Ivy League school, so arguably I’m lying. But Penn is not regarded as prestigious by tech people. None of the ivies are, in fact. Techies regard Stanford, MIT, and CMU as the holy trifecta, not Harvard-Yale-Princeton. I have never had a hiring manager (besides one CEO who was actually a lawyer) mention my alma mater and not immediately confuse it for Penn State. They’d then make conversation by asking me about the only thing they associate with that school: a football coach sex abuse scandal. Not exactly a leading indicator of my tech chops, in either direction.
  2. This is wildly unrelated, but there’s a song called “Birdplane” by Axis of Awesome that, despite not being about Twitter at all, hilariously and presciently captures the absurd mood of the whole Twitter story arc in a way that makes it the perfect complement to writing a post like this one.
  3. A side benefit of using any pronouns is that you get to use any word—you don’t have to put it back on the shelf ’cause it’s “for boys” or whatever.
  4. I am deliberately excluding lifestyle-specific, public-invite social media options like Fetlife for kinksters, Hacker News for tech folks, or Myspace for nostalgic scenesters. I don’t believe that the success of new or existing sites like these will be made or broken by the Twitter implosion.
  5. Yeah, I’m aware that a Slack spokesperson would probably argue me here. Instance owners can’t access your DMs without your permission unless they convince Slack of “legal right or reason” (for the skilled faerie, this is trivial) or they’re on one of the accounts that pays Slack really a lot of money. Forgive me, dear reader, for not absolving Slack based on these comically shaky caveats.
  6. I know that pile-ons do plenty of harm. In fact, I‘ve gone on record talking about it. In particular, they demonstrate the general microblog malady that I’ve referred to as Zinger Fever; people get an irresistible dopamine hit from firing without thinking, even when their aim blooows. I’m not calling pile-ons an unqeuivocal platform asset. I’m pointing out their utility in a very specific context.
  7. I don’t know that Bluesky counts as “decentralized” while everybody’s on one instance.
  8. I can no longer confirm that this is Pocket’s approach. Please do not hold Pocket employees responsible for adhering to the values I expressed in a post two years ago.
  9. Examples in Ruby because it’s a concise but precise, Englishy way to show this concept for a mixed audience. I don’t consent to bikeshedding my language choice for these examples. It’s three lines; you’ll survive.
  10. Friendly reminder that even Twitter did not get paid, on a to-consumer level. In its seventeen year tenure to date it achieved a profit twice (2018 and 2019), but otherwise recorded losses. Individual contributors and vendors got paid, but that was made possible by VCs.
  11. University of Chicago policy requires me to include an academic honesty (anti-plagiarization) speech in session 1 of all my classes. Meanwhile, in industry…

If you liked this piece:

I think you should just read ‘The Oxymoron of Data Driven Innovation’ already

You might like my assessment of the general, sort of, social media conversation trainwreck

I actually was pretty proud of my MortgageGuyGate piece too

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.