Instagram has posted an article describing the behind-the-scenes machinery that fills the Explore tab in Instagram with new, interesting stuff every time you open it. It’s a bit technical, so here are five takeaways.” data-reactid=”11″>Instagram has posted an article describing the behind-the-scenes machinery that fills the Explore tab in Instagram with new, interesting stuff every time you open it. It’s a bit technical, so here are five takeaways.
Even Instagram and Facebook have limited resources
feed, which some still would prefer was simply chronological, the Explore tab needs to be algorithmically driven. But understanding what’s happening on an image-based social network and recommending new content to people is a problem that’s exactly as hard as you make it.” data-reactid=”13″>Unlike the feed, which some still would prefer was simply chronological, the Explore tab needs to be algorithmically driven. But understanding what’s happening on an image-based social network and recommending new content to people is a problem that’s exactly as hard as you make it.
It’s also easier to experiment and iterate when you can change stuff and see results quickly, they point out.
It’s all about the account, not the post
So much is posted to Instagram that it would be pretty much impossible to keep track of every photo individually, for recommendation purposes anyway. It’s simpler and more efficient to track accounts, since accounts tend to have themes or topics, from a broader one like “travel” to something highly specific, like especially round seals.
While liking one post from an account doesn’t necessarily mean you’ll like everything else from that account, it is a good indicator that you’re at least interested in the theme of that account. Even if it was this particular post of this particular cat that you wanted to heart because it reminds you of old Mittens, if you’re liking pictures from an account that mostly posts cats, that’s valuable information.
Complex habits inform the algorithm
Notably it isn’t just image features that Instagram uses to figure out what accounts are topically linked, though of course that kind of thing can be detected too. They also use your behavior.
For instance, when you like several posts in a row, they’re more likely to be linked in some way even if Instagram’s algorithms can’t quite see it:
If an individual interacts with a sequence of accounts in the same session, it’s more likely to be topically coherent compared with a random sequence of accounts from the diverse range of Instagram accounts. This helps us identify topically similar accounts.
People just tend to look into stuff that way, going from one travel-focused account to the next, or focusing on animals because they need a pick-me up. All that information gets sucked up by the algorithm and inspected for relevance. Of course deliberate actions like “see fewer posts like this” and blocking accounts has a lot of weight as well.
From “seed accounts” to a top 25
The process of getting from a couple billion posts to just two dozen can be pretty difficult, but you can cut the problem down to manageable size by limiting the Explore tab to accounts linked in some way to accounts the user has already liked or saved posts from. These are called “seed accounts” because everything else in the process really grows out of them.
Because of how the machine learning system represents accounts and their topics inside itself, it’s super easy for it to find a couple hundred similar accounts.
Imagine if you know someone likes a particular reddish-orange marble and you need to find some more like it. If you just dip your hand into a sack of marbles you’re unlikely to find one quickly. Even if you pour them out on the floor you’ll still have to hunt around for a bit. But if you’ve already organized them by color, all you have to do is reach into the general vicinity of the marble they like and you’re almost guaranteed to pick a winner.
The machine learning model does that by giving all these accounts a sort of location in a virtual space, and the closer two are in that space, the closer they are topically.
So the really hard part of paring down a set of billions to a set of hundreds is basically already accomplished by the way the accounts are classified.
From there Instagram does three passes with neural networks of increasing complexity.
I guess that was kind of long for a “takeaway.” Don’t worry, the next one is quick.
And of course, no 🍑
“We want to make sure the content we recommend is both safe and appropriate for a global community of many ages on Explore,” they write. “Using a variety of signals, we filter out content we can identify as not being eligible to be recommended.”