To moderate billions of posts, many social media platforms first compress posts into bite-sized pieces of text that algorithms can process quickly. These compact blurbs, called “hashes,” look like a short combination of letters and numbers, but one hash can represent a user’s entire post.To moderate billions of posts, many social media platforms first compress posts into bite-sized pieces of text that algorithms can process quickly. These compact blurbs, called “hashes,” look like a short combination of letters and numbers, but one hash can represent a user’s entire post.

Automated Content Moderation: How Does It Work?

2025/12/08 01:00

The Markup, now a part of CalMatters, uses investigative reporting, data analysis, and software engineering to challenge technology to serve the public good. Sign up for Klaxon, a newsletter that delivers our stories and tools directly to your inbox.


When social media platforms get as big as Instagram—more than 2 billion monthly active users—a huge majority of content moderation is automated. Earlier this week, The Markup published an investigation into how Instagram’s moderation system demoted images of the Israel–Hamas war and denied users the option to appeal, as well as a piece on what someone can do if they think they’ve been shadowbanned.

\ But how do these automated systems work in the first place, and what do we know about them?

How does any platform moderate billions of posts quickly?

To moderate billions of posts, many social media platforms first compress posts into bite-sized pieces of text that algorithms can process quickly. These compact blurbs, called “hashes,” look like a short combination of letters and numbers, but one hash can represent a user’s entire post.

\ For example, a hash for the paragraph you just read is:

​​d0ff9f3f7adc0e012100018c23effc91afd792561738073e42e9bd1f4c590505

\ And a hash for our entire investigation into shadowbanning on Instagram—a little over 3,200 words—is:

​​3a6f7903dae55c7c6f509fb6d57a0f08ebe41b27eb6ef5e25c9fe7940cf9dee8

\ You can play around with generating hashes yourself by using one of many free tools online. (We selected the “SHA-256” algorithm to generate ours.)

\ In many ways, hashes operate like fingerprints, and content moderation algorithms search through their database of existing fingerprints to flag any matches to posts they already know they want to remove. Dani Lever, a spokesperson at Meta, Instagram’s parent company, confirmed that Instagram uses hashes to “catch known violating content.”

\ Even if someone edits images or videos after posting, platforms can still identify similar posts by using “perceptual hashing,” which creates something like a partial fingerprint based on parts of the content that can survive alteration, a process described in a 2020 paper about algorithmic content moderation. Perceptual hashing is likely how YouTube can identify snippets of copyrighted music or video in an upload and proactively block or monetize it.

\ Robert Gorwa, a postdoctoral researcher at the Berlin Social Science Center and lead author of the paper, told The Markup that regulators have been pressuring the biggest social media companies to be more proactive about sharing knowledge and resources on content moderation since the 2010s. In 2017, Facebook, Microsoft, Twitter, and YouTube came together to form the Global Internet Forum to Counter Terrorism, an initiative that, among other things, administers a database of hashes of content from or supporting terrorist and violent extremist groups. Organizations like the National Center for Missing and Exploited Children operate similar hash-sharing platforms for online child sexual abuse material.

\ In his paper, Gorwa said that Facebook used this technique to automatically block 80% of the 1.5 million re-uploads of the 2019 live-streamed video of the Christchurch mosque mass shooting. But checking uploads against hashes doesn’t help platforms evaluate whether brand new posts are violating standards or have the potential to get them in trouble with their users, advertisers, and regulators.

How are new posts moderated in real time?

That’s where machine learning algorithms come in. To proactively flag problematic content, Meta trains machine learning models on enormous amounts of data: a continuously refreshed pool of text, images, audio, and video that users and human content moderators have reported as inappropriate. As the model processes more impermissible content, it gets more efficient at flagging new uploads.

\ Long Fei, who until 2021 worked at YouTube as a technical lead managing its Trust and Safety Infrastructure Team, told The Markup that, behind the scenes, specialized models scan what users post to the site. According to Fei, these specialized models have different jobs. Some look for patterns and signals within the posts, while others weigh those signals and decide what to do with the content.

\ For example, Fei said, “there may be models looking for signals of guns and blood, while there may be other models using the signals and determining whether the video contains gun violence.” While the example itself is oversimplified, Fei said, it’s a good way to think about how models work together.

\ Instagram says it builds “machine learning models that can do things like recognize what’s in a photo or analyze the text of a post. … models may be built to learn whether a piece of content contains nudity or graphic content. Those models may then determine whether to take action on the content, such as removing it from the platform or reducing its distribution.”

What can human moderators see?

Platforms also have people who manually moderate content, of course—their work is used to train and check the machine learning algorithms. The human point of view is also necessary to resolve questions requiring sensitivity or diplomacy, and to review user appeals of moderation decisions.

\ Although platforms have their own internal teams, much of the actual review work is outsourced to contractors that employ people in the U.S. and around the world, often at low wages. Recently, content moderators have started to organize for better pay and mental health services to help with the trauma of flagging the internet’s darkest content.

\ There isn’t a lot of transparency around what information human moderators have access to. A new California law that requires large social media companies to disclose how they moderate content is currently being challenged in court by X, formerly known as Twitter.

\ Joshua Sklar, a former Facebook content moderator, told The Markup that human moderators often don’t have the frame of reference needed to make informed decisions. The team he worked on specifically looked at moderating individual Facebook posts, but Sklar said he could barely see any information other than the post itself.

\ “You’re pretty much viewing these things out of context,” Sklar said. “Say someone did a series of posts of images that spelled out a racial slur. You wouldn’t be able to tell [as a moderator].”

\ Gorwa has heard similar accounts. “Human moderators typically have limited context about the person posting and the context in which the content they’re reviewing was posted. The full thread in which the post appeared is usually not available,” he said.

\ Meta’s transparency center describes how review teams work, but does not describe what review teams actually see about users and what they posted.

\ When cases need context, Meta says it will, “send the post to review teams that have the right subject matter and language expertise for further review. … When necessary, we also provide reviewers additional information from the reported content. For example, words that are historically used as racial slurs might be used as hate speech by one person but can also be a form of self-empowerment when shared by another person, in a different context. In some cases, we may provide additional context about such words to reviewers to help them apply our policies and decide whether the post should be left up or taken down.”

\ Meta declined to comment on what Instagram’s human moderators can see when reviewing content.

How is artificial intelligence being used?

Robyn Caplan, an assistant professor at Duke University’s Sanford School of Public Policy who researches platform governance and content moderation, said that previously, “it was thought that there were certain types of content that were not going to be done through automation, things like hate speech that require a lot of context. That is increasingly not the case. Platforms have been moving towards increased automation in those areas.”

\ In 2020, Facebook wrote about how it “uses super-efficient AI models to detect hate speech.” Now, in 2024, Meta said it has “started testing Large Language Models (LLMs) by training them on our Community Standards to help determine whether a piece of content violates our policies. These initial tests suggest the LLMs can perform better than existing machine learning models.” The company has also created AI tools to help improve performance of its existing AI models.

\ “There’s hype about using LLMs in content moderation. There’s some early indications that it could yield good results, but I’m skeptical,” Gorwa said.

\ Despite investing in AI tools, Meta has had some high-profile stumbles along the way, such as when Instagram associated posts about the Al-Aqsa Mosque with terrorist organizations in 2021.

More transparency, please

Our reporting has shown that platforms often say they are constantly building new moderation tools and tweaking the rules for what is permissible on their site. But that isn’t the whole story.

\ Platforms are overwhelmed with the flood of content they encourage—and depend on—from their users. They make efforts to root out bad actors, but often find themselves sanctioning accounts or posts made by people who are expressing themselves in good faith. As we’ve seen over and over again, the results are often disproportionate, with the views of a single group suffering far more than others.

\ It’s a difficult problem—there is truth to the statement that Lever, Meta’s spokesperson, made in our story earlier this week: her company is indeed operating massive platforms in “a fast-moving, highly polarized” environment.

\ Shifting societal norms, technological advances, and the chaos of world events mean we may never reach an equilibrium where content moderation is solved.

\ “What we’ve really learned out of the debate over content moderation over the last several years is that [it] has been implied that there is a solution to a lot of these problems that is going to satisfy everyone, and there might not be,” Caplan said.

\ But as long as tech companies refuse to be more forthcoming about how they police content, everyone on the outside, including users, will be left trying to piece together what they can about these systems—while having no choice but to live with the results.


Credits

  • Tomas Apodaca, Journalism Engineer
  • Natasha Uzcátegui-Liggett, Statistical Journalist

Design and Graphics

  • Sisi Wei
  • Joel Eastwood

Engagement

  • Maria Puertas

Editing

  • Soo Oh
  • Sisi Wei
  • Michael Reilly

\ Also published here

\ Photo by Igor Omilaev on Unsplash

Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact service@support.mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.

You May Also Like

HashKey's IPO is imminent: Targeting the golden age of digital assets and building a benchmark for a compliant ecosystem in Asia.

HashKey's IPO is imminent: Targeting the golden age of digital assets and building a benchmark for a compliant ecosystem in Asia.

Original author: Ho Mo-cheung, Hong Kong 01 With a clearer global regulatory framework, increased institutional participation, and breakthroughs in underlying blockchain technology, the digital asset market is transitioning from an "early stage of experimentation" to a new phase of "institutionalized development." According to Frost & Sullivan data, from 2024 to 2029, the global onshore digital asset trading volume will achieve a CAGR of 48.9%, the tokenization services market will see an even higher CAGR of 94.8%, and the digital asset management services market will achieve a CAGR of 54.5%, indicating that the industry is entering a long-term, sustainable structural expansion cycle. In this round of structural upward cycle in the industry, compliance, licensing, and security have become the core elements that institutions and new funds are most concerned about. As a leading integrated digital asset company in Asia, HashKey, with its compliance-first strategic layout and business ecosystem covering the entire chain, is becoming a core bridge connecting traditional finance and the digital economy, standing at the starting point of the value enhancement cycle. Recently, with HashKey passing the Hong Kong Stock Exchange's listing hearing, its Hong Kong IPO will soon commence. This leading integrated digital asset group in Asia is showcasing the systemic value of its compliance moat, technological capabilities, and comprehensive ecosystem to the capital market. Industry insiders generally believe that HashKey's IPO will be a significant milestone in the institutionalization of digital assets in Hong Kong. Compliance as the foundation, and a synergistic ecosystem of business operations to create systemic advantages. In the digital asset field, compliance and security remain the primary principles determining a company's long-term development. As global regulatory frameworks rapidly improve, licenses and compliance capabilities have transformed from bonuses into core credentials for companies to expand their business scope and secure incremental institutional funding. Especially in high-barrier-to-entry sectors such as custody, RWA, and institutional asset management, regulatory approval is a direct ticket to the game and the only way to build competitive barriers. This is why HashKey has prioritized compliance since its inception, building a global compliance system covering core markets such as Hong Kong, Singapore, and Japan. As the first virtual asset trading platform (VATP) in Hong Kong simultaneously authorized to serve both retail and institutional investors, HashKey currently holds 13 cross-regional licenses, forming a regulatory moat that is difficult to replicate. Furthermore, the company's annual internal control audits have passed international certifications such as SOC 1 (Type 2), SOC 2 (Type 2), ISO27001, and ISO27701. Since its operation, it has maintained an industry record of "zero customer fund losses and zero on-chain penalties," laying an unshakeable foundation for its long-term credibility. Technology empowers growth from self-use to spillover effects, expanding the boundaries of growth. Based on this compliant platform, HashKey has built a full-chain business ecosystem of "transaction facilitation + on-chain services + asset management" and is rapidly expanding its market leadership. According to the prospectus, as of August 31, 2025, the transaction facilitation business accounted for 75% of the Hong Kong market share, with a cumulative spot trading volume of HK$1.3 trillion; the on-chain service staking scale exceeded HK$25 billion; the asset management scale exceeded HK$8 billion, and the return rate of its funds exceeded 10 times. All three segments rank first in Asia. More importantly, this integrated business is not simply a combination, but a self-reinforcing network that grows stronger with continued operation. Its flywheel effect manifests in: on-chain services providing tokenization tools for projects and institutions; exchanges handling distribution and circulation needs; and asset management accumulating long-term capital and meeting incremental demand. These three elements serve as entry points and reinforce each other, forming a positive value loop that continuously expands HashKey's ecosystem stickiness and market competitiveness. From compliance systems to technology platforms, and then to multi-business collaboration, HashKey is no longer just a trading platform, but a core hub for building Asia's digital asset infrastructure. On its technological foundation, HashKey has built a high-performance platform specifically designed for institutional scenarios: capable of supporting up to 50,000 transactions per second, with dynamic scaling capabilities, sufficient to handle periodic traffic surges and ensure stable and smooth transactions even under extreme market conditions. At the underlying level, the company's self-developed HashKey Chain—an Ethereum Layer 2 network for financial institutions—has become the technological carrier for key scenarios such as RWA tokenization, stablecoins, and DeFi applications, and has been selected by numerous financial institutions, gradually becoming the infrastructure for on-chain and off-chain asset flows. More noteworthy is that HashKey's technological capabilities have begun to be exported to external financial and technology institutions, creating a cross-market growth spillover effect. For example, it has partnered with Coins.ph to export its underlying technology and liquidity capabilities to create a licensed cross-border remittance channel; it has partnered with securities firms such as Victory Securities to launch compliant integrated account solutions; and it has partnered with Standard Chartered Bank, ZA Bank, and others to provide 24/7 fiat currency deposit and withdrawal services. This "technology infrastructure spillover" model has essentially expanded HashKey's growth boundaries from a single platform business to a broader regional fintech market, bringing more flexible long-term growth potential than the trading business, and also enabling it to establish a clear leading position in the Asian digital asset infrastructure race. With the accelerated implementation of scenarios such as RWA, stablecoins, on-chain clearing and payments, companies that possess both compliance access and underlying technical capabilities will capture the next long-term dividends of the entire industry. HashKey's early deployment in this direction is essentially opening up growth potential far exceeding its current scale. Ecological effects are beginning to emerge, and growth is entering a period of acceleration. As its business ecosystem gradually takes shape, HashKey's growth has entered a period of accelerated development, and the ecosystem's amplifying effect is fully unfolding. Financial data has already shown a clear structural upward trend: Total revenue increased from HK$129 million in 2022 to HK$721 million in 2024, a 4.6-fold increase in two years; the Hong Kong station launched in 2023 became a new engine, with Hong Kong revenue increasing by 58% year-on-year to HK$89 million in the first half of 2025. In terms of revenue structure, transaction facilitation services have become the main driver of growth, contributing 71.8% in 2024. Meanwhile, high-margin on-chain services and asset management services continue to provide stable cash flow, forming a virtuous cycle. Increased revenue has driven rapid expansion of gross profit: gross profit increased from HK$125 million in 2022 to HK$533 million in 2024, representing a CAGR of 106%; adjusted net loss also narrowed further from HK$400 million in 2022 to HK$376 million in 2024. Overall, the company's multiple advantages in compliance, technological capabilities, and ecosystem layout have built a significant comprehensive competitive barrier, firmly securing its core position in the Asian digital asset market. In the context of the deep integration of traditional finance and the digital economy globally, companies that integrate compliance, technology, and infrastructure will reap the greatest cyclical benefits. HashKey's strategic layout aligns perfectly with this wave of industrial structural migration, and its technology spillover, ecosystem expansion, and first-mover advantage in compliance are demonstrating its true long-term value to the market. In the Asian market, HashKey's strategic position deserves a more imaginative reassessment, and its growth potential is far from being fully realized. In particular, against the backdrop of digital assets becoming institutionalized, this not only represents a new stage in the company's development but also symbolizes a new trajectory that Hong Kong is forging in the global financial landscape. Original article URL: HashKey's IPO is imminent: Anchoring the golden age of digital assets, building a benchmark for compliance ecosystem in Asia | Hong Kong 01 https://www.hk01.com/article/60300961?utm_source=01articlecopy&utm_medium=referral
Share
PANews2025/12/08 11:15