Reddit user reverse-engineers Apple's CSAM tool and discovers algorithm already exists

[[418306]]

Earlier this month, Apple announced that it would introduce new child safety features across its entire ecosystem. As part of this effort, the Cupertino-based company will use on-device machine learning to scan content on iCloud and Messages apps to detect possible child sexual abuse material (CSAM),media reported.

Although Apple clarified that the app would not be used to invade privacy or be exploited to obtain other people's information and photos, the statement still caused a lot of controversy in the technology industry and the public.

Following the criticism, Apple released a six-page document outlining its approach to combating CSAM using on-device machine learning and an algorithm called NeuralHash.

Apple further stated that its CSAM detection module is under development and will only scan images that have been flagged as problematic.

However, in the latest development, a curious Reddit user went into Apple's hidden API and reverse engineered the NeuralHash algorithm. Surprisingly, they found that this algorithm existed in Apple's ecosystem as early as iOS 14.3. This may raise some eyebrows, as the whole CSAM thing is a recent thing, but the user pointed out that there are good reasons to believe that this discovery is legitimate.

First, it was discovered that the files of the model were all appended with the NeuralHashv3b prefix. It follows the naming convention of Apple's six-page paper. Second, it was also noted that the undisclosed source code used the same process of synthesizing hashes as outlined in Apple's documentation. Third, Apple claims that their hashing scheme creates hashes that are almost independent of the size and compression of the image, which is also what the Reddit user discovered in the source code, further solidifying their belief that NeuralHash was indeed found hidden deep in the source code.

The Reddit user published the findings on GitHub. While he did not publish the exported model files, he outlined the process of extracting the model and converting it to the deployable ONNX runtime format. After exporting the model, he test-ran the inference and gave a sample image.

According to the Reddit user, the hash is the same on all devices except for a few bits, which is expected behavior because NeuralHash handles floating point calculations and its accuracy depends heavily on the hardware. He also added that Apple is likely to adapt to these differences in the subsequent database matching algorithm.

The Reddit user believes that now is a good time to take a deeper look at how NeuralHash works and its impact on user privacy.

<<: Operators’ worries: 4G die-hards unwilling to upgrade and “move”

>>: iOS 15 Beta 6 new features summary: focus on improving Safari browsing experience

Reddit user reverse-engineers Apple's CSAM tool and discovers algorithm already exists

5 tips to improve APP retention rate

618 headline traffic forecast and advertising suggestions!

8 directions to create Douyin catering influencers!

Jade Wenchang Tower has the best spirituality

Tips and strategies for doing fission activities!

Xiaohongshu blogger’s guide to making money on Double Eleven!

How to choose topics for short videos, 7 tips for choosing topics for short videos!

How to operate new media well? New media operation positioning skills!

Ukrainian birthday gifts, how much does it cost to send birthday wishes to Ukrainian beauties?

What kind of copywriting do users like most? 15 secrets to understanding copywriting that impresses users in one article

Recommend

The teacher was asked to resign and was sued for 420,000 yuan!

How do you evaluate the amazing Pechoin advertisement that has been flooding the WeChat Moments?

Idol New Media: 2020 Kuaishou becomes an enhanced version of real-life practice, quickly create a profitable Kuaishou

Li Xingxing's PR video editing cheats + PR editing score strategy

Marketing Promotion: How was the Forbidden City’s super IP created?

Solid info! 6 customer acquisition models for APP promotion!

How can we reduce the uninstall rate of APP users?

Benxi SEO Training: Do you know the practical skills of website optimization? How to quickly improve rankings?

What is the exposure mechanism of information flow advertising?

iPhone 8/8 Plus real machine pictures, iPhone X real machine pictures and hands-on videos

Facing a sales ban in China: Is Apple on its way to the end of the patent war?

5 "pitfalls" to avoid in marketing activities

Ni Jianwei's secret to explosive sales and winning orders: there is no customer or order that cannot be handled

Tencent advertising plan for the dental industry

Case analysis: Wedding photography advertising case in WeChat Moments!