Reddit user reverse-engineers Apple's CSAM tool and discovers algorithm already exists

Reddit user reverse-engineers Apple's CSAM tool and discovers algorithm already exists

  [[418306]]

Earlier this month, Apple announced that it would introduce new child safety features across its entire ecosystem. As part of this effort, the Cupertino-based company will use on-device machine learning to scan content on iCloud and Messages apps to detect possible child sexual abuse material (CSAM),media reported.

Although Apple clarified that the app would not be used to invade privacy or be exploited to obtain other people's information and photos, the statement still caused a lot of controversy in the technology industry and the public.

Following the criticism, Apple released a six-page document outlining its approach to combating CSAM using on-device machine learning and an algorithm called NeuralHash.

Apple further stated that its CSAM detection module is under development and will only scan images that have been flagged as problematic.

However, in the latest development, a curious Reddit user went into Apple's hidden API and reverse engineered the NeuralHash algorithm. Surprisingly, they found that this algorithm existed in Apple's ecosystem as early as iOS 14.3. This may raise some eyebrows, as the whole CSAM thing is a recent thing, but the user pointed out that there are good reasons to believe that this discovery is legitimate.

First, it was discovered that the files of the model were all appended with the NeuralHashv3b prefix. It follows the naming convention of Apple's six-page paper. Second, it was also noted that the undisclosed source code used the same process of synthesizing hashes as outlined in Apple's documentation. Third, Apple claims that their hashing scheme creates hashes that are almost independent of the size and compression of the image, which is also what the Reddit user discovered in the source code, further solidifying their belief that NeuralHash was indeed found hidden deep in the source code.

The Reddit user published the findings on GitHub. While he did not publish the exported model files, he outlined the process of extracting the model and converting it to the deployable ONNX runtime format. After exporting the model, he test-ran the inference and gave a sample image.

According to the Reddit user, the hash is the same on all devices except for a few bits, which is expected behavior because NeuralHash handles floating point calculations and its accuracy depends heavily on the hardware. He also added that Apple is likely to adapt to these differences in the subsequent database matching algorithm.

The Reddit user believes that now is a good time to take a deeper look at how NeuralHash works and its impact on user privacy.

<<:  Operators’ worries: 4G die-hards unwilling to upgrade and “move”

>>:  iOS 15 Beta 6 new features summary: focus on improving Safari browsing experience

Recommend

The teacher was asked to resign and was sued for 420,000 yuan!

Article 23: An employer and an employee may agree...

Li Xingxing's PR video editing cheats + PR editing score strategy

Li Xingxing's PR video editing cheats + PR ed...

Marketing Promotion: How was the Forbidden City’s super IP created?

As a cultural symbol with a history of nearly 600...

Solid info! 6 customer acquisition models for APP promotion!

This time, I spent 2 days researching various pop...

How can we reduce the uninstall rate of APP users?

There is a question that has been bothering App d...

What is the exposure mechanism of information flow advertising?

When doing information flow advertising, you may ...

Facing a sales ban in China: Is Apple on its way to the end of the patent war?

On the evening of December 10th, Beijing time, Qu...

5 "pitfalls" to avoid in marketing activities

As the old saying goes, "Failure is the moth...

Tencent advertising plan for the dental industry

1. Account Naming Standards 1. Why do we emphasiz...

Case analysis: Wedding photography advertising case in WeChat Moments!

With the continuous evolution of consumption upgr...