Earlier this month, Apple announced that it would introduce new child safety features across its entire ecosystem. As part of this effort, the Cupertino-based company will use on-device machine learning to scan content on iCloud and Messages apps to detect possible child sexual abuse material (CSAM),media reported. Although Apple clarified that the app would not be used to invade privacy or be exploited to obtain other people's information and photos, the statement still caused a lot of controversy in the technology industry and the public.Following the criticism, Apple released a six-page document outlining its approach to combating CSAM using on-device machine learning and an algorithm called NeuralHash. Apple further stated that its CSAM detection module is under development and will only scan images that have been flagged as problematic. However, in the latest development, a curious Reddit user went into Apple's hidden API and reverse engineered the NeuralHash algorithm. Surprisingly, they found that this algorithm existed in Apple's ecosystem as early as iOS 14.3. This may raise some eyebrows, as the whole CSAM thing is a recent thing, but the user pointed out that there are good reasons to believe that this discovery is legitimate. First, it was discovered that the files of the model were all appended with the NeuralHashv3b prefix. It follows the naming convention of Apple's six-page paper. Second, it was also noted that the undisclosed source code used the same process of synthesizing hashes as outlined in Apple's documentation. Third, Apple claims that their hashing scheme creates hashes that are almost independent of the size and compression of the image, which is also what the Reddit user discovered in the source code, further solidifying their belief that NeuralHash was indeed found hidden deep in the source code. The Reddit user published the findings on GitHub. While he did not publish the exported model files, he outlined the process of extracting the model and converting it to the deployable ONNX runtime format. After exporting the model, he test-ran the inference and gave a sample image. According to the Reddit user, the hash is the same on all devices except for a few bits, which is expected behavior because NeuralHash handles floating point calculations and its accuracy depends heavily on the hardware. He also added that Apple is likely to adapt to these differences in the subsequent database matching algorithm. The Reddit user believes that now is a good time to take a deeper look at how NeuralHash works and its impact on user privacy. |
<<: Operators’ worries: 4G die-hards unwilling to upgrade and “move”
>>: iOS 15 Beta 6 new features summary: focus on improving Safari browsing experience
Article 23: An employer and an employee may agree...
Do you remember Gu Ye’s famous advertisement “Why...
Idol New Media: 2020 Kuaishou becomes an enhanced...
Li Xingxing's PR video editing cheats + PR ed...
As a cultural symbol with a history of nearly 600...
This time, I spent 2 days researching various pop...
There is a question that has been bothering App d...
Many newcomers to website optimization do not kno...
When doing information flow advertising, you may ...
While everyone was shouting that the iPhone X was...
On the evening of December 10th, Beijing time, Qu...
As the old saying goes, "Failure is the moth...
The author of "Sales is about getting people...
1. Account Naming Standards 1. Why do we emphasiz...
With the continuous evolution of consumption upgr...