"Pink Killer" wanted poster, AI's ability to read breast X-rays is comparable to that of doctors

"Pink Killer" wanted poster, AI's ability to read breast X-rays is comparable to that of doctors

According to statistics from the World Health Organization, there were 2.3 million new cases of breast cancer worldwide in 2020, ranking first among all cancers and surpassing lung cancer to become the number one cancer.

However, if it can be detected early and treated in time, killing cancer cells before the tumor metastasizes, the mortality rate of breast cancer can be greatly reduced. Currently, the common method of initial screening for breast cancer is breast X-ray, and then the doctor will judge the health of the breast by analyzing and reviewing the X-ray. However, the review process consumes a lot of time and affects other patients' visits.
To this end, researchers at the University of Nottingham in the UK compared the ability of commercial AI and doctors to read breast X-rays, providing new ideas for the application of AI in clinical medicine.

Author | Xuecai

Editor | Three Sheep, Iron Tower

This article was first published on HyperAI WeChat public platform~

According to statistics from the American Cancer Society, there will be approximately 930,000 new cancer cases in American women in 2022, of which approximately 290,000 will be new breast cancer patients, accounting for 31%. At the same time, breast cancer patients account for 15% of cancer deaths, second only to lung cancer.

Figure 1: Number of new cancer cases (top) and cancer deaths (bottom) in the United States in 2022

In China, breast cancer is the most common cancer among female patients in the 21st century , and the number of new patients is increasing every year.

Figure 2: Number of new cancer cases in Chinese women from 2000 to 2016, with the gray color representing breast cancer cases

Breast cancer is a disease caused by abnormal breast cells growing out of control and forming tumors. If not intervened in time, the tumor will metastasize and spread, eventually endangering life. However, if the local tumor can be found in the early stage of cancer and treatment is started, the five-year survival rate of cancer can reach 99%.

At present, hospitals generally use mammography to screen for breast cancer. However, false positives may occur during the initial screening process, causing patients without cancer to undergo unnecessary tests. There may also be omissions, delaying the best treatment time for patients.

Therefore, many European countries will review mammograms to eliminate as many false positive cases as possible. This method is effective, reducing false positives while increasing cancer detection rates by 6% to 15%.

However, it takes a considerable amount of time to read and evaluate X-rays. In areas where the doctor-patient ratio is low, reviewing X-rays not only takes up doctors' time, but also affects the early screening of other patients.

The application of AI has partially alleviated the work pressure of doctors, but it seems a bit unsafe to entrust AI to evaluate life and health. In this regard, Professor Yan Chen of the University of Nottingham in the UK said, "The application of AI in clinical medicine faces great pressure, but we need to do this well to protect women's health."

To this end, Yan Chen's team compared the accuracy of commercial AI Lunit with that of doctors in reading breast X-rays. **The results showed that Lunit's ability to analyze breast X-rays is comparable to that of human doctors. **This result has been published in "Radiology".

Paper link:

https://pubs.rsna.org/doi/10.1148/radiol.223299#_i13

Experimental procedures

Dataset: PERFORMS dataset

This study selected two sets of PERFORMS datasets as test sets for the model. Each set of PERFORMS datasets consists of 60 challenging X-rays, including malignant tumors (about 35%), benign tumors, and normal results. Over the past 30 years, the PERFORMS dataset has been used for entry tests and routine assessments of doctors in the UK National Health Service Breast Screening Program (NHSBSP).

Evaluation criteria: marking + scoring

When analyzing X-rays, doctors will mark suspicious locations and finally give a rating of 1-5, corresponding to normal, benign, uncertain, suspicious and malignant.

The AI ​​will rate the suspiciousness of each feature of the X-ray from 1 to 100 , with the highest score being considered the score for the entire X-ray. If there are no suspicious features, it will be considered 0 points.

Figure 3: Doctors and AI’s analysis of breast X-rays

A: The blue arrow indicates an unknown mass with a diameter of 8 mm, which was later identified as histological grade 2 ductal carcinoma;

B: The red cross is the abnormal feature discovered by AI, and the blue dot is the suspicious area marked by the doctor during analysis.

Comparison results: Specificity + Sensitivity

A total of 552 doctors participated in the competition, accounting for 68% of the total number of NHSBSP, including 315 radiologists, 206 radiographers and 31 clinicians.

After analyzing the two PERFORMS datasets, they considered 161 mammograms to be normal, 70 to be malignant, and 9 to be benign. Common features of malignancy included masses (64.3%), calcifications (12.9%), asymmetry (11.4%), and architectural distortion (11.4%), with an average lesion size of 15.5 ± 9.2 mm.

Table 1: Results on the PERFORMS dataset

The mean AUC of the human group was 0.88. The AUC of the AI ​​group was 0.93, corresponding to the 96.8th percentile of the human group, but there was no significant difference between the two groups.

Figure 4: AUC histogram of the doctor group and the AUC of AI (yellow line)

The average sensitivity and specificity of the human group were 90% and 76%, respectively. At the threshold recommended by the developers, the AI's sensitivity and specificity were 84% and 89%, respectively.

Table 2: Judgment results of the doctor group and AI with different thresholds

TP: true positive;

FP: false positive;

TN: true negative;

FN: false negative;

Sensitivity = TP / total number of positives;

Specificity = TN / total number of negatives.

In the ROC curve of AI, 52% of doctors performed above the curve, 36% were below the curve, and 12% performed consistent with the ROC curve.

Figure 5: ROC curve of AI, where the blue dots are the performance of different doctors

When the AI ​​threshold was 3.06, the AI's sensitivity was consistent with that of the doctors, detecting 63 malignant tumors and missing only 7. At this time, the AI's specificity was not significantly different from that of the doctors.

When the threshold was set to 2.91, the AI ​​had the same specificity as the physician group, with a sensitivity of 91%. The above results show that the sensitivity and specificity of Lunit's AI in analyzing breast X-rays are comparable to those of human physicians.

Figure 6: The impact of different thresholds on AI judgment results

A: The blue arrow indicates an asymmetric area, which was later identified as histological grade 2 ductal carcinoma;

B: Detection results when the AI ​​threshold is 2.91, and the red cross is finally identified as a true positive;

C: The test results when the AI ​​threshold was 3.06, no obvious abnormal features were found.

Professor Yan Chen said, " The results of this study provide strong evidence for AI screening, showing that AI's level of analysis of mammograms is comparable to that of human doctors ."

Breast cancer: the hidden pink killer

On World Cancer Day on February 4, 2021, the International Agency for Research on Cancer under the World Health Organization (WHO) stated that there were 2.3 million new cases of breast cancer last year, accounting for 11.7%, exceeding the number of new cases of lung cancer for the first time , becoming a "hidden pink killer."

At the same time, the highest incidence of breast cancer is among women in high-income countries, while the incidence is significantly lower among women in middle- and low-income countries. In addition, about 0.5-1% of breast cancers come from men.

However, the mortality rate of breast cancer itself is not high. From 2016 to 2020, 8 million women were diagnosed with breast cancer and survived, which is higher than other cancers.

Currently, WHO is promoting the Global Breast Cancer Action around the world, hoping to reduce the number of deaths from breast cancer worldwide through early detection, timely diagnosis and comprehensive breast cancer management.

Figure 7: AI-assisted breast cancer screening

As a powerful tool for initial screening of breast cancer, AI can detect early features of breast cancer in a timely manner, and is expected to kill the "pink killer" in the early stages. However, it may be too early to promote AI in clinical practice on a large scale, because changes in the environment and the algorithm itself will continue to affect, causing the sensitivity and specificity of AI to decrease over time.

Professor Yan Chen also believes that "once AI enters clinical application, we must have a mechanism to continuously evaluate and monitor it." Now, research teams from all over the world are evaluating the test results of AI and have achieved satisfactory results. In the future, with the help of efficient AI and a sound regulatory mechanism , various diseases will have "nowhere to hide" and our health will be more stably protected.

Reference Links:

[1]https://acsjournals.onlinelibrary.wiley.com/doi/10.3322/caac.21708

[2]https://www.sciencedirect.com/science/article/pii/S2667005422000047

This article was first published on HyperAI WeChat public platform~

<<:  AI successfully predicts tens of millions of "missense mutations", which is expected to solve the problem of human genetics

>>:  Are there any "four-in-one" creatures in the water? The strange-looking Zhang's unicorn shrimp

Recommend

Windows 10X can be started on Surface Pro 7 and most drivers work fine

Although the first batch of hardware products run...

SEM operation: All the SEM creative strategies you want are here!

This article starts with the creative foundation,...

How do IT staff manage user application experience in a complex environment?

As BYOD, wireless, unified communications (UC) an...

99% of households have this unhealthy oil usage habit, which may cause cancer!

Fried chicken legs, French fries, fried meatballs...

Are you still carpooling or group-fighting? Let's ride on a rocket together.

In the era of rapid development of the sharing ec...

Why virtual reality devices will never catch up with TV

On June 3, according to the Huffington Post, when...

2021 e-commerce product selection skills and cases!

In the field of performance advertising, I have c...