How accurate is artificial intelligence for diagnosing keratoconus?

Key messages

• The included studies suggest that artificial intelligence (AI) can identify keratoconus. This may lead to early detection and prevention of vision loss.
• Estimates were similar for different types of AI algorithms.
• We have little confidence in the evidence; there is a need for more research on this topic.

What is keratoconus and why is (early) diagnosis so important?

Keratoconus is a disease of the cornea (the clear window at the front of the eye) that affects people between the ages of 10 and 40 years. In those affected, the cornea weakens and thins over the years, gradually bulging into the typical cone-like shape, which leads to reduced vision. Glasses can resolve this problem in the early stages of keratoconus, but no longer offer a satisfying solution as the disease becomes more severe. Early diagnosis is imperative to ensure follow-up and treatment and thus prevent loss of vision.

The diagnosis of keratoconus is based on an eye exam (measuring the eye and evaluating the cornea with a vertical beam of light and a microscope) and imaging (computer-assisted techniques that create three-dimensional pictures or maps of the cornea). Interpreting the images can be challenging, especially in primary eye care settings and in the early stages of the disease. Not recognizing keratoconus could lead to worsening of the disease and worsening of vision. For example, people at risk of developing keratoconus who undergo refractive surgery (surgery to correct their vision) could end up with worse vision.

What is artificial intelligence and how can it help detect keratoconus?

Detecting keratoconus based on images is challenging, especially for untrained clinicians. AI gives machines the ability to adapt, reason, and find solutions. Algorithms can be developed and trained to analyse images of the cornea and recognize keratoconus. These tests could help ophthalmologists, optometrists, and other eye care professionals to make a diagnosis and refer people with keratoconus to cornea specialists in time to preserve their vision. There are many different types of algorithms, but they all distinguish between healthy eyes and keratoconus based on images of the cornea.

What did we want to find out?

The aim of the review was to find out whether AI can correctly diagnose keratoconus in people seeking refractive surgery and people whose vision can no longer be corrected fully with glasses.

What did we do?

We searched for studies that investigated the accuracy of AI for diagnosing keratoconus, preferably in people seeking refractive surgery or people whose vision can no longer be corrected fully with glasses. We compared and summarized the results of the studies to calculate two measures of accuracy: sensitivity (the ability of AI to correctly identify keratoconus) and specificity (the ability of AI to correctly rule out keratoconus). The closer sensitivity and specificity were to 100%, the better the algorithm.

What did we find?

We found 63 studies that used three different units (eyes, participants, and images) to analyse the accuracy of AI for detecting keratoconus: 44 studies analysed 23,771 eyes, four studies analysed 3843 participants, and 15 studies analysed 38,832 images.

The accuracy of AI for detecting manifest keratoconus (keratoconus that can be detected through a clinical examination) was high. If 1000 people were tested, 30 people with keratoconus would be correctly referred to a cornea specialist, and none would be missed. Of the remaining 970 people (without keratoconus), only 17 would be wrongly referred. These people would receive additional non-invasive tests to verify whether they had keratoconus.

The accuracy of AI for detecting early keratoconus was lower. If 1000 people were tested, nine people with keratoconus would be correctly referred to a cornea specialist and one would be missed. If this person received refractive surgery, it would aggravate the disease and worsen their vision. Of the remaining 990 people (without keratoconus), 941 would be reassured that they did not have the disease and would receive refractive surgery or glasses; 49 people would be wrongly referred.

The evidence suggests that AI may be good at detecting manifest keratoconus but may not be ideal for screening early keratoconus.

What are the limitations of the evidence?

We have little confidence in the evidence on the accuracy of AI for detecting manifest keratoconus, and we have little to no confidence in the evidence related to early keratoconus. There were problems with how the studies were conducted, which may result in AI appearing more accurate than it really is.

How up-to-date is this evidence?

The evidence is up-to-date to 29 November 2022.

Authors' conclusions: 

AI appears to be a promising triage tool in ophthalmologic practice for diagnosing keratoconus. Test accuracy was very high for manifest keratoconus and slightly lower for subclinical keratoconus, indicating a higher chance of missing a diagnosis in people without clinical signs. This could lead to progression of keratoconus or an erroneous indication for refractive surgery, which would worsen the disease.

We are unable to draw clear and reliable conclusions due to the high risk of bias, the unexplained heterogeneity of the results, and high applicability concerns, all of which reduced our confidence in the evidence.

Greater standardization in future research would increase the quality of studies and improve comparability between studies.

Read the full abstract...
Background: 

Keratoconus remains difficult to diagnose, especially in the early stages. It is a progressive disorder of the cornea that starts at a young age. Diagnosis is based on clinical examination and corneal imaging; though in the early stages, when there are no clinical signs, diagnosis depends on the interpretation of corneal imaging (e.g. topography and tomography) by trained cornea specialists. Using artificial intelligence (AI) to analyse the corneal images and detect cases of keratoconus could help prevent visual acuity loss and even corneal transplantation. However, a missed diagnosis in people seeking refractive surgery could lead to weakening of the cornea and keratoconus-like ectasia. There is a need for a reliable overview of the accuracy of AI for detecting keratoconus and the applicability of this automated method to the clinical setting.

Objectives: 

To assess the diagnostic accuracy of artificial intelligence (AI) algorithms for detecting keratoconus in people presenting with refractive errors, especially those whose vision can no longer be fully corrected with glasses, those seeking corneal refractive surgery, and those suspected of having keratoconus. AI could help ophthalmologists, optometrists, and other eye care professionals to make decisions on referral to cornea specialists.

Secondary objectives

To assess the following potential causes of heterogeneity in diagnostic performance across studies.

• Different AI algorithms (e.g. neural networks, decision trees, support vector machines)
• Index test methodology (preprocessing techniques, core AI method, and postprocessing techniques)
• Sources of input to train algorithms (topography and tomography images from Placido disc system, Scheimpflug system, slit-scanning system, or optical coherence tomography (OCT); number of training and testing cases/images; label/endpoint variable used for training)
• Study setting
• Study design
• Ethnicity, or geographic area as its proxy
• Different index test positivity criteria provided by the topography or tomography device
• Reference standard, topography or tomography, one or two cornea specialists
• Definition of keratoconus
• Mean age of participants
• Recruitment of participants
• Severity of keratoconus (clinically manifest or subclinical)

Search strategy: 

We searched CENTRAL (which contains the Cochrane Eyes and Vision Trials Register), Ovid MEDLINE, Ovid Embase, OpenGrey, the ISRCTN registry, ClinicalTrials.gov, and the World Health Organization International Clinical Trials Registry Platform (WHO ICTRP). There were no date or language restrictions in the electronic searches for trials. We last searched the electronic databases on 29 November 2022.

Selection criteria: 

We included cross-sectional and diagnostic case-control studies that investigated AI for the diagnosis of keratoconus using topography, tomography, or both. We included studies that diagnosed manifest keratoconus, subclinical keratoconus, or both. The reference standard was the interpretation of topography or tomography images by at least two cornea specialists.

Data collection and analysis: 

Two review authors independently extracted the study data and assessed the quality of studies using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS‐2) tool. When an article contained multiple AI algorithms, we selected the algorithm with the highest Youden's index. We assessed the certainty of evidence using the GRADE approach.

Main results: 

We included 63 studies, published between 1994 and 2022, that developed and investigated the accuracy of AI for the diagnosis of keratoconus. There were three different units of analysis in the studies: eyes, participants, and images. Forty-four studies analysed 23,771 eyes, four studies analysed 3843 participants, and 15 studies analysed 38,832 images.

Fifty-four articles evaluated the detection of manifest keratoconus, defined as a cornea that showed any clinical sign of keratoconus. The accuracy of AI seems almost perfect, with a summary sensitivity of 98.6% (95% confidence interval (CI) 97.6% to 99.1%) and a summary specificity of 98.3% (95% CI 97.4% to 98.9%). However, accuracy varied across studies and the certainty of the evidence was low.

Twenty-eight articles evaluated the detection of subclinical keratoconus, although the definition of subclinical varied. We grouped subclinical keratoconus, forme fruste, and very asymmetrical eyes together. The tests showed good accuracy, with a summary sensitivity of 90.0% (95% CI 84.5% to 93.8%) and a summary specificity of 95.5% (95% CI 91.9% to 97.5%). However, the certainty of the evidence was very low for sensitivity and low for specificity.

In both groups, we graded most studies at high risk of bias, with high applicability concerns, in the domain of patient selection, since most were case-control studies. Moreover, we graded the certainty of evidence as low to very low due to selection bias, inconsistency, and imprecision.

We could not explain the heterogeneity between the studies. The sensitivity analyses based on study design, AI algorithm, imaging technique (topography versus tomography), and data source (parameters versus images) showed no differences in the results.

Health topics: