Key messages
• Both the FIB-4 score and Forns index can be used in the initial phase of investigating whether someone has liver scarring.
• It is best to use the FIB-4 score to rule out stage 3 (severe fibrosis) or stage 4 scarring (cirrhosis).
• It is best to use the Forns index to diagnose people with stage 2 scarring (significant fibrosis).
Why is improving the diagnosis of liver scarring important?
Hepatitis C infection is a common cause of liver scarring (fibrosis). Untreated, liver scarring can progress to a severe form called liver cirrhosis, which is mostly irreversible and can cause the liver to shut down or develop cancer. Currently, the best test to diagnose liver fibrosis is liver biopsy, where liver tissue is taken with a needle and looked at under a microscope. However, liver biopsy is invasive, costly, painful, and carries some serious risks such as bleeding. Accurately diagnosing liver fibrosis through non-invasive tests such as the FIB-4 score and Forns index would benefit people and healthcare systems overall. However, their diagnostic accuracy (that is, how good they are at telling us which people have what stage of disease) in people with hepatitis C infection remains unclear.
What are the FIB-4 score and Forns index tests?
The FIB-4 score and Forns index are tests for diagnosing stages of liver fibrosis. They combine standard laboratory results with factors such as age to calculate a score that estimates the amount of scarring in the liver. Compared to liver biopsy, these are simple, inexpensive, widely available, relatively painless, and risk-free tests.
Each test has two cut-offs: high/rule in and low/rule out. If a person's result is below the low cut-off, they do not have that stage of fibrosis. If a person's result is above the high cut-off, they do have that stage of fibrosis. If someone's score is between the two cut-offs, the test is unhelpful because it can neither rule in nor rule out fibrosis. This is called the 'grey area'. Someone with a score in the 'grey area' should have further tests, such as a liver biopsy.
What did we want to find out?
We wanted to determine how well the FIB-4 score and Forns index can diagnose different liver fibrosis stages in people with chronic hepatitis C, compared to the results from liver biopsy.
What did we do?
We searched for studies that evaluated the diagnostic accuracy of the FIB-4 score or Forns index (or both) in people with hepatitis C. We combined the results from these studies.
What did we find?
We included 84 studies with a total of 107,583 participants. The studies were conducted in 28 countries, and were published between 2002 and 2021. We analysed results from 62 studies with 100,605 participants. We selected this portion of studies because they applied the two tests using comparable low and high cut-off values. This approach means we can be more confident about the results of our analysis.
By combining the studies' results for the FIB-4 score for diagnosing severe (stage 3) fibrosis, we can say the following for a hypothetical group of 1000 people:
• using the high or 'rule-in' cut-off, 144 people would correctly be diagnosed with severe fibrosis, whilst 48 people would wrongly be diagnosed with this stage of disease;
• using the low or 'rule-out' cut-off, 430 people would have severe fibrosis correctly ruled out, whilst 58 people with fibrosis would be missed;
• by using both cut-offs together, about one-third of people will need further tests ('grey area').
By combining the studies' results for the Forns index to diagnose significant (stage 2) fibrosis, we can say the following for a hypothetical group of 1000 people:
• using the high or 'rule-in' cut-off, 179 people would correctly be diagnosed with significant fibrosis, whilst 13 people would wrongly be diagnosed with this stage of disease;
• using the low or 'rule-out' cut-off, 218 people would have significant fibrosis correctly ruled out, whilst 83 people would be missed;
• by using both cut-offs together, about half of people will need further tests ('grey area').
What are the limitations of the evidence?
Our confidence in the evidence was reduced because many of the studies may have overestimated the diagnostic accuracy of the tests. Also, the numbers described above are a summary based on pooling results from many studies. Because estimates of accuracy varied considerably across individual studies, we cannot be sure that applying the FIB-4 score or Forns index will always produce these results.
How up to date is this evidence?
The evidence is current to 13 April 2022.
Both the FIB-4 score and the Forns index may be considered for the initial assessment of people with CHC. The FIB-4 score's low cut-off (1.45) can be used to rule out people with at least severe fibrosis (≥ F3) and cirrhosis (F4). The Forns index's high cut-off (6.9) can be used to diagnose people with at least significant fibrosis (≥ F2). We judged most of the included studies to be at unclear or high risk of bias. The overall quality of the body of evidence was low or very low, and more high-quality studies are needed. Our review only captured data from referral centres. Therefore, when generalising our results to a primary care population, the probability of false positives will likely be higher and false negatives will likely be lower. More research is needed in sub-Saharan Africa, since these tests may be of value in such resource-poor settings.
The presence and severity of liver fibrosis are important prognostic variables when evaluating people with chronic hepatitis C (CHC). Although liver biopsy remains the reference standard, non-invasive serological markers, such as the four factors (FIB-4) score and the Forns index, can also be used to stage liver fibrosis.
To determine the diagnostic accuracy of the FIB-4 score and Forns index in staging liver fibrosis in people with chronic hepatitis C (CHC) virus, using liver biopsy as the reference standard (primary objective). To compare the diagnostic accuracy of these tests for staging liver fibrosis in people with CHC and explore potential sources of heterogeneity (secondary objectives).
We used standard Cochrane search methods for diagnostic accuracy studies (search date: 13 April 2022).
We included diagnostic cross-sectional or case-control studies that evaluated the performance of the FIB-4 score, the Forns index, or both, against liver biopsy, in the assessment of liver fibrosis in participants with CHC. We imposed no language restrictions. We excluded studies in which: participants had causes of liver disease besides CHC; participants had successfully been treated for CHC; or the interval between the index test and liver biopsy exceeded six months.
Two review authors independently extracted data. We performed meta-analyses using the bivariate model and calculated summary estimates. We evaluated the performance of both tests for three target conditions: significant fibrosis or worse (METAVIR stage ≥ F2); severe fibrosis or worse (METAVIR stage ≥ F3); and cirrhosis (METAVIR stage F4). We restricted the meta-analysis to studies reporting cut-offs in a specified range (+/-0.15 for FIB-4; +/-0.3 for Forns index) around the original validated cut-offs (1.45 and 3.25 for FIB-4; 4.2 and 6.9 for Forns index). We calculated the percentage of people who would receive an indeterminate result (i.e. above the rule-out threshold but below the rule-in threshold) for each index test/cut-off/target condition combination.
We included 84 studies (with a total of 107,583 participants) from 28 countries, published between 2002 and 2021, in the qualitative synthesis. Of the 84 studies, 82 (98%) were cross-sectional diagnostic accuracy studies with cohort-based sampling, and the remaining two (2%) were case-control studies. All studies were conducted in referral centres. Our main meta-analysis included 62 studies (100,605 participants).
Overall, two studies (2%) had low risk of bias, 23 studies (27%) had unclear risk of bias, and 59 studies (73%) had high risk of bias. We judged 13 studies (15%) to have applicability concerns regarding participant selection.
FIB-4 score
The FIB-4 score's low cut-off (1.45) is designed to rule out people with at least severe fibrosis (≥ F3). Thirty-nine study cohorts (86,907 participants) yielded a summary sensitivity of 81.1% (95% confidence interval (CI) 75.6% to 85.6%), specificity of 62.3% (95% CI 57.4% to 66.9%), and negative likelihood ratio (LR-) of 0.30 (95% CI 0.24 to 0.38).
The FIB-4 score's high cut-off (3.25) is designed to rule in people with at least severe fibrosis (≥ F3). Twenty-four study cohorts (81,350 participants) yielded a summary sensitivity of 41.4% (95% CI 33.0% to 50.4%), specificity of 92.6% (95% CI 89.5% to 94.9%), and positive likelihood ratio (LR+) of 5.6 (95% CI 4.4 to 7.1).
Using the FIB-4 score to assess severe fibrosis and applying both cut-offs together, 30.9% of people would obtain an indeterminate result, requiring further investigations. We report the summary accuracy estimates for the FIB-4 score when used for assessing significant fibrosis (≥ F2) and cirrhosis (F4) in the main review text.
Forns index
The Forns index's low cut-off (4.2) is designed to rule out people with at least significant fibrosis (≥ F2). Seventeen study cohorts (4354 participants) yielded a summary sensitivity of 84.7% (95% CI 77.9% to 89.7%), specificity of 47.9% (95% CI 38.6% to 57.3%), and LR- of 0.32 (95% CI 0.25 to 0.41).
The Forns index's high cut-off (6.9) is designed to rule in people with at least significant fibrosis (≥ F2). Twelve study cohorts (3245 participants) yielded a summary sensitivity of 34.1% (95% CI 26.4% to 42.8%), specificity of 97.3% (95% CI 92.9% to 99.0%), and LR+ of 12.5 (95% CI 5.7 to 27.2).
Using the Forns index to assess significant fibrosis and applying both cut-offs together, 44.8% of people would obtain an indeterminate result, requiring further investigations. We report the summary accuracy estimates for the Forns index when used for assessing severe fibrosis (≥ F3) and cirrhosis (F4) in the main text.
Comparing FIB-4 to Forns index
There were insufficient studies to meta-analyse the performance of the Forns index for diagnosing severe fibrosis and cirrhosis. Therefore, comparisons of the two tests' performance were not possible for these target conditions. For diagnosing significant fibrosis and worse, there were no significant differences in their performance when using the high cut-off. The Forns index performed slightly better than FIB-4 when using the low/rule-out cut-off (relative sensitivity 1.12, 95% CI 1.00 to 1.25; P = 0.0573; relative specificity 0.69, 95% CI 0.57 to 0.84; P = 0.002).