Positive predictive value—the role of prevalence

In today’s lesson, I’ll explain what positive and negative predictive values are, why they are useful, and how the positive predictive value is influenced by disease prevalence.

Franz Wiesbauer, MD MPH

2nd Nov 2016 • 4m read

This is part 7 in our epidemiology review series. If you haven’t done so, you should also watch the previous videos (see links below).

In today’s lesson, I’ll explain what positive and negative predictive values are, why they are useful, and how the positive predictive value is influenced by disease prevalence.

Here are the links to our previous epidemiology videos:

Video 1: Incidence and prevalence–what you really need to know
Video 2: Mortality rates–the nuts and bolts
Video 3: The importance of age confounding in the medical literature
Video 4: The role of adjustment in age confounding
Video 5: Test sensitivity and specificity made easy
Video 6: Sensitivity and specificity taken to the next level

Video Transcript

[00:00:00] In the previous lessons, we said that a clinician needs to know additional indicators for test validity, apart from sensitivity and specificity, in order to correctly interpret test results. Clinicians are usually confronted with test results that are either positive or negative. So, the question is, what does a positive or negative test result tell you? That's where the predictive value comes in. In this lesson, we're mainly going to cover the positive predictive value or PPV. Let's take one of our previous examples again.

[00:00:30] We have a population of 1000, 200 are diseased, 800 are not. 160 out of the 200 correctly tested positive, 720 of the non-diseased are correctly tested negative. So, we have 160 true positives and 720 true negatives. Accordingly, we have 80 false positives and 40 false negatives. So, how should we interpret a positive test result? Well, here's how, 160 true positives, divided by all positives of

[00:01:00] 240, times 100 gives us 67%. This is the positive predictive value. In other words, this means that 67% of all individuals testing positive are truly diseased, and what if the test is negative. Well, as you might have guessed, there's also a negative predictive value or NPV. And here's how it's calculated, 720 true negatives, divided by all 760 people who tested negative, times 100 equals 95%.

[00:01:30] So, 95% of all people who test negative are truly disease free, pretty good. These numbers are actually much more helpful for a clinician for interpreting the data. Essentially, she wants to know what the probability of disease is, given a positive or negative test result. So, you might ask, is the predictive value always the same? I often see different specialists interpret the same lab values in a very different way. That's a great question. These specialists should interpret the same values in a different way because their

[00:02:00] populations differ in a major way and that major way has to do with prevalence. Let's look at an example again. Let's pick a population of 1000 individuals and a test with a sensitivity and a specificity of 90%. Let's pick a prevalence of 5%. So, 50 have the disease and 950 don't. With the sensitivity and the specificity of 90%, the test will correctly pick up 90% or 45 individuals with the disease and 90% or 855 of individuals without

[00:02:30] the disease. The remaining 5 will be falsely classified as negatives and 95 as false positives. So, overall, we have 140 people who test positive. So, the positive predictive value is 45, divided by 140, times 100, equalling 32%. So, very weak. The test in this population is pretty useless. But now, let's say the prevalence is 20%, so 200 individuals out of the 1000 are diseased and 800

[00:03:00] are non-diseased. The test will correctly identify 90% of diseased and non-diseased, since the sensitivity and specificity are 90%. So, 180, 20, 720, and 80. So, now we have 260 people who test positive, 180 true positives and 80 false positives. So, the positive predictive value is 180. divided by 260, times 100, equalling 69%. So, in this population, the same test

[00:03:30] is much more useful. PPV of 69% when the prevalence is 20% but only 32% when the prevalence is 5%. Let me tell you a story that has to do with this concept. I once saw a very fit and athletic pilot who was dismissed from flying because a routine ECG showed a left bundle branch block. Every other test including the echo was unremarkable. The aviation authorities had issued a directive upon which they argued that left bundle branch blocks

[00:04:00] were actually associated with an increased risk of latent or future cardiomyopathy. And pilots with a left bundle branch block should actually be dismissed from service. When I looked at the paper that served as the basis for this directive, I saw that it was carried out in patients who were either hospitalized or seen in an outpatient cardiology clinic. So, in a population with a much higher prevalence of cardiomyopathy than the private practice where the pilot got the ECG. So,

[00:04:30] the directive is highly problematic and based on indicators that might not be transferable to the population of pilots because the positive predictive value of a left bundle branch block, for the diagnosis of future cardiomyopathy, is probably not useful in this population.