Lung Cancer Screening
Lung cancer is the leading cause of cancer-related deaths in the United States.1 It is estimated that 236,740 new cases of lung and bronchial cancer will have been diagnosed in the United States in 2022 (117,910 in men and 118,830 in women) with 130,180 anticipated deaths (68,820 in men and 61,360 in women).1 Earlier diagnosis has significant impact on clinical outcome, as 5-year survival increases from 6%-9.5% for distant-stage disease, to 33-44% for regional disease, and 60-75% for localized disease,1 although there have been recent improvements in non-small cell lung cancer outcomes across stages with the advent of targeted and molecular therapies.2 Data from larger randomized control trials such as the National Lung Screening Trial (NLST) and the Dutch-Belgian Randomized Lung Cancer Screening Trial (Nederlands–Leuvens Longkanker Screenings Onderzoek [NELSON]) supports low-dose CT (LDCT) screening in high-risk individuals based on smoking criteria and age.3-5 A noticeable decline in advanced-stage lung cancer diagnosis with a corresponding increase in incidence of localized stage disease was observed between 2013 and 2018 following recommendation for lung cancer screening by the United States Preventive Services Task Force (USPSTF).6 Due to increased strength of evidence supporting annual screening with LDCT for high-risk individuals, USPSTF issued updated screening guidelines in March of 2021, expanding eligibility to adults aged 50 to 80 years who have a 20 pack-year smoking history and currently smoke or have quit within the past 15 years.6 The Centers for Medicare & Medicaid Services (CMS) covers annual LDCT screening for appropriate Medicare beneficiaries with significant smoking history up to 77 years of age if they participate in shared decision-making before their first screening LDCT.7 LDCT screening is also recommended by the 2021 CHEST Guideline and Expert Panel Report on Screening for Lung Cancer.8
Although there is consensus on the value of LDCT screening, uncertainty remains about the appropriate duration of screening and age of screening cessation, with NCCN recommending annual screening until the patient is no longer a candidate for definitive treatment.9 There are also potential risks associated with LDCT screening including false negative as well as false positive results that can lead to unnecessary tests and invasive procedures, complications from the diagnostic workup, overdiagnosis of incidental findings, short-term anxiety due to indeterminate results, and radiation exposure.9 As a result of increased implementation of screening guidelines, the incidence of nodules detected on CT continues to rise, and an estimated 1.5 million nodules are detected each year in the United States.10 Most lung nodules found on LDCT are benign,3,4 such that lung cancer prevalence in the screening setting is 0.8-2.2% and approximately 0.11% in nodules incidentally detected when imaging is performed for other reasons.10 An LDCT screen is defined as “positive” if the size and morphologic features of the detected nodule results in a recommendation for follow-up testing in addition to recommended annual screening based on published guidelines.8 Approximately 7% of patients with false positive results go on to an invasive procedure, most often bronchoscopy.9,11,12
Indeterminate Pulmonary Nodules (IPN) Risk Assessment and Management Guidelines
The American College of Chest Physicians (ACCP) Evidence-Based Clinical Practice Guidelines define an indeterminate nodule as any nodule that is not calcified in a benign pattern and is lacking clearly benign features such as intramodular fat indicative of hamartoma or a feeding artery and vein typical for arteriovenous malformation.13 Factors such as nodule morphology, size, attenuation, and clinical context harbor varying risks of malignancy and the probability of malignancy (low, intermediate, high) determines further management, which is often either continued CT-surveillance PET/CT, and tissue sampling including percutaneous needle biopsy, bronchoscopic biopsy, or surgical biopsy.14,9 Nodule attenuation (solid, part-solid, ground glass opacity), size (diameter or volume), and rate of growth factor strongly into multiple management guidelines.13,15,16,17,9 NCCN guidelines delineate cut-off values for size, follow-up interval, and intervention depending on nodule appearance as solid, part-solid, or non-solid on initial LDCT screen and subsequently consider level of clinical suspicion of lung cancer.9 The Fleischner Society Guidelines for nodules discovered outside of LDCT cancer screening also propose follow-up methods depending on whether a nodule is solid or sub-solid. Additional factors considered for management include the number of nodules, and size or volume of each nodule along with clinical risk factors.15 The ACCP outlines a similar methodology for indeterminate nodules discovered in the screening setting as well as those detected incidentally wherein the follow-up strategy is dependent on nodule appearance, size, and risk or probability of malignancy.13
Clinical risk factors associated with a higher risk of malignancy include cigarette smoking, age, occupational and environmental exposures, pulmonary fibrosis, chronic obstructive pulmonary disorder (COPD), personal history of lung cancer, and female sex.14 ACCP Practice Guidelines recommend that clinicians estimate the pretest probability of malignancy either qualitatively by using their clinical judgement and/or quantitatively by using a validated model. Pretest probability of malignancy enables selection and subsequent interpretation of diagnostic tests as depicted in Table 1 below.13 Several quantitative risk models exist to facilitate decision-making by incorporating clinical and radiologic factors into a single risk score that summarizes likelihood of malignancy.18-20 These models can be utilized alongside clinical judgement and recommended guidelines to guide the next management step.14 The Herder model incorporates PET avidity into the Mayo Clinic Model to improve diagnostic value,21 whereas the TREAT model was designed for use in the surgical clinic and incorporates hemoptysis and PET avidity, assuming a higher prevalence of malignancy.22 Risk calculator performance varies depending on the lung cancer prevalence and clinical characteristics of the original study population.23 Therefore, risk estimates should be interpreted with caution, taking the patient’s clinical context and factors such as geographical location into consideration. Recent studies suggest that physicians generally have good intuition regarding risk assessment of IPNs and while many do not document a quantitative prediction of malignancy in advance of tissue diagnosis, qualitative risk statements generally correlate with quantitative risk.25 NCCN Guidelines for Lung Cancer Screening encourage a multidisciplinary approach to evaluation for the suspicion of lung cancer including input from thoracic radiology, pulmonary medicine, and thoracic surgery, and may include the use of a lung nodule risk calculator to inform probability assessment. NCCN cautions that risk calculator use is not a substitute for multidisciplinary pulmonary management as geographic and other factors can influence calculator accuracy.9
Table 1. ACCP: Assessment of the Probability of Malignancy.13
Assessment Criteria
|
Probability of Malignancy
|
Low (<5%)
|
Intermediate (5%-65%)
|
High (>65%)
|
Clinical factors alone (determined by clinical judgement and/or use of a validated model)
|
Young, less smoking, no prior cancer, smaller nodule size, regular margins, and/or non-upper-lobe location
|
Mixture of low and high probability features
|
Older, heavy smoking, prior cancer, larger size, irregular/spiculated margins, and/or upper-lobe location
|
FDG-PET scan results
|
Low-moderate clinical probability and low FDG-PET activity
|
Weak or moderate FDG-PET scan activity
|
Intensely hypermetabolic nodule
|
Nonsurgical biopsy results (bronchoscopy or TTNA)
|
Specific benign diagnosis
|
Nondiagnostic
|
Suspicious for malignancy
|
CT scan surveillance
|
Resolution or near-complete resolution, progressive or persistent decrease in size, or no growth over ≥ 2 years (solid nodule) or ≥ 3-5 years (subsolid nodule)
|
NA
|
Clear evidence of growth
|
FDG = fluorodeoxyglucose; NA = not applicable; TTNA = transthoracic needle aspiration
Guidelines largely agree on management of low and high risk IPNs.9,13-17 Nodules with low probability of malignancy should be monitored with CT surveillance, whereas those with high probability of malignancy merit more aggressive evaluation and consideration for surgical resection.26 However, there is heterogeneity in management recommendations of IPNs with intermediate malignancy risk.14 Additional evaluation options for intermediate nodules include short-term interval CT, fluorodeoxyglucose (FDG) PET, non-surgical or surgical biopsy depending on the patient’s risk of malignancy, health status, preference, and clinical setting.14 Studies have also shown that many physicians do not follow management guidelines, introducing further heterogeneity in management and potential for suboptimal care with inadequate work-up, prolonged surveillance or potentially harmful unnecessary procedures.24,27 For example, a study of pulmonary nodule evaluation in the usual care setting outside of a lung cancer screening study or dedicated pulmonary nodule clinic found 18% of patients receiving over-evaluation consisting of prolonged surveillance and biopsy and 27% receiving less intense evaluation than recommended by guidelines, with radiologist recommendation as the strongest predictor of intensity of evaluation.27 However, standardized reporting of radiologic findings has helped to increase guideline compliance.27,28 Additional improvement to risk stratification beyond the available clinical and radiologic features is needed to decrease unnecessary biopsy referrals and costs, especially in the intermediate risk group that is most likely to undergo further testing for benign disease.
Bronchoscopy
Bronchoscopy is a common approach to tissue sampling and is frequently used in patients with IPNs and an intermediate-risk of malignancy, for whom guidelines show heterogeneity in management recommendations ranging from CT-surveillance to non-surgical biopsy. Bronchoscopy is a relatively safe procedure with less than 1% of cases complicated by pneumothorax.29 Approximately 500,000 bronchoscopies are performed annually in the United States of which approximately half are done for lung cancer evaluation.30 Nevertheless, bronchoscopy has lower sensitivity for smaller and peripherally-located nodules and up to 40% of bronchoscopies lead to a non-diagnostic outcome wherein the clinician cannot obtain a clinically actionable benign or malignant diagnosis. Physicians are subsequently faced with the dilemma of whether to monitor such patients with CT surveillance or proceed to a surgical lung biopsy or transthoracic needle biopsy associated with a greater risk of morbidity, such as pneumothorax or hemorrhage, or mortality.31
Biomarkers to Improve Diagnostic Yield of Bronchoscopy
Biomarkers can serve as an adjunct to bronchoscopy by resolving equivocal cytology32 or improving risk-stratification to inform further patient management. Recently published examples include detection of cancer-associated deoxyribonucleic acid (DNA) methylation and gene mutations in bronchial washings performed during fiberoptic bronchoscopy for diagnosis of lung cancer,33 development of an exploratory bronchoalveolar lavage (BAL) genomic classifier aimed at detecting tumor-derived mutations by targeted sequencing of BAL cell free DNA (cfDNA),34 and validation of a multiple logistic regression model including methylated tumor DNA from the homeobox A9 (HOXA) gene in bronchial lavage along with clinical factors such as age and smoking status as a supplementary diagnostic tool for lung cancer detection,35 among others. While many approaches are in the research and development pipeline, few have been commercialized and incorporated into clinical practice36 as further studies are required to determine clinical validity and utility. This evidence summary will focus on in-depth review of commercialized molecular assays that aim to improve risk stratification of indeterminate pulmonary nodules following non-diagnostic bronchoscopy, acknowledging that additional molecular assays exist in various stages of development.
First Generation Gene Expression Profiling (GEP) Test: Percepta Bronchial Genomic Classifier (BGC)
The Percepta BGC is a messenger-ribonucleic acid (RNA) assay performed on cytology brushings of bronchial epithelial cells collected during bronchoscopy from current or former smokers undergoing evaluation for suspected lung cancer and is performed in the event of non-diagnostic bronchoscopy. The test uses patient age along with microarray technology to measure the expression of 23 lung cancer associated genes to improve the diagnostic yield of bronchoscopy. The assay was designed as a high sensitivity “rule-out” test for patients with a non-diagnostic bronchoscopy and intermediate-risk of malignancy, in order to re-classify intermediate pre-test cancer risk to low post-test risk with a nondiagnostic bronchoscopy and negative classifier result.31
BGC classifier development was rooted in preceding studies that demonstrated a molecular field of injury consisting of gene expression changes in airway epithelial cells as a function of cigarette smoking.37,38 The classifier was developed in current or former smokers undergoing bronchoscopy for suspected lung cancer across 28 centers in the United States, Canada, and Ireland in two independent, prospective, multicenter observational studies, known as the Airway Epithelial Gene Expression in the Diagnosis of Lung Cancer (AEGIS-1 and AEGIS-2) Trials.31 Exclusion criteria consisted of age <21 years, smoking <100 cigarettes, concurrent cancer or history of lung cancer. The prevalence of lung cancer in AEGIS-1 was 74% and 78% in AEGIS-2. The median age of patients in AEGIS-1 was 62 years (interquartile range 55-70) and 64 years (interquartile range 57-71) in AEGIS-2.31 Both trials included a diverse group of patients, with nearly 20% being African American.31 A set of patients from AEGIS-1 (223 patients diagnosed with lung cancer and 76 patients diagnosed with benign disease) was randomly selected for classifier training39 whereas 298 patients from AEGIS-1 and 341 patients from AEGIS-2 were utilized in classifier validation.31
In the validation set, 43% of bronchoscopies were non-diagnostic for lung cancer (95% Confidence interval (95% CI), 39-46), including for 25% of patients in whom lung cancer was ultimately diagnosed (95% CI, 21-29). A “diagnostic” bronchoscopy was defined to yield a confirmed lung cancer diagnosis. Patients’ pre-bronchoscopy risk of malignancy was assessed by each treating physician based on their subjective assessment and the results were divided into categories of low (<10%), intermediate (10-60%), and high (>60%) probability of malignancy with a corresponding malignancy prevalence of 5%, 41%, and 95%, respectively. Physicians and patients were not informed of classifier results. Bronchoscopy sensitivity for lung cancer detection was 74% (95% CI, 68 to 79) in AEGIS-1 and 76% (95% CI, 71 to 81) in AEGIS-2. In AEGIS-1, the area under the receiver-operating-characteristic curve (AUC) for the classifier was 0.78 (95% CI, 0.73 to 0.83) with a sensitivity of 88% (95% CI, 83 to 92) and a specificity of 47% (95% CI, 37 to 58). The AUC of the classifier in AEGIS-2 was 0.74 (95% CI, 0.68 to 0.80), with a sensitivity of 89% (95% CI, 84 to 92), and a specificity of 47% (95% CI, 36 to 59). There was no statistically significant difference in classifier performance in AEGIS-1 vs AEGIS-2. Combining bronchoscopy with the classifier led to an increased sensitivity of 96% (95% CI, 93 to 98) in AEGIS-1 and 98% (95% CI, 96 to 99) in AEGIS-2, independent of lesion size and location. The negative predictive value (NPV) of the classifier in 101 patients with an intermediate pretest probability of cancer and nondiagnostic bronchoscopy was 91% (95% CI, 75 to 98),31 whereas in 426 patients with a high pretest probability of cancer, the NPV was 38% (95% CI, 15-65).31
Overall, 487 patients were diagnosed with lung cancer and 120 had non-diagnostic bronchoscopy, of which 13 were classifier negative (false negatives). Three of these were in the intermediate risk category and 10 had a high pretest risk of malignancy. Patients with false negative results should not experience significant delay to diagnosis as patients with a nondiagnostic bronchoscopy and negative classifier score should undergo CT surveillance as standard of care when an immediate invasive strategy is not utilized.31 The positive predictive value (PPV) in the intermediate risk population was 40% (95% CI 27-55) and 84% (95% CI 75-91) in the high pretest probability group. Given the modest PPV in the intermediate risk group (40%), the authors concluded that a positive classifier result does not warrant decision alteration between invasive strategy and imaging-surveillance.31 A negative classifier result has potential clinical utility in patients with a nondiagnostic bronchoscopy and an intermediate probability of cancer, as a negative classifier score may warrant a more conservative diagnostic strategy involving imaging surveillance instead of invasive procedures as the next step in patient management.31 Analytical performance of the bronchial genomic classifier was reported in Hu et al40.
Clinical utility was evaluated by examining potential procedure reduction in the AEGIS trials as a result of classifier use41 and through a survey of pulmonary physicians presented with clinical cases.42 The Percepta Registry Cohort was established as a multicenter prospective registry including academic and community medical centers aimed at observing physician management of patients with pulmonary nodules following nondiagnostic bronchoscopy in a setting with and without classifier results.43 Lee et al43 found that 34.3% of patients with low or intermediate risk of malignancy and a clinician-designated plan for a subsequent invasive procedure had a reduction in malignancy risk and 73.9% of these patients had a subsequent change in management plan from invasive procedure to surveillance with the majority avoiding a procedure up to 12 months following the initial evaluation. The study did not find a statistically significant delay to diagnosis in the classifier false negative patients nor a significant increase in advanced cancer stage at diagnosis.43
Second Generation GEP: Percepta Genomic Sequencing Classifier (GSC)
The Percepta Genomic Sequencing Classifier (GSC) is a second-generation classifier currently offered for clinical use in replacement of the BGC. It was developed from whole transcriptome RNA sequencing along with clinical factors, with multiple thresholds allowing for both up-classification and down-classification of malignancy risk in patients with non-diagnostic biopsy.8,44 The up-classification is intended to be an improvement on the first generation Bronchial Genomic Classifier, which was designed to be solely a “rule out” test for intermediate-risk patients. The GSC final model uses 1232 genes and four clinical covariates – including pack-years, age, and interfering factors such as inhaled medication use and specimen collection timing.44 The GSC was developed using samples from current and former smokers who underwent bronchoscopy for suspected lung cancer as part of the AEGIS-1 and AEGIS-2 trials and the BGC Registry, wherein patients were split into training and validation cohorts. Classifier performance characteristics from the validation set (N=412) are listed in Table 2 below, demonstrating similar performance as a rule-out test for intermediate and low risk categories compared with the first generation BGC.
Table 2. Lung cancer genomic sequencing classifier validation performance.44
AUC
|
Pre-test Cancer Risk
|
Cancer prevalence
|
Cancer Risk re-stratification
|
Specificity (% and 95% CI)
|
Sensitivity (% and 95% CI)
|
Post-test NPV/PPV (% and 95 CI)
|
% Re-stratified
|
73.4% [95% CI 68.3-78.4]
|
Low
|
5%
|
Low to Very Low
|
57.4% [44.8-69.3]
|
100% [39.8-100]
|
100% NPV [91.0-100]
|
54.5%
|
Intermediate
|
28.2%
|
Intermediate to Low
|
37.3% [27.9-47.4]
|
90.6% [79.3-96.9]
|
91.0% NPV [80.8-96.0]
|
29.4%
|
Intermediate to High
|
94.1% [87.6-97.8]
|
28.3% [16.8-42.3]
|
65.4% PPV [43.8-82.1]
|
12.2%
|
High
|
73.6%
|
High to Very High
|
91.2% [76.3-98.1]
|
34.0% [25.0-43.8]
|
91.5% PPV [77.9-97.0]
|
27.3%
|
29.4% of intermediate-risk patients had an “actionable negative” result, such that if the test were to lead to surveillance imaging in 10 patients, 9 would be expected to have benign lesions and safely avoid further testing, whereas one patient with a malignant lesion could potentially experience a delay in further evaluation.44 12.2% of the intermediate cohort was up-classified form intermediate to high risk with a PPV of 65.4%. Thus, if the test were to result in more aggressive management, approximately two patients with malignancy would experience additional invasive testing or treatment, whereas one patient with a benign lesion would do the same.44 The potential impact of the classifier on the rate of invasive procedures was assessed in the AEGIS I and II cohorts.45 This was done by estimating the potential reduction in the number of procedures in patients who have been re-classified by the GSC, assuming that the classifier would have been used in procedure decision-making.45 As a result of classifier use, 50% of patients with benign lesions as well as 29% of those with malignancy undergoing additional invasive procedures prior to definitive surgery could have potentially avoided invasive procedures.44,45 Analytical validation of the GSC was published by Johnson et al46.
Raval et al47 conducted an observational study that retrospectively assessed data from four clinical sites (two academic and two community medical centers) that regularly use the GSC in clinical practice. 42% of patients had a change in risk category following GSC results compared to pre-procedure risk of malignancy. Potential clinical utility was modeled using performance characteristics from prior studies and modeling was based on hypothetical assumptions that risk up-classification from high to very high would lead to referral for surgical resection as the next management step and down-classification to low or very low risk of malignancy would result in CT surveillance as the next step.47 The authors aim to update this study to assess the true impact of Percepta GSC results in the context of outcomes data including future diagnosis of cancer or benign disease when that information becomes available.47 Finally, Sethi et al48 reported results from a decision impact study demonstrating that up-classification of malignancy risk from high to very high can potentially allow more patients to proceed more rapidly to curative therapy, with a decrease in intervening diagnostic procedures.