Τετάρτη 9 Οκτωβρίου 2019

Performance of a Genomic Sequencing Classifier for the Preoperative Diagnosis of Cytologically Indeterminate Thyroid Nodules.

Performance of a Genomic Sequencing Classifier for the Preoperative Diagnosis of Cytologically Indeterminate Thyroid Nodules.:

Icon for Silverchair Information Systems Icon for PubMed Central


corresponding authorCorresponding author.
Article Information
Accepted for Publication: February 25, 2018.
Corresponding Author: Kepal N. Patel, MD, Division of Endocrine Surgery, Department of Surgery, New York University Langone Medical Center, 530 First Ave, Ste 6H, New York, NY 10016 (gro.cmuyn@letap.lapek).
Published Online: May 23, 2018. doi:10.1001/jamasurg.2018.1153
Author Contributions: Drs Kennedy and Ladenson had full access to all of the data in the study and take responsibility for the integrity of the data and the accuracy of the data analysis.
Study concept and design: Babiarz, Barth, Harrell, Huang, Kennedy, Kloos, LiVolsi, Randolph, Shanik, Walsh, Whitney, Ladenson.
Acquisition, analysis, or interpretation of data: Patel, Angell, Babiarz, Barth, Blevins, Duh, Ghossein, Huang, Kennedy, Kim, Kloos, Randolph, Sadow, Shanik, Sosa, Traweek, Walsh, Whitney, Yeh, Ladenson.
Drafting of the manuscript: Patel, Babiarz, Barth, Harrell, Huang, Kennedy, Kim, Kloos, Randolph, Shanik, Ladenson.
Critical revision of the manuscript for important intellectual content: Patel, Angell, Babiarz, Barth, Blevins, Duh, Ghossein, Huang, Kennedy, Kim, Kloos, LiVolsi, Randolph, Sadow, Shanik, Sosa, Traweek, Walsh, Whitney, Yeh, Ladenson.
Statistical analysis: Barth, Huang, Kennedy, Kim.
Obtained funding: Kennedy.
Administrative, technical, or material support: Babiarz, Barth, Kennedy, Kloos, Randolph, Sadow, Traweek, Whitney.
Study supervision: Patel, Angell, Barth, Kennedy, Kloos, Randolph, Sosa, Walsh.
Conflict of Interest Disclosures: Drs Patel, Blevins, Shanik, and Ladenson have received speaker’s honoraria from Veracyte Inc. Drs Ghossein, LiVolsi, Sadow, and Ladenson serve as consultants for Veracyte Inc. Drs Blevins, Shanik, and Ladenson have received institutional research support from Veracyte Inc. Drs Babiarz, Barth, Huang, Kennedy, Kim, Kloos, and Whitney and Mr Walsh are employees of Veracyte Inc. Drs Babiarz, Barth, Huang, Kennedy, Kim, Kloos, Traweek, and Whitney and Mr Walsh own equity in Veracyte Inc. Dr Sosa is a member of the American Thyroid Association Data Monitoring Committee of the Medullary Thyroid Cancer Consortium, which is supported by GlaxoSmithKline, Novo Nordisk, AstraZeneca, and Eli Lilly. No other disclosures were reported.
Funding/Support: This study was funded by Veracyte Inc.
Role of the Funder/Sponsor: Veracyte Inc drafted the study design and oversaw the data collection, management, and initial analysis. Veracyte Inc had no role in data interpretation; preparation, review, and approval of the manuscript; and the decision to submit the manuscript.
Meeting Presentation: Summary findings from this study were presented as an abstract and oral presentation at the Third World Congress on Thyroid Cancer; July 27-30, 2017; Boston, Massachusetts.
Additional Contributions: We thank the many investigators and patients who provided the fine-needle aspiration samples used here for training and in the independent test set.

Associated Data

Supplementary Materials

Introduction

Thyroid cancer incidence has increased substantially in the United States in recent decades, with evidence to support both an increase in detection and a true increase in occurrence. Thyroid nodules are palpable in 5% of adults and are visualized with contemporary imaging in more than one-third of adults.,, Malignancy is present in only 5% to 15% of all thyroid nodules,,,, and definitive diagnosis is achieved by surgical histopathology on resected tissue. Unfortunately, thyroid surgery is associated with discomfort, scarring, inconvenience, direct and indirect costs, potential lifelong medication, and occasional surgical complications., Efforts to exclude cancer with clinical assessment alone are admittedly imperfect, and laboratory testing of serum thyroid-stimulating hormone levels and thyroid imaging with radionuclides or ultrasonography identify benignity with high confidence in only 4% to 26% of nodules.,,, Forty years ago, the application of cytology to thyroid nodule specimens obtained by fine-needle aspiration (FNA) biopsy had a substantial effect on patient management by reducing surgery by one-half and doubling the proportion of cancer among patients who underwent surgery., However, approximately one-third of thyroid nodule cytology findings today are cytologically indeterminate,, with estimated risks of malignancy ranging from 5% to 30%. Consequently, approximately three-quarters of patients with cytologically indeterminate thyroid nodules have been referred for surgery,, even though 80% ultimately prove to have benign nodules.,,
The practice of using preoperative genomic information for thyroid nodule differential diagnosis is more than a decade old, and several commercial and noncommercial genomic approaches are currently available. Performance data from blinded prospective multicenter validation trials are limited and include the gene expression classifier (GEC), in which a machine learning–derived classification algorithm uses messenger RNA transcript expression levels to categorize cytologically indeterminate FNAs as either benign or suspicious. Altered messenger RNA expression can occur for several reasons, including complex upstream interactions that occur because of sequence changes in key core genes or in relevant peripheral genes, the effect of epigenetic changes that occur without DNA sequence alterations, and both internal and external modifiers, such as inflammation and lifestyle or environment., In a cohort with a 24% prevalence of malignancy, the GEC accurately identified 90% of malignancies (ie, sensitivity) and 52% of benign nodules (ie, specificity) with indeterminate Bethesda III or IV cytology. It intentionally favored high sensitivity over specificity to ensure the accuracy and safety of a benign genomic result. A test with improved specificity for identification of benign nodules and maintained high sensitivity for malignancy detection could spare even more patients from surgery with an accurate benign genomic result (negative predictive value [NPV]) and increase the cancer yield among those with a suspicious result (positive predictive value [PPV]).
Enhanced technologies for characterizing genomic information, including improved methods for the measurement of RNA transcriptome expression and sequencing of nuclear and mitochondrial RNAs, measurement changes in genomic copy number, including loss of heterozygosity, and the development of enhanced bioinformatics and machine learning strategies, have created the opportunity to develop a new, more robust genomic test. This study describes the blinded clinical validation of the novel genomic sequence classifier (GSC) on a prospective multicenter–derived set of patients with FNA samples whose referral to surgery and histopathological diagnosis were determined in the absence of genomic information.

Methods

Training and Validation Cohorts

The study was approved by institution-specific institutional review boards as well as by Liberty IRB (DeLand, Florida; now Chesapeake IRB) and Copernicus Group Independent Review Board (Cary, North Carolina). All patients provided written informed consent prior to participating in the study. The training cohort is described in eMethods 1 in the Supplement.

Validation Cohort

Dedicated thyroid nodule FNA specimens and surgical histopathology from nodules 1 cm or larger were collected using a prospective and blinded protocol at 49 academic and community centers in the United States from patients 21 years or older. These samples, stored at −80°C, were previously used to validate the GEC. The details of their enrollment and prespecified inclusion and exclusion criteria have been reported elsewhere. Histopathology diagnoses were previously established by an expert panel of thyroid surgical histopathologists that were blinded to all clinical and molecular data. BRAF V600E DNA mutational reference status was established by testing DNA from all samples with the competitive allele-specific TaqMan polymerase chain reaction, as described in eMethods 2 in the Supplement. This independent validation cohort was prespecified and divided into a primary test set comprised of all patients with Bethesda III and IV samples described in the clinical validation of the Afirma GEC with sufficient RNA remaining and a secondary test set comprised of all patients with Bethesda II, V, or VI samples described in the clinical validation of the Afirma GEC with sufficient RNA remaining and not randomly assigned to the training set, as described in eMethods 1 in the Supplement.

Blinding of the Independent Test Set

The following steps were implemented to ensure the independent test set was securely blinded throughout algorithm development and validation (eTable 1 in the Supplement). First, each step was documented in a prespecified protocol and time-stamped on execution. Each team member was assigned a single role and allowed access only to information designated for that role. A randomly generated blinded identification number was assigned to each sample in the validation set by information technology engineers who operated independently of all other teams to ensure that all other personnel were unable to link clinical and genomic data. All historic information that could potentially reveal the clinical label on the independent test set was secured in a password-protected folder prior to the start of algorithm development. Information technology engineers conducted performance testing of the validation test set independently of all other teams. RNA purification, library preparation, next-generation sequencing, RNA sequencing pipeline, feature extraction, and quality control methods are described in eMethods 3-6 in the Supplement.

Algorithm Development

Fine-needle aspiration samples (n = 634) were used to build the GSC core ensemble model, as described in eMethods 1 and eTable 2 in the Supplement. The ensemble model consists of 12 independent classifiers: 6 are elastic net logistic regression models and 6 are support vector machines. The 6 models within each category differ from each other according to the gene sets used (eTable 3 in the Supplement).
To minimize overfitting and to accurately reflect classifier performance incorporating random noise, hyperparameter tuning and model selections were performed using repeated nested cross-validation. Hyperparameter tuning was performed within the inner layer of the cross-validation, and the classifier performance was summarized using the outer layer of the 5-fold cross-validation repeated 40 times. For each classifier, the decision boundary was chosen to optimize specificity, with a minimum requirement of 90% sensitivity to detect malignancy.
The locked ensemble model uses a total of 10 196 genes, among which are 1115 core genes (eTable 4 in the Supplement). These core genes drive the prediction behavior of the model, and the remaining genes improve classifier stability against assay variability.
In addition to the ensemble model described above, the Afirma GSC system includes 7 other components: a parathyroid cassette, a medullary thyroid cancer (MTC) cassette, a BRAF V600E cassette, RET/PTC1 and RET/PTC3 fusion detection modules, follicular content index, Hürthle cell index, and Hürthle neoplasm index. The first 4 are upstream of the ensemble classifier, targeting specific and rare patient subgroups (eFigure 1 in the Supplement). The last 3 (the follicular content index, Hürthle cell index, and the Hürthle neoplasm index) were developed to further improve the benign vs suspicious classification performance. They were incorporated with the ensemble classifier to form the core benign vs suspicious classifier engine.

Statistical Analysis

Statistical analyses were performed using R statistical software version 3.2.3 (https://www.r-project.org). Continuous variables were compared using t test, and categorical variables were compared using Fisher exact test. We evaluated test performance using sensitivity, specificity, and NPV and PPV based on established methods. All confidence intervals are 2-sided 95% CIs and were computed using the exact binomial test. Test performance comparison between the GSC and GEC was done using McNemar χ2 test on the matched data set. Significance level in differential gene expression analysis is reported using a false discovery rate–adjusted P value. Two-sided P values less than .05 were used to declare significance.

Results

We used the FNA samples that previously validated the GEC to independently validate the GSC. The earlier GEC validation samples were derived from 4812 nodule aspirations prospectively collected from 3789 patients at 49 clinical sites in the United States over a 2-year period. Of the 210 validation samples with corresponding Bethesda III or IV cytology and blinded postoperative consensus histopathology diagnoses, 191 (91.0%) had sufficient residual RNA for GSC testing. These samples from cytologically indeterminate nodules constituted the blinded primary test set.
The previously established thyroid nodule cytological diagnosis was used again. Patient demographic characteristics and baseline data are shown in Table 1. Age, sex, clinical risk factors, nodule size, histology subtype (eTable 5 in the Supplement), number of FNA passes, prevalence of malignancy (eTable 6 in the Supplement), and proportion of samples collected at community centers did not differ significantly between the primary study population (n = 191) and the GEC clinical validation cohort of samples (n = 210), consistent with unbiased drop out.

Table 1.

Baseline Demographic and Clinical Characteristics of the Study Cohorta
VariableGEC ValidationGSC Validation
Total, No.
Samples210191
Patients199183
Type of study site, No. (%) of samples
Academic76 (36.2)65 (34.0)
Community134 (63.8)126 (66.0)
No. of fine-needle aspiration passes, No. (%) of samples
188 (41.9)73 (38.2)
2122 (58.1)118 (61.8)
Age of patients, mean (range), y51.2 (22.0-85.0)51.7 (22.0-85.0)
Sex, No. (%) of patients
Male46 (23.1)41 (22.4)
Female153 (76.9)142 (77.6)
Risk factors, No. (%) of patients
Radiation exposure to head, neck, or both7 (3.5)5 (2.7)
Family history of thyroid cancer14 (7.0)13 (7.1)
Nodule
Size on ultrasonography, median (range), cm2.5 (1.0-9.1)2.6 (1.0-9.1)
Size group, No. (%) of nodules, cm
1.00-1.9969 (32.9)60 (31.4)
2.00-2.9962 (29.5)60 (31.4)
3.00-3.9942 (20.0)37 (19.4)
≥4.0037 (17.6)34 (17.8)
Abbreviations: GEC, gene expression classifier; GSC, genomic sequencing classifier.
aStatistical tests were performed to compare the 191 GSC nodules with the 19 nodules in the GEC validation that were excluded in the GSC validation because of insufficient RNA quantity. The 2 groups differ only on the number of fine-needle aspiration passes, which is not unexpected, as only samples with sufficient remaining RNA were included in the GSC evaluation.
The Standards for Reporting of Diagnostic Accuracy Studies was developed to improve the quality of reporting diagnostic accuracy studies. eFigure 2 in the Supplement shows the flow of samples through the study in a Standards for Reporting of Diagnostic Accuracy Studies diagram. Of these 191 indeterminate FNAs, 46 (24.1%) were diagnosed as malignant by an expert surgical histopathology panel who were blinded to all cytologic and genomic results and to the local histopathology diagnosis. Results are reported in the order of testing through the GSC test system (eFigure 1 in the Supplement). Initially, all GSC samples are tested for RNA quantity and quality. None of the 191 samples failed. Subsequently, the GSC aimed to identify nodules composed of parathyroid tissue, those with MTC, and those with a BRAF V600E mutation or RET/PTC1 or RET/PTC3 fusion. Samples testing positive for these are included in performance calculations described below, except for samples testing positive for parathyroid tissue, as this result does not indicate a benign or malignant etiology. Among the 191 samples, positive results for parathyroid, MTC, BRAF, and RET/PTC occurred in 0, 1, 3, and 0 samples, respectively. All MTC and BRAF V600E results were concordant with reference methods (eMethods 2 in the Supplement). After this testing, samples were evaluated for follicular cell content by the follicular content index classifier. One sample, negative for the above results, was deemed to have inadequate follicular content and therefore was assigned no result. This sample was excluded from subsequent analyses, leaving 190 samples. Table 2 summarizes clinical performance characteristics for Bethesda III and IV nodules.

Table 2.

Performance of the Genomic Sequencing Classifier (GSC) According to the Final Histopathological Diagnoses and Cytopathological Category
GSC ResultReference Standard, % (95% CI)
MalignantBenign
Performance across the primary test set of Bethesda III and IV indeterminate nodules (n = 190)
Suspicious, No./total No.41/4546/145
Benign, No./total No.4/4599/145
Sensitivity91.1 (79-98)
Specificity68.3 (60-76)
NPV96.1 (90-99)
PPV47.1 (36-58)
Prevalence of malignant lesions, %23.7
Bethesda III: atypia of undetermined significance/follicular lesion of undetermined significance (n = 114 [60.0%])
Suspicious, No./total No.26/2825/86
Benign, No./total No.2/2861/86
Sensitivity92.9 (76-99)
Specificity70.9 (60-80)
NPV96.8 (89-100)
PPV51.0 (37-65)
Prevalence of malignant lesions, %24.6
Bethesda IV: follicular or Hürthle cell neoplasm or suspicious for follicular neoplasm (n = 76 [40.0%])
Suspicious, No./total No.15/1721/59
Benign, No./total No.2/1738/59
Sensitivity88.2 (64-99)
Specificity64.4 (51-76)
NPV95.0 (83-99)
PPV41.7 (26-59)
Prevalence of malignant lesions, %22.4
Performance across the secondary test set of Bethesda II, V, and VI nodules (n = 61)a
Suspicious, No./total No.34/347/26
Benign, No./total No.0/3419/26
Sensitivity100 (90-100)
Specificity73.1 (52-88)
NPV100 (82-100)
PPV82.9 (68-93)
Prevalence of malignant lesions, %56.7
Bethesda II: cytopathologically benign (n = 19 [31.1%])a
Suspicious, No./total No.2/22/16
Benign, No./total No.0/214/16
Sensitivity100 (16-100)
Specificity87.5 (62-98)
NPV100 (77-100)
PPV50.0 (7-93)
Prevalence of malignant lesions, %11.1
Bethesda V: suspicious for malignancy (n = 23 [37.7%])
Suspicious, No./total No.13/135/10
Benign, No./total No.0/135/10
Sensitivity100 (75-100)
Specificity50.0 (19-81)
NPV100 (48-100)
PPV72.2 (47-90)
Prevalence of malignant lesions, %56.5
Bethesda VI: cytopathologically malignant (n = 19 [31.1%])
Suspicious, No./total No.19/190/0
Benign, No./total No.0/190/0
Sensitivity100 (82-100)
PPV100 (82-100)
Prevalence of malignant lesions, %100
Abbreviations: NPV, negative predictive value; PPV, positive predictive value.
aOne sample has no result because of low follicular content that is not summarized in the table.
The GSC correctly identified 41 of 45 malignant samples as suspicious, yielding a sensitivity of 91.1% (95% CI, 79-98), and 99 of 145 nonmalignant samples were correctly identified as benign by the GSC, yielding a specificity of 68.3% (95% CI, 6076). Among Bethesda III and IV samples, the NPV was 96.1% (95% CI, 90-99) and the PPV was 47.1% (95% CI, 36-58). Performance of the GSC was similar between Bethesda III and IV categories (Table 2).
Among the 190 Bethesda III and IV samples, 17 (8.9%) were histologically Hürthle cell adenomas and 9 (4.7%) were Hürthle cell carcinomas, while 164 samples (86.3%) were histologically non-Hürthle. For samples with Hürthle histology, the sensitivity was 88.9% (95% CI, 52-100) and the specificity was 58.8% (95% CI, 33-82). For samples with non-Hürthle histology, the sensitivity was 91.7% (95% CI, 78-98) and the specificity was 69.5% (95% CI, 61-77).
A wide variety of malignant subtypes were correctly classified as suspicious (Table 3). Four false-negative cases occurred (Table 4). We assessed whether patient age or sex, malignancy subtype, or nodule size by ultrasonography or on histopathology were associated with false-negative cases, and none were. Comparisons of GSC to GEC results on a per-sample basis are reported in the eAppendix in the Supplement. The performance of the GSC in secondary analyses of nodules with Bethesda II, V, or VI cytopathology are reported in Table 2. Among the entire secondary analysis group, the GSC sensitivity was 100% (95% CI, 90-100) and the specificity was 73.1% (95% CI, 52-88).

Table 3.

Performance of Genomic Sequencing Classifier (GSC) According to Histopathological Subtype
Histopathological SubtypeNodules, No. (%)Result With GSC, Benign, No./Suspicious, No.
Benign
Total, No.145NA
Benign follicular nodule49 (33.8)38/11
Hyperplastic nodule5 (3.4)5/0
Follicular adenoma54 (37.2)37/17
Follicular tumor of uncertain malignant potential9 (6.2)4/5
Well-differentiated tumor of uncertain malignant potential8 (5.5)4/4
Hürthle cell adenoma17 (11.7)10/7
Chronic lymphocytic thyroiditis2 (1.4)1/1
Hyalinizing trabecular adenoma1 (0.7)0/1
Malignant
Total, No.45NA
Papillary thyroid carcinoma15 (33.3)2/13
Tall-cell variant1 (2.2)0/1
Follicular variant11 (24.4)1/10
Hürthle cell carcinomaa9 (20.0)1/8
Follicular carcinomab7 (15.6)0/7
Poorly differentiated carcinoma1 (2.2)0/1
Medullary thyroid cancer1 (2.2)0/1
Abbreviation: NA, not applicable.
aAmong the Hürthle cell carcinomas, 7 showed capsular invasion and 2 showed vascular invasion. The false-negative case was previously false-negative on the gene expression classifier.
bAmong the follicular carcinomas, 3 showed capsular invasion and 4 were well-differentiated carcinomas not otherwise specified.

Table 4.

Cytologic Findings and Histopathological Diagnosis in 4 False-Negative Results on Genomic Sequencing Classification
Patient No./SexNodule Size, cmBethesda Cytologic DiagnosisFinal Histologic Diagnosis
Ultrasonographic ImagingPathological Examination
1/M1.11.2IIIPTC
2/F2.51.5IIIPTC
3/F3.23.0IVFVPTC
4/F2.93.5IVHCC-v
Abbreviations: FVPTC, papillary thyroid cancer follicular variant; HCC-v, Hürthle cell carcinoma, vascular invasion; PTC, papillary thyroid cancer.

Discussion

A 2016 meta-analysis reported the risks of malignancy among Bethesda III and IV thyroid nodules to be 17% (95% CI, 11-23) and 25% (95% CI, 20-29), respectively. To safely avoid unnecessary diagnostic surgery among these cytologically indeterminate nodules, a test with a high sensitivity and NPV for malignancy is required. This blinded clinical validation of the GSC in a prospectively collected, representative, universally operated, and histopathologically diagnosed cohort demonstrates the required high NPV across these ranges of cancer prevalence encountered in Bethesda III and IV nodules in clinical practice (Figure). To independently validate the GSC, we implemented a set of strict blinding and deidentification protocols that enabled us to use the same FNA samples previously used to validate the GEC. Use of these samples allowed testing of complete and representative sets of nodules with corresponding surgical histology unaffected by the current widespread use of molecular testing to avoid or encourage surgery.
An external file that holds a picture, illustration, etc.
Object name is jamasurg-153-817-g001.jpg
Afirma Genomic Sequencing Classifier Performance Across Differing Risk Populations
There was a negative predictive value (NPV) of 96% (95% CI, 90-99) (A) and a positive predictive value (PPV) of 47% (95% CI, 36-58) (B) at a 24% cancer prevalence in the current Bethesda III and IV cohort. A 2016 meta-analysis reported prevalence of malignancy among Bethesda III and IV nodules as 17% (95% CI, 11-23) and 25% (95% CI, 20-29), respectively. Deriving PPV and NPV at 11% cancer prevalence yielded 98% NPV and 26% PPV, and deriving PPV and NPV at 29% cancer prevalence yielded 95% NPV and 54% PPV.
Test sensitivity of the GSC (91%; 95% CI, 79-98) compared with the GEC (89%; 95% CI, 76-96) was maintained, with the point estimate within the counterpart’s 95% CI, and the McNemar χ2 test (df = 1) on the matched sample set renders a test statistic of 0 (P > .99). On the other hand, test specificity of the GSC (68%; 95% CI, 60-76) was significantly improved from the GEC (50%; 95% CI, 42-59), with the point estimate outside the counterpart’s 95% CI, and the McNemar χ2 test (df = 1) on the matched sample set renders a test statistic of 16.447 (P < .001) (eTable 7 in the Supplement). In practice, this enhanced performance suggests that among Bethesda III and IV nodules that are histopathologically benign, at least one-third more will receive a benign result using the GSC compared with the GEC. At a cancer prevalence of 24%, more than half of tested patients are projected to receive a GSC benign result, and among GSC suspicious nodules, nearly half are anticipated to have cancer on surgical histology. This increased benign call rate is expected to result in more patients being assigned to active observation as opposed to diagnostic surgery. Given the high cost of surgery in the United States among Medicare and private payers, the increased avoidance of diagnostic surgery because of GSC benign results is expected to further improve cost-effectiveness and reduce surgical complications.,
While genomic data has been incorporated in clinical management decisions of multiple medical conditions for more than a decade, progress continues toward understanding the complexities of genomic and nongenomic pathways in the development and behavior of disease. Current evidence suggests that most common diseases are associated with small effects from a large number of genes and that most of these contributions are derived from transcriptionally active portions of the genome. This implies that diseases such as thyroid cancer are unlikely to be accounted for by the effects of a small number of genes. The fact that few genomic variants are associated with 100% penetrance toward malignant histology suggests that a complex interaction of multiple factors ultimately determines the benign or malignant nature of thyroid nodules., As the number of these factors expands, it becomes critical to use machine learning and statistical models to interpret their signals in a trained model to derive an accurate diagnosis.
Hürthle lesions exemplify the challenges inherent in complex biology and the opportunity to harness high-dimensional genomic data for predictive model training and subsequent validation. Most Hürthle cell–dominant Bethesda III and IV thyroid nodules have historically undergone surgery given the potential for Hürthle cell carcinoma, yet most have proven to be histologically benign. The GEC identified these samples at a high NPV, but most were categorized as GEC suspicious. We sought to maintain a high NPV while providing more benign results by including 2 dedicated classifiers to work with the core GSC classifier. Among the 26 Hürthle cell adenomas or Hürthle cell carcinomas reported here, the final GSC sensitivity was 88.9% and the specificity was 58.8%; the GEC sensitivity was 88.9% and the specificity was 11.8% among these same neoplasms. Thus, while the overall GSC sensitivity of 91.1% reported here is comparable with that of the GEC (by design), the improved overall GSC specificity of 68.3% results from significantly improved performances among both Hürthle and non-Hürthle specimen types. Given that most histologically benign Hürthle and non-Hürthle specimens are now both identified as GSC benign, GSC testing may further safely reduce unnecessary surgery among both specimen types.
Recently, the histological diagnosis of noninvasive follicular thyroid neoplasm with papillary-like nuclear features (NIFTP) was recognized as a biologically distinct entity with a low risk of malignant behavior following surgical excision, which remains the currently recommended treatment. These lesions were previously described as encapsulated noninvasive follicular variant of papillary thyroid cancer. No NIFTP histopathology diagnoses were available in this independent validation cohort, as it was collected prior to the establishment of this diagnostic category. However, subsequent studies,, have suggested a high rate of GEC suspicious results among NIFTP cases. The GSC was trained to identify NIFTP cases as suspicious. While removal of NIFTP from the malignant category would reduce the prevalence of cancers among cytological categories and alter the anticipated PPV of GSC tested cases, this exercise would not be clinically meaningful since the goal of a positive GSC test is to identify all thyroid nodules that warrant surgery, which currently remains necessary for NIFTP.
We performed a secondary analysis of 61 Bethesda II, V, or VI samples that also were included in the GEC validation study (Table 2). While performance of a genomic test among these more definitive cytology categories may not predict performance of the test within the Bethesda III and IV categories, the consistency of these performance metrics is reassuring and supportive of the findings in the primary analysis.

Limitations

Limitations of this study include the lack of performance data among children and data on when the nodule had been previously biopsied or when sample collection methods other than 1 or 2 dedicated FNA passes were used. Another potential limitation is that the prevalence of cancer in this study was toward the higher end of the expected range among Bethesda III and IV nodules, as seen in the Figure. It is possible that a cytologically indeterminate cohort with a significantly lower prevalence of cancer may contain more benign nodules that are easier for the GSC to classify, as seen in Table 2 among nodules with Bethesda II cytopathology. Should that happen, an effectively higher test specificity may occur.

Conclusions

The current trend in thyroid nodule and cancer management is more conservative, with physicians more aware of the burden of unnecessary thyroid surgery and the indolent behavior of most thyroid malignancies confined to the thyroid.,,, Current US guidelines indicate that molecular testing may be used among Bethesda III and IV nodules to add additional information about the nodule’s risk of malignancy, which, along with patient preference, may guide clinical decision-making., This study demonstrates high test sensitivity and NPV among Bethesda III and IV cytologically indeterminate thyroid nodules across a broad range of nodule sizes (Table 1). As an adjunct to clinical judgment, the GSC is expected to reduce unnecessary diagnostic surgery, improve patient safety, reduce health care costs, and improve patient quality of life.

Notes

Supplement.

eMethods 1. Training cohort.
eMethods 2. Reference methods.
eMethods 3. RNA purification.
eMethods 4. Library preparation.
eMethods 5. Next-generation sequencing.
eMethods 6. RNA sequencing pipeline, feature extraction, and quality control.
eAppendix. Genomic sequence classifier to gene expression classifier comparison on a per-samples basis.
eTable 1. Blinding of the independent test set.
eTable 2. Composition of the core ensemble model training set.
eTable 3. Feature sets used in each classifier within the final ensemble model.
eTable 4. List of 1115 core genes deriving the ensemble model prediction.
eTable 5. Histology subtype comparison between validation cohorts.
eTable 6. Prevalence of malignancy between validation cohorts.
eTable 7. Performance comparison between the genomic sequence classifier and gene expression classifier.
eFigure 1. Afirma gene expression classifier system.
eFigure 2. Standards for Reporting of Diagnostic Accuracy Studies diagram of sample flow through the study.

References

1. Davies L, Welch HG. Current thyroid cancer trends in the United StatesJAMA Otolaryngol Head Neck Surg. 2014;140(4):317-322. [PubMed[]
2. Lim H, Devesa SS, Sosa JA, Check D, Kitahara CM. Trends in thyroid cancer incidence and mortality in the United States, 1974-2013JAMA. 2017;317(13):1338-1348. [PubMed[]
3. Mazzaferri EL. Management of a solitary thyroid noduleN Engl J Med. 1993;328(8):553-559. [PubMed[]
4. Guth S, Theune U, Aberle J, Galach A, Bamberger CM. Very high prevalence of thyroid nodules detected by high frequency (13 MHz) ultrasound examinationEur J Clin Invest. 2009;39(8):699-706. [PubMed[]
5. Hegedüs L. Clinical practice: the thyroid noduleN Engl J Med. 2004;351(17):1764-1771. [PubMed[]
6. Kamran SC, Marqusee E, Kim MI, et al. Thyroid nodule size and prediction of cancerJ Clin Endocrinol Metab. 2013;98(2):564-570. [PubMed[]
7. Haugen BR, Alexander EK, Bible KC, et al. 2015 American Thyroid Association management guidelines for adult patients with thyroid nodules and differentiated thyroid cancer: the American Thyroid Association Guidelines Task Force on thyroid nodules and differentiated thyroid cancerThyroid. 2016;26(1):1-133. [PMC free article] [PubMed[]
8. Li H, Robinson KA, Anton B, Saldanha IJ, Ladenson PW. Cost-effectiveness of a novel molecular test for cytologically indeterminate thyroid nodulesJ Clin Endocrinol Metab. 2011;96(11):E1719-E1726. [PubMed[]
9. Meltzer C, Klau M, Gurushanthaiah D, et al. Risk of complications after thyroidectomy and parathyroidectomy: a case series with planned chart reviewOtolaryngol Head Neck Surg. 2016;155(3):391-401. [PubMed[]
10. Hong MJ, Na DG, Baek JH, Sung JY, Kim JH. Cytology-ultrasonography risk-stratification scoring system based on fine-needle aspiration cytology and the Korean-Thyroid Imaging Reporting and Data SystemThyroid. 2017;27(7):953-959. [PubMed[]
11. Virmani V, Hammond I. Sonographic patterns of benign thyroid nodules: verification at our institutionAJR Am J Roentgenol. 2011;196(4):891-895. [PubMed[]
12. Middleton WD, Teefey SA, Reading CC, et al. Multiinstitutional analysis of thyroid nodule risk stratification using the American College of Radiology Thyroid Imaging Reporting and Data SystemAJR Am J Roentgenol. 2017;208(6):1331-1341. [PubMed[]
13. Tang AL, Falciglia M, Yang H, Mark JR, Steward DL. Validation of American Thyroid Association ultrasound risk assessment of thyroid nodules selected for ultrasound fine-needle aspirationThyroid. 2017;27(8):1077-1082. [PubMed[]
14. Bongiovanni M, Spitale A, Faquin WC, Mazzucchelli L, Baloch ZW. The Bethesda System for Reporting Thyroid Cytopathology: a meta-analysisActa Cytol. 2012;56(4):333-339. [PubMed[]
15. Melillo RM, Santoro M. Molecular biomarkers in thyroid FNA samplesJ Clin Endocrinol Metab. 2012;97(12):4370-4373. [PubMed[]
16. Cibas ES, Ali SZ. The Bethesda System for Reporting Thyroid CytopathologyThyroid. 2009;19(11):1159-1165. [PubMed[]
17. Cibas ES, Baloch ZW, Fellegara G, et al. A prospective assessment defining the limitations of thyroid nodule pathologic evaluationAnn Intern Med. 2013;159(5):325-332. [PubMed[]
18. Wang CC, Friedman L, Kennedy GC, et al. A large multicenter correlation study of thyroid nodule cytopathology and histopathologyThyroid. 2011;21(3):243-251. [PMC free article] [PubMed[]
19. Onenerk AM, Pusztaszeri MP, Canberk S, Faquin WC. Triage of the indeterminate thyroid aspirate: what are the options for the practicing cytopathologist? Cancer Cytopathol. 2017;125(S6):477-485. [PubMed[]
20. Alexander EK, Kennedy GC, Baloch ZW, et al. Preoperative diagnosis of benign thyroid nodules with indeterminate cytologyN Engl J Med. 2012;367(8):705-715. [PubMed[]
21. Boyle EA, Li YI, Pritchard JK. An expanded view of complex traits: from polygenic to omnigenicCell. 2017;169(7):1177-1186. [PMC free article] [PubMed[]
22. Herceg Z, Hainaut P. Genetic and epigenetic alterations as biomarkers for cancer detection, diagnosis and prognosisMol Oncol. 2007;1(1):26-41. [PMC free article] [PubMed[]
23. Ravegnini G, Sammarini G, Hrelia P, Angelini S. Key genetic and epigenetic mechanisms in chemical carcinogenesisToxicol Sci. 2015;148(1):2-13. [PubMed[]
24. Teutsch SM, Bradley LA, Palomaki GE, et al. ; EGAPP Working Group . The Evaluation of Genomic Applications in Practice and Prevention (EGAPP) initiative: methods of the EGAPP Working GroupGenet Med. 2009;11(1):3-14. [PMC free article] [PubMed[]
25. Friedman J, Hastie T, Tibshirani R, Simon N, Narasimhan B, Qian J. Glmnet: lasso and elastic-net regularized generalized linear models. http://CRAN.R-project.org/package=glmnet. Accessed August 15, 2017.
26. Karatzoglou A, Smola A, Hornik K, Zeileis A. Kernlab: an S4 package for kernel methods in RJ Stat Softw. 2004;11(9):1-20. doi:10.18637/jss.v011.i09 [CrossRef[]
27. Krstajic D, Buturovic LJ, Leahy DE, Thomas S. Cross-validation pitfalls when selecting and assessing regression and classification modelsJ Cheminform. 2014;6(1):10. [PMC free article] [PubMed[]
28. Altman DG, Bland JM. Diagnostic tests 2: predictive valuesBMJ. 1994;309(6947):102. [PMC free article] [PubMed[]
29. Clopper CJ, Pearson ES. The use of confidence or fiducial limits illustrated in the case of the binomialBiometrika. 1934;26(4):404-413. doi:10.2307/2331986 [CrossRef[]
30. Agresti A. Categorical Data Analysis. New York, NY: John Wiley & Sons; 1990. []
31. Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testingJ R Stat Soc B. 1995;57(1):289-300. []
32. Bossuyt PM, Reitsma JB, Bruns DE, et al. ; STARD Group . STARD 2015: an updated list of essential items for reporting diagnostic accuracy studiesClin Chem. 2015;61(12):1446-1452. [PubMed[]
33. Krauss EA, Mahon M, Fede JM, Zhang L. Application of the Bethesda classification for thyroid fine-needle aspiration: institutional experience and meta-analysisArch Pathol Lab Med. 2016;140(10):1121-1131. [PubMed[]
34. Singer J, Hanna JW, Visaria J, Gu T, McCoy M, Kloos RT. Impact of a gene expression classifier on the long-term management of patients with cytologically indeterminate thyroid nodulesCurr Med Res Opin. 2016;32(7):1225-1232. [PubMed[]
35. Kloos RT. Molecular profiling of thyroid nodules: current role for the Afirma gene expression classifier on clinical decision makingMol Imaging Radionucl Ther. 2017;26(suppl 1):36-49. [PMC free article] [PubMed[]
36. Strickland KC, Vivero M, Jo VY, et al. Preoperative cytologic diagnosis of noninvasive follicular thyroid neoplasm with papillary-like nuclear features: a prospective analysisThyroid. 2016;26(10):1466-1471. [PubMed[]
37. Nikiforov YE, Seethala RR, Tallini G, et al. Nomenclature revision for encapsulated follicular variant of papillary thyroid carcinoma: a paradigm shift to reduce overtreatment of indolent tumorsJAMA Oncol. 2016;2(8):1023-1029. [PMC free article] [PubMed[]
38. Wong KS, Angell TE, Strickland KC, et al. Noninvasive follicular variant of papillary thyroid carcinoma and the Afirma gene-expression classifierThyroid. 2016;26(7):911-915. [PubMed[]
39. Jiang XS, Harrison GP, Datto MB. Young investigator challenge: molecular testing in noninvasive follicular thyroid neoplasm with papillary-like nuclear featuresCancer Cytopathol. 2016;124(12):893-900. [PubMed[]
40. Golding A, Shively D, Bimston DN, Harrell RM. Noninvasive encapsulated follicular variant of papillary thyroid cancer: clinical lessons from a community-based endocrine surgical practiceInt J Surg Oncol. 2017;2017:4689465. [PMC free article] [PubMed[]
41. Davies L, Welch HG. Thyroid cancer survival in the United States: observational data from 1973 to 2005Arch Otolaryngol Head Neck Surg. 2010;136(5):440-444. [PubMed[]
42. Tuttle RM, Fagin JA, Minkowitz G, et al. Natural history and tumor volume kinetics of papillary thyroid cancers during active surveillanceJAMA Otolaryngol Head Neck Surg. 2017;143(10):1015-1020. [PMC free article] [PubMed[]
43. Miyauchi A, Ito Y, Oda H. Insights into the management of papillary microcarcinoma of the thyroidThyroid. 2018;28(1):23-31. [PMC free article] [PubMed[]
44. Nou E, Kwong N, Alexander LK, Cibas ES, Marqusee E, Alexander EK. Determination of the optimal time interval for repeat evaluation after a benign thyroid nodule aspirationJ Clin Endocrinol Metab. 2014;99(2):510-516. [PMC free article] [PubMed[]
45. National Comprehensive Cancer Network Thyroid carcinoma. NCCN Clinical Practice Guidelines in Oncology. Version 2. https://www.nccn.org/professionals/physician_gls/default.aspx. Accessed August 17, 2017.

Δεν υπάρχουν σχόλια:

Δημοσίευση σχολίου

Αρχειοθήκη ιστολογίου