Validation of Online Prognostication Model that Predicts Survival for Women with Early Breast Cancer in Egypt

Breast cancer is considered to be the commonest cancer among females, encompassing 23% of the 1.1 million female cancers diagnosed annually [1-2]. It is also considered the highlighting cause of cancer-related deaths worldwide with the highest case fatality rates in the developing countries [3]. In Egypt, Breast cancer is considered the most prevalent cancer in women. Age-specific-incidence rates have a dramatic increase after the age of 30 [4]. Breast cancer is considered a heterogeneous disease. Its etiology and pathology vary among patients. Metastasis can occur at different stages depending on the biology of the disease and its degree of aggressiveness [5]. Abstract

Introduction or low risk. They are highly efficient however they are costly and not easily available in developing countries. Similarly, several programs have emerged through time trying to estimate the survival and calculate the added benefit of the treatment given such as Nottingham Prognostic Index (NPI), PREDICT, Adjuvant! [6].
The Nottingham Prognostic Index (NPI) is a scoring program that depends on 3 tumor characteristics which are tumor size, grade, and lymph node status. It stratified the patients into 3 groups with different survival [9]. Adjuvant!, is an online model used for the prediction of survival as well as the expected treatment benefits [10].
PREDICT is an online freely available program that aids clinicians to estimate patient's survival based on combined tumor and patient criteria. It was developed as a collaboration between the Cambridge Breast Unit, University of Cambridge Department of Oncology, and the UK's Eastern Cancer Information and Registration Centre (ECRIC). It was first established in the UK and had been validated on a cohort of 5,000 patients [6]. It was initially revealed in 2011 and has been widely approved and its use has been increasing. It includes entry of specific data as regards the patient's age at the time of diagnosis, mode of detection, hormonal-receptor status, tumor grade, size as well as the number of involved nodes. It provides an average estimation of 5 and 10 years overall-survival in women with early breast cancer. It also gives an insight into the added benefit of any given therapy whether chemotherapy, hormonal therapy, targeted therapy (anti-HER2), or even combinations of these modalities.
PREDICT is a well-calibrated model that provides easy access and a fruitful insight as regards the estimated survival and the additional benefit from the use of adjuvant therapy. PREDICT has not been validated in any cohort of the Egyptian population. This study aimed to test the utility and reliability of PREDICT as a prognostication model in patients with early breast cancer in Alexandria, Egypt.

Patients and Methods
This study included female patients diagnosed with early breast cancer and treated with surgery (either breast conservative surgery or modified radical mastectomy) followed by adjuvant systemic therapy with or without radiotherapy in 2005. Data on patient, tumor, and treatment-related characteristics, as well as the follow-up, were obtained from the archives of the Department of Clinical Oncology, Faculty of Medicine and Department of Cancer Management and Research, Medical Research Institute, University of Alexandria, Egypt after having approval from the ethical committee.
A total number of 128 eligible patients with an adequate follow-up that allowed calculation of the actual 5 and 10-Year OS were included in our study. Data obtained on patient's age, tumor characteristics (including pathological data on tumor size, number of involved lymph nodes, tumor grade, ER status, and HER2 status based on immunohistochemistry testing), as well as treatment and follow up. Treatment data included the type of treatment (surgery, chemotherapy, endocrine therapy, and radiotherapy), type of chemotherapy regimen received (no chemotherapy, second-generation chemotherapydoxorubicin based, third-generation chemotherapy-taxane based). Data on Ki67 status was not available as it was not tested in all of the patients at that time.
Besides, patients with unknown tumor size, number of positive lymph nodes, differentiation grade, or estrogen receptor (ER) status were excluded, since PREDICT doesn't permit the absence of these data.
The program then produced an estimated 5 and 10-year overall survival (OS) for each patient. It also included a survival analysis for countable possibilities, that is, overall-survival with no adjuvant treatment added benefit of adjuvant hormonal therapy, chemotherapy alone or the combined benefit of both, additional benefit of adding trastuzumab to adjuvant chemotherapy and hormone therapy.

Statistical Analysis
The area under the ROC curve (AUC) was used for validation of the given results. It detected the accuracy of PREDICT in the estimation of the actual survival. A p-value of 0.05 was chosen as a cutoff point for statistical significance. Values under 0.05 were considered statistically significant difference between predicted & actual survivals, while those bigger were considered nonsignificant. Analysis of different prognostic subgroups was done as well.

Results
In this study of women with early breast cancer, the mean age at diagnosis was 49 years. Almost all of the patients were symptomatic at presentation (125, 99.2%), whereas only 1.6% of women in this study had mammographic screening-detected breast cancer. The mean tumor size at presentation was 32 mm, and 55 patients had lymph node involvement (43%). ER was found positive in about 115 patients, Data on HER2 status was not available in 121 patients, within patients with available information, HER2 was expressed in only 4 patients. No data on Ki67 was available. 106 (82.8%) patients had grade II tumors. 108 (84.4%) patients received adjuvant chemotherapy which was only anthracyclinebased (second-generation) regimes.
Receiver-operating characteristic (ROC) analysis was apjcc.waocp.com Gehan A. Khedr, et al: Predict Prognostic Model Validation significant (p=0.671) as shown in Table 5. Table 6 shows that PREDICT overestimated 10-year OS in subgroups of patients with a good prognosis, for example, ER-positive, T1, and N0 disease. However, none was statistically significant (p=0.76, 0.118 & 1 respectively) it also underestimated 10-year OS for ER-negative patients, in such population, the difference between predicted and actual survivors was -5.5%, which was statistically significant (p= 0.016), PREDICT also underestimated 10-year OS in other poor prognostic subgroups, for example, GIII, N+. However, none was statistically significant (p= 0.125 & 0.405 respectively).
PREDICT under-estimated 10-year OS in a certain age group (>35-50 years), although the difference between predicted and actual survivors was -5.5%, it wasn't statistically significant (p = 0.162).
PREDICT accurately predicts 5-year OS in the entire study subjects and all predefined subgroups. Ten-year survival was predicted quite well, although underestimation of survival was actual in ER-negative patients. Although this difference was within the range of 5.5%, it was statistically significant used to validate the estimated results of PREDICT as shown in Figure 1 and 2. An area under the ROC curve AUC was used to evaluate the 5-and 10-years overall survival.
In the entire study population, 5-year OS analysis was good with an AUC of 0.787, An AUC of 0.649 was used for testing the accuracy of 10-year OS estimation as shown in Table 1. The minimum percentage calculated for 5 year survival was 63% and the maximum was 98% with a mean of 91.12% and a median of 93%. Meanwhile, The minimum percentage calculated for 10-year survival was 40% and the maximum was 95% with a mean of 80.42% and a median of 82%as shown in Table 2.
The predicted number of survivors after 5 years in the entire study subjects was 125 (97.7%) compared to 123 (96.1%) actual survivors. The difference was 1.6% which was not significant (p= 0.625) as shown in Table 3. Table 4 shows that PREDICT overestimated 5-year OS in subgroups of patients with a good prognosis, for example, ER-positive and N0 disease. However none was statistically significant (p=0.625 & 0.25 respectively) it also underestimated 5-year OS for ER-negative patients, however, it wasn't statistically significant (the difference between predicted and actual survivors was -0.8%, (p=0.5).
The predicted number of survivors after 10 years in the entire cohort was 77 (60.2%) compared to 81 (63.3%) actual survivors. The difference was -3.1% which was not

Discussion
Generally, PREDICT performed well in terms of estimating the 5-and 10-years overall survival with no statistical significance between the actual and predicted survivals. Meanwhile, PREDICT overestimated 5-year OS in subgroups of patients with good prognosis, for example, ER-positive and N0 disease. However, none was statistically significant (p=0.625 & 0.25 respectively). It also underestimated 5-year OS for ER-negative patients, however, it wasn't statistically significant (the difference between predicted and actual survivors was -0.8%, (p=0.5). Similar to 5-year survival analysis, PREDICT overestimated 10-year OS in subgroups of patients with good prognosis, for example, ER-positive, T1 and N0 disease. However, none was statistically significant (p=0.76, 0.118 & 1 respectively) it also underestimated 10-year OS for ER-negative patients, in this subgroup, the difference between predicted and actual survivors was -5.5%, which was statistically significant (p=0.016), PREDICT also underestimated 10-year OS in other poor prognostic subgroups, for example, GIII, N+. However, none was statistically significant (p= 0.125 & 0.405 respectively).
This finding is consistent with a Dutch study performed to validate Predict in Dutch population by van Maaren et al [11] and was carried on 10,338 patients with operated, non-metastatic primary invasive breast cancer, diagnosed in 2005. In the Dutch population, an AUC of 0.80 was used for the assessment of 5-year OS accuracy. The predicted number of survivors after 5 years was 7595.2 (86.0%) compared to 7723 (87.4%) actual survivors. The difference was -1.4%, which was not significant (p=0.14). In ER-positive patients, the difference between predicted and actual survivors was -0.7% (p=0.53). In ER-negative patients, the difference between predicted and actual   survivors was -4.9%, which was statistically significant (p=0.02) but just within the range of 5%. For the entire cohort and the ER-positive patients, the predicted and actual 5-year OS do not differ significantly.
In the entire Dutch validation population, an AUC of 0.78 was used for the assessment of 10-year OS accuracy. The predicted number of survivors after 10years was 6404 (72.5%) compared to 6493 (73.5%) actual events. The difference was-1.0%, which was not significant (p=0.27). In ER-positive patients, the difference between predicted and actual survivors was -0.1% (p=0.92). In ER-negative patients, the difference between predicted and actual events was -5.3%, which was statistically significant (p=0.01). For the entire cohort and the ER-positive patients, the predicted 10-year OS did not differ from the actual 10-year OS. However, for ER-negative patients, a significant underestimation was seen (p=0.01). 10-year OS was significantly underestimated by PREDICT in T3 (-13%, p < 0.01), grade III (-3.2%, p=0.03). However, the only differences outside the range of 5%, were in patients with T3 (underestimation).
Van Maaren et al [11] concluded that PREDICT accurately predicts 5-year and 10-year OS in the overall Dutch validation population. However, 5 and 10-year OS was underestimated for ER-negative disease.
The finding that 10-year OS was underestimated in ER-negative patients, but was accurately predicted in ER-positive patients is consistent with the present study in which Predict underestimated 10-year OS in ER-negative patients. This may be related to the biological criteria of the ER-negative population which is characterized by much more aggressive disease with subsequent worse predicted survival rates.
In a similar study performed by Wong et al [12], on the Southeast Asian population particularly on 1480 patients who underwent complete surgical treatment for stages I to III breast cancer from 1998 to 2006, were identified from the prospective breast cancer registry    [12]. In this study, an AUC of 0.78 was used for the assessment of 5-year OS accuracy. The predicted number of survivors after 5 years in the entire cohort was 86.3% compared to 87.6% actual survivors.
In-addition, An AUC of 0.73 was used for the assessment of 10-year OS accuracy. The predicted number of survivors after 10 years in the entire cohort was 77.5% compared to 74.2% actual survivors. The difference was 3.3%, which was not statistically significant (p=0.12).
PREDICT was also accurate in most subgroups of patients, except in certain subgroups, the program tended to overestimate the survival. For example, in a cohort of women with age less than 40 years, PREDICT overestimated their 5-year OS by 6.8% and their 10-year OS by 17.2%. Similar to the present study, the model tended to underestimate the 5-year OS in subgroup of patients with ER-negative tumors. However, in the Southeast Asian study, it was statistically significant, the difference between predicted and actual events was -6.0% (p<0.001), the underestimation was not reported in the prediction of 10-year survival [12].
A similar study was carried out by Engelhardt et al [13], for validation of PREDICT in a certain group of female patients with early breast cancer younger than 50 years. The study was carried out on 2710 patients with stage I-III breast cancer.
In Engelhardt et al [13] study, the only estimation of 10-year overall survival was analyzed. The difference between predicted and actual mortality was -1.1 which was non-significant (P=0.28), which is consistent with the present study. PREDICT did significantly underestimated all-cause mortality for patients <40 years by up to -6.6% [14]. Younger patients tend to present with more advanced stage and more aggressive disease. Additionally, younger patients are more likely to be hormone receptor-negative. Also, lack of awareness about the increasing incidence in the younger population tends to attribute breast cancer symptomatology to a more benign cause without consideration of breast cancer as a possibility, eventually leading to a more advanced stage with a poorer outcome.
Eventually, this trend for PREDICT to underestimate the survival in the ER-negative populations, makes it an unreliable tool for these subsets of patients. Also, the lack of HER2 data renders it difficult to assess the additive benefit of trastuzumab in either actual analysis or predicted values.
In conclusion, to our knowledge, this is the first study in Egypt that validates PREDICT as an online prognostication tool in women diagnosed with early-stage breast cancer.
A limitation of this study is the absence of knowledge on cause-specific mortality which prevents determining whether differences are due to breast cancer-specific mortality or other unrelated causes of death. Another limitation of this study is the lack of data on Ki67, therefore it was marked as unknown for all patients. Also, nearly all the study subjects were clinically symptomatic at the time of the presentation. Symptomatic cancers are more likely to present with undesirable tumor characteristics in comparison with screen-detected cancers. Furthermore, a larger study population is needed to provide a wider database and establish a more powerful analysis as regard patients with less favorable tumor characteristics especially ER-negative patients.
In conclusion, PREDICT is a valuable prognostication tool. It has the advantage of being a free easily accessible online model. It has a mere benefit in developing countries with limited resources. Moreover, it shall add a fruitful insight to help clinicians in determining the appropriate treatment strategy for each patient on an individual basis.

Clinical Practice Points
Breast cancer is a major problem in Egypt. In a country with low income, managing the resources in the best possible way would allow directing the proper therapy without excessive use of unnecessary chemotherapy.
PREDICT is an online easy access program that allows integration of clinical parameters in the clinical practice. It was validated in the UK population. Applying this program to our study subjects proved its effectiveness and its major role in tailoring therapy in a country where access to modern molecular and genetic analysis is difficult.
Clinicians should integrate this program to guide them in decision-making as regards providing the proper therapy for women with early breast cancer in Egypt.