In Europe, in 2008, it was estimated that 421,000 new cases of breast cancer were diagnosed; 129,000 women died from breast cancer in the same year [1]. Neoadjuvant chemotherapy (NAC), initially used only for locally advanced breast cancer, is now commonly used in patients with operable but large breast cancer. This strategy allows patients to undergo breast-conserving surgery (BCS) and gives information on the efficacy of chemotherapy [2]. Long-term outcomes are significantly correlated with pathological tumour response rates [3]. In the present paper we here focus on what could be the role of an early evaluation with 18F-FDG PET/CT of the response to NAC in patients with operable breast cancer.

What are the established benefits of NAC in patients with operable breast cancer?

NAC increases the chances of performing BCS instead of mastectomy in patients with large tumour (usually >3 cm) or unfavourable tumour/breast size index. In the National Surgical Adjuvant Breast and Bowel Project (NSABP) B-18 randomized trial, the BCS rate in the group receiving NAC with four cycles of doxorubicin and cyclophosphamide (AC) was significantly higher than in the group receiving this chemotherapy postoperatively (67.8 vs 59.8%; p = 0.002) [2, 3].

With NAC, there is also hope that earlier initiation of systemic therapy could lead to better efficiency in treating occult micrometastasis. However, an overall survival benefit for NAC compared with adjuvant chemotherapy in operable breast cancer has not been clearly proven, although subset analyses suggest benefit restricted to some groups of patients [3]. Nevertheless, NAC is an excellent setting to document response of the tumour to the administered chemotherapy. Absence of residual cancer cells in the primary tumour following NAC is strongly associated with improved disease-free survival (DFS) and overall survival (OS) [3]. Pathological complete response (pCR) after completion of NAC occurs in 13–26% of patients [2, 4]. The percentage is lower in series requiring a pCR not only in the primary tumour but also in axillary lymph nodes [5].

Why early prediction of response with imaging techniques would be useful?

Early prediction during NAC of what would be the final pathological response might offer an early opportunity to change strategy in cases of inefficacy. However, breast cancer is a heterogeneous class of tumours that differ in biology profiles, treatment possibilities and outcomes. Therefore, recognizing early that the administered chemotherapy is inefficient might lead to diverse decisions based on individual patient and tumour characteristics.

Several studies that examined a possible role for metabolic evaluation with 18F-FDG PET have evidenced a correlation between early changes in the maximum standardized uptake value (SUVmax) (after one or two courses of chemotherapy) and the final pathological response after completion of NAC [614]. However, the ability to implement early 18F-FDG PET as a surrogate marker for treatment efficacy in clinical practice remains unclear because of substantial heterogeneity across studies. We, hereafter, discuss the reasons for heterogeneity, propose criteria for homogenization and raise specific clinical aims by patient categories.

Heterogeneity in defining responders

In every study, a threshold value of decrease in SUV (ΔSUV) has been proposed for discriminating metabolic responders (decrease of SUV superior to the threshold) from non-responders (Table 1). The cutoff chosen is supposed to best predict final pathological response. Unfortunately, the specific threshold proposed varies dramatically across studies (Table 1).

Table 1 Studies evaluating the decrease of SUVmax during NAC for breast cancer with FDG PET(/CT)

Rousseau et al. found that a ΔSUV of at least 40% after two courses of chemotherapy was the optimal cutoff to early separate responders from non-responders [8]. Two other teams found an optimal cutoff value of 55% after two courses [7, 11]. Berriolo-Riedinger et al. calculated a best ΔSUV of 60% after one course from baseline [10]. The major factor which could explain such differences in the SUV thresholds is the lack of consensual definition of the histopathological response (Table 1).

Histopathology after completion of NAC is the reference standard for assessing response [15]. However, there are a number of different scales to evaluate the pathological response after surgical procedures (the Feldman classification, the Chevallier scale, the Sataloff scale, the Honkoop criteria, the Miller-Payne classification, the index described by Symmans, etc.). The most frequent scales used in PET studies are the Sataloff scale, the Honkoop criteria and the Miller-Payne classification [1618]. These scales are described in Table 2. Not only different pathological classifications are used in PET studies, but also authors using the same classification had chosen different patients’ regroupings to define pathological responders.

Table 2 The three most frequent pathological scales used to define response in the studies investigating the role of PET in predicting early response

In the study by Rousseau et al., pathological responders were defined as those with a tumour regression superior to 50% (Sataloff grades T-A and T-B). On the basis of this definition, Rousseau et al. found an optimal cutoff value of 40% after two courses. In a preliminary analysis of 55 patients at Saint-Louis Hospital, we also found an optimal threshold of 40% when using the same definition of response as Rousseau et al. (Sataloff grades T-A and T-B) [19]. However, if shrinkage of 50% from the baseline could be a satisfactory objective to allow BCS, it is not sufficient to improve the OS. Indeed, clinical studies demonstrated that only patients with complete response (pCR) or minimal residual disease (pMRD) had significantly higher DFS and OS rates [2, 15, 17].

Schwarz-Dose et al. used the Honkoop scale and searched to determine optimal PET criteria to segregate the patients who will achieve no residual invasive tumour (pCR) or only a few scattered foci of microscopic residual tumour (pMRD) as indicators of a satisfactory pathological response. This situation may also correspond to a Sataloff T-A response with minor differences. So Schwarz-Dose et al. wanted to predict a more important response than Rousseau et al. did. This could explain that optimal cutoff value for ΔSUV was more stringent in their study (55% for Schwarz-Dose et al. vs 40% for Rousseau et al.). With criteria similar to Schwarz-Dose et al., Schelling et al. also found an optimal decrease of 55% after two courses [7].

The previous studies focussed on response in primary breast tumour only. However, it is important to note that, for most clinicians, a pCR should not only concern the primary tumour but also the axillary lymph nodes [5, 20]. Berriolo-Riedinger et al. considered as responders the patients who presented a total or near total therapeutic effect in the primary tumour (Sataloff T-A) and no residual nodal disease (Sataloff N-A/B). It is therefore no wonder that the SUV cutoff was even more stringent (ΔSUV of 60% after one course). Thus, differences in the SUV thresholds across studies are quite in agreement with differences in pathological criteria used to define response.

Defining homogeneous groups of pathological response that might serve clear clinical aims

In order to determine an optimal decrease of SUV that could be reproducible, it is then crucial to define consensual criteria of pathological response, so that different researchers can use the same pathological scale. The American Joint Committee on Cancer Staging (AJCC) recommends separating three degrees of response to NAC (complete, partial and no response) and to take into account not only the response in the primary tumour but also in the axillary lymph nodes [21].

We think the histopathological scale used to define response after primary chemotherapy should allow identifying three groups:

  1. Group 1

    could represent patients with pCR which is defined as no evidence of residual invasive cancer, both in breast and axilla [21, 22]. The presence or absence of residual ductal carcinoma in situ after preoperative therapy does not influence long-term DFS or OS. For this reason, pCR should not mandate an absence of residual ductal carcinoma in situ [21, 22]. Factors associated with a higher likelihood of pCR include tumour size, histology (ductal > lobular), tumour intrinsic subtype (basaloid or HER2 > luminal), hormone receptor status [oestrogen receptor (ER)-negative > ER-positive] and grade (high > low) [22].

  2. Group 2

    represents patients with partial response (PR) who still harbour residual invasive cancer cells. On the scale of Sataloff, this group will correspond to patients classified as T-A N-C or T-B N-A/B/C (Table 2).

  3. Group 3

    defines patients with poor, no response or progression. On the Sataloff scale, this group will correspond to patients classified as T-A/B N-D and T-C/D (Table 2).

In all cases, whatever the tumour type, NAC will be considered unsatisfactory in group 3 and early prediction should lead to the decision for immediate surgery or change of therapy. For patients with intermediate response (PR) the aim might vary according to the tumour type. For example, in a patient with ER-positive tumour, an intermediate response with tumour shrinkage allowing BCS might be considered as a reasonable objective. Indeed, obtaining pCR with chemotherapy is rare for these patients [20, 2224], and treatment will be completed after surgery by adjuvant hormonotherapy. However, when considering patients with triple-negative tumours, a partial response is unsatisfactory.

We strongly believe that criteria of response should take into account tumour characteristics. Breast carcinoma is not a single entity. Gene expression profiling has led to the identification of different molecular breast cancer subtypes (luminal-A, luminal-B, HER2, basal and normal-like), all differing in term of gene expression, genome alterations as well as clinical characteristics and outcome [25]. In current clinical practice, immunohistochemistry can also be used to define subgroups with different therapeutic response and different outcome. We will discuss separately three broad groups: (1) ER-positive tumours with no HER2 overexpression, (2) HER2-positive tumours (which may be ER-positive or ER-negative) and (3) triple-negative tumours when ER, PR and HER2 are all negative.

ER-positive tumours with no HER2 overexpression

At baseline, as demonstrated in recent studies, most hormone-positive tumours present low FDG uptake [11, 26]. Chemosensitivity of ER-positive tumours is variable and mostly limited; pCR is rarely obtained in this group [20, 2224]. In the GeparTrio trial, pCR was obtained in 10% of the patients with positive hormonal status as compared to 43.2% in the group of negative hormonal receptor tumours [20].

Because of low histological complete response of ER+ patients, factors which could predict chemosensitivity are required. PET could be one of these factors. In the large series from Schwarz-Dose et al., an initial low uptake in ER+ tumours was predictive of poor response to NAC [11].

HER2 overexpression

The advent of antibody treatment with trastuzumab targeting this receptor has been a breakthrough. In patients with HER2 overexpression, chemotherapy + trastuzumab are efficient [27, 28]. In the study by Buzdar et al., the pCR rate was 65% in the patients receiving trastuzumab in addition to anthracyclines and taxanes and 26% for patients who did not receive trastuzumab [27]. Identifying the small subset with poor or no response (i.e. group 3 or Sataloff grades T-C/D or N-D) might be important for improving outcome by switching to a different chemotherapy, to another targeted therapy (i.e. lapatinib) and/or to combine targeted therapy [29].

Triple-negative tumours

Almost 15% of breast cancers have a triple-negative phenotype. In contrast to the two preceding groups, no targeted therapy is currently available for triple-negative tumours. The rate of disease recurrence is high in this group despite high chemosensitivity. The poorer prognosis of triple-negative breast cancers could be explained by a higher likelihood of relapse in those patients in whom pCR is not achieved [30]. Therefore, when early PET does not point to the possibility of obtaining a complete response, it would then be necessary to enter the patient in a trial of new therapy (e.g. PARP inhibitor or antiangiogenic therapy) or of dose-dense chemotherapy [31, 32].

Standardization of the method of evaluation

To define the place of PET in evaluating treatment, especially in the neoadjuvant setting of breast cancer, standardization is required.

Standardization in the preparation of patients and in the acquisition of images

For a given patient, the two PET examinations must be done in the same centre with the same instrumentation, and the same methodology must be applied. Rigorous calibration may be performed by the radiation physicist. Time between injection and acquisition needs to be the same. It is important to verify the absence of activity at the injection site (SUV value is underestimated in cases of extravasation of 18F-FDG). Preparation procedures and instrumental factors (resolution of the machine, method of attenuation correction and algorithm of reconstruction, etc.) can introduce slight differences in the measurement of SUV and need to be controlled [33].

Optimal date for the interim PET

McDermott et al. performed 18F-FDG PET at baseline, after one cycle, at midpoint and at endpoint [9]. They suggested that the most prudent time to evaluate chemotherapy response was between the end of the first cycle and the midpoint of chemotherapy. Three teams suggested that the best timing to perform the early evaluation was immediately before the third cycle [7, 8, 14]. In our institution, we also perform the interim PET after two courses of NAC [19]. In the study by Schwarz-Dose et al., the same results were obtained irrespective of whether PET was performed after the first course or after the second cycle of chemotherapy [11].

In the different studies, we can observe that diminution in the SUV value occurs rapidly during the first part of treatment, and, even if the decrease in SUVmax continues up to the end of chemotherapy, the curve tends to flatten out. Performing PET after the second course might be a good compromise and still allow an early change of therapy in cases of inefficiency.

Limits of evaluation with PET

An important limit is that the pretreatment SUV must be high in order to detect a meaningful reduction during treatment. Low contrast tumours are more difficult to distinguish from background tissues and are more affected by imaging imprecision. This requirement limits the use of PET in patients whose tumours have low initial FDG uptake, which is the case for invasive lobular carcinoma (ILC) [13, 26].

ILC represents the second histological type of breast cancer (almost 15%) after ductal carcinoma (almost 80%). ILC is a well-established source of weak FDG uptake [26] and PET might not be suitable for early evaluation in this subtype. The chemosensitivity of lobular carcinoma is low [5, 20]. In the GeparTrio trial, pCR was obtained in 24% for patients with an invasive ductal carcinoma and only in 10.2% in patients with an ILC [20]. Moreover, patients thought to be candidates for BCS after NAC often require repeat surgery because of involved margins [5]. Some authors question the use of NAC in lobular carcinoma [5].

Well-differentiated steroid receptor-positive tumours can sometimes also be a source of low FDG uptake. In the study by Schwarz-Dose et al., 24 patients (23%) with initial tumour SUV < 3 were excluded from further evaluation of response with FDG PET [11]. It is important to note that in this study, the low initial SUV was a predictive factor for chemoresistance: none of the 24 breast carcinomas with a baseline SUV less than 3.0 achieved pCR. Interestingly, these tumours with an SUV less than 3.0 were often steroid receptor positive [11].

Conclusion

PET seems to be a good predictive factor for chemosensitivity in most studies which evaluate the early response to primary chemotherapy in breast cancer. However, the ability to implement early PET as a surrogate marker for treatment efficacy in clinical practice remains unclear because of substantial heterogeneity across studies. Consensus in the definition of histological responder is the first requirement in the plan to develop the use of PET. Breast carcinoma is not a single entity and criteria of response should take into account tumour characteristics and subtypes of breast carcinoma. Finally, with a view to using PET in current practice, standardization in procedures and the timing of evaluation are needed.