A Guideline for Clinicians Performing Clinical Studies with Fluorescence Imaging

Fluorescence imaging is an emerging imaging technique that has shown many bene ﬁ ts for clinical care. Currently, the ﬁ eld is in rapid clinical translation

6 more absorption occurs in a highly vascularized liver than in muscle tissue. Improved penetration 110 depth can be obtained by imaging in the near-infrared (NIR) window (i.e., 750-1700 nm). This 111 spectral region benefits from reduced scattering and lowest absorption by tissue chromophores 112 (e.g., haemoglobin, water). A critical note here is that the signal is heavily surface-weighted due 113 to light attenuation in tissue (i.e., absorption and scattering), and that the spatial resolution 114 decreases with depth due to scattering (Fig. 2) (17). 115 When the user is aware of the tissue of interest's optical properties, the biochemical 116 phenomenon or (patho)physiological process should be concretized. All possible targets, including 117 biomarkers and phenomena/processes, should be examined to determine which is most suitable 118 for localization or evaluation of the target tissue. For example, one can image breast cancer 119 through visualizing nonspecific intra tumoral phenomena (e.g., enhanced permeability and 120 retention effect), a specific cell membrane-bound receptor, or a pathophysiological phenomenon 121 in the tumour microenvironment. Methods for target selection have been reported previously (18, 122 19). Briefly, the potential target should be prevailing in the target tissue compared to directly 123 adjacent tissue, benefitting high binding sensitivity and specificity as well as improving the 124 contrast. Target expression is commonly determined by immunohistochemistry. However, it is 125 increasingly questioned whether this is representative of the complete tumour due to tumour 126 heterogeneity and variations in target expression over time. Data-driven methods based on 127 genomic alterations are studied to identify and prioritize relevant targets for clinical trials(20). In 128 addition, many targets (e.g., cell membrane receptors) are present in a microscopically 129 heterogeneous pattern. For solid tumours that require wide local excision, the latter does not per 130 se impede guiding the surgeon in tumour resection since the margin is of primary interest(21, 22, 131 23). Contrary, in debulking surgery procedures (e.g., glioblastoma surgery) homogenous contrast 132 is of clinical importance since microscopic residues should be identified in order to excise all 133 tumour tissue(24, 25). 134 7

SELECT THE APPROPRIATE IMAGING MODALITY 136
When selecting FI camera systems for a clinical trial, the systems' form factor must fit in the 137 expected clinical setting. For instance, tumour visualization in oral cancer can be performed using 138 an open system, but perfusion assessment during minimally invasive surgery requires a 139 laparoscopic system. Next, the user should be aware of its performance characteristics to obtain 140 the desired imaging data, as these parameters greatly affect results(10). There are numerous 141 parameters to consider, but one should focus on those that directly influence imaging data, such 142 as the camera detection sensitivity to the desired tracer, depth sensitivity, field illumination 143 homogeneity, spatial and temporal resolution, and dynamic range. These minimum requirements 144 of these parameters should be finetuned for a specific imaging study, preferably in cooperation 145 with an engineer and a physicist. 146 The camera detection sensitivity describes the ability of a FI camera system to detect a certain 147 concentration of a specific contrast (i.e., fluorescent dye and corresponding emission wavelength). 148 This should be determined for every combination of a FI camera system and fluorescent tracer 149 since the systems' foremost influential characteristic is the sensitivity to the fluorescent tracer's 150 emission peak. Commercially available FI camera systems are equipped with very specific narrow 151 band optical filters. A mismatch between the optical filters and the fluorescent tracer results in a 152 low fluorescence intensity and could lead to an erroneous conclusion that a fluorescent tracer 153 (micro)dose does not accumulate in the region of interest since the contrast-to-noise ratio (CNR) 154 is low (Fig. 3, panel B). 155 Depth sensitivity is the ability to measure fluorescence signal at a certain depth. This is largely 156 dependent on the type of light (i.e., coherent or non-coherent) and the wavelength-specific 157 penetration depth of the excitation light. Ideally, devices should evolve to account for this 158 automatically, yet, the user should be aware for each clinical application of interest(26). For margin 159 assessment the imaging depth may vary among different tumours, since the definition of an 160 adequate margin is different. Head and neck cancer requires a tumour-free margin of at least 5 161 8 mm, whereas for breast cancer this is at least 1 mm. Although the penetration depth of light 162 increases with longer wavelengths (i.e., NIR versus visible spectrum), this does automatically 163 translate to increased measurement depth. When deeper tissues are imaged due to increased 164 scattering, the discrimination between target and surrounding tissue is impaired due to decreasing 165 CNR with imaging depth (i.e., low depth sensitivity) (Fig 2). 166 Field homogeneity describes how uniform the region of interest is illuminated. Inhomogeneous 167 field illumination can lead to over-or underestimation of the fluorescent signal throughout the field 168 of view. Perfect field homogeneity is rarely achieved in practice, and only a few FI camera systems 169 have implemented algorithms to improve field homogeneity. Most systems, especially endoscopic 170 ones, have highly inhomogeneous light fields that lead to steep intensity fall-off towards the edge 171 of the field. The user should validate the field homogeneity prior to every imaging procedure using 172 a calibration phantom. An inhomogeneous field illumination is not an insurmountable problem, as 173 long as the user is aware and knows how to interpret and correct for it (27). 174 Resolution of a FI camera system is characterized by spatial and temporal resolution. The spatial 175 resolution dictates the modalities' ability to differentiate between the smallest fluorescent sources. 176 The spatial resolution should at least be half of the smallest feature that has to be detected, as 177 described by the Nyquist theorem. The temporal resolution dictates the modalities' ability to detect 178 changes in signal over time. This is of importance when a dynamic phenomenon is of interest, 179 such as organ perfusion (e.g., semi-quantitative indocyanine green)(28). 180 The dynamic range greatly influences the ability to measure fluorescence signal. The dynamic 181 range (i.e., the detector's quantum efficacy) is the measure for the highest and lowest amount of 182 measurable light for a set exposure time. A camera system with a low dynamic range can either 183 measure very high or very low signals depending on exposure time. However, the camera cannot 184 do so both at the same time. Hence, a camera with a high dynamic range can measure both very 185 bright (i.e., high quantum yield) and very dim (i.e., low quantum yield) fluorescence signals ( Fig.  186 3, panel A).

BENCHMARKING OF FLUORESCENCE IMAIGNG CAMERA SYSTEMS 189
To compare different FI camera systems, universal standards are required for benchmarking their 190 performance, as is common in the other medical imaging modalities(29). As such, solid tissue-191 mimicking phantoms have been developed to characterize the different FI imaging systems 192 quantitatively. Wells filled with different concentrations of nanoparticles (i.e., quantum-dots) are 193 used to measure i) camera detection sensitivity versus optical properties, ii) depth sensitivity, iii) 194 dynamic range, iv) field homogeneity, and v) spatial resolution(27). We advise that users acquire 195 a FI camera system with high camera detection sensitivity in combination with a high dynamic 196 range. Also, as described above, the camera wavelength specificity and emission light sources (i.e., camera distance, incidence angle, ambient light) and processed according to a strict protocol. 203 (27,30,31). Automated log files should be constructed according to a standardized format and 204 recorded for review purposes, safeguarding a quality management system for FI in clinical use. 205 Ideally, these log files are archived with the patient data and imaging results, allowing for 206 calibration in later analysis of batch data, similar to the metadata archived in DICOM images taken 207 with radiologic imaging systems. We propose a quality management system to enable 208 comparative multicentre clinical trials and implementation in general practice, enabling uniformity. 209 Additionally, FI camera systems should have the option to export raw data without 210 interference of (undesired) image post-processing to obtain (semi-)quantitative data rather than 211 qualitative images. However, some commercial intraoperative imaging devices often opt for an 212 underlay for the surgeon's orientation purposes, which impedes the possibility of 213 quantification(10). 214

FLUORESCENCE CONTRAST 216
Fluorescence contrast can be either endogenous (i.e., autofluorescence of intrinsic tissue 217 compounds) or exogenous (i.e., administered fluorescent tracer)(32). Although the use of 218 endogenous contrast has some advantages, such as inherent non-toxicity and absence of 219 regulatory issues, we focus on the use of exogenous contrast as this has been shown to increase 220 specificity and detection sensitivity(33). The main criteria for selecting a fluorescent tracer include 221 efficient fluorescence light output (i.e., quantum yield), biodistribution and pharmacokinetic 222 characteristics, signal enhancement strategies (i.e., "always-on" versus "activatable" or "smart") 223 and regulatory approval(11). Lastly, the clinician must be aware of regulatory issues that can result 224 in tremendous costs when designing and using new fluorescent tracers, such as intellectual 225 property, animal tox studies, availability of compounds in a good-manufacturing practice facility 226 and regulatory approval(34, 35). 227 Generally, exogenous fluorescent tracers can be divided into targeted and non-targeted 228 tracers. Non-targeted tracers do not bind to biomarkers for disease-specificity but accumulate 229 passively into the tissue through metabolism or nonspecific uptake (e.g., enhanced permeability 230 and retention effect in tumours). A well-known non-targeted fluorescent tracer is indocyanine 231 green, which has Food and Drug Administration approval for tissue perfusion assessment, sentinel 232 lymph node mapping and biliary duct visualization. As fluorescent dyes itself are not tumour- The detected fluorescence is dependent on different specifications of the FI camera system 280 (e.g., exposure time, gain) in combination with the contrast, as well as variable imaging 281 parameters of the experiment itself (e.g., working distance, incident angle and ambient light). 282 Imaging with varying working distances substantially impacts the data consistency since the 283 intensity measured is distance-dependent (Fig. 3, panel C). Consequently, higher fluorescence 284 intensity is detected when the distance of the tissue of interest to the detector decreases, even 285 when the fluorescent light emitted is the same. The camera should be perpendicular to the tissue 286 to maximize the effective surface area of the detector (Fig. 3, panel D. When all variable imaging 287 parameters are standardized in every FI measurement, the imaging data allows for reproduction 288 and represents the tracer distribution more realistically(26). Ideally, all imaging parameters should 289 also be registered to allow for post hoc correction. 290 Although the impact of ambient light in FI has never been underestimated(45), it is rarely 291 standardized or corrected for. The most common solution is to keep the ambient light to a constant 292 minimum as relatively few systems can deal with high ambient intensity. The choice of lighting in 293 the operating room can be optimized, typically by minimizing NIR light. This is specifically emitted 294 from commonly used tungsten bulbs that could simply be replaced by light-emitting diodes. 295 Needless to say, this only reduces the problem for NIR-based emission probes such as 296 indocyanine green. 297

REPORTING ON FLUORESCENCE IMAGING DATA 299
Apart from a standardized imaging protocol, standardized data processing, representation and 300 reporting are necessary for the implementation of FI in standard of care. Contrary to some other 301 imaging techniques (e.g., CT), wide-field FI does not provide quantitative data. Even when imaging 302 parameters are standardized, variations in tissue optical properties affect the fluorescence signal. 303 Additionally, the signal is heavily surface-weighted, meaning that anything closer to the surface 304 will generate more fluorescence signal. These factors need to be taken to account when analysing 305 FI data. The most used semi-quantitative unit is mean fluorescence intensity (MFI), defined as the 306 average pixel intensity within a region of interest. Yet, reporting the MFI as an absolute and 307 quantitative measure without a thoroughly standardized protocol can lead to incorrect conclusions. 308 Since FI is a detection or discrimination method, relative measures (i.e. ratios) are more 309 appropriate for FI as these demonstrate the ratio between the target and the background. 310 Commonly used ratios in clinical FI include tumour-to-background ratio, signal-to-background ratio 311 and CNR(46). We advocate the use of CNR, defined as the target's MFI subtracted by the 312 background's MFI, divided by the standard deviation of the background. Using a CNR is 313 favourable since this is more informative on the detectability of the contrast (i.e. target) of 314 interest(47). A high CNR indicates good discrimination between the target and background tissue. 315 Still, the CNR is influenced by the FI camera systems dynamic range and quantum efficiency. For 316 example, using a fluorescent tracer with a relatively high quantum yield together with two different 317 FI camera systems with a low-and high dynamic range may result in two very different CNRs. In 318 other words, a FI camera system with a low dynamic range may underestimate the CNR as the 319 signal of the tumour is limited (Fig. 3, panel A). Also, despite the seemingly straightforward 320 definition, these quantities are prone to bias due to the strong dependency on the definition of the 321 surrounding tissue. Ideally, the target and the background are based on the gold standard (i.e., conclusions. This may, for example, lead to erroneous tumour delineation due to scattering in 332 margin assessment when interpreted by different clinicians Lastly, as mentioned earlier, the used 333 FI camera system settings must be described in detail. Reporting these settings is essential for 334 the reproducibility of study results as the FI camera system settings severely influence the 335 obtained FI data. 336

CONCLUSION 338
The rapidly increasing interest in FI has led to serious improvements in FI camera systems and 339 fluorescent tracers available. Although FI has shown enormous potential for a variety of 340 indications, the field has not yet established clinical implementation. Here, we have provided a 341 guideline for clinicians to perform FI clinical trials (Fig. 1). The same conceptual thinking applies 342 to other optical imaging modalities, such as laser speckle contrast imaging or spectroscopy-based 343 techniques. Similar to the classical medical imaging field, the FI field should focus on training 344 clinicians and supportive staff in a multidisciplinary way to better understand the underlying 345 physics and chemistry. Still, we advise clinicians to collaborate with researchers that have 346 experience with FI camera systems and fluorescent tracers in order to correctly acquire, analyse 347 and interpret the imaging data in an accurate and reproducible manner. To establish the clinical 348 implementation of FI, phase II and III trials need to commence based on a consistent study design, 349 imaging protocol and data analysis. By emphasizing standardization and reproducibility, the full 350 potential of FI can be realized, and its clinical value can be proven. The team then defines a biological target with the microscopic distribution and required penetration 370 depth in mind. The tracer must match the target and should be selected based on the 371 targeted/non-targeted approach, the tracers' emission peak, the tissue optical properties and the 372 administration route. Simultaneously, the device emission and excitation filters must match the 373 tracers' wavelength. Also, the form factor should be determined along with the desired resolution, 374 sensitivity to light and dynamic range. Prior to every imaging procedure, phantom measurements 375 should be obtained to evaluate performance characteristics over time. The user should set the 376 camera settings such as exposure time, binning, gain, emission light intensity, and the data should 377 be recorded without any pre-processing. Moreover, the camera setup should be identical in every 378 procedure, with respect to the working distance, angle of illumination and ambient light levels, to 379 compare results across patients. After data analysis, the performance of fluorescent tracer and 380 imaging device combination should be reviewed based on the contrast-to-noise ratio. Images 381 should be processed using perceptually uniform colour maps. 382 Absorption causes light energy to be transferred to the tissue, decreasing the light intensity. 392 Scattering is a process of short-lived absorption of a photon (typically) without energy loss, but 393 20 with a change of initial direction. Also, scattering decreases the ability to distinguish details. If there 394 is no correction for tissue optical properties, the signal registered is rather qualitative than 395 quantitative. A. The contrast-to-noise ratio is strongly dependent on the dynamic range of the fluorescence 400 imaging camera system concerning the fluorescent tracer. When imaging tissue using a 401 fluorescent tracer with a high quantum yield, the system with the high dynamic range would result 402 in a higher contrast-to-noise ratio compared to the low dynamic range system. B. The fluorescence 403 intensity detected by the fluorescence imaging camera system is dependent on the match 404 between the systems' optical filter and the emission peak of the fluorescent tracer used. A 405 mismatch between the emission peak and optical filter will result in suboptimal fluorescence 406 intensity detected (wavelength A) compared to the most optimal (wavelength B). C. The 407 fluorescence intensity exponentially decreases with increased working distance due to the 408 diverging nature of light. D. When the detector is not placed perpendicular to the tissue of interest, 409 the effective detection surface (EDS) that can detect emitted photons is smaller. As such, 410 fluorescence intensity is falsely reduced, possibly leading to erroneous conclusions.