Test-retest reproducibility of cardiac magnetic resonance imaging in healthy mice at 7-Tesla: effect of anesthetic procedures

Cardiac magnetic resonance (CMR) has emerged as a powerful tool for in vivo assessments of cardiac parameters in experimental animal models of cardiovascular diseases, but its reproducibility in this setting remains poorly explored. To address this issue, we investigated the test-retest reproducibility of preclinical cardiac magnetic resonance imaging (CMR) at 7 Tesla in healthy C57BL/6 mice, including an analysis of the impact of different anesthetic procedures (isoflurane or pentobarbital). We also analyzed the intra-study reproducibility and the intra- and inter-observer post-processing reproducibility of CMR images. Test-retest reproducibility was high for left ventricular parameters, especially with the isoflurane anesthetic procedure, whereas right ventricular parameters and deformation measurements were less reproducible, mainly due to physiological variability. Post-processing reproducibility of CMR images was high both within and between observers. These results highlight that anesthetic procedures might influence CMR test-retest reproducibility, an important ethical consideration for longitudinal studies in rodent models of cardiomyopathy to limit the number of animals used.

Heart failure resulting from ischemic or non-ischemic cardiomyopathy is one of the most important causes of death worldwide. In the past two decades, many animal models have been developed to explore heart failure and cardiovascular diseases. New cardiac imaging technologies that permit non-invasive assessment of cardiac function have opened the field of longitudinal analysis of functional changes after therapeutic interventions in these various animal models. Cardiac magnetic resonance (CMR) offers a high degree of intrinsic tissue contrast from tissue relaxation times and bulk flow that is used to obtain volumetric data on heart chambers, myocardial mass, global and regional function, myocardial strain, perfusion and tissue characteristics 1,2 . In humans, due to its high reproducibility, CMR is considered a gold standard for non-invasive assessment of both left and right ventricular function 3,4 and left ventricular strain 5 . However, CMR in small animals remains challenging and not well standardized 1 . The small masses of the mouse body and heart (respectively 20-40 g and 50-135 mg in normal mice) and the high cardiac and respiratory rates of anesthetized animals raise several technical issues. Using dedicated high-field small animal systems, CMR allows dual cardiac and respiratory gated imaging of a fast-beating mouse heart 1 . The recent availability of self-gated cardiac imaging 6 has dramatically simplified data acquisition of high temporal resolution images covering the entire ventricular volume. However, even though recent investigations have suggested excellent intra-and inter-observer reproducibility of cardiac function and left ventricular mass measurement using self-gated CMR, longitudinal reproducibility remains poorly investigated 7 . While the feasibility of CMR tagging has been demonstrated in mice for evaluating regional myocardial wall strain 8,9 , this approach is based on spatial modulation of magnetization applied to a single-slice ECG-gated fast 2D gradient echo sequence and may be compromised by limited temporal resolution.
Finally, anesthetic drugs and the depth of anesthesia may lead to complex cardiovascular effects through the modulation of heart and respiratory rates or through direct cardiovascular effects, which may potentially impact cardiac functional assessment.
The aim of this study was to assess the reproducibility of CMR in a preclinical setting. We evaluated the inter-/ intra-study and inter-/intra-observer reproducibility of CMR in healthy mice using a standardized imaging protocol with two different anesthetic procedures.

Results
Physiological measurements. Twenty mice were employed (ten mice in each anesthetic group).
Physiological parameters are summarized in Table 1. Mean body weight was not different between the isoflurane (group IF) and pentobarbital (group P) groups during the first CMR exam (CMR1) (27.9 ± 0.84 and 28.4 ± 1.41 g, respectively). Mean body weight was stable in both groups IF and P at the second CMR exam (CMR2) one week later. In group IF, mean heart rate (HR) and breath rate (BR) were also not significantly different between CMR1 and CMR2. Conversely, in group P, mean HR was stable between examinations, but mean BR was higher during CMR2 than during CMR1 (58 ± 17 compared to 48 ± 8 inspiration/min, respectively; p = 0.0155). HR was significantly higher and BR significantly lower in animals in group IF compared with group P during both CMR examinations (Table 1). HR and BR profiles throughout the CMR examinations are presented in Fig. 1. HR was higher in animals from group IF at all time points (Fig. 1). Intra-study HR and BR variability (expressed as intrastudy standard deviation) were not different between groups (Table 1).
Test-retest reproducibility. In group IF, test-retest reproducibility was high, with a low coefficient of variation (COV) between CMR1 and CMR2 for left ventricular (LV) function parameters that ranged from 5.36 ± 3.62 to 12.97 ± 9.64%. However, the measurements of right ventricular (RV) and strain parameters were fairly reproducible, except for right ventricular ejection fraction (RVEF) (test-retest COV of 15.96 ± 9.74%). In group P compared with group IF, test-retest COV was significantly higher for left ventricular end diastolic volume (LVEDV), left ventricular end systolic volume (LVESV), left ventricular ejection fraction (LVEF) and right ventricular end systolic volume (RVESV). In this group, the reproducibility threshold was only reached for left ventricular mass (LVM), LVEDV and LVEF (Table 2). Bland-Altman graphs also showed wider limits of agreement for group P (Fig. 2). In addition, intra-class correlations (ICCs) were only significant for LVEDV, LVESV and LVEF in group IF, whereas no ICC was significant in group P ( Table 2). The results of pooled mean data from CMR1 and CMR2 were compared between the IF and P groups for all CMR parameters (Table 2).  Table 1. Physiological parameters during cardiac magnetic resonance. Mean heart rate (HR), intra-study HR standard deviation (SD), mean breath rate (BR), intra-study BR SD and body weight during first (CMR1) and second (CMR2) cardiac magnetic resonance exams are presented for both anesthetic procedures (IF -isoflurane and P -pentobarbital). Data are the mean (SD). Intra-study reproducibility of myocardial strain. For both groups IF and P, the intra-study COV of parameters from the two tagged sequences ranged from 15.43 ± 10.67 to 31.48 ± 25.37%, with no inter-group difference. The reproducibility threshold was obtained only for circumferential strain with both anesthetic procedures ( Table 3). The ICCs were in line with this poor reproducibility as no ICC value reached significance (Table 3).
Intra-observer reproducibility. Reproducibility of intra-observer post-processing was good, with COV below 20% for all measured parameters and positive and significant ICC values for all parameters except LVM and right ventricular end diastolic volume (RVEDV) ( Table 5).  Table 2. Inter-study reproducibility as evaluated by the absolute difference, coefficient of variation (COV) (expressed in %) and interclass correlation (ICC) between results from the first (CMR1) and second (CMR2) cardiac magnetic resonance exams, during isoflurane (IF) and pentobarbital (P) anesthesia. Data are the mean (SD) except for ICC [95% interval confidence] (p). † Comparison of inter-study COV between groups IF and P. § Indicates CMR parameters that are different (p < 0.05) between groups IF and P (pooled data from CMR1 and CMR2).

Discussion
In this preclinical CMR study, we observed high test-retest reproducibility for left ventricular volumes and function with isoflurane anesthesia. Reproducibility was lower with pentobarbital anesthesia as well as for right ventricular parameters and ventricular strain analysis. The intra-study reproducibility assessed for tagged sequences was overall poor with both anesthetic procedures. Conversely, post-processing reproducibility, as evaluated by   Table 3. Intra-study reproducibility as evaluated by the absolute difference, coefficient of variation (COV) (expressed in %) and interclass correlation (ICC) between results from the first (seq. 1) and second (seq. 2) tagged sequences within the same cardiac magnetic resonance exam during isoflurane (group IF) and pentobarbital (group P) anesthesia. Data are the mean (SD) except for ICC [95% interval confidence] (p). *Comparison of intra-study difference between groups IF and P.  Table 4. Inter-observer reproducibility as evaluated by the absolute difference, coefficient of variation (COV) (expressed in %) and interclass correlation (ICC) between results from the first (observer1) and second (observer2) observers who post analyzed the same CMR examinations. Data are the mean (SD) except for ICC [95% interval confidence] (p value).
inter-and intra-observer comparisons, was good, with COV below 20% and positive and significant ICC values for most parameters assessed. Our results highlight the influence of anesthesia on small-animal imaging of the heart. Differences in the cardiovascular effects of various anesthetic drugs are well known. Pentobarbital is a common short-acting barbiturate used for rodent anesthesia that induces different sleep times depending on mice strain, age, and sex 10 . It produces marked respiratory depression and heterogeneous cardiovascular effects depending on the animal species, dose used and expected duration of anesthesia. Indeed, two rat studies by Redfors B et al. 11 and Stein A et al. 12 showed that HR remained stable from baseline after anesthesia with pentobarbital 25-30 mg/kg IP, whereas HR was depressed with other anesthetic agents (isoflurane, ketamine/xylazine). In these two studies, left ventricular volumes were lower in animals anesthetized with pentobarbital (i.e., those with higher HR) than in animals anesthetized with isoflurane (i.e., those with depressed HR). By contrast, two other studies performed in mice showed that a higher dose of pentobarbital, 50-70 mg/kg IP, induced depressed HR from baseline, with a more pronounced effect with longer duration of anesthesia 13,14 . Our findings are in agreement with these latter studies: a dose of pentobarbital 60 mg/kg IP, which is needed to maintain a sleep time of at least 60 min 10 , was associated with significantly lower HR compared with animals in the isoflurane group. In addition, mice in the pentobarbital group displayed significantly higher LVEDV than mice from the isoflurane group (62.3 ± 15.1 and 54.1 ± 6.2 µL, respectively; p < 0.05) ( Table 2). Taken altogether, these data suggest that pentobarbital may have various cardiac effects, with higher pentobarbital doses and longer anesthesia leading to an HR drop, which favors greater diastolic ventricular filling and, consequently, left ventricular enlargement. A similar mechanism was suggested by Kober et al. in a study demonstrating that CMR measurements of left ventricular volumes were biased in mice anesthetized with ketamine/xylazine: this common anesthetic combination induced bradycardia and increased preload condition, resulting in higher left ventricular volumes compared with isoflurane anesthesia 15 . In addition to dose-related various pentobarbital cardiovascular effects, a versatile cardiovascular impact of this drug was also observed during long anesthesia, with a two-slope blood pressure pattern (initial decrease after anesthesia induction and secondary increase during anesthesia maintenance). In addition, the vasomotor effects of pentobarbital were less predictable (vasodilatation for some animals and vasoconstriction for others), with higher inter-individual variability of this parameter, compared with isoflurane 13 . These inconsistent results might be at least partially attributable to the low reproducibility of anesthesia depth, as suggested by the inter-study variation  Table 5. Intra-observer reproducibility as evaluated by the absolute difference, coefficient of variation (COV) (expressed in %) and interclass correlation (ICC) between results from the first (analysis1) and second (analysis2) post-processing of data from the same CMR examinations by the same observer. Data are the mean (SD) except for ICC [95% interval confidence] (p value).
of BR observed in the present study, despite the use of identical doses of this drug. In addition, the cardiac effects of pentobarbital may vary depending on whether the gas inhaled during anesthesia is ambient air or a mixture, such as O2 + N2 or O2 + N2O 16 . Isoflurane, the leading drug for inhaled anesthesia, is increasingly used in research settings (especially in the cardiovascular field) due to its reputation of neutrality with respect to cardiovascular parameters 17 . However, isoflurane may cause more pronounced respiratory depression than pentobarbital, as observed in the present study and by others 11 . In addition, some authors have reported vasodilatation properties of this halogenated ether, with potential impacts on blood flow 18,19 . Iltis et al. also observed this effect and demonstrated that isoflurane anesthesia was associated with increased myocardial blood flow in Wistar rats compared with pentobarbital 16 . However, accumulating evidence favors the use of isoflurane in rodent cardiac imaging studies, such as a preclinical report highlighting the superiority of isoflurane anesthesia based on lower inter-subject variability of CMR left ventricular parameters compared with intraperitoneal anesthesia with MMF combination (medetomidine, midazolam and fentanyl) 20 . Our results provide additional evidence for the use of isoflurane to reduce the test-retest variability of left ventricular volume assessment when two CMR examinations are performed one week apart in the same healthy animals.
Conversely, in our trial, test-retest reproducibility of strain analysis, as measured by tissue tagging imaging, was not influenced by the anesthetic procedure and was low overall. However, tissue tagging remains the reference method for strain analysis in humans due to its high reproducibility, with test-retest COV below 10% for circumferential strain and below 20% for radial strain in healthy adults 5 . Similar results were obtained in patients with various pathological conditions in whom radial strain was less reproducible than circumferential strain 21 . Test-retest strain reproducibility with tissue tagging appears to be higher in humans compared with our mouse results. This difference might be explained by the higher temporal resolution of tagged sequences in human, which is inversely correlated with heart rate. Indeed, up to thirty tagged images were performed within the R-R interval for patients whose heart rate was 60-80 bpm, whereas in our study, given an approximate heart rate of 400 bpm in mice, a maximum of 14 images were acquired 5,21 . In addition to this insufficient temporal sampling, the number of images per cardiac cycle was variable between two examinations or even within an exam session, depending on the mouse heart rate and the maximum number of images possible to acquire between two R-R. This might at least partly explain the test-retest and intra-study variability that we observed in our study for myocardial strain.
The implication of CMR acquisition parameters rather than post-processing issues is further supported by the high inter-and intra-observer reproducibility we observed for left ventricular parameters: analysis and re-analysis of the same data set by the same or another observer showed 5-10% COV. This result is consistent with Schneider J et al., who reported a 2-11% COV for inter-and intra-observer reproducibility after 11.2-T CMR mouse exams 22 . In our study, inter/intra-observer reproducibility was lower for right ventricular and radial strain assessment, with an increase in the COV to 20-30%. The lower reproducibility of radial strain has been discussed previously 5,21 . Regarding right ventricular issues, there is a paucity of preclinical data, but a previous study showed lower reproducibility for RV evaluation compared with LV, given the complex geometry of the right heart and the challenging delineation of its cavities 23 . However, even for strain and RV parameters, ICC values were positive and significant for inter-and intra-observer evaluation, suggesting the robustness of CMR post-treatment in our process.
We acknowledge that our work has several limitations. First, we used only the sequences and software available at our research facility, without comparison with other methods. Ventricular function was assessed using a self-gating sequence (IntraGate ® ) based on a modified retrospectively gated sequence, which can introduce several biases with respect to how this process works, its validation and how the operator uses it. In addition, for post-processing, we used a post-processing software for ventricular cavities (Segment) and another for myocardial strain (OsiriX-intag). This latter software relies on local sine-wave modeling (SinMod) technology. A comparison between the results obtained with this software and with harmonic phase analysis (HARP) software might provide interesting insights regarding the poor strain reproducibility of our study. Such a comparison was They demonstrated that interand intra-observer variability were significantly higher with HARP compared with SinMod software, highlighting a potential role of the post-processing tool in reproducibility evaluation 24 . Another limitation of this work is the absence of standardized temporal sampling for strain sequences, secondary to physiological constraints, especially heart rate variability. Systematic use of the same temporal resolution might have decreased the variability of the results.
However, our study has several strengths. This study is the first CMR preclinical study to systematically explore inter-/intra-study and inter-/intra-observer reproducibility, thus permitting the differentiation of limits related to the acquisition or post processing of CMR images. In addition, no previous study has explored the differential effects of two anesthetic drugs on test-retest reproducibility, an important question for CMR, which is often used in a longitudinal approach. Finally, our research provides data sets for CMR in healthy mice.
In conclusion, preclinical CMR presented high reproducibility for the assessment of left ventricular function under isoflurane anesthesia. Right ventricular function and myocardial strain assessment were less reproducible, mainly due to physiological variability, whereas post-processing appeared robust.

Methods
Animals. All animal procedures conformed with the guidelines from Directive 2010/63/EU of the European Parliament on the protection of animals used for scientific purposes, and specific French laws were followed. All investigations and procedures were approved by the regional animals ethics committee (Cenomexa 054 -n°03854.01). Experiments were performed in 18-22-week-old male C57BL/6 mice weighing 25-30 g at baseline (Janvier Labs, Le Genest St Isle, France), which were housed individually in a temperature-controlled room with ad libitum access to standard mouse chow and water. Animals were randomized in two groups according to the anesthetic procedure applied to perform cardiac magnetic resonance: group IF (isoflurane anesthesia) and group P (pentobarbital anesthesia).
Anesthetic procedures during CMR. Group IF: Anesthesia was induced and maintained with isoflurane (Baxter SAS, Maurepas, France) at concentrations of 5% and 1.5-3%, respectively. Isoflurane was delivered in an oxygen/nitrous oxide mix (0.6 L/min) via spontaneous breathing using a precision vaporizer. Isoflurane rate was titrated throughout the exam to maintain a 40-60/min breath rate (BR).
Group P: Anesthesia was induced and maintained with a 60 mg/kg intra-peritoneal (IP) injection of pentobarbital (CEVA, Libourne, France).
Heart rate (HR) (beats per min − bpm) and BR (inspirations per min − insp/min) were monitored every 15 minutes in all animals throughout all CMR examinations.
Cardiac magnetic resonance (CMR). In both groups, CMR was performed twice, a week apart (CMR1 and CMR2). A 7-T Bruker Pharmascan magnetic resonance (Bruker Biospin, Ettlingen, Germany) interfaced with a dedicated small-animal ECG and respiratory triggering system (SA Instruments) was used. A quadrature 1 H resonator was used for radiofrequency transmission (inner diameter = 72 mm) in conjunction with a surface single loop receive-only coil. Mice were placed in the supine position, and body temperature was maintained in a physiological range using a heating pad. Ventricular function was assessed using a black blood self-gated sequence (IntraGate ® , Bruker Biospin, Ettlingen, Germany) based on a modified retrospectively gated Fast Low Angle Shot (FLASH) sequence [ REF Hiba B 2006]. After multislab survey acquisition in 3 orthogonal axes (axial, coronal, and sagittal), 8 to 9 short axis IntraGate ® slices ensuring a full coverage of the ventricles from the base to the apex were acquired (slice thickness 0.563 mm; echo time (TE): 2 ms; repetition time (TR): 5.5 ms; flip angle 25°; field of view: 28 × 28 mm 2 ; matrix size: 128 × 128 mm 2 ; spatial resolution: 0.219 × 0.219 mm 2 /pixel), with a temporal resolution of 16 images per cardiac cycle. Sample images of the self-gated sequence IntraGate ® are shown on Fig. 3. Left ventricular strain was assessed using two short axis (basal third and apical third) and one long axis tagged CMR using a cine 2D Flash gated sequence (slice thickness 1 mm; TE/TR 3/8 ms; flip angle 15°). The assessment was performed twice within each CMR examination with 2 different matrices but the same spatial resolution to assess an intra-study reproducibility (first tagged acquisition: field of view 45 × 45 mm 2 ; matrix size 256 × 256 mm 2 ; spatial resolution 0.176 × 0.176 mm 2 /pixel; second tagged acquisition: field of view 22.4 × 22.4 mm 2 ; matrix size 128 × 128 mm 2 ; spatial resolution 0.175 × 0.175 mm 2 /pixel). Temporal resolution was variable depending on the maximum number of cardiac frames acquired during a single R-R interval (range, [10][11][12][13][14]. Sample images of the tagged cine 2D Flash gated sequence are shown in Fig. 4. Cardiac function analysis. Ventricular parameters were analyzed in a blinded manner using Segment software after manual delineation of the right and left ventricular endocardial (excluding papillary muscles) and epicardial borders on all slices at end-diastole and end-systole (Segment v1.8 R1675, Medviso AB, University of Lund, Sweden). Left ventricular mass (LVM − µg), left/right ventricular ejection fraction (LVEF/RVEF − %) and left/right ventricular end diastolic and end systolic volume (LVEDV/RVEDV and LVESV/RVESV − µl) were therefore determined. Strain analysis was performed using the open source software OsiriX (http://www. osirix-viewer.com/) with the InTag plugin. Circumferential (Ecc) and radial (Err) strain was assessed on short axis (SA) frames, and longitudinal (Ell) strain was assessed on the long axis (LA) frame. Samples of strain time curves during the cardiac cycle (InTag plugin − OsiriX) are shown in Fig. 4.
Reproducibility assessment. Test-retest reproducibility was assessed by comparing the results from the two CMR examinations (CMR1 and CMR2) performed one week apart. Intra-study reproducibility was assessed by comparing the results from the two tagged sequences performed within each CMR exam. For inter-observer reproducibility, each CMR exam was processed in a blinded manner by two different observers. For intra-observer reproducibility, a random subset of 5 CMR examinations was processed twice by the same observer, one week apart. For inter-and intra-study reproducibility, data were separately analyzed for each anesthetic procedure, as this parameter may influence the results. Statistical analysis. Quantitative data are expressed as the mean (standard deviation − SD). Reproducibility was evaluated by the mean (SD) absolute difference and coefficient of variation (COV) (mean relative difference) between results from CMR1 and CMR2, tagged sequence 1 and tagged sequence 2, observer 1 and observer 2 post-processing, and post-processing 1 and 2 of same examinations by the same observer for test-retest, intra-study, inter-observer and intra-observer reproducibility, respectively. A measurement was considered reproducible when COV was below 20%. Furthermore, the reproducibility was also assessed using the interclass correlation coefficient (ICC) and its 95% confidence interval (95%CI) under an ANOVA random effect model. As described by Tammemagi et al. 25 the ICC can vary from −1 (perfect disagreement) to 0 (random agreement) and to +1 (perfect agreement). A negative ICC occurs when the between-subject variation is relatively small compared with the within-subject variation. An unpaired Student's T-test was used for absolute differences comparisons between anesthetic procedures. HR, BR and body weight were compared within and between anesthetic procedures using paired and unpaired Student's T-tests, respectively. A value of p < 0.05 was considered to denote significance. Ventricular parameters from CMR1 and CMR2 with both anesthetic procedures were also represented with Bland-Altman graphs, plotting the COV vs. average and the 95% limits of agreement. All tests were two tailed, and their level of significance (p) was defined as p < 0.05. IBM ® -SPSS ® 22.0 for Windows ® was used as the statistical software.