Measurement of Prostate Volume with MRI (A Guide for the Perplexed): Biproximate Method with Analysis of Precision and Accuracy

To review the anatomic basis of prostate boundary selection on T2-weighted magnetic resonance imaging (MRI). To introduce an alternative 3D ellipsoid measuring technique that maximizes precision, report the intra- and inter-observer reliability, and to advocate it’s use for research involving multiple observers. We demonstrate prostate boundary anatomy using gross pathology and MRI examples. This provides background for selecting key boundary marks when measuring prostate volume. An alternative ellipsoid volume method is then proposed using these boundaries in an attempt to improve inter-observer precision. An IRB approved retrospective study of 140 patients with elevated serum prostate specific antigen levels and/or abnormal digital rectal examinations was done with T2-weighted MRI applying a new (Biproximate) technique. Measurements were made by 2 examiners, correlated with each other for inter-observer precision and correlated with an expert observer for accuracy. Correlation statistics, linear regression analysis, and tests of means were applied using p ≤ 0.05 as the threshold for significance. Inter-observer correlation (precision) was 0.95 between observers. Correlation between these observers and the expert (accuracy) was 0.94 and 0.97 respectively. Intra-observer correlation for the expert was 0.98. Means for inter-rater reliability and accuracy were all the same (p = 0.001). We conclude that using more precise reproducible landmarks with biproximate technique, precision and accuracy of total prostate volume is found to be demonstrated.

Measurement of the prostate gland in vivo is commonly required in the management of prostate disorders, both benign and malignant. Measurement is especially useful in the diagnosis and management of adenocarcinoma and benign prostatic hyperplasia (BPH). In the former, knowledge of total prostatic volume is necessary in the calculation of prostate specific antigen (PSA) density (PSAD)-a key indicator of the likelihood that elevated PSA is due to malignancy 1,2 . Accurate measurement may affect cancer treatment strategies 1,2 . However, the ability to make consistent and accurate measurement of the prostate is also useful in the diagnosis and management of BPH 3,4 .
Early attempts to estimate total prostatic volume by digital rectal examination (DRE) alone were replaced by more accurate measurement with transrectal ultrasound (TRUS) that showed excellent correlation with gross pathologic specimens using planimetric and calculated ellipsoid volume formula (EVF) techniques 5 . Similarly, accuracy compared to pathologic specimens has also been established using 3-plane magnetic resonance imaging (MRI) [6][7][8][9] . While the forgoing techniques have shown reasonable success, details describing exact landmarks of the boundaries measured on imaging are rarely provided to assure general reproducibility. Difficulty in identifying landmarks such as the bladder neck and prostatic apex on ultrasound has been mentioned 10,11 . Most studies performed with TRUS or MRI underestimate overall pathologic measurement of size by weight 6,7,[12][13][14] although this may be due to inclusion of the seminal vesicles (SV), vas deferens (VD) stumps and periprostatic tissue in weighing the pathologic specimens 8 . Measured against an expert panel, in vivo measurement with TRUS overestimates post-operative small specimen size and underestimates large prostates 15,16 . However other investigators using MRI EVF volumetrics report overestimation compared to specimen volume (corrected for density of 1.05 gm/cc) 9 .
There are scant studies of inter-observer reliability for the measurement of total prostatic volume using standard methods 10,11,17 . TRUS intra-rater reliability using the ellipsoid formula was examined in one study and was 0.93 12 . Bangma, et al. 18 compared TRUS step-section planigraphy with several geometric multiplane models and

Materials and Methods patient selection. This study was approved by our institutional Internal Review Board and Ethics
Committee (protocol number 00003941) with a waiver of informed consent. 140 patients were selected from a population of men undergoing MRI because of abnormal PSA, digital rectal examination or both. Ages ranged from 41-76 (mean, 64 years). They were selected by the senior radiologist (NFW) with over 40 years of experience who acted as administrator/expert. All MRI lobar classification types of BPH were represented in the selection 4 .
Study design and statistics. Two body imagers with 10 years of experience reading prostate MRIs were assigned the task of measuring total prostatic volume (EN, BS) using T2-weighted MRI on the 140 patient cohort. They were provided with a spreadsheet showing only the patient hospital identification numbers. After determining the total prostate volumes, these spreadsheets were returned to the administrator. Intra-rater reliability was tested only on the expert's data. Re-measurement of prostate volume was done between 3 and 9 months after primary measures. Inter-rater reliability (precision) for continuous variables for volumes was analyzed for actual paired case-to-case correlation using the Pearson product moment and Lin's Coefficient of Concordance 19 . Linear regression was calculated to find r 2 values, Y intercepts, p values and 95% confidence intervals (CI). These were calculated from the open access statistical programs at the National Institute of Water and Atmospheric Research (NIAWA.com) and QI Macro Statistics ® (KnowWare International, Inc., Denver CO, USA). Measurements of the administrator were used as a proxy for gross pathological specimen weight for the determination of rater "accuracy". Correlations were analyzed by linear regression and graphed. A comparison of means was also done after the data was available to establish data normality. Significance was defined as p ≤ 0.05. Anatomy. Anatomic boundaries and landmarks for measurement of total prostate volume were chosen after a thorough review of the anatomic literature, including the work of Salvadore Gil-Vernet (salvadoregilvernet. com) 20 , Robert Meyers 21,22 and others [23][24][25] . The Prostate is not only a glandular but also a muscular organ 23,26 . Its external boundary, that may be referred to as the external prostatic capsule (EPC) extends posterior, lateral, and antero-lateral. The EPC represents the peripheral compressed stromal (fibro-muscular) component of the latticework supporting the glandular elements 23 . This is the outer boundary that is visible on imaging. Walz, et al. 27 refers to this as the "true capsule", although it is not what some anatomists would define as a true "capsule" that can be "peeled off " and separated from the gland. It will be referred to as a "capsule" in this presentation due to its common use in clinical practice and because it is the outer boundary we see on imaging.

MRI acquisition techniques.
The controversy surrounding the naming of this "pseudocapsule" is discussed in several anatomic studies including those that refer to the periprostatic fascia as the "capsule" 25,27 . This fascial plane is not visible on imaging. The anterior external limit of the prostate is less distinct, comprising anterolateral elements of the EPC, the surgical capsule (SC) surrounding enlarged transition zones (TZ) when BPH is present, and anterior fibro-muscular stroma (AFMS). The multilayered loose fascia surrounding the prostate is disrupted with surgical or post-mortem removal making correlation between the in vivo gland to the ex vivo specimen more difficult 22 .
The "surgical" capsule, (SC) is so-called because it encases the enlarged transition zone, which on open surgery can be enucleated from surrounding tissues. It represents compressed circular smooth muscle fibers of the pre-prostatic urethral wall (above the verumontanum) and prostatic urethral wall (below the veru) 23 . The anterior SC is indistinct to absent unless there is more advanced nodular hyperplasia of the (TZ). The AFMS is thickest proximally where it is formed by smooth muscle fibers from the bladder wall. This boundary is critical to define when performing maximal AP prostate measurement. The above features are delineated in Fig. (1A-C). The EPC is incomplete at the apical peripheral zone (PZ) [22][23][24] (Fig. 1D).

MRI image analysis & selection of boundaries.
All images were reviewed on high-resolution monitors using Philips Intellispace (Koninklijke Philips N.V.) software. Because of new interest in the urological literature pertaining to herniated portions of enlarged prostate into the bladder base, an alternative scheme of volumetrics was devised to include measurement of intravesical prostatic protrusion (IPP). A new vocabulary is needed to describe these boundaries e.g. vesico-prostatic angle (VPA), vesico-prostatic line (VPL), and apical line (AL) discussed below. Our approach is calibrated to provide the most reproducibility for intra-and interrater reliability.
Transverse measurements were made in the axial plane estimated to show maximal diameter. It allows best visualization of the lateral EPC boundaries for transverse measurement. The axial plane is also the only plane that allows for selection of the anterior boundary by interpolation of anterior ends of the right and left EPC for accurate antero-posterior (AP) measurement. The mid-sagittal plane is unsatisfactory for that determination. (The coronal plane, though reasonable for transverse measurement, is suboptimal for AP diameter measurement).
Pericapsular veins and levator ani muscles are confounding anatomical structures that, depending on axial level, lie adjacent to the EPC and make its outer boundary difficult to define ( Fig. 2A). Therefore, for consistency, when measuring the transverse diameter, we arbitrarily chose to measure from the inner margins of the EPC.
The maximal AP diameters were made at or near the same axial plane as the transverse taking care to not include the inferior extension of the anterior bladder detrusor. Measurement was made from the inner margin of www.nature.com/scientificreports www.nature.com/scientificreports/ the mid-posterior EPC to the mid-anterior AFMS. If there was a midline posterior indentation of the EPC (posterior commissure), the inner point was selected, even though the PZ projected slightly more posterior on either side. The anterior landmark chosen varied depending on definable anatomy.
There are three general configurations of the anterior boundary. In some patients, there is a discrete continuation of the merged SC and bilateral anterolateral EPC across the midline allowing for selection of its outer boundary (Fig. 3A). In the second, the EPC is interrupted and invisible in the midline, in which case we drew an interpolated arc-like line connecting the ends of these structures (Fig. 3B). (This is the rationale for using the axial rather that sagittal plane for AP landmarks). The third configuration is where there is apparent complete fusion of the AFMS, SC and EPC, completely obliterating visualization of the anterior EPC. This configuration was seen in less than 5% of our cohort. An interpolated arc-like line was dawn if the two antero-lateral ends of the EPC were close enough to connect. Otherwise, the axial AP anterior landmark was taken as the inner boundary of the AFMS (Fig. 3C). The appearance may be caused by the "salami effect" of a trans-axial plane that is oblique with respect to the longitudinal axis of the prostate (Fig. 4).
Length measurement was made in the mid-sagittal plane. The mid-sagittal plane is defined as that which best shows the membranous and bulbous urethra. Landmarks for measuring mid-sagittal length started by finding the anterior and posterior vesico-prostatic angles (VPA) as defined as the points where the detrusor connects with the prostate and connecting these angles by a line called the vesico-prostatic line (VPL) (Fig. 5). The VPL arbitrarily defines the level of bladder "outlet. " The inferior landmark was the caudal margin of the PZ defining the apex where an apical line (AL) was drawn approximately perpendicular to the long axis of the prostate drawn from www.nature.com/scientificreports www.nature.com/scientificreports/ the bladder outlet to the membranous urethra (Fig. 5). This PZ inferior boundary may occur posteriorly or in a parasagittal plane, therefore, it is essential to use a localizer tool in the coronal view to verify the level of the AL in the mid-sagittal plane (Fig. 6). If the prostate was entirely below the VPL, the mid-sagittal length was defined as the length of a line drawn from the mid-urethra at the level of the AL extending perpendicular to the VPL. If the prostate projected above the VPL, the midsagittal length was defined as the sum of the distance below the VPL plus the distance of a line from the from the superior margin of this tissue drawn perpendicular to the VPL (Fig. 5). This segment superior to the VPL represents intravesical prostatic protrusion (IPP).
Total prostatic volume was calculated according to the formula of the geometric model for a prolate ellipsoid: www.nature.com/scientificreports www.nature.com/scientificreports/ The size of the retrourethral lobe (RU) was arbitrarily evaluated by measuring the maximal AP diameter as visualized on the mid-sagittal view. This involved three steps: Step 1: The RU long axis was defined by a line from the cranial margin of the RU to the mid-verumontanum (conjunction of the ejaculatory ducts with urethra).
Step 2: We traced the approximate course of the urethra from the bladder neck through the level of the verumontanum using the "freehand" tool.
Step 3: we estimated the thickest AP level of the RU as measured from the inner margin of its posterior capsule perpendicular to its long-axis ending at the posterior wall of the urethra (Fig. 7A,B).

Results
Analysis of the alternative method of determining total prostatic volume resulted in an interobserver correlation (precision) of 0.95 for the two interpreting radiologists (Fig. 8A). Intra-observer correlation (precision) was 0.98 for the senior radiologist (administrator) (Fig. 8B). Interobserver correlation between the two interpreting radiologists and the administrator (accuracy) was 0.94 and 0.97 respectively (Fig. 9A,B). All correlation p values < 0.000. Since the data was non-normal in distribution, a parametric test of total prostate volume means (Wilcoxan Sign Rank) was applied and means for interrater reliability (precision) and accuracy compared to the administrator were all the same p = 0.001 (Fig. 10).

Discussion
The purpose of this study was to determine the inter-and intra-rater reliability of an alternative method of measuring total prostatic volume based on well-defined anatomic landmarks in order to improve precision among observers. After review of the anatomy and correlation with MRI, the resultant measurements show excellent statistical agreement.
The conventional transverse measurement method attempts to find the outside EPC margins. Because of the adjacent position of pericapsular veins and in the mid-and lower levels the levator ani muscles, these structures are often included in the transverse measurement. MRI is superior to ultrasound in selecting the outer boundary of the EPC. We chose to select the inner boundary of the EPC for axial plane landmarks to improve interobserver consistency (precision). In the everyday clinical setting, measurement of the outer boundary of the EPC, when visible in the maximal transverse view, is acceptable. We also describe better criteria for identifying a more consistent selection of the anterior and posterior landmarks for measurement in the axial view. Some might take issue with our technique of not including the entire AFMS in some cases wherein the maximal transverse axial plane falls at an extreme proximal level. In these situations, the detrusor (primarily smooth muscle) component of the AFMS is focally thickest and not representative of the lower two-thirds of the prostate. In addition to the "salami" effect mentioned earlier, this may explain why EVF calculations in most studies indicate that volume in glands <35 cc tend to be overestimated 12 .
The single most common cause for variation of total prostatic volume between examiners is likely selection of precise lower and upper landmarks in the mid-sagittal plane while measuring length 5,28 . Common errors include incorrect identification of the inferior boundary of the apex as the caudal landmark, inconsistent identification of the cranial landmark, making measurements in a parasagittal (rather than mid-sagittal) plane, and measuring www.nature.com/scientificreports www.nature.com/scientificreports/ from the anterior or posterior prostatic base to apex. Our understanding of the location of the inferior boundary of the PZ is critical. Experience has shown that relying only on the apparent lower margin of the PZ in the mid-sagittal plane can be misleading and that confirmation in the coronal view using a localizer function is required for accuracy. Collins, et al. 29 pointed out two methods of longitudinal methods of measurement, one using the bladder base as the cranial landmark and the other the superior margin of the prostatic tissue including any "median" lobe (Fig. 11). We refer to our proposed method of length measurement as "biproximate" as it conveniently embraces both.  www.nature.com/scientificreports www.nature.com/scientificreports/ While measurement of the AP diameter of the RU may be useful for lobar classification of BPH 4 , it is non-contributory in determining total prostatic volume. However, the proposed alternative method of volumetrics includes measurement of IPP that may be of use to urologists for patient management [30][31][32][33][34][35][36][37][38][39] . Some studies indicate that IPP may be an independent variable effecting surgical or medical outcome 40 . IPP of greater than 5 mm as measured on transabdominal ultrasound, has been associated with clinical signs and symptoms of bladder outlet obstruction 30,31 and responsiveness to medical management 31,32,38 .
Landmarks presented on MRI in our study are much more accurately visualized than on TRUS and could be expected to produce better intra-and interobserver concordance than those selected on TRUS. Another advantage of MRI, in contrast to transabdominal and TRUS, is that MRI does not require complete bladder distention for adequate visualization of landmarks.
The choice of an expert as proxy for "ground truth" measure of total prostatic volume rather than the pathological specimen may be considered a limitation or advantage of our study. Other clinical research studies have accepted the concept of expert consensus as a "gold standard" [41][42][43] . One limitation of our study is that we used a single expert rather than a panel of experts due to the relative paucity of those with experience using the measurement techniques under examination. Expert opinion is also the current standard for evaluation of MRI prostate segmentation 44 .
One could argue that since the objective should be to make in vivo measurements under normal physiological conditions, the volumes obtained were more realistic than under post-surgical or post-mortem conditions. Ex vivo measurements in the pathology department have disadvantages that reduce accuracy compared to those made in the living subject. The prostate is no longer infused with active circulation. This decreases specimen weight (volume) 15 . Processing of the specimen has been variable in previous studies. Comparison is usually made between the imaging measured volume and the "wet weight" of the fresh specimen on a scale or by water displacement in a graduated cylinder 45 . More accurate weights can be derived from electronic scales 14 .
Since the specific gravity of prostate tissue and water are nearly identical (1.05 gm to 1.0 ml) 13 , weight and volume have traditionally been used interchangeably in the literature, introducing mild inaccuracies when comparing image volumes to pathologic weights in larger glands. Some investigators use this number as a correction factor for comparing weight to volume 9,46 . This assumption fails to take into account that it is likely that there is a spectrum of density values in larger prostates with different proportions of stromal, glandular, and malignant tissues 13 . There is no evidence that the presence of carcinoma changes the accuracy between TRUS and cadaver volumes 47 . The specimen may include the attached seminal vesicles, vas deferens and surrounding soft tissues in many studies 47 . In some, the seminal vesicles and vas deferens are physically removed before weighing 9,14,48 , and in others the estimated mean volume of the seminal vesicles, as determined from the medical literature, is subtracted from the total weight of the prostate and accessory structures 15,46 . Changes in prostate size due to drying or fixation complicate comparisons in cadaver studies 48,49 .
MRI volume imaging eliminates these variables, and it could be argued that MRI volumetrics in the living patient should replace post-mortem measurement as the "gold standard" 6 . Hass, et al. 50 has suggested that MRI-measured EVF might be considered the new standard for total volume estimate. In a small early small study using semi-automatic planigraphy, MRI showed improved results compared with TRUS 6,9 .
The application of artificial intelligence algorithms and the use of segmentation programs to establish prostate volume with MRI were reviewed by Turkbey, et al. 51 . These include innovative semi-automatic and automatic computer-based methods to "find" elusive prostate boundaries (edge detection). The software used is only available for research and not clinical use. These technologies share with geometric orthogonal models, such as the prolate ellipsoid formula (EVF) the fact that they are based on mathematical models and not direct measurement e.g. weight. The former requires significant boundary adjustments by an expert reader who still must ultimately understand the anatomy and landmarks we have described. At present, these techniques are thought by most to be too labor intensive for speedy clinical use 9,46,50 . Fully automatic unsupervised computerized assisted programs are under study, but performance compared with simpler methods is less well understood 44 . The great advantage of a fully automatic method would be precision since there is no user variation. Benxinque, et al. 52 recently compared prolate ellipsoid and other prostate volume measures to a commercially available fully automated segmentation program. Although fully automatic measurement was not reliable compared to manually adjusted MRI B PT Figure 11. Two proposed ways to measure prostate length in the mid-sagittal plane. B = Distance from apex to bladder neck, P = distance from apex to most proximal prostate tissue in sagittal plane (Modified from Collins GN, Ultrasound in Med. & Biol. 1995; 21:1102). Two proposed ways to measure prostate length in the mid-sagittal plane. B = Distance from apex to bladder neck, PT = distance from apex to most proximal prostate tissue in sagittal plane (Modified from Collins GN, Ultrasound in Med. & Biol. 1995; 21:1102).