Epidural anesthesia needle guidance by forward-view endoscopic optical coherence tomography and deep learning

Epidural anesthesia requires injection of anesthetic into the epidural space in the spine. Accurate placement of the epidural needle is a major challenge. To address this, we developed a forward-view endoscopic optical coherence tomography (OCT) system for real-time imaging of the tissue in front of the needle tip during the puncture. We tested this OCT system in porcine backbones and developed a set of deep learning models to automatically process the imaging data for needle localization. A series of binary classification models were developed to recognize the five layers of the backbone, including fat, interspinous ligament, ligamentum flavum, epidural space, and spinal cord. The classification models provided an average classification accuracy of 96.65%. During puncture, it is important to maintain a safe distance between the needle tip and the dura mater. Regression models were developed to estimate that distance based on the OCT imaging data. Based on the Inception architecture, our models achieved a mean absolute percentage error of 3.05% ± 0.55%. Overall, our results validated the technical feasibility of using this novel imaging strategy to automatically recognize different tissue structures and measure the distances ahead of the needle tip during the epidural needle placement.

www.nature.com/scientificreports/ epidural hematoma and the development of an abscess might occur due to inaccurate puncture 14,15 . Moreover, the neurologic injury caused by inadvertent puncture can lead to other symptoms like fever or photophobia 16,17 .
In current clinical practice, accurate placement of the needle relies on the experience of the anesthesiologist 18 . The most common method of detecting the placement of the needle in the epidural space is based on the loss of resistance (LOR) 19 . To test the LOR, the anesthesiologist keeps pressing on the plunger of a syringe filled with saline or air during the inserting the epidural needle 20 . When the needle tip passes through the ligamentum flavum and reaches at the epidural space, there is a sudden decrease of the resistance that can be sensed by anesthesiologists 21 . Nevertheless, this method has been shown to be inaccurate in predicting needle location and actual needle insertion could be further inside the body than the expectation 22 . Up to 10% of patients undergoing epidural anesthesia are not provided with adequate analgesia by using LOR 23,24 . And the LOR technique can fail in up to 53% of the attempts without image guidance in more challenging procedures such as cervical epidural injections 25,26 . Moreover, complications such as pneumocephalus 27 , nerve root compression 28 , subcutaneous emphysema 29 and venous air embolism 30 have been shown to be related to the air or liquid injection while using LOR technique. To improve the success rate of epidural puncture and decrease the number of puncture attempts, there is a strong demand for an effective imaging technique to guide the epidural needle insertion.
Currently, imaging modalities, such as ultrasound and fluoroscopy, have been utilized during the needle access 31,32 . However, the complex and articulated encasement of bones allows only a narrow acoustic window for the ultrasound beam 26 . Fluoroscopy does not have soft tissue contrast and, thus, cannot differentiate critical soft tissues (such as blood vessels and nerve roots) that need to be avoided during the needle insertion. Moreover, the limited resolution and contrast in fluoroscopy make it difficult to distinguish different tissue layers in front of the needle tip, especially for the cervical and thoracic epidural anesthesia where the epidural space is as narrow as 1-4 mm 33 . To improve the needle placement accuracy, novel optical imaging systems have been designed and tested. A portable optical epidural needle system based on fiberoptic bundle was designed to identify the epidural space 34 , but there are some limitations for the optical signal interpretation and needle trajectory identification due to the uncertain direction of needle bevel or the surrounding fluid 35 . Additionally, optical spectral analysis has been utilized for tissue differentiation during epidural space identification 36,37 . However, the accuracy of measured spectral results can be compromised by the surrounding tissues and the bleeding during the puncture.
Optical coherence tomography (OCT) is a non-invasive imaging modality that can visualize the cross-sections of tissue samples 38 . At 10-100 times higher resolution (~ 10 µm) than ultrasound and fluoroscopy, OCT can improve the efficacy of tissue imaging 39 . OCT has been integrated with fiber-optic catheters and endoscopes for numerous internal imaging applications [40][41][42][43] . Fiber-optic based OCT probe systems have been proposed in epidural anesthesia needle guidance and provided promising results in identifying epidural space in pig models 44,45 . In the previous study, our group has also reported a forward-imaging endoscopic OCT needle device for realtime epidural anesthesia placement guidance and demonstrated its feasibility in piglets in vivo 26 . By fitting the OCT needle inside the hollow bore of the epidural needle, no additional invasiveness is introduced from the OCT endoscope. The high scanning speed of OCT system allows real-time imaging of the tissue OCT images in front of the needle. The tissues in front of the needle tip can be recognized based on the distinct OCT imaging features of the different tissues.
Convolutional neural networks (CNN) has been widely used for classification of medical images 46,47 and have been applied for OCT images in macular, retina and esophageal related research for automatic tissue segmentation [48][49][50] . To help improve the efficiency of tissue recognition, herein we proposed to use CNN to classify and recognize different epidural tissue types automatically. In this study, we developed a computer-aided diagnosis (CAD) system based on CNN to automatically locate the epidural needle tip based on the forward-view OCT images. To the best of our knowledge, this is the first attempt to combine forward-view OCT system with CNN for guiding the epidural anesthesia procedure. Five epidural layers (fat, interspinous ligament, ligamentum flavum, epidural space and spinal cord) were imaged to train and test the CNN classifiers based on Inception 51 , Residual Network 50 (ResNet50) 52 and Xception 53 . After the needle tip arrives the epidural space, the OCT images can then be used to estimate the distance of the needle tip from the dura mater to avoid spinal cord damage. We trained and tested regression models based on Inception, ResNet50 and Xception using OCT images with manually labeled distances. The Inception model achieved the best performance with a mean absolute percentage error of 3.05% ± 0.55%. These results demonstrated the feasibility of this novel imaging strategy for guiding the epidural anesthesia needle placement.

Results
OCT images of five epidural layer categories. The schematic of the experiment using our endoscopic OCT system was shown in Fig. 1A. Cross-sectional 2D OCT image examples of fat, interspinous ligament, ligamentum flavum, epidural space and spinal cord were shown in Fig. 1B. Because of the gap between needle tip and dura mater, epidural space was the simplest to be recognized. Among the other four tissues, interspinous ligament showed the most obvious imaging features, including the maximum penetration depth and the clear transverse stripes due to the thick fiber structure. Compared to other tissue types, ligamentum flavum showed higher imaging brightness close to the surface and the shallowest imaging depth. Imaging depths of fat and spinal cord were similar, but the imaging intensity of fat was not as evenly distributed as spinal cord. The corresponding histology results were also included in Fig. 1B. These tissues presented different cellular structures and distributions and correlated well with their OCT results except for fat. The fat tissue was featured with pockets of adipocytes in the histology, while this feature was not clear in the OCT results. This may be caused by the tissue compression we applied to mimic the clinical insertion scenario.  Supplementary Figure 1. However, the overall accuracies of the multi-class classification models based on Inception reached ~ 66%. Although this was significantly higher than the accuracy of 20% by random guessing, further improvement was needed for clinical use.
Since the multi-class classification results were not satisfactory, herein we proposed to use sequential binary methods to improve the classification accuracies. During the needle placement, the needle was inserted through fat, interspinous ligament, and ligamentum flavum until reaching the epidural space. Continuing the needle insertion beyond the epidural space can puncture the dura and damage the spinal cord. The classification process was thus divided into a sequential process of four binary classifications:  Table 1.

Estimation of the distance between the needle tip and dura mater by regression. Inception,
ResNet50, and Xception were compared for the regression task of estimating the distance of the needle tip to the dura mater. In Table 3, the mean and standard error of the cross-validation mean absolute percentage error (MAPE) for ResNet50, Xception, and Inception in all testing folds were shown. In every fold, the Inception model outperformed the ResNet50 and Xception models, indicated by the lowest MAPE. In each testing rotation, a new Inception model was trained using all the images in the seven cross-validation folds and then evaluated on the unseen testing images in the one testing fold. Examples of OCT images with different distances between needle tip and tissue were shown in Fig. 3A. A model was trained on 21,000 images belonging to subjects 1, 2, 3, 4, 5, 6, and 8, and tested on 3,000 images belonging to subject 7. The distribution of the errors from the Inception model during the seventh testing fold (i.e., testing images belong to subject 7) can be visualized with the violin plots in Fig. 3B. The MAPE on this testing set was 3.626%, and the mean absolute error (MAE) was 34.093 μm. From the testing results on the Inception architecture, it was evident that the regression model can accurately estimate the distance to the dura mater in most of the OCT images. The distribution of the errors from the Inception model from all the other testing folds can be found in Supplementary Figure 5-6.

Discussion
In this study, we validated our endoscopic OCT system for epidural anesthesia surgery guidance. The OCT endoscope can provide 10-100 times higher resolution than conventional medical imaging modalities. Moreover, this proposed endoscopic OCT system is compatible with the clinical-used epidural guiding methods (e.g., www.nature.com/scientificreports/ ultrasound, fluoroscopy, and CT), and will complement these macroscopic methods by providing the detailed images in front of the epidural needle. Five different tissue layers including fat, interspinous ligament, ligamentum flavum, epidural space and spinal cord were imaged. To assist the OCT image interpretation, a deep learning-based CAD platform was developed to automatically differentiate the tissue layers at the epidural needle tip and predict the distance from the needle tip to dura mater.
Three convolutional neural network architectures, including ResNet50, Xception and Inception, were tested for image classification and distance regression. The best classification accuracy of the five tissue layers were 60-65% from a multi-class Inception classifier. The main challenge was the differentiation between fat and spinal cord (Supplementary Table 2) because they had similar feature in OCT images (Fig. 1). Based on the needle puncture sequence, we divided the overall classification into four sequential binary classifications: Fat vs Interspinous Ligament; Interspinous Ligament vs Ligamentum Flavum; Ligamentum Flavum vs Epidural Space, and Epidural Space vs Spinal Cord. The overall prediction accuracies of all four classifications reached to more than 90%. ResNet50 presented the best overall performance compared to Xception and Inception. Due to the unique features of epidural space in OCT images, it was possible to achieve > 99% precision when the needle arrived the epidural space. Table 2 showed the accuracies of ~ 99.8% and 100% when classifying Epidural Space vs Ligamentum Flavum and Epidural Space vs Spinal Cord. This will allow accurate detection of the epidural space for injection of the anesthetic during epidural anesthesia. The sequential transition from one binary classifier to the next was controlled accurately using a simple logic, which was demonstrated in a video simulating the insertion of a needle through the five tissue layers (Fig. 2). In the future, this can be improved by combining CNN with Recurrent Neural Network (RNN) to handle the temporal dimension of video streaming data 61 . Additionally, we developed a CNN regression model to estimate the needle distance to the dura mater upon entry of the epidural space. For the regression task, Inception provided better performance compared to Xception and ResNet50. The mean relative error was 3.05%, which was able to track the accurate location of the needle tip in the epidural space.
CNNs have shown to be a valuable tool in biomedical imaging. Manually configuring CNN architectures for an imaging modality can be a tedious trial-and-error process. ResNet, Inception, and Xception are commonly used architectures for general image classification tasks. Here, we showed that the architectures can be easily adapted for both classification and regression tasks in biomedical imaging applications. The best performance was obtained by ResNet50 for the binary classifications and by Inception for the distance regression.
The nested-cross validation and testing procedure was computationally expensive, but it provided the uncertainty quantification of the test performance across subjects. The wall-clock time for training the binary classification models on NVIDIA Volta GPUs were ~ 11 min per validation fold for ResNet50, ~ 32 min per validation fold for Xception, and ~ 11 min per validation fold for Inception. The wall-clock time for training the regression models on NVIDIA RTX 3090 GPUs were ~ 50 min per validation fold for ResNet50, ~ 145 min per validation fold for Xception, and ~ 36 min per validation fold for Inception. The inferencing for the binary classifications on NVIDIA Volta GPUs took 13 ms per image on average. The inferencing for the distance regression on NVIDIA RTX 3090 GPUs took 2.1 ms per image on average. In future, the inferencing by these large CNN models can be further accelerated by weight pruning and knowledge distillation 62 .
In the next study, we will use the GRIN lens with a suitable diameter for practical 16-gauge Tuohy needle used in epidural anesthesia in our future hardware design 63,64 . Furthermore, we will miniaturize the size of our OCT scanner to make our system more portable and convenient for anesthesiologists to use in clinical applications. Finally, we will test the performance of our endoscopic OCT system together with the deep learning-based CAD platform in the in-vivo pig experiments. Difference of OCT images from in-vivo and ex-vivo condition may deteriorate the in-vivo testing results. In that case, we will re-train our model using in vivo pig data. Additionally, during the in-vivo experiments, there will be blood vessels surrounding the spinal cord 65 . To address this, we plan to further use Doppler OCT method for the blood vessel detection to avoid the rupture of blood vessels during epidural needle insertion. www.nature.com/scientificreports/ principle was based on a Michaelson interferometer with a reference arm and a sample arm 38 . The endoscopic system was built on a swept-source OCT (SS-OCT). The light source was a wavelength-swept laser with 1300 nm central wavelength and 100 nm bandwidth 66 . The laser had the maximum scanning speed at 200 kHz A-Scan rate. The light from the laser was first unevenly split by a fiber coupler (FC). 97% power was split into the circulator and transmitted into the interferometer, and the other 3% was input to the Mach-Zehnder interferometer (MZI) which provided the triggering signal for data sampling. The 97% power was further split by another 50:50 FC to the reference arm and the sample arm. The reflected signal from the reference arm and the backscattered signal from the sample arm interfered with each other and were collected by a balanced detector (BD) for noise reduction. The signal was then sent to data acquisition board (DAQ) and computer for post-processing based on Fourier transform 67 . While imaging the samples in the air, the axial resolution reached to 10.6 μm and the lateral resolution was 20 μm. To achieve the endoscopic imaging, a gradient-index (GRIN) rod lens was added in the sample arm. It was fixed in front of the scanning lens of the galvanometer scanning mirror (GSM). The GRIN lens used in this study had a total length of 138 mm, an inner diameter of 1.3 mm, and a view angle of 11.0°. It was protected by a thin-wall steel tubing. For dispersion compensation, a second set of identical GRIN lens was stabilized in front of the reflector (mirror) of the reference arm. In addition, two polarization controllers (PC) were placed in each arm to reduce the noise level.
The GRIN lens utilized in the sample arm was assembled in front of the OCT scanning lens of the GSM. To decrease the reflection from the proximal end surface of the GRIN lens that significantly degraded the imaging quality, the proximal surface of the GRIN lens was aligned ~ 1.5 mm off the focus of the scanning lens. The GRIN lens had four integer pitch length to relay the images from the distal end to its proximal surface 68 . In the sample arm, the proximal GRIN lens surface was adjusted close to the focus point of the objective after the OCT scanner. Thus, the spatial information from the distal surface (tissue sample) of the GRIN lens transmitted to the proximal surface was further collected by the OCT scanner. Therefore, OCT images of the epidural tissues in front of the GRIN lens can be successfully obtained. Our endoscopic system provided ~ 1.25 mm field of view (FOV) with sensitivity of 92 dB. Data acquisition. Backbones from eight pigs were acquired from local slaughterhouses and cut at the middle before imaging to expose different tissue layers. From the cross-section of the sample, different tissue types could be clearly distinguished through the tissue anatomic features and their positions as shown in Fig. 5. To further limit the number of misclassified results, two lab members confirmed the tissue types before imaging started. In Fig. 5, five tissue layers including fat, interspinous ligament, ligamentum flavum, epidural space and spinal cord can be distinguished from their anatomic appearance. The OCT needle was placed against these confirmed tissue layers to obtain their OCT structural images. Following the practice of epidural needle placement, we mimicked the puncturing process by inserting the OCT endoscope through fat, interspinous ligament, ligamentum flavum and epidural space of our sample. Since the targeted position of the anesthetic injection is the epidural space with width ~ 1-6 mm 69 , we also obtained OCT images of epidural space by positioning the needle tip in front of the spinal cord at different distances. To mimic the condition of accidental puncture into spinal cord, we took OCT images while inserting the endoscope into the spinal cord. Some force was applied For each backbone sample, 1000 cross-sectional OCT images were obtained from each tissue layer. To decrease noise and increase the deep-learning processing speed, the original images were further cropped to smaller sizes that only contained the effective tissue information. Imaged were cropped to 181 × 241 pixels for the tissue classification. The data was uploaded to Zenodo (http:// doi. org/ 10. 5281/ zenodo. 50185 81) 70 .
At the end of imaging, tissues of fat, interspinous ligament, ligamentum flavum and spinal cord with dura mater of the porcine back bones were excised and processed for histology following the same orientation of OCT endoscope imaging to compare with corresponding OCT results. The tissues were fixed with 10% formalin, embedded in paraffin, sectioned (4 µm thick) and stained with hematoxylin and eosin (H & E) for histological analysis. Images were analyzed by Keyence Microscope BZ-X800. Sectioning and H & E staining was carried out by the Tissue Pathology Shared Resource, Stephenson Cancer Center (SCC), University of Oklahoma Health Sciences Center. The Hematoxylin (cat# 3801571) and Eosin (cat# 3801616) were purchased from Leica biosystems, and the staining was performed utilizing Leica ST5020 Automated Multistainer following the HE staining protocol at the SCC Tissue Pathology core.
Convolutional neural networks. Convolutional Neural Networks (CNN) were used to classify OCT images by epidural layers. Three CNN architectures, including ResNet50 52 , Inception 51 and Xception 53 , were imported from the Keras library 71 . The output layer of the models was a dense layer that represented the number of categories. The images were centered by subtracting training mean pixel value. The SGD with Nesterov momentum optimizer was used with a learning rate of 0.01, a momentum of 0.9, and a decay of 0.01. The batch size was 32. Early stopping was used with a patience of 10. The loss function used was sparse categorical cross entropy.
Nested cross-validation and testing 72,73 were used for model selection and benchmarking as described previously 66 . This evaluation strategy provided an unbiased estimation of model performance with uncertainty quantification using two nested loops for cross-validation and cross-testing. Images were acquired from eight subjects in this dataset. The images were divided to 8 folds by subjects to account for the subject-to-subject variability. An eight-fold cross-testing loop was performed by rotating through every subject for testing and using the remaining seven subjects (7000 images) for cross-validation. In the cross-validation, six subjects were used for training and one subject for validation in each rotation. The sevenfold cross-validation loop was used to compare the performance of three architecture models: ResNet50, Xception and Inception. The model with the best cross-validation performance was automatically selected for performance benchmarking in the corresponding testing fold. Supplementary Figure 7 depicted this evaluation strategy with Subject 1 used for testing. The performance of this overall procedure was evaluated by aggregating the testing performance from all 8 testing folds. Grad-CAM 74 was used to generate instance-wise explanation of selected models 75,76 .
The computation was performed using the Schooner supercomputer at the University of Oklahoma and the Summit supercomputer at Oak Ridge National Laboratory. The computation on Schooner used five computational nodes, each of which had 40 CPU cores (Intel Xeon Cascade Lake) and 200 GB of RAM. The computation on Summited used up to 10 nodes, each of which had 2 IBM POWER9 processors and 6 NVIDIA Volta Graphic cards. The complete code for the classification models can be found at https:// github. com/ thepa nlab/ Endos copic_ OCT_ Epidu ral.
The classification accuracy of the models was computed as: where TP was True Positives, TN was True Negatives, FP was False Positives, and FN was False Negatives.
(1) Accuracy = TP + TN TP + TN + FP + FN www.nature.com/scientificreports/ Receiver Operating Characteristic (ROC) curves were used to visualize the relationship between sensitivity and specificity. The area under the curve (AUC) of ROC was also used to assess the overall performance of the models.
Epidural distance prediction using deep learning. OCT images of epidural space were obtained at a range of distances between approximately 0.2 mm and 2.5 mm from the needle tip to the spinal cord surface (dura mater). A total of 24,000 images from eight subjects were used for this task. For each image taken in the epidural space for the distance estimation task, the distance in micrometers (μm) from the epidural needle to the dura mater was manually calculated and labeled. This distance label served as the ground truth for computing the loss during the training process in the regression model. All images were of 241 × 681 pixels on X and Z (depth) axes with pixel size of 6.25 µm. The pixel values for each image were scaled in the range of 0-255.
The regression model was developed to estimate the distance from the epidural needle to the dura upon entry into the epidural space automatically. Three architectures, including ResNet50, Inception, and Xception, were compared using nested cross-validation and testing as described above. The final output layer consisted of a single neuron with an identity activation function for regression on the continuous distance values 77 . The SGD algorithm with Nesterov momentum optimization was used with a learning rate of 0.01, momentum of 0.9, and a decay rate of 0.01. Training took place with a batch size of 32 over 20 epochs. The mean absolute percentage error (MAPE) and mean absolute error (MAE) were the metrics used to evaluate the regression performance due to their intuitive interpretability in relation to the relative error. The MAPE and MAE performance metrics are defined in Eqs. (2) and (3), respectively. Model training and testing for the regression task was performed on a workstation equipped with dual NVIDIA RTX 3090 GPUs. The complete code for the regression models can be found at: https:// github. com/ thepa nlab/ Endos copic_ OCT_ Epidu ral.
The classification accuracy of the models was computed as: where Y i was the ground truth distance, X i was the predicted distance, and n was the number of OCT images.

Data availability
The datasets generated and/or analyzed during the current study are available in the Github repository, https:// github. com/ thepa nlab/ Endos copic_ OCT_ Epidu ral.