Novel clinical device tracking and tissue event characterization using proximally placed audio signal acquisition and processing

We propose a new and complementary approach to image guidance for monitoring medical interventional devices (MID) with human tissue interaction and surgery augmentation by acquiring acoustic emission data from the proximal end of the MID outside the patient to extract dynamical characteristics of the interaction between the distal tip and the tissue touched or penetrated by the MID. We conducted phantom based experiments (n = 955) to show dynamic tool/tissue interaction during tissue needle passage (a) and vessel perforation caused by guide wire artery perforation (b). We use time-varying auto-regressive (TV-AR) modelling to characterize the dynamic changes and time-varying maximal energy pole (TV-MEP) to compute subsequent analysis of MID/tissue interaction characterization patterns. Qualitative and quantitative analysis showed that the TV-AR spectrum and the TV-MEP indicated the time instants of the needle path through different phantom objects (a) and clearly showed a perforation versus other generated artefacts (b). We demonstrated that audio signals acquired from the proximal part of an MID could provide valuable additional information to surgeons during minimally invasive procedures.

occlusion 11,12 . The main drawback of these approaches is that the sensors are usually located at the distal end of the guide wire, i.e., invasively in the part of the device that is inserted inside the body. Moreover, due to this distal placement characteristic they come with a degradation of the devices clinical efficiency due to the size of the sensors and placement of required cables or wires.
In this work we propose a new approach for MID/tissue interaction monitoring and surgery augmentation using acoustic emission (AE) data acquisition from the proximal end of a conventional clinically used device to extract dynamical characteristics of the interaction between the distal tip and the tissue. The hypothesis is that a change of tissue that occurs when the MID passes through with the distal end causes changes in the characteristics of the measured signal at the proximal end. By applying advanced signal processing techniques to the acquired audio signal it should be possible to identify and characterize significant events related to the interaction of the MID distal tip with the tissue, such as penetration, friction and puncture dynamics. In contrast with recent papers on surgical soundtracks or auditory display 13,14 , which use medical image analysis for sonification, this work aims at acquiring, modifying and amplifying natural sounds of tool/tissue interactions. AE techniques have been extensively used as a non-invasive diagnosis tool in different research and development fields. In mechanical machining processes AE has been widely used for non-invasive online tool condition monitoring 15 , where AE signals are processed for tool wear assessment or in general for machine health monitoring and fault diagnosis. Other areas such as nondestructive testing use AE techniques for detecting and locating flaws in structures 16 . In medicine, with the exception of the areas of orthopedy 17 , AE has been little exploited. Some few works have been proposed with a goal similar to our approach [18][19][20] , but only for acquiring AE from drilling machines in orthopaedic surgery or in general for bone surgery. Our approach is intended to be used for soft tissue applications and involves a completely different challenge.
Experimental setups for two different MIDs, a biopsy needle and a guide wire were implemented and a time-varying autoregressive (TV-AR) modelling has been used for extracting valuable features from the audio signal related to the friction dynamics between the tool and the tissue. Some preliminary results of this work concerning the guide wire experimental setup has been presented in 21 .
Qualitative and quantitative results showed that with the proposed approach the time instants when the needle enters and leaves different phantom objects are clearly identifiable and also showed signal patterns that are significantly different between perforation and generated artefacts for the guide wire. This demonstrates that audio signals by itself acquired from the proximal part of a conventional and only little altered MID could potentially provide valuable additional information to surgeons during minimally invasive procedures. In combination with established imaging guidance this could lead to increased success of selected surgical procedures or reduce false negative events. This could also be used in a future setup for readjusting robotic devices that exclusively rely on preoperative imaging data.

Methods
Experimental setup and AE data acquisition. Experimental setups (ES) for two different MIDs, a 18G 200 mm length biopsy needle with the needle core (ITP, Germany) and a 0.014−inch guide wire of 1.8 m length (Boston Scientific, US) placed inside a flushed 1.9 F micro catheter of 1.5 m length were implemented. The main objective of the ES for the biopsy needle was to observe audio signal dynamical changes when the needle passes through two different tissue structures. In contrast the guide wire ES intended to analyse signal dynamics of perforation in vascular structures.
For both experiments AE signals were acquired using a stethoscope connected to a microphone which was directly and firmly attached to the proximal end of the MID via a 3D printed adapter (see top of Fig. 1). For each MID experiment qualitative and quantitative analysis were performed and a database for each ES was implemented for the quantitative case.
A gelatine phantom filled with different fruits, chicken breast and liver located 6 cm deep was used for the biopsy needle qualitative analysis, while for quantitative analysis the gelatine phantom was filled with ex-vivo porcine tissue. The needle insertion was performed manually for qualitative analysis and automatically for quantitative analysis.
The qualitative and quantitative guide wire perforation tests were performed on ex-vivo porcine coronary arteries. The guide wire was placed inside a flushed micro catheter. This was shaped to mimic a natural tortuous pathway to the coronaries.
For both ES, the audio signals were recorded in WAV format with a sampling frequency of 44100 Hz. The Matlab R2015b was used for the audio signal analysis.
Database implementation for needle experimental setup quantitative analysis. For the needle ES, 80 audio recordings were acquired during automatic insertion of the biopsy needle into ex-vivo porcine tissue phantom using a testing machine (Zwicki, Zwick GmbH & Co.KG, Ulm) at an insertion velocity of 3 mm/s (see Fig. 1a).
In order to show repeatability of the approach the time instants of object entry t in and exit t out were manually annotated. For that the force from the testing machine was recorded synchronous with the audio and a video camera was placed in front of the phantom (see Fig. 1a). The main objective of that was to set t in at the time instant of first signal deflection when the force started to change (contact of the needle with the tissue) and to synchronize this time instant with the one observed when the needle touches the tissue in the video. In this way video and audio were also synchronized and t out was taken directly from the synchronized video (see Fig. 2).
We have performed experiments at two additional velocities in order to study the performances of our approach when the needle insertion velocity is changed. The insertion velocities encountered during current clinical procedures can have large variation but in 2 insertion velocities between 1 and 10 mm/s were analysed. Experiments at 5 mm/s and 8 mm/s were performed additionally to the set of 3 mm/s described above. We have acquired 20 recordings for each additional velocity and the same procedure than the experiments at 3 mm/s was applied.
Database implementation for guide wire experimental setup quantitative analysis. For the guide wire ES, 560 audio signals of 30 seconds duration were recorded during the tip perforation of coronary arteries belonging to 10 porcine hearts (see Fig. 1b). The main objective of the created database was to analyse performances on classifying the audio signals as a perforation or as an artefact. Therefore, 315 additional recordings with different types of induced guide wire audio artefacts were performed, including friction between the guide wire and the artery wall (165 recordings) and tiny guide wire bumps (150 recordings).   Audio signal characteristics extraction. The block diagram of Fig. 3 displays the main steps of the used signal processing approach for the extraction of valuable characteristics from the AE signals for both ES. Each signal was first decimated and then bandpass filtered. The resulting signals were modelled using a TV-AR parametrical model for estimating a time-varying (TV) parametrical power spectral density (PSD) and TV poles. Finally an indicator based on tracking the pole of maximal energy was computed.
Signal pre-processing. The audio signal was first decimated to simplify the tracking of dynamical changes using the AR pole representation. The resulting signal was then bandpass filtered to focus the analysis in the frequency range significant for the used stethoscope.
Time-varying auto-regressive modelling for acoustic signature extraction. When an MID crosses a given tissue structure the resulting friction on the cutting edges produces an audio signal whose dynamics involve characteristics that are strongly variant in time. Additionally, when the MID passes through two different tissue structures (boundary between two different tissue layers for example) during and after the tissue transition the audio signal present important transient dynamics that can abruptly change. All these dynamical characteristics, together with the fact that the audio signals present significant background noise, and therefore a poor signal-to-noise-ratio (SNR), make the signal difficult to process and the information that this signal conveys must be decoded. What we propose in this work is to find an acoustic signature in the signal that provides information concerning tissue transition in the needle case and vessel perforation characteristic in the guide wire case.
Due to the signal characteristics described above, classical methods for stationary processes no longer can be used to follow the dynamical changes that this signal involves and cannot describe many conditions in processes where transient phenomena are involved. In the presence of time-varying characteristics the classical way to follow these variations is tracking and TV-AR modelling is well suited for extracting signature from audio signal using pole representation [22][23][24] . This acoustic signature was extracted in this work from the TV-AR spectrum and the dominant TV-AR pole.
The classical AR modelling for stationary processes is a well-known technique for parametrical spectral estimation and a huge amount of literature has been written showing its advantages over non-parametrical based methods (for detailed information about classical AR modelling we suggest to consult 25 ). One advantage is that when an appropriate model is selected it presents a higher spectral resolution even in signals with poor SNR and using less data than classical methods. But another really important advantage for our work is that in its TV version it allows the decomposition of different TV dynamics through the pole representation allowing the tracking of those dynamical changes.
The main difference between the stationary AR version and the TV-AR one that we used in this work is that the parameters of the AR model are now time dependent, which results in a time-dependent representation of the transfer function: [ , ] [ , ] where n represent the sample time instant and a k (n) the time-variant AR parameters. This give rise to a time-varying spectrum As mentioned above dynamical changes of a nonstationary process can be tracked using a pole approach. Different poles should be associated with different dynamic components of the signal. Because of the complex properties of the MID/tissue interaction audio signal it would be very difficult to track each estimated pole. This is why we decided for this approach to track only one pole, the dominant one. We assume that transitions between tissue layers would be modelled mainly by one pole which at each time instant would contain the maximal energy of the spectrum. We can assume that when the MID passes from one medium to another, this dominant pole would change position and the time instant when the pole abruptly move would be the time instant of transition between two tissue layers.
For estimating the Time Varying Maximal Energy Pole (TV-MEP), first the poles z k (n) were obtained by finding the roots of the AR coefficient in the denominator of the time-varying pole representation transfer function that was obtained from equation (1)   Then the equation (3) was used for computing the evolution of the maximal energy pole, i.e., the pole that, at each time window, had the maximal spectral power. It was computed by calculating first the r time dependent resonant frequencies from the phase angle θ k (n) of the corresponding pole in the upper half of the complex plane as explained in 26 : correspond to the sampling frequency and to the r resonant frequencies resulting from the poles, respectively. Then the spectral power P k of the resonant frequency k was obtained from the real part of the residue term r k : Finally at each time instant n the maximal energy pole was computed as the frequency belonging to the pole having the maximal power from the r resonant frequencies.
Accession codes. The algorithms described in the Methods section are available to editors and referees upon request. For the needle ES an algorithm that has as input an audio signal and as output the detected time instants of the porcine tissue entry and exit can be provided. For the guide wire ES the full classification algorithm can be provided.

Results and Discussions
In this section, qualitative and quantitative results are presented for both ES. We analyse performances on detecting abrupt dynamical changes produced by the needle tip during its entry and exit of tissue and on classifying a guide wire event as a perforation or as an artefact.

Model parameter settings.
In this work the TV-AR model parameters were computed over a sliding window of width w and an overlap of Ov. In each window, a p order AR model was used to estimate the AR parameters using the Yule-Walker method and for each of the windows the AR spectrum and poles were computed. The values of w, Ov and p were set to 110 ms, 50% and 30 respectively for the needle ES and 50 ms, 95% and 20 respectively for the guide wire ES.
For the needle ES, the bandpass filter consist of a 7th order Butterworth filter with a bandpass of 3-6 KHz. Due to the more transient characteristics of guide wire perforation dynamics the bandpass filter for this signal was implemented using Discrete Wavelet Transform (DWT). For that the signal was decomposed in 10 scales using a Daubechies DWT and finally reconstructed with selected middle-frequency wavelet scales as presented in 27 . Qualitative results. Biopsy needle tissue penetration analysis. The ES for the biopsy needle attempts to emulate different structures of tissue in order to analyse the different types of AE response that can be obtained as a result of the friction with cutting edges of the needle tip. The tested tissues were two fruits, persimmon and grape, and two chicken parts, breast and liver, all of them having different structure characteristics. Figure 4 displays the results of needle insertion in the four tested tissue objects. For each one the original signal, its bandpass filtered version, the TV-AR spectrum and the TV-MEP are shown. We can see in the original signal that only for the grape it is possible to clearly identify a dynamical change during the time interval between entry and exit from the tissue. However, it is difficult to determine the onset and offset of this dynamical change. In the other tissue object cases it is not at all evident when exactly the needle enters and passes through the object. Since the 3D printed box is not isolated from the outside the audio recordings involve different artefacts that can be more or less disturbing. It is therefore necessary to apply a signal processing strategy in order to enhance the information that otherwise stays hidden to the human eye. In this sense the bandpass filtered signals already enhances the information obtained from the penetration of the needle in the different tissues. The needle entering and leaving the persimmon becomes now visually evident and the TV-MEP significantly changes its frequency as a result of change of friction dynamics when the tip passes through the persimmon.
In chicken breast and liver, we can observe that the friction dynamics during needle penetration are more transient than in the fruits, resulting in a more difficult visualization of the process even after the filter is applied. AR modelling, as parametrical technique to analyse data, can help through the pole representation to extract more valuable and analytical information from friction dynamics, as can be seen in the behaviour of the TV-MEP.
The analysis of the grape case shows interesting results starting with the normalized TV-AR spectrum that is shown in Fig. 5a. The grape is a fruit with internal inhomogeneities as a combination of seeds and water and with that its TV-AR spectrum is quite illustrative. We can observe that before and after the entry and exit from the grape the main frequencies are more stable and concentrated in a narrowband location. This can be explained by the homogeneity of the gelatine. When the needle is inside the grape the frequency band is wider, certainly due to the unstable beating of the needle between seeds and water.
Another important analysis can be observed in Fig. 5b. In this figure AE recordings of grapes located at two different depths, 5 and 10 cm, are displayed. We can see in the filtered audio signal that there is nearly no SCIeNTIFIC REPORTs | (2018) 8:12070 | DOI:10.1038/s41598-018-30641-0 difference in intensity between both grapes and the TV-MEP for both cases are similar. This would suggest that the audio is propagated without significant losses from both locations.
Finally it is important to look at the difference in Fig. 2 between the needle entry and exit dynamics in the audio and force signals. It is possible to observe that the transition from gelatine to fat, when the needle enter to the tissue, is abrupt in the audio while in the force is slow. This would suggest that the transition in the process between two states (needle passing through two different tissues) could be abruptly detected in the audio signal but not in the force signal. Furthermore, the exit from the fat is evident in the audio signal but not in the force signal, indicating that the dynamical changes occurring in the process when a needle passes a layer boundary are not lost in the audio signal compared with the force signal when the event occurs deeper in the body.   Guide wire perforation audio signal characteristics. Figure 6 shows three real audio signals obtained from the proximal end of the guide wire. Three different guide wire event cases are depicted to describe the differentiating signal patterns: heart vessel perforation, friction and a guide wire bump. For each case, the original signal, the TV-AR spectrum and the TV-MEP are shown. Additionally, we display a 3D option for the visualization of the spectrum. The 2D display of the spectrum allows to see the time spectral lines and the 3D can help us to have a better insight on the frequency power distribution over time. A visual analysis of the time-varying patterns of the spectrum and of the TV-MEP allows to verify the following: • The TV-AR spectrum shows main frequency components that are stable in time during a perforation (inside the segment of higher energy in the TV spectrum), unlike an artefact where the frequencies are more disperse in time. That is equivalent to say that it is possible to observe a higher spectral disorder when an artefact occurs than when a perforation occurs. • During a perforation some characteristic segments can be distinguished in the TV-MEP: 1) a strong overshoot represented by a fast and short rise of the pole frequency, 2) a plateau just after the overshoot until the signal goes back to a stationary stage. It is possible to observe that the overshoot is not present when an artefact occurs and also that the plateau is much more stable when a perforation occurs than when an artefact is generated.
This analysis leads us to believe that a perforation can be identified from other occurring events, since it leaves a clear audio signal trace in the time-frequency domain. First, the overshoot is an important characteristic of a perforation indicating a high concentration and release of energy just before and just after the guide wire tip crosses the vessel wall. We can assume that different elasticities of vessel structures could result in different patterns of the overshoot. But this overshoot should be present when a perforation occurs. Secondly, the audio signal dynamics in the TV-MEP just after the overshoot are stable, which can be explained by the fact that after exiting the vessel wall the dynamic comes back to a stationary state through a damped stable frequency that should be a transformation of the natural frequency of the tissue. This dominant frequency has high spectral energy and is always characterized by a unique pole. The purity of the main frequencies of the spectrum in terms of their stability over time, which is clearly visible in both 2D/3D visualization of the spectrum, confirms the clear response of the vessel wall after excitation.
Quantitative results. Needle experimental setup. For each recording belonging to the needle database, the entry and exit time instants were automatically detected using a simple CUSUM hypothesis testing over the obtained TV-MEP signal 28 . These time instants were compared with the manual references by computing the absolute values of the time errors between automatic and manual annotations.
In Fig. 7a the errors between manual and automatic annotations are shown for each one of the 80 recordings for both, needle entry and exit from the ex-vivo porcine tissue. In all the cases the TV-MEP was sensible to needle entry and exit. The average and standard deviation of the detection errors was 0.53 ± 0.35 seconds for needle entry and 0.59 ± 0.32 seconds for needle exit. If we consider that the automatic needle insertion velocity with the For the recordings performed at the additional velocities of 5 and 8 mm/s the average and standard deviation of the detection errors was 0.11 ± 0.29 and 0.16 ± 0.30 seconds for needle entry and −0.09 ± 0.51 and 0.002 ± 0.40 seconds for needle exit, respectively. Considering the velocities the errors are equivalent to 0.56 ± 1.44 mm entry and −0.43 ± 2.53 mm exit for the velocity insertion of 5 mm/s and 1.27 ± 2.37 mm entry and 0.02 ± 3.23 mm exit for the velocity insertion of 8 mm/s. It is possible to observe that in terms of detection errors in seconds the results between the three tested velocities are similar for needle entry and exit. However in terms of distance we can observe that the standard deviation increases directly with the velocity. This is explained by the fact that an error has a major effect when the velocity is higher.
Guide wire experimental setup. For the guide wire 16 features were extracted from the TV-MEP and from the TV-AR spectrum for each one of the 875 recordings from the database. These features are computed following the qualitative analysis made previously about the dynamical characteristics of a perforation. For the explanation of the extracted features let P m (n) be the TV-MEP signal at time instant n and S AR (f, n) be the TV-AR spectrum computed from equation (2) at time instant n and frequency f. Let also t on and t off be the time instant of beginning and end of the occurring event (perforation or artefact), which are computed following the method presented in 29 . The proposed features can be divided in three groups: overshoot based features, plateau based features and TV-AR spectrum based features.
Overshoot based features: The main objective of this type of features is to provide quantitative information concerning the overshoot of P m (n) that is usually present in guide wire perforations. We compute 6 overshoot based features as:   Plateau based features: These features are based on the stability of the plateau characteristics in P m (n). We consider a plateau in P m (n) as an event occurring just after the overshoot, which involves low signal fluctuations. Therefore the plateau is estimated using the derivative of P m (n). If this derivative is continuously low enough then the interval is considered as a plateau. Following this concept three plateau signal intervals are computed PL 1 (n), PL 2 (n) and PL 3 (n) by using three different threshold values to the fluctuations in the derivative of P m (n). We compute 8 plateau based features as: where Corr is the correlation of the matrix and Mean is the average of the correlations. Figure 7b shows a 3D scatter plot for the values of the 875 guide wire audio signal obtained for three different features belonging each one to a different type of feature: the overshoot based F 1 , the plateau based F 7 and the TV-AR Spectrum based F 16 . This scatter shows different location for the points belonging to heart vessel perforation compared with the artefact events.
In order to classify between heart vessel perforation and artefact a Support Vector Machine (SVM) was used as the tool for binary classification (perforation or artefact). SVMs are widely used in applications such as bioinformatics, text mining, face recognition and image processing. They are based on supervised learning and margin maximization, meaning that a labelled dataset (perforations labelled with 1 as target value, artefacts with 0 as target value) is fed to the algorithm to create a high dimensional hyperplane, which serves as the desired decision boundary with maximal distance between the classes 30 . In this work a 16-dimensional space is used.
A balanced training data set plays a crucial role in the performance of SVM 31 . In order to obtain a meaningful result, the same proportion of perforations and artefacts were used for training. Thus, 100 recorded perforations were used along with 50 frictions and 50 bump recordings for training. For the subsequent test of the trained SVM 460 heart vessel perforation recordings, 115 friction recordings and 100 bump recordings were used. Table 1 shows the obtained results of the SVM classification where we can see that from 460 tested heart vessel perforations the algorithm recognize correctly 418 recordings. It is possible to see also that from the 215 tested SCIeNTIFIC REPORTs | (2018) 8:12070 | DOI:10.1038/s41598-018-30641-0 artefact only 10 of them were assigned as perforation. This demonstrates that the assumptions on overshoot, plateau, and time-varying spectrum were highly useful for distinguishing between a perforation and an artefact.

Conclusions
The presented research approach shows a novel and relatively simple method to obtain valuable information for MID guidance and tissue/device interactions using an attached microphone that records propagated sound over the devices shaft for further analysis and evaluation. These interactions starting at the tip of an interventional device can be picked up and detected on the proximal end of a standard clinically used device. Even if this work present results from experiments performed in a laboratory controlled environment, where many sources of physiological and not physiological artefacts are not present, we strongly believe that due to the clear dynamical changes in the signal, performances of the presented approach should not be significantly affected in real clinical conditions.
The experiments with an implemented database for both MIDs show that valuable information can be extracted from the audio signal through the TV-MEP. It is sensitive to tissue boundary transitions and has a distinguishable signature when the guide wire perforates a vessel. The audio induced by the friction on the cutting edges in needle insertion and perforation of a guide wire are events that involve signal dynamics containing specific time-variant frequency components and dominating energies. For this reason it should be possible to distinguish these events, clinically important, from events or artefacts induced by other unwanted sources during the process.
The results indicate that a first use of this approach should be for verification and/or visual haptic feedback for real time tissue/tissue passage. The obtained processed audio signal contains additional complementary feedback information to the one provided by image guidance. For example, if integrated with ultrasound this would allow verification of needle tip position even when the tip is difficult to distinguish, which happens quite frequently in applications such as regional anaesthesia, where the angle between the US probe and the needle is too small. Another application could be for surgical robot guidance where the audio information could correct or verify a position or be used for haptic robotic feedback.
For applications with relatively clear boundary structure, the presented approach will help to obtain clearly distinguishable and automatically detectable dynamical changes when the MID passes through tissue structures.
We were able to clearly identify boundaries and we can observe that each tested tissue leaves a different acoustic signature when analysed in a time-varying parametrical context. A next development step will be to study whether it is possible to classify tissue through this approach. For that, advanced signal processing algorithms should be developed for taking into account the non-linearities, inhomogeneities and elasticity characteristics of a given tissue.
Also needed is a methodology for translating decoded information into a friendly graphical user interface or acoustic signal and to study and evaluate human performances with respect to a real clinical intervention.