Depth‐resolved registration of transesophageal echo to x‐ray fluoroscopy using an inverse geometry fluoroscopy system Depth-resolved registration of transesophageal echo to x-ray fluoroscopy using an inverse geometry fluoroscopy system Charles R. Hatt Department of Biomedical Engineering, University of Wisconsin-Madison, Madison, Wisconsin 53705 Michael T. Tomkowiak, David A. P. Dunkerley, and Jordan M. Slagowski Department of Medical Physics, University of Wisconsin-Madison, Madison, Wisconsin 53705 Tobias Funk Triple Ring Technologies, Inc., Newark, California 94560 Amish N. Raval Department of Medicine, University of Wisconsin-Madison, Madison, Wisconsin 53792 Michael A. Speidela) Departments of Medical Physics and Medicine, University of Wisconsin-Madison, Madison, Wisconsin 53705 (Received 26 July 2015; revised 28 September 2015; accepted for publication 29 October 2015; published 13 November 2015) Purpose: Image registration between standard x-ray fluoroscopy and transesophageal echocardiog- raphy (TEE) has recently been proposed. Scanning-beam digital x-ray (SBDX) is an inverse geometry fluoroscopy system designed for cardiac procedures. This study presents a method for 3D registration of SBDX and TEE images based on the tomosynthesis and 3D tracking capabilities of SBDX. Methods: The registration algorithm utilizes the stack of tomosynthetic planes produced by the SBDX system to estimate the physical 3D coordinates of salient key-points on the TEE probe. The key-points are used to arrive at an initial estimate of the probe pose, which is then refined using a 2D/3D registration method adapted for inverse geometry fluoroscopy. A phantom study was conducted to evaluate probe pose estimation accuracy relative to the ground truth, as defined by a set of coregistered fiducial markers. This experiment was conducted with varying probe poses and levels of signal difference-to-noise ratio (SDNR). Additional phantom and in vivo studies were performed to evaluate the correspondence of catheter tip positions in TEE and x-ray images following registration of the two modalities. Results: Target registration error (TRE) was used to characterize both pose estimation and registra- tion accuracy. In the study of pose estimation accuracy, successful pose estimates (3D TRE < 5.0 mm) were obtained in 97% of cases when the SDNR was 5.9 or higher in seven out of eight poses. Under these conditions, 3D TRE was 2.32±1.88 mm, and 2D (projection) TRE was 1.61±1.36 mm. Probe localization error along the source-detector axis was 0.87±1.31 mm. For the in vivo experiments, mean 3D TRE ranged from 2.6 to 4.6 mm and mean 2D TRE ranged from 1.1 to 1.6 mm. Anatomy extracted from the echo images appeared well aligned when projected onto the SBDX images. Conclusions: Full 6 DOF image registration between SBDX and TEE is feasible and accurate to within 5 mm. Future studies will focus on real-time implementation and application-specific analysis. C 2015 American Association of Physicists in Medicine. [http://dx.doi.org/10.1118/1.4935534] Key words: inverse geometry, scanning-beam digital x-ray, image fusion, fluoroscopy, echocardiography 1. INTRODUCTION Catheter-based cardiac interventions allow for minimally inva- sive treatment of structural heart disease, reducing patient trauma and opening up treatment options for patients that are too sick and/or fragile to undergo surgery. While surgeons have the luxury of direct visualization of the treatment site, this comes at the cost of increased risk to the patient, greater morbidity, and longer recovery time. In contrast, interventional cardiologists employ imaging methods such as x-ray fluoros- copy (XRF) and echocardiography (echo) to visualize their devices, identify the anatomy they wish to treat and to avoid, and to monitor the success of the therapy. Integration of these imaging methods is desirable for optimal clinical workflow and improved therapeutic success. Image fusion between XRF and transesophageal echocar- diography (TEE) has recently been proposed1–4 and clinically implemented (EchoNavigator, Philips Healthcare). Many structural heart interventions, such as transcatheter aortic valve replacement (TAVR), left atrial appendage closure, and the mitral clip procedure, utilize both XRF and TEE. These procedures may benefit from the enhanced guidance offered by combining information from both image modalities. For example, in TAVR, a prosthetic valve is guided and deployed using XRF, but visualization of the anatomy is poor. If XRF/TEE fusion is enabled, real-time anatomical information 7022 Med. Phys. 42 (12), December 2015 0094-2405/2015/42(12)/7022/12/$30.00 © 2015 Am. Assoc. Phys. Med. 7022 http://dx.doi.org/10.1118/1.4935534 http://dx.doi.org/10.1118/1.4935534 http://dx.doi.org/10.1118/1.4935534 http://dx.doi.org/10.1118/1.4935534 http://dx.doi.org/10.1118/1.4935534 http://dx.doi.org/10.1118/1.4935534 http://dx.doi.org/10.1118/1.4935534 http://dx.doi.org/10.1118/1.4935534 http://dx.doi.org/10.1118/1.4935534 http://dx.doi.org/10.1118/1.4935534 http://crossmark.crossref.org/dialog/?doi=10.1118/1.4935534&domain=pdf&date_stamp=2015-11-13 7023 Hatt et al.: Registration of transesophageal echo to inverse geometry fluoroscopy 7023 from echo can be visualized continuously in the context of the devices without the need for nephrotoxic x-ray contrast. XRF/TEE fusion is accomplished using 2D/3D registration techniques1–3 or magnetic tracking sensors.5,6 Sensor-based methods require additional hardware and may be inaccurate due to electromagnetic field distortions in the catheterization lab.7 Generally, 2D/3D registration techniques will estimate the 3D location and orientation (pose) of the TEE probe by comparing the clinical XRF image to simulated XRF images of a 3D probe model (digitally reconstructed radiographs or DRRs). The model pose is iteratively adjusted until the simi- larity between the clinical XRF and DRR is maximized. After inferring the 3D position of the probe in the C-arm coordinate system, the 3D TEE image data can be registered to XRF data. Using a typical monoplane XRF C-arm system, the most challenging pose parameters to estimate are the so-called “out- of-plane” parameters, which include Euler angle rotations about the detector axes (pitch and roll) and, in particular, trans- lations along the source-detector axis. This is because varying these parameters typically causes only subtle changes in the device appearance, which in turn do not strongly influence the similarity function maximized during pose estimation. Performing the registration using two x-ray views can help resolve this issue but the increased radiation dose to the patient is a concern. Scanning-beam digital x-ray (SBDX) is an inverse geom- etry x-ray fluoroscopy technology designed for dose reduc- tion and tomosynthesis-based 3D device tracking.8 The basic components of SBDX are a scanning x-ray tube, multihole collimator, high speed photon-counting detector, and a real- time reconstructor (Fig. 1). As the electron beam in the x-ray tube scans over an array of focal spot positions, small-field- of-view images of the patient are captured. After each frame, the detector data are reconstructed into a stack of full-field-of view tomosynthesis images (32 planes×15 frames/s).9 The tomosynthesis images are a necessary precursor to the final live image display, termed the composite image. However, the plane stacks can also be exploited for frame-by-frame 3D localization of high-contrast devices. This principle has been previously applied to localize catheter tips and electrodes,10 fiducials,11 and coronary artery centerlines.12 In this paper, we present the first investigation of SBDX/TEE image registration. The tomosynthesis capability of SBDX is used to obtain the position of the TEE probe along the source-detector axis and an accurate initial estimate of the 3D probe pose. This is followed by refinement of the pose estimate using a 2D/3D registration procedure. A phantom study of 3D and 2D target registration error (TRE) was conducted for a variety of probe orientations and image noise levels in order to quantify the performance of the new pose estimation algorithm. To demonstrate SBDX/TEE image fusion in 3D and 2D visualizations, additional phantom and in vivo studies were conducted. In the 3D visualization, 3D TEE data are registered and fused with 3D catheter tip positions localized from SBDX tomosynthesis imaging.10 2. ALGORITHM SBDX/TEE registration is achieved by estimating the 3D pose of the TEE probe based on its appearance in SBDX x- ray images. The pose estimation algorithm has two stages (Secs. 2.B and 2.C). First, an initial estimate of the probe pose is obtained by performing tomosynthesis-based 3D local- ization of key-points on the probe. Second, the initial pose is refined with a 2D/3D registration algorithm adapted for SBDX’s inverse geometry. Given the 3D pose and a calibration relating the echo image volume to the TEE probe, the echo F. 1. (A) SBDX imaging geometry, demonstrating shift-and-add tomosynthesis at multiple planes. (B) The central rays of the individual beamlets form a cone of rays originating from the center of the detector. The tomosynthesis image pixels are formed by subdividing the lateral shift between these rays. (C) The multiplane composite can be viewed as a “virtual projection” of the in-focus features of the tomosynthesis images. Medical Physics, Vol. 42, No. 12, December 2015 7024 Hatt et al.: Registration of transesophageal echo to inverse geometry fluoroscopy 7024 T I. Glossary of symbols. x,y,z Coordinate axes of the SBDX system u,v Integer column and row indices of a pixel in a reconstructed plane or composite image θx,θy,θz,tx,ty,tz Rotation angles and translations corresponding to the x, y, and z axes I(u, v, z) Tomosynthesis plane stack px,py Virtual detector element pitch in the x and y directions dx,dy,SDD Distance between center of the virtual detector and the virtual source point, along x, y, and z M Denotes local coordinate system attached to the echo probe model V(x j, yj, z j) 3D point intensities defining the probe model in coordinate system M D(u, v) Digitally reconstructed radiograph of the probe model SBDXTM Transformation from probe model coordinates to SBDX coordinates CTTecho Transformation from echo image coordinates to CT coordinates used in echo calibration MTCT Transformation from echo calibration CT coordinates to probe model coordinates MTecho Transformation from echo image coordinates to probe model coordinates TRE Target registration error image data may then be registered to XRF. Details of SBDX image reconstruction, pose estimation, and visualization of the registered images are described in Table I. 2.A. SBDX image reconstruction During SBDX imaging, an electron beam is raster-scanned over an array of focal spot positions. A multihole collimator defines a series of narrow overlapping x-ray beamlets directed at the detector. The detector captures a small image for each collimator hole illumination, and the images are transmitted to GPU-based hardware for real-time reconstruction.9 A two stage reconstruction process is executed for the detector images acquired in every 1/15 s scan frame. First, digital tomosynthesis is performed in parallel at a stack of 32 planes spaced by 5 mm. As described in Ref. 8, an unfiltered backpro- jection technique is used (“shift-and-add” tomosynthesis). In the tomosynthesis images, in-plane objects appear sharp, and out-of-plane objects are progressively blurred as the plane-to- object distance increases. In the second stage, a 2D composite image is formed from the tomosynthesis stack in order to display all objects in focus simultaneously. The composite is generated by a plane selection algorithm, which, for each pixel position, selects the pixel value from the tomosynthesis plane with the highest local contrast and sharpness. Field-of- view and frame rate are dictated by the number of focal spots scanned and the number of electron beam dwells per focal spot.8 In this work, scanning was performed with 71×71 holes, 8 dwells per hole per scan frame, and 15 scan frames/s. Com- posite images and plane stacks were reconstructed at 15 Hz, and the isocenter plane reconstruction measured 11.4 cm wide. The source-to-detector distance (SDD) is fixed at 1500 mm. The coordinate system of the SBDX C-arm is defined such that x corresponds to the horizontal image direction, y the vertical image direction, and z is the distance along the source- detector axis. The (x,y,z) origin is located at the center of the focal spot array. The pixel coordinates of tomosynthetic and composite images are referred to by the integers u and v. The u- and v-axes are parallel to x- and y-axes, respectively. When a plane stack I(u,v,z) is described, z is assumed to take on the discrete values, in millimeters, corresponding to the plane positions. The SBDX/TEE registration algorithm uses the 3D plane stack for initial probe pose estimation and the 2D compos- ite image for final pose refinement. Since the SBDX image coordinate system is relevant to these tasks, a brief review is provided here. The pixel pitch in each tomosynthesis plane is defined by dividing the shift distance between adjacent back- projected images into a fixed number (m =10) of pixels. Since the x-ray beamlets originate from a regularly spaced array of focal spot positions in the source and they all converge to a common point on the detector, the pixel centers for the stack of planes fall along a cone of rays originating from the center of the detector (see Fig. 1). That is, a ray corresponds to fixed (u,v) in the plane stack I(u,v,z). The composite image contains the in-focus pixel value for each of these rays. Thus, the 2D composite can be viewed as an inverted “virtual projection” of the in-focus features in the patient volume, where the “virtual source” is at the center of the detector and the “virtual detector” is located at the source plane. The virtual detector pitch is the focal spot pitch (2.3 mm) divided by m =10. For more details, we refer the reader to Ref. 13. The use of this virtual projection model in 2D/3D registration is described in Sec. 2.C. The coordinate system “M ” of the TEE probe is defined such that the probe face from which the ultrasound volume emanates points in the positive z-direction (toward the SBDX detector), and the long-axis of the probe points in the negative y-direction of the SBDX system (toward patient inferior). The rotational pose parameters for the probe angle (θx, θy, and θz) correspond to sequential Euler angle rotations about the SBDX coordinate system axes, in the order y → x → z. This corresponds to a rotation about the long-axis of the probe (“roll”), followed by a rotation about the short-axis of the probe (“pitch”), and then finally a rotation of the probe about the z-axis (“yaw”). Figure 2 demonstrates a TEE probe model after it has been rotated and translated to a position in the SBDX coordinate system. 2.B. Initial 3D pose estimation from tomosynthesis An initial estimate of the 3D position and orientation of the probe in the SBDX C-arm coordinate system is obtained from the tomosynthetic plane stack I(u,v,z) generated in a 1/15 s frame period. To obtain the position along the source- detector axis, the method exploits the fact that a device feature appears most in focus in the image plane closest to that feature and is progressively blurred as the plane-to-feature distance increases (Fig. 3). The z-location of the device feature is determined with finer precision than the plane-to-plane spac- ing by analyzing the distribution of feature sharpness versus Medical Physics, Vol. 42, No. 12, December 2015 7025 Hatt et al.: Registration of transesophageal echo to inverse geometry fluoroscopy 7025 F. 2. The transformation SBDXTM maps 3D points in the local coordinate system of the probe model (M) to 3D points in the SBDX coordinate system. the z-coordinate of each plane. The method has three steps: (i) detection of probe key-points in the composite image, (ii) 3D localization of key-points using the tomosynthesis planes, and (iii) principal component analysis (PCA) for orien- tation estimation (see Fig. 4). 2.B.1. Key-point detection First, the center pixel of the square transducer face of the TEE probe is located in the composite image. This was done manually, although automatic techniques14 can also be applied. A segmentation of the TEE probe is then generated by applying the Frangi vesselness filter to the composite image15 followed by thresholding and dilation with a 10-pixels wide circular structuring element. The TEE probe is typically the largest high-contrast object in the image. Therefore, the largest connected component is found and all others are removed to produce the probe segmentation mask [Fig. 4(C)]. To detect key-points within the mask, first the gradient magnitude of the composite image is computed following convolution with a Gaussian kernel (σ = 1.0 pixel). Next, a phase-symmetry filter is applied16 to enhance salient edges in the image while suppressing noisy edges. The result is multiplied by the mask to produce an edge image [Fig. 4(E)]. A local maximum filter is applied where a pixel is set to 1 if it is equal to the maximum value within a 7 × 7 window, and 0 otherwise. The result specifies the initial key-point locations [Fig. 4(F)]. 2.B.2. 3D localization of key-points At each key-point position (uk,vk), the image gradient magnitude is sampled at all 32 planes to create a 32-vector of edge strength values [Fig. 4(G)]. The gradient magnitude in each plane of the stack I(u,v,z) is computed using the finite difference method after convolution with a 2D Gaussian kernel (σ = 1.0 pixel). Since the tomosynthetic blurring behavior is locally symmetric about the true object z-position, the vector of edge strength values can be viewed as a sampled version of a function with its centroid located at the true object position. Local edge strengths about the object are obtained by applying a threshold [see Fig. 4(G)]. Denoting the original distribution of edge strengths as Ck(z), the thresholded distribution is Ĉk(z) = Ck(z)− A if Ck(z) > A 0 otherwise , (1) where A = 0.75 max(Ck(z)). The z-position zk of a key- point (uk,vk) is then calculated as the center-of-mass of this distribution, zk = 32 i=1 ziĈk(zi) 32 i=1 Ĉk(zi) . (2) For each vector Ĉk(z), the number of local peaks is found. Vectors with more than one peak are removed from consid- eration, as they often result in unreliable 3D localization estimates. The localized key-point positions are converted from (u,v,z) to (x,y,z) coordinates using precalculated lookup tables. At this stage of the algorithm, most key-points belong to edges of the probe. However, some key-points have unrealistic z-coordinate values and should be labeled as outliers. To remove them, the median z-coordinate of all key- points is calculated, and any point that is greater than 15 mm away from the median value is removed. (The distance 15 mm was chosen based on the dimensions of the TEE probe.) This mechanism is also designed to reject erroneous z-coordinates caused by overlapping objects, such as a catheter. 2.B.3. Initial 3D pose With the remaining set of 3D key-points, PCA is used to determine a rough 3D pose. PCA finds the directions of the highest variance in N -dimensional data. The first principal F. 3. Top row: Tomosynthesis images of a TEE probe head reconstructed at different planes relative to the SBDX source. Bottom row: The edge magnitudes grow weaker as the distance between the probe and the reconstructed image plane increases. The in-focus plane is indicated with the red rectangle. Medical Physics, Vol. 42, No. 12, December 2015 7026 Hatt et al.: Registration of transesophageal echo to inverse geometry fluoroscopy 7026 F. 4. (A) Original SBDX composite image. (B) Composite image after applying the Frangi filter. (C) The largest connected component of the filtered image is extracted and dilated to roughly segment the probe. (D) The gradient magnitude of a subwindow of the original image. (E) Phase-symmetry filter applied to the gradient magnitude. (F) Local maxima of the phase-symmetry image. (G) An example of the edge strength for each SBDX plane for a single 2D key-point. For the center-of-mass computation, only values greater than 0.75 of the maximal value are considered. (H) Final 3D key-points, with principal direction computed via principal component analysis. component, therefore, defines the direction that the long-axis of the TEE probe is aligned with, which in turn is used to determine the in-plane rotation of the probe (θz; yaw) and the out-of-plane pitch (θx). Furthermore, the average z-location of the 3D key-points is used to estimate the central z-coordinate of the probe (tz) by finding the mean z-value of all key-points within 10 mm of the center pixel (chosen based on the size of the TEE probe). Figure 4(H) demonstrates localized probe key-points in three dimensions along with the orientation vec- tor determined by PCA. 2.C. Pose refinement based on 2D/3D registration After the initial pose estimation step, the final estimation of all pose parameters is achieved through 2D/3D registra- tion. The TEE probe is modeled as a point-cloud model (Sec. 3.A.1), with its own coordinate system M . The pose parameters, applied to the probe model, refer to the three trans- lations (t x,t y,tz) and Euler angle rotations (θx,θy,θz) about the axes (x,y,z) of the SBDX system. The full spatial transforma- tion of the TEE probe is stored in the matrix SBDXTM, SBDXTM =  1 0 0 t x 0 1 0 t y 0 0 1 tz 0 0 0 1   cz −sz 0 0 sz cz 0 0 0 0 1 0 0 0 0 1  ×  1 0 0 0 0 cx −sx 0 0 sx cx 0 0 0 0 1   cy 0 −sy 0 0 1 0 0 sy 0 cy 0 0 0 0 1  , (3) where cj =cos(θ j) and s j =sin(θ j). As explained in Sec. 2.A, the SBDX system geometry is different than the geometry of a standard C-arm imaging system. However, when considering the displayed composite image, the imaging geometry can be viewed as a single virtual inverted cone-beam projection, where the rays originate from the center of the detector and diverge in the direction of the x- ray tube. The matrix P defines the virtual projection geometry from the SDD, distance of virtual source to the center of the virtual detector in the x(dx) and y(d y) directions, and the virtual detector element spacing in the x(px) and y(py) directions, P =  1/px 0 0 dx 0 1/py 0 d y 0 0 1 0 0 0 −1/SDD 1  . (4) P was calibrated using a helix phantom (Sec. 3.A.2). The value of SDD is 1500 mm, and the nominal virtual detector element spacing is 2.3 mm/10 = 0.23 mm in both directions. With these definitions, the 2D/3D registration proceeds as follows: (i) Given a vector of initial pose parameters, ϕ, generate a DRR from a 3D model of the probe. (ii) Compute the similarity between the DRR and the SBDX composite image. (iii) Using a nonlinear optimizer, repeat with different ϕ until the similarity is maximized. DRRs were generated using a point splatting method, similar to wobbled splatting.17 Using this method, a DRR is generated by projecting point inten- sities, usually from a CT volume V (x,y,z), onto the image. Each pixel in the DRR image takes on a value equal to the sum of the values of the voxels that project onto it, D(ui,vi)=e −α  j∈Si V(x j, yj,z j) , (5) where Si is the set of all voxel indices j such that the 3D point (x j,yj,z j) projects onto the 2D detector point (ui,vi), i.e., P ·SBDXTM · � x j,yj,z j �T = [ui,vi]T . The voxel intensities V Medical Physics, Vol. 42, No. 12, December 2015 7027 Hatt et al.: Registration of transesophageal echo to inverse geometry fluoroscopy 7027 are normalized to a positive value range and the parameter α controls the contrast of the DRR. To facilitate cross correlation calculations, α was set to achieve contrast approximately equal to that observed in an x-ray image. Two similarity metrics were used for optimization: normal- ized cross correlation (NCC) and gradient cross correlation (GCC). Normalized cross correlation is defined as NCC(I1,I2)=  i  j � I1 � ui,vj � −µ1 �� I2 � ui,vj � −µ2 � σ1σ2 , (6) where µ is the image mean and σ is the image standard deviation. GCC is defined as GCC(I1,I2)=0.5·NCC(Gx1,Gx2)+0.5·NCC(Gy1,Gy2), (7) where Gx is the image x-gradient and Gy is the image y- gradient. 2D/3D registration consisted of three optimization stages: (i) optimization of the in-plane parameters (t x,t y, and θz) using NCC, (ii) all parameters except tz using NCC, and (iii) all parameters, including the DRR contrast parameter α, using GCC. The Nelder–Mead optimizer was used at every registration stage. 3. METHODS 3.A. Calibrations 3.A.1. Probe model In order to compute splat rendered DRRs for 2D/3D regis- tration, a point-cloud model of the TEE probe was generated from a cone-beam CT of the probe.18 This was done by manu- ally segmenting voxels belonging to the TEE probe and then randomly sampling 220 points within the segmented volume. The intensity associated with each point was obtained using linear interpolation from the CT volume. 3.A.2. SBDX C-arm calibration The 3D/2D transformation matrix (P) describing the SBDX virtual projection geometry was calibrated using a precision manufactured phantom with steel beads arranged in a helical pattern. The helix phantom was placed at approximately the isocenter and imaged. To maximize SNR, a 64-frame average was formed. The image was then manually thresholded to segment each fiducial, and the intensity centroid of each fidu- cial was calculated. An initial P matrix was generated using the nominal virtual projection geometry, and the helix model pose was manually initialized. Next, the Levenburg–Marquardt algorithm was used to optimize helix model pose with fixed P. Following convergence, the helix pose was fixed, and P was optimized. This was repeated until the fiducial registration error converged to a minimal value. 3.A.3. Echo calibration The spatial transform relating the echo image space to the TEE probe model (M ), MTecho, was found using a wire phantom. The phantom consisted of a water-filled cylinder F. 5. Illustration of TEE probe model to echo image calibration. The probe model is registered to a CT image of the wire phantom. The wires from echo are then registered to the wires in the CT volume. This allows the spatial relationship between the TEE probe model and the echo image volume to be established. containing metallic wires and an entrance port for the TEE probe. A CT image was acquired of the entire setup while a simultaneous 3D echo of the metallic wires was recorded. Using standard intensity-based registration, the echo image of the wires was computationally registered to the wires in the CT image to find CTTecho (see Fig. 5). The probe model (generated from a previously acquired high-resolution CT) was similarly registered to the probe visible in the CT of the phantom to obtain MTCT. These two transforms were combined to obtain MTecho, MTecho= MTCT CTTecho. (8) To find the TRE of the echo volume-to-probe registration, voxels from the echo volume, pecho, and the CT volume, pCT, belonging to the wires were extracted by manually setting an intensity threshold for each image. For each wire voxel in echo, the distance to the nearest wire voxel in the CT image following registration was calculated. The TRE defined as the RMS distance over all of these distances was found to be 1.71 mm, TREecho=  i � min(pCT−CTTecho·pi,echo) �2 2 N . (9) 3.B. Pose estimation accuracy A study was performed to compare the TEE pose estimation results with a ground truth reference at eight different TEE probe orientations (see Fig. 6), and a range of image signal-to- noise ratios. The ground truth was established by embedding the TEE probe within a PVC cylinder covered with spherical steel ball bearings (2.5 and 3 mm diameters). The probe was fixed in the cylinder using silicone rubber. A cone-beam CT of the entire fiducial/probe setup was used to establish the spatial relationship between the probe and fiducials. To measure the ground truth pose, the pose estimation algorithm was applied Medical Physics, Vol. 42, No. 12, December 2015 7028 Hatt et al.: Registration of transesophageal echo to inverse geometry fluoroscopy 7028 F. 6. The eight different TEE probe poses tested for the pose estimation experiments. The fiducials attached to the probe are used to estimate the ground truth pose. to the fiducials only. SBDX imaging of the probe/helix was performed at 80 kV, 75 mApeak (36% maximum tube current) in the 71×71 15 frames/s scan mode, with 23.3 cm acrylic in the x-ray beam (Fig. 7). SBDX image reconstructions were performed offline. Five different levels of signal difference-to- noise ratio (SDNR) were generated by randomly sampling and averaging 1, 2, 4, 8, and 16 frames. For each noise level and TEE probe pose, this was repeated 10 times, for a total of 400 experiments. For all experiments, the SDNR was computed as SDNR= µprobe−µbackground σbackground . (10) In order to measure TEE probe and background signal statis- tics, ROI masks were created by manually setting two intensity thresholds, one to segment out the probe and one to sample the background near the probe. σbackground was computed by sub- tracting two consecutive frames, finding the standard deviation of the difference image within the background, and dividing by√ 2. µprobe and µbackground were computed by finding the mean within their respective masks for one image frame. 2D and 3D TREs were used to quantify pose estimation accuracy. For this experiment, the TRE was based on a set of N = 100 virtual points defined in the echo image space, randomly and uniformly distributed within a 50 mm wide cubic volume. The virtual points p in echo space were trans- formed to the C-arm coordinate system using both the ground truth pose and the estimated pose, yielding point sets ptrue and pestimated, respectively. The TRE3D was then computed for each experiment as TRE3D= ∥ptrue−pestimated∥22 N . (11) The TRE2D was computed the same way, but only the x and y coordinates were used in the Euclidean distance computation. TRE3D is the total target registration error, while TRE2D is representative of the error for points in echo in the plane paral- lel to the XRF detector. For this study, the overall registration error was based purely on pose estimation of the probe and did not include errors in registering the echo image space to the probe model (this additional error is considered in Sec. 3.C). F. 7. The experimental setup for the phantom pose estimation experiment. Left: The TEE probe, embedded in the PVC cylinder with surrounding fiducials, was imaged between layers of acrylic to decrease the signal-to-noise. Right: A zoomed out view of the experiment showing the SBDX C-arm. Medical Physics, Vol. 42, No. 12, December 2015 7029 Hatt et al.: Registration of transesophageal echo to inverse geometry fluoroscopy 7029 3.C. Phantom and in vivo studies of SBDX/TEE registration Water tank phantom and in vivo experiments were con- ducted to evaluate two image fusion scenarios: echo-to-SBDX fusion, where features within the 3D echo image were pro- jected onto the 2D SBDX image, and SBDX-to-echo fusion, where the 3D locations of devices from the SBDX image space were transformed into the 3D echo image space. The first scenario maintains the conventional 2D x-ray display format while adding anatomical structures rendered/segmented from echo, whereas the second scenario enables the fusion of 3D SBDX catheter tracking results with the native display format of 3D echocardiography. 3.C.1. Phantom study For the phantom experiment, a cylindrical polyvinyl alco- hol (PVA) phantom with a ventricle sized cylindrical cavity (height=35 mm, radius=15 mm) was fabricated. An injection catheter (MyoStar, Biosense Webster) with a metallic tip was guided through a plastic tube on the proximal side of the phantom, until it was positioned against the distal wall of the cavity. The proximal end of the catheter was attached to a trans- lation stage, and a 5 mm/s catheter pullback was performed under simultaneous echo and SBDX imaging. The resulting trajectory of the catheter tip was a straight path mainly in the negative y-direction. Sequences from two different C-arm angles (15◦ LAO, 0◦ CC and 15◦ LAO, 10◦ CC) were per- formed, resulting in different appearances of the TEE probe. Imaging was performed at 80 kV, 75 mApeak. Background x-ray attenuation was provided by 15 cm water, 2 cm of wood, and 1 cm of polyurethane plastic. To evaluate the 3D TRE of SBDX-to-echo 3D image regis- tration, first tomosynthesis-based 3D tracking of the catheter tip in SBDX space was performed using the algorithm in Ref. 10. The tip coordinate was then transformed to the echo image space using the TEE probe pose estimate and the echo- volume-to-probe calibration. The transformed coordinate was then compared to the catheter tip location as manually identi- fied from the 3D echo images. For this task, the centroid of the reverberation artifact was located, which was presumed to correspond to the metal tip of the catheter (Fig. 8). The 3D TRE was calculated for each frame using the following equation: TREtip= � pecho−echoTSBDX·pSBDX � 2, echoTSBDX= �SBDXTM MTecho �−1 . (12) As in the pose estimation accuracy experiments, TRE2D was computed by considering only the x and y coordinates. 3.C.2. In vivo study A 50 kg healthy swine with 24 cm anterior–posterior chest thickness was imaged in the 71×71 15 frames/s scan mode and 100 kV, 120 mApeak (50% maximum tube current) x- ray technique. Procedures were approved by the local In- stitutional Animal Care and Use Committee. Three image F. 8. Method for catheter segmentation in the in vivo and water tank experiments. The catheter tip was found by determining the line that passed through the 3D reverberation artifact. sequences were performed under simultaneous SBDX and TEE guidance. For the first two sequences, an injection cath- eter (MyoStar) with a metallic tip was guided into the left ventricle (LV). In sequence 1, the catheter tip was manipulated throughout the left ventricle to mimic navigation toward a target site. In sequence 2, the catheter was positioned at a single location against the left ventricular wall to mimic a catheter position confirmation task. In the latter case, the catheter only underwent cardiorespiratory motion. These two sequences were used to evaluate the registration accuracy for a discrete tip. The third sequence was used to evaluate the qualitative accuracy of anatomic echo-to-SBDX registration. Specifically, a ventriculogram was acquired under simultaneous SBDX and echo imaging in order to compare a standard x-ray ven- triculogram with a proposed echo-based ventriculogram, in which the LV is segmented from the echo data and overlaid on the fluoroscopic image. Additional TRE measurements were obtained from the metallic markers of the pigtail injection catheter present in this sequence. Since the SBDX and echo data were recorded simulta- neously on separate systems, temporal synchronization of im- age frames was necessary. To synchronize the images, each modality was first analyzed to determine the spatial axis with the largest variation of catheter motion. Next, the 1D posi- tion of the catheter along that axis was recorded as a 1D signal. Finally, the 1D motion “signals” from both modalities were compared and the time-shift that resulted in the highest normalized cross correlation was used to temporally align the image sequences. Medical Physics, Vol. 42, No. 12, December 2015 7030 Hatt et al.: Registration of transesophageal echo to inverse geometry fluoroscopy 7030 TRE was calculated in the same way as in the phantom study, with the exception of the ventriculogram sequence. For that sequence, the multiple metallic markers present on the pigtail catheter were indistinguishable in the echo image. Therefore, a spline, secho, was fit to a set of manually segmented points on the catheter in the echo image, and the TRE was the root mean square of the minimal distance between the markers registered from SBDX and the spline, TREpigtail= � min(secho−echoTSBDX·pSBDX) �2 2 N . (13) 4. RESULTS 4.A. Pose estimation accuracy Figures 9(a) and 9(b) show the average TRE3D and TRE2D for all successful registrations obtained in the pose estimation study. Figure 9(c) shows the success rates, defined as the per- centage of registrations with a TRE3D less than 5 mm. While this threshold is application dependent, 5 mm was chosen because it represents a registration error that would result in suboptimal placement of a prosthetic valve during TAVR (Ref. 19) or suboptimal catheter-based targeting of therapeutic injections.5 The eight poses tested are shown in Fig. 6. The five SDNR levels tested were 5.9±0.3, 9.4±0.8, 14.8±1.2, 22.0 ±2.0, and 33.4±4.0. With the exception of pose 5, the success rate was 97.1% for all experiments, with a mean TRE3D of 2.32±1.88 mm and TRE2D of 1.61±1.36 mm. Pose 5, with a rotation of θy = 76◦, rarely converged, an issue which is F. 10. Mean TRE2D (top) and TRE3D (bottom) for varying levels of SDNR. addressed in Sec. 5. Considering all poses and SDNR levels, the probe localization z-error along the source-detector axis was 0.87±1.31 mm. Higher image SDNR tended to improve TRE (Fig. 10), although for SDNR in the range of 11–35 the TRE did not vary much. For experiments with probe orientations typically seen in clinical cases (TEE probe roll < |60◦|, poses 1–4 and 8), and with SDNR > 18.8, the registration success rate was 100%, the TRE3D was 1.76±0.59 mm, and the TRE2D was 1.40±0.40 mm. For reference, the SDNRs in the in vivo study were 35–39. 4.B. Water tank phantom and in vivo studies Table II shows the TRE results for the water tank phan- tom and in vivo studies. The full image registration pipeline F. 9. Summary of pose estimation accuracy experiments for varying poses and increasing SDNR (via image averaging). Bars indicate the mean value and error bars indicate one standard deviation. Ten images were generated for each pose and SDNR level. (a) TRE3D. (b) TRE2D. (c) Percentage of successful registrations, defined as TRE3D < 5.0 mm. For each SDNR/pose combination, n=10. (Truncated result in the top panel: pose 5, 1 frame=3.86±1.26 mm.) Medical Physics, Vol. 42, No. 12, December 2015 7031 Hatt et al.: Registration of transesophageal echo to inverse geometry fluoroscopy 7031 T II. Results for the water tank phantom and in vivo experiments. Sequence Frames TRE2D (mm) mean ± std TRE3D (mm) mean ± std SDNR Water tank 1 26 0.95 ± 0.44 1.84 ± 0.63 63.5 Water tank 2 21 1.78 ± 0.79 2.02 ± 0.76 54.7 In vivo 1 53 1.57 ± 0.89 4.64 ± 1.95 35.8 In vivo 2 18 1.15 ± 0.49 2.62 ± 0.49 34.8 In vivo 3 35 1.61 ± 1.11 3.87 ± 1.13 38.8 resulted in a mean 3D error of 3.39 mm over all experiments and image frames, with errors in individual frames ranging from 0.22 to 9.1 mm. The water tank experiments showed lower TRE values than the in vivo experiments, which is likely due to the tighter control over experimental variables. Potential causes of errors are outlined in Sec. 5. Figure 11 demonstrates echo-to-SBDX registration and SBDX-to-echo registration for the catheter tip in the sec- ond in vivo sequence. In the echo-to-SBDX registration, the catheter tip segmented from echo is registered to the 2D SBDX image (blue circle). The TRE2D values in Table II represent the error in this registration. In the SBDX-to-echo registration, tomosynthesis-based catheter tip tracking is regis- tered to two planes of the echo image volume and displayed as red circles. The error in this process is characterized by TRE3D. Figure 12 demonstrates an echo-to-SBDX registration of the endocardial surface of the left ventricle. A 3D ventricular volume was manually segmented from an end-diastolic echo image volume and then registered to SBDX using the TEE probe pose. The segmented 3D volume was then projected onto the SBDX image and the borders of the projected segmen- tation were displayed. For comparison, a contrast-enhanced ventriculogram was performed with SBDX. A good agreement exists between the visible borders of the x-ray contrast and the echo-based borders. 5. DISCUSSION XRF is generally considered the primary imaging modal- ity for guidance of devices in structural heart interventions, but soft tissue visualization is poor, the projection format creates ambiguity, and ionizing radiation dose is an ongo- ing source of concern. Previous work has demonstrated the potential of SBDX to both reduce dose and provide 3D cath- eter tracking.10,20 The registration of 3D echo with SBDX could address the remaining need for real-time soft tissue anatomy in a common visualization environment. To this end, we have developed and evaluated an algorithm for SBDX/TEE registration. The SBDX/TEE registration algorithm combines tomosynthesis-based 3D localization with a version of 2D/3D registration adapted for inverse geometry x-ray imaging. In the initial pose estimation stage, the algorithm was able to localize the correct z-position of the TEE probe to within 0.87 mm on average. The ability of inverse geometry fluoroscopy to resolve depth in a single image frame is a unique advantage compared to standard fluoroscopy, which generally requires either biplane imaging or multiple acquisitions at different C-arm projection angles to localize the TEE probe in three dimensions. The study of pose estimation accuracy found TRE3D < 3 mm in individual images, for all experiments conducted at SDNR levels similar to those that were encountered in vivo. At lower SDNR levels, the pose corresponding to a primarily lateral view of the TEE probe (pose 5, Fig. 6) resulted in poor registration convergence. Visual inspection revealed this was due to an error in the final θy (roll) and θx (pitch) parameters. Additional work is needed to address this issue, but we note that in TAVR procedures performed at our own institution, the occurrence of this pose is extremely rare since the probe is almost always facing toward the x-ray detector while imaging the heart. Future work should also validate pose estimation accuracy in the presence of over- lapping high-contrast objects in the field-of-view, such as a catheter. For the water tank and in vivo studies, a general increase in TRE relative to the pose accuracy study was observed. This was expected because the targets were real catheter tips rather than virtual objects. Under this scenario, additional sources of registration error included localization of the catheter tip in echo and SBDX, echo volume-to-TEE probe calibration error (TRE = 1.71 mm), and potential temporal synchronization er- rors. For example, the 3D localization of the catheter tip in the SBDX plane stack was expected to introduce approximately 1.0 mm error in the z-direction.10 F. 11. Left: A SBDX composite image is shown, with the catheter tip location from echo overlaid onto the image, demonstrating TRE2D. Right: Two orthogonal slices from the 3D echo corresponding to the SBDX composite image on the left. The catheter tip, localized in SBDX, is transformed and overlaid onto the echo image. Medical Physics, Vol. 42, No. 12, December 2015 7032 Hatt et al.: Registration of transesophageal echo to inverse geometry fluoroscopy 7032 F. 12. Left: Two orthogonal slices through a 3D echo volumetric frame taken during the in vivo experiment, with a semiautomatically generated segmentation of the left ventricle. Right: The corresponding SBDX composite image, with the left ventricle segmentation borders registered and projected onto the SBDX image. This study demonstrates two potential approaches to SBDX/echo visualization. In an echo-centric display, 3D echo images could be augmented with 3D representations of the catheter device derived from SBDX device localization. Alter- natively, a live 2D fluoroscopic image could be combined with soft tissue anatomy segmented from simultaneous 3D echo. Future work will investigate the utility of these approaches in different structural heart interventional tasks. Note that in this initial study, SBDX/TEE image fusion was implemented in . For real-time guidance experiments, implementation on GPU-based hardware will be required. Additionally, future work should include automated procedures for initialization of the registration. 6. CONCLUSIONS Image registration between a low dose SBDX system and TEE has been demonstrated. A novel 6 degree-of-freedom localization algorithm was presented, and the registration feasibility and accuracy were evaluated in phantoms and in vivo. Future technical work will focus on real-time implemen- tation and fully automatic registration initialization. ACKNOWLEDGMENT Partial financial support for this work was provided by NIH Grant No. R01 HL084022. a)Author to whom correspondence should be addressed. Electronic mail: speidel@wisc.edu 1G. Gao, G. Penney, Y. Ma, N. Gogin, P. Cathier, A. Arujuna, G. Morton, D. Caulfield, J. Gill, and C. A. Rinaldi, “Registration of 3D trans-esophageal echocardiography to x-ray fluoroscopy using image-based probe tracking,” Med. Image Anal. 16, 38–49 (2012). 2P. Lang, P. Seslija, M. W. Chu, D. Bainbridge, G. M. Guiraudon, D. L. Jones, and T. M. Peters, “US–fluoroscopy registration for transcatheter aortic valve implantation,” IEEE Trans. Biomed. Eng. 59, 1444–1453 (2012). 3C. Hatt, M. Speidel, and A. Raval, “Robust 5DOF transesophageal echo probe tracking at fluoroscopic frame rates,” in International Conference on Medical Image Computing and Computer Assisted Interventions, edited by J. H. Nassir Navab, S. Wells, and A. F. Frangi (Springer, Munich, Germany, 2015). 4P. Mountney, R. Ionasec, M. Kaizer, S. Mamaghani, W. Wu, T. Chen, M. John, J. Boese, and D. Comaniciu, “Ultrasound and fluoroscopic images fusion by autonomous ultrasound probe detection,” in Medical Image Computing and Computer-Assisted Intervention–MICCAI 2012 (Springer, Berlin Heidelberg, 2012), pp. 544–551. 5C. R. Hatt, A. K. Jain, V. Parthasarathy, A. Lang, and A. N. Raval, “MRI—3D ultrasound—X-ray image fusion with electromagnetic tracking for transendocardial therapeutic injections: In-vitro validation and in-vivo feasibility,” Comput. Med. Imaging Graphics 37, 162–173 (2013). 6A. Jain, L. Gutierrez, and D. Stanton, “3D TEE registration with x-ray fluo- roscopy for interventional cardiac applications,” in Functional Imaging and Modeling of the Heart (Springer, Berlin Heidelberg, 2009), pp. 321–329. 7L. E. Bø, H. O. Leira, G. A. Tangen, E. F. Hofstad, T. Amundsen, and T. Langø, “Accuracy of electromagnetic tracking with a prototype field generator in an interventional OR setting,” Med. Phys. 39, 399–406 (2012). 8M. A. Speidel, B. P. Wilfley, J. M. Star-Lack, J. A. Heanue, and M. S. Van Lysel, “Scanning-beam digital x-ray (SBDX) technology for inter- ventional and diagnostic cardiac angiography,” Med. Phys. 33, 2714–2727 (2006). 9M. A. Speidel, M. T. Tomkowiak, A. N. Raval, D. A. Dunkerley, J. M. Slagowski, P. Kahn, J. Ku, and T. Funk, “Detector, collimator and real-time reconstructor for a new scanning-beam digital x-ray (SBDX) prototype,” Proc. SPIE 9412, 94121W (2015). 10M. A. Speidel, M. T. Tomkowiak, A. N. Raval, and M. S. Van Lysel, “Three- dimensional tracking of cardiac catheters using an inverse geometry x-ray fluoroscopy system,” Med. Phys. 37, 6377–6389 (2010). 11M. A. Speidel, B. P. Wilfley, A. Hsu, and D. Hristov, “Feasibility of low-dose single-view 3D fiducial tracking concurrent with external beam delivery,” Med. Phys. 39, 2163–2169 (2012). 12M. T. Tomkowiak, A. N. Raval, M. S. Van Lysel, T. Funk, and M. A. Speidel, “Calibration-free coronary artery measurements for interventional device sizing using inverse geometry x-ray fluoroscopy: In vivo validation,” J. Med. Imaging 1, 033504 (2014). 13M. T. Tomkowiak, M. S. Van Lysel, and M. A. Speidel, “Monoplane stereo- scopic imaging method for inverse geometry x-ray fluoroscopy,” Proc. SPIE 8669, 86692W (2013). 14C. Hatt, M. Speidel, and A. Raval, “Hough Forests for real-time, automatic device localization in fluoroscopic images: Application to TAVR,” edited by N. Navab, J. Hornegger, S. Wells, and A. F. Frangi, International Conference on Medical Image Computing and Computer Assisted Interventions (Springer, Munich, Germany, 2015). Medical Physics, Vol. 42, No. 12, December 2015 mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu mailto:speidel@wisc.edu http://dx.doi.org/10.1016/j.media.2011.05.003 http://dx.doi.org/10.1109/TBME.2012.2189392 http://dx.doi.org/10.1007/978-3-642-33418-4_67 http://dx.doi.org/10.1007/978-3-642-33418-4_67 http://dx.doi.org/10.1016/j.compmedimag.2013.03.006 http://dx.doi.org/10.1118/1.3666768 http://dx.doi.org/10.1118/1.2208736 http://dx.doi.org/10.1117/12.2081716 http://dx.doi.org/10.1118/1.3515463 http://dx.doi.org/10.1118/1.3697529 http://dx.doi.org/10.1117/1.JMI.1.3.033504 http://dx.doi.org/10.1117/1.JMI.1.3.033504 http://dx.doi.org/10.1117/12.2006238 7033 Hatt et al.: Registration of transesophageal echo to inverse geometry fluoroscopy 7033 15A. F. Frangi, W. J. Niessen, K. L. Vincken, and M. A. Viergever, “Mul- tiscale vessel enhancement filtering,” in Medical Image Computing and Computer-Assisted Interventation—MICCAI’98 (Springer, Berlin Heidel- berg, 1998), pp. 130–137. 16P. Kovesi, “Symmetry and asymmetry from local phase,” Presented at the Tenth Australian Joint Conference on Artificial Intelligence, Perth, Australia, 1997. 17W. Birkfellner, R. Seemann, M. Figl, J. Hummel, C. Ede, P. Homolka, X. Yang, P. Niederer, and H. Bergmann, “Wobbled splatting—A fast perspec- tive volume rendering method for simulation of x-ray images from CT,” Phys. Med. Biol. 50, N73–N84 (2005). 18C. R. Hatt, M. A. Speidel, and A. N. Raval, “Efficient feature-based 2D/3D registration of transesophageal echocardiography to x-ray fluoroscopy for cardiac interventions,” Proc. SPIE 9036, 90361J (2014). 19N. Piazza, P. de Jaegere, C. Schultz, A. E. Becker, P. W. Serruys, and R. H. Anderson, “Anatomy of the aortic valvar complex and its implications for transcatheter implantation of the aortic valve,” Circ.: Cardiovasc. Interven- tions 1, 74–81 (2008). 20M. A. Speidel, B. P. Wilfley, J. M. Star-Lack, J. A. Heanue, T. D. Betts, and M. S. Van Lysel, “Comparison of entrance exposure and signal-to-noise ratio between an SBDX prototype and a wide-beam cardiac angiographic system,” Med. Phys. 33, 2728–2743 (2006). Medical Physics, Vol. 42, No. 12, December 2015 http://dx.doi.org/10.1088/0031-9155/50/9/N01 http://dx.doi.org/10.1117/12.2043137 http://dx.doi.org/10.1161/CIRCINTERVENTIONS.108.780858 http://dx.doi.org/10.1161/CIRCINTERVENTIONS.108.780858 http://dx.doi.org/10.1118/1.2198198