arXiv:1204.0067v1 [cs.RO] 31 Mar 2012 1 Estimating Rigid Transformation Between Two Range Maps Using Expectation Maximization Algorithm Shuqing Zeng† Abstract We address the problem of estimating a rigid transformation between two point sets, which is a key module for target tracking system using Light Detection And Ranging (LiDAR). A fast implementation of Expectation-maximization (EM) algorithm is presented whose complexity is O(N) with N the number of scan points. I. INTRODUCTION Rigid registration of two sets of points sampled from a surface has been widely investigated (e.g., [1], [4]–[6], [9]) in computer vision literature. Generally, these methods are designed to tackle range maps with dense points for non-realtime applications. In [2], [8] scans are matched using iterative closest line (ICL), a variant of “normal-distance” form of ICP algorithm [1] originally proposed in computer vision community by [3]. However, the convergence of this approach is sensitive to errors in normal direction estimations [10]. m s j k T w(s , m )j k Fig. 1. Illustration of the proposed algorithm. Fig. 1 illustrates the concept. The light green circles denote the contour of a target M. The red circles are the projection of M under a rigid transformation T, denoted as ˜ M. Let S be the current range image shown as upper triangles. We propose an Expectation-maximization (EM) algorithm [4], [7] to find the rigid transformation such that the projected range image best matches the current image. Each point mj in ˜ M is treated as the center of a Parzen window. There is an edge between sk ∈S and mj if sk lies in the window. The weight of the edge (sk, mj) is based on the proximity between the two vertices. The larger weight of the edge, the thicker the line is shown, and the more force that pulls the corresponding the point mj to sk through T. This document describes a fast implementation of expectation maximization (EM) algorithm [7] to locally match between M and S. By exploiting the sparsity of the locally matching matrix, this imple- mentation scales linearly with the number of points. †Department of Computer Science and Engineering, Michigan State University, East Lansing, MI 48824, e- mail:zengshuq@msu.edu. 2 II. ALGORITHM DERIVATION This section is devoted to the problem of how to estimate the rigid transformation T using (EM) algorithm, giving scan map S and a contour model M. We constructed a bipartite graph B = (M, S, EB) between the vertex set M to S with EB the set of edges. Let m ∈M and s ∈S. An edge exists between the points s and m if and only if ∥s −m∥< W with W a distance threshold. By N(s) ≡{m | (s, m) ∈EB} we denote the neighborhood of s. Scan points are indexed using a lookup hash-table with W/2 resolution. Find the points m near a point s within the radius W involving searching through all the three-by-three neighbor grid of the cell containing s. Since hash table is used, and |N(s)| is bounded, construction graph B is an O(N) operation with N the number of points in a scan. Let sj ∈S be one of the nS scan points, and mk ∈M be one of the nM points from the model. We denote T a rigid transformation from the model to the new scan frame, with the parameter vector y. If sj is the measure of mk (i.e., (sj, mk) ∈B) with a known noise model, we write the density function as p(sj | mk, y) = p(sj | T(mk, y)). In case of an additive and centered Gaussian noise of precision matrix Γ, p(sj | mk, T) = c exp(−∥sj−T(mk,y)∥2 Γ 2 ) where the Mahalanobis norm is defined as ∥x∥2 Γ ≡xT Γx. We use the binary matrix A to represent the correspondence between sj and mk. The entry Ajk = 1 if sj matches mk and 0 otherwise. Assume each scan point sj corresponds to at most one model point. We have ΣkAjk =  1 If N(sj) ̸= ∅ 0 Otherwise. for all scan point index j. For the above equation, we note that for the case N(sj) = ∅, sj is an outlier, and the correspondence sj to mk can be treated as a categorical distribution. In order to apply EM procedure we use a random matching matrix A with each element a binary random variable. Each eligible matching matrix A has a probability p(A) ≡p(A = A). One can verify that ¯Ajk = E{Ajk} = P(Ajk = 1), and the following constraint holds Σk ¯Ajk =  1 If N(sj) ̸= ∅ 0 Otherwise. Considering the distribution of Aj, the j-th row of the A, which is the distribution of assigning the scan point sj to the model point mk, i.e., p(Aj) = Y mk∈N (sj) ( ¯Ajk)Ajk Assuming the scan points are independent, we can write p(A) = Y sj∈S Y mk∈N (sj) ( ¯Ajk)Ajk = Y (sj,mk)∈EB ( ¯Ajk)Ajk (1) An example of p(A) is the noninformative prior probability of the matches: a probability distribution that a given scan point is a measure of a given model point without knowing measurement information: ¯Ajk = πjk =  1 |N (sj)| If N(sj) ̸= ∅ 0 Otherwise. The joint probability of the scan point sj and the corresponding assignment Aj can be expressed as p(sj, Aj | M, y) = Y mk∈N (sj) (πjkp(sj | mk, y))Ajk 3 Providing that the scan points are conditionally independent, the overall joint probability is the product of the each row of A: p(S, A | M, y) = Y (sj,mk)∈EB (πjkp(sj | mk, y))Ajk (2) and the logarithm of marginal distribution can be written as ML(T) = log p(S | M, y) = log X A p(S, A | M, y) ! (3) Unfortunately, Eq. (3) has no closed-form solution and no robust and efficient algorithm to directly minimize it with respect to the parameter y. Noticing that Eq. (3) only involves the logarithm of a sum, we can treat the matching matrix A as latent variables and apply the EM algorithm to iteratively estimate y. Assuming after n-th iteration, the current estimate for y is given by yn, we can compute an updated estimate T such that ML(T) is monotonically increasing, i.e., ∆(y | yn) = ML(y) −ML(yn) > 0 Namely, we want to maximize the difference ∆(y | yn). Now we are ready to state two propositions whose proofs are relegated to Appendix. Proposition 2.1: ∆(y | yn) = EA|S,M,yn{log (p(S, A | M, y))} Proposition 2.2: Given the transformation estimate yn, scan points S and model points M, the posterior of the matching matrix A can be written as p(A | S, M, yn) = Y j,k:(sj,mk)∈EB ( ˆAjk)Ajk (4) where E{A}jk = ˆAjk = ( πjkp(sj|mk,yn) P k πjkp(sj|mk,yn) If N(sj) ̸= ∅ 0 Otherwise. (5) Therefore, we have the following EM algorithm to compute y that maximizes the likelihood defined in Eq. (3). We assume there exists an edge in the graph B between sj and mk in the following derivation. • E-step: Given the previous estimate Tn, we update ˆAjk using Eq. (5). The conditional expectation is computed as ∆(y | yn) = EA|S,M,yn {log p(S, A | M, y)} = E   log  Y j,k πjkp(sj | mk, y)Ajk      = X j,k E{Ajk}(log p(sj | mk, y) + log πjk) = X j,k ˆAjk∥sj −T (mk, y)∥2 Γ + const. (6) where E is EA|S,M,yn in short, and const. is the terms irrelevant to y. • M-step: Compute y to maximize the least-squares expression in Eq. (6). The above EM procedure is repeated until the model is converged, i.e., the difference of log-likelihood between two iterations ∆(y | yn) is less than a small number. The complexity of the above computation for a target in each iteration is O(|EB|). Since the number of neighbors for sj is bounded, the complexity is reduced to O(|S|). Since experimental result shows that only 4-5 epochs are needed for EM iteration 4 to converge. Consequently, the overall complexity for all of the tracked objects is O(N) with N the number of scan points. The following proposition shows how to compute the covariance matrix for the transformation param- eters y. Proposition 2.3: Given y, the covariance matrix R is R = 1 nP X (sj,mk)∈EB ˆAjk(sj −T(mk, y))(sj −T(mk, y))T (7) where nP is the number of the nonzero rows of the matrix ˆA. III. PROOF OF PROPOSITIONS A. Proof of Proposition 2.1 ∆(y | yn) = ML(y) −ML(yn) = log X A p(S, A|M, y) ! −log (p(S | M, yn)) = log X A p(S|A, M, y)p(A|M, y) ! −log p(S|M, yn) = log X A p(A|S, M, yn)p(S|A, M, y)p(A|M, y) p(A|S, M, yn) ! −log p(S|M, yn) ≥ X A p(A|S, M, yn) log p(S|A, M, y)p(A|M, y) p(A|S, M, yn)  −log p(S|M, yn) (8) = X A p(A|S, M, yn) log  p(S|A, M, y)p(A|M, y) p(A|S, M, yn)p(S|M, yn)  where Jansen’s inequality and convexity of logarithm function are applied in deriving Eq. (8). Since we are maximizing ∆(y|yn) with respect to y, we can drop terms that are irrelevant to y, thus ∆(y|yn) = X A p(A|S, M, yn) log (p(S|A, M, y)p(A|M, y)) = X A p(A|S, M, yn) log p(S, A, y|M) p(A, y|M) p(A, y|M) p(y|M)  = X A p(A|S, M, yn) log p(S, A, y|M) p(y|M)  = X A p(A|S, M, yn) log (p(S, A|M, y)) = EA|S,M,yn{log (p(S, A|M, y))} (9) 5 B. Proof of Proposition 2.2 If N(sj) ̸= ∅, the marginal PDF of the j-th row of A is p(sj|M, y) = P k πjkp(sj|mk, y). We assume there exists an edge in the graph B between the scan and model points sj and mk, and scan points are independent each other. One can verify that p(S|M, yn) = Y j X k πjkp(sj|mk, yn) ! Using Bayesian theorem, we have p(A|S, M, yn) = p(S, A|M, yn) p(S|M, yn) = Q j,k (πjkp(sj|mk, yn))Ajk Q j (P k πjkp(sj|mk, yn)) = Q j,k (πjkp(sj|mk, yn))Ajk Q j,k (P k πjkp(sj|mk, yn))Ajk = Y j,k  πjkp(sj|mk, yn) P k πjkp(sj|mk, yn) Ajk Comparing with Eq. (4), the equation Eq. (5) holds. C. Proof of Proposition 2.3 We treat the precision matrix Γ as the uncertainty of unknown transformation parameter y. We use a maximum likelihood approach, which amounts to minimizing Eq. (6) with respect to Γ given a transformation and a set of matches with probabilities: ∂ ∂Γ∆(y|yn) = ∂ ∂Γ X (sj,mk)∈EB ˆAjk ∥sj −T(mk, y)∥2 Γ 2 + log |Γ|−1 2  = 1 2 X (sj,mk)∈EB ˆAjk(sj −T(mk, y))(sj −T(mk, y))T −nP 2 Γ−1 = 0 where nP is the number of nonzero rows of the matrix ˆA. Thereby the covariance matrix R is computed as R = Γ−1 = 1 nP X (sj,mk)∈EB ˆAjk(sj −T(mk))(sj −T(mk))T REFERENCES [1] P. Besl and N. McKay. A method for registration of 3-D shapes. IEEE Trans. Pattern Analaysis and Machine Intelligence, 14(2):239–256, 1992. [2] A. Censi. An icp variant using a point-to-line metric. In IEEE International Conference on Robotics and Automation, pages 19–25, New York, NY, 2008. 6 [3] Y. Chen and G. Medioni. Object modeling by registration of multiple range images. Image and Vision Computing, 10(3):145–155, 1992. [4] S. Granger and X. Pennec. Multi-scale EM-ICP: A fast and robust approach for surface registration. In ECCV, pages 418–432, 2002. [5] A. Jagannathan and E. Miller. Unstructure point cloud matching within graph-theoretic and thermodynamic frameworks. In CVPR, pages 1008–1015, 2005. [6] A. Makadia, A. Patterson, and K. Daniilidis. Fully automatic registration of 3D point clouds. In CVPR, pages 1297–1304, 2006. [7] G. McLachain and T. Krishnan. The EM algorithm and extensions. John Wiley & Sons Inc., New York, second edition, 2008. [8] E. Olson. Real-time correlative scan matching. In IEEE International Conference on Robotics and Automation, pages 4387–4393, Kobe, Japan, 2009. [9] G. Sharp, S. Lee, and D. Wehe. ICP registration using invariant features. IEEE Trans. Pattern Analaysis and Machine Intelligence, 24(1):90–102, 2002. [10] C. Stewart. Uncertainty-driven, point-based image registration. In N. Paragios, Y. Chen, and O. Faugeras, editors, Handbook of Mathematical Models in Computer Vision, chapter 14, pages 221–235. Springer, 2006.