arXiv:1204.0067v1  [cs.RO]  31 Mar 2012
1
Estimating Rigid Transformation Between Two
Range Maps Using Expectation Maximization
Algorithm
Shuqing Zeng†
Abstract
We address the problem of estimating a rigid transformation between two point sets, which is a key
module for target tracking system using Light Detection And Ranging (LiDAR). A fast implementation
of Expectation-maximization (EM) algorithm is presented whose complexity is O(N) with N the number
of scan points.
I. INTRODUCTION
Rigid registration of two sets of points sampled from a surface has been widely investigated (e.g., [1],
[4]–[6], [9]) in computer vision literature. Generally, these methods are designed to tackle range maps
with dense points for non-realtime applications.
In [2], [8] scans are matched using iterative closest line (ICL), a variant of “normal-distance” form of
ICP algorithm [1] originally proposed in computer vision community by [3]. However, the convergence
of this approach is sensitive to errors in normal direction estimations [10].
m
s
j
k
T
w(s  , m )j
k
Fig. 1.
Illustration of the proposed algorithm.
Fig. 1 illustrates the concept. The light green circles denote the contour of a target M. The red circles
are the projection of M under a rigid transformation T, denoted as ˜
M. Let S be the current range image
shown as upper triangles. We propose an Expectation-maximization (EM) algorithm [4], [7] to ﬁnd the
rigid transformation such that the projected range image best matches the current image. Each point mj
in
˜
M is treated as the center of a Parzen window. There is an edge between sk ∈S and mj if sk lies in
the window. The weight of the edge (sk, mj) is based on the proximity between the two vertices. The
larger weight of the edge, the thicker the line is shown, and the more force that pulls the corresponding
the point mj to sk through T.
This document describes a fast implementation of expectation maximization (EM) algorithm [7] to
locally match between M and S. By exploiting the sparsity of the locally matching matrix, this imple-
mentation scales linearly with the number of points.
†Department
of
Computer
Science
and
Engineering,
Michigan
State
University,
East
Lansing,
MI
48824,
e-
mail:zengshuq@msu.edu.
2
II. ALGORITHM DERIVATION
This section is devoted to the problem of how to estimate the rigid transformation T using (EM)
algorithm, giving scan map S and a contour model M.
We constructed a bipartite graph B = (M, S, EB) between the vertex set M to S with EB the set of
edges. Let m ∈M and s ∈S. An edge exists between the points s and m if and only if ∥s −m∥< W
with W a distance threshold. By N(s) ≡{m | (s, m) ∈EB} we denote the neighborhood of s.
Scan points are indexed using a lookup hash-table with W/2 resolution. Find the points m near a
point s within the radius W involving searching through all the three-by-three neighbor grid of the cell
containing s. Since hash table is used, and |N(s)| is bounded, construction graph B is an O(N) operation
with N the number of points in a scan.
Let sj ∈S be one of the nS scan points, and mk ∈M be one of the nM points from the model. We
denote T a rigid transformation from the model to the new scan frame, with the parameter vector y. If
sj is the measure of mk (i.e., (sj, mk) ∈B) with a known noise model, we write the density function as
p(sj | mk, y) = p(sj | T(mk, y)). In case of an additive and centered Gaussian noise of precision matrix
Γ, p(sj | mk, T) = c exp(−∥sj−T(mk,y)∥2
Γ
2
) where the Mahalanobis norm is deﬁned as ∥x∥2
Γ ≡xT Γx.
We use the binary matrix A to represent the correspondence between sj and mk. The entry Ajk = 1
if sj matches mk and 0 otherwise. Assume each scan point sj corresponds to at most one model point.
We have
ΣkAjk =
 1 If N(sj) ̸= ∅
0 Otherwise.
for all scan point index j.
For the above equation, we note that for the case N(sj) = ∅, sj is an outlier, and the correspondence
sj to mk can be treated as a categorical distribution. In order to apply EM procedure we use a random
matching matrix A with each element a binary random variable. Each eligible matching matrix A has
a probability p(A) ≡p(A = A). One can verify that ¯Ajk = E{Ajk} = P(Ajk = 1), and the following
constraint holds
Σk ¯Ajk =
 1 If N(sj) ̸= ∅
0 Otherwise.
Considering the distribution of Aj, the j-th row of the A, which is the distribution of assigning the
scan point sj to the model point mk, i.e.,
p(Aj) =
Y
mk∈N (sj)
( ¯Ajk)Ajk
Assuming the scan points are independent, we can write
p(A) =
Y
sj∈S
Y
mk∈N (sj)
( ¯Ajk)Ajk =
Y
(sj,mk)∈EB
( ¯Ajk)Ajk
(1)
An example of p(A) is the noninformative prior probability of the matches: a probability distribution
that a given scan point is a measure of a given model point without knowing measurement information:
¯Ajk = πjk =

1
|N (sj)|
If N(sj) ̸= ∅
0 Otherwise.
The joint probability of the scan point sj and the corresponding assignment Aj can be expressed as
p(sj, Aj | M, y) =
Y
mk∈N (sj)
(πjkp(sj | mk, y))Ajk
3
Providing that the scan points are conditionally independent, the overall joint probability is the product
of the each row of A:
p(S, A | M, y) =
Y
(sj,mk)∈EB
(πjkp(sj | mk, y))Ajk
(2)
and the logarithm of marginal distribution can be written as
ML(T) = log p(S | M, y) = log
 X
A
p(S, A | M, y)
!
(3)
Unfortunately, Eq. (3) has no closed-form solution and no robust and efﬁcient algorithm to directly
minimize it with respect to the parameter y. Noticing that Eq. (3) only involves the logarithm of a sum,
we can treat the matching matrix A as latent variables and apply the EM algorithm to iteratively estimate
y. Assuming after n-th iteration, the current estimate for y is given by yn, we can compute an updated
estimate T such that ML(T) is monotonically increasing, i.e.,
∆(y | yn) = ML(y) −ML(yn) > 0
Namely, we want to maximize the difference ∆(y | yn).
Now we are ready to state two propositions whose proofs are relegated to Appendix.
Proposition 2.1:
∆(y | yn) = EA|S,M,yn{log (p(S, A | M, y))}
Proposition 2.2: Given the transformation estimate yn, scan points S and model points M, the
posterior of the matching matrix A can be written as
p(A | S, M, yn) =
Y
j,k:(sj,mk)∈EB
( ˆAjk)Ajk
(4)
where
E{A}jk = ˆAjk =
(
πjkp(sj|mk,yn)
P
k πjkp(sj|mk,yn)
If N(sj) ̸= ∅
0 Otherwise.
(5)
Therefore, we have the following EM algorithm to compute y that maximizes the likelihood deﬁned in
Eq. (3). We assume there exists an edge in the graph B between sj and mk in the following derivation.
• E-step: Given the previous estimate Tn, we update ˆAjk using Eq. (5). The conditional expectation
is computed as
∆(y | yn) = EA|S,M,yn {log p(S, A | M, y)}
= E


log

Y
j,k
πjkp(sj | mk, y)Ajk





=
X
j,k
E{Ajk}(log p(sj | mk, y) + log πjk)
=
X
j,k
ˆAjk∥sj −T (mk, y)∥2
Γ + const.
(6)
where E is EA|S,M,yn in short, and const. is the terms irrelevant to y.
• M-step: Compute y to maximize the least-squares expression in Eq. (6).
The above EM procedure is repeated until the model is converged, i.e., the difference of log-likelihood
between two iterations ∆(y | yn) is less than a small number. The complexity of the above computation
for a target in each iteration is O(|EB|). Since the number of neighbors for sj is bounded, the complexity
is reduced to O(|S|). Since experimental result shows that only 4-5 epochs are needed for EM iteration
4
to converge. Consequently, the overall complexity for all of the tracked objects is O(N) with N the
number of scan points.
The following proposition shows how to compute the covariance matrix for the transformation param-
eters y.
Proposition 2.3: Given y, the covariance matrix R is
R = 1
nP
X
(sj,mk)∈EB
ˆAjk(sj −T(mk, y))(sj −T(mk, y))T
(7)
where nP is the number of the nonzero rows of the matrix ˆA.
III. PROOF OF PROPOSITIONS
A. Proof of Proposition 2.1
∆(y | yn) = ML(y) −ML(yn)
= log
 X
A
p(S, A|M, y)
!
−log (p(S | M, yn))
= log
 X
A
p(S|A, M, y)p(A|M, y)
!
−log p(S|M, yn)
= log
 X
A
p(A|S, M, yn)p(S|A, M, y)p(A|M, y)
p(A|S, M, yn)
!
−log p(S|M, yn)
≥
X
A
p(A|S, M, yn) log
p(S|A, M, y)p(A|M, y)
p(A|S, M, yn)

−log p(S|M, yn)
(8)
=
X
A
p(A|S, M, yn) log
 p(S|A, M, y)p(A|M, y)
p(A|S, M, yn)p(S|M, yn)

where Jansen’s inequality and convexity of logarithm function are applied in deriving Eq. (8). Since we
are maximizing ∆(y|yn) with respect to y, we can drop terms that are irrelevant to y, thus
∆(y|yn) =
X
A
p(A|S, M, yn) log (p(S|A, M, y)p(A|M, y))
=
X
A
p(A|S, M, yn) log
p(S, A, y|M)
p(A, y|M)
p(A, y|M)
p(y|M)

=
X
A
p(A|S, M, yn) log
p(S, A, y|M)
p(y|M)

=
X
A
p(A|S, M, yn) log (p(S, A|M, y))
= EA|S,M,yn{log (p(S, A|M, y))}
(9)
5
B. Proof of Proposition 2.2
If N(sj) ̸= ∅, the marginal PDF of the j-th row of A is p(sj|M, y) = P
k πjkp(sj|mk, y). We assume
there exists an edge in the graph B between the scan and model points sj and mk, and scan points are
independent each other. One can verify that
p(S|M, yn) =
Y
j
 X
k
πjkp(sj|mk, yn)
!
Using Bayesian theorem, we have
p(A|S, M, yn) = p(S, A|M, yn)
p(S|M, yn)
=
Q
j,k (πjkp(sj|mk, yn))Ajk
Q
j (P
k πjkp(sj|mk, yn))
=
Q
j,k (πjkp(sj|mk, yn))Ajk
Q
j,k (P
k πjkp(sj|mk, yn))Ajk
=
Y
j,k

πjkp(sj|mk, yn)
P
k πjkp(sj|mk, yn)
Ajk
Comparing with Eq. (4), the equation Eq. (5) holds.
C. Proof of Proposition 2.3
We treat the precision matrix Γ as the uncertainty of unknown transformation parameter y. We
use a maximum likelihood approach, which amounts to minimizing Eq. (6) with respect to Γ given
a transformation and a set of matches with probabilities:
∂
∂Γ∆(y|yn)
= ∂
∂Γ
X
(sj,mk)∈EB
ˆAjk
∥sj −T(mk, y)∥2
Γ
2
+ log |Γ|−1
2

= 1
2
X
(sj,mk)∈EB
ˆAjk(sj −T(mk, y))(sj −T(mk, y))T
−nP
2 Γ−1 = 0
where nP is the number of nonzero rows of the matrix ˆA. Thereby the covariance matrix R is computed
as
R = Γ−1
= 1
nP
X
(sj,mk)∈EB
ˆAjk(sj −T(mk))(sj −T(mk))T
REFERENCES
[1] P. Besl and N. McKay. A method for registration of 3-D shapes. IEEE Trans. Pattern Analaysis and Machine Intelligence,
14(2):239–256, 1992.
[2] A. Censi. An icp variant using a point-to-line metric. In IEEE International Conference on Robotics and Automation,
pages 19–25, New York, NY, 2008.
6
[3] Y. Chen and G. Medioni.
Object modeling by registration of multiple range images.
Image and Vision Computing,
10(3):145–155, 1992.
[4] S. Granger and X. Pennec. Multi-scale EM-ICP: A fast and robust approach for surface registration. In ECCV, pages
418–432, 2002.
[5] A. Jagannathan and E. Miller. Unstructure point cloud matching within graph-theoretic and thermodynamic frameworks.
In CVPR, pages 1008–1015, 2005.
[6] A. Makadia, A. Patterson, and K. Daniilidis. Fully automatic registration of 3D point clouds. In CVPR, pages 1297–1304,
2006.
[7] G. McLachain and T. Krishnan. The EM algorithm and extensions. John Wiley & Sons Inc., New York, second edition,
2008.
[8] E. Olson. Real-time correlative scan matching. In IEEE International Conference on Robotics and Automation, pages
4387–4393, Kobe, Japan, 2009.
[9] G. Sharp, S. Lee, and D. Wehe. ICP registration using invariant features. IEEE Trans. Pattern Analaysis and Machine
Intelligence, 24(1):90–102, 2002.
[10] C. Stewart. Uncertainty-driven, point-based image registration. In N. Paragios, Y. Chen, and O. Faugeras, editors, Handbook
of Mathematical Models in Computer Vision, chapter 14, pages 221–235. Springer, 2006.