This post is mainly a re-formulation and detailed expansion for the section 8.4, "Direct Method" in "14 lectures on Visual SLAM".
When I was working on direct method SLAM, I found the notations used on the book is hard to understand and many details to derive the final equation are omitted. Hence, I wrote this post as a note and record for my own derivation of the Jacobian in Direct Method of SLAM.
Coordinate System and Symbol Table
Assumption
Cam1 and Cam2 have same intrinsic matrix K and are pinhole camera with no distortion (or distortion is corrected for resulting image in pre-process step).
Coordinate System
There are two 3D coordinate systems: the camera 1 (where optic center O1 is the same as world coordinate origin) and the camera 2 (with transformation of cam1 → cam2 is T12=ξ)
There are two 2D coordinate systems (pixel coordinates): the image 1 and image 2. I will use uv coordinate when refer to them.
Symbol Table
Symbol | Domain | Shape | Meaning |
---|
Pi | R3 | (1, 3) | i-th point in the 3-dimentional space (under cam1 coordinate) |
p1,i | R2 | (1, 2) | i-th point's projection on cam1's sensor plane |
p2,i | R2 | (1, 2) | i-th point's projection on cam2's sensor plane |
K | R3×3 | (3, 3) | Camera intrinsic matrix |
ξ | se(3) | (1, 6) | Cam2's pose w.r.t. world coordinate (cam1) |
I1 | R2→R | - | Illuminance map from cam1 (w/ bilinear interpolation) |
I2 | R2→R | - | Illuminance map from cam2 (w/ bilinear interpolation) |
Exp(⋅) | se(3)→R4×4 | - | vector to matrix + Exponential Mapping (Lie Algebra → Transformation Matrix) |
Log(⋅) | R4×4→se(3) | - | Logarithm Mapping + Matrix to vector (Transformation Matrix → Lie Algebra) |
exp(⋅) | se(3)→SE(3) | - | Exponential Mapping (Lie Algebra → Lie Group) |
⋅∧ | R6→R4×4 | - | se(3) vector representation to matrix representation |
⋅~ | - | - | Homogenous coordinate system |
Unless explicitly specified, all the coordinates are represented under heterogeneous coordinate.
Step 1. Reprojection Model
According to the pinhole camera model, we have p1,i~=KPi and
p2,i~=K(Exp(ξ)Pi~)1:3.
By converting homogenous coordinate to heterogeneous coordinate, we can retrieve the p1,i and p2,i from ξ and K.
Step 2. Photometric Error as Model Residual
In direct method, we do not do feature point matching as this process (extracting feature point, convert to feature descriptor, run descriptor matching with or without epipolar geo constraint) is computationally heavy and error-prone.
Instead, we use the "illuminance consistency assumption". That is, between two camera frames t and t+1, the illuminance of a point in 3D space should be consistent. That is,
It(P)=It+1(P)
Naturally, the photometric error is the difference in illuminance between same point in 2 adjacent frames.
ei(ξ)=(I1(p1,i)−I2(p2,i))2
Direct SLAM make use of this assumption heavily. In the bundle adjustment process, we define the total residual of current state estimation as:
R(ξ)=i=1∑Nei(ξ)=i=1∑N(I1(p1,i)−I2(p2,i))2
During the (local) bundle adjustment stage, we want to optimize the pose of cam2, namely ξ∈se(3), to minimize the residual of state estimation. The mathematical formulation of this optimization problem is:
ξ⋆=argξminR(ξ)=argξmini=1∑Nei(ξ)=argξmini=1∑N∥I1(p1,i)−I2(p2,i)∥22=argξmini=1∑N∥I1(KPi)−I2(K(Exp(ξ)Pi~)1:3∥22
Step 4. Perturbation Model and Jacobian for Direct Method
To solve this optimization problem, we can use optimizers like Gauss-Newton or Lagrange-Marquadt. These optimizers typically requires us to provide the jacobian of model. Hence, we need to derive the jacobian J for cam2 pose ξ w.r.t. R. Since R is a simple summation from ei(ξ), we will only derive ∂ei(ξ)/∂ξ here.
To calculate the jacobian of R w.r.t. ξ, there are two main methods: the direct differentiation and perturbation model.
Direct Differentiation (which, we will not use)
∂ξ∂ei(ξ)=δξ→0limδξei(ξ⊕δξ)−ei(ξ)=δξ→0limδξ∥I(KPi)−I2(K(Exp(ξ⊕δξ)Pi~)1:3)∥22−ei(ξ)
However, we have ξ⊕δξ=Log(Exp(ξ)Exp(δξ)) in Lie algebra. Instead, we need to follow the BCH formula with first-order approximation (when δξ is sufficiently small)
Log(Exp(δξ)Exp(ξ))≈Jl(ξ)−1δξ+ξ
However, using BCH formula requires us to calculate the Jl, which is undesired due to its computation complexity. Hence, we will use the perturbation model to calculate the differentiation instead.
Perturbation Model
Comparing to the direct differentiation, where BCH formula is required, perturbation model first add a small perturbation on target function ei, then calculate the derivative of residual w.r.t. the perturbation term.
That is, we will calculate ∂δξ∂ei(ξ⊕δξ) instead of ∂ξ∂ei(ξ).
First, we will derive ei(ξ⊕δξ) (using left-perturbation, the result is different from right-perturbation).
ei(ξ⊕δξ)=I1(KPi)−I2(K(Exp(δξ)Exp(ξ)Pi~)1:3)
The taylor expansion for Exp(δξ) is of form
Exp(δξ)=exp(δξ∧)=n=0∑∞n!1(δξ∧)n
Using a first-order taylor approximation here (should be accurate as δξ is a minute perturbation term), we have
Exp(δξ)≈1+δξ∧
Then, we have
ei(ξ⊕δξ)=I1(KPi)−I2(K(Exp(δξ)Exp(ξ)P~i)1:3)≈I1(KPi)−I2(K((1+δξ∧)Exp(ξ)P~i)1:3)=I1(KPi)−I2(K(Exp(ξ)P~i)1:3+K(δξ∧Exp(ξ)P~i)1:3)
Then we apply another first-order taylor approximation
≈I2(K(Exp(ξ)P~i)1:3+K(δξ∧Exp(ξ)P~i)1:3)≈I2(K(Exp(ξ)P~i)1:3)+∂p2∂I2(p2)∂ξ∂p2δξ
And we have the fully expanded form of e(ξ⊕δξ)
e(ξ⊕δξ)=I1(KPi)−(I2(K(Exp(δξ)Exp(ξ)Pi)1:3)+∂p2∂I2(p2)∂ξ∂p2δξ)
where we can derive the derivative under left perturbation
∂δξ∂e(ξ⊕δξ)≈−∂p2∂I2(p2)∂ξ∂p2