Architecture design
The proposed LiDAR-SLAM-based structure for PNT is illustrated in Fig. 1. GNSS in Fig. 1 refers to the DGNSS with the measurements received from multi-constellation system, which takes GPS, GLONASS, and BeiDou systems into consideration. The more measurements from multi-constellation system will result in more precise positioning solution in diverse scenarios. This is applicable for the following content as well. The initial guess for LiDAR-SLAM comes from two sources. One is the INS/GNSS integrated navigation solution with EKF after Rauch–Tung–Striebel (RTS) smoother (Chiang et al., 2009; Särkkä, 2008), which predicts and updates the velocity, position, and attitude information and feedbacks the bias and scale factor from EKF to INS mechanism. The other is directly from DGNSS stand-alone system with the position and velocity information only. The whole LiDAR-SLAM process includes point cloud preprocessing, DG, 3D NDT scan matching, and FDE. Especially, the FDE mechanism involves the two-step-functions, which are LiDAR Odometry (LO) and LiDAR Mapping (LM).
Initial guesses
INS/GNSS integrated navigation solution
In this research, the INS and DGNSS sensors integration system is developed utilizing the Loosely Coupled (LC) scheme, which needs an optimal estimator to compare the difference between each measurement with predicted variables and establish the error model, in order to estimate the possible error. The Least-Square (LS) method is the most used optimal estimator (Parvazi et al., 2020). To deal with continuous measurements, the Recursive Least-Square (RLS) method, the derivative of LS can update the previous optimal estimates with the current observations. However, RLS is only suitable for time-invariant system, while the Kalman filter can cope with time-varying system (Zangeneh-Nejad et al., 2018). The fundamental equations of EKF based on RLS are as follows.
The state vector \({x}_{k}\) for EKF input is showed in Eq. 1 (Shin and El, 2002; Shin, 2005), where each element involves the values in x, y, and z three axes.
$${x}_{k}={\left[r \, v \, \varphi \, {b}_{a} \, {b}_{g} \, {s}_{a} \, {s}_{g}\right]}_{21\times 1}^{T}$$
(1)
where \(r\) is the position vector in the Earth frame (latitude, longitude, height); \(v\) is velocity vector recorded in North, East, and Downward sequence; \(\varphi\) is the attitude rotating from the body frame to mapping frame, while body frame is known as the vehicle frame, and mapping frame refers to the local level frame; \({b}_{a}\) is the bias of accelerometer in IMU; \({b}_{g}\) is the bias of gyroscope in IMU; \({s}_{a}\) and \({s}_{g}\) are the scale factors of accelerometer and gyroscope, respectively.
With the state vector, we can construct two functions. One is function f, which utilizes the state vector at epoch k to predict the state vector at the next epoch k + 1. The other is function h, which can compute the measurements updated at epoch k + 1 by multiplying the predicted state from function f with observation matrix H and adding the observation error \({v}_{k}\).
$${x}_{k+1}^{-}=f\left( {x}_{k}, {\Phi }_{k}\right)+{w}_{k}$$
(2)
$${z}_{k+1}=h\left( {x}_{k}\right)+{v}_{k}= H{x}_{k}+{v}_{k}$$
(3)
Because the state transition and measurement update model are nonlinear, the previous functions shall pass through partial derivatives before putting them into Kalman Filter, which can be divided into two parts: prediction and measurement update. The prediction process estimates the state and noise at epoch k + 1 from the observation at epoch k, and the estimated value from prediction is denoted with superscript (–).
$${\widehat{x}}_{k+1}^{-}={\Phi }_{k}{\widehat{x}}_{k}$$
(4)
$${\widehat{P}}_{k+1}^{-}={\Phi }_{k}{P}_{k}{\Phi }_{k}+{Q}_{k}$$
(5)
The measurement update process is conducted by updating the state with the observation at epoch k + 1. It put the \({K}_{k+1}\) as Kalman gain into consideration. When the prediction model is more reliable, the weight for residual will be smaller, and vice versa. The combination and evaluation about the weights of observations from two sources (INS and GNSS) are critical (Teunissen & Amiri-Simkooei, 2008), while the suitable weight settings can improve the quality of unknown parameters. Meanwhile, the weighting should also take the grade of the sensor into consideration.
$${K}_{k+1}={\widehat{P}}_{k+1}^{-}{H}^{T}{\left(H{\widehat{P}}_{k+1}^{-}{H}^{T}+{R}_{k+1}\right)}^{-1}$$
(6)
$${\widehat{x}}_{k+1}^{+}={\widehat{x}}_{k+1}^{-}+{K}_{k+1}\left({z}_{k+1}-H{\widehat{x}}_{k+1}^{-}\right)$$
(7)
$${\widehat{P}}_{k+1}^{+}=\left(I-{K}_{k+1}H\right){\widehat{P}}_{k+1}^{-}$$
(8)
DGNSS stand-alone navigation solution
The concept of DGNSS is that the base station and the rover station simultaneously receive GNSS signals, while the latter is mounted on the vehicle. By subtracting the positioning data of the base (also named as the reference station) and rover stations, it can turn out the differential GNSS data. Compared to real-time DGNSS, the post-processing DGNSS applied in this research can achieve higher accuracy (Yoshimura & Hasegawa, 2004). The schematic diagram of DGNSS is shown in Fig. 2a. In this paper, the satellites’ data are continuously received by several Virtual Reference Stations (VRS), which therefore forms a GPS network to estimate the GPS error model in a specific area, the corrections will be send back to the rover station via the network to update the position of the vehicle. It is noteworthy that the established GPS network of multi reference stations can not only reduce the ionosphere and tropospheric delay errors, but also enhance the rover station’s ability to resolve phase ambiguity, increasing the positioning solution reliability.
Because the GNSS only provides position and velocity data, the initial heading at the first epoch comes from the reference, and the heading values at the following epochs are calculated by the positions at current epoch and previous epoch. The method is described in Fig. 2b.
Direct georeferencing
While each sensor is in different frame, the initial guess received by GNSS antenna is in Earth frame, and the point cloud is in LiDAR frame. To fulfill multi-sensors integration, it must transform them into a uniform frame by performing the translation and rotation with the three axes (Chiang et al., 2019), the so-called DG process. In this research, we transform each frame into the mapping frame with the coordinate at the first epoch as the origin. The transformation relationship between each pair of sensor frames is shown in Fig. 3, while \({r}_{b}^{m}\) and \({R}_{b}^{m}\) represent the translation and rotation of body frame with respect to the mapping frame and so on.
After DG, it can define the initial pose and construct the environment where the vehicle is by the initial guess generated in previous sub-section after the compensation of lever arm and boresight. However, the distribution of the point cloud is based on the rough initial guess. To correct the initial pose of the vehicle, it shall go through scan to dynamic map registration to adjust it, which will be mentioned in the next sub-section.
$${r}_{l}^{m}={R}_{b}^{m}{r}_{l}^{m}+{r}_{b}^{m}$$
(9)
$${R}_{l}^{m}={R}_{b}^{m}{R}_{l}^{b}$$
(10)
3D normal distribution transform
Although the initial guess can estimate the pose of the point cloud, in GNSS challenging environment, the initial guess might lead to a large pose offset of the vehicle. To prevent the failure of scan matching caused by the initial pose offset, if the points of each frame are well-matched, it can correct the error in the initial guess. This paper applies NDT for scan matching, which is a distribution-based registration method and more robust to deal with initial pose offset.
As we receive the consecutive frames of point cloud from LiDAR, the previous frame of point cloud is cut in a constant voxel size. Each voxel shall contain at least three points. Later, it processes the NDT by matching points after voxelization from moving frame (current frame) to the distribution of the fixed frame (previous frame) and turns out the normal distribution modeling \(N(q, \sum )\) for each voxel (Akai et al., 2017; Biber & Straßer, 2003).
$$q=\frac{1}{n}\sum_{i=1}^{n}{x}_{i}$$
(11)
$$\sum =\frac{1}{n-1}\sum_{i=1}^{n}\left({x}_{i}-q\right){\left({x}_{i}-q\right)}^{T}$$
(12)
$$p\left(x\right)\sim exp\left(-\frac{{\left({x}_{i}-q\right)}^{T}{\Sigma }^{-1}\left({x}_{i}-q\right)}{2}\right)$$
(13)
where \({x}_{i}\) is the point in the voxel; n is the number of points in a specific voxel; q is the mean value and Σ is the covariance value of the voxel.
After registration, it can calculate the score value. To minimize the score value, it iteratively updates the rotation and translation in three axes of consecutive frames and applies the Newton’s algorithm to optimize. It’s worth noticing that during the coordinate transformation process, the estimation of the weights for unknowns can enhance the registration results (Amiri-Simkooei., 2018), such as the calculation of six Degrees of Freedom (6DOF) may refer to the slant distance from the center to each point to set the corresponding weight.
For the research settings, the iteration won’t stop until it meets some criteria, such as the iteration number of scan matching, or the rotation and translation differences between consecutive frames fall within a predefined tolerance. To prevent the rotation and translation differences from falling into a local minimum, Iterative Discretization Method-NDT (IDM-NDT) registers the point cloud in three rounds by iteratively decreasing the voxel size. The schematic diagram of the NDT process mentioned above is shown in Fig. 4.
$$\left[\begin{array}{c}{x}^{^{\prime}}\\ {y}^{^{\prime}}\\ {z}^{^{\prime}}\end{array}\right]={R}_{z}\left(\psi \right){R}_{y}\left(\theta \right){R}_{x}\left(\phi \right)\left[\begin{array}{c}x\\ y\\ z\end{array}\right]+\left[\begin{array}{c}{t}_{x}\\ {t}_{y}\\ {t}_{z}\end{array}\right]$$
(14)
$$score\left(p\right)=\sum_{i=1}^{n}p\left({x}_{i}\right)$$
(15)
Fault detection and exclusion scheme
While processing the NDT algorithm, the error in height and rotation angle will drift with time. It leads to error propagation and NDT failure, resulting from the miss-registration caused by the large offset in translation and rotation output from previous NDT (Al et al., 2017). To this end, FDE mechanism (Akai et al., 2017) is added after the NDT process to detect the faults and exclude them from LiDAR-SLAM-based navigation solution. The concept of FDE applied in this research is described in Fig. 5. The FDE mechanism operates in two steps: the LiDAR Mapping and LiDAR Odometry. Due to NDT being sensitive to the height offset, FDE for LiDAR Mapping can constraint the translation offset in height and the rotation in pitch and roll, then update the transformation matrix to register the point cloud again and build robust dynamic map for autonomous vehicle without HD map. For the FDE in LiDAR Odometry, it compares the initial guess and the initial guess added with NDT, then outputs a relatively accurate one which is treated as the navigation solution outcome of LiDAR-SLAM.