This section provides a brief introduction to the principles of the SVM LOS/NLOS classifier. Moreover, the features for LOS/NLOS classification are discussed.

### The architecture of classifier

The architecture of SVM classifier contains two stages: offline and online, as shown in Fig. 3. For the offline stage, the raw GNSS measurements are used for extracting features of machine learning approach, and the features are labeled using the 3D building models, ground truth and satellite positions calculated by GNSS ephemeris. The elevation and azimuth angles of satellites could be calculated by the satellite position from GNSS ephemeris and the ground truth position, then the elevation of satellites is compared with the elevation angle of building edges at the same azimuth angle. For the LOS satellites, the elevation is higher than the maximum elevation angle of the buildings at the same azimuth angle, and vice versa. Finally, an offline labeled dataset is created to train a linear SVM classifier. For a linear SVM classifier, the score of classification is calculated by:

$$Score\left( x \right) = \left( {x/s} \right)^{T} \beta + b,$$

(3)

whereÏ *x* is the machine learning feature vectors, and s, *β*, *b* donate the kernel scale, the vector of fitted linear coefficients and bias from linear SVM classifier, respectively. The predicted LOS/NLOS label is calculated by:

$${\text{Label}} = \left\{ {\begin{array}{*{20}l} {LOS} \hfill & {{\text{Score}} \ge 0} \hfill \\ {NLOS} \hfill & {{\text{Score}} < 0} \hfill \\ \end{array} } \right.,$$

(4)

For the online stage, the feature vector from the raw GNSS measurement is put into the SVM score formula to obtain the predicted satellite visibility.

### Features of machine model

According to our preliminary result (Xu et al. 2018), there are differences between LOS and NLOS signals existing in features as follows:

### Signal noise ratio (SNR)

The SNR is a conventional variable to predict satellite visibility, because the reflection and refraction of the NLOS signal transmission decrease the signal strength for most cases. The signal strength of each received signal could be obtained from the raw GNSS measurements in receiver independent exchange format (RINEX) data. To present the real SNR measurement, a dataset of about 20 min is collected in urban scenario, as shown in Fig. 4. It is evident that there are some SNR regions where the LOS and NLOS signals coexist at the same time, demonstrating that the simple SNR threshold classification might not work perfectly in urban environments.

### Normalized pseudorange residual (NPR)

The pseudorange residual is also a useful feature related with satellite visibility (Hsu et al. 2017). The pseudorange residual is computed by the least square approach, which is a conventional approach to estimate user position. The least square approach is computed by:

$${\text{X}} = \left( {{\text{H}}^{{\text{T}}} { }{\text{H}}} \right)^{ - 1} { }{\text{H}}^{{\text{T}}} { }{{{\uprho}}},$$

(5)

where\({\varvec{X}}\) is a vector with the estimated receiver position and clock bias, \({\varvec{H}}\) is a matrix with unit LOS vectors pointing from the receiver to satellites. \({\varvec{\rho}}\) denotes pseudorange measurements. After iterations, the pseudorange residual of each satellite is expressed as:

$${\text{Pr}} = {{{\uprho}}} - {\text{H}} \cdot {\text{X}},$$

(6)

However, the estimated position in urban area always contains a large error, so the pseudorange residual could not indicate the difference between LOS and NLOS signals clearly. For that reason, the pseudorange residuals of each epoch are normalized as:

$$NPR = \frac{{Pr_{i} - Pr_{min} }}{{Pr_{max} - Pr_{min} }},$$

(7)

where \(Pr_{max}\) and \(Pr_{min}\) are maximum and minimum pseudorange residual of each epoch. A demonstration of normalized pseudorange residual is shown in Fig. 5. With an accurate position estimation, the normalized pseudorange residual of LOS signal is closer to zero than that of NLOS signal, since the NLOS signal have additional propagation path in pseudorange.

### Elevation angle (EA)

The elevation angle of satellite has relationship with the satellite visibility. The main reason is that the higher elevation angle signal is less possible to be blocked by the surrounding building. The existing classification algorithm also applied the elevation angle into LOS/NLOS classification (Yozevitch et al. 2016).

### Pseudorange rate consistency (PRC)

The pseudorange rate is the changing rate of pseudorange measurement between two epochs and expressed as:

$$\Delta P_{t}^{i} = \rho_{t}^{i} - \rho_{t - 1}^{i} ,$$

(8)

where \(\rho_{t}^{i}\) and \(\rho_{t - 1}^{i}\) is the pseudorange measurement of satellite *i* at epoch *t* and *t*-1. The pseudorange measurement of raw data comes from the receiver code tracking loop. Meanwhile, the Doppler shift of signal is estimated from the receiver frequency tracking loop, and the pseudorange rate could be related with Doppler shift by:

$$\rho_{t}^{i} = \left( { - \lambda_{i} \cdot f_{d.i} } \right)\Delta t,$$

(9)

where \({\uplambda }_{i}\) is the negative of carrier wavelength and \(f_{d.i}\) is Doppler shift measurement for satellite \(i\). Comparing with receiver code tracking loop, the multipath and reflection path have less impact on frequency tracking loop, which shows the consistency between the pseudorange rate from pseudorange measurement and Doppler shift could reveal the influence from NLOS signal. The pseudorange measurement consistency is expressed by:

$$PRC = \rho_{t}^{i} - \Delta P_{t}^{i} ,$$

(10)

where \(\hat{P}_{t}^{i}\) and \(\Delta P_{t}^{i}\) are the pseudorange rate from Doppler shift and pseudorange measurement respectively. The pseudorange rate of LOS signal have a more stable and smaller absolute value than that of NLOS signal, as shown in Fig. 6.

After generated from the GNSS measurements, the four features are used in the proposed SVM classifier.