Development and evaluation of the refined zenith tropospheric delay (ZTD) models

The tropospheric delay is a significant error source in Global Navigation Satellite System (GNSS) positioning and navigation. It is usually projected into zenith direction by using a mapping function. It is particularly important to establish a model that can provide stable and accurate Zenith Tropospheric Delay (ZTD). Because of the regional accuracy difference and poor stability of the traditional ZTD models, this paper proposed two methods to refine the Hopfield and Saastamoinen ZTD models. One is by adding annual and semi-annual periodic terms and the other is based on Back-Propagation Artificial Neutral Network (BP-ANN). Using 5-year data from 2011 to 2015 collected at 67 GNSS reference stations in China and its surrounding regions, the four refined models were constructed. The tropospheric products at these GNSS stations were derived from the site-wise Vienna Mapping Function 1 (VMP1). The spatial analysis, temporal analysis, and residual distribution analysis for all the six models were conducted using the data from 2016 to 2017. The results show that the refined models can effectively improve the accuracy compared with the traditional models. For the Hopfield model, the improvement for the Root Mean Square Error (RMSE) and bias reached 24.5/49.7 and 34.0/52.8 mm, respectively. These values became 8.8/26.7 and 14.7/28.8 mm when the Saastamoinen model was refined using the two methods. This exploration is conducive to GNSS navigation and positioning and GNSS meteorology by providing more accurate tropospheric prior information.


Introduction
During propagating through the neutral atmosphere, Global Navigation Satellite System (GNSS) signals from a satellite to a receiver will be delayed and bent due to their interaction with dry gases and water particles, which is called tropospheric delay (Bevis et al., 1992;Yao et al., 2018). It is a significant error source in GNSS positioning and navigation, for the delay varies from 2 to 20 m depending on the elevation angle of a satellite (Chen et al., 2020;Penna et al., 2001). We generally project the tropospheric delay into zenith direction by using a mapping function and utilize the Zenith Tropospheric Delay (ZTD) to describe the tropospheric influence on the signal propagation. An accurate ZTD is not only an important parameter for GNSS navigation and positioning (Duan et al., 1996;Meng, 2002;Zhang et al., 2017;Zumberge et al., 1997), but also the basis for retrieving Precipitable Water Vapor (PWV) in GNSS meteorology (Li et al., 2014;Yang et al., 2020a;Zheng et al., 2018). A stable and accurate ZTD model is necessary to meet these requirements.
Two types of ZTD models are commonly used: (1) ZTD models with the measured meteorological parameters at a site, such as Hopfield model, Saastamoinen model and Black model, which can achieve centimeter-level accuracy by inputting accurately measured meteorological parameters (Hopfield, 1969;Black & Eisner, 1984;Saastamoinen, 1972); (2) the empirical ZTD models, which feedback only by the location of a site and time of interest. Some empirical models, such as GZTD series (Yang et al., 2020b;Yao et al., 2013Yao et al., , 2016 Open Access  (Li et al., 2012(Li et al., , 2015(Li et al., , 2018, are established by using the trend analysis on long-term ZTD values. The other empirical models, such as GPT series (Boehm et al., 2007;Bohm et al., 2015;Lagler et al., 2013;Landskron & Boehm, 2018), are first building the models of various meteorological quantities, and then estimating the ZTD with these estimated meteorological parameters and the formula of Saastamoinen model, Hopfield model, and other models. Thus, the ZTDs estimated by the above two types of empirical models have generally poorer results than those with Saastamoinen and Hopfield model based on the measured meteorological data. However, several studies confirmed that the Saastamoinen and Hopfield models tend to be poor when using the regional meteorological data in a local area (Yang et al., 2020c(Yang et al., , 2021. The obvious regional differences in the accuracy with the Saastamoinen and Hopfield models are due to the fact that they were constructed based on global mean meteorological data and global climate analysis, which makes it difficult to describe the ZTD characteristics in certain areas. Therefore, it is necessary to perform the regional refinement of the ZTD models, which can not only optimize the performance of the corresponding parameter models in a specific area, but also improve the accuracy of empirical ZTD models.

Satellite Navigation
In this paper, two regional refined methods are proposed for the Hopfield and Saastamoinen models. The first method introduces the annual and semi-annual periodic terms in the Saastamoinen and Hopfield models and utilizes the least-squares fitting method to establish the regional refined models. The second method adopts Back-Propagation Artificial Neural Network (BP-ANN) to perform error compensation for the Saastamoinen and Hopfield models.

The refined models
Hopfield and Saastamoinen models are the most used ZTD models, which use the surface pressure, temperature, and water vapor pressure to estimate ZTD above a specific site. The Hopfield ZTD model is expressed as follows (Hopfield, 1969): where ZTD H denotes the ZTD estimates of Hopfield model, P s , T s and e s represent pressure (in hPa), temperature (in K), and water vapor pressure (in hPa), respectively, h s denotes the height of a site above the mean sea level,. h d = 40, 136 + 148.72(T s − 273.15) m and h w = 11, 000 m are the height of tropopause and wet tropopause, respectively. (1) The Saastamoinen ZTD model is represented by the following equation (Saastamoinen, 1972): where ZTD S denotes the ZTD estimates of the Saastamoinen model, ϕ represents the latitude of the site, and f is the correction of gravitational acceleration caused by the rotation of the Earth, which can be calculated by the following formula: The research on the temporal and spatial distributions of ZTD found that ZTD has obvious annual and semiannual variations (Mao et al. 2013;Myers et al. 2013). The idea of compensating the Hopfield model by adding annual and semi-annual periodic terms was utilized (Yang et al. 2020c). In this paper, we refined the Hopfield and Saastamoinen models, called the Hop-r1 and Saas-r1 models, which are expressed as follows: where ZTD hr1 and ZTD sr1 denote the ZTD estimates of the refined Hopfield and Saastamoinen models, doy represents the day of year, (a 11 , a 12 ) and (a 21 , a 22 ) are the annual amplitudes for the two refined models, (a 13 , a 14 ) and (a 23 , a 24 ) are their semi-annual amplitudes, and c 1 and c 2 denote their constant terms. To calculate the ten coefficients in the above two refined models, we utilized the accurate tropospheric products of the GNSS stations provided by the site-wise Vienna Mapping Function 1 (VMF1) and adopted the least-squares method.
An Artificial Neural Network (ANN) is designed to simulate the way with which the human brain analyzes and process information, which is widely used in (2) classification, regression, and in the geoscience field (Yang et al. 2020d). Composed of an input layer, an output layer and one or more hidden layers, the ANN can efficiently handle the relations between input and output variables and produce better results as enough data become available. In this paper, we utilized the BP-ANN to construct the relationship between the ZTD estimated with the models and the true ZTD values to achieve the purpose of ZTD error compensation. Specially, four input parameters are selected in this research, including temperature, pressure, water vapor pressure, and ZTD estimates of parameter models. The output parameter is the true ZTD value. We refined Hopfield and Saastamoinen models based on the BP-ANN, which are called Hop-r2 and Saas-r2 models, respectively. Figure 1 is a flowchart showing the basic process of constructing these two models based on BP-ANN. The site-wise VMF1 tropospheric products contain the meteorological parameters and the true ZTD values of the selected GNSS stations, therefore, it can provide the dataset for BP-ANN training. We divide the datasets into a training set and a validation set, accounting for 75 and 25 % of the total data sets, respectively. The function of training set and validation set are to adjust the weights on the neural network and to minimize overfitting, respectively. The BP-ANN structure used for the two refined models are as follows: four nodes in the input layers, which is the same as the number of input parameters. A single node in the output layer is the true ZTD value. There are two hidden layers with four nodes. The used training and activation functions are Levenberg-Marquardt and hyperbolic tangent, respectively. The values of 6000, 0.01 and 0.001 were selected for the maximum training number, learning rate and error threshold, respectively.

Analysis of the refined ZTD models
The 5-year data from 2011 to 2015 collected at 67 GNSS stations in China and surrounding regions are used to construct the above mentioned four refined models. To assess the performance of the proposed ZTD models, we evaluated the ZTD values estimated with different models using the true ZTD values of year 2016-2017 provided by site-wise VMF1 as references. Thus, there are six models for estimating ZTD in the comparisons, including the Hopfield model, the refined Hopfield model with periodic terms (Hop-r1), the refined Hopfield model based on BP-ANN (Hop-r2), the Saastamoinen model, the refined Saastamoinen model with periodic terms (Saas-r1), and the refined Saastamoinen model based on BP-ANN (Saas-r2). Two statistical quantities, i.e., bias and Root Mean Square Error (RMSE), are chosen as the criteria to assess the performance of each model.
The ZTD estimates of all stations in the research area are calculated by these six models and compared with the references at the corresponding time. Figure 2 represents the maps of RMSE, which shows the different performances of the six models at each site. It shows the influence of site latitude on RMSE, that is, the RMSE is always small at high latitudes and becomes large at the middle and low latitudes. The two traditional models perform poorly, especially the Hopfield model. All four refined models can improve the accuracy compared with the traditional models, as indicated by the color change of the points in the figure. It is obvious that both refined Hopfield models still have some stations with poor accuracy. In this comparison, the refined Saastamoinen models do not show this phenomenon, indicating that they are better than the refined Hopfield models. It is observed that the refined models based on the BP-ANN perform better than the those with periodic terms, indicating that the advantages of the second refinement method.
The maps of bias for the six models are illustrated in Fig. 3. The negative bias appears at each site for the two traditional models. The absolute value of bias increases as the latitude decreases for the Saastamoinen model, while the bias is always a large negative value at most of the stations for the Hopfield model. After the refinement, the biases for the four refined models at each site are closer to 0with small positive value at some stations. Note that the Hop-r1 model hardly shows an improvement at some stations. One can observe that the refined models based on the BP-ANN give the best results, which is similar to that of Fig. 2.
The mean RMSE and bias, as well as their maximum and minimum values, of the differences between the ZTD estimated with the six models and the referenced ZTD at all stations are summarized in Table 1   and Saastamoinen models show apparent seasonal effects with the peaks in summer, and the maximum negative biases are in July which are − 83.5 and − 54.2 mm, respectively. It indicates that the water vapor in the research area changes greatly in summer, the ZTD estimates with the two traditional models are smaller than the referenced ZTD, and therefore implementation of error compensation is necessary. Correspondingly, their RMSE experiences an increase from Spring to Summer, and then a decrease from Summer to Autumn with the maximum values of 95.8 mm and 70.6 mm, respectively. For the four refined models, their monthly mean biases and RMSEs have no obvious seasonal changes, and the monthly fluctuations are also small. For example, the maximum and minimum monthly mean RMSEs of the Hop-r2 model appear in July and December, and their values are 43.2 mm and 32.1 mm, respectively. For the Saas-r2 model, these values are 42.7 mm and 31.9 mm in October and December, respectively.
Further, the performance of the six models to estimate ZTD at different Coordinated Universal Time (UTC) epochs are analyzed. Their RMSE distributions at the four UTC epochs are illustrated in Fig. 8. Based on the accuracy from low to high the six models at each UTC epoch are ranked as the Hopfield model, Saastamoinen model,and Saas-r2 model. The accuracy for the Hopfield and Saastamoinen models is similar at different UTC time. The RMSE of the four refined models at UTC 6:00 and UTC 12:00 are slightly smaller than those at UTC 0:00 and UTC 18:00. The Saas-r2 model has the best performance at each UTC epoch, and the values at these UTC epochs are 37.8, 34.4, 34.7 and 36.9 mm, respectively. The histogram of the ZTD residuals, namely the differences between the model-derived ZTDs and the referenced ZTDs, is shown in Fig. 9. One can see that most of the ZTD residuals calculated with the Hopfield and Saastamoinen models are less than 0. The percentages of the ZTD residuals larger than − 100/− 50 mm are 20.5 %/52.3 and 8.7 %/32.5% for the Hopfield and Saastamoinen models, respectively, indicating that the residual distribution for the Saastamoinen model is slightly better than the Hopfield model. After the refinement, the Hop-r1 makes the ZTD residual distribution closer to the normal distribution, and the Saas-r2 model leads more ZTD residuals closer to zero. The two refined models based on the BP-ANN achieve the best ZTD residuals distributions, which basically follow the normal distribution and make most of the ZTD residuals concentrated around 0 mm. For example, the percentages of the ZTD residuals in the range of − 10 to 10 mm are 29.0 and 28.6% for the Hop-r2 model and the Saas-r2 model, respectively. When the range changes to − 50 to 50 mm, these percentages become 81.5 and 82.4 % for the two refined models.
Moreover, the Standard Deviation (SD) of the ZTD residuals is computed for the six models. These values are 50.4 mm and 45.3 mm for the Hopfield and Saastamoinen models, respectively. The two corresponding refined models improves the SD of the Hopfield model by 6.7 and 23.2 %, respectively. The Saas-r1 and Saas-r2 models achieve SD values of 43.1 mm and 38.0 mm, an improvement by about 4.9 and 16.1 % over the Saastamoinen model.

Conclusions
To refine the Saastamoinen model and the Hopfield model, two methods were introduced, namely the method by adding annual and semi-annual periodic terms and the method based on the BP-ANN. Therefore, four refined ZTD models are established using the ZTD products provided by the site-wise VMF1. The comprehensive comparisons between the four refined models and the two traditional models are conducted using the 2 years data derived from the site-wise VMF1 tropospheric products. From the spatial analysis the accuracy for the two traditional models shows a spatial difference and is affected by the latitude of a site. Moreover, the accuracy of the Hopfield model becomes worse as an increase in the site height. The refined models can effectively overcome the above problems, especially the refined models based on the BP-ANN. For example, the mean bias and RMSE of the Hop-r2 and Saas-r2 model are − 2.2/36.5 mm and − 2.7/36.0 mm, respectively. From the temporal analysis of the model accuracy, the two traditional models appear obvious seasonal effect and have the worst performance in summer. The four refined models can eliminate the seasonal influence of the estimated ZTD, and the monthly fluctuation also becomes very small. The accuracies of all six models are not affected by different UTC epochs, and the Saas-r2 model has the best performance at each UTC epoch. From the analysis of  . 8 The performance of the six models at different UTC epochs residual distributions, the refined models can improve the residual distributions compared with the traditional models, especially the models based on the BP-ANN, which make the ZTD residuals follow the normal distribution and concentrated around zero. We constructed the refined models in China and surrounding regions, which can improve the accuracy of estimated ZTD in this region, and therefore provide more accurate information on troposphere for the research on GNSS navigation and positioning and GNSS meteorology. In further research, the refined models that are better suitable for local areas should be explored, such as constructing the coefficients of the refined models in the form of dense grids. Histogram of the residuals between the ZTD derived from six models and the referenced ZTD