Lattice-based clustering and genetic programming for coordinate transformation in GPS applications
Highlights
► Hybrid computational intelligence techniques are used for GPS coordinate transformation. ► A lattice-based clustering method is designed and integrated with genetic programming. ► The size of lattices for clustering is determined by information entropy. ► Regression by genetic programming is trained using the data of each clustered lattice. ► The proposed method has been integrated with GPS-applications for coordinate transformation.
Introduction
Coordinate transformation is an important task in georeferencing applications. With the popularity of the Global Positioning System (GPS), GPS has become a convenient tool for georeferencing applications (EI-Rabbany, 2002, Barbeau et al., 2010). GPS is a satellite-based navigation system and it describes a position in the Earth-Centered, Earth-Fixed (ECEF) Cartesian coordinate frame. Information for calculating the coordinates of a position ((latitude), (longitude), and h84(height) in WGS84) is formatted in NMEA-0183 and transmitted by GPS satellites (NGS, 2007, DMA, 1987). In practical GPS applications, it is frequently required to convert the coordinates of a position from one geographic coordinate system to another. Typical conversion methods perform level-wised transformation by referring to various mathematical models. Such models usually involve complicated, nonlinear, algebraic formulas and parameters associated with a geodetically specific datum. A geodetic datum is a reference system officially established for a specific area and is used as national or continental standard for positioning. For example, TM2 is a 2-D geodetic datum based on GRS67 and has being widely used in many existing GPS applications used in the local area where the authors are from. In these applications, the coordinates need to be transformed from WGS84 to TWD97/TWD67 in TM2 to become useful (Tseng and Chang, 1999). Fig. 1 depicts the conversion process.
Consider the level-wised transformation from WGS84 to TWD97/TWD67 in the local applications. By counting the operators involved in the transformation formulas, it takes more than 165 and 186 floating-point operations to obtain a position in (easting) and (northing) from , , h84, respectively (Doong, 2008). Recently, the demands on faster computation and low energy consumption when using handheld GPS devices receive more and more attentions. For example, the embedded systems transmitting real-time GPS data for monitoring slope stability, debris flow, bridges or crustal deformation (Mayer et al., 2010, He et al., 2011, Moschas and Stiros, 2011, He et al., 2004) are usually implemented on wireless, portable devices working with batteries or sonar cells. Due to the limited resources associated with handheld GPS devices, low power consumption is an important consideration in implementing such systems. The level-wise transformation has been used for years and many software tools have been developed based on these formulas. Recent studies attempt to reduce the costs of GPS-based coordinate transformation. For example, Soler et al. (2012) introduced a least squares solution to determine a unique set of geodetic coordinates, with accompanying accuracy predictions. The method is based on the given sets of individual (x,y,z) GPS-obtained values and their variance–covariance matrices. Shu and Li (2010) developed an iterative algorithm for the transformation from Cartesian to geodetic coordinates using the Newton–Raphson method to solve a quartic equation of the Lagrange parameter. Civicioglu (2012) presented the differential search algorithm to solve the problem of transforming the geocentric cartesian coordinates into geodetic coordinates. Recently, computational intelligence methods (Tierra et al., 2008, Wu et al., 2008b, Gullu, 2010) are also applied to estimate the transformation of coordinates.
As in most machine learning applications, a sufficient amount of high-quality training data is the key to success. The learning results may be trivial or meaningless if too many or too few training data are used. Coordinate transformation can be rephrased as a regression problem that builds simpler formulas of transformation from a training data set of GPS reference points (Wu et al., 2008a). However, the GPS reference points are not normally distributed in the application areas. Many reference points are established in the plain areas but few are in the mountain areas. For example, the application area under study is about 35,915 km2 and contains a variety of land-forms, most are mountains. In the work of Wu et al. (2008a), 2700 reference points were used as training data, most of which are established on the west-side plain areas. The regression model is accurate and applicable in the areas where many (or enough) reference points are established. However, the regression model is not always accurate when being applied in the areas where few reference points are established.
This study intends to improve the performance of regression models of coordinate transformation so as to improve the positioning accuracy of GPS applications. Two learning techniques are integrated, clustering and genetic programming (GP) (Koza, 1992), for this purpose. To maximize the performance of both learning techniques, a lattice-based clustering method is developed. A lattice is a partition of the application area, containing a subset of GPS reference points extracted from the training data set. The lattice size is determined according to the geographic locations and distribution of the data points. Clustering is then performed on lattices, not data points. GP-based regression is then performed on the data contained in the lattices belonging to the same clusters. In this way, the data points contained in different lattices can be considered to have the same importance. Biased regression results caused by the distribution of data can be eliminated. The experimental results show that the proposed method can further improve the positioning accuracy than the previous methods. The basic idea of the proposed method is depicted in Fig. 2.
The remaining of this paper is organized as follows. Section 2 presents a brief review on related studies. In Section 3, the clustering problem and regression problem related to GPS coordinate transformation are defined. The techniques of lattice-based data clustering and the GP regression models are presented in Section 4. The results of experiments are given in Section 5. Finally, we conclude this study in Section 6.
Section snippets
Related work
In various machine learning applications, training on clustered or partitioned data usually can produce focused results. Below are several interesting studies that integrate clustering with machine learning methods. In the work of Ari and Güvenir (2002), clustered linear regression improved the accuracy of classical linear regression by partitioning training space into subspaces for linear approximations. Haeb-Umbach (2001) integrated bottom-up clustering with maximum likelihood linear
Problem formulation
This section presents the definitions and formulations associated with the proposed method.
Lattice-based data clustering
The idea of lattice-based clustering is based on Eqs. (6), (7), (8), (9). The concept of lattice-based data clustering is straightforward: the objects to be clustered are lattices, not all data points. For example, in Fig. 3, the number of data points is 27 in G. When z=8, the number of non-empty lattices is 22; and when z=12, the number of non-empty lattices is 25. Therefore, a lattice containing 20 points is as important as the one containing only two points. In this manner, an isolated
Experiments
In order to demonstrate the effectiveness of the proposed method, several experiments were conducted. In the experiments, training data were standard GPS reference points collected from official databases (MOI, 2006). These reference points are established by years of surveys and verified periodically. Each reference point is associated with its location's coordinates in WGS84 and TWD97. In the experiments, 2748 data points were collected in G as the source data. To evaluate the overall
Discussion and conclusion
This paper presents a lattice-based clustering strategy and its applications for GPS coordinate transformation. The proposed method first partitions training data into lattices by considering the distribution of the geographic reference points. Objects to be clustered are lattices, not data points. Data points of the same cluster are used as inputs for GP for training regression models for GPS coordinate transformation. Because the data contained in the same lattice are considered as of the
Acknowledgments
This work was partially supported by National Science Council (NSC), Taiwan, under Grants no. NSC 95-2218-E-309-004 and 99-2221-E-390-031.
References (36)
An approach for fuzzy rule-base adaptation using on-line clustering
International Journal of Approximate Reasoning
(2004)- et al.
Clustered linear regression
Knowledge Based Systems
(2002) Transforming geocentric cartesian coordinates to geodetic coordinates by using differential search algorithm
Computers & Geosciences
(2012)- et al.
A fuzzy clustering method of construction of ontology-based user profiles
Advances in Engineering Software
(2009) Function identification for the intrinsic strength and elastic properties of granitic rocks via genetic programming (GP)
Computers & Geosciences
(2011)Bayesian identification of clustered outliers in multiple regression
Computational Statistics & Data Analysis
(2007)- et al.
Measurement of the dynamic displacements and of the modal frequencies of a short-span pedestrian bridge using GPS and an accelerometer
Engineering Structures
(2011) - et al.
Genetic algorithm based support vector machine regression in predicting wave transmission of horizontally interlaced multi-layer moored floating pipe breakwater
Advances in Engineering Software
(2012) - et al.
An iterative algorithm to compute geodetic coordinates
Computers & Geosciences
(2010) - et al.
Clustering noisy data in a reduced dimension space via multivariate regression trees
Pattern Recognition
(2006)