Constrained optimisation of spatial sampling : a geostatistical approach
AimsThis thesis aims at the development of optimal sampling strategies for geostatistical studies. Special emphasis is on the optimal use of ancillary data, such as co-related imagery, preliminary observations and historic knowledge. Although the object of all studies is the soil, the developed methodology can be used in any scientific field dealing with geostatistics.In summary, the objectives of this study were:Formulation of a range of optimisation criteria that honour a wide variety of aims in soil-related surveys.Development of an optimisation algorithm for spatial sampling that is able to handle these different optimisation criteria.Incorporation of ancillary data such as co-related imagery, historic knowledge and expert knowledge in the sampling strategy.Comparison of the performances of the developed optimisation algorithms with established sampling strategies.Application of developed optimisation techniques in practical soil sampling studies.Outline of major toolsChapter 2 shows how a phased sampling procedure can optimise environmental risk assessment. Using indicator kriging, probability maps of exceeding environmental threshold levels are used to direct subsequent sampling. The method is applied in a lead-pollution study in the city of Schoonhoven, The Netherlands. It is tested using stochastic simulations, and results are compared to conventional sampling schemes in terms of type-I and type-II errors. The phased sampling schemes have much lower type-I errors than the conventional sampling schemes with comparable type-II errors. They predict almost 70% of the area correctly (polluted or not-polluted), as compared to 55% by conventional schemes.Chapter 3 introduces the spatial simulated annealing (SSA) algorithm as a general, flexible optimisation method for spatial sampling. Sampling schemes are optimised at the point level, taking into account sampling constraints and preliminary observations. Different optimisation criteria can be handled. SSA is demonstrated using two optimisation criteria from the literature. The first (the MMSD criterion) aims at even spreading of points over the area. The second (WM criterion) optimises the realised point pair distribution for variogram estimation. For several examples it is shown that SSA is superior to conventional sampling strategies. Improvements up to 30% occur for the first criterion, while an almost complete solution is found for the second criterion. SSA is flexible in adding extra criteria.Optimising sampling for spatial interpolationChapter 4 introduces the MEAN_OK algorithm in SSA, which aims at minimisation of the mean ordinary kriging variance over the research area. It is applied on texture and phosphate content on a river terrace in Thailand. First, sampling is conducted for estimation of the variogram. The variograms thus obtained are used to optimise additional sampling for minimal kriging variance using SSA. This reduces kriging variance of sand percentage from 28.2 to 23.7 (%) 2. The variograms are used subsequently in a geomorphologically similar area. Optimised sampling schemes for anisotropic variables differ considerably from isotropic ones. Size of kriging neighbourhood has a small but distinct effect on the schemes. The schemes are especially efficient in reducing high kriging variances near boundaries of the area.Chapter 5 further explores the possibilities of minimising kriging variance using SSA. Next to the MEAN_OK criterion, the MAX_OK criterion is introduced, which minimises maximum kriging variance. Both criteria are compared to a regular grid. Using SSA, the mean kriging variance reduces from 40.64 [unit] 2to 39.99 [unit] 2. The maximum kriging variance reduces from 68.83 [unit] 2to 53.36 [unit] 2. An additional sampling scheme of 10 observations is optimised for an irregularly scattered data set of 100 observations. This reduces the mean kriging variance from 21.62 [unit] 2to 15.83 [unit] 2, and maximum kriging variance from 70.22 [unit] 2to 34.60 [unit] 2. The influence of variogram parameters on the optimised sampling schemes is investigated. A Gaussian variogram produces a very different sampling scheme than an exponential variogram with similar nugget, sill and (effective) range. A very short range results in random sampling schemes, with observations separated by distances larger than twice the range. For a spherical variogram, magnitude of the relative nugget effect does not effect the sampling schemes, except for high values.Chapter 6 introduces the WMSD criterion into SSA, which optimises sampling using a spatial weight function. This allows distinguishing between different areas of priority. A multivariate contamination study in the Rotterdam harbour with five contaminants at two depths shows two subsequent sampling stages with two spatial weight functions. The first stage combines earlier observations and historic knowledge, with emphasis on areas with high priority. The resulting scheme shows a contamination at 17.4% of the samples, with 1.5% heavily contaminated. The second stage uses probability maps of exceeding intermediate threshold values to guide additional sampling to possible hot spots. This yields 26.7% contaminated samples, with 16.7% heavily contaminated. These include new locations that were not detected during the first stage. The WMSD criterion can be used as a valuable tool in decision making processes.Optimising sampling for model estimationChapter 7 focuses on the use of ancillary data to optimise sampling for precision farming research. Using a cheap, low-tech scoring technique yield maps were predicted for millet in an on-farm study in Niger. Yield varied from 0 to 2500 kg ha -1. Subsequently, SSA was used to optimise three different sampling schemes. Scheme 1 optimised coverage of the whole area. Scheme 2 covered the whole yield range, and scheme 3 covered the low producing areas. Using correlation coefficients, scheme 2 found significant correlations between 5 variables and yield. Scheme 1 found only one significant correlation. Using multivariate regression of yield on soil variables, scheme 2 explained 70% of the yield variation. For scheme 1 this was only 37%. Differences between scheme 3 and scheme 1 proved to be significant for distance to shrubs, micro-relief, pH-H2O and CEC. From this study we concluded that shrubs are the main factor influencing yield by catching eroded particles and improving soil fertility. In general, we concluded that the sampling strategy of scheme 2 should be recommended for establishing yield/soil relations. Variograms of micro-relief and yield suggested that spatial correlation is largely confined to distances of 3 to 5 m.Chapter 8 evaluates a number of sampling strategies for variogram estimation. In the first part, a regular grid is compared to a sampling scheme that optimises the point pair distribution for variogram estimation. This yields unbiased experimental variograms. However, the fluctuation of the experimental variograms is much lower with a regular grid. We concluded from this that the point pair distribution alone is not a useful optimisation criterion for variogram estimation. In the second part, additional observations selected for optimal point pair distribution are compared with randomly drawn additional observations. The random observations result in much higher standard deviations at shorter distances. We concluded from this that for additional short distance observations the point pair distribution is a very useful optimisation criterion. The third part focusses on optimal variogram use. A sampling grid of 81observations is completed, after preliminary estimation of the variogram, with 19 additional observations for minimal kriging variance.The scheme is compared to a regular grid of 100 observations. For an exponential field without nugget effect, the use of the phased sampling scheme reduces the mean squared kriging error from 0.39 [unit] 2to 0.31 [unit] 2, and the maximum squared kriging error from 6.05 [unit] 2to 4.24 [unit] 2. For a spherical field with a nugget effect of 33%, mean squared kriging error does not change and maximum squared kriging error decreases from 15.98 [unit] 2to 11.52 [unit] 2. We concluded that minimisation of the squared kriging error is often more relevant than accurate estimation of the variogram. Taking samples just outside the area improved the quality of the prediction in terms of both kriging variance and squared kriging error.
Main Author: | |
---|---|
Other Authors: | |
Format: | Doctoral thesis biblioteca |
Language: | English |
Subjects: | geostatistics, kriging, sampling, soil, statistical analysis, bemonsteren, bodem, geostatistiek, statistische analyse, |
Online Access: | https://research.wur.nl/en/publications/constrained-optimisation-of-spatial-sampling-a-geostatistical-app |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Summary: | AimsThis thesis aims at the development of optimal sampling strategies for geostatistical studies. Special emphasis is on the optimal use of ancillary data, such as co-related imagery, preliminary observations and historic knowledge. Although the object of all studies is the soil, the developed methodology can be used in any scientific field dealing with geostatistics.In summary, the objectives of this study were:Formulation of a range of optimisation criteria that honour a wide variety of aims in soil-related surveys.Development of an optimisation algorithm for spatial sampling that is able to handle these different optimisation criteria.Incorporation of ancillary data such as co-related imagery, historic knowledge and expert knowledge in the sampling strategy.Comparison of the performances of the developed optimisation algorithms with established sampling strategies.Application of developed optimisation techniques in practical soil sampling studies.Outline of major toolsChapter 2 shows how a phased sampling procedure can optimise environmental risk assessment. Using indicator kriging, probability maps of exceeding environmental threshold levels are used to direct subsequent sampling. The method is applied in a lead-pollution study in the city of Schoonhoven, The Netherlands. It is tested using stochastic simulations, and results are compared to conventional sampling schemes in terms of type-I and type-II errors. The phased sampling schemes have much lower type-I errors than the conventional sampling schemes with comparable type-II errors. They predict almost 70% of the area correctly (polluted or not-polluted), as compared to 55% by conventional schemes.Chapter 3 introduces the spatial simulated annealing (SSA) algorithm as a general, flexible optimisation method for spatial sampling. Sampling schemes are optimised at the point level, taking into account sampling constraints and preliminary observations. Different optimisation criteria can be handled. SSA is demonstrated using two optimisation criteria from the literature. The first (the MMSD criterion) aims at even spreading of points over the area. The second (WM criterion) optimises the realised point pair distribution for variogram estimation. For several examples it is shown that SSA is superior to conventional sampling strategies. Improvements up to 30% occur for the first criterion, while an almost complete solution is found for the second criterion. SSA is flexible in adding extra criteria.Optimising sampling for spatial interpolationChapter 4 introduces the MEAN_OK algorithm in SSA, which aims at minimisation of the mean ordinary kriging variance over the research area. It is applied on texture and phosphate content on a river terrace in Thailand. First, sampling is conducted for estimation of the variogram. The variograms thus obtained are used to optimise additional sampling for minimal kriging variance using SSA. This reduces kriging variance of sand percentage from 28.2 to 23.7 (%) 2. The variograms are used subsequently in a geomorphologically similar area. Optimised sampling schemes for anisotropic variables differ considerably from isotropic ones. Size of kriging neighbourhood has a small but distinct effect on the schemes. The schemes are especially efficient in reducing high kriging variances near boundaries of the area.Chapter 5 further explores the possibilities of minimising kriging variance using SSA. Next to the MEAN_OK criterion, the MAX_OK criterion is introduced, which minimises maximum kriging variance. Both criteria are compared to a regular grid. Using SSA, the mean kriging variance reduces from 40.64 [unit] 2to 39.99 [unit] 2. The maximum kriging variance reduces from 68.83 [unit] 2to 53.36 [unit] 2. An additional sampling scheme of 10 observations is optimised for an irregularly scattered data set of 100 observations. This reduces the mean kriging variance from 21.62 [unit] 2to 15.83 [unit] 2, and maximum kriging variance from 70.22 [unit] 2to 34.60 [unit] 2. The influence of variogram parameters on the optimised sampling schemes is investigated. A Gaussian variogram produces a very different sampling scheme than an exponential variogram with similar nugget, sill and (effective) range. A very short range results in random sampling schemes, with observations separated by distances larger than twice the range. For a spherical variogram, magnitude of the relative nugget effect does not effect the sampling schemes, except for high values.Chapter 6 introduces the WMSD criterion into SSA, which optimises sampling using a spatial weight function. This allows distinguishing between different areas of priority. A multivariate contamination study in the Rotterdam harbour with five contaminants at two depths shows two subsequent sampling stages with two spatial weight functions. The first stage combines earlier observations and historic knowledge, with emphasis on areas with high priority. The resulting scheme shows a contamination at 17.4% of the samples, with 1.5% heavily contaminated. The second stage uses probability maps of exceeding intermediate threshold values to guide additional sampling to possible hot spots. This yields 26.7% contaminated samples, with 16.7% heavily contaminated. These include new locations that were not detected during the first stage. The WMSD criterion can be used as a valuable tool in decision making processes.Optimising sampling for model estimationChapter 7 focuses on the use of ancillary data to optimise sampling for precision farming research. Using a cheap, low-tech scoring technique yield maps were predicted for millet in an on-farm study in Niger. Yield varied from 0 to 2500 kg ha -1. Subsequently, SSA was used to optimise three different sampling schemes. Scheme 1 optimised coverage of the whole area. Scheme 2 covered the whole yield range, and scheme 3 covered the low producing areas. Using correlation coefficients, scheme 2 found significant correlations between 5 variables and yield. Scheme 1 found only one significant correlation. Using multivariate regression of yield on soil variables, scheme 2 explained 70% of the yield variation. For scheme 1 this was only 37%. Differences between scheme 3 and scheme 1 proved to be significant for distance to shrubs, micro-relief, pH-H2O and CEC. From this study we concluded that shrubs are the main factor influencing yield by catching eroded particles and improving soil fertility. In general, we concluded that the sampling strategy of scheme 2 should be recommended for establishing yield/soil relations. Variograms of micro-relief and yield suggested that spatial correlation is largely confined to distances of 3 to 5 m.Chapter 8 evaluates a number of sampling strategies for variogram estimation. In the first part, a regular grid is compared to a sampling scheme that optimises the point pair distribution for variogram estimation. This yields unbiased experimental variograms. However, the fluctuation of the experimental variograms is much lower with a regular grid. We concluded from this that the point pair distribution alone is not a useful optimisation criterion for variogram estimation. In the second part, additional observations selected for optimal point pair distribution are compared with randomly drawn additional observations. The random observations result in much higher standard deviations at shorter distances. We concluded from this that for additional short distance observations the point pair distribution is a very useful optimisation criterion. The third part focusses on optimal variogram use. A sampling grid of 81observations is completed, after preliminary estimation of the variogram, with 19 additional observations for minimal kriging variance.The scheme is compared to a regular grid of 100 observations. For an exponential field without nugget effect, the use of the phased sampling scheme reduces the mean squared kriging error from 0.39 [unit] 2to 0.31 [unit] 2, and the maximum squared kriging error from 6.05 [unit] 2to 4.24 [unit] 2. For a spherical field with a nugget effect of 33%, mean squared kriging error does not change and maximum squared kriging error decreases from 15.98 [unit] 2to 11.52 [unit] 2. We concluded that minimisation of the squared kriging error is often more relevant than accurate estimation of the variogram. Taking samples just outside the area improved the quality of the prediction in terms of both kriging variance and squared kriging error. |
---|