Kernel-Based Nonparametric Density Estimation and Regression With Statistical Applications

Foster, Peter John (1990) Kernel-Based Nonparametric Density Estimation and Regression With Statistical Applications. PhD thesis, University of Glasgow.

Full text available as:
[thumbnail of 13834284.pdf] PDF
Download (18MB)


This thesis is concerned with nonparametric kernel density estimation and regression. In particular, techniques for obtaining better estimates than those produced by the standard fixed kernel approach are examined as well as the use of nonparametric estimates in certain other statistical procedures. Allowing the degree of smoothing to adapt to the "local" density of the data has been suggested as a means of reducing bias and mean integrated squared error (MISE) in comparison with the levels for fixed kernel density estimators. In chapter two the finite sample properties of two particular adaptive estimators are investigated and compared with those of the fixed kernel method. This is carried out for both univariate and multivariate data from a number of different underlying distributions which are assumed to be of a known form. Numerical integration techniques are used to calculate exact values for the bias, variance and MISE. A simple smoothing strategy based on Normality is also derived. In the first part of chapter three techniques for obtaining fixed kernel density estimators with smaller bias than that of the standard fixed approach are described and their asymptotic properties studied. These fall into three classes which are using "higher order" kernels, subtracting a bias reducing correction factor and using a multiplicative correction factor. Those with the best asymptotic properties in each class are compared with the standard fixed and the adaptive approaches via a simulation study. In the second part of this chapter methods for reducing the bias inherent in the Priestley-Chao fixed kernel regression estimator are similarly explored. These techniques are generally analogous to those studied for density estimators except for a two-stage procedure called "twicing" which is also considered. In chapter four the problem of obtaining pointwise confidence intervals for the unknown density function is examined. The sampling distributions of the estimators are unknown but can be approximated in two ways. These are firstly by assuming Normality and secondly by the use of the bootstrap method. Competing approaches are again compared via a simulation study. In chapter five two density based tests of multivariate Normality are described. The first is based on a measure of integrated squared error and the second utilises the entropy property of the multivariate Normal (MVN) distribution. Critical values for the test statistics are obtained and a power study carried out. These powers are also compared with those for an omnibus procedure due to Koziol based on the "radii and angle" properties of the MVN distribution. In chapter six a procedure for graphically exploring a multivariate set based on finding directions of high multivariate density is proposed. The three main aims are to explore the main features of a p-dimensional (p > 2) density function, seek non-linear features in the data and use pairs of directions in the construction of two-dimensional representations. This approach is illustrated by application to real data sets. In chapter seven the goodness-of-fit of a logistic regression model based on multiple covariates is assessed by comparing the parametric probabilities with estimates obtained by nonparametric regression. The global discrepancy is assessed using a pseudo-likelihood ratio test statistic and significance determined through a simulation procedure. The degree of smoothing plays an important role so methods for choosing the value of the smoothing parameter are disucssed. Also, the use of partial residual plots to determine if the functional form of a covariate effect has been specified correctly are explored and a test of linearity to aid in this is proposed and investigated.

Item Type: Thesis (PhD)
Qualification Level: Doctoral
Additional Information: Adviser: A W Bowman
Keywords: Statistics
Date of Award: 1990
Depositing User: Enlighten Team
Unique ID: glathesis:1990-76502
Copyright: Copyright of this thesis is held by the author.
Date Deposited: 19 Nov 2019 14:15
Last Modified: 19 Nov 2019 14:15

Actions (login required)

View Item View Item


Downloads per month over past year