Research Projects
Supported by HKU's High Performance Computing Facilities |
|
Researcher: |
Miss On-yee Tang, Department of
Statistics and Actuarial Science |
|
Project Title: |
Estimation of Generalized Linear Mixed Models Using Multiple Imputations |
|
Project Description: |
This project mainly lies in the estimation of the regression parameters for
the Generalized Linear Mixed Models, specifically in the analysis of count
data with excess zeroes. This research was motivated by a household health
survey in Indonesia in 1997 which I came across when I was working as a
research assistant in the Department of Statistics and Actuarial Science of
the University of Hong Kong last summer. The survey employed a multi-stage
cluster sampling scheme (community-household-individuals). However, the data
exhibit excessive zeroes which make the modeling difficult, even the special
zero inflated Poisson (ZIP) model may not always be adequate to model the
real life situations. Therefore, I will advocate the use of a random effects
model at which the random effects are governed by a special distribution
called the non-central chi-square distribution with zero degrees of freedom
to model the extra variations induced by some unobservable subject specific
heterogeneities, such as the degree of healthiness of the subjects due to
environmental factors or genetic predisposition. |
|
Project Duration: |
2
years |
-
Back to top- / Contents |
|
Project
Significance: |
In medical, health economics or public health investigations, interest often
focuses on analyzing count data such as the number of heart attacks, the
number of recurrences of breast cancer tumours, and the number of
utilizations of the emergency services/room. Investigators are often
interested in the relationship between some explanatory variables and the
rate of occurrences of the event. The variable of interest in the data set
is the number of days of missing primary activities due to illness in the
last four weeks associated with some potential health and socio-economic
explanatory variables such as gender, level of education, per capita annual
household income and many more. A high incidence rate of the number of days
of missing primary activities (absent from work) may lead to a low GDP. If
we can show that the incidence rate is related to a certain explanatory
variable that can be improved at a reduced cost by the government, then the
government might formulate new policies on this issue from the perspective
of health economics. Also, this research facilitates the investigation of
the disease processes, caries prevention in dental epidemiology as well as
occupational health discipline. The overall analysis will be clinically
informative. A number of health and socio-economic related variables were
available in the data set which will be analyzed by the proposed regression
model. The methodology of this research will be applicable to medical and
public health survey, and provide profound implications on health economics
which has been a hot topic in Hong Kong recently. |
|
Results
Achieved: |
Due to the flexibility of the model proposed and the simplicity of the
algorithm, it is expected that estimations will work extremely well in the
univariate settings. As the survey employed a multi-stage cluster sampling
scheme and contains more than one variable of interest, these provide much
stimulations to extend my research to the multivariate and cluster analysis.
It is believed that these investigations will be valuable for practical use. |
|
Remarks on the
Use of High Performance Computing Cluster: |
As the proposed model involves a complex distribution, this project requires
a lot of programming and I have to make use of HPC Cluster for the tedious
computations. On the other hand, in order to increase the preciseness of the
parameter estimates, it really takes a long time for the programs to finish.
Pleasantly, using the HPC Cluster in addition to the parallel programming, these
greatly reduce the running times of the programs. More importantly, it makes
the methodology possible and efficient enough to be implemented in practice. |
|
-
Back to top- / Contents |