Research Projects Supported by HKU's High Performance Computing Facilities
Researcher:
Miss On-yee Tang, Department of Statistics and Actuarial Science
Project Title:
Estimation of Generalized Linear Mixed Models Using Multiple Imputations
Project Description:
This project mainly lies in the estimation of the regression parameters for the Generalized Linear Mixed Models, specifically in the analysis of count data with excess zeroes.  This research was motivated by a household health survey in Indonesia in 1997 which I came across when I was working as a research assistant in the Department of Statistics and Actuarial Science of the University of Hong Kong last summer.  The survey employed a multi-stage cluster sampling scheme (community-household-individuals). However, the data exhibit excessive zeroes which make the modeling difficult, even the special zero inflated Poisson (ZIP) model may not always be adequate to model the real life situations. Therefore, I will advocate the use of a random effects model at which the random effects are governed by a special distribution called the non-central chi-square distribution with zero degrees of freedom to model the extra variations induced by some unobservable subject specific heterogeneities, such as the degree of healthiness of the subjects due to environmental factors or genetic predisposition.
Project Duration:
2 years

- Back to top- / Contents

Project Significance:
In medical, health economics or public health investigations, interest often focuses on analyzing count data such as the number of heart attacks, the number of recurrences of breast cancer tumours, and the number of utilizations of the emergency services/room.  Investigators are often interested in the relationship between some explanatory variables and the rate of occurrences of the event. The variable of interest in the data set is the number of days of missing primary activities due to illness in the last four weeks associated with some potential health and socio-economic explanatory variables such as gender, level of education, per capita annual household income and many more.  A high incidence rate of the number of days of missing primary activities (absent from work) may lead to a low GDP.  If we can show that the incidence rate is related to a certain explanatory variable that can be improved at a reduced cost by the government, then the government might formulate new policies on this issue from the perspective of health economics. Also, this research facilitates the investigation of the disease processes, caries prevention in dental epidemiology as well as occupational health discipline. The overall analysis will be clinically informative.  A number of health and socio-economic related variables were available in the data set which will be analyzed by the proposed regression model. The methodology of this research will be applicable to medical and public health survey, and provide profound implications on health economics which has been a hot topic in Hong Kong recently.
Results Achieved:
Due to the flexibility of the model proposed and the simplicity of the algorithm, it is expected that estimations will work extremely well in the univariate settings. As the survey employed a multi-stage cluster sampling scheme and contains more than one variable of interest, these provide much stimulations to extend my research to the multivariate and cluster analysis. It is believed that these investigations will be valuable for practical use.
Remarks on the Use of High Performance Computing Cluster:
As the proposed model involves a complex distribution, this project requires a lot of programming and I have to make use of HPC Cluster for the tedious computations. On the other hand, in order to increase the preciseness of the parameter estimates, it really takes a long time for the programs to finish. Pleasantly, using the HPC Cluster in addition to the parallel programming, these greatly reduce the running times of the programs. More importantly, it makes the methodology possible and efficient enough to be implemented in practice.

- Back to top- / Contents