The Delaunay Triangulation Learner


liuyh's picture

liuyh - Posted on 15 September 2017

Project Description: 

We propose a new base learner, called the Delaunay triangulation learner (DTL), which can significantly improve the performance over the standard tree-based ensem-ble methods, under high-dimensional and highly interactive settings. The innovation of the DTL arises from discrete differential geometry, in which Delaunay triangulation is applied as a surface reconstruction algorithm that can accurately describe the prop- erties of a surface using a linear interpolation function. Compared with other triangle meshes, a Delaunay triangle mesh is considered as a near-optimal way to construct a linear interpolation function to approximate the target function with minimal er-ror. Thus, we construct the DTL by applying Delaunay triangulation on the training dataset and fit the model with a linear interpolation function. Furthermore, we pro-pose two appropriate regularization functions to penalize the roughness of the DTL and improve its predictability on the testing dataset. In ensemble learning applica- tions, we propose the bagging DTL and random crystal, where the DTLs are assigned to the subspaces of the feature space, and the feature interactions are captured by Delaunay triangle meshes. Compared with the tree-based approaches, the DTL shows substantial performance enhancement, especially when feature interactions are deeply covered by some weak partial dependencies.

Researcher name: 
Yehong Liu
Researcher position: 
PhD candidate
Researcher department: 
Department of Statistics and Actuarial Science
Researcher email: 
Researcher name: 
Guosheng Yin
Researcher position: 
Professor
Researcher department: 
Department of Statistics and Actuarial Science
Researcher email: 
Research Project Details
Project Duration: 
11/2016 to 07/2018
Project Significance: 
The Delaunay triangulation learner that we proposed has shown better performance that the traditional tree-based methods. This algorithm is more computationally intensive than traditional methods, however, totally parallel-able. We use the HPC to run tests of this algorithm on large scale datasets. This research is supported in part by a grant (17326316) from the Research Grants Council of Hong Kong.
Remarks: 
HPC will help us on the acceleration of our large-scale simulation studies.