Randomized Methods in Large-scale optimization


zhengqu - Posted on 23 February 2016

Project Description: 

In the past decade, several key processes of global society have turned digital. Companies, individuals and governments routinely collect massive amounts of structured and unstructured data--text, images, videos, bookmarks, links--and this is done at unprecedented scales. Moreover, with advances in technology, it is becoming increasingly cheaper to collect and store raw data. All of these changes have brought us into the era of Big Data, brimmed with challenges and opportunities.
Extracting information and insights from data has the potential to inform and enrich numerous aspects of our daily lives, and a ect fundamental processes in science, industry, business, social media, healthcare and defense. However, in order for the opportunities to bear some fruit, we must be able to transform the data into information--and this needs to be done at scales never encountered before. Many information extraction problems of interest can be cast as optimization problems and solved by optimization algorithms. However, traditional optimization algorithms relying on
expensive iterations are not scalable to handle data of today's magnitude. Actually the diculty of scalability is present almost everywhere in numerical analysis. For example, the classical numerical methods such as nite di erence scheme for solving the partial di erential equations arising in optimal control are not directly extendable to high dimensional problems, due to the curse of dimensionality.
This project focuses on developing and analyzing novel optimization algorithms capable of solving big data problems. Algorithms aspiring to achieve this goal must be highly granular and parallel / distributed in nature so as to exploit the power of modern high performance computer systems. Modern optimization methods need to address novel challenges brought up by the big data nature of the problems and need to rely on elements such as acceleration techniques, randomization, asynchronicity and communication avoiding strategies. The methods we design have applications in diverse domains, such as machine learning, engineering and science.