AI researchers tackle the 'curse of dimensionality' with new algorithm

By:
Alan Flurry

In the context of dynamic programming, the curse of dimensionality refers to various phenomena that arise when analyzing and organizing data with hundreds or thousands of dimensions. In order to obtain a statistically sound and reliable result, the amount of data needed to support the result often grows exponentially with the dimensionality.

In a recent paper published in the proceedings of the prestigious machine learning and artificial intelligence conference NeurIPS “Large-scale optimal transport map estimation using projection pursuit”, a UGA team led by statistics professors Ping Ma, Wenxuan Zhong and Yuan Ke present a promising new AI algorithm called “projection pursuit Monge map (PPMM)” capable of reproducing graphs and figures with high similarities:

Existing literature approximates the large-scale OTM by a series of one-dimensional OTM problems through iterative random projection. Such methods, however, suffer from slow or none convergence in practice due to the nature of randomly selected projection directions. Instead, we propose an estimation method of large-scale OTM by combining the idea of projection pursuit regression and sufficient dimension reduction. The proposed method, named projection pursuit Monge map (PPMM), adaptively selects the most informative'' projection direction in each iteration. We theoretically show the proposed dimension reduction method can consistently estimate the most informative'' projection direction in each iteration. Furthermore, the PPMM algorithm weakly convergences to the target large-scale OTM in a reasonable number of steps. Empirically, PPMM is computationally easy and converges fast. We assess its finite sample performance through the applications of Wasserstein distance estimation and generative models.

"Intending to train a machine to draw and generalize abstract concepts in a manner like humans, we train our model on a dataset of hand-drawn sketches, each represented as a handwritten number," the group wrote in describing the algorithm. "In doing so, we created a model that potentially has many applications, from assisting the creative process of an artist, to help teach students how to draw. To be fast and efficient, the algorithm can adaptively select the most “informative” direction each time it analyzes the figure themselves. In this case, we can decrease the computational time to a great extent. 

"Two examples are shown in the paper, The MNIST database of handwritten digits has a training set of 60,000 examples and a test set of 10,000 examples. Figure 1 lists some random images generated by PPMM, they are a very similar comparison to the original handwritten digits. In addition to generate images, the algorithm can even smoothly transfer one digit to another. In this case, we construct a bridge that connects two digits. We trained our machine to move freely between digits only based on the information from the images."

Co-authors on the new paper are statistics PhD students Cheng Meng, Jingyi Zhang,  Mengrui Zhang.