Technology That Converts Machine Learning
Pipelines into Artificial Neural Networks

Taking advantage of both traditional machine learning techniques and the latest deep learning techniques

Expected to be used in various real-life AI applications

From left, Ph.D. student Gyeongin Yu and Professor Byung-Gon Chun

Professor Byung-Gon Chun’s team of the Department of Computer Science and Engineering developed WindTunnel in collaboration with Microsoft, a framework for optimizing a traditional machine learning pipeline by converting it into a neural network.
This achievement is a core technology that takes advantage of both traditional machine learning techniques and the latest deep learning techniques, and is expected to be used in various real-world AI applications such as click-rate prediction and recommendation of systems.
While deep learning techniques are receiving great amounts of attention for having been shown to be effective in fields such as computer vision and natural language processing, tabular data used in artificial intelligence applications such as click-rate prediction and recommendation systems are still better performed by traditional machine learning techniques such as linear models and gradient boost decision trees (GBDT).
When using traditional machine learning techniques, usually a number of machine learning models and data conversion operations are combined to form a single machine learning pipeline and during learning, each element that constitutes the pipeline is learned and used separately.
The researcher team developed a technology that learns each component of the pipeline individually and converts it into an artificial neural network to optimize multiple components at once through backpropagation. In particular, they have proposed a method to transform and optimize commonly non-differentiable components, such as GBDT and categorical feature encoders, into neural networks.
The WindTunnel framework developed using this technology was evaluated to be able to lead to various studies that find a compromise between traditional machine learning and deep learning techniques on tabular data with higher predictive performance compared to existing methods.

Meanwhile, the results of the study will be published at the 'International Conference on Very Large Data Bases (VLDB) 2022.
For further information, please contact Prof. Byung-Gon Chun (bgchun@snu.ac.kr).