Deep learning cheating, using a single parameter fit any data set, this 19-year-old paper is "hot" again

Heart of machine 2021-10-14 02:35:45

One Parameters Draw an elephant .

It is said that , feng · Neumann once attended a meeting , A physics researcher is reporting on a research progress , Using a very complex model , try graph theory Verify that the experimental data points fall on the same curve , In line with model expectations . So Feng · Neumann just said , It's better to say that these points are on the same plane . Last , feng · Neumann left a famous saying :「With four parameters I can fit an elephant, and with five I can make him wiggle his trunk.」

This is Feng · Neumann classic 「 four Parameters Draw elephants , Five Parameters Nose shaking 」 The story of .

2010 year , The papers published by three researchers from Max Planck Institute of molecular cell biology and genetics in Germany and European Molecular Biology Laboratory have realized four Parameters Draw elephants , As follows :

Picture source :

The same idea , In recent days, , An article was published in 2019 year 4 Month's old paper 《 Real numbers, data science and chaos: How to fit any dataset with a single parameter 》, There was another wave of discussion on twitter . Author of the paper Laurent Boué He is now a Microsoft senior manager machine learning scientists , He talked about 「 How to use a single Parameters Fit any dataset 」.

Address of thesis :

The poster is a Princeton doctoral student 、 DeepMind Research scientist intern Miles Granmer, He said ,「 This paper provides a model with a single Parameters Scalar function of , And this function is differentiable and continuous !」

For this study , Some people think that :「 Technically speaking , This article has some 『 cheat 』, Because this paper uses floating-point numbers with arbitrary precision . Because the number of bits required for floating-point numbers is very small , Therefore, this article may be a good candidate for compressed representation . But it's definitely not 『 A single 』 Parameters . I agree that this paper is a way to encode data sets into numbers , Then decode it back to a clever way to reconstruct a single point .」

There is also a fit to the study Parameters The standard error is of interest , If it's a single Parameters , How big the error will be ?

Others say :「1 individual Parameters The continuous differentiable function of can generate infinite VC the Uygurs . This paper seems to be a version of the technique .」
The content of the paper is introduced

This paper introduces how to pass with a single real value Parameters Scalar function of ( continuity 、 It's very small ...) To approximate any different modes ( The time series 、 Images 、 voice ...) Data set of . Based on the basic concept of chaos theory , The researchers used teaching methods (pedagogical) Method to demonstrate how to adjust this real value Parameters , To achieve arbitrary precision fitting of all data samples .

Real world data come in a variety of shapes and sizes , Its mode includes from traditional structured database mode to unstructured media source , Such as video source and recording . However , Any data set can eventually be considered a list of values X = [x_0, · · · , x_n] , This list describes the data content and ignores the underlying mode of the data . And this paper aims to prove that any data set X All samples can be reproduced by a simple differential equation :

among α ϵ R Is the real value to learn from the data Parameters ,x ϵ [0, · · · , n] Take the whole number .(τ ϵ N It's a constant , Can effectively control the required Accuracy rate ). according to 「 Fit the elephant 」 Tradition , The study first shows how to select the appropriate α Values generate different animal shapes , Pictured 1 Shown .

After the demonstration f_α You can generate any type of graffiti drawing above , The paper continues to use words 「Hello world」 There was a demonstration , To further illustrate the function of the method . The figure below 2 Shows how to use carefully selected α Value to generate complex high-dimensional acoustic signals , Coding actually expresses 「Hello world」.

In the data mode of image , With the development of special hardware and new technology neural network The continuous emergence of Architecture , It is generally believed that the available large-scale labeled training data has become a hot topic Computer vision 「 mature 」 One of the most important factors .

under these circumstances ,CIFAR-10 Data sets are considered to be a powerful standard to measure the performance of new learning algorithms . The study shows that : Here's the picture 3 Shown , You can always find one α value , bring f_α Be able to build a reflection CIFAR-10 Artificial images of categories .

Based on the examples of the above modes , The paper concludes that : A model with simple and differentiable formulas f_α Can produce any type of semantically related scatter diagram 、 Audio or visual data ( The text is similar ), You only need a single real value Parameters . This has aroused the doubts of researchers .

Besides , This paper expounds the fact that this method can not be generalized . This is because all the information in this method is directly encoded , Without any compression or 「 Study 」. From a mathematical point of view , There are an infinite number of real numbers , Therefore, it should not be confused with the limited precision data types implemented by the programming language . Based on this ,f_α It is impossible to achieve real generalization , The figure below 9 That's one example .

Regarding this , What do you think ?

本文为[Heart of machine]所创,转载请带上原文链接,感谢