The mathematics of machine learning and deep learning. Sanjeev aroras deep learning theory tutorial and ben recths optimization tutorial were both excellent id suggest taking a look at each if you get time. Limitations of deep learning towards ai best artificial. How can we adapt deep learning to new domains in a principled way. Paper 1 by agrarwal et al and paper 2 by carmon et al. Toward theoretical understanding of deep learning sanjeev arora. Nonblack box analyses of simpler problems subcases of simple neural nets. Du, wei hu, zhiyuan li, ruslan salakhutdinov, ruosong wang on. Implicit acceleration by overparameterization sanjeev arora nadav cohen elad hazan 2018 tutorial. Deep learning frameworks a framework is environment that is built by system software to give platform to programmer for developing and deploying their applications. Sanjeev arora born january 1968 is an indian american theoretical computer scientist who is best known for his work on probabilistically checkable proofs and, in particular, the pcp theorem. Sanjeev arora nadav cohen elad hazan 2018 tutorial.
Sanjeev satheesh machine learning landing ai linkedin. Deeplearningfree text and sentence embedding, part 2 jun 25, 2018 sanjeev arora, mikhail khodak, nikunj saunshi. We start with sparse linear representations, and give the rst algorithm for dictionary. Facebooks ai guru lecun imagines ais next frontier. Compressibility and generalization in largescale deep. Indeed, the current state of deep learning theory is like the fable the blind men and the elephant. Deep learning frameworks enable the programmer to built and test their deep learning based applications.
Feb 14, 2018 deep nets generalize well despite having more parameters than the number of training samples. From the viewpoint of machine learning theory, the novelty in this project is the focus on unsupervised settings by contrast, many traditional frameworks in learning theory such as pac, svms, online learning, etc. Provable algorithms for machine learning problems rong ge a dissertation presented to the faculty. Sanjeev arora, david steuer, and constantinos daskalakis. Understanding deep learning requires rethinking generalization. Provable algorithms for machine learning problems rong ge a dissertation presented to the faculty of princeton university in candidacy for the degree of doctor of philosophy recommended for acceptance by the department of computer science adviser. The group meets two hours a week at in tbd and is currently limited to tbd num members members.
Machine learning is the subfield of computer science concerned with creating programs and machines. Some theory and empirics sanjeev arora department of computer science princeton university princeton, nj 08544, usa. Harnessing the power of infinitely wide deep nets on smalldata tasks. Sep 22, 2018 this is the second ahlfors lecture of sanjeev arora from princeton university and the institute for advanced study. Toward theoretical understanding of deep learning sanjeev arora princeton university institute for advanced study support. Efforts to understand the generalization mystery in deep learning have led to the belief that gradientbased optimization induces a form of implicit regularization, a bias towards models of low complexity. In this deep learning era, machine learning usually boils down to defining a suitable objectivecost function for the learning task.
The second part of the thesis provides ideas for provably learning deep, sparse representations. Sanjeev arora, rong ge, behnam neyshabur, yi zhang submitted on 14 feb 2018, last revised 26 nov 2018 this version, v4. Sanjeev arora simon du wei hu zhiyuan li russ salakhutdinov ruosong wang 2017 workshop. Sep 19, 2018 plenary lecture 15 the mathematics of machine learning and deep learning sanjeev arora abstract. Sanjeev arora simon du wei hu zhiyuan li russ salakhutdinov ruosong wang 2019 spotlight. Implicit acceleration by overparameterization %a sanjeev arora %a nadav cohen %a elad hazan %b proceedings of the 35th international conference on machine learning %c proceedings of machine learning research %d 2018 %e jennifer dy %e andreas krause %f pmlrv80arora18a %i pmlr %j proceedings of machine learning research %p 244253. Pdf implicit regularization in deep matrix factorization. This post continues sanjeevs post and describes further attempts to construct elementary and interpretable text embeddings. Professor of computer science, princeton university. Qi sanjeev arora and aditya bhaskara and rong ge and tengyu ma provable bounds for learning some deep representations. Diving into the limits of deep learning, this article talks about the limitations of deep learning in ai research for the general public.
Feb 15, 2018 the foundational paper of goodfellow et al. Plenary lecture 15 the mathematics of machine learning and deep learning sanjeev arora abstract. The current paper shows generalization bounds thatre orders of magnitude better in practice. Computational complexity see my book on this topic, probabilistically checkable proofs pcps, computing approximate. Sanjeev arora princeton university computer science.
Recent research shows that the following two models are equivalent. Stronger generalization bounds for deep nets via a compression. I am a member of the groups in theoretical computer science and theoretical machine learning. Computational complexity see my book on this topic, probabilistically checkable proofs pcps, computing approximate solutions to nphard problems, and related issues. Download pdf proceedings of machine learning research. View sanjeev satheeshs profile on linkedin, the worlds largest professional community.
Kiran vodrahalli 03202018 1 toward theoretical understanding of deep learning sanjeev arora 1. This is the second ahlfors lecture of sanjeev arora from princeton university and the institute for advanced study. Now the problem in deep learning is that the optimization landscape is unknown. Sanjeev arora, curtis callan, and victor mikhaylov.
This cited by count includes citations to the following articles in scholar. We study the implicit regularization of gradient descent over deep linear neural networks for matrix completion and sensing, a model referred to as deep matrix factorization. Topic modeling arora, ge, moitra, sparse coding, matrix completion. Is optimization a sufficient language for understanding deep learning.
Du, wei hu, zhiyuan li, ruslan salakhutdinov, ruosong wang neurips 2019 learning neural networks with adaptive regularization han zhao, yaohung hubert tsai, ruslan salakhutdinov, geoffrey j. Our result shows that such encoderdecoder training objectives also cannot guarantee learning of the full distribution because they cannot prevent serious mode collapse. Deeplearning dl architectures based on sparse data modeling context. On exact computation with an infinitely wide neural net. Deep learning for physics deep learning refers to use of neural networks to solve learning problems, including learning hidden structures in large and complex data sets. Mamta arora, sanjeev dhawan, kulvinder singh 383 figure6 deep stack network 3. Stronger generalization bounds for deep nets via a compression approach.
A survey of deep learning for scientific discovery. Each gate computes a simple nonlinear function, which is applied to weighted sum of incoming signals. Identifying new form of noisestability for deep nets. Implicit acceleration by overparameterization %a sanjeev arora %a nadav cohen %a elad hazan %b proceedings of the 35th international conference on machine learning %c proceedings of machine learning research %d 2018 %e jennifer dy %e andreas krause %f pmlrv80arora18a %i pmlr %j proceedings of machine. Deep learning has led to rapid progress in open problems of artificial intelligencerecognizing images, playing go, driving cars, automating translation between languagesand has triggered a new gold rush in the tech sector. Sanjeev arora, rong ge, ravindran kannan, ankur moitra, computing a nonnegative matrix factorization. In this paper, we propose morphed learning, a privacypreserving technique for deep learning based on data morphing that, allows data owners to share their data without leaking sensitive privacy. The next evolution in artificial intelligence may be a matter of dispensing with all the probabilistic tricks of deep learning. Provable bounds for learning some deep representations. We will have several invited talks each day and also spotlight talks by young researchers. Harnessing the power of infinitely wide deep nets on small. Mathematics of deep learning princeton university scribe. But some scientists raise worries about slippage in scientific practices and rigor, likening the process to. Whyhow does optimization nd globally good solutions to the deep learning optimization problem.
Some theory and empirics sanjeev arora, andrej risteski, yi zhang. Sanjeev arora, nadav cohen, noah golowich, and wei hu. This paper suggests that, sometimes, increasing depth can speed up optimization. You will learn about convolutional networks, rnns, lstm, adam, dropout, batchnorm, and more.
Sanjeev arora learning algorithms do something similar, except the settings are more complicated and with many more sometimes, tens of. Crmcifar deep learning summer school, organized by professors aaron courville and yoshua bengio, 2016. This workshop seeks to bring together deep learning practitioners and theorists to discuss progress that has been made on deep learning theory, and to identify promising avenues where theory is possible and useful. They also connect deep learning to notions such as gaussian processes and kernels. Sanjeev arora princeton university do gans actually learn the distribution. Recent works try to give an explanation using pacbayes and marginbased analyses, but do not as yet result in sample complexity bounds better than naive parameter counting. Published as a conference paper at iclr 2018 do gans learn the distribution. Fitzmorris professor of computer science at princeton university, and his research interests include computational complexity theory, uses of randomness in computation. Feb 12, 2019 stanford professor, sanjeev arora, takes a vivid approach to the generalization theory of deep neural networks 15, in which he mentions the generalization mystery of deep learning as to why do.
Deep nets generalize well despite having more parameters than the number of training samples. Brief introduction to deep learning and the alchemy controversy sanjeev. Edu princeton university, computer science department and center for computational intractability, princeton 08540, usa. Ai has achieved incredible feats thanks to deep learning. Claim your profile and join one of the worlds largest a. A convergence analysis of gradient descent for deep linear neural networks. While some progress has been made recently towards a foundational understanding of deep learning, most theory work has been disjointed, and a coherent picture has yet to emerge. Belkin et al18 to understand deep learning we need to understand kernel learning. Pdf provable bounds for learning some deep representations. Companies are racing to develop hardware that more directly empowers deep learning. Implicit acceleration by overparameterization sanjeev arora1 2 nadav cohen2 elad hazan1 3 abstract conventional wisdom in deep learning states that increasing depth improves expressiveness but complicates optimization. But some scientists raise worries about slippage in scientific practices and rigor, likening the process to alchemy. See the complete profile on linkedin and discover sanjeevs. Apr 12, 2020 in this course, you will learn the foundations of deep learning, understand how to build neural networks, and learn how to lead successful machine learning projects.