YiDa.Xu ML notes
Infinity in Deep Learning
Detailed derivation of neural networks as (1) Gaussian Process using central Limit theorem (2) Neural Tangent Kernel (NTK)
Discuss Neural ODE and in particular the use of adjoint equation in Parameter training
Sinovasinovation DeeCamp
properties of Softmax, Estimating softmax without compute denominator, Probability re-parameterization: Gumbel-Max trick and REBAR algorithm
Expectation-Maximization & Matrix Capsule Networks; Determinantal Point Process & Neural Networks compression; Kalman Filter & LSTM; Model estimation & Binary classifier
Video Tutorial to these notes
- I recorded about 20% of these notes in videos in 2015 in Mandarin (all my notes and writings are in English) You may find them on Youtube and bilibili and Youku
3D Geometry Computer vision
3D Geometry Fundamentals
Camera Models, Intrinsic and Extrinsic parameter estimation, Epipolar Geometry, 3D reconstruction, Depth Estimation
Recent Deep 3D Geometry based Research
Recent research of the following topics: Single image to Camera Model estimation, Multi-Person 3D pose estimation from multi-view, GAN-based 3D pose estimation, Deep Structure-from-Motion, Deep Learning based Depth Estimation
Deep Learning
New Research on Softmax function
Out-of-distribution, Neural Network Calibration, Gumbel-Max trick, Stochastic Beams Search (some of these lectures overlap with DeeCamp2019)
Optimisation methods
Optimisation methods in general. not limited to just Deep Learning
Neural Networks
basic neural networks and multilayer perceptron
Convolution Neural Networks: from basic to recent Research
detailed explanation of CNN, various Loss function, Centre Loss, contrastive Loss, Residual Networks, Capsule Networks, YOLO, SSD
Word Embeddings
Word2Vec, skip-gram, GloVe, Fasttext
Deep Natural Language Processing
RNN, LSTM, Seq2Seq with Attenion, Beam search, Attention is all you need, Convolution Seq2Seq, Pointer Networks
Mathematics for Generative Adversarial Networks
How GAN works, Traditional GAN, Mathematics on W-GAN, Duality and KKT conditions, Info-GAN, Bayesian GAN
Restricted Boltzmann Machine
basic knowledge in Restricted Boltzmann Machine (RBM)
Reinforcement Learning
Reinforcement Learning Basics
basic knowledge in reinforcement learning, Markov Decision Process, Bellman Equation and move onto Deep Q-Learning
Monto Carlo Tree Search
Monto Carlo Tree Search, alphaGo learning algorithm
Policy Gradient
Policy Gradient Theorem, Mathematics on Trusted Region Optimization in RL, Natural Gradients on TRPO, Proximal Policy Optimization (PPO), Conjugate Gradient Algorithm
Data Science
30 minutes introduction to AI and Machine Learning
An extremely gentle 30 minutes introduction to AI and Machine Learning. Thanks to my PhD student Haodong Chang for assist editing
Regression methods
Classification: Logistic and Softmax; Regression: Linear, polynomial; Mix Effect model [costFunction.m] and [soft_max.m]
Recommendation system
collaborative filtering, Factorization Machines, Non-Negative Matrix factorisation, Multiplicative Update Rule
Dimension Reduction
classic PCA and t-SNE
Introduction to Data Analytics and associate Jupyter notebook
Supervised vs Unsupervised Learning, Classification accuracy
Probability and Statistics Background
Bayesian model
revision on Bayes model include Bayesian predictive model, conditional expectation
Probabilistic Estimation
some useful distributions, conjugacy, MLE, MAP, Exponential family and natural parameters
Statistics Properties
useful statistical properties to help us prove things, include Chebyshev and Markov inequality
Probabilistic Model
Expectation Maximisation
Proof of convergence for E-M, examples of E-M through Gaussian Mixture Model, [gmm_demo.m] and [kmeans_demo.m] and [bilibili video]
State Space Model (Dynamic model)
explain in detail of Kalman Filter [bilibili video], [kalman_demo.m] and Hidden Markov Model [bilibili video]
Inference
Variational Inference
explain Variational Bayes both the non-exponential and exponential family distribution plus stochastic variational inference. [vb_normal_gamma.m] and [bilibili video]
Stochastic Matrices
stochastic matrix, Power Method Convergence Theorem, detailed balance and PageRank algorithm
Introduction to Monte Carlo
inverse CDF, rejection, adaptive rejection, importance sampling [adaptive_rejection_sampling.m] and [hybrid_gmm.m]
Markov Chain Monte Carlo
M-H, Gibbs, Slice Sampling, Elliptical Slice sampling, Swendesen-Wang, demonstrate collapsed Gibbs using LDA [lda_gibbs_example.m] and [test_autocorrelation.m] and [gibbs.m] and [bilibili video]
Particle Filter (Sequential Monte-Carlo)
Sequential Monte-Carlo, Condensational Filter algorithm, Auxiliary Particle Filter [bilibili video]
Advanced Probabilistic Model
Bayesian Non Parametrics (BNP) and its inference basics
Dircihlet Process (DP), Chinese Restaurant Process insights, Slice sampling for DP [dirichlet_process.m] and [bilibili video] and [Jupyter Notebook]
Bayesian Non Parametrics (BNP) extensions
Hierarchical DP, HDP-HMM, Indian Buffet Process (IBP)
Completely Random Measure (early draft – written in 2015)
Levy-Khintchine representation, Compound Poisson Process, Gamma Process, Negative Binomial Process
Sample correlated integers from HDP and Copula
This is an alternative explanation to our IJCAI 2016 papers. The derivations are different from the paper, but portraits the same story.
Determinantal Point Process
explain the details of DPP’s marginal distribution, L-ensemble, its sampling strategy, our work in time-varying DPP
Join the discussion