机器学习中的概念总结

Cliff & Flat Regions

Cliff and Flat regions in the parameter space of neural network
Cliff and Flat regions in the parameter space of neural network.
θ and γ are two parameters and L(θ, γ) is the loss function. The region with light color is a cliff and the region with dark color is a flat region.

Whiten

在深度学习训练中,白化(Whiten)是加速收敛的一个小Trick,所谓白化是指将图像像素点变化到均值为0,方差为1的正态分布。

Dropout

Batch Normalization

Weight Normalization

Local Minima

局部最小值

Ablation Study

2
消融研究,指通过移除某个模型或者算法的某些特征,来观察这些特征对模型效果的影响
实际上ablation study就是为了研究模型中所提出的一些结构是否有效而设计的实验。比如你提出了某某结构,但是要想确定这个结构是否有利于最终的效果,那就要将去掉该结构的网络与加上该结构的网络所得到的结果进行对比,这就是ablation study

An ablation study typically refers to removing some “feature” of the model or algorithm, and seeing how that affects performance.

Examples:

An LSTM has 4 gates: feature, input, output, forget. We might ask: are all 4 necessary? What if I remove one? Indeed, lots of experimentation has gone into LSTM variants, the GRU being a notable example (which is simpler).
If certain tricks are used to get an algorithm to work, it’s useful to know whether the algorithm is robust to removing these tricks. For example, DeepMind’s original DQN paper reports using (1) only periodically updating the reference network and (2) using a replay buffer rather than updating online. It’s very useful for the research community to know that both these tricks are necessary, in order to build on top of these results.
If an algorithm is a modification of a previous work, and has multiple differences, researchers want to know what the key difference is.
Simpler is better (inductive prior towards simpler model classes). If you can get the same performance with two models, prefer the simpler one.

Sharp Minimal

尖锐最小值
Yoshua Bengio 组在2017年发表文章Sharp Minima Can Generalize For Deep Nets介绍了在深度神经网络中尖锐的最小值也可以让模型具有泛化能力.

Internal Covariate Shift – ICS

发表评论

电子邮件地址不会被公开。 必填项已用*标注