Posted 2021-01-27Updated 2021-03-18Deeplearning15 分钟 read (About 2205 words)

< Tensorflow >Tensorflow2.4 最佳实践

开始

最近想尝试一下用Transformer做图片分类的效果，于是就在网上找找有没有比较好的例子．发现keras官方有个例子，于是就clone下来看看．本以为multi-head-attention这个模块需要自己来实现，竟然发现tf.keras中已经实现了multi-head-attention的接口，发现是真的方便（tensorflow的最新版本tf2.4才有的一个接口）．

Posted 2020-07-30Updated 2021-03-18Deeplearning26 分钟 read (About 3927 words)

< Deeplearning > 给模型加入先验知识

模型加入先验知识的必要性

端到端的深度神经网络是个黑盒子，虽然能够自动学习到一些可区分度好的特征，但是往往会拟合到一些非重要特征，导致模型会局部坍塌到一些不好的特征上面。常常一些人们想让模型去学习的特征模型反而没有学习到。为了解决这个问题，给模型加入人为设计的先验信息会让模型学习到一些关键的特征。下面就从几个方面来谈谈如何给模型加入先验信息。

Posted 2020-03-20Updated 2021-03-13Deeplearning13 分钟 read (About 2003 words)

< Deeplearning > TF实操Game of Noise

前言

来自于2020年1月份的一篇arxiv文章.文章的主要思想是通过给CNN网络(以分类模型举例)的输入图加入噪声来使得模型更加的鲁棒．

与之前手动加入噪声不同的是，该文章采用对抗网络的思想，通过一个噪声生成器来生成噪声,并尽量使你的分类模型(判别模型)做出错误的分类．

而你的分类模型的目的是尽量能够不被加入的图片噪声干扰，依然能做做出正确的输出.最终经过数轮的迭代训练,达到使得你的分类模型能够抵抗各类噪声干扰的目的．

Posted 2019-04-24Updated 2021-03-13Deeplearning1 分钟 read (About 153 words)

<Deeplarning> A simple way to distinguish different optimizers in DeepLearning

The optimizers

Let’s simply list the optimizers that may or may not be used in your project.

SGDOptimizer
MomentumOptimizer
NesterovOptimizer
AdagradOptimizer
AdadeltaOptimizer
RMSPropOptimizer
AdamOptimizer
NadamOptimizer

Posted 2019-03-25Updated 2021-03-13Deeplearning6 分钟 read (About 973 words)

< Tensorflow >How dose TensorFlow do Quant Aware Training?

Let firstly simplify the Quant process in TF

Overview

1	S_a1(q_a1 + Z_a1) = S_w1(q_w1 + Z_w1) * S_a0(q_a0 + Z_a0)

q_a1: Quanted activation value in layer 1
S_a1, Z_a1: Estimated scale and zero point in layer 1
q_w1: Quanted weight in layer 1
S_w1, Z_w1: Statistical scale and zero point in layer 1
q_a0: Quanted activation value in layer 0
S_a0, Z_a0: Estimated scale and zero point in layer 0

As we can see, in order to compute q_a1(Quanted activation value in layer 1), we have to get S_w1, Z_w1, S_a0, Z_a0, q_a1, Z_a1. To get S_w1/Z_w1 is simple, we can get the Statistical maximum of the weights in each layer we want. The only tricky thing is how to get S_a1/Z_a1/S_a0/Z_a0, which have to be estimated from the training data.

Posted 2019-03-06Updated 2021-03-14Deeplearning3 分钟 read (About 484 words)

< DeepLearning > The Support Vectors in DeepLearning

Dose DeepLearning has ‘Support Vectors’?

As we all know, support vectors is a notation in SVM(Support Vector Machine). Support vectors means that the data points in the decision boundary(the Maximum Margin in SVM) are very important in a classification algorithm.

We can train a SVM only with the ‘Support Vectors’ and can achieve the same accuracy with models trained with more data. So does the ‘Support Vectors’ exists in DeepLearning?

Posted 2019-01-03Updated 2021-03-13Deeplearning2 分钟 read (About 325 words)

< Tensorflow > Softmax cross entropy & Sigmoid cross entropy

How difference

For multi-class classification, if you want optimize only one category during training, you should use SOFTMAX cross entropy. Otherwise, if you want to optimize more than one category, you should use SIGMOID cross entropy.

Posted 2018-12-30Updated 2021-03-13Deeplearning3 分钟 read (About 474 words)

< Tensorflow > How to implement the larget margin softmax loss in tensorflow.

There is my implementation of the Large Margin Softmax Loss in TF.