< Tensorflow > Softmax cross entropy & Sigmoid cross entropy

How difference

For multi-class classification, if you want optimize only one category during training, you should use SOFTMAX cross entropy. Otherwise, if you want to optimize more than one category, you should use SIGMOID cross entropy.

The code below could demonstrate

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import tensorflow as tf
import numpy as np

np.set_printoptions(precision=4, suppress=True)
def fun():
x = tf.constant([7, 6, -4], tf.float64)
x_sig = tf.nn.sigmoid(x)
z = tf.constant([1,1,0], tf.float64)
loss_1 = tf.reduce_sum(z * -tf.log(x_sig) + (1 - z) * -tf.log(1 - x_sig))
loss_2 = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(logits=x, labels=z))

logits = tf.constant([[3, -4], [4, -2], [-2, 2]], tf.float64)
y = tf.nn.sigmoid(logits)
y_ = tf.constant([[1, 0], [1, 0], [0, 1]], tf.float64)
loss_3 = tf.reduce_sum(tf.nn.sigmoid_cross_entropy_with_logits(logits=logits, labels = y_))


softmax = tf.nn.softmax(logits)
loss_4 = -tf.reduce_sum(y_ * tf.log(softmax))
loss_5 = tf.reduce_sum(tf.nn.softmax_cross_entropy_with_logits(logits=logits, labels=y_))

with tf.Session() as sess:

print('loss1:', sess.run(loss_1))
print('loss2:', sess.run(loss_2))
print('loss3:', sess.run(loss_3))
print('loss4:', sess.run(loss_4))
print('loss5:', sess.run(loss_5))

fun()

The results are as follows:

1
2
3
4
5
6
loss1: 0.021537079509314265
loss2: 0.02153707950931443
loss3: 0.465671240538279
loss4: 0.021537079509314265
loss5: 0.021537079509314338

loss1 is the result which is computed through naive implementation of sigmoid cross entropy.

loss2 is the result which is computed through the standard sigmoid cross entropy in tensorflow API.

loss4 is the result which is computed through naive implementation of softmax cross entropy.

loss5 is the result which is computed through the standard sigmoid cross entropy in tensorflow API.

As you can see, the result of sigmoid cross entropy and softmax cross entropy are the same.

This is mainly because sigmoid could be seen a special case of sofmax. To sigmoid one number could equal to softmax two number which could sum to that number.

< Tensorflow > Softmax cross entropy & Sigmoid cross entropy

https://zhengtq.github.io/2019/01/03/softmax-sigmoid-cross-entropy/

Author

Billy

Posted on

2019-01-03

Updated on

2021-03-13

Licensed under

Comments