i want implement mlp model taught in https://www.coursera.org/learn/machine-learning, using tensorflow. here's implementation.
# 1 hidden layer mlp x = tf.placeholder(tf.float32, shape=[none, 784]) y = tf.placeholder(tf.float32, shape=[none, 10]) w_h1 = tf.variable(tf.random_normal([784, 512])) h1 = tf.nn.sigmoid(tf.matmul(x, w_h1)) w_out = tf.variable(tf.random_normal([512, 10])) y_ = tf.matmul(h1, w_out) # cross_entropy = tf.nn.sigmoid_cross_entropy_with_logits(y_, y) cross_entropy = tf.reduce_sum(- y * tf.log(y_) - (1 - y) * tf.log(1 - y_), 1) loss = tf.reduce_mean(cross_entropy) train_step = tf.train.gradientdescentoptimizer(0.05).minimize(loss) correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1)) accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) # train tf.session() s: s.run(tf.initialize_all_variables()) in range(10000): batch_x, batch_y = mnist.train.next_batch(100) s.run(train_step, feed_dict={x: batch_x, y: batch_y}) if % 100 == 0: train_accuracy = accuracy.eval(feed_dict={x: batch_x, y: batch_y}) print('step {0}, training accuracy {1}'.format(i, train_accuracy))
however, not work. think definition layers correct, problem in cross_entropy. if use first one, 1 got commented out, model converges quickly; if use 2nd one, think/hope translation of previous equation, model won't converge.
if want take @ cost equation, can find @ here.
update
i have implemented same mlp model using numpy , scipy, , works.
in tensorflow code, added print line in training loop, , found out elements in y_
nan...i think caused arithmetic overflow or alike.
it 0*log(0) issue.
replacing
cross_entropy = tf.reduce_sum(- y * tf.log(y_) - (1 - y) * tf.log(1 - y_), 1)
with
cross_entropy = tf.reduce_sum(- y * tf.log(tf.clip_by_value(y_, 1e-10, 1.0)) - (1 - y) * tf.log(tf.clip_by_value(1 - y_, 1e-10, 1.0)), 1)
please see tensorflow nan bug?.
Comments
Post a Comment