python - TensorFlow: 2 layer feed forward neural net -


i'm trying implement simple fully-connected feed-forward neural net in tensorflow (python 3 version). network has 2 inputs , 1 output, , i'm trying train output xor of 2 inputs. code follows:

import numpy np import tensorflow tf  sess = tf.interactivesession()  inputs = tf.placeholder(tf.float32, shape = [none, 2]) desired_outputs = tf.placeholder(tf.float32, shape = [none, 1])  weights_1 = tf.variable(tf.zeros([2, 3])) biases_1 = tf.variable(tf.zeros([1, 3])) layer_1_outputs = tf.nn.sigmoid(tf.matmul(inputs, weights_1) + biases_1)  weights_2 = tf.variable(tf.zeros([3, 1])) biases_2 = tf.variable(tf.zeros([1, 1])) layer_2_outputs = tf.nn.sigmoid(tf.matmul(layer_1_outputs, weights_2) + biases_2)  error_function = -tf.reduce_sum(desired_outputs * tf.log(layer_2_outputs)) train_step = tf.train.gradientdescentoptimizer(0.05).minimize(error_function)  sess.run(tf.initialize_all_variables())  training_inputs = [[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]] training_outputs = [[0.0], [1.0], [1.0], [0.0]]  in range(10000):     train_step.run(feed_dict = {inputs: np.array(training_inputs), desired_outputs: np.array(training_outputs)})  print(sess.run(layer_2_outputs, feed_dict = {inputs: np.array([[0.0, 0.0]])})) print(sess.run(layer_2_outputs, feed_dict = {inputs: np.array([[0.0, 1.0]])})) print(sess.run(layer_2_outputs, feed_dict = {inputs: np.array([[1.0, 0.0]])})) print(sess.run(layer_2_outputs, feed_dict = {inputs: np.array([[1.0, 1.0]])})) 

it seems simple enough, print statements @ end show neural net near desired outputs, regardless of number of training iterations or learning rate. can see doing wrong?

thank you.

edit: i've tried following alternative error function:

error_function = 0.5 * tf.reduce_sum(tf.sub(layer_2_outputs, desired_outputs) * tf.sub(layer_2_outputs, desired_outputs)) 

that error function sum of squares of errors. results in network outputting value of 0.5-- indication of mistake somewhere in code.

edit 2: i've found code works fine , and or, not xor. i'm extremely puzzled now.

there several issues in code. in following i'm going comment each line bring solution.

note: xor not linearly separable. need more 1 hidden layer.

n.b: lines starts # [!] lines wrong.

import numpy np import tensorflow tf  sess = tf.interactivesession()  # batch of inputs of 2 value each inputs = tf.placeholder(tf.float32, shape=[none, 2])  # batch of output of 1 value each desired_outputs = tf.placeholder(tf.float32, shape=[none, 1])  # [!] define number of hidden units in first layer hidden_units = 4   # connect 2 inputs 3 hidden units # [!] initialize weights random numbers, make network learn weights_1 = tf.variable(tf.truncated_normal([2, hidden_units]))  # [!] biases single values per hidden unit biases_1 = tf.variable(tf.zeros([hidden_units]))  # connect 2 inputs every hidden unit. add bias layer_1_outputs = tf.nn.sigmoid(tf.matmul(inputs, weights_1) + biases_1)  # [!] xor problem function not linearly separable # [!] mlp (multi layer perceptron) can learn separe non linearly separable points ( can # think learn hypercurves, not hyperplanes) # [!] lets' add new layer , change layer 2 output more 1 value  # connect first hidden units 2 hidden units in second hidden layer weights_2 = tf.variable(tf.truncated_normal([hidden_units, 2])) # [!] same of above biases_2 = tf.variable(tf.zeros([2]))  # connect hidden units second hidden layer layer_2_outputs = tf.nn.sigmoid(     tf.matmul(layer_1_outputs, weights_2) + biases_2)  # [!] create new layer weights_3 = tf.variable(tf.truncated_normal([2, 1])) biases_3 = tf.variable(tf.zeros([1]))  logits = tf.nn.sigmoid(tf.matmul(layer_2_outputs, weights_3) + biases_3)  # [!] error function chosen multiclass classification taks, not xor. error_function = 0.5 * tf.reduce_sum(tf.sub(logits, desired_outputs) * tf.sub(logits, desired_outputs))  train_step = tf.train.gradientdescentoptimizer(0.05).minimize(error_function)  sess.run(tf.initialize_all_variables())  training_inputs = [[0.0, 0.0], [0.0, 1.0], [1.0, 0.0], [1.0, 1.0]]  training_outputs = [[0.0], [1.0], [1.0], [0.0]]  in range(20000):     _, loss = sess.run([train_step, error_function],                        feed_dict={inputs: np.array(training_inputs),                                   desired_outputs: np.array(training_outputs)})     print(loss)  print(sess.run(logits, feed_dict={inputs: np.array([[0.0, 0.0]])})) print(sess.run(logits, feed_dict={inputs: np.array([[0.0, 1.0]])})) print(sess.run(logits, feed_dict={inputs: np.array([[1.0, 0.0]])})) print(sess.run(logits, feed_dict={inputs: np.array([[1.0, 1.0]])})) 

i increased number of train iteration sure network converge no matter random initialization values are.

the output, after 20000 train iteration is:

[[ 0.01759939]] [[ 0.97418505]] [[ 0.97734243]] [[ 0.0310041]] 

it looks pretty good.


Comments

Popular posts from this blog

jOOQ update returning clause with Oracle -

java - Warning equals/hashCode on @Data annotation lombok with inheritance -

java - BasicPathUsageException: Cannot join to attribute of basic type -