我正在尝试使用多层神经网络来预测第 n 个平方.
I am trying to use multi-layer neural network to predict nth square.
我有以下包含前 99 个方格的训练数据
I have the following training data containing the first 99 squares
1 1 2 4 3 9 4 16 5 25 ... 98 9604 99 9801这是代码:
import numpy as np import neurolab as nl # Load input data text = np.loadtxt('data_sq.txt') # Separate it into datapoints and labels data = text[:, :1] labels = text[:, 1:] # Define a multilayer neural network with 2 hidden layers; # First hidden layer consists of 10 neurons # Second hidden layer consists of 6 neurons # Output layer consists of 1 neuron nn = nl.newff([[0, 99]], [10, 6, 1]) # Train the neural network error_progress = nn.train(data, labels, epochs=2000, show=10, goal=0.01) # Run the classifier on test datapoints print(' Test results:') data_test = [[100], [101]] for item in data_test: print(item, '-->', nn.sim([item])[0])第 100 个和第 101 个方格都打印 1:
Which prints 1 for both 100th and 101st squares:
Test results: [100] --> [ 1.] [101] --> [ 1.]这样做的正确方法是什么?
What is the right way to do this?
推荐答案根据 Filip Malczak 和 Seanny123 的建议和评论,我在 tensorflow 中实现了一个神经网络,以检查当我们试图教它预测(和插值)2平方.
Following Filip Malczak's and Seanny123's suggestions and comments, I implemented a neural network in tensorflow to check what happens when we try to teach it to predict (and interpolate) the 2-nd square.
连续间隔训练
我在区间 [-7,7] 上训练网络(在这个区间内取 300 个点,使其连续),然后在区间 [-30,30] 上对其进行测试.激活函数是ReLu,网络有3个隐藏层,每层大小为50.epochs=500.结果如下图所示.
I trained the network on the interval [-7,7] (taking 300 points inside this interval, to make it continuous), and then tested it on the interval [-30,30]. The activation functions are ReLu, and the network has 3 hidden layers, each one is of size 50. epochs=500. The result is depicted in the figure below.
所以基本上,在区间 [-7,7] 内部(也接近于),拟合非常完美,然后它或多或少地在外部线性地继续.很高兴看到,至少在最初,网络输出的斜率试图匹配"x^2 的斜率.如果我们增加测试间隔,两个图就会有很大的分歧,如下图所示:
So basically, inside (and also close to) the interval [-7,7], the fit is quite perfect, and then it continues more or less linearly outside. It is nice to see that at least initially, the slope of the network's output tries to "match" the slope of x^2. If we increase the test interval, the two graphs diverge quite a lot, as one can see in the figure below:
偶数训练
最后,如果我在区间 [-100,100] 中的所有偶数集合上训练网络,并将其应用于该区间内所有整数(偶数和奇数)的集合,我得到:
Finally, if instead I train the network on the set of all even integers in the interval [-100,100], and apply it on the set of all integers (even and odd) in this interval, I get:
在训练网络以生成上面的图像时,我将 epochs 增加到 2500 以获得更好的准确性.其余参数保持不变.因此,在训练间隔内部"内插似乎效果很好(可能除了 0 附近的区域,那里的拟合稍差一些).
When training the network to produce the image above, I increased the epochs to 2500 to get a better accuracy. The rest of the parameters stayed unchanged. So it seems that interpolating "inside" the training interval works quite well (maybe except of the area around 0, where the fit is a bit worse).
这是我用于第一个图的代码:
Here is the code that I used for the first figure:
import tensorflow as tf import matplotlib.pyplot as plt import numpy as np from tensorflow.python.framework.ops import reset_default_graph #preparing training data train_x=np.linspace(-7,7,300).reshape(-1,1) train_y=train_x**2 #setting network features dimensions=[50,50,50,1] epochs=500 batch_size=5 reset_default_graph() X=tf.placeholder(tf.float32, shape=[None,1]) Y=tf.placeholder(tf.float32, shape=[None,1]) weights=[] biases=[] n_inputs=1 #initializing variables for i,n_outputs in enumerate(dimensions): with tf.variable_scope("layer_{}".format(i)): w=tf.get_variable(name="W",shape=[n_inputs,n_outputs],initializer=tf.random_normal_initializer(mean=0.0,stddev=0.02,seed=42)) b=tf.get_variable(name="b",initializer=tf.zeros_initializer(shape=[n_outputs])) weights.append(w) biases.append(b) n_inputs=n_outputs def forward_pass(X,weights,biases): h=X for i in range(len(weights)): h=tf.add(tf.matmul(h,weights[i]),biases[i]) h=tf.nn.relu(h) return h output_layer=forward_pass(X,weights,biases) cost=tf.reduce_mean(tf.squared_difference(output_layer,Y),1) cost=tf.reduce_sum(cost) optimizer=tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost) with tf.Session() as sess: sess.run(tf.global_variables_initializer()) #train the network for i in range(epochs): idx=np.arange(len(train_x)) np.random.shuffle(idx) for j in range(len(train_x)//batch_size): cur_idx=idx[batch_size*j:batch_size*(j+1)] sess.run(optimizer,feed_dict={X:train_x[cur_idx],Y:train_y[cur_idx]}) #current_cost=sess.run(cost,feed_dict={X:train_x,Y:train_y}) #print(current_cost) #apply the network on the test data test_x=np.linspace(-30,30,300) network_output=sess.run(output_layer,feed_dict={X:test_x.reshape(-1,1)}) plt.plot(test_x,test_x**2,color='r',label='y=x^2') plt.plot(test_x,network_output,color='b',label='network output') plt.legend(loc='center') plt.show()更多推荐
神经网络预测第 n 个平方
发布评论