This an example of Two Layered Perceptron for classification of CIFAR10 Dataset. Network Structure : 2 Hiden Layers Hidden Layer 1 : 256 Nodes Hidden layer 2 : 128 Nodes Total 10 classes
This uses a simple Sigmoid Activation Function and Adam Optimizer for reducing the cost. Cost is computed as Cross Entropy with Logits.
Questions to ask
1) Why do we use Sigmoid Activation function ? What are its advantages ?
2) What is Cross Entropy with Logits ?
Data is assumed to be present in the CIFAR10 folder
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt
import os
from six.moves import cPickle
%matplotlib inline
First Step is to do some Data Exploration. Data is present in a pickled format. Format is present at http://www.cs.utoronto.ca/~kriz/cifar.html
Filenames = {'batch1': 'CIFAR10\cifar-10-batches-py\data_batch_1',
'batch2': 'CIFAR10\cifar-10-batches-py\data_batch_2',
'batch3': 'CIFAR10\cifar-10-batches-py\data_batch_3',
'batch4': 'CIFAR10\cifar-10-batches-py\data_batch_4',
'batch5': 'CIFAR10\cifar-10-batches-py\data_batch_5'
}
def getImageData(filename):
f = open(filename,'rb')
datadict = cPickle.load(f,encoding='latin1') #Why the hell latin1 ???
f.close()
X=datadict['data'].reshape((len(datadict['data']), 3, 32, 32)).transpose(0, 2, 3, 1)
return X
#A function to display the
def display_stats(filename,sample_id):
f = open(filename,'rb')
datadict = cPickle.load(f,encoding='latin1') #Why the hell latin1 ???
f.close()
X=datadict['data']
Y=datadict['labels']
print(len(Y)) # Note: Y is a list
for _y in set(Y):
print(_y,Y.count(_y), end = ' ')
display_stats(Filenames["batch2"],4)
Display 25 Random Images in a grid
X_image = getImageData(Filenames["batch2"])
fig, axes1 = plt.subplots(5,5,figsize=(3,3))
for j in range(5):
for k in range(5):
i = np.random.choice(range(len(X)))
axes1[j][k].set_axis_off()
axes1[j][k].imshow(X_image[i,:])
Pre Processing
Before we create a Network and Train the model , we would like to preprocess the data.
We will preprocess the Data and save it to disk . Useful for future as well. Also we will write helper functions to provide data in the form of Batches for Optimization Loop used at the time of Training Neural Network
At the time of Pre Processing we would like to create a Validation set as well. So from Training data we will keep aside 10% of Data for Validation.
def normalize(image):
maximum = np.max(image)
minimum = np.min(image)
return (image-minimum)/(maximum-minimum)
# A List of Labels (0-9) integers representing 10 different type of Images needs to encoded.
# Encoding is actually an identity matrix of dimensions 10 X 10
def oneHotEncoding(labels):
maxval = np.max(labels)
return np.eye(maxval+1)[labels]
# Uncomment to see how one hot encoding on an example set looks like
#print(oneHotEncoding([1,2,3,4,5,6,7,8,9,1,2,3,4]))
def PreProcessAndSaveCIFAR10():
validFeatures = []
validLabels = []
for (filename,path) in Filenames.items():
f = open(path,'rb')
datadict = cPickle.load(f,encoding='latin1') #Why the hell latin1 ???
f.close()
features = datadict['data'].reshape((len(datadict['data']), 3, 32, 32)).transpose(0, 2, 3, 1)
labels =datadict['labels']
validationCount = int(len(features)*0.1) # Note len(features) gives the value of dim 0 for numpy array
featureSubset = normalize(features[:-validationCount]) # Take only 90% and normalize it
labelSubset = oneHotEncoding(labels[:-validationCount])
validFeatures.extend(features[-validationCount:]) # Add Remaining 10% to validation Features
validLabels.extend(labels[-validationCount:])
cPickle.dump((featureSubset,labelSubset),open("preprocess_"+filename,'wb'))
validFeatures = normalize(np.array(validFeatures)) #@TODO :Not sure as to how this works
validLabels = oneHotEncoding(np.array(validLabels))
cPickle.dump((validFeatures,validLabels),open("preprocess_valid",'wb'))
PreProcessAndSaveCIFAR10()
def loadPreProcessingData(filename,batchSize):
filename="preprocess_"+filename
f = open(filename,'rb')
features,labels = cPickle.load(f)
for start in range(0,len(features),batchSize):
end = min(start+batchSize,len(features))
yield features[start:end],labels[start:end]
Create the Two Layer Network
numBatches = 5
numSamplesPerBatch = 10000
numSamples = numBatches*numSamplesPerBatch
numClasses = 10
numHidden1 = 256
numHidden2 = 128
numInput = 32*32*3
X = tf.placeholder("float",[None,numInput])
Y = tf.placeholder("float",[None,numClasses])
stddev = 0.1
weights = {
'h1': tf.Variable(tf.random_normal([numInput,numHidden1],stddev=0.1)),
'h2': tf.Variable(tf.random_normal([numHidden1,numHidden2],stddev=0.1)),
'out':tf.Variable(tf.random_normal([numHidden2,numClasses],stddev=0.1))
}
biases = {
'b1' : tf.Variable(tf.random_normal([numHidden1],stddev=0.1)),
'b2' : tf.Variable(tf.random_normal([numHidden2],stddev=0.1)),
'out': tf.Variable(tf.random_normal([numClasses],stddev=0.1))
}
print ("NETWORK READY")
def multiLayerPerceptron(_X,_weights,_biases):
layer1 = tf.nn.sigmoid(tf.add(tf.matmul(_X, _weights['h1']),_biases['b1']))
layer2 = tf.nn.sigmoid(tf.add(tf.matmul(layer1,_weights['h2']),_biases['b2']))
out = tf.add(tf.matmul(layer2,_weights['out']),_biases['out'])
return out
pred = multiLayerPerceptron(X,weights,biases)
Some Useful functions provided by Tensorflow
cost = tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(logits=pred,labels=Y))
optimizer = tf.train.AdamOptimizer(learning_rate=0.01).minimize(cost)
corr = tf.equal(tf.argmax(pred,1),tf.argmax(Y,1))
accuracy = tf.reduce_mean(tf.cast(corr,'float'))
init = tf.global_variables_initializer()
valid_features, valid_labels = cPickle.load(open('preprocess_valid', mode='rb'))
with tf.Session() as sess:
sess.run(init)
for epoch in range(50):
for (filename,path) in Filenames.items():
for f,l in loadPreProcessingData(filename,1024):
sess.run(optimizer,feed_dict={X:f.reshape(len(f),3072),Y:l})
c = sess.run(cost,feed_dict={X:f.reshape(len(f),3072),Y:l})
acc = sess.run(accuracy,feed_dict={X:valid_features.reshape(len(valid_features),3072),Y:valid_labels})
if(epoch%10 == 0):
print("Epoch =",epoch," Cost = ",c," Accuracy = ", acc)
What Next ? This notebook is aimed only at showing how to write a Multi Layer Perceptron and train it on CIFAR10. Valiation Accuracy is around 37 %. That itself is bad, so not even doing test accuracy. Once we add Convolution Layers to the network we will measure test accuracy as well. Lets hope that the test accuracy is on CIFAR10 with a basic CNN is > 50 %