Fully Convolutional Network

Networks used previously were trained to classify images. Semantic segmentation is also an important problem that can be solved by deep learning. Fully Convolutional Networks can be employed to do this.

This notebook tries to attempt at using FCN to solve kaggle Challenge of ultrasound nerve segmentation.

Code in this notebook is inspired from https://github.com/jocicmarko/ultrasound-nerve-segmentation/blob/master/train.py

Some important points to remember

  • Deconvolution in neural networks is not truly deconvolution in signal processing sense.
  • Appropriate name for it is Transposed convolution.

  • Convolution can be looked at as a matrix operation. A very good visual tutorial is at http://deeplearning.net/software/theano/tutorial/conv_arithmetic.html

  • One important thing to remember in CNN is ' Every filter is small spatially (along width and height), but extends through the full depth of the input volume'

' Example 1. For example, suppose that the input volume has size [32x32x3], (e.g. an RGB CIFAR-10 image). If the receptive field (or the filter size) is 5x5, then each neuron in the Conv Layer will have weights to a [5x5x3] region in the input volume, for a total of 553 = 75 weights (and +1 bias parameter).'

'Example 2. Suppose an input volume had size [16x16x20]. Then using an example receptive field size of 3x3, every neuron in the Conv Layer would now have a total of 3320 = 180 connections to the input volume. Notice that, again, the connectivity is local in space (e.g. 3x3), but full along the input depth (20).'

More details at http://cs231n.github.io/convolutional-networks/

In [1]:
import numpy as np
import os
from skimage.io import imsave,imread
import matplotlib.pyplot as plt
%matplotlib inline

Explore the Dataset Display Imaages and Corresponding Masks

In [ ]:
data_path='C:\\NanoDegree\\Kaggle\\NerveSegmentation\\'

def displayImgAndMask(idx):
    train_data_path = os.path.join(data_path,'train')
    images = os.listdir(train_data_path)
    img_name = images[idx]
    
    if 'mask' in img_name:
        img_name = img_name.split('_mask')[0]+'.tif'
        
    print(img_name)

    img = imread(os.path.join(train_data_path,img_name),as_grey=True)
    plt.imshow(img,cmap='gray')
    plt.show()
    
    img_mask_name = img_name.split('.')[0]+'_mask.tif'
    img_mask = imread(os.path.join(train_data_path,img_mask_name))

    plt.imshow(img_mask,cmap='gray')
    plt.show()

displayImgAndMask(5900)

Read All the Images and Masks from Training Data set into a numpy Array

In [ ]:
image_rows = 420
image_cols = 580
def create_train_data():
    train_data_path = os.path.join(data_path,'train')
    images = os.listdir(train_data_path)
    print (len(images))
    # Assuming there is one mask for every image
    numImages = int(len(images)/2)
    print(numImages)
    imageData = np.ndarray((numImages,image_rows,image_cols),dtype = np.uint8)
    imageMaskData = np.ndarray((numImages,image_rows,image_cols),dtype=np.uint8)
    
    i = 0 # Index into the Image Data
    for image in images:
        if 'mask' in image:
            continue
        imageMask = image.split('.')[0]+'_mask.tif'
        
        img = imread(os.path.join(train_data_path,image),as_grey=True)
        imgMask = imread(os.path.join(train_data_path,imageMask),as_grey=True)
        
        imageData[i] = np.array([img])
        imageMaskData[i] = np.array([imgMask])
        
        i= i+1
        if i % 500 == 0:
            print("----------Completed reading next 100------------------------")
    np.save('imgs_train.npy',imageData)
    np.save('imgs_train_mask.npy',imageMaskData)
    
    
In [ ]:
create_train_data()

Load Training data from saved npy files

In [2]:
def load_train_data():
    imgs_train = np.load('imgs_train.npy')
    imgs_train_mask = np.load('imgs_train_mask.npy')
    return imgs_train,imgs_train_mask

   

Pre Processing

  • A Simple Pre Processing to shrink the Data to size 96 x 96 is done. As this is for expermentation purposes an eloborate pre processing is avoided
In [3]:
from skimage.transform import resize
img_newRows = 96
img_newCols = 96
def preprocess(imgs):
    imgs_p = np.ndarray((imgs.shape[0],img_newRows,img_newCols),dtype=np.uint8)
    for i in range(imgs.shape[0]):
        imgs_p[i] = resize(imgs[i],(img_newRows,img_newCols),preserve_range=True)
    print(imgs_p.shape)
    imgs_p=imgs_p[...,np.newaxis] # Note this just adds a new dimension at the end. i.e (5635,96,96) becomes (5635,96,96,1)
    print(imgs_p.shape)
    return imgs_p
    

Why Additional dimension of '1' at the end ?

A new dimension of 1 is introduced at the end as the images are gray scale images amd tensor flow needs a volume as an input. Ex: In CIFAR10 dataset 32x 32 x 3 becomes input.

Colors become the third dimension in CIFAR10. As there are no colors/channels here a '1' is added at the end.

Also it is important to tell TensorFLow (Backend of Keras) to consider the last dimension as the channel.

Following command is useful to do so

  • from keras import backend as k
  • k.set_image_data_format('channels_last')
In [9]:
imgs_train, imgs_train_mask = load_train_data()
imgs_train = preprocess(imgs_train)


imgs_train_mask = preprocess(imgs_train_mask)
(5635, 96, 96)
(5635, 96, 96, 1)
In [10]:
plt.imshow(imgs_train[100,...,0],cmap='gray')
plt.show()
plt.imshow(imgs_train_mask[100,...,0],cmap='gray')
plt.show()

Define the loss

Loss is being defined as dice_coefficient. @TODO : Add more explanation

In [14]:
def dice_coef(y_true, y_pred):
    y_true_f = K.flatten(y_true)
    y_pred_f = K.flatten(y_pred)
    intersection = K.sum(y_true_f * y_pred_f)
    return (2. * intersection + 1) / (K.sum(y_true_f) + K.sum(y_pred_f) + 1)


def dice_coef_loss(y_true, y_pred):
    return -dice_coef(y_true, y_pred)

Define Network

A smaller version of the Unet is used here.

In [23]:
from keras.models import Model
from keras.layers import Input, concatenate, Conv2D, MaxPooling2D, Conv2DTranspose
from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint
from keras import backend as K

def get_unet():
    inputs = Input((img_newRows, img_newCols, 1))
    conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(inputs)
    conv1 = Conv2D(32, (3, 3), activation='relu', padding='same')(conv1)
    pool1 = MaxPooling2D(pool_size=(2, 2))(conv1)

    conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(pool1)
    conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv2)
    pool2 = MaxPooling2D(pool_size=(2, 2))(conv2)
    
    conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(pool2)
    conv3 = Conv2D(128, (3, 3), activation='relu', padding='same')(conv3)

    up4 = concatenate([Conv2DTranspose(64, (2, 2), strides=(2, 2), padding='same')(conv3), conv2], axis=3)
    conv4 = Conv2D(64, (3, 3), activation='relu', padding='same')(up4)
    conv4 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv4)

    up5 = concatenate([Conv2DTranspose(32, (2, 2), strides=(2, 2), padding='same')(conv4), conv1], axis=3)
    conv5 = Conv2D(32, (3, 3), activation='relu', padding='same')(up5)
    conv5 = Conv2D(32, (3, 3), activation='relu', padding='same')(conv5)

    conv6 = Conv2D(1, (1, 1), activation='sigmoid')(conv5)

    model = Model(inputs=[inputs], outputs=[conv6])

    model.compile(optimizer=Adam(lr=1e-5), loss=dice_coef_loss, metrics=[dice_coef])

    return model

Normalize the Data

In [18]:
imgs_train = imgs_train.astype('float32')
mean = np.mean(imgs_train)  # mean for data centering
std = np.std(imgs_train)  # std for data normalization
imgs_train -= mean
imgs_train /= std

imgs_train_mask = imgs_train_mask.astype('float32')
imgs_train_mask /= 255.  # scale masks to [0, 1]
In [19]:
K.set_image_data_format('channels_last')  # TF dimension ordering in this code
In [24]:
model = get_unet()
model_checkpoint = ModelCheckpoint('weights.h5', monitor='val_loss', save_best_only=True)
In [41]:
 model.fit(imgs_train, imgs_train_mask, batch_size=128, epochs=10, verbose=1, shuffle=True,
              validation_split=0.2,
              callbacks=[model_checkpoint])
Train on 4508 samples, validate on 1127 samples
Epoch 1/10
4508/4508 [==============================] - 632s - loss: -0.0374 - dice_coef: 0.0374 - val_loss: -0.0394 - val_dice_coef: 0.0394
Epoch 2/10
4508/4508 [==============================] - 617s - loss: -0.0583 - dice_coef: 0.0583 - val_loss: -0.0664 - val_dice_coef: 0.0664
Epoch 3/10
4508/4508 [==============================] - 617s - loss: -0.0969 - dice_coef: 0.0969 - val_loss: -0.1056 - val_dice_coef: 0.1056
Epoch 4/10
4508/4508 [==============================] - 617s - loss: -0.1596 - dice_coef: 0.1596 - val_loss: -0.1411 - val_dice_coef: 0.1411
Epoch 5/10
4508/4508 [==============================] - 617s - loss: -0.2153 - dice_coef: 0.2153 - val_loss: -0.1667 - val_dice_coef: 0.1667
Epoch 6/10
4508/4508 [==============================] - 617s - loss: -0.2475 - dice_coef: 0.2475 - val_loss: -0.1859 - val_dice_coef: 0.1859
Epoch 7/10
4508/4508 [==============================] - 583s - loss: -0.2710 - dice_coef: 0.2710 - val_loss: -0.1939 - val_dice_coef: 0.1939
Epoch 8/10
4508/4508 [==============================] - 583s - loss: -0.2950 - dice_coef: 0.2950 - val_loss: -0.1806 - val_dice_coef: 0.1806
Epoch 9/10
4508/4508 [==============================] - 589s - loss: -0.3057 - dice_coef: 0.3057 - val_loss: -0.2013 - val_dice_coef: 0.2013
Epoch 10/10
4508/4508 [==============================] - 584s - loss: -0.3206 - dice_coef: 0.3206 - val_loss: -0.1883 - val_dice_coef: 0.1883
Out[41]:
<keras.callbacks.History at 0xa549048>

Test with the Training Data

As we have not trained the complete deep network, and also number of Epochs used is less we know that the test result will not be best.

Observed that the dice_coef is increasing , i,e we are going closer and closer to our train labels.

Plottinf them and comparing with the labels will give a picture as to how well the model is trained.

In [42]:
imgs_mask_temp = model.predict(imgs_train[0:10,...],verbose=1)
10/10 [==============================] - 0s
In [46]:
plt.imshow(imgs_mask_temp[1,...,0],cmap='gray')
plt.show()
plt.imshow(imgs_train_mask[1,...,0],cmap='gray')
plt.show()
In [ ]: