SOM / KOHONEN NETWORK

Image Recognition Tutorial

Introduction

This tutorial is intended to give a basic example of how to perform image / character recognition using SOM / Kohonen neural network architectures.
This tutorial uses a basic application called org.joone.samples.editor.SOMImageTester

You can use the sample application to draw basic black and white colour images and save the output into a file format that Joone recognises.

The example presented in this tutorial teaches the user how to setup a network that recognises the characters 'A' and 'B'. The reade can use this technique to setup a network that will recognise an arbitary number of characters.

Sample Application Quick Guide

The sample application is fairly self explanatory but you can use the guide below in order to use the application.

Features

Drawing Area - The high resolution 'A' image shown above is where the user can draw custom images.
Image ID - This is the identify of the image, you can use this number to mark what character the image is. Only numbers can be entered here. I.e a 1 could mean character 'A' and 2 could be 'B'.
Down Sample - This allows you to preview the down sampled image after drawing. To obtain the down sample the application first crops the image in the draw area. The image is cropped by obtaining the left most black pixel , top most black pixel etc to find the bounds of the cropped image. See the image below.


Secondly the cropped image is scaled down to a 9x9 image. The image is scaled by splitting the cropped image into a series of grids relating to each pixel in the 9x9 down sampled image, then if a grid in the cropped image contains a black pixel then the relevant pixel in the 9x9 down sampled image will contain a black pixel. The application automatically down samples each image when the user saves the the images to a file.
Help
This presents some basic help on the application.
New Image
Creates a new image for drawing a character/image into.
Clear Image
Removes all the black pixels from the current image.
Save Images
Allows all images to be saved to a Joone format file for use in a File Input Synapse. The format is 81 pixel inputs followed by the image id.
Quit
Allows you to quit the application.


Data Setup

Start the example application SOMImageTester. See the basic guide above on how to use the application.
First we need to create several 'A' character images and several 'B' character images that will be used in training and testing.
Draw the 4 'A' characters in the drawing panel clicking on New Image when you have finished each one. The down sample button can be used to see what each character looks like down sampled. When you have finished drawing the 4 'A' characters then draw four 'B' characters. Then use the Save Images button to save them out to a file, remember the file name and location we will call this 4As4Bs.txt in this example.

Note the more samples of a specific character you draw will mean the network is better able to recognise that character. You'll have noticed that the image gets cropped and down sampled, this is to stop the network from just recognising the character's size.

We now need a couple of test character's. Close and re-open the application , draw one 'A' character and save it we will call this testA.txt. Close the application again and re-start, this time draw a 'B' character and save the file we will call this testB.txt.

Neural Network Setup

For the neural network we will be using SOM components thus the network will be unsupervised. We will need to input the previously produced file into a linear layer of 81 inputs. This will by be fed to a Winner Takes All layer via a Kohonen Synapse. We can use a File Input Synapse to load the file in. See the image below ...



Note the Winner Takes All layer has two neurons, this is to ensure it classifies out two characters.

Input Layer Properties


Note our input images have 81 inputs i.e the 9x9 down sampled image that the application made earlier.

Linear Layer Properties


Note the rows here must match the inputs from the file input synapse.

Winner Takes All Layer Properties


Note the height or width should be 2 and 1, either can be 2 but not both. This ensure the layer contains 2 neurons for our two character classification.

Control Properties



Training The Network

Ensure the network has been set up as in the previous section. The run the network. When it has finished 10000 epochs it should have learned how to recognise the character 'A' and 'B'.

We need to find out which neuron fires on an 'A' character and which one fires on a 'B'.

We need to attach a file output synapse to the Winner Takes All Layer. Do this now and in the file output synapse set the file name to something like test.txt, in the control panel set the number of epochs to 1 and the learning property to false.

Run the network again and examine the text.txt file, you should see 8 rows and two columns. The column represents the neuron and the row the character they are trying to recognise i.e 1-8. We now that the first four characters were the character 'A' and the lest four were 'B' characters. Check that the test.txt contains 1.0 in the same column for four rows then 1.0 in the other column for the last four rows. On hour network it came out like this ...

0.0;1.0
0.0;1.0
0.0;1.0
0.0;1.0
1.0;0.0
1.0;0.0
1.0;0.0
1.0;0.0

So we now know that by looking at the first four rows neuron 2 fires for character 'A' and neuron 1 fires for character 'B'. It could be the other way round for you.
If at this point the it is not clear i.e neuron 2 fires for both an 'A' and 'B' then you might not have setup the network correctly or it may need more training.

Testing The Network

To test the network , modify the file name in the file input syanpse , select the testA.txt in order to test a character 'A'.

We have only one character in this file so in the control panel set the validation patterns to 1 and the validation mode to true. Run the network again. Examine the test.txt file, check if the correct neuron fired. In our case it was correct ..

0.0;1.0

Neuron 2 fired indicating that the network thought it was a character 'A' , it is correct.
You can do the same for the testB.txt file.

Using The Network

It is possible to use this network in your own application but your custom application must present 81 inputs which are written as row1 x,x+1,x+2,x+3,...,x+9 , row2 x,x+1,x+2,x+3,....x+9 , row3 ..... , row9 x, x+1,x+2,...,x+9. Direct input from memory will require the Memory Input Synapse.
An on pixel is represented as 1.0 and off 0.0.
The network can obviously not handle colour just black (on) and white (off).
Your application will also have to crop the image and down sample it to the correct size.

Further Work

Image recognition is a fascinating field and you'll probably want to experiment in recognising different images / objects. At the time of writing the Joone project is looking the producing an Image Input Synapse that will enable users to present images from files or Java images. If this is available then you could use this to easily load images into the network for training and running.
If this is not available then you will have to write some image pre-processing in your custom application.

Something worth thinking about when looking at image recognition is things like colour , size , shape, texture etc. An extension to the this example might be to enable the net to recognise coloured characters but independent of the actual colour. If you always present 'A' in green and 'B' in blue and train it then when you come to test it might have just learned how to recognise the colours green and blue, then when you try and present a green 'B' it doesn't recognise it according to what you were thinking of. In this case you should present 'A' and 'B' in different colours.

In the classic tank hiding in jungle example a research team wanted to train a network to spot tanks hidden in a jungle. They went out an took pictures of tanks hiding in a jungle and pictures with no tanks. They trained the network and when they tested it the network worked very well. However to verify the network they went out a took more pictures and tested it again. This time it failed miserably. Why? For the training images the researchers took pictures of the tanks hiding in the jungle on sunny day and the ones where the tanks were not hiding on an overcast rainy day. The network had simply recognised that it was sunny or cloudy.