Tips and Tricks for training Neural Network in Theano

Theano is a popular Python’s meta programming framework used for Deep Learning on top of either CPU or GPU. Purpose of this blog is to suggest some tips which you can incorporate if you are getting trouble while performing Deep Learning on your problem.

  • Constant Validation Error– If you have just started with Theano and are applying logistic regression model to your problem (MNIST’s Digit recognition is not considered as problem here), then you are likely to get constant validation error while training. If that happens you need to fix your learning rate by determining the optimal one. Start with 0.1 and keep reducing it by a factor of 10 after every epoch until you see a fall in validation error and then use that learning rate for training. Tip- Whenever you initiate training always start with a smaller dataset, say 500-1000 samples, and try to overfit your model. Give same Dataset to training, validation and test. You should get a 100% test error. Your network should have more number of nodes compared to your input so that it can fit. If this is not happening certainly there’s some bug in your implementation.
  • Gaussian Initialization– By default Theano developers have set Initialization of weights to random uniform distribution. Change it to Gaussian(normal) Distribution, you are then likely to get improved results.

12164584_10207334153570569_1373918166_o

  • Reduce Batch Size– When you reach to a point where you don’t see any improvement in error after some appreciable number of epochs, reduce your batch size. This will improve your results. Intuitive reasoning behind this is let us say your initial batch size was 10. Now the situation possibly could be that you have 1 misclassified sample and 9 correctly classified samples in that batch. The cost calculated will then get averaged and gets reduced by a factor of 10 and would not be good enough to bring sufficient update in weights. Now if you set your batch size to one, you would see improvement.
  • Weight Saving- All these hyper parameter modification or adjustments you can do only if you have incorporated the weight saving feature to your code.

To save weights you can use the code below:-

Define these functions in your Learning Model Class

image_1

Here weights are getting saved in myfile.pkl after every epoch

image_2

Now next time when you’ll start training program will check if there’s any previous saved weights which it can pick. If no, it will randomly initialize weights else it will load the previous saved weights with setstate function.

image_3

If you’ll follow the aforementioned tips you certainly can exhaust more benefits from Deep Learning.

Cheers!

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s