Decision trees are a common technique used in data mining to predict a target value based on several input data. Prediction of output value involves testing of input sample on certain rules. Each terminal node of the tree represents the output to which sample it belongs. To figure out the output, we start at the root node of the tree, and ask a sequence of questions about the features. The interior nodes are labeled with questions, and the edges or branches between them labeled by the answers and based on the attributes you eventually end in a particular leaf.
Category: Machine Learning
Theano is a popular Python’s meta programming framework used for Deep Learning on top of either CPU or GPU. Purpose of this blog is to suggest some tips which you can incorporate if you are getting trouble while performing Deep Learning on your problem.
- Constant Validation Error– If you have just started with Theano and are applying logistic regression model to your problem (MNIST’s Digit recognition is not considered as problem here), then you are likely to get constant validation error while training. If that happens you need to fix your learning rate by determining the optimal one. Start with 0.1 and keep reducing it by a factor of 10 after every epoch until you see a fall in validation error and then use that learning rate for training. Tip- Whenever you initiate training always start with a smaller dataset, say 500-1000 samples, and try to overfit your model. Give same Dataset to training, validation and test. You should get a 100% test error. Your network should have more number of nodes compared to your input so that it can fit. If this is not happening certainly there’s some bug in your implementation.
- Gaussian Initialization– By default Theano developers have set Initialization of weights to random uniform distribution. Change it to Gaussian(normal) Distribution, you are then likely to get improved results.
Neural Networks are getting so popular due to their ability to create any function by feature learning when enough data is provided. Features are the information you are giving to the network, greater the feature size greater is the information you provide. They are primarily used to solve classification problems but research is still being done to make them work for regression problems as well.
Right now there’s a big hype about Machine learning and Big Data all around in the tech world. This is not surprising as they have played a significant role in Automation, Business advancements and predictions. But along with them Deep Learning is also now becoming a popular term in recent times. One interesting fact about deep learning is that it was abandoned in late 1980s, but later in 2007 Geoffrey Hinton brought an algorithm which all over again has invoked research in it.