If you have been using machine learning models for a while now, you must have realized that some models perform great directly out of the box while others don’t. However, this doesn’t mean that the models aren’t great – they just require a bit of hyperparameter tuning.
While the hyperparameter tuning process is not that complex, it’s undoubtedly a long one, considering the time it takes for models to be trained with different combinations of hyperparameters. So, it’s crucial we automate the process to reduce the human effort required as much as possible.
So, today, we will be going through the top 3 ways you can train hyperparameters of machine learning models, going from the most manual and intuitive approach to the quickest one that requires minimal human intervention.
So, let’s start without any further ado.
Note: Here is the notebook if you want to jump straight to the code.
We won’t be building a very complex model here since the article focuses on tuning hyperparameters instead of making huge models. So, we will be using a relatively simple but practical dataset of diabetic patients to train this model. Let’s import the required Pandas libraries and the dataset to get started.
Let’s view our dataset to have a quick view of it:
Now, let’s split the dataset into testing and training datasets using the scikit-learn library.
The datasets are ready to be fed to the model. Let’s define a simple decision tree classifier to build a baseline model we can compare to the improved versions with tweaked hyperparameters later. This model will be made with the default hyperparameters, so we’re calling it the baseline model.
Here’s how we can train a simple decision tree model and then find out its accuracy, along with the confusion matrix.
Here are the accuracy and the confusion matrix for the trained model.
As you can see, we have achieved an accuracy of around 77% on the test dataset without playing around with the hyperparameters. We shall keep this figure in mind when tweaking the hyperparameters in the next steps, so we have something to compare the improved models to.
Method 1: Manual Hyperparameter Tuning
The first and easiest approach to tune hyperparameters without using any external libraries is fairly simple. While it will be a little tedious since we will be doing this manually, it shouldn’t cause a lot of burden on your mind.
Before we dive in, we need to decide which of the hyperparameters of decision trees we’re going to be playing around with. There are many hyperparameters present in the decision trees that you can tune to alter the model’s performance; we will be looking at the three most-used hyperparameters that you can see below:
· Splitter – to determine the strategy of split at each node (best or random)
· Criterion – the measure of the quality of split at each level (entropy or gini)
· Max_depth – the maximum depth of the decision tree
We’ll define the hyperparameters in the form of a dictionary, then use it to train different models. Let’s see how that can be achieved:
Here’s how the models performed:
Apparently, we have already performed better than our baseline model by just guessing some random values for hyperparameters. However, this isn’t mostly the case, especially in the industry-level models. So, we need a way to train models using hundreds of combinations easily; this approach is clearly not one of them.
Method 2: Loop-based Hyperparameter Tuning
To improve on the previous approach where we manually defined the hyperparameters inside a dictionary, we will use a loop to try out different combinations of the hyperparameters. We can use nested lists to achieve this and try out different combinations automatically. Here’s how that can be done:
Once the code snippet is run, the results will be stored in the dataframe, sorted based on the accuracy values, so we can easily view the best performances. Let’s view the dataframe to see which combinations performed best.
It’s evident that the approach is a great advancement from the previous one, but we had to use the cursed nested loops. Imagine what would happen if we had more than three hyperparameters to try, let’s say ten. Quite a trouble with nested loops, right?
Method 3: Hyperparameter Tuning with GridSearch
While the previous approach worked great, we hit a dead-end when tuning a higher number of hyperparameters due to the performance issues posed by nested loops. So, to improve, we will use the GridSearch class, which comes with the library scikit-learn.
Not only does GridSearch let us search for the best hyperparameters automatically, but it does so with a considerably good speed, even if we have a lot of hyperparameters. Hence, it’s certainly the best of all previously seen approaches.
To use the GridSearch, first, we need to declare a dictionary containing the parameters we need to tune, with the keys being the hyperparameters and the values being the possible values it can take. After that, we just need to make an instance of the GridSearch class and call the fit() method on it.
Here’s how it can be done:
Then, we can store the results in a dataframe like we did in the previous approach and view the results to find the best hyperparameters.
You can see that the GridSearch Class also displays a lot of other information that we might be interested in. But right now, we don’t want the extra information and are only interested in the mean test scores. So, let’s filter the results, sort the values by mean_test_score and view the cleaner dataframe again.
If you just want to find the best hyperparameters and don’t want to see the whole dataframe, here’s how you can do that in a simple line:
Hyperparameter tuning is a vital topic in machine learning. If you’re an aspiring data scientist, it’s a must-have skill to have if you want to build ML models that perform above the ordinary.
Throughout this article, we have gone through three primary techniques for finding the best hyperparameters for a given model. We started from the simplest, most manual, to the quickest and most manual.
While the manual ones are great for building intuition, you might not find yourself using them in practical life since so many combinations need to be tried. So, I recommend using GridSearch as your go-to method whenever you have to tune hyperparameters of any model.