How to Make a State of the Art Model with Fastai

Image classification with Learning Rate Finder and Progressive Resizing approach for getting the best results in a short amount of time.

Aug 24, 2021

When I first started my journey with fastai, I was very excited to build and train a deep learning model that could give amazing results in a short span of time. Now that we’ve seen some under the hood training jargon that fastai uses along with some Pytorch essential functions, it is now time to see what results a little more effort into building the model yields.

I’ll be linking my previous articles in which I document my learning with fastai at the end of this article. :)

Getting the data

We will need this data to get started. It is from the Rock, Paper, Scissors dataset from Kaggle. This task becomes a multi-class image classification problem with three classes (each with train, valid, test folders) containing RGB colour images of size 300x300.

Specify the import function for all things fastai vision and set the path variable.

from fastai.vision.all import *

path = Path('/storage/RockPaperScissors/RockPaperScissors/data/')path.ls()

Output:(#4) [Path('/storage/RockPaperScissors/RockPaperScissors/data/test2'),Path('/storage/RockPaperScissors/RockPaperScissors/data/valid'),Path('/storage/RockPaperScissors/RockPaperScissors/data/.DS_Store'),Path('/storage/RockPaperScissors/RockPaperScissors/data/train')]

# make sure to set that path to wherever you've kept your data either locally or online.

Now we shall define a datablock for getting the data from the folders. We specify these to make sure our data is available to the model while writing minimal code:

How to get image files with get_image_files function — this simply collects all the images files in our train and valid folders
Getting the classes with parent_label which makes sure we get the immediate parent folder name as our class name and finally,
Getting the train and validation split with GrandparentSplitter which gets us the separate dataset for training and validating with the folders one up in the heirarchy with or train and valid folders.

def get_dls(bs, size):    dblock = DataBlock(blocks = (ImageBlock, CategoryBlock),                       get_items = get_image_files,                       get_y = parent_label,                       splitter = GrandparentSplitter(),                       item_tfms = Resize(size)                      )    return dblock.dataloaders(path, bs = bs)

This will return a dataloader that will give us a batch size of bs and an image size of size.

What is Progressive Resizing and how do we apply it?

As Jeremy Howard says in his book: start training using small images, and end training using large images. Spending most of the epochs training with small images, helps training complete much faster. Completing training using large images makes the final accuracy much higher.

This is an experimental technique that has been proving to be extremely useful in getting much higher accuracies than when otherwise done with the same sizes of images throughout.

Let’s now see how can be train in multiple sizes, shall we?

We’ll get the batch size as 64 and image size as smaller 128x128.

dls = get_dls(64, 128)

Now let’s go ahead and calculate what learning rate we should use for this part of the training.

Finding a suitable learning rate

First, we make a model by utilising transfer learning with the following line.

learn = cnn_learner(dls, resnet34, metrics=accuracy)

Then, we plot a graph to see about finding the learning rate.

learn.lr_find()

The output looks like this, with a clear visualisation of what our losses will look like if we take a specific value of the learning rate.

Looks like taking a learning rate of around 1e-3 will be enough to make sure our losses decrease with training. We’ll choose that.

learn = cnn_learner(dls, resnet34, metrics=accuracy)learn.fit_one_cycle(10, 1e-3)

We see quite remarkable results in the first few epochs itself.

Note: I trained the model on the GPU which is why it only takes mere seconds with each epoch.

If you were to train on the CPU only, it’ll take much longer, sometimes even ~10 minutes approx for each epoch.

Now that we have the model trained on the smaller image sizes, we can proceed to the second part of the training.

We use the batch size as 128 and our image sizes as 224 for our next fine tuning of the model.

learn.dls = get_dls(128, 224)learn.fine_tune(5, 1e-3)

This, as you can infer, yields us an accuracy of almost 95% in our training and it only takes about three minutes to train on the GPU!

Concluding…

Fastai enables us with the ability for rapid development of any deep learning task, and as I’ve experimented with it in the previous weeks, I’ve found myself being more and more in love with its super simple approach. If you’re keen to follow along this journey with me, make sure to follow me for continued updates as I explore more deep learning tasks with this amazing library.

As I promised earlier, here are the other articles I’ve written for fastai. Happy coding! 😁

A Fast Introduction to fastai: A Fast Introduction to FastAI — My Experience
Pixel similarity approach — under the hood part 1: Fastai — Exploring the Training Process — the Pixel Similarity approach
Stochastic Gradient Descent and training from scratch — under the hood part 2: Fastai — Multi-class Classification with Stochastic Gradient Descent from Scratch

Also, here is the GitHub repo link with all of the code:

yashprakash13/RockPaperScissorsFastAI
Contribute to yashprakash13/RockPaperScissorsFastAI development by creating an account on GitHub.github.com

Connect with me on Twitter and LinkedIn!

How to Make a State of the Art Model with Fastai

Image classification with Learning Rate Finder and Progressive Resizing approach for getting the best results in a short amount of time.

Image classification with Learning Rate Finder and Progressive Resizing approach for getting the best results in a short amount of time.

Getting the data

What is Progressive Resizing and how do we apply it?

Finding a suitable learning rate

Concluding…

Discussion about this post