Is currywurst? or not currywurst?

Using transfer learning with fastai for image classification

Kerem Dede
7 min readOct 16, 2023

You are wandering in the streets of Berlin and you come across a currywurst imbiss. Tiredness of walking from Potsdamer platz to Brandenburger tor, leads you to decide to eat a currywurst for the first time in yourlife. How do you know what you are getting a real currywurst!? Let me show you how you can build a model that helps you to know whether you are about to eat a real currywurst or not.

Data collection

The first part of this project was data collection. I’ve used duckduckgo’s library to fetch a couple hundred images. Other alternatives were to use a cloud provider. I tried to set it up with Azure/bing image search api but I couldn’t get it working in a reasonable time so I switched to not only free, but also faster alternative, duckduckgo.

Here is the download function with duckduckgo search:

from duckduckgo_search import DDGS
import requests
import os

if __name__ == '__main__':
image_search_keyword = 'space travel'
folder_for_saving_image_search_results = './data/space_travel/'
max_results = 150

with DDGS() as ddgs:
keywords = image_search_keyword
ddgs_images_gen = ddgs.images(
keywords,
safesearch="off",
size='Large',
type_image='photo',
max_results=max_results,
)

for ddgs_image in ddgs_images_gen:
try:
# Define the URL of the image you want to download
image_url = ddgs_image['image']

# Ensure the folder exists, create it if it doesn't
os.makedirs(folder_for_saving_image_search_results, exist_ok=True)

# Extract the image file name from the URL
image_filename = os.path.join(folder_for_saving_image_search_results, os.path.basename(image_url))[:100]

# Send an HTTP GET request to the URL to fetch the image
response = requests.get(image_url)

# Check if the request was successful (status code 200)
if response.status_code == 200:
# Open a file in binary write mode and save the image content
with open(image_filename, "wb") as file:
file.write(response.content)
print(f"Image downloaded and saved as {image_filename}")
else:
print(f"Failed to download image. Status code: {response.status_code}")
except:
print('An exception occured')

After collecting some data, I put them into two subfolders currywurst and not_currywurst inside the data folder. The names of those subfolders are also corresponding to the labels / classes.

Folder structure for data

Because what we are doing here a binary classification, you should not only prepare data for positive case (currywurst), but also for the negative case (not-currywurst). I didn’t know that at first, so the resulting model classified everything as currywurst. I expected the model to say that it doesn’t recognize any currywurst in a given image, therefore not_currywurst. But apparently, that’s not how it works.

Setting up the loader and learner

In your Jupyter notebook, import the fastai library.

from fastai.vision.all import *

Using transfer learning in fastai is simple. I expected transfer learning to be complicated, but apparently, it does not have to be. At least, not always.

There are two core concepts and a couple of nice utility functions to fine-tuning a model in fast-ai.

Loader

First core part is that there is this concept called DataLoader. You tell it where to load the data from and which transformations to apply on data.

Because we are handling image data here, we we will need to use ImageDataLoaders which is just a wrapper around DataLoader with additional image processing capabilities.

dls = ImageDataLoaders.from_folder(Path('./data/'), valid_pct=0.2, item_tfms=Resize((224, 224)), batch_tfms=aug_transforms())

One thing to note about this loader is that it expects to find a trainand valid subfolders. If you have all your training and validation data in a single folder, like me, then you need to pass valid_pct as well. Otherwise, the loader won’t know how to separate your data into training and validation set. In that case, you’ll get the following error:

TypeError: 'NoneType' object is not iterable

You also see two other params that we passed into the loader that are item_tfms and batch_tfms. First one defines the transformations that will be applied onto each item before the batches are formed. The latter one applies transformations on the batches.

Before we feed those data into our model, it makes sense to validate that the loader loads the data as we expect. So, there is nice little function called show_batch which displays a batch of images with given labels. A sample output is like below:

A sample output of show_batch function

Learner

The second part is the so-called Learner. This learner requires a data loader, a pre-trained model and optionally a set of metrics that you care about.

learner = vision_learner(dls, resnet18, metrics=[accuracy])

This learner has now some weights, knows where to find the data through the loader. All you need to do now is to fine-tune the learner so that it can adjust the weights for your particular objective and dataset. You do that by calling the fine_tune method.

learner.fine_tune(5)

The output looks something like this:

Fine tuning the pre-trained model for currywurst classification

It’s been years that I’m in software but I still find it astonishing what you can achieve with a couple lines of code. We had about 70% accuracy, in our first epoch. Cool. In 5 epochs, we got it up to 95%.

Inspecting the results

Looking at a batch

Because fastai library is so high level, you have access to some neat functionalities that you otherwise would not bother to implement yourself. At least, not when you are playing around.

Learner has a function called show_results which displays a batch of images with the actual and predicted label. It’s nice to have a visual feedback of what’s going on. A sample result looks like this:

A sample result for show_results method of learner

In this result, you can see where miss-classifications are and get a feeling how capable your model has become. For example, if the model classifies a human as a currywurst, that is not good. But if the few exceptions are only miss-classifiying Adana kebap as currywurst, you know, it can be rather a hard distinction.

Sample adana kebap. It does resemble currywurst.

Warning: Do not search for kebap images right before the lunch time.

Using confusion matrix

In fastai, there is a class called ClassificationInterpretation that has a factory method to take in a learner. Once you got an interpreter for your learner, you can plot a confusion matrix to have an overview of how good the model performs on the whole dataset instead of only one batch. The code looks like:

interpreter = ClassificationInterpretation.from_learner(learner)
interpreter.plot_confusion_matrix()

This draws a graph containing four rectangles for binary classification where you see your true positives, true negatives, flase positives and false negatives.

Confusion matrix for currywurst classifier that has been fine-tuned from a pretrained resnet model

Top losses

Top losses is the other feature where you see which pictures your model has the most difficulty with. This is really helpful because sometimes your positive and negative labeled data is mixed or there are problematic images in your positive case data set.

You can view those biggest losers with the help of same interpreter we defined in the previous section. In code:

interpreter.plot_top_losses(10)

The result looks like:

10 top losses — my labels are too long apparently

Precision vs recall

Lastly, there is the method print_classification_report, which is more of a numerical way to see precision and recall measurements of your model. To get this report printed on your jupyter cell output, you need to use the same interpreter as before and call:

interpreter.print_classification_report()

The output looks like following:

As a starter in the field, you rather tend to disregard those insights and rather focus solely on the “architecture”, but those insights being so easily accessible makes you want to use them more often.

Sharing your model

You trained your model and now it is time to share it. How do you do that? You can use Huggingface to upload your model and use the Gradio to create an interface for people to interact with it. Such as:

import gradio as gr
from fastai.vision.all import load_learner

def greet(in_image):
learner = load_learner('cw_classifier.pkl')
prediction = learner.predict(in_image)

estimates_per_class = list(map(float, prediction[2]))
classes = learner.dls.vocab
classes_with_estimations = dict(zip(classes, estimates_per_class))

return classes_with_estimations

iface = gr.Interface(fn=greet, inputs='image', outputs='label')
iface.launch()

Once you put your gradio user interface definition in the app.py file and submit it to Huggingface space, then you (everybody else) will be able to interact with your model.

Conclusion

With fastai, using transfer learning and doing image classification comes down to two lines of code and some data preparation. This is a great way for any ML-outsider to get into the field and start experimenting.

If you’d any questions or feedback, leave a comment or reach me on LinkedIn.

Have a good day,

Kerem Dede

--

--