# install fastkaggle if not available
try: import fastkaggle
except ModuleNotFoundError:
!pip install -Uq fastkaggle
from fastkaggle import *
Mulit target Example
This is part 4 of the Road to the Top series, in which I show the process I used to tackle the Paddy Doctor competition. The first three parts show the process I took that lead to four 1st place submissions. This part onwards discusses various extensions and advanced tricks you might want to consider – but which I don’t promise will necessarily help your score (you’ll need to do your own experiments to figure this out!) If you haven’t read the rest of this series yet, I’d suggest starting at part 1.
In this notebook we’re going to build a model that doesn’t just predict what disease the rice paddy has, but also predicts what kind of rice is shown.
This might sound like a bad idea. After all, doesn’t that mean that the model has more to do? Mightn’t it get rather distracted from its main task, which is to identify paddy disease?
Perhaps… But in previous projects I’ve often found the opposite to be true, especially when training for quite a few epochs. By giving the model more signal about what is present in a picture, it may be able to use this information to find more interesting features that predict our target of interest. For instance, perhaps some of the features of disease change between varieties.
Multi-output DataLoader
First we’ll repeat the steps we used last time to access the data and ensure all the latest libraries are installed:
from nbdevAuto.functions import kaggle_competition_download
from pathlib import Path
from fastai.vision.all import *
42)
set_seed(
= Path('./Data')
datapath = 'paddy-disease-classification'
name = Path(f'{datapath}/{name}')
path
kaggle_competition_download(name, datapath)
file exists
from fastcore.parallel import *
= Path(f'{path}/train_images') trn_path
Here’s the CSV that Kaggle provides, showing the variety of rice contained in each image – we’ll make image_id
the index of our data frame so that we can look up images directly to grab their variety:
= pd.read_csv(f'{path}/train.csv', index_col='image_id')
df df.head()
label | variety | age | |
---|---|---|---|
image_id | |||
100330.jpg | bacterial_leaf_blight | ADT45 | 45 |
100365.jpg | bacterial_leaf_blight | ADT45 | 45 |
100382.jpg | bacterial_leaf_blight | ADT45 | 45 |
100632.jpg | bacterial_leaf_blight | ADT45 | 45 |
101918.jpg | bacterial_leaf_blight | ADT45 | 45 |
Pandas uses the loc
attribute to look up rows by index. Here’s how we can get the variety of image 100330.jpg
, for instance:
'100330.jpg', 'variety'] df.loc[
'ADT45'
Our DataBlock will be using get_image_files
to get the list of training images, which returns Path
objects. Therefore, to look up an item to get its variety, we’ll need to pass its name
. Here’s a function which does just that:
def get_variety(p): return df.loc[p.name, 'variety']
We’re now ready to create our DataLoaders
. To do this, we’ll use the DataBlock
API, which is a flexible and convenient way to plug pieces of a data processing pipeline together:
= DataBlock(
dls =(ImageBlock,CategoryBlock,CategoryBlock),
blocks=1,
n_inp=get_image_files,
get_items= [parent_label,get_variety],
get_y =RandomSplitter(0.2, seed=42),
splitter=Resize(192, method='squish'),
item_tfms=aug_transforms(size=128, min_scale=0.75)
batch_tfms ).dataloaders(trn_path)
Here’s an explanation of each line:
=(ImageBlock,CategoryBlock,CategoryBlock), blocks
The DataBlock
will create 3 things from each file: an image (the contents of the file), and 2 categorical variables (the disease and the variety).
=1, n_inp
There is 1
input (the image) – and therefore the other two variables (the two categories) are outputs.
=get_image_files, get_items
Use get_image_files
to get a list of inputs.
= [parent_label,get_variety], get_y
To create the two outputs for each file, call two functions: parent_label
(from fastai) and get_variety
(defined above).
=RandomSplitter(0.2, seed=42), splitter
Randomly split the input into 80% train and 20% validation sets.
=Resize(192, method='squish'),
item_tfms=aug_transforms(size=128, min_scale=0.75) batch_tfms
These are the same item and batch transforms we’ve used in previous notebooks.
Let’s take a look at part of a batch of this data:
=6) dls.show_batch(max_n
We can see that fastai has created both the image input and two categorical outputs that we requested!
Replicating the disease model
Now we’ll replicate the same disease model we’ve made before, but have it work with this new data.
The key difference is that our metrics and loss will now receive three things instead of two: the model outputs (i.e. the metric and loss function inputs), and the two targets (disease and variety). Therefore, we need to define slight variations of our metric (error_rate
) and loss function (cross_entropy
) to pass on just the disease
target:
def disease_err(inp,disease,variety): return error_rate(inp,disease)
def disease_loss(inp,disease,variety): return F.cross_entropy(inp,disease)
We’re now ready to create our learner.
There’s just one wrinkle to be aware of. Now that our DataLoaders
is returning multiple targets, fastai doesn’t know how many outputs our model will need. Therefore we have to pass n_out
when we create our Learner
– we need 10
outputs, one for each possible disease:
= 'convnext_small'
arch = vision_learner(dls, arch, loss_func=disease_loss, metrics=disease_err, n_out=10).to_fp16()
learn = 0.01 lr
When we train this model we should get similar results to what we’ve seen with similar models before:
5, lr) learn.fine_tune(
epoch | train_loss | valid_loss | disease_err | time |
---|---|---|---|---|
0 | 1.433345 | 1.037968 | 0.308986 | 01:09 |
epoch | train_loss | valid_loss | disease_err | time |
---|---|---|---|---|
0 | 0.730947 | 0.590886 | 0.190774 | 01:03 |
1 | 0.522005 | 0.622835 | 0.196540 | 01:08 |
2 | 0.313315 | 0.299247 | 0.090822 | 01:00 |
3 | 0.141513 | 0.134452 | 0.040846 | 01:01 |
4 | 0.081356 | 0.110794 | 0.031235 | 01:03 |
Multi-target model
In order to predict both the probability of each disease, and of each variety, we’ll now need the model to output a tensor of length 20, since there are 10 possible diseases, and 10 possible varieties. We can do this by setting n_out=20
:
= vision_learner(dls, arch, n_out=20).to_fp16() learn
We can define disease_loss
just like we did previously, but with one important change: the input tensor is now length 20, not 10, so it doesn’t match the number of possible diseases. We can pick whatever part of the input we want to be used to predict disease. Let’s use the first 10 values:
def disease_loss(inp,disease,variety): return F.cross_entropy(inp[:,:10],disease)
That means we can do the same thing for predicting variety, but use the last 10 values of the input, and set the target to variety
instead of disease
:
def variety_loss(inp,disease,variety): return F.cross_entropy(inp[:,10:],variety)
Our overall loss will then be the sum of these two losses:
def combine_loss(inp,disease,variety): return disease_loss(inp,disease,variety)+variety_loss(inp,disease,variety)
It would be useful to view the error rate for each of the outputs too, so let’s do the same thing for out metrics:
def disease_err(inp,disease,variety): return error_rate(inp[:,:10],disease)
def variety_err(inp,disease,variety): return error_rate(inp[:,10:],variety)
= (disease_err,variety_err) err_metrics
It’s useful to see the loss for each of the outputs too, so we’ll add those as metrics:
= err_metrics+(disease_loss,variety_loss) all_metrics
We’re now ready to create and train our Learner
:
= vision_learner(dls, arch, loss_func=combine_loss, metrics=all_metrics, n_out=20).to_fp16() learn
5, lr) learn.fine_tune(
epoch | train_loss | valid_loss | disease_err | variety_err | disease_loss | variety_loss | time |
---|---|---|---|---|---|---|---|
0 | 2.574120 | 1.809338 | 0.410860 | 0.161941 | 1.277781 | 0.531556 | 00:54 |
epoch | train_loss | valid_loss | disease_err | variety_err | disease_loss | variety_loss | time |
---|---|---|---|---|---|---|---|
0 | 1.099624 | 0.838923 | 0.165786 | 0.114849 | 0.526330 | 0.312593 | 00:54 |
1 | 0.767783 | 0.732510 | 0.133590 | 0.103316 | 0.433259 | 0.299251 | 00:58 |
2 | 0.439430 | 0.287075 | 0.063431 | 0.021144 | 0.223693 | 0.063382 | 00:59 |
3 | 0.207751 | 0.180888 | 0.043729 | 0.010091 | 0.145213 | 0.035676 | 01:00 |
4 | 0.117015 | 0.163008 | 0.038443 | 0.010572 | 0.129077 | 0.033932 | 01:00 |
Conclusion
So, is this useful?
Well… if you actually want a model that predicts multiple things, then yes, definitely! But as to whether it’s going to help us better predict rice disease, I honestly don’t know. I haven’t come across any research that tackles this important question: when can a multi-target model improve the accuracy of the individual targets compared to a single target model? (That doesn’t mean it doesn’t exist of course – perhaps it does and I haven’t found it yet…)
I’ve certainly found in previous projects that there are cases where improvements to single targets can be made by using a multi-target model. I’d guess that it’ll be most useful when you’re having problems with overfitting. So try doing this with more epochs, and let me know how you go!
If you found this notebook useful, please remember to click the little up-arrow at the top to upvote it, since I like to know when people have found my work useful, and it helps others find it too. And if you have any questions or comments, please pop them below – I read every comment I receive.
# This is what I use to push my notebook from my home PC to Kaggle
if not iskaggle:
'jhoward', 'multi-target-road-to-the-top-part-4',
push_notebook(='Multi-target: Road to the Top, Part 4',
titlefile='11-multitask.ipynb',
=comp, private=False, gpu=True) competition