The inputs to our model are images, and the outputs are categories (in this case, “bird” or “forest”).
get_items=get_image_files,
To find all the inputs to our model, run the get_image_files function (which returns a list of all image files in a path).
splitter=RandomSplitter(valid_pct=0.2, seed=42),
Split the data into training and validation sets randomly, using 20% of the data for the validation set.
get_y=parent_label,
The labels (y values) is the name of the parent of each file (i.e. the name of the folder they’re in, which will be bird or forest).
item_tfms=[Resize(192, method='squish')]
Before training, resize each image to 192x192 pixels by “squishing” it (as opposed to cropping it).
Traning the model
Now we’re ready to train our model. The fastest widely used computer vision model is resnet18. You can train this in a few minutes, even on a CPU! (On a GPU, it generally takes under 10 seconds…)
fastai comes with a helpful fine_tune() method which automatically uses best practices for fine tuning a pre-trained model, so we’ll use that.1
Let’s see what our model thinks about that bird we downloaded at the start:
??PILImage
Init signature: PILImage()Source:class PILImage(PILBase):"A RGB Pillow `Image` that can show itself and converts to `TensorImage`"passFile: ~/mambaforge/envs/cfast/lib/python3.11/site-packages/fastai/vision/core.py
Type: BypassNewMeta
Subclasses: PILImageBW
is_bird,_,probs = learn.predict(PILImage.create('Data/bird0.jpg'))print(f"This is a: {is_bird}.")print(f"Probability it's a bird: {probs[0]:.4f}")