“Building a deep learning model in a few minutes? It takes a few hours to train! I don’t even have a good enough machine.” Countless people who work on data science say they always try to avoid building depth on their machines. Learning model.
In fact, you don't need to work for Google or other large technology companies to use deep learning data sets, build your own neural network from scratch in a matter of minutes, without having to rent a Google server is no longer just a dream. Fast.ai students designed a model using only 18 minutes on the Imagenet dataset. This article will show a similar model building process.
Deep learning is a very wide field, so in this paper we will focus on solving image classification projects. In addition, we will use a very simple deep learning architecture to improve accuracy.
You can consider the python code seen in this article as a reference for building an image classification model. Once you have a good understanding of the concept, you can try to participate in the competition!
What is image classification?
Observe the picture below:
You will immediately realize that this is a (luxury) car. Take a step back and analyze how you came to this conclusion - you saw a picture and classified its category (in this case a car). In short, this is everything about image classification.
To classify a given image, there may be n categories in the image. Manually checking and classifying images is a very tedious process. When we are faced with a lot of pictures, such as 10000 or even 100000, this task is almost impossible to complete. If the entire process can be automated and quickly tag images according to the corresponding class, how useful would this be?
Self-driving cars are a good example of understanding the application of image classification in the real world. To achieve autonomous driving, we can create an image classification model to identify various objects on the road, such as cars, people, moving objects, and so on. We'll see a few use cases later in this article, and there are more applications around.
Now that we have mastered the theme, let's dive into how to build an image classification model, what are its prerequisites and how it is implemented in Python.
Set the structure of the image data
In order to solve the problem of image classification, our data needs to adopt a specific format. We will see this in several sections, but before that, please keep these instructions in mind.
You need two folders, one for the training set and one for the test set. The training set folder contains a .csv file and an image folder:
- The csv file contains the names of all training images and their corresponding tags
- Image folder contains all training images
The .csv file in the test set is different from the training set. The .csv file in the test set contains the names of all test images, but does not contain their corresponding tags. Can you guess why? Our model will be trained on the images in the training set and the label prediction will be performed on the images in the test set.
If your data is not in the format described above, you will need to convert it accordingly (otherwise the prediction will be wrong and useless).
Decomposition model construction process
Before delving into Python code, let's take a moment to understand how image classification models are usually designed. We can roughly divide this process into 4 phases. Each phase takes a certain amount of time to execute:
- Load and preprocess data - 30% of the time
- Define the model architecture - 10% of the time
- Training model - 50% of the time
- Performance evaluation - 10% of the time
Now let's explain each of the above steps in more detail. This part is very important because not all models are built at once. You need to go back after each iteration, fine tune the steps, and run again. A solid understanding of the underlying concepts will be of great help in accelerating the process.
Stage 1: loading and preprocessing of data
As far as the deep learning model is concerned, the data is gold. If there are a large number of images in the training set, your image classification model will be more likely to perform well. In addition, the shape of the data varies depending on the architecture/framework we use.
Therefore, it is highly recommended that you learn the basics of image processing in Python to learn more about how preprocessing handles image data. But we don't have to go so far. To understand how our model handles unseen data (and before we expose it to the test set), we need to create a validation set. This is done by dividing the training set data.
In short, the model is trained on the training set and verified on the validation data. Once we are satisfied with the performance of the model on the validation set, we can use it to predict the test data.
Time required for this step: approximately 2-3 minutes.
Stage 2: Defining Model Architecture
This is another key step in the process of building our deep learning model. We have to define the appearance of the model, which requires answering the following questions:
- How many convolutions do you need?
- What is the activation function of each layer?
- How many hidden units are there on each floor?
There are still many problems, which are not listed here. These are essentially hyperparameters of the model, and they play an important role in determining how accurate the prediction is.
How do we decide on these values? good question! A good idea is to choose these values based on existing research. Another idea is to keep trying these values until you find the best match, but this can be a very time consuming process.
Time required for this step: approximately 1 minutes.
Stage 3: Training Model
In order to train the data, we need to:
- Training images and their corresponding correct labels
- Verify the image and its corresponding correct label (we only use these labels to validate the model, not in the training phase)
We also clarified the number of periods in this step. First, we will run the 10 period of the model (you can change the number of periods later).
Time required for this step: Since the training requires a model learning structure, we need approximately 5 minutes to complete this step.
It's time to make predictions!
Stage 4: Evaluating the performance of the model
Finally, we load the test data (image) and complete the pre-processing steps here. Then we use the training model to predict the class for these images.
Time required for this step: no more than 1 minutes.
Establish problem statements and understand the data
We will accept a very cool challenge to understand image classification. Create a model that categorizes a given set of images based on clothing (shirts, pants, shoes, socks, etc.). This is actually a problem for many e-commerce retailers, making it a more interesting computer vision issue.
This challenge, known as "identifying apparel," is one of the practical issues encountered on the DataHack platform.
We have a total of 70000 images (28 x 28 dimensions), where 60000 is from the training set and 10000 is from the test set. The training images are pre-marked according to the type of clothing, for a total of 10 categories. Of course, the test image has no labels. The challenge is to identify the type of clothing in all test images.
We will build the model on Google Colab because it provides a free GPU for our training model.
Steps to create an image classification model
It's time to motivate Python skills. I finally reached the hands-on practice part!
- Install Google Colab
- Import library
- Load and preprocess data (3 minutes)
- Create a verification set
- Define the model structure (1 minutes)
- Training model (5 minutes)
- Forecast (1 minutes)
Let us step by step:
Step 1: Install Google Colab
Because we want to import data from Google Drive links, we need to add a few lines of code to Google Colabnotebook. Create a new python 3notebook and write the following code block:
!pip install PyDrive
This will install PyDrive. Now we will import some necessary libraries:
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
Next we will create a drive variable to access GoogleDrive:
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
In order to download the dataset, we will use the ID of the uploaded file on Google Drive:
download = drive.CreateFile({'id': '1BZOv422XJvxFUnGh-0xVeSvgFgqVY45q'})
Replace "id" in the above code with the id of the file. Now we will download this file and unzip it:
download.GetContentFile('train_LbELtWX.zip')
!unzip train_LbELtWX.zip
These code blocks must be run each time you start your notebook.
Step 2: Import the libraries needed in the model building phase.
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import to_categorical
from keras.preprocessing import image
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from tqdm import tqdm
Step 3: Recall the preprocessing steps we discussed earlier.We will repeat the above steps here after loading the data.
train = pd.read_csv('train.csv')
Next, we will read all the training images, store them in a list, and finally convert the list into a numpy array.
# We have grayscale images, so while loading the images we will keep grayscale=True, if you have RGB images, you should set grayscale as False
train_image = []
for i in tqdm(range(train.shape[0])):
img = image.load_img('train/'+train['id'][i].astype('str')+'.png', target_size=(28,28,1), grayscale=True)
img = image.img_to_array(img)
img = img/255
train_image.append(img)
X = np.array(train_image)
Since this is a multi-class classification problem (10 classes), we will have one-hot encoding of the target variable.
y=train['label'].values
y = to_categorical(y)
Step 4: Create a validation set from the training data
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, test_size=0.2)
Step 5: Define the model structure
A simple schema with two convolutional layers, one hidden layer, and one output layer will be created.
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=(28,28,1)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
Next, compile the created model.
model.compile(loss='categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])
Step 6: Train the model.
In this step, the model will be trained on the training set image and verified using the validation set (as you guessed).
model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test))
Step 7: Forecast!
We will first follow the steps that are performed when processing the training data. Use the model.predict_classes() function to load the test image and predict its category.
download = drive.CreateFile({'id': '1KuyWGFEpj7Fr2DgBsW8qsWvjqEzfoJBY'})
download.GetContentFile('test_ScVgIM0.zip')
!unzip test_ScVgIM0.zip
Import test files:
test = pd.read_csv('test.csv')
Now, read in and store all the test images:
test_image = []
for i in tqdm(range(test.shape[0])):
img = image.load_img('test/'+test['id'][i].astype('str')+'.png', target_size=(28,28,1), grayscale=True)
img = image.img_to_array(img)
img = img/255
test_image.append(img)
test = np.array(test_image)
# making predictions
prediction = model.predict_classes(test)
We will also create a submission file and upload it to the Datahack platform page (see how the results are displayed on the leaderboard).
download = drive.CreateFile({'id': '1z4QXy7WravpSj-S4Cs9Fk8ZNaX-qh5HF'})
download.GetContentFile('sample_submission_I5njJSF.csv')
# creating submission file
sample = pd.read_csv('sample_submission_I5njJSF.csv')
sample['label'] = prediction
sample.to_csv('sample_cnn.csv', header=True, index=False)
Download this sample _cnn.csv file and upload it to the contest page, generate the score and check the ranking on the leaderboard. This will give you a reference solution to start working on any image classification problem!
You can experiment with hyperparameters and regularization techniques to further improve the performance of your model.
Accept another challenge
Let's test the learning effects on different data sets. We will address the "Identify Numbers" practice issue in this section and download the dataset. Try to solve this problem yourself before continuing. You already have the tools to solve the problem - you just need to apply them! Come back and check your results, or if you get stuck at some point.
In this challenge, we need to identify the numbers in a given image. There are a total of 70000 images – 49000 has a tagged image in the training set and 21000 in the test set (the test image has no tags). We need to identify/predict the categories of these unlabeled images.
Ready to get started? Excellent! Create a new python 3 notebook and run the following code:
# Setting up Colab
!pip install PyDrive
import os
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
from google.colab import auth
from oauth2client.client import GoogleCredentials
auth.authenticate_user()
gauth = GoogleAuth()
gauth.credentials = GoogleCredentials.get_application_default()
drive = GoogleDrive(gauth)
# Replace the id and filename in the below codes
download = drive.CreateFile({'id': '1ZCzHDAfwgLdQke_GNnHp_4OheRRtNPs-'})
download.GetContentFile('Train_UQcUa52.zip')
!unzip Train_UQcUa52.zip
# Importing libraries
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D
from keras.utils import to_categorical
from keras.preprocessing import image
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from keras.utils import to_categorical
from tqdm import tqdm
train = pd.read_csv('train.csv')
# Reading the training images
train_image = []
for i in tqdm(range(train.shape[0])):
img = image.load_img('Images/train/'+train['filename'][i], target_size=(28,28,1), grayscale=True)
img = image.img_to_array(img)
img = img/255
train_image.append(img)
X = np.array(train_image)
# Creating the target variable
y=train['label'].values
y = to_categorical(y)
# Creating validation set
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=42, test_size=0.2)
# Define the model structure
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',input_shape=(28,28,1)))
model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))
model.add(Flatten())
model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(10, activation='softmax'))
# Compile the model
model.compile(loss='categorical_crossentropy',optimizer='Adam',metrics=['accuracy'])
# Training the model
model.fit(X_train, y_train, epochs=10, validation_data=(X_test, y_test))
download = drive.CreateFile({'id': '1zHJR6yiI06ao-UAh_LXZQRIOzBO3sNDq'})
download.GetContentFile('Test_fCbTej3.csv')
test_file = pd.read_csv('Test_fCbTej3.csv')
test_image = []
for i in tqdm(range(test_file.shape[0])):
img = image.load_img('Images/test/'+test_file['filename'][i], target_size=(28,28,1), grayscale=True)
img = image.img_to_array(img)
img = img/255
test_image.append(img)
test = np.array(test_image)
prediction = model.predict_classes(test)
download = drive.CreateFile({'id': '1nRz5bD7ReGrdinpdFcHVIEyjqtPGPyHx'})
download.GetContentFile('Sample_Submission_lxuyBuB.csv')
sample = pd.read_csv('Sample_Submission_lxuyBuB.csv')
sample['filename'] = test_file['filename']
sample['label'] = prediction
sample.to_csv('sample.csv', header=True, index=False)
Submit this file on the practice page to get a pretty good and accurate number. This is a good start, but there is always room for improvement. Continue to adjust the hyperparameter values to see if you can improve the base model.
This article is transferred from the public reading core,Original address
Comments