author :Damiano Baschiera
The original title is | DATA LOADING AND PROCESSING TUTORIAL
author | Sasank Chilamkurthy
brief introduction
This tutorial is mainly about how to load 、 Methods of preprocessing and enhancing data .
First, make sure to install the following python
library :
-
scikit-image
: Processing image data -
pandas
: Handle csv file
The import module code is as follows :
This tutorial uses a face pose data set , The picture is as follows :
Every face has 68 Personal face key points , It is from dlib
Generated , Specific implementation can see its official website :
https://blog.dlib.net/2014/08/real-time-face-pose-estimation.html
Dataset download address :
https://download.pytorch.org/tutorial/faces.zip
In dataset csv
The format of the file is as follows , Picture name and coordinates of each key point x, y
Download and decompress the dataset and put it in the folder data/faces
in , Then we quickly open it face_landmarks.csv
file , View file contents , I.e. marking information , The code is as follows :
The output is as follows :

Then write an auxiliary function to display the face image and its key points , The code is as follows :
The output is as follows :

Dataset class
torch.utils.data.Dataset
Is an abstract class that represents a dataset , When you customize your dataset, you need to inherit Dataset
Category , And override the following methods :
-
len
: calllen(dataset)
The number of data sets that can be returned when ; -
getitem
: get data , Index access can be realized , namelydataset[i]
You can visit thei
Sample data
Next, we will customize a category for our face key data set , stay __init__
Method will read the data set information , And in __getitem__
Data set obtained by method call , This is mainly a memory based consideration , This approach does not need to read all the data once and store it in memory , You can read and load data into memory when you need to read data .
Samples of the dataset will be represented in a dictionary :{'image': image, 'landmarks': landmarks}
, There is also an optional parameter transform
Sample data for preprocessing read , This is covered in the next section transform
Usefulness .
The code for the custom function is as follows :
Next is a simple example to use our custom dataset class , In the example, before reading 4 Samples and show :
The output is as follows :

Transforms
From the output structure of the above example, we can see a problem , The size of the pictures is not the same , But most neural networks need to input a fixed size of the image . therefore , The next step is to give some preprocessing code , There are three main preprocessing methods :
-
Rescale
: Resize the picture -
RandomCrop
: Crop pictures randomly , This is a data enhancement method -
ToTensor
: takenumpy
Format picture topytorch
Data formattensors
, We need to exchange coordinates here .
These methods will be written as callable classes , Instead of simple functions , So you don 't have to pass parameters every time . therefore , We need to achieve __call__
Method , And if necessary ,__init__
Method is also to be realized , You can then call these methods as follows :
Rescale
The implementation code of the method is as follows :
RandomCrop
Code implementation of :
ToTensor
Method implementation of :
Combination of preprocessing methods
Next is an example of using the custom preprocessing method described above .
Suppose we want to adjust the shortest side length of the picture to 256, Then cut one randomly 224*224 Size picture area , That is, we need to combine calls Rescale
and RandomCrop
Pretreatment method .
torchvision.transforms.Compose
It is a class that can implement the combination of methods to be processed , The implementation code is as follows :
Output structure :

Iterate over the entire dataset
Now we've defined a class to process the dataset ,3 Classes of preprocessing data , So we can integrate them , The process of loading and preprocessing data , The process is as follows :
-
First read the picture according to the picture path
-
Call preprocessing methods for all images
-
The preprocessing method can also realize data enhancement
The implementation code is as follows :
Output results :

This is just a simple process , In fact, when processing and loading data , We usually do the following processing for the data :
-
Divide the data into batches and batches according to the given size
-
Disarrange data order
-
use
multiprocessing
To load data in parallel
torch.utils.data.DataLoader
Is an iterator that can implement the above operations . The required parameters are shown in the following code , One of the parameters collate_fn
It is used to specify how to batch data operation , But you can also use the default function .
Output results :

torchvision
Finally, it introduces torchvision
This library , It provides some common data sets and preprocessing methods , With this library, you don't need custom classes , The common method is ImageFolder
, It assumes that the path to save the image is as follows :
there ants
,bees
Wait, it's all category labels , Besides PIL.Image
The pretreatment method of , Such as RandomHorizontalFlip
、Scale
All contained in torchvision
in , An example is as follows :
Summary
This tutorial mainly introduces how to customize a class to load your dataset , And pretreatment methods , At the same time, it also introduces PyTorch
Medium torchvision
,torch.utils.data.DataLoader
Method .
The code of this article is uploaded to Github:
https://github.com/ccc013/DeepLearning_Notes/blob/master/Pytorch/pytorch_dataloader_tutorial.ipynb
in addition , It's also useful dlib
Generate face key code :
https://github.com/ccc013/DeepLearning_Notes/blob/master/Pytorch/create_landmark_dataset.py
Besides , You can also reply to the official account. “PyTorch” Get the dataset and code for this tutorial .
Welcome to my WeChat official account. -- The growth of algorithmic apes , Or scan the QR code below , Let's talk , Learning and progress !
