柠檬分类全流程实战

释放双眼,带上耳机,听听看~!
本文详细介绍了图像分类比赛的完整流程,包括数据处理、模型搭建、损失函数、优化算法选择等内容,并提供实战代码。通过本文可以了解图像分类网络的搭建,并具备参加图像分类竞赛的能力。

柠檬分类全流程实战

开启掘金成长之旅!这是我参与「掘金日新计划 · 2 月更文挑战」的第 17 天,柠檬品相分类比赛为例,为大家详细介绍下图像分类比赛的完整流程。为大家提供从数据处理,到模型搭建,损失函数、优化算法选择,学习率调整策略到模型训练,以及推理输出一条龙服务。每个模块都有很多tricks,在这里我会逐一为大家进行理论介绍以及相应的代码实战。通过本次课程你将了解图像分类网络的搭建,并且能够具备参加图像分类竞赛的能力。(PS:本次课程包教包会,但是,不会也不退票!!!我们的服务宗旨就是不退票。)

柠檬分类全流程实战

目录

  1. 图像任务中Pipeline的构建(模块化)
  2. 通用调参技巧与常见思路
  3. 柠檬分类竞赛项目调优实战
  4. 建议与总结

图像分类竞赛全流程工具

  • 编程语言

  python

  • 炼丹框架

  PaddlePaddle2.0

  • 图像预处理库

  OpenCV

  PIL(pillow)

  • 通用库

  Numpy

  Pandas

  Scikit-Learn

  Matplotlib

图像分类比赛的一般解题流程

  1. 数据EDA (Pandas、Matplotlib)
  2. 数据预处理 (OpenCV、PIL、Pandas、Numpy、Scikit-Learn)
  3. 根据赛题任务定义好读取方法,即Dataset和Dataloader(PaddlePaddle2.0)
  4. 选择一个图像分类模型进行训练 (PaddlePaddle2.0)
  5. 对测试集进行测试并提交结果(PaddlePaddle2.0、Pandas)

一、EDA(Exploratory Data Analysis)与数据预处理

1.1 数据EDA

  探索性数据分析(Exploratory Data Analysis,简称EDA),是指对已有的数据(原始数据)进行分析探索,通过作图、制表、方程拟合、计算特征量等手段探索数据的结构和规律的一种数据分析方法。一般来说,我们最初接触到数据的时候往往是毫无头绪的,不知道如何下手,这时候探索性数据分析就非常有效。

  对于图像分类任务,我们通常首先应该统计出每个类别的数量,查看训练集的数据分布情况。通过数据分布情况分析赛题,形成解题思路。(洞察数据的本质很重要。)

数据分析的一些建议

1、写出一系列你自己做的假设,然后接着做更深入的数据分析。

2、记录自己的数据分析过程,防止出现遗忘。

3、把自己的中间的结果给自己的同行看看,让他们能够给你一些更有拓展性的反馈、或者意见。(即open to everybody)

4、可视化分析结果

!cd data/data71799/ && unzip -q lemon_lesson.zip
!cd data/data71799/lemon_lesson && unzip -q train_images.zip
!cd data/data71799/lemon_lesson && unzip -q test_images.zip
# 导入所需要的库

import os
import pandas as pd
import numpy as np
from PIL import Image

import paddle
import paddle.nn as nn
from paddle.io import Dataset
import paddle.vision.transforms as T
import paddle.nn.functional as F
from paddle.metric import Accuracy

import warnings
warnings.filterwarnings("ignore")
# 数据EDA
df = pd.read_csv('data/data71799/lemon_lesson/train_images.csv')
d=df['class_num'].hist().get_figure()
# d.savefig('2.jpg')

柠檬数据集数据分布情况如下;

柠檬分类全流程实战

知识点 图像分类竞赛常见难点

  1. 类别不均衡
  2. One-Shot和Few-Shot分类
  3. 细粒度分类

柠檬分类竞赛难点

限制模型大小

数据量小(训练集1102张图片)

1.2 数据预处理

Compose实现将用于数据集预处理的接口以列表的方式进行组合。

# 定义数据预处理
data_transforms = T.Compose([
    T.Resize(size=(32, 32)),
    T.Transpose(),    # HWC -> CHW
    T.Normalize(
        mean=[0, 0, 0],        # 归一化
        std=[255, 255, 255],
        to_rgb=True)    
])

图像标准化与归一化

  最常见的对图像预处理方法有两种,一种叫做图像标准化处理,另外一种方法叫做归一化处理。数据的标准化是指将数据按照比例缩放,使之落入一个特定的区间。将数据通过去均值,实现中心化。处理后的数据呈正态分布,即均值为零。数据归一化是数据标准化的一种典型做法,即将数据统一映射到[0,1]区间上。

作用

  1. 有利于初始化的进行
  2. 避免给梯度数值的更新带来数值问题
  3. 有利于学习率数值的调整
  4. 加快寻找最优解速度

标准化

柠檬分类全流程实战

归一化

柠檬分类全流程实战

没有归一化前,寻找最优解的过程&归一化后的过程

柠檬分类全流程实战
柠檬分类全流程实战

#什么是数值问题?

421*0.00243 == 0.421*2.43
False
import numpy as np
from PIL import Image
from paddle.vision.transforms import Normalize

normalize_std = Normalize(mean=[127.5, 127.5, 127.5],
                        std=[127.5, 127.5, 127.5],
                        data_format='HWC')

fake_img = Image.fromarray((np.random.rand(300, 320, 3) * 255.).astype(np.uint8))

fake_img = normalize_std(fake_img)
# print(fake_img.shape)
print(fake_img)
[[[ 0.8666667   0.78039217 -0.9137255 ]
  [-0.46666667  0.14509805 -0.08235294]
  [ 0.16078432  0.25490198  0.34117648]
  ...
  [ 0.38039216  0.8666667   0.827451  ]
  [-0.16862746  0.49803922  0.3019608 ]
  [ 0.06666667 -0.49019608 -0.7019608 ]]

 [[ 0.8509804  -0.05882353  0.00392157]
  [-0.8666667   0.9137255   0.67058825]
  [ 0.16078432 -0.6862745   0.88235295]
  ...
  [ 0.41960785 -0.49803922  0.29411766]
  [-0.2627451   0.7019608   0.60784316]
  [ 0.13725491 -0.6627451  -0.09803922]]

 [[-0.3647059  -0.77254903  0.60784316]
  [-0.79607844  0.7647059  -0.23921569]
  [ 0.9607843  -0.8901961   0.75686276]
  ...
  [-0.96862745  0.94509804  0.8352941 ]
  [ 0.75686276 -0.8745098   0.7176471 ]
  [-0.7490196   0.654902   -0.01960784]]

 ...

 [[ 0.5137255   0.41960785  0.67058825]
  [-0.06666667  0.5294118  -0.28627452]
  [-0.8666667  -0.3254902   0.4117647 ]
  ...
  [-0.1764706   0.6392157   0.75686276]
  [-0.27058825 -0.9843137   0.39607844]
  [ 0.33333334 -0.05098039  0.75686276]]

 [[-0.827451    0.16862746  0.6313726 ]
  [-0.99215686 -0.9607843   0.94509804]
  [ 0.77254903  0.16862746 -0.94509804]
  ...
  [ 0.81960785 -0.5372549  -0.75686276]
  [-0.06666667 -0.81960785 -0.5137255 ]
  [ 0.34901962 -0.15294118  0.39607844]]

 [[ 0.7176471   0.18431373  0.7411765 ]
  [ 0.5372549   0.46666667 -0.4117647 ]
  [ 0.01960784  0.23137255 -0.28627452]
  ...
  [ 0.44313726  0.06666667 -0.62352943]
  [-0.78039217  0.88235295 -0.34117648]
  [ 0.92156863  0.16862746 -0.7254902 ]]]
import numpy as np
from PIL import Image
from paddle.vision.transforms import Normalize

normalize = Normalize(mean=[0, 0, 0],
                        std=[255, 255, 255],
                        data_format='HWC')

fake_img = Image.fromarray((np.random.rand(300, 320, 3) * 255.).astype(np.uint8))

fake_img = normalize(fake_img)
# print(fake_img.shape)
print(fake_img)
[[[0.6313726  0.93333334 0.60784316]
  [0.6666667  0.67058825 0.72156864]
  [0.7647059  0.83137256 0.99215686]
  ...
  [0.12156863 0.07450981 0.75686276]
  [0.33333334 0.93333334 0.7058824 ]
  [0.8862745  0.42745098 0.8666667 ]]

 [[0.49411765 0.58431375 0.41568628]
  [0.6509804  0.99215686 0.15294118]
  [0.73333335 0.09019608 0.77254903]
  ...
  [0.56078434 0.74509805 0.04313726]
  [0.91764706 0.74509805 0.64705884]
  [0.92941177 0.80784315 0.57254905]]

 [[0.12156863 0.3137255  0.9372549 ]
  [0.42352942 0.6862745  0.0627451 ]
  [0.62352943 0.6        0.30980393]
  ...
  [0.09411765 0.01176471 0.9372549 ]
  [0.57254905 0.7294118  0.5254902 ]
  [0.40784314 0.43137255 0.2627451 ]]

 ...

 [[0.21176471 0.3372549  0.04705882]
  [0.5647059  0.42352942 0.36862746]
  [0.3254902  0.99607843 0.3254902 ]
  ...
  [0.9607843  0.48235294 0.5921569 ]
  [0.04705882 0.13725491 0.8       ]
  [0.9254902  0.54509807 0.77254903]]

 [[0.79607844 0.2509804  0.09411765]
  [0.6392157  0.09019608 0.64705884]
  [0.2901961  0.07843138 0.45882353]
  ...
  [0.30588236 0.01176471 0.29803923]
  [0.09803922 0.6784314  0.03529412]
  [0.69803923 0.89411765 0.75686276]]

 [[0.35686275 0.7294118  0.24705882]
  [0.8392157  0.18431373 0.9647059 ]
  [0.3372549  0.92941177 0.5294118 ]
  ...
  [0.79607844 0.9254902  0.5921569 ]
  [0.24705882 0.03921569 0.12941177]
  [0.52156866 0.34117648 0.00392157]]]

数据集划分

# 读取数据

train_images = pd.read_csv('data/data71799/lemon_lesson/train_images.csv', usecols=['id','class_num'])

# 划分训练集和校验集
all_size = len(train_images)
print(all_size)
train_size = int(all_size * 0.8)
train_image_path_list = train_images[:train_size]
val_image_path_list = train_images[train_size:]

print(len(train_image_path_list))
print(len(val_image_path_list))
1102
881
221
# 构建Dataset
class MyDataset(paddle.io.Dataset):
    """
    步骤一:继承paddle.io.Dataset类
    """
    def __init__(self, train_list, val_list, mode='train'):
        """
        步骤二:实现构造函数,定义数据读取方式
        """
        super(MyDataset, self).__init__()
        self.data = []
        # 借助pandas读取csv文件
        self.train_images = train_list
        self.test_images = val_list
        if mode == 'train':
            # 读train_images.csv中的数据
            for row in self.train_images.itertuples():
                self.data.append(['data/data71799/lemon_lesson/train_images/'+getattr(row, 'id'), getattr(row, 'class_num')])
        else:
            # 读test_images.csv中的数据
            for row in self.test_images.itertuples():
                self.data.append(['data/data71799/lemon_lesson/train_images/'+getattr(row, 'id'), getattr(row, 'class_num')])

    def load_img(self, image_path):
        # 实际使用时使用Pillow相关库进行图片读取即可,这里我们对数据先做个模拟
        image = Image.open(image_path).convert('RGB')

        return image

    def __getitem__(self, index):
        """
        步骤三:实现__getitem__方法,定义指定index时如何获取数据,并返回单条数据(训练数据,对应的标签)
        """
        image = self.load_img(self.data[index][0])
        label = self.data[index][1]

        return data_transforms(image), np.array(label, dtype='int64')

    def __len__(self):
        """
        步骤四:实现__len__方法,返回数据集总数目
        """
        return len(self.data)

数据加载器定义

#train_loader
train_dataset = MyDataset(train_list=train_image_path_list, val_list=val_image_path_list, mode='train')
train_loader = paddle.io.DataLoader(train_dataset, places=paddle.CPUPlace(), batch_size=128, shuffle=True, num_workers=0)

#val_loader
val_dataset =MyDataset(train_list=train_image_path_list, val_list=val_image_path_list, mode='test')
val_loader = paddle.io.DataLoader(val_dataset, places=paddle.CPUPlace(), batch_size=128, shuffle=True, num_workers=0)
print('=============train dataset=============')
for image, label in train_dataset:
    print('image shape: {}, label: {}'.format(image.shape, label))
    break

for batch_id, data in enumerate(train_loader()):
    x_data = data[0]
    y_data = data[1]
    print(x_data)
    print(y_data)
    break

二、Baseline选择

  理想情况中,模型越大拟合能力越强,图像尺寸越大,保留的信息也越多。在实际情况中模型越复杂训练时间越长,图像输入尺寸越大训练时间也越长。
比赛开始优先使用最简单的模型(如ResNet),快速跑完整个训练和预测流程;分类模型的选择需要根据任务复杂度来进行选择,并不是精度越高的模型越适合比赛。
在实际的比赛中我们可以逐步增加图像的尺寸,比如先在64 * 64的尺寸下让模型收敛,进而将模型在128 * 128的尺寸下训练,进而到224 * 224的尺寸情况下,这种方法可以加速模型的收敛速度。

Baseline应遵循以下几点原则:

  1. 复杂度低,代码结构简单。
  2. Loss收敛正确,评价指标(metric)出现相应提升(如accuracy/AUC之类的)
  3. 迭代快速,没有很复杂(Fancy)的模型结构/Loss function/图像预处理方法之类的
  4. 编写正确并简单的测试脚本,能够提交submission后获得正确的分数

知识点

模型组网方式

  对于组网方式,飞桨框架统一支持 Sequential 或 SubClass 的方式进行模型的组建。我们根据实际的使用场景,来选择最合适的组网方式。如针对顺序的线性网络结构我们可以直接使用 Sequential ,相比于 SubClass ,Sequential 可以快速的完成组网。 如果是一些比较复杂的网络结构,我们可以使用 SubClass 定义的方式来进行模型代码编写,在 init 构造函数中进行 Layer 的声明,在 forward 中使用声明的 Layer 变量进行前向计算。通过这种方式,我们可以组建更灵活的网络结构。

使用 SubClass 进行组网

#定义卷积神经网络
class MyNet(paddle.nn.Layer):
    def __init__(self, num_classes=4):
        super(MyNet, self).__init__()
        self.conv1 = paddle.nn.Conv2D(in_channels=3, out_channels=32, kernel_size=(3, 3), stride=1, padding = 1)
        # self.pool1 = paddle.nn.MaxPool2D(kernel_size=2, stride=2)

        self.conv2 = paddle.nn.Conv2D(in_channels=32, out_channels=64, kernel_size=(3,3),  stride=2, padding = 0)
        # self.pool2 = paddle.nn.MaxPool2D(kernel_size=2, stride=2)

        self.conv3 = paddle.nn.Conv2D(in_channels=64, out_channels=64, kernel_size=(3,3), stride=2, padding = 0)

        self.conv4 = paddle.nn.Conv2D(in_channels=64, out_channels=64, kernel_size=(3,3), stride=2, padding = 1)

        self.flatten = paddle.nn.Flatten()
        self.linear1 = paddle.nn.Linear(in_features=1024, out_features=64)
        self.linear2 = paddle.nn.Linear(in_features=64, out_features=num_classes)

    def forward(self, x):
        x = self.conv1(x)
        x = F.relu(x)
        # x = self.pool1(x)
        # print(x.shape)
        x = self.conv2(x)
        x = F.relu(x)
        # x = self.pool2(x)
        # print(x.shape)

        x = self.conv3(x)
        x = F.relu(x)
        # print(x.shape)
        
        x = self.conv4(x)
        x = F.relu(x)
        # print(x.shape)

        x = self.flatten(x)
        x = self.linear1(x)
        x = F.relu(x)
        x = self.linear2(x)
        return x

使用 Sequential 进行组网


# Sequential形式组网
MyNet = nn.Sequential(
    nn.Conv2D(in_channels=3, out_channels=32, kernel_size=(3, 3), stride=1, padding = 1),
    nn.ReLU(),
    nn.Conv2D(in_channels=32, out_channels=64, kernel_size=(3,3),  stride=2, padding = 0),
    nn.ReLU(),
    nn.Conv2D(in_channels=64, out_channels=64, kernel_size=(3,3), stride=2, padding = 0),
    nn.ReLU(),
    nn.Conv2D(in_channels=64, out_channels=64, kernel_size=(3,3), stride=2, padding = 1),
    nn.ReLU(),
    nn.Flatten(),
    nn.Linear(in_features=50176, out_features=64),
    nn.ReLU(),
    nn.Linear(in_features=64, out_features=4)
)
# 模型封装

model = paddle.Model(MyNet())

网络结构可视化

通过summary打印网络的基础结构和参数信息。


model.summary((1, 3, 32, 32))
---------------------------------------------------------------------------
 Layer (type)       Input Shape          Output Shape         Param #    
===========================================================================
   Conv2D-1       [[1, 3, 32, 32]]     [1, 32, 32, 32]          896      
   Conv2D-2      [[1, 32, 32, 32]]     [1, 64, 15, 15]        18,496     
   Conv2D-3      [[1, 64, 15, 15]]      [1, 64, 7, 7]         36,928     
   Conv2D-4       [[1, 64, 7, 7]]       [1, 64, 4, 4]         36,928     
   Flatten-1      [[1, 64, 4, 4]]         [1, 1024]              0       
   Linear-1         [[1, 1024]]            [1, 64]            65,600     
   Linear-2          [[1, 64]]              [1, 4]              260      
===========================================================================
Total params: 159,108
Trainable params: 159,108
Non-trainable params: 0
---------------------------------------------------------------------------
Input size (MB): 0.01
Forward/backward pass size (MB): 0.40
Params size (MB): 0.61
Estimated Total Size (MB): 1.02
---------------------------------------------------------------------------






{'total_params': 159108, 'trainable_params': 159108}

知识点–特征图尺寸计算

柠檬分类全流程实战

# 模型封装
# model = MyNet(num_classes=2)
# # model = mnist
# model = paddle.Model(model)

# 定义优化器
optim = paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters())


# 配置模型
model.prepare(
    optim,
    paddle.nn.CrossEntropyLoss(),
    Accuracy()
    )
# 调用飞桨框架的VisualDL模块,保存信息到目录中。
# callback = paddle.callbacks.VisualDL(log_dir='visualdl_log_dir')

from visualdl import LogReader, LogWriter

args={
    'logdir':'./vdl',
    'file_name':'vdlrecords.model.log',
    'iters':0,
}

# 配置visualdl
write = LogWriter(logdir=args['logdir'], file_name=args['file_name'])
#iters 初始化为0
iters = args['iters'] 

#自定义Callback
class Callbk(paddle.callbacks.Callback):
    def __init__(self, write, iters=0):
        self.write = write
        self.iters = iters

    def on_train_batch_end(self, step, logs):

        self.iters += 1

        #记录loss
        self.write.add_scalar(tag="loss",step=self.iters,value=logs['loss'][0])
        #记录 accuracy
        self.write.add_scalar(tag="acc",step=self.iters,value=logs['acc'])
`./vdl/vdlrecords.model.log` is exists, VisualDL will add logs to it.

# 模型训练与评估
model.fit(train_loader,
        val_loader,
        log_freq=1,
        epochs=5,
        callbacks=Callbk(write=write, iters=iters),
        verbose=1,
        )
The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/5
step 7/7 [==============================] - loss: 0.9400 - acc: 0.4926 - 948ms/step        
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 2/2 [==============================] - loss: 0.9144 - acc: 0.5837 - 715ms/step         
Eval samples: 221
Epoch 2/5
step 7/7 [==============================] - loss: 0.6421 - acc: 0.6913 - 847ms/step        
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 2/2 [==============================] - loss: 0.7182 - acc: 0.7240 - 729ms/step         
Eval samples: 221
Epoch 3/5
step 7/7 [==============================] - loss: 0.4278 - acc: 0.7911 - 810ms/step        
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 2/2 [==============================] - loss: 0.5868 - acc: 0.7873 - 724ms/step         
Eval samples: 221
Epoch 4/5
step 7/7 [==============================] - loss: 0.3543 - acc: 0.8547 - 792ms/step        
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 2/2 [==============================] - loss: 0.4153 - acc: 0.8597 - 738ms/step         
Eval samples: 221
Epoch 5/5
step 7/7 [==============================] - loss: 0.2989 - acc: 0.8956 - 831ms/step        
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 2/2 [==============================] - loss: 0.4725 - acc: 0.8235 - 724ms/step         
Eval samples: 221
# 保存模型参数
# model.save('Hapi_MyCNN')  # save for training
model.save('Hapi_MyCNN1', False)  # save for inference

扩展知识点:训练过程可视化

  然后我们调用VisualDL工具,在命令行中输入: visualdl --logdir ./visualdl_log_dir --port 8080,打开浏览器,输入网址 http://127.0.0.1:8080 就可以在浏览器中看到相关的训练信息,具体如下:柠檬分类全流程实战

调参,训练,记录曲线,分析结果。

三、模型预测

import os, time
import matplotlib.pyplot as plt
import paddle
from PIL import Image
import numpy as np

def load_image(img_path):
    '''
    预测图片预处理
    '''
    img = Image.open(img_path).convert('RGB')
    plt.imshow(img)          #根据数组绘制图像
    plt.show()               #显示图像
    
    #resize
    img = img.resize((32, 32), Image.BILINEAR) #Image.BILINEAR双线性插值
    img = np.array(img).astype('float32')

    # HWC to CHW 
    img = img.transpose((2, 0, 1))
    
    #Normalize
    img = img / 255         #像素值归一化
    # mean = [0.31169346, 0.25506335, 0.12432463]   
    # std = [0.34042713, 0.29819837, 0.1375536]
    # img[0] = (img[0] - mean[0]) / std[0]
    # img[1] = (img[1] - mean[1]) / std[1]
    # img[2] = (img[2] - mean[2]) / std[2]
    
    return img

def infer_img(path, model_file_path, use_gpu):
    '''
    模型预测
    '''
    paddle.set_device('gpu:0') if use_gpu else paddle.set_device('cpu')
    model = paddle.jit.load(model_file_path)
    model.eval() #训练模式

    #对预测图片进行预处理
    infer_imgs = []
    infer_imgs.append(load_image(path))
    infer_imgs = np.array(infer_imgs)
    label_list = ['0:優良', '1:良', '2:加工品', '3:規格外']

    for i in range(len(infer_imgs)):
        data = infer_imgs[i]
        dy_x_data = np.array(data).astype('float32')
        dy_x_data = dy_x_data[np.newaxis,:, : ,:]
        img = paddle.to_tensor(dy_x_data)
        out = model(img)

        print(out[0])
        print(paddle.nn.functional.softmax(out)[0]) # 若模型中已经包含softmax则不用此行代码。

        lab = np.argmax(out.numpy())  #argmax():返回最大数的索引
        print("样本: {},被预测为:{}".format(path, label_list[lab]))

    print("*********************************************")
image_path = []

for root, dirs, files in os.walk('work/'):
    # 遍历work/文件夹内图片
    for f in files:
        image_path.append(os.path.join(root, f))

for i in range(len(image_path)):
    infer_img(path=image_path[i], use_gpu=True, model_file_path="Hapi_MyCNN")
    # time.sleep(0.5) #防止输出错乱
    break
---------------------------------------------------------------------------

NameError                                 Traceback (most recent call last)

<ipython-input-4-b77cd2e0e8d6> in <module>
      7 
      8 for i in range(len(image_path)):
----> 9     infer_img1(path=image_path[i], use_gpu=True, model_file_path="Hapi_MyCNN")
     10     # time.sleep(0.5) #防止输出错乱
     11     break


NameError: name 'infer_img1' is not defined

baseline选择技巧

  • 模型:复杂度小的模型可以快速迭代。
  • optimizer:推荐Adam,或者SGD
  • Loss Function: 多分类Cross entropy;
  • metric:以比赛的评估指标为准。
  • 数据增强:数据增强其实可为空,或者只有一个HorizontalFlip即可。
  • 图像分辨率:初始最好就用小图,如224*224之类的。

如何提升搭建baseline的能力

  • 鲁棒的baseline,等价于好的起点,意味着成功了一半。
  • 阅读top solution的开源代码,取其精华,去其糟粕。
  • 积累经验,多点实践,模仿他人,最后有着属于自己风格的一套。

竞赛完整流程总结

第一部分(数据处理)

  1. 数据预处理
  2. 自定义数据集
  3. 定义数据加载器

第二部分(模型训练)

  1. 模型组网
  2. 模型封装(Model对象是一个具备训练、测试、推理的神经网络。)
  3. 模型配置(配置模型所需的部件,比如优化器、损失函数和评价指标。)
  4. 模型训练&验证

第三部分(提交结果)

  1. 模型预测
  2. 生成提交结果(pandas)

常用调参技巧

为什么需要调参技巧?

  • 调参是比赛环节里非常重要的一步,即使在日常工作里也不可避免。
  • 合适的learning rate对比不合适的learning rate,得到的结果差异非常大。
  • 模型的调优,很大一部分的收益其实多是从调参中获得的。
  • 在一些数据没有很明显的特点的比赛任务里,最后的名次往往取决于你的调参能力。

柠檬分类全流程实战

接下来,我将结合刚刚总结的三个部分介绍每个步骤中常用的一些调参技巧。

数据处理部分

label shuffling

  首先对原始的图像列表,按照标签顺序进行排序;
然后计算每个类别的样本数量,并得到样本最多的那个类别的样本数。
根据这个最多的样本数,对每类都产生一个随机排列的列表;
然后用每个类别的列表中的数对各自类别的样本数求余,得到一个索引值,从该类的图像中提取图像,生成该类的图像随机列表;
然后把所有类别的随机列表连在一起,做个Random Shuffling,得到最后的图像列表,用这个列表进行训练。

柠檬分类全流程实战

# labelshuffling

def labelShuffling(dataFrame, groupByName = 'class_num'):

    groupDataFrame = dataFrame.groupby(by=[groupByName])
    labels = groupDataFrame.size()
    print("length of label is ", len(labels))
    maxNum = max(labels)
    lst = pd.DataFrame()
    for i in range(len(labels)):
        print("Processing label  :", i)
        tmpGroupBy = groupDataFrame.get_group(i)
        createdShuffleLabels = np.random.permutation(np.array(range(maxNum))) % labels[i]
        print("Num of the label is : ", labels[i])
        lst=lst.append(tmpGroupBy.iloc[createdShuffleLabels], ignore_index=True)
        print("Done")
    # lst.to_csv('test1.csv', index=False)
    return lst


from sklearn.utils import shuffle

# 读取数据
train_images = pd.read_csv('data/data71799/lemon_lesson/train_images.csv', usecols=['id','class_num'])

# 读取数据

df = labelShuffling(train_images)
df = shuffle(df)

image_path_list = df['id'].values
label_list = df['class_num'].values
label_list = paddle.to_tensor(label_list, dtype='int64')
label_list = paddle.nn.functional.one_hot(label_list, num_classes=4)

# 划分训练集和校验集
all_size = len(image_path_list)
train_size = int(all_size * 0.8)
train_image_path_list = image_path_list[:train_size]
train_label_list = label_list[:train_size]
val_image_path_list = image_path_list[train_size:]
val_label_list = label_list[train_size:]
length of label is  4
Processing label  : 0
Num of the label is :  400
Done
Processing label  : 1
Num of the label is :  255
Done
Processing label  : 2
Num of the label is :  235
Done
Processing label  : 3
Num of the label is :  212
Done

图像扩增

  为了获得更多数据,我们只需要对现有数据集进行微小改动。例如翻转剪裁等操作。对图像进行微小改动,模型就会认为这些是不同的图像。常用的有两种数据增广方法:
第一个方法称为离线扩充。对于相对较小的数据集,此方法是首选。
第二个方法称为在线增强,或即时增强。对于较大的数据集,此方法是首选。

飞桨2.0中的预处理方法

  在图像分类任务中常见的数据增强有翻转、旋转、随机裁剪、颜色噪音、平移等,具体的数据增强方法要根据具体任务来选择,要根据具体数据的特定来选择。对于不同的比赛来说数据扩增方法一定要反复尝试,会很大程度上影响模型精度。

柠檬分类全流程实战

水平翻转&垂直翻转

柠檬分类全流程实战
柠檬分类全流程实战

柠檬分类全流程实战
柠檬分类全流程实战

import numpy as np
from PIL import Image
from paddle.vision.transforms import RandomHorizontalFlip

transform = RandomHorizontalFlip(224)

fake_img = Image.fromarray((np.random.rand(300, 320, 3) * 255.).astype(np.uint8))

fake_img = transform(fake_img)
print(fake_img.size)
# 定义数据预处理
data_transforms = T.Compose([
    T.Resize(size=(224, 224)),
    T.RandomHorizontalFlip(224),
    T.RandomVerticalFlip(224),
    T.Transpose(),    # HWC -> CHW
    T.Normalize(
        mean=[0, 0, 0],        # 归一化
        std=[255, 255, 255],
        to_rgb=True)    
])
# 构建Dataset
class MyDataset(paddle.io.Dataset):
    """
    步骤一:继承paddle.io.Dataset类
    """
    def __init__(self, train_img_list, val_img_list,train_label_list,val_label_list, mode='train'):
        """
        步骤二:实现构造函数,定义数据读取方式,划分训练和测试数据集
        """
        super(MyDataset, self).__init__()
        self.img = []
        self.label = []
        # 借助pandas读csv的库
        self.train_images = train_img_list
        self.test_images = val_img_list
        self.train_label = train_label_list
        self.test_label = val_label_list
        if mode == 'train':
            # 读train_images的数据
            for img,la in zip(self.train_images, self.train_label):
                self.img.append('data/data71799/lemon_lesson/train_images/'+img)
                self.label.append(la)
        else:
            # 读test_images的数据
            for img,la in zip(self.train_images, self.train_label):
                self.img.append('data/data71799/lemon_lesson/train_images/'+img)
                self.label.append(la)

    def load_img(self, image_path):
        # 实际使用时使用Pillow相关库进行图片读取即可,这里我们对数据先做个模拟
        image = Image.open(image_path).convert('RGB')
        return image

    def __getitem__(self, index):
        """
        步骤三:实现__getitem__方法,定义指定index时如何获取数据,并返回单条数据(训练数据,对应的标签)
        """
        image = self.load_img(self.img[index])
        label = self.label[index]
        # label = paddle.to_tensor(label)
        
        return data_transforms(image), paddle.nn.functional.label_smooth(label)

    def __len__(self):
        """
        步骤四:实现__len__方法,返回数据集总数目
        """
        return len(self.img)
#train_loader
train_dataset = MyDataset(train_img_list=train_image_path_list, val_img_list=val_image_path_list, train_label_list=train_label_list, val_label_list=val_label_list, mode='train')
train_loader = paddle.io.DataLoader(train_dataset, places=paddle.CPUPlace(), batch_size=32, shuffle=True, num_workers=0)

#val_loader
val_dataset = MyDataset(train_img_list=train_image_path_list, val_img_list=val_image_path_list, train_label_list=train_label_list, val_label_list=val_label_list, mode='test')
val_loader = paddle.io.DataLoader(train_dataset, places=paddle.CPUPlace(), batch_size=32, shuffle=True, num_workers=0)

4.2模型训练部分

标签平滑(LSR)

  在分类问题中,一般最后一层是全连接层,然后对应one-hot编码,这种编码方式和通过降低交叉熵损失来调整参数的方式结合起来,会有一些问题。这种方式鼓励模型对不同类别的输出分数差异非常大,或者说模型过分相信他的判断,但是由于人工标注信息可能会出现一些错误。模型对标签的过分相信会导致过拟合。
标签平滑可以有效解决该问题,它的具体思想是降低我们对于标签的信任,例如我们可以将损失的目标值从1稍微降到0.9,或者将从0稍微升到0.1。总的来说,标签平滑是一种通过在标签y中加入噪声,实现对模型约束,降低模型过拟合程度的一种正则化方法。
论文地址 飞桨2.0API地址

yk~=(1−ϵ)∗yk+ϵ∗μktilde{y_k} = (1 – epsilon) * y_k + epsilon * mu_k

其中 1−ϵ 和 ϵ 分别是权重,yk~tilde{y_k}是平滑后的标签,通常 μ 使用均匀分布。

print('=============train dataset=============')
for image, label in train_dataset:
    print('image shape: {}, label: {}'.format(image.shape, label))
    break
=============train dataset=============
image shape: (3, 224, 224), label: Tensor(shape=[4], dtype=float32, place=CUDAPlace(0), stop_gradient=True,
       [0.92499995, 0.02500000, 0.02500000, 0.02500000])
print('=============train dataset=============')
for image, label in train_dataset:
    print('image shape: {}, label: {}'.format(image.shape, label))
    break
=============train dataset=============
image shape: (3, 224, 224), label: Tensor(shape=[4], dtype=float32, place=CUDAPlace(0), stop_gradient=True,
       [0.02500000, 0.02500000, 0.02500000, 0.92499995])

扩展知识点–独热编码

One-Hot编码是分类变量作为二进制向量的表示。这首先要求将分类值映射到整数值。然后,每个整数值被表示为二进制向量,除了整数的索引之外,它都是零值,它被标记为1。

离散特征的编码分为两种情况:

  1. 离散特征的取值之间没有大小的意义,比如color:[red,blue],那么就使用one-hot编码
  2. 离散特征的取值有大小的意义,比如size:[X,XL,XXL],那么就使用数值的映射{X:1,XL:2,XXL:3},标签编码

优化算法选择

Adam, init_lr=3e-4,3e-4号称是Adam最好的初始学习率,有图有真相,SGD比较更考验调参功力。

柠檬分类全流程实战

学习率调整策略

为什么要进行学习率调整?

  当我们使用梯度下降算法来优化目标函数的时候,当越来越接近Loss值的全局最小值时,学习率应该变得更小来使得模型尽可能接近这一点。

柠檬分类全流程实战

可以由上图看出,固定学习率时,当到达收敛状态时,会在最优值附近一个较大的区域内摆动;而当随着迭代轮次的增加而减小学习率,会使得在收敛时,在最优值附近一个更小的区域内摆动。(之所以曲线震荡朝向最优值收敛,是因为在每一个mini-batch中都存在噪音)。因此,选择一个合适的学习率,对于模型的训练将至关重要。下面来了解一些学习率调整的方法。

针对学习率的优化有很多种方法,而linearwarmup是其中重要的一种。

飞桨2.0学习率调整相关API

当我们使用梯度下降算法来优化目标函数的时候,当越来越接近Loss值的全局最小值时,学习率应该变得更小来使得模型尽可能接近这一点,而余弦退火(Cosine annealing)可以通过余弦函数来降低学习率。余弦函数中随着x的增加余弦值首先缓慢下降,然后加速下降,再次缓慢下降。这种下降模式能和学习率配合,以一种十分有效的计算方式来产生很好的效果。

柠檬分类全流程实战

技巧应用实战

此部分通过MobileNetV2训练模型,并在模型中应用上述提到的技巧。

from work.mobilenet import MobileNetV2

# 模型封装
model_res = MobileNetV2(class_dim=4)
model = paddle.Model(model_res)

# 定义优化器

scheduler = paddle.optimizer.lr.CosineAnnealingDecay(learning_rate=0.5, T_max=10, verbose=True)
sgd = paddle.optimizer.SGD(learning_rate=scheduler, parameters=linear.parameters())
# optim = paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters())
Epoch 0: LinearWarmup set learning rate to 0.0.

扩展知识点–软标签&硬标签

柠檬分类全流程实战

# 配置模型
model.prepare(
    optim,
    paddle.nn.CrossEntropyLoss(soft_label=True),
    Accuracy()
    )


# 模型训练与评估
model.fit(train_loader,
        val_loader,
        log_freq=1,
        epochs=5,
        # callbacks=Callbk(write=write, iters=iters),
        verbose=1,
        )

The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/5
step  1/40 [..............................] - loss: 1.6536 - acc: 0.1875 - ETA: 17s - 450ms/stepEpoch 1: LinearWarmup set learning rate to 0.025.
step  2/40 [>.............................] - loss: 1.5561 - acc: 0.2969 - ETA: 13s - 367ms/stepEpoch 2: LinearWarmup set learning rate to 0.05.
step  3/40 [=>............................] - loss: 1.4161 - acc: 0.2812 - ETA: 13s - 363ms/stepEpoch 3: LinearWarmup set learning rate to 0.075.
step  4/40 [==>...........................] - loss: 1.5118 - acc: 0.2891 - ETA: 13s - 372ms/stepEpoch 4: LinearWarmup set learning rate to 0.1.
step  5/40 [==>...........................] - loss: 2.2180 - acc: 0.2750 - ETA: 12s - 363ms/stepEpoch 5: LinearWarmup set learning rate to 0.125.
step  6/40 [===>..........................] - loss: 3.7303 - acc: 0.2656 - ETA: 12s - 361ms/stepEpoch 6: LinearWarmup set learning rate to 0.15.
step  7/40 [====>.........................] - loss: 6.1175 - acc: 0.2634 - ETA: 11s - 355ms/stepEpoch 7: LinearWarmup set learning rate to 0.175.
step  8/40 [=====>........................] - loss: 7.1091 - acc: 0.2773 - ETA: 11s - 356ms/stepEpoch 8: LinearWarmup set learning rate to 0.2.
step  9/40 [=====>........................] - loss: 8.5457 - acc: 0.2743 - ETA: 10s - 352ms/stepEpoch 9: LinearWarmup set learning rate to 0.225.
step 10/40 [======>.......................] - loss: 8.1497 - acc: 0.2625 - ETA: 10s - 352ms/stepEpoch 10: LinearWarmup set learning rate to 0.25.
step 11/40 [=======>......................] - loss: 13.1281 - acc: 0.2557 - ETA: 10s - 352ms/stepEpoch 11: LinearWarmup set learning rate to 0.275.
step 12/40 [========>.....................] - loss: 12.6392 - acc: 0.2578 - ETA: 9s - 349ms/step Epoch 12: LinearWarmup set learning rate to 0.3.
step 13/40 [========>.....................] - loss: 12.2208 - acc: 0.2548 - ETA: 9s - 347ms/stepEpoch 13: LinearWarmup set learning rate to 0.325.
step 14/40 [=========>....................] - loss: 13.5863 - acc: 0.2500 - ETA: 8s - 344ms/stepEpoch 14: LinearWarmup set learning rate to 0.35.
step 15/40 [==========>...................] - loss: 10.0763 - acc: 0.2417 - ETA: 8s - 342ms/stepEpoch 15: LinearWarmup set learning rate to 0.375.
step 16/40 [===========>..................] - loss: 10.9785 - acc: 0.2383 - ETA: 8s - 340ms/stepEpoch 16: LinearWarmup set learning rate to 0.4.
step 17/40 [===========>..................] - loss: 11.0326 - acc: 0.2316 - ETA: 7s - 339ms/stepEpoch 17: LinearWarmup set learning rate to 0.425.
step 18/40 [============>.................] - loss: 10.4662 - acc: 0.2361 - ETA: 7s - 338ms/stepEpoch 18: LinearWarmup set learning rate to 0.45.
step 19/40 [=============>................] - loss: 8.3447 - acc: 0.2368 - ETA: 7s - 337ms/step Epoch 19: LinearWarmup set learning rate to 0.475.
step 20/40 [==============>...............] - loss: 7.1285 - acc: 0.2359 - ETA: 6s - 335ms/stepEpoch 20: LinearWarmup set learning rate to 0.5.
step 21/40 [==============>...............] - loss: 6.2685 - acc: 0.2321 - ETA: 6s - 334ms/stepEpoch 21: LinearWarmup set learning rate to 0.5.
step 22/40 [===============>..............] - loss: 4.7648 - acc: 0.2315 - ETA: 5s - 333ms/stepEpoch 22: LinearWarmup set learning rate to 0.5.
step 23/40 [================>.............] - loss: 8.0146 - acc: 0.2351 - ETA: 5s - 332ms/stepEpoch 23: LinearWarmup set learning rate to 0.5.
step 24/40 [=================>............] - loss: 7.5062 - acc: 0.2396 - ETA: 5s - 331ms/stepEpoch 24: LinearWarmup set learning rate to 0.5.
step 25/40 [=================>............] - loss: 4.4639 - acc: 0.2400 - ETA: 4s - 331ms/stepEpoch 25: LinearWarmup set learning rate to 0.5.
step 26/40 [==================>...........] - loss: 8.2850 - acc: 0.2368 - ETA: 4s - 330ms/stepEpoch 26: LinearWarmup set learning rate to 0.5.
step 27/40 [===================>..........] - loss: 4.4414 - acc: 0.2396 - ETA: 4s - 330ms/stepEpoch 27: LinearWarmup set learning rate to 0.5.
step 28/40 [====================>.........] - loss: 2.1917 - acc: 0.2422 - ETA: 3s - 329ms/stepEpoch 28: LinearWarmup set learning rate to 0.5.
step 29/40 [====================>.........] - loss: 1.3481 - acc: 0.2457 - ETA: 3s - 329ms/stepEpoch 29: LinearWarmup set learning rate to 0.5.
step 30/40 [=====================>........] - loss: 1.3644 - acc: 0.2510 - ETA: 3s - 328ms/stepEpoch 30: LinearWarmup set learning rate to 0.5.
step 31/40 [======================>.......] - loss: 1.8781 - acc: 0.2560 - ETA: 2s - 328ms/stepEpoch 31: LinearWarmup set learning rate to 0.5.
step 32/40 [=======================>......] - loss: 1.3190 - acc: 0.2627 - ETA: 2s - 327ms/stepEpoch 32: LinearWarmup set learning rate to 0.5.
step 33/40 [=======================>......] - loss: 1.6382 - acc: 0.2680 - ETA: 2s - 327ms/stepEpoch 33: LinearWarmup set learning rate to 0.5.
step 34/40 [========================>.....] - loss: 3.0187 - acc: 0.2675 - ETA: 1s - 327ms/stepEpoch 34: LinearWarmup set learning rate to 0.5.
step 35/40 [=========================>....] - loss: 1.3880 - acc: 0.2696 - ETA: 1s - 326ms/stepEpoch 35: LinearWarmup set learning rate to 0.5.
step 36/40 [==========================>...] - loss: 1.6008 - acc: 0.2691 - ETA: 1s - 327ms/stepEpoch 36: LinearWarmup set learning rate to 0.5.
step 37/40 [==========================>...] - loss: 1.3834 - acc: 0.2711 - ETA: 0s - 326ms/stepEpoch 37: LinearWarmup set learning rate to 0.5.
step 38/40 [===========================>..] - loss: 1.2529 - acc: 0.2763 - ETA: 0s - 326ms/stepEpoch 38: LinearWarmup set learning rate to 0.5.
step 39/40 [============================>.] - loss: 1.8549 - acc: 0.2788 - ETA: 0s - 326ms/stepEpoch 39: LinearWarmup set learning rate to 0.5.
Epoch 40: LinearWarmup set learning rate to 0.5.
step 40/40 [==============================] - loss: 2.2793 - acc: 0.2766 - 327ms/step          
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 40/40 [==============================] - loss: 2.0073 - acc: 0.2414 - 314ms/step        
Eval samples: 1280
Epoch 2/5
step  1/40 [..............................] - loss: 2.2807 - acc: 0.3750 - ETA: 14s - 373ms/stepEpoch 41: LinearWarmup set learning rate to 0.5.
step  2/40 [>.............................] - loss: 1.9062 - acc: 0.3906 - ETA: 13s - 344ms/stepEpoch 42: LinearWarmup set learning rate to 0.5.
step  3/40 [=>............................] - loss: 1.4720 - acc: 0.4062 - ETA: 12s - 335ms/stepEpoch 43: LinearWarmup set learning rate to 0.5.
step  4/40 [==>...........................] - loss: 1.3937 - acc: 0.3750 - ETA: 11s - 333ms/stepEpoch 44: LinearWarmup set learning rate to 0.5.
step  5/40 [==>...........................] - loss: 1.2439 - acc: 0.4188 - ETA: 11s - 335ms/stepEpoch 45: LinearWarmup set learning rate to 0.5.
step  6/40 [===>..........................] - loss: 2.9683 - acc: 0.4219 - ETA: 11s - 339ms/stepEpoch 46: LinearWarmup set learning rate to 0.5.
step  7/40 [====>.........................] - loss: 2.6396 - acc: 0.4107 - ETA: 11s - 340ms/stepEpoch 47: LinearWarmup set learning rate to 0.5.

完整代码

数据读取部分

# 导入所需要的库
from sklearn.utils import shuffle
import os
import pandas as pd
import numpy as np
from PIL import Image

import paddle
import paddle.nn as nn
from paddle.io import Dataset
import paddle.vision.transforms as T
import paddle.nn.functional as F
from paddle.metric import Accuracy

import warnings
warnings.filterwarnings("ignore")

# 读取数据
train_images = pd.read_csv('data/data71799/lemon_lesson/train_images.csv', usecols=['id','class_num'])

# labelshuffling

def labelShuffling(dataFrame, groupByName = 'class_num'):

    groupDataFrame = dataFrame.groupby(by=[groupByName])
    labels = groupDataFrame.size()
    print("length of label is ", len(labels))
    maxNum = max(labels)
    lst = pd.DataFrame()
    for i in range(len(labels)):
        print("Processing label  :", i)
        tmpGroupBy = groupDataFrame.get_group(i)
        createdShuffleLabels = np.random.permutation(np.array(range(maxNum))) % labels[i]
        print("Num of the label is : ", labels[i])
        lst=lst.append(tmpGroupBy.iloc[createdShuffleLabels], ignore_index=True)
        print("Done")
    # lst.to_csv('test1.csv', index=False)
    return lst

# 划分训练集和校验集
all_size = len(train_images)
# print(all_size)
train_size = int(all_size * 0.8)
train_image_list = train_images[:train_size]
val_image_list = train_images[train_size:]

df = labelShuffling(train_image_list)
df = shuffle(df)

train_image_path_list = df['id'].values
label_list = df['class_num'].values
label_list = paddle.to_tensor(label_list, dtype='int64')
train_label_list = paddle.nn.functional.one_hot(label_list, num_classes=4)

val_image_path_list = val_image_list['id'].values
val_label_list = val_image_list['class_num'].values
val_label_list = paddle.to_tensor(val_label_list, dtype='int64')
val_label_list = paddle.nn.functional.one_hot(val_label_list, num_classes=4)

# 定义数据预处理
data_transforms = T.Compose([
    T.Resize(size=(224, 224)),
    T.RandomHorizontalFlip(224),
    T.RandomVerticalFlip(224),
    T.Transpose(),    # HWC -> CHW
    T.Normalize(
        mean=[0, 0, 0],        # 归一化
        std=[255, 255, 255],
        to_rgb=True)    
])

length of label is  4
Processing label  : 0
Num of the label is :  321
Done
Processing label  : 1
Num of the label is :  207
Done
Processing label  : 2
Num of the label is :  181
Done
Processing label  : 3
Num of the label is :  172
Done
a = [x for x in train_image_path_list if x in val_image_path_list]
len(a)
146
# 构建Dataset
class MyDataset(paddle.io.Dataset):
    """
    步骤一:继承paddle.io.Dataset类
    """
    def __init__(self, train_img_list, val_img_list,train_label_list,val_label_list, mode='train'):
        """
        步骤二:实现构造函数,定义数据读取方式,划分训练和测试数据集
        """
        super(MyDataset, self).__init__()
        self.img = []
        self.label = []
        # 借助pandas读csv的库
        self.train_images = train_img_list
        self.test_images = val_img_list
        self.train_label = train_label_list
        self.test_label = val_label_list
        if mode == 'train':
            # 读train_images的数据
            for img,la in zip(self.train_images, self.train_label):
                self.img.append('data/data71799/lemon_lesson/train_images/'+img)
                self.label.append(la)
        else:
            # 读test_images的数据
            for img,la in zip(self.train_images, self.train_label):
                self.img.append('data/data71799/lemon_lesson/train_images/'+img)
                self.label.append(la)

    def load_img(self, image_path):
        # 实际使用时使用Pillow相关库进行图片读取即可,这里我们对数据先做个模拟
        image = Image.open(image_path).convert('RGB')
        return image

    def __getitem__(self, index):
        """
        步骤三:实现__getitem__方法,定义指定index时如何获取数据,并返回单条数据(训练数据,对应的标签)
        """
        image = self.load_img(self.img[index])
        label = self.label[index]
        # label = paddle.to_tensor(label)
        
        return data_transforms(image), paddle.nn.functional.label_smooth(label)

    def __len__(self):
        """
        步骤四:实现__len__方法,返回数据集总数目
        """
        return len(self.img)
#train_loader
train_dataset = MyDataset(train_img_list=train_image_path_list, val_img_list=val_image_path_list, train_label_list=train_label_list, val_label_list=val_label_list, mode='train')
train_loader = paddle.io.DataLoader(train_dataset, places=paddle.CPUPlace(), batch_size=32, shuffle=True, num_workers=0)

#val_loader

val_dataset = MyDataset(train_img_list=train_image_path_list, val_img_list=val_image_path_list, train_label_list=train_label_list, val_label_list=val_label_list, mode='test')
val_loader = paddle.io.DataLoader(train_dataset, places=paddle.CPUPlace(), batch_size=32, shuffle=True, num_workers=0)

模型训练部分

from work.mobilenet import MobileNetV2

# 模型封装
model_res = MobileNetV2(class_dim=4)
model = paddle.Model(model_res)

# 定义优化器

scheduler = paddle.optimizer.lr.LinearWarmup(
        learning_rate=0.5, warmup_steps=20, start_lr=0, end_lr=0.5, verbose=True)
optim = paddle.optimizer.SGD(learning_rate=scheduler, parameters=model.parameters())
# optim = paddle.optimizer.Adam(learning_rate=0.001, parameters=model.parameters())

# 配置模型
model.prepare(
    optim,
    paddle.nn.CrossEntropyLoss(soft_label=True),
    Accuracy()
    )

# 模型训练与评估
model.fit(train_loader,
        val_loader,
        log_freq=1,
        epochs=10,
        # callbacks=Callbk(write=write, iters=iters),
        verbose=1,
        )
Epoch 0: LinearWarmup set learning rate to 0.0.
The loss value printed in the log is the current step, and the metric is the average value of previous step.
Epoch 1/5
step  1/41 [..............................] - loss: 1.4605 - acc: 0.1875 - ETA: 17s - 446ms/stepEpoch 1: LinearWarmup set learning rate to 0.025.
step  2/41 [>.............................] - loss: 1.5771 - acc: 0.2188 - ETA: 14s - 363ms/stepEpoch 2: LinearWarmup set learning rate to 0.05.
step  3/41 [=>............................] - loss: 1.3226 - acc: 0.2812 - ETA: 13s - 361ms/stepEpoch 3: LinearWarmup set learning rate to 0.075.
step  4/41 [=>............................] - loss: 1.2958 - acc: 0.3203 - ETA: 12s - 351ms/stepEpoch 4: LinearWarmup set learning rate to 0.1.
step  5/41 [==>...........................] - loss: 1.6094 - acc: 0.3125 - ETA: 12s - 344ms/stepEpoch 5: LinearWarmup set learning rate to 0.125.
step  6/41 [===>..........................] - loss: 4.1445 - acc: 0.2865 - ETA: 11s - 340ms/stepEpoch 6: LinearWarmup set learning rate to 0.15.
step  7/41 [====>.........................] - loss: 9.7780 - acc: 0.2902 - ETA: 11s - 336ms/stepEpoch 7: LinearWarmup set learning rate to 0.175.
step  8/41 [====>.........................] - loss: 9.8929 - acc: 0.2891 - ETA: 10s - 333ms/stepEpoch 8: LinearWarmup set learning rate to 0.2.
step  9/41 [=====>........................] - loss: 12.4322 - acc: 0.2812 - ETA: 10s - 331ms/stepEpoch 9: LinearWarmup set learning rate to 0.225.
step 10/41 [======>.......................] - loss: 9.7367 - acc: 0.2812 - ETA: 10s - 329ms/step Epoch 10: LinearWarmup set learning rate to 0.25.
step 11/41 [=======>......................] - loss: 10.2395 - acc: 0.2756 - ETA: 9s - 327ms/stepEpoch 11: LinearWarmup set learning rate to 0.275.
step 12/41 [=======>......................] - loss: 15.0872 - acc: 0.2656 - ETA: 9s - 326ms/stepEpoch 12: LinearWarmup set learning rate to 0.3.
step 13/41 [========>.....................] - loss: 6.0130 - acc: 0.2788 - ETA: 9s - 326ms/step Epoch 13: LinearWarmup set learning rate to 0.325.
step 14/41 [=========>....................] - loss: 6.6485 - acc: 0.2701 - ETA: 8s - 326ms/stepEpoch 14: LinearWarmup set learning rate to 0.35.
step 15/41 [=========>....................] - loss: 4.7297 - acc: 0.2646 - ETA: 8s - 325ms/stepEpoch 15: LinearWarmup set learning rate to 0.375.
step 16/41 [==========>...................] - loss: 4.6904 - acc: 0.2617 - ETA: 8s - 324ms/stepEpoch 16: LinearWarmup set learning rate to 0.4.
step 17/41 [===========>..................] - loss: 10.6784 - acc: 0.2574 - ETA: 7s - 324ms/stepEpoch 17: LinearWarmup set learning rate to 0.425.
step 18/41 [============>.................] - loss: 6.7177 - acc: 0.2500 - ETA: 7s - 323ms/step Epoch 18: LinearWarmup set learning rate to 0.45.
step 19/41 [============>.................] - loss: 9.8610 - acc: 0.2484 - ETA: 7s - 322ms/stepEpoch 19: LinearWarmup set learning rate to 0.475.
step 20/41 [=============>................] - loss: 6.4373 - acc: 0.2391 - ETA: 6s - 323ms/stepEpoch 20: LinearWarmup set learning rate to 0.5.
step 21/41 [==============>...............] - loss: 4.8204 - acc: 0.2455 - ETA: 6s - 322ms/stepEpoch 21: LinearWarmup set learning rate to 0.5.
step 22/41 [===============>..............] - loss: 6.1034 - acc: 0.2429 - ETA: 6s - 322ms/stepEpoch 22: LinearWarmup set learning rate to 0.5.
step 23/41 [===============>..............] - loss: 11.9938 - acc: 0.2405 - ETA: 5s - 322ms/stepEpoch 23: LinearWarmup set learning rate to 0.5.
step 24/41 [================>.............] - loss: 4.4950 - acc: 0.2409 - ETA: 5s - 322ms/step Epoch 24: LinearWarmup set learning rate to 0.5.
step 25/41 [=================>............] - loss: 4.4387 - acc: 0.2437 - ETA: 5s - 322ms/stepEpoch 25: LinearWarmup set learning rate to 0.5.
step 26/41 [==================>...........] - loss: 2.6016 - acc: 0.2416 - ETA: 4s - 322ms/stepEpoch 26: LinearWarmup set learning rate to 0.5.
step 27/41 [==================>...........] - loss: 2.4175 - acc: 0.2350 - ETA: 4s - 321ms/stepEpoch 27: LinearWarmup set learning rate to 0.5.
step 28/41 [===================>..........] - loss: 2.0685 - acc: 0.2388 - ETA: 4s - 322ms/stepEpoch 28: LinearWarmup set learning rate to 0.5.
step 29/41 [====================>.........] - loss: 2.0059 - acc: 0.2511 - ETA: 3s - 321ms/stepEpoch 29: LinearWarmup set learning rate to 0.5.
step 30/41 [====================>.........] - loss: 2.8769 - acc: 0.2479 - ETA: 3s - 321ms/stepEpoch 30: LinearWarmup set learning rate to 0.5.
step 31/41 [=====================>........] - loss: 2.0216 - acc: 0.2490 - ETA: 3s - 321ms/stepEpoch 31: LinearWarmup set learning rate to 0.5.
step 32/41 [======================>.......] - loss: 1.9136 - acc: 0.2510 - ETA: 2s - 320ms/stepEpoch 32: LinearWarmup set learning rate to 0.5.
step 33/41 [=======================>......] - loss: 1.6150 - acc: 0.2557 - ETA: 2s - 320ms/stepEpoch 33: LinearWarmup set learning rate to 0.5.
step 34/41 [=======================>......] - loss: 2.8779 - acc: 0.2574 - ETA: 2s - 321ms/stepEpoch 34: LinearWarmup set learning rate to 0.5.
step 35/41 [========================>.....] - loss: 3.5160 - acc: 0.2580 - ETA: 1s - 321ms/stepEpoch 35: LinearWarmup set learning rate to 0.5.
step 36/41 [=========================>....] - loss: 2.4957 - acc: 0.2587 - ETA: 1s - 321ms/stepEpoch 36: LinearWarmup set learning rate to 0.5.
step 37/41 [==========================>...] - loss: 1.9640 - acc: 0.2644 - ETA: 1s - 321ms/stepEpoch 37: LinearWarmup set learning rate to 0.5.
step 38/41 [==========================>...] - loss: 6.7647 - acc: 0.2640 - ETA: 0s - 321ms/stepEpoch 38: LinearWarmup set learning rate to 0.5.
step 39/41 [===========================>..] - loss: 4.1642 - acc: 0.2628 - ETA: 0s - 321ms/stepEpoch 39: LinearWarmup set learning rate to 0.5.
step 40/41 [============================>.] - loss: 1.7224 - acc: 0.2641 - ETA: 0s - 321ms/stepEpoch 40: LinearWarmup set learning rate to 0.5.
Epoch 41: LinearWarmup set learning rate to 0.5.
step 41/41 [==============================] - loss: 5.0040 - acc: 0.2640 - 314ms/step          
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 41/41 [==============================] - loss: 13.1931 - acc: 0.2500 - 282ms/step       
Eval samples: 1284
Epoch 2/5
step  1/41 [..............................] - loss: 8.9741 - acc: 0.2500 - ETA: 14s - 353ms/stepEpoch 42: LinearWarmup set learning rate to 0.5.
step  2/41 [>.............................] - loss: 2.8284 - acc: 0.2656 - ETA: 12s - 322ms/stepEpoch 43: LinearWarmup set learning rate to 0.5.
step  3/41 [=>............................] - loss: 1.5603 - acc: 0.3021 - ETA: 11s - 312ms/stepEpoch 44: LinearWarmup set learning rate to 0.5.
step  4/41 [=>............................] - loss: 1.5512 - acc: 0.3516 - ETA: 11s - 318ms/stepEpoch 45: LinearWarmup set learning rate to 0.5.
step  5/41 [==>...........................] - loss: 1.6047 - acc: 0.3438 - ETA: 11s - 318ms/stepEpoch 46: LinearWarmup set learning rate to 0.5.
step  6/41 [===>..........................] - loss: 1.3139 - acc: 0.3802 - ETA: 11s - 318ms/stepEpoch 47: LinearWarmup set learning rate to 0.5.
step  7/41 [====>.........................] - loss: 1.5003 - acc: 0.3929 - ETA: 10s - 318ms/stepEpoch 48: LinearWarmup set learning rate to 0.5.
step  8/41 [====>.........................] - loss: 1.2434 - acc: 0.4102 - ETA: 10s - 314ms/stepEpoch 49: LinearWarmup set learning rate to 0.5.
step  9/41 [=====>........................] - loss: 3.2426 - acc: 0.4236 - ETA: 9s - 311ms/step Epoch 50: LinearWarmup set learning rate to 0.5.
step 10/41 [======>.......................] - loss: 2.9345 - acc: 0.4156 - ETA: 9s - 308ms/stepEpoch 51: LinearWarmup set learning rate to 0.5.
step 11/41 [=======>......................] - loss: 2.6498 - acc: 0.4119 - ETA: 9s - 306ms/stepEpoch 52: LinearWarmup set learning rate to 0.5.
step 12/41 [=======>......................] - loss: 1.2690 - acc: 0.4141 - ETA: 8s - 304ms/stepEpoch 53: LinearWarmup set learning rate to 0.5.
step 13/41 [========>.....................] - loss: 1.1631 - acc: 0.4135 - ETA: 8s - 304ms/stepEpoch 54: LinearWarmup set learning rate to 0.5.
step 14/41 [=========>....................] - loss: 1.5092 - acc: 0.4152 - ETA: 8s - 302ms/stepEpoch 55: LinearWarmup set learning rate to 0.5.
step 15/41 [=========>....................] - loss: 1.2106 - acc: 0.4250 - ETA: 7s - 301ms/stepEpoch 56: LinearWarmup set learning rate to 0.5.
step 16/41 [==========>...................] - loss: 1.3855 - acc: 0.4375 - ETA: 7s - 300ms/stepEpoch 57: LinearWarmup set learning rate to 0.5.
step 17/41 [===========>..................] - loss: 0.9602 - acc: 0.4540 - ETA: 7s - 300ms/stepEpoch 58: LinearWarmup set learning rate to 0.5.
step 18/41 [============>.................] - loss: 1.0053 - acc: 0.4618 - ETA: 6s - 299ms/stepEpoch 59: LinearWarmup set learning rate to 0.5.
step 19/41 [============>.................] - loss: 1.3082 - acc: 0.4638 - ETA: 6s - 298ms/stepEpoch 60: LinearWarmup set learning rate to 0.5.
step 20/41 [=============>................] - loss: 1.3354 - acc: 0.4672 - ETA: 6s - 298ms/stepEpoch 61: LinearWarmup set learning rate to 0.5.
step 21/41 [==============>...............] - loss: 1.1436 - acc: 0.4747 - ETA: 5s - 297ms/stepEpoch 62: LinearWarmup set learning rate to 0.5.
step 22/41 [===============>..............] - loss: 1.1773 - acc: 0.4815 - ETA: 5s - 297ms/stepEpoch 63: LinearWarmup set learning rate to 0.5.
step 23/41 [===============>..............] - loss: 1.0244 - acc: 0.4905 - ETA: 5s - 296ms/stepEpoch 64: LinearWarmup set learning rate to 0.5.
step 24/41 [================>.............] - loss: 0.9988 - acc: 0.4974 - ETA: 5s - 296ms/stepEpoch 65: LinearWarmup set learning rate to 0.5.
step 25/41 [=================>............] - loss: 1.0273 - acc: 0.5038 - ETA: 4s - 296ms/stepEpoch 66: LinearWarmup set learning rate to 0.5.
step 26/41 [==================>...........] - loss: 1.4507 - acc: 0.5036 - ETA: 4s - 296ms/stepEpoch 67: LinearWarmup set learning rate to 0.5.
step 27/41 [==================>...........] - loss: 1.6826 - acc: 0.5023 - ETA: 4s - 296ms/stepEpoch 68: LinearWarmup set learning rate to 0.5.
step 28/41 [===================>..........] - loss: 1.4181 - acc: 0.5022 - ETA: 3s - 295ms/stepEpoch 69: LinearWarmup set learning rate to 0.5.
step 29/41 [====================>.........] - loss: 0.8785 - acc: 0.5108 - ETA: 3s - 295ms/stepEpoch 70: LinearWarmup set learning rate to 0.5.
step 30/41 [====================>.........] - loss: 0.9180 - acc: 0.5167 - ETA: 3s - 295ms/stepEpoch 71: LinearWarmup set learning rate to 0.5.
step 31/41 [=====================>........] - loss: 0.8932 - acc: 0.5232 - ETA: 2s - 295ms/stepEpoch 72: LinearWarmup set learning rate to 0.5.
step 32/41 [======================>.......] - loss: 1.0629 - acc: 0.5273 - ETA: 2s - 295ms/stepEpoch 73: LinearWarmup set learning rate to 0.5.
step 33/41 [=======================>......] - loss: 0.9054 - acc: 0.5322 - ETA: 2s - 295ms/stepEpoch 74: LinearWarmup set learning rate to 0.5.
step 34/41 [=======================>......] - loss: 0.8587 - acc: 0.5377 - ETA: 2s - 295ms/stepEpoch 75: LinearWarmup set learning rate to 0.5.
step 35/41 [========================>.....] - loss: 0.9035 - acc: 0.5446 - ETA: 1s - 295ms/stepEpoch 76: LinearWarmup set learning rate to 0.5.
step 36/41 [=========================>....] - loss: 0.8399 - acc: 0.5486 - ETA: 1s - 295ms/stepEpoch 77: LinearWarmup set learning rate to 0.5.
step 37/41 [==========================>...] - loss: 0.8279 - acc: 0.5515 - ETA: 1s - 294ms/stepEpoch 78: LinearWarmup set learning rate to 0.5.
step 38/41 [==========================>...] - loss: 0.8625 - acc: 0.5518 - ETA: 0s - 294ms/stepEpoch 79: LinearWarmup set learning rate to 0.5.
step 39/41 [===========================>..] - loss: 0.8525 - acc: 0.5553 - ETA: 0s - 294ms/stepEpoch 80: LinearWarmup set learning rate to 0.5.
step 40/41 [============================>.] - loss: 0.7763 - acc: 0.5594 - ETA: 0s - 294ms/stepEpoch 81: LinearWarmup set learning rate to 0.5.
Epoch 82: LinearWarmup set learning rate to 0.5.
step 41/41 [==============================] - loss: 0.9631 - acc: 0.5592 - 288ms/step          
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 41/41 [==============================] - loss: 0.8044 - acc: 0.5872 - 285ms/step         
Eval samples: 1284
Epoch 3/5
step  1/41 [..............................] - loss: 1.6057 - acc: 0.4375 - ETA: 13s - 344ms/stepEpoch 83: LinearWarmup set learning rate to 0.5.
step  2/41 [>.............................] - loss: 1.6986 - acc: 0.4375 - ETA: 12s - 318ms/stepEpoch 84: LinearWarmup set learning rate to 0.5.
step  3/41 [=>............................] - loss: 1.7700 - acc: 0.3750 - ETA: 11s - 307ms/stepEpoch 85: LinearWarmup set learning rate to 0.5.
step  4/41 [=>............................] - loss: 1.2904 - acc: 0.4062 - ETA: 11s - 302ms/stepEpoch 86: LinearWarmup set learning rate to 0.5.
step  5/41 [==>...........................] - loss: 1.0241 - acc: 0.4437 - ETA: 10s - 299ms/stepEpoch 87: LinearWarmup set learning rate to 0.5.
step  6/41 [===>..........................] - loss: 1.3834 - acc: 0.4896 - ETA: 10s - 297ms/stepEpoch 88: LinearWarmup set learning rate to 0.5.
step  7/41 [====>.........................] - loss: 1.9258 - acc: 0.4821 - ETA: 10s - 295ms/stepEpoch 89: LinearWarmup set learning rate to 0.5.
step  8/41 [====>.........................] - loss: 1.1720 - acc: 0.4961 - ETA: 9s - 294ms/step Epoch 90: LinearWarmup set learning rate to 0.5.
step  9/41 [=====>........................] - loss: 1.0077 - acc: 0.5139 - ETA: 9s - 293ms/stepEpoch 91: LinearWarmup set learning rate to 0.5.
step 10/41 [======>.......................] - loss: 0.8264 - acc: 0.5437 - ETA: 9s - 292ms/stepEpoch 92: LinearWarmup set learning rate to 0.5.
step 11/41 [=======>......................] - loss: 1.2041 - acc: 0.5455 - ETA: 8s - 292ms/stepEpoch 93: LinearWarmup set learning rate to 0.5.
step 12/41 [=======>......................] - loss: 0.9172 - acc: 0.5521 - ETA: 8s - 291ms/stepEpoch 94: LinearWarmup set learning rate to 0.5.
step 13/41 [========>.....................] - loss: 0.9660 - acc: 0.5553 - ETA: 8s - 291ms/stepEpoch 95: LinearWarmup set learning rate to 0.5.
step 14/41 [=========>....................] - loss: 0.9492 - acc: 0.5558 - ETA: 7s - 290ms/stepEpoch 96: LinearWarmup set learning rate to 0.5.
step 15/41 [=========>....................] - loss: 0.9394 - acc: 0.5667 - ETA: 7s - 290ms/stepEpoch 97: LinearWarmup set learning rate to 0.5.
step 16/41 [==========>...................] - loss: 1.2105 - acc: 0.5762 - ETA: 7s - 290ms/stepEpoch 98: LinearWarmup set learning rate to 0.5.
step 17/41 [===========>..................] - loss: 0.9003 - acc: 0.5809 - ETA: 6s - 289ms/stepEpoch 99: LinearWarmup set learning rate to 0.5.
step 18/41 [============>.................] - loss: 0.9375 - acc: 0.5920 - ETA: 6s - 289ms/stepEpoch 100: LinearWarmup set learning rate to 0.5.
step 19/41 [============>.................] - loss: 0.8016 - acc: 0.5921 - ETA: 6s - 289ms/stepEpoch 101: LinearWarmup set learning rate to 0.5.
step 20/41 [=============>................] - loss: 1.0494 - acc: 0.5938 - ETA: 6s - 290ms/stepEpoch 102: LinearWarmup set learning rate to 0.5.
step 21/41 [==============>...............] - loss: 0.8524 - acc: 0.5997 - ETA: 5s - 290ms/stepEpoch 103: LinearWarmup set learning rate to 0.5.
step 22/41 [===============>..............] - loss: 1.0582 - acc: 0.5994 - ETA: 5s - 290ms/stepEpoch 104: LinearWarmup set learning rate to 0.5.
step 23/41 [===============>..............] - loss: 0.8613 - acc: 0.6033 - ETA: 5s - 290ms/stepEpoch 105: LinearWarmup set learning rate to 0.5.
step 24/41 [================>.............] - loss: 0.8158 - acc: 0.6120 - ETA: 4s - 290ms/stepEpoch 106: LinearWarmup set learning rate to 0.5.
step 25/41 [=================>............] - loss: 0.7029 - acc: 0.6175 - ETA: 4s - 290ms/stepEpoch 107: LinearWarmup set learning rate to 0.5.
step 26/41 [==================>...........] - loss: 0.9267 - acc: 0.6214 - ETA: 4s - 290ms/stepEpoch 108: LinearWarmup set learning rate to 0.5.
step 27/41 [==================>...........] - loss: 0.7612 - acc: 0.6250 - ETA: 4s - 290ms/stepEpoch 109: LinearWarmup set learning rate to 0.5.
step 28/41 [===================>..........] - loss: 0.7916 - acc: 0.6283 - ETA: 3s - 290ms/stepEpoch 110: LinearWarmup set learning rate to 0.5.
step 29/41 [====================>.........] - loss: 0.7258 - acc: 0.6347 - ETA: 3s - 290ms/stepEpoch 111: LinearWarmup set learning rate to 0.5.
step 30/41 [====================>.........] - loss: 0.8422 - acc: 0.6365 - ETA: 3s - 290ms/stepEpoch 112: LinearWarmup set learning rate to 0.5.
step 31/41 [=====================>........] - loss: 0.8392 - acc: 0.6391 - ETA: 2s - 290ms/stepEpoch 113: LinearWarmup set learning rate to 0.5.
step 32/41 [======================>.......] - loss: 0.6522 - acc: 0.6445 - ETA: 2s - 289ms/stepEpoch 114: LinearWarmup set learning rate to 0.5.
step 33/41 [=======================>......] - loss: 0.8444 - acc: 0.6477 - ETA: 2s - 289ms/stepEpoch 115: LinearWarmup set learning rate to 0.5.
step 34/41 [=======================>......] - loss: 0.8241 - acc: 0.6498 - ETA: 2s - 289ms/stepEpoch 116: LinearWarmup set learning rate to 0.5.
step 35/41 [========================>.....] - loss: 0.6293 - acc: 0.6562 - ETA: 1s - 289ms/stepEpoch 117: LinearWarmup set learning rate to 0.5.
step 36/41 [=========================>....] - loss: 0.8358 - acc: 0.6571 - ETA: 1s - 289ms/stepEpoch 118: LinearWarmup set learning rate to 0.5.
step 37/41 [==========================>...] - loss: 1.0717 - acc: 0.6562 - ETA: 1s - 289ms/stepEpoch 119: LinearWarmup set learning rate to 0.5.
step 38/41 [==========================>...] - loss: 0.8111 - acc: 0.6587 - ETA: 0s - 289ms/stepEpoch 120: LinearWarmup set learning rate to 0.5.
step 39/41 [===========================>..] - loss: 0.7964 - acc: 0.6635 - ETA: 0s - 289ms/stepEpoch 121: LinearWarmup set learning rate to 0.5.
step 40/41 [============================>.] - loss: 0.7018 - acc: 0.6695 - ETA: 0s - 290ms/stepEpoch 122: LinearWarmup set learning rate to 0.5.
Epoch 123: LinearWarmup set learning rate to 0.5.
step 41/41 [==============================] - loss: 1.9302 - acc: 0.6698 - 284ms/step          
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 41/41 [==============================] - loss: 1.5869 - acc: 0.5218 - 281ms/step         
Eval samples: 1284
Epoch 4/5
step  1/41 [..............................] - loss: 2.1765 - acc: 0.5938 - ETA: 13s - 344ms/stepEpoch 124: LinearWarmup set learning rate to 0.5.
step  2/41 [>.............................] - loss: 1.0811 - acc: 0.6406 - ETA: 12s - 316ms/stepEpoch 125: LinearWarmup set learning rate to 0.5.
step  3/41 [=>............................] - loss: 0.7610 - acc: 0.6771 - ETA: 11s - 305ms/stepEpoch 126: LinearWarmup set learning rate to 0.5.
step  4/41 [=>............................] - loss: 0.6602 - acc: 0.7109 - ETA: 11s - 303ms/stepEpoch 127: LinearWarmup set learning rate to 0.5.
step  5/41 [==>...........................] - loss: 0.6485 - acc: 0.7438 - ETA: 10s - 299ms/stepEpoch 128: LinearWarmup set learning rate to 0.5.
step  6/41 [===>..........................] - loss: 0.6350 - acc: 0.7656 - ETA: 10s - 297ms/stepEpoch 129: LinearWarmup set learning rate to 0.5.
step  7/41 [====>.........................] - loss: 0.7539 - acc: 0.7723 - ETA: 10s - 295ms/stepEpoch 130: LinearWarmup set learning rate to 0.5.
step  8/41 [====>.........................] - loss: 0.8203 - acc: 0.7812 - ETA: 9s - 295ms/step Epoch 131: LinearWarmup set learning rate to 0.5.
step  9/41 [=====>........................] - loss: 0.7477 - acc: 0.7778 - ETA: 9s - 294ms/stepEpoch 132: LinearWarmup set learning rate to 0.5.
step 10/41 [======>.......................] - loss: 0.5125 - acc: 0.7937 - ETA: 9s - 294ms/stepEpoch 133: LinearWarmup set learning rate to 0.5.
step 11/41 [=======>......................] - loss: 0.5680 - acc: 0.8040 - ETA: 8s - 293ms/stepEpoch 134: LinearWarmup set learning rate to 0.5.
step 12/41 [=======>......................] - loss: 0.4912 - acc: 0.8151 - ETA: 8s - 293ms/stepEpoch 135: LinearWarmup set learning rate to 0.5.
step 13/41 [========>.....................] - loss: 0.4919 - acc: 0.8245 - ETA: 8s - 292ms/stepEpoch 136: LinearWarmup set learning rate to 0.5.
step 14/41 [=========>....................] - loss: 0.7733 - acc: 0.8192 - ETA: 7s - 292ms/stepEpoch 137: LinearWarmup set learning rate to 0.5.
step 15/41 [=========>....................] - loss: 0.6591 - acc: 0.8229 - ETA: 7s - 292ms/stepEpoch 138: LinearWarmup set learning rate to 0.5.
step 16/41 [==========>...................] - loss: 0.6335 - acc: 0.8223 - ETA: 7s - 292ms/stepEpoch 139: LinearWarmup set learning rate to 0.5.
step 17/41 [===========>..................] - loss: 0.6573 - acc: 0.8235 - ETA: 6s - 291ms/stepEpoch 140: LinearWarmup set learning rate to 0.5.
step 18/41 [============>.................] - loss: 0.6074 - acc: 0.8281 - ETA: 6s - 291ms/stepEpoch 141: LinearWarmup set learning rate to 0.5.
step 19/41 [============>.................] - loss: 0.4689 - acc: 0.8355 - ETA: 6s - 291ms/stepEpoch 142: LinearWarmup set learning rate to 0.5.
step 20/41 [=============>................] - loss: 0.6040 - acc: 0.8359 - ETA: 6s - 291ms/stepEpoch 143: LinearWarmup set learning rate to 0.5.
step 21/41 [==============>...............] - loss: 0.6394 - acc: 0.8378 - ETA: 5s - 291ms/stepEpoch 144: LinearWarmup set learning rate to 0.5.
step 22/41 [===============>..............] - loss: 0.6323 - acc: 0.8395 - ETA: 5s - 291ms/stepEpoch 145: LinearWarmup set learning rate to 0.5.
step 23/41 [===============>..............] - loss: 0.6662 - acc: 0.8397 - ETA: 5s - 291ms/stepEpoch 146: LinearWarmup set learning rate to 0.5.
step 24/41 [================>.............] - loss: 0.8264 - acc: 0.8398 - ETA: 4s - 291ms/stepEpoch 147: LinearWarmup set learning rate to 0.5.
step 25/41 [=================>............] - loss: 0.8226 - acc: 0.8387 - ETA: 4s - 291ms/stepEpoch 148: LinearWarmup set learning rate to 0.5.
step 26/41 [==================>...........] - loss: 1.0118 - acc: 0.8365 - ETA: 4s - 291ms/stepEpoch 149: LinearWarmup set learning rate to 0.5.
step 27/41 [==================>...........] - loss: 0.4944 - acc: 0.8414 - ETA: 4s - 291ms/stepEpoch 150: LinearWarmup set learning rate to 0.5.
step 28/41 [===================>..........] - loss: 0.6112 - acc: 0.8426 - ETA: 3s - 291ms/stepEpoch 151: LinearWarmup set learning rate to 0.5.
step 29/41 [====================>.........] - loss: 0.4335 - acc: 0.8470 - ETA: 3s - 291ms/stepEpoch 152: LinearWarmup set learning rate to 0.5.
step 30/41 [====================>.........] - loss: 0.4403 - acc: 0.8510 - ETA: 3s - 290ms/stepEpoch 153: LinearWarmup set learning rate to 0.5.
step 31/41 [=====================>........] - loss: 0.4445 - acc: 0.8548 - ETA: 2s - 290ms/stepEpoch 154: LinearWarmup set learning rate to 0.5.
step 32/41 [======================>.......] - loss: 0.4558 - acc: 0.8584 - ETA: 2s - 290ms/stepEpoch 155: LinearWarmup set learning rate to 0.5.
step 33/41 [=======================>......] - loss: 0.5929 - acc: 0.8589 - ETA: 2s - 290ms/stepEpoch 156: LinearWarmup set learning rate to 0.5.
step 34/41 [=======================>......] - loss: 0.6390 - acc: 0.8566 - ETA: 2s - 290ms/stepEpoch 157: LinearWarmup set learning rate to 0.5.
step 35/41 [========================>.....] - loss: 0.9121 - acc: 0.8527 - ETA: 1s - 290ms/stepEpoch 158: LinearWarmup set learning rate to 0.5.
step 36/41 [=========================>....] - loss: 1.1417 - acc: 0.8472 - ETA: 1s - 290ms/stepEpoch 159: LinearWarmup set learning rate to 0.5.
step 37/41 [==========================>...] - loss: 0.8047 - acc: 0.8438 - ETA: 1s - 290ms/stepEpoch 160: LinearWarmup set learning rate to 0.5.
step 38/41 [==========================>...] - loss: 0.6723 - acc: 0.8446 - ETA: 0s - 290ms/stepEpoch 161: LinearWarmup set learning rate to 0.5.
step 39/41 [===========================>..] - loss: 0.6380 - acc: 0.8446 - ETA: 0s - 290ms/stepEpoch 162: LinearWarmup set learning rate to 0.5.
step 40/41 [============================>.] - loss: 0.8412 - acc: 0.8445 - ETA: 0s - 290ms/stepEpoch 163: LinearWarmup set learning rate to 0.5.
Epoch 164: LinearWarmup set learning rate to 0.5.
step 41/41 [==============================] - loss: 1.4908 - acc: 0.8435 - 284ms/step          
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 41/41 [==============================] - loss: 6.3602 - acc: 0.3170 - 279ms/step         
Eval samples: 1284
Epoch 5/5
step  1/41 [..............................] - loss: 1.8366 - acc: 0.5312 - ETA: 13s - 345ms/stepEpoch 165: LinearWarmup set learning rate to 0.5.
step  2/41 [>.............................] - loss: 0.6627 - acc: 0.7188 - ETA: 12s - 317ms/stepEpoch 166: LinearWarmup set learning rate to 0.5.
step  3/41 [=>............................] - loss: 0.5376 - acc: 0.8021 - ETA: 11s - 308ms/stepEpoch 167: LinearWarmup set learning rate to 0.5.
step  4/41 [=>............................] - loss: 0.5863 - acc: 0.8281 - ETA: 11s - 304ms/stepEpoch 168: LinearWarmup set learning rate to 0.5.
step  5/41 [==>...........................] - loss: 0.9443 - acc: 0.8125 - ETA: 10s - 301ms/stepEpoch 169: LinearWarmup set learning rate to 0.5.
step  6/41 [===>..........................] - loss: 0.5346 - acc: 0.8281 - ETA: 10s - 299ms/stepEpoch 170: LinearWarmup set learning rate to 0.5.
step  7/41 [====>.........................] - loss: 0.6858 - acc: 0.8214 - ETA: 10s - 297ms/stepEpoch 171: LinearWarmup set learning rate to 0.5.
step  8/41 [====>.........................] - loss: 0.4342 - acc: 0.8438 - ETA: 9s - 296ms/step Epoch 172: LinearWarmup set learning rate to 0.5.
step  9/41 [=====>........................] - loss: 0.6366 - acc: 0.8472 - ETA: 9s - 295ms/stepEpoch 173: LinearWarmup set learning rate to 0.5.
step 10/41 [======>.......................] - loss: 0.4926 - acc: 0.8594 - ETA: 9s - 295ms/stepEpoch 174: LinearWarmup set learning rate to 0.5.
step 11/41 [=======>......................] - loss: 0.5774 - acc: 0.8608 - ETA: 8s - 294ms/stepEpoch 175: LinearWarmup set learning rate to 0.5.
step 12/41 [=======>......................] - loss: 0.5804 - acc: 0.8672 - ETA: 8s - 294ms/stepEpoch 176: LinearWarmup set learning rate to 0.5.
step 13/41 [========>.....................] - loss: 0.5692 - acc: 0.8702 - ETA: 8s - 294ms/stepEpoch 177: LinearWarmup set learning rate to 0.5.
step 14/41 [=========>....................] - loss: 0.5204 - acc: 0.8705 - ETA: 7s - 294ms/stepEpoch 178: LinearWarmup set learning rate to 0.5.
step 15/41 [=========>....................] - loss: 0.4335 - acc: 0.8792 - ETA: 7s - 293ms/stepEpoch 179: LinearWarmup set learning rate to 0.5.
step 16/41 [==========>...................] - loss: 0.4385 - acc: 0.8828 - ETA: 7s - 293ms/stepEpoch 180: LinearWarmup set learning rate to 0.5.
step 17/41 [===========>..................] - loss: 0.4864 - acc: 0.8842 - ETA: 7s - 292ms/stepEpoch 181: LinearWarmup set learning rate to 0.5.
step 18/41 [============>.................] - loss: 0.5209 - acc: 0.8854 - ETA: 6s - 292ms/stepEpoch 182: LinearWarmup set learning rate to 0.5.
step 19/41 [============>.................] - loss: 0.6156 - acc: 0.8865 - ETA: 6s - 292ms/stepEpoch 183: LinearWarmup set learning rate to 0.5.
step 20/41 [=============>................] - loss: 0.6270 - acc: 0.8859 - ETA: 6s - 291ms/stepEpoch 184: LinearWarmup set learning rate to 0.5.
step 21/41 [==============>...............] - loss: 0.5757 - acc: 0.8854 - ETA: 5s - 291ms/stepEpoch 185: LinearWarmup set learning rate to 0.5.
step 22/41 [===============>..............] - loss: 0.4877 - acc: 0.8892 - ETA: 5s - 291ms/stepEpoch 186: LinearWarmup set learning rate to 0.5.
step 23/41 [===============>..............] - loss: 0.5588 - acc: 0.8913 - ETA: 5s - 291ms/stepEpoch 187: LinearWarmup set learning rate to 0.5.
step 24/41 [================>.............] - loss: 0.4401 - acc: 0.8945 - ETA: 4s - 291ms/stepEpoch 188: LinearWarmup set learning rate to 0.5.
step 25/41 [=================>............] - loss: 0.5182 - acc: 0.8975 - ETA: 4s - 290ms/stepEpoch 189: LinearWarmup set learning rate to 0.5.
step 26/41 [==================>...........] - loss: 0.4689 - acc: 0.9002 - ETA: 4s - 290ms/stepEpoch 190: LinearWarmup set learning rate to 0.5.
step 27/41 [==================>...........] - loss: 0.4507 - acc: 0.9016 - ETA: 4s - 290ms/stepEpoch 191: LinearWarmup set learning rate to 0.5.
step 28/41 [===================>..........] - loss: 0.6368 - acc: 0.9018 - ETA: 3s - 290ms/stepEpoch 192: LinearWarmup set learning rate to 0.5.
step 29/41 [====================>.........] - loss: 0.5875 - acc: 0.9019 - ETA: 3s - 290ms/stepEpoch 193: LinearWarmup set learning rate to 0.5.
step 30/41 [====================>.........] - loss: 0.4567 - acc: 0.9031 - ETA: 3s - 290ms/stepEpoch 194: LinearWarmup set learning rate to 0.5.
step 31/41 [=====================>........] - loss: 0.4439 - acc: 0.9042 - ETA: 2s - 290ms/stepEpoch 195: LinearWarmup set learning rate to 0.5.
step 32/41 [======================>.......] - loss: 0.5571 - acc: 0.9053 - ETA: 2s - 290ms/stepEpoch 196: LinearWarmup set learning rate to 0.5.
step 33/41 [=======================>......] - loss: 0.5097 - acc: 0.9062 - ETA: 2s - 290ms/stepEpoch 197: LinearWarmup set learning rate to 0.5.
step 34/41 [=======================>......] - loss: 0.4536 - acc: 0.9072 - ETA: 2s - 290ms/stepEpoch 198: LinearWarmup set learning rate to 0.5.
step 35/41 [========================>.....] - loss: 0.4585 - acc: 0.9080 - ETA: 1s - 290ms/stepEpoch 199: LinearWarmup set learning rate to 0.5.
step 36/41 [=========================>....] - loss: 0.5545 - acc: 0.9071 - ETA: 1s - 290ms/stepEpoch 200: LinearWarmup set learning rate to 0.5.
step 37/41 [==========================>...] - loss: 0.5444 - acc: 0.9071 - ETA: 1s - 291ms/stepEpoch 201: LinearWarmup set learning rate to 0.5.
step 38/41 [==========================>...] - loss: 0.4621 - acc: 0.9087 - ETA: 0s - 291ms/stepEpoch 202: LinearWarmup set learning rate to 0.5.
step 39/41 [===========================>..] - loss: 0.5259 - acc: 0.9087 - ETA: 0s - 291ms/stepEpoch 203: LinearWarmup set learning rate to 0.5.
step 40/41 [============================>.] - loss: 0.5032 - acc: 0.9102 - ETA: 0s - 291ms/stepEpoch 204: LinearWarmup set learning rate to 0.5.
Epoch 205: LinearWarmup set learning rate to 0.5.
step 41/41 [==============================] - loss: 0.5349 - acc: 0.9104 - 285ms/step          
Eval begin...
The loss value printed in the log is the current batch, and the metric is the average value of previous step.
step 41/41 [==============================] - loss: 0.8855 - acc: 0.8512 - 284ms/step         
Eval samples: 1284
# 保存模型参数
# model.save('Hapi_MyCNN')  # save for training
model.save('Hapi_MyCNN2', False)  # save for inference

模型预测

import os, time
import matplotlib.pyplot as plt
import paddle
from PIL import Image
import numpy as np

def load_image(img_path):
    '''
    预测图片预处理
    '''
    img = Image.open(img_path).convert('RGB')
    # plt.imshow(img)          #根据数组绘制图像
    # plt.show()               #显示图像
    
    #resize
    img = img.resize((32, 32), Image.BILINEAR) #Image.BILINEAR双线性插值
    img = np.array(img).astype('float32')

    # HWC to CHW 
    img = img.transpose((2, 0, 1))
    
    #Normalize
    img = img / 255         #像素值归一化
    # print(img)
    # mean = [0.31169346, 0.25506335, 0.12432463]   
    # std = [0.34042713, 0.29819837, 0.1375536]
    # img[0] = (img[0] - mean[0]) / std[0]
    # img[1] = (img[1] - mean[1]) / std[1]
    # img[2] = (img[2] - mean[2]) / std[2]
    
    return img

def infer_img(path, model_file_path, use_gpu):
    '''
    模型预测
    '''
    paddle.set_device('gpu:0') if use_gpu else paddle.set_device('cpu')
    model = paddle.jit.load(model_file_path)
    model.eval() #训练模式

    #对预测图片进行预处理
    infer_imgs = []
    infer_imgs.append(load_image(path))
    infer_imgs = np.array(infer_imgs)
    label_list = ['0:優良', '1:良', '2:加工品', '3:規格外']
    label_pre = []
    for i in range(len(infer_imgs)):
        data = infer_imgs[i]
        dy_x_data = np.array(data).astype('float32')
        dy_x_data = dy_x_data[np.newaxis,:, : ,:]
        img = paddle.to_tensor(dy_x_data)
        out = model(img)

        # print(out[0])
        # print(paddle.nn.functional.softmax(out)[0]) # 若模型中已经包含softmax则不用此行代码。

        lab = np.argmax(out.numpy())  #argmax():返回最大数的索引
        label_pre.append(lab)
        # print(lab)
        # print("样本: {},被预测为:{}".format(path, label_list[lab]))
    return label_pre
    # print("*********************************************")
img_list = os.listdir('data/data71799/lemon_lesson/test_images/')
img_list.sort()
img_list
image_path = []
submit = []
for root, dirs, files in os.walk('data/data71799/lemon_lesson/test_images/'):
    # 文件夹内图片
    for f in files:
        image_path.append(os.path.join(root, f))
        submit.append(f)
image_path.sort()       
submit.sort()

key_list = []
for i in range(len(image_path)):
    key_list.append(infer_img(path=image_path[i], use_gpu=True, model_file_path="Hapi_MyCNN1")[0])
    # time.sleep(0.5) #防止输出错乱
    
submit
import pandas as pd

img = pd.DataFrame(submit)
img = img.rename(columns = {0:"id"})
img['class_num'] = key_list


img.to_csv('submit123.csv', index=False)

建议与总结

建议

  • 小白入坑,可独立完成一个比赛,不追求名次,但要渴望追求学习新知识。比赛开始优先使用最简单的模型(如ResNet18),快速跑完整个训练和预测流程。
  • 要有一定毅力,不怕失败,比赛过程往往会踩到不少坑。数据扩增方法一定要反复尝试,会很大程度上影响模型精度。
  • 有充足的时间,看相关论文,找灵感,有些domain的知识是必须有个基本概念认识。
  • 足够的算力支撑,单位时间内可尝试的实验想法才更多,要不最后大融合拼大模型大图的时候会被高端玩家暴打。

柠檬分类全流程实战

总结

  • 技巧并非每次都有效,但掌握了方法,可降低试错率,提高实验效率。
  • 调参+技巧是为了保证模型的下限,可为自己获得一个很不错的分数,但大多数情况下并不能为你带来胜利。
  • 并没有所谓的通用秘笈,每场比赛都有新的数据类型,许多成功也往往来自经验的累积,知识总是学不完的。
  • 洞穿数据的本质才是王道。

柠檬分类全流程实战

总之,做深度学习比赛,你需要的就是耐心和GPU(GPU is all you need)。

本网站的内容主要来自互联网上的各种资源,仅供参考和信息分享之用,不代表本网站拥有相关版权或知识产权。如您认为内容侵犯您的权益,请联系我们,我们将尽快采取行动,包括删除或更正。
AI教程

Sobel算子在图像边缘检测中的应用及原理解析

2023-12-20 20:39:14

AI教程

大型语言模型的涌现能力:真相与质疑

2023-12-20 20:48:14

个人中心
购物车
优惠劵
今日签到
有新私信 私信列表
搜索