(구현) Deep Residual Learning for Image Recognition

Notice

Recent Posts

Recent Comments

Link

12-20 02:58

« 2024/12 »
일	월	화	수	목	금	토
1	2	3	4	5	6	7
8	9	10	11	12	13	14
15	16	17	18	19	20	21
22	23	24	25	26	27	28
29	30	31

Tags more

Archives

Today

Total

관리 메뉴

AI 전문가가 되고싶은 사람

(구현) Deep Residual Learning for Image Recognition 본문

논문

(구현) Deep Residual Learning for Image Recognition

Kimseungwoo0407 2024. 9. 12. 12:56

참고 사이트 :

https://github.com/akamaster/pytorch_resnet_cifar10/blob/master/resnet.py

pytorch_resnet_cifar10/resnet.py at master · akamaster/pytorch_resnet_cifar10

Proper implementation of ResNet-s for CIFAR10/100 in pytorch that matches description of the original paper. - akamaster/pytorch_resnet_cifar10

github.com

Resnet 논문에 나온 파라미터를 참고하여 작성한 코드입니다.

import torch.nn as nn
import torch
import torch.nn.functional as F
# 가중치 초기화 함수 제공
import torch.nn.init as init

# Kaiming 초기화 (ReLU에 적합한 파라미터 초기화 방법)
# 네트워크가 안정적으로 학습할 수 있게 해줌
def _weights_init(m):
    classname = m.__class__.__name__
    if isinstance(m, nn.Linear) or isinstance(m, nn.Conv2d):
        init.kaiming_normal_(m.weight)

# 크기를 맞추기 위해 패딩을 적용할 때 사용하면 간단한 계산 처리 가능
class LambdaLayer(nn.Module):
    def __init__(self, lambd):
        super(LambdaLayer, self).__init__()
        self.lambd = lambd

    def forward(self, x):
        return self.lambd(x)

# 두 개의 3x3 합성곱 레이어와 배치 정규화 레이어로 구성되며, shortcut 경로는 입력과 출력의 크기 및 채널 수를 맞추기 위해 추가된다.
class BasicBlock(nn.Module):
    expansion = 1

    def __init__(self, in_channels, out_channels, stride=1, option='A'):
        super(BasicBlock, self).__init__()
        self.conv1 = nn.Conv2d(in_channels, out_channels, kernel_size=3, stride=stride, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(out_channels)
        self.conv2 = nn.Conv2d(out_channels, out_channels, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn2 = nn.BatchNorm2d(out_channels)

        self.shortcut = nn.Sequential()
        if stride != 1 or in_channels != out_channels:
            # 패딩을 사용해 크기를 맞춘다.
            if option == 'A':
                """
                For CIFAR10 ResNet paper uses option A.
                """
                self.shortcut = LambdaLayer(lambda x:
                                            F.pad(x[:,:,::2,::2], (0,0,0,0, out_channels//4, out_channels//4), "constant",0))
            # 1x1 합성곱으로 채널 수와 크기를 조정한다.
            elif option == "B":
                self.shortcut = nn.Sequential(
                    nn.Conv2d(in_channels, self.expansion * out_channels, kernel_size=1, stride=stride, bias=False),
                    nn.BatchNorm2d(self.expansion * out_channels)
                )

        
        def forward(self,x):
            out = F.relu(self.bn1(self.conv1(x)))
            out = self.bn2(self.conv2(out))
            out += self.shortcut(x)
            out = F.relu(out)
            return out

# conv1은 초기 입력을 처리하는 합성곱 레이어, 입력은 RGB 이미지이므로 채널 수 3
class Resnet(nn.Module):
    def __init__(self, block, num_blocks, num_classes=10):
        super(Resnet,self).__init__()
        self.in_channels=16

        self.conv1 = nn.Conv2d(3, 16, kernel_size=3, stride=1, padding=1, bias=False)
        self.bn1 = nn.BatchNorm2d(16)
        self.layer1 = self._make_layer(block, 16, num_blocks[0], stride=1)
        self.layer2 = self._make_layer(block, 32, num_blocks[1], stride=2)
        self.layer3 = self._make_layer(block, 64, num_blocks[2], stride=2)
        self.linear = nn.Linear(64, num_classes)

        self.apply(_weights_init)
    # 여러 개의 블록을 쌓는 역할을 한다.
    # 블록마다 stride 값을 조정하여 이미지 크기를 줄이고, 필요한 레이어들을 리스트로 쌓아서 반환
    def _make_layer(self,block,out_channels,num_blocks,stride):
        strides = [stride] + [1]*(num_blocks-1)
        layers = []
        for stride in strides:
            layers.append(block(self.in_channels, out_channels, stride))
            self.in_channels = out_channels * block.expansion

        return nn.Sequential(*layers)

    def forward(self,x):
        out = F.relu(self.bn1(self.conv1(x)))
        out = self.layer1(out)
        out = self.layer2(out)
        out = self.layer3(out)
        out = F.avg_pool2d(out, out.size()[3])
        out = out.view(out.size(0), -1)
        out = self.linear(out)
        return out


def resnet20():
    return Resnet(BasicBlock, [3,3,3])

def resnet32():
    return ResNet(BasicBlock, [5, 5, 5])


def resnet44():
    return ResNet(BasicBlock, [7, 7, 7])


def resnet56():
    return ResNet(BasicBlock, [9, 9, 9])


def resnet110():
    return ResNet(BasicBlock, [18, 18, 18])


def resnet1202():
    return ResNet(BasicBlock, [200, 200, 200])


def test(net):
    import numpy as np
    total_params = 0

    for x in filter(lambda p: p.requires_grad, net.parameters()):
        total_params += np.prod(x.data.numpy().shape)
    print("Total number of params", total_params)
    print("Total layers", len(list(filter(lambda p: p.requires_grad and len(p.data.size())>1, net.parameters()))))


if __name__ == "__main__":
    for net_name in __all__:
        if net_name.startswith('resnet'):
            print(net_name)
            test(globals()[net_name]())
            print()

해당 코드를 실행하면 아래와 같은 결과가 나오는데 이 결과는 논문에서 제시한 파라미터와 매우 비슷하게 작성되었음을 알 수 있다.

'논문' 카테고리의 다른 글

RNN(Recurrent Neural Networks) (0)	2024.11.18
Transformer 리뷰를 위한 공부(완료) (1)	2024.11.18
(공부용) Deep Residual Learning for Image Recognition (4)	2024.09.11
(구현) Rethinking the inception Architecture for computer vision (0)	2024.09.10
(공부용) Rethinking the inception Architecture for computer vision (2)	2024.09.09

'논문' Related Articles

AI 전문가가 되고싶은 사람

(구현) Deep Residual Learning for Image Recognition 본문

(구현) Deep Residual Learning for Image Recognition

'논문' 카테고리의 다른 글

티스토리툴바