Repository: jhkim89/PyramidNet
Branch: master
Commit: 27d168ee586e
Files: 3
Total size: 21.9 KB
Directory structure:
gitextract_o_9qlr6a/
├── README.md
├── addpyramidnet.lua
└── mulpyramidnet.lua
================================================
FILE CONTENTS
================================================
================================================
FILE: README.md
================================================
# PyramidNet
This repository contains the code for the paper:
Dongyoon Han*, Jiwhan Kim*, and Junmo Kim, "Deep Pyramidal Residual Networks", CVPR 2017 (* equal contribution).
Arxiv: https://arxiv.org/abs/1610.02915.
The code is based on Facebook's implementation of ResNet (https://github.com/facebook/fb.resnet.torch).
### Caffe implementation of PyramidNet: [site](https://github.com/jhkim89/PyramidNet-caffe)
### PyTorch implementation of PyramidNet: [site](https://github.com/dyhan0920/PyramidNet-PyTorch)
## Abstract
Deep convolutional neural networks (DCNNs) have shown remarkable performance in image classification tasks in recent years. Generally, deep neural network architectures are stacks consisting of a large number of convolution layers, and they perform downsampling along the spatial dimension via pooling to reduce memory usage. At the same time, the feature map dimension (i.e., the number of channels) is sharply increased at downsampling locations, which is essential to ensure effective performance because it increases the capability of high-level attributes. Moreover, this also applies to residual networks and is very closely related to their performance. In this research, instead of using downsampling to achieve a sharp increase at each residual unit, we gradually increase the feature map dimension at all the units to involve as many locations as possible. This is discussed in depth together with our new insights as it has proven to be an effective design to improve the generalization ability. Furthermore, we propose a novel residual unit capable of further improving the classification accuracy with our new network architecture. Experiments on benchmark CIFAR datasets have shown that our network architecture has a superior generalization ability compared to the original residual networks.
<p align="center"><img src="https://cloud.githubusercontent.com/assets/22743125/19235579/7e7e33c6-8f2d-11e6-9397-1b505688e92a.png" width="960"></p>
Figure 1: Schematic illustration of (a) basic residual units, (b) bottleneck, (c) wide residual units, and (d) our pyramidal residual units.
<p align="center"><img src="https://cloud.githubusercontent.com/assets/22743125/19235610/bb3d5fd0-8f2d-11e6-84bd-46c9b7a4797a.png" width="640"></p>
Figure 2: Visual illustrations of (a) additive PyramidNet, (b) multiplicative PyramidNet, and (c) comparison of (a) and (b).
## Usage
1. Install Torch (http://torch.ch) and ResNet (https://github.com/facebook/fb.resnet.torch).
2. Add the files addpyramidnet.lua and mulpyramidnet.lua to the folder "models".
3. Manually set the parameter "alpha" in the files addpyramidnet.lua and mulpyramidnet.lua (Line 28).
4. Change the learning rate schedule in the file train.lua: "decay = epoch >= 122 and 2 or epoch >= 81 and 1 or 0" to "decay = epoch >= 225 and 2 or epoch >= 150 and 1 or 0".
5. Train our PyramidNet, by running main.lua as below:
To train additive PyramidNet-164 (alpha=48) on CIFAR-10 dataset:
```bash
th main.lua -dataset cifar10 -depth 164 -nEpochs 300 -LR 0.1 -netType addpyramidnet -batchSize 128 -shareGradInput true
```
To train additive PyramidNet-164 (alpha=48) with 4 GPUs on CIFAR-100 dataset:
```bash
th main.lua -dataset cifar100 -depth 164 -nEpochs 300 -LR 0.5 -nGPU 4 -nThreads 8 -netType addpyramidNet -batchSize 128 -shareGradInput true
```
## Results
#### CIFAR
Top-1 error rates on CIFAR-10 and CIFAR-100 datasets. "alpha" denotes the widening factor; "add" and "mul" denote the results obtained with additive and multiplicative pyramidal networks, respectively.
| Network | # of parameters | Output feat. dimension | CIFAR-10 | CIFAR-100 |
| --------------------------------- | --------------- | ---------------------- | ----------- | ----------- |
| PyramidNet-110 (mul), alpha=4.75 | 1.7M | 76 | 4.62 | 23.16 |
| PyramidNet-110 (add), alpha=48 | 1.7M | **64** | 4.62 | 23.31 |
| PyramidNet-110 (mul), alpha=8 | 3.8M | 128 | 4.50 | 20.94 |
| PyramidNet-110 (add), alpha=84 | 3.8M | **100** | 4.27 | 20.21 |
| PyramidNet-110 (mul), alpha=27 | 28.3M | 432 | 4.06 | 18.79 |
| PyramidNet-110 (add), alpha=270 | 28.3M | **286** | **3.73** | **18.25** |
Top-1 error rates of our model with the **bottleneck architecture** on CIFAR-10 and CIFAR-100 datasets. We use the additive pyramidal networks.
| Network | # of parameters | Output feat. dimension | CIFAR-10 | CIFAR-100 |
| --------------------------------- | --------------- | ---------------------- | ----------- | ----------- |
| PyramidNet-164 (add), alpha=48 | 1.7M | 256 | 4.21 | 19.52 |
| PyramidNet-164 (add), alpha=84 | 3.8M | 400 | 3.96 | 18.32 |
| PyramidNet-164 (add), alpha=270 | 27.0M | 1144 | **3.48** | **17.01** |
| PyramidNet-200 (add), alpha=240 | 26.6M | 1024 | **3.44** | **16.51** |
| PyramidNet-236 (add), alpha=220 | 26.8M | 944 | **3.40** | **16.37** |
| PyramidNet-272 (add), alpha=200 | 26.0M | 864 | **3.31** | **16.35** |

Figure 3: Performance distribution according to number of parameters on CIFAR-10 (left) and CIFAR-100 (right).
#### ImageNet
Top-1 and Top-5 error rates of single-model, single-crop (224*224) on ImageNet dataset. We use the additive PyramidNet for our results.
| Network | # of parameters | Output feat. dimension | Top-1 error | Top-5 error |
| ----------------------------------------- | --------------- | ---------------------- | ----------- | ----------- |
| PreResNet-200 | 64.5M | 2048 | 21.66 | 5.79 |
| PyramidNet-200, alpha=300 | 62.1M | 1456 | 20.47 | 5.29 |
| PyramidNet-200, alpha=450, Dropout (0.5) | 116.4M | 2056 | 20.11 | 5.43 |
Model files download: [link](https://1drv.ms/f/s!AmNvwgeB0n4GsiDFDNJWZkEbajJf)
## Notes
1. The parameter "alpha" can only be changed in the files addpyramidnet.lua and mulpyramidnet.lua (Line 28).
2. We recommend to use multi-GPU when training additive PyramidNet with alpha=270 or multiplicative PyramidNet with alpha=27. Otherwise you may get "out of memory" error.
3. We are currently testing our code in the ImageNet dataset. We will upload the result when the training is completed.
## Updates
07/17/2017:
1. Caffe implementation of PyramidNet is released.
02/12/2017:
1. Results of the bottleneck architecture on CIFAR datasets are updated.
01/23/2017:
1. Added Imagenet pretrained models.
## Contact
Jiwhan Kim (jhkim89@kaist.ac.kr),
Dongyoon Han (dyhan@kaist.ac.kr),
Junmo Kim (junmo.kim@kaist.ac.kr)
================================================
FILE: addpyramidnet.lua
================================================
-- Implementation of "Deep Pyramidal Residual Networks"
-- ************************************************************************
-- This code incorporates material from:
-- fb.resnet.torch (https://github.com/facebook/fb.resnet.torch)
-- Copyright (c) 2016, Facebook, Inc.
-- All rights reserved.
--
-- This source code is licensed under the BSD-style license found in the
-- LICENSE file in the root directory of this source tree. An additional grant
-- of patent rights can be found in the PATENTS file in the same directory.
--
-- ************************************************************************
local nn = require 'nn'
require 'cunn'
local Convolution = cudnn.SpatialConvolution
local Avg = cudnn.SpatialAveragePooling
local ReLU = cudnn.ReLU
local Max = nn.SpatialMaxPooling
local SBatchNorm = nn.SpatialBatchNormalization
local function createModel(opt)
local depth = opt.depth
local iChannels
local alpha = 48
-- local alpha = 300
local function round(x)
return math.floor(x+0.5)
end
local function shortcut(nInputPlane, nOutputPlane, stride)
-- Strided, zero-padded identity shortcut
local short = nn.Sequential()
if stride == 2 then
short:add(nn.SpatialAveragePooling(2, 2, 2, 2))
end
if nInputPlane ~= nOutputPlane then
short:add(nn.Padding(1, (nOutputPlane - nInputPlane), 3))
else
short:add(nn.Identity())
end
return short
end
local function basicblock(n, stride)
local nInputPlane = iChannels
iChannels = n
local s = nn.Sequential()
s:add(SBatchNorm(nInputPlane))
s:add(Convolution(nInputPlane,n,3,3,stride,stride,1,1))
s:add(SBatchNorm(n))
s:add(ReLU(true))
s:add(Convolution(n,n,3,3,1,1,1,1))
s:add(SBatchNorm(n))
return nn.Sequential()
:add(nn.ConcatTable()
:add(s)
:add(shortcut(nInputPlane, n, stride)))
:add(nn.CAddTable(true))
end
local function bottleneck(n, stride, type)
local nInputPlane = iChannels
iChannels = n * 4
local s = nn.Sequential()
s:add(SBatchNorm(nInputPlane))
s:add(Convolution(nInputPlane,n,1,1,1,1,0,0))
s:add(SBatchNorm(n))
s:add(ReLU(true))
s:add(Convolution(n,n,3,3,stride,stride,1,1))
s:add(SBatchNorm(n))
s:add(ReLU(true))
s:add(Convolution(n,n*4,1,1,1,1,0,0))
s:add(SBatchNorm(n*4))
return nn.Sequential()
:add(nn.ConcatTable()
:add(s)
:add(shortcut(nInputPlane, n * 4, stride)))
:add(nn.CAddTable(true))
end
-- Creates count residual blocks with specified number of features
local function layer(block, features, count, stride)
local s = nn.Sequential()
if count < 1 then
return s
end
for i=1,count do
s:add(block(features, stride))
end
return s
end
local model = nn.Sequential()
if opt.dataset == 'imagenet' then
-- Configurations for ResNet:
-- num. residual blocks, num features, residual block function
local cfg = {
[18] = {{2, 2, 2, 2}, 512, basicblock},
[34] = {{3, 4, 6, 3}, 512, basicblock},
[50] = {{3, 4, 6, 3}, 2048, bottleneck},
[101] = {{3, 4, 23, 3}, 2048, bottleneck},
[152] = {{3, 8, 36, 3}, 2048, bottleneck},
[200] = {{3, 24, 36, 3}, 2048, bottleneck},
}
assert(cfg[depth], 'Invalid depth: ' .. tostring(depth))
local def, nFeatures, block = table.unpack(cfg[depth])
iChannels = 64
Channeltemp = 64
local addrate = alpha/(def[1]+def[2]+def[3]+def[4])
print(' | PyramidNet-' .. depth .. ' ImageNet')
model:add(Convolution(3,64,7,7,2,2,3,3))
model:add(SBatchNorm(64))
model:add(ReLU(true))
model:add(Max(3,3,2,2,1,1))
Channeltemp = Channeltemp + addrate
model:add(bottleneck(round(Channeltemp), 1, 1, 'first'))
for i=2,def[1] do
Channeltemp = Channeltemp + addrate
model:add(bottleneck(round(Channeltemp), 1, 1))
end
Channeltemp = Channeltemp + addrate
model:add(bottleneck(round(Channeltemp), 2, 1))
for i=2,def[2] do
Channeltemp = Channeltemp + addrate
model:add(bottleneck(round(Channeltemp), 1, 1))
end
Channeltemp = Channeltemp + addrate
model:add(bottleneck(round(Channeltemp), 2, 1))
for i=2,def[3] do
Channeltemp = Channeltemp + addrate
model:add(bottleneck(round(Channeltemp), 1, 1))
end
Channeltemp = Channeltemp + addrate
model:add(bottleneck(round(Channeltemp), 2, 1))
for i=2,def[4] do
Channeltemp = Channeltemp + addrate
model:add(bottleneck(round(Channeltemp), 1, 1))
end
model:add(nn.Copy(nil, nil, true))
model:add(SBatchNorm(iChannels))
model:add(ReLU(true))
model:add(Avg(7, 7, 1, 1))
model:add(nn.View(iChannels):setNumInputDims(3))
model:add(nn.Linear(iChannels, 1000))
elseif opt.dataset == 'cifar10' or opt.dataset == 'cifar100' then
-- local n = (depth - 2) / 6 -- basicblock
local n = (depth - 2) / 9 -- bottleneck
iChannels = 16
local startChannel = 16
local Channeltemp = 16
addChannel = alpha/(3*n)
print(' | PyramidNet-' .. depth .. ' CIFAR')
model:add(Convolution(3,16,3,3,1,1,1,1))
model:add(SBatchNorm(iChannels))
Channeltemp = startChannel
startChannel = startChannel + addChannel
model:add(layer(bottleneck, round(startChannel), 1, 1, 1))
for i=2,n do
Channeltemp = startChannel
startChannel = startChannel + addChannel
model:add(layer(bottleneck, round(startChannel), 1, 1, 1))
end
Channeltemp = startChannel
startChannel = startChannel + addChannel
model:add(layer(bottleneck, round(startChannel), 1, 2, 1))
for i=2,n do
Channeltemp = startChannel
startChannel = startChannel + addChannel
model:add(layer(bottleneck, round(startChannel), 1, 1, 1))
end
Channeltemp = startChannel
startChannel = startChannel + addChannel
model:add(layer(bottleneck, round(startChannel), 1, 2, 1))
for i=2,n do
Channeltemp = startChannel
startChannel = startChannel + addChannel
model:add(layer(bottleneck, round(startChannel), 1, 1, 1))
end
model:add(nn.Copy(nil, nil, true))
model:add(SBatchNorm(iChannels))
model:add(ReLU(true))
model:add(Avg(8, 8, 1, 1))
model:add(nn.View(iChannels):setNumInputDims(3))
if opt.dataset == 'cifar10' then
model:add(nn.Linear(iChannels, 10))
elseif opt.dataset == 'cifar100' then
model:add(nn.Linear(iChannels, 100))
end
else
error('invalid dataset: ' .. opt.dataset)
end
local function ConvInit(name)
for k,v in pairs(model:findModules(name)) do
local n = v.kW*v.kH*v.nOutputPlane
v.weight:normal(0,math.sqrt(2/n))
if cudnn.version >= 4000 then
v.bias = nil
v.gradBias = nil
else
v.bias:zero()
end
end
end
local function BNInit(name)
for k,v in pairs(model:findModules(name)) do
v.weight:fill(1)
v.bias:zero()
end
end
ConvInit('cudnn.SpatialConvolution')
ConvInit('nn.SpatialConvolution')
BNInit('fbnn.SpatialBatchNormalization')
BNInit('cudnn.SpatialBatchNormalization')
BNInit('nn.SpatialBatchNormalization')
for k,v in pairs(model:findModules('nn.Linear')) do
v.bias:zero()
end
model:cuda()
if opt.cudnn == 'deterministic' then
model:apply(function(m)
if m.setMode then m:setMode(1,1,1) end
end)
end
model:get(1).gradInput = nil
return model
end
return createModel
================================================
FILE: mulpyramidnet.lua
================================================
-- Implementation of "Deep Pyramidal Residual Networks"
-- ************************************************************************
-- This code incorporates material from:
-- fb.resnet.torch (https://github.com/facebook/fb.resnet.torch)
-- Copyright (c) 2016, Facebook, Inc.
-- All rights reserved.
--
-- This source code is licensed under the BSD-style license found in the
-- LICENSE file in the root directory of this source tree. An additional grant
-- of patent rights can be found in the PATENTS file in the same directory.
--
-- ************************************************************************
local nn = require 'nn'
require 'cunn'
local Convolution = cudnn.SpatialConvolution
local Avg = cudnn.SpatialAveragePooling
local ReLU = cudnn.ReLU
local Max = nn.SpatialMaxPooling
local SBatchNorm = nn.SpatialBatchNormalization
local function createModel(opt)
local depth = opt.depth
local iChannels
local alpha = 4.75
local function round(x)
return math.floor(x+0.5)
end
local function shortcut(nInputPlane, nOutputPlane, stride)
-- Strided, zero-padded identity shortcut
local short = nn.Sequential()
if stride == 2 then
short:add(nn.SpatialAveragePooling(2, 2, 2, 2))
end
if nInputPlane ~= nOutputPlane then
short:add(nn.Padding(1, (nOutputPlane - nInputPlane), 3))
else
short:add(nn.Identity())
end
return short
end
local function basicblock(n, stride)
local nInputPlane = iChannels
iChannels = n
local s = nn.Sequential()
s:add(SBatchNorm(nInputPlane))
s:add(Convolution(nInputPlane,n,3,3,stride,stride,1,1))
s:add(SBatchNorm(n))
s:add(ReLU(true))
s:add(Convolution(n,n,3,3,1,1,1,1))
s:add(SBatchNorm(n))
return nn.Sequential()
:add(nn.ConcatTable()
:add(s)
:add(shortcut(nInputPlane, n, stride)))
:add(nn.CAddTable(true))
end
local function bottleneck(n, stride, type)
local nInputPlane = iChannels
iChannels = n * 4
local s = nn.Sequential()
s:add(SBatchNorm(nInputPlane))
s:add(Convolution(nInputPlane,n,1,1,1,1,0,0))
s:add(SBatchNorm(n))
s:add(ReLU(true))
s:add(Convolution(n,n,3,3,stride,stride,1,1))
s:add(SBatchNorm(n))
s:add(ReLU(true))
s:add(Convolution(n,n*4,1,1,1,1,0,0))
s:add(SBatchNorm(n*4))
return nn.Sequential()
:add(nn.ConcatTable()
:add(s)
:add(shortcut(nInputPlane, n * 4, stride)))
:add(nn.CAddTable(true))
end
-- Creates count residual blocks with specified number of features
local function layer(block, features, count, stride)
local s = nn.Sequential()
if count < 1 then
return s
end
for i=1,count do
s:add(block(features, stride))
end
return s
end
local model = nn.Sequential()
if opt.dataset == 'imagenet' then
-- Configurations for ResNet:
-- num. residual blocks, num features, residual block function
local cfg = {
[18] = {{2, 2, 2, 2}, 512, basicblock},
[34] = {{3, 4, 6, 3}, 512, basicblock},
[50] = {{3, 4, 6, 3}, 2048, bottleneck},
[101] = {{3, 4, 23, 3}, 2048, bottleneck},
[152] = {{3, 8, 36, 3}, 2048, bottleneck},
[200] = {{3, 24, 36, 3}, 2048, bottleneck},
}
assert(cfg[depth], 'Invalid depth: ' .. tostring(depth))
local def, nFeatures, block = table.unpack(cfg[depth])
iChannels = 64
Channeltemp = 64
local addrate = alpha^(1/(def[1]+def[2]+def[3]+def[4]))
print(' | ResNet-' .. depth .. ' ImageNet')
model:add(Convolution(3,64,7,7,2,2,3,3))
model:add(SBatchNorm(64))
model:add(ReLU(true))
model:add(Max(3,3,2,2,1,1))
Channeltemp = Channeltemp * addrate
model:add(bottleneck(round(Channeltemp), 1, 1, 'first'))
for i=2,def[1] do
Channeltemp = Channeltemp * addrate
model:add(bottleneck(round(Channeltemp), 1, 1))
end
Channeltemp = Channeltemp * addrate
model:add(bottleneck(round(Channeltemp), 2, 1))
for i=2,def[2] do
Channeltemp = Channeltemp * addrate
model:add(bottleneck(round(Channeltemp), 1, 1))
end
Channeltemp = Channeltemp * addrate
model:add(bottleneck(round(Channeltemp), 2, 1))
for i=2,def[3] do
Channeltemp = Channeltemp * addrate
model:add(bottleneck(round(Channeltemp), 1, 1))
end
Channeltemp = Channeltemp * addrate
model:add(bottleneck(round(Channeltemp), 2, 1))
for i=2,def[4] do
Channeltemp = Channeltemp * addrate
model:add(bottleneck(round(Channeltemp), 1, 1))
end
model:add(nn.Copy(nil, nil, true))
model:add(SBatchNorm(iChannels))
model:add(ReLU(true))
model:add(Avg(7, 7, 1, 1))
model:add(nn.View(iChannels):setNumInputDims(3))
model:add(nn.Linear(iChannels, 1000))
elseif opt.dataset == 'cifar10' or opt.dataset == 'cifar100' then
local n = (depth - 2) / 6
iChannels = 16
local startChannel = 16
local Channeltemp = 16
addChannel = alpha^(1/(3*n))
print(' | PyramidNet-' .. depth .. ' CIFAR-10')
model:add(Convolution(3,16,3,3,1,1,1,1))
model:add(SBatchNorm(iChannels))
Channeltemp = startChannel
startChannel = startChannel * addChannel
model:add(layer(basicblock, round(startChannel), 1, 1, 1))
for i=2,n do
Channeltemp = startChannel
startChannel = startChannel * addChannel
model:add(layer(basicblock, round(startChannel), 1, 1, 1))
end
Channeltemp = startChannel
startChannel = startChannel * addChannel
model:add(layer(basicblock, round(startChannel), 1, 2, 1))
for i=2,n do
Channeltemp = startChannel
startChannel = startChannel * addChannel
model:add(layer(basicblock, round(startChannel), 1, 1, 1))
end
Channeltemp = startChannel
startChannel = startChannel * addChannel
model:add(layer(basicblock, round(startChannel), 1, 2, 1))
for i=2,n do
Channeltemp = startChannel
startChannel = startChannel * addChannel
model:add(layer(basicblock, round(startChannel), 1, 1, 1))
end
model:add(nn.Copy(nil, nil, true))
model:add(SBatchNorm(iChannels))
model:add(ReLU(true))
model:add(Avg(8, 8, 1, 1))
model:add(nn.View(iChannels):setNumInputDims(3))
if opt.dataset == 'cifar10' then
model:add(nn.Linear(iChannels, 10))
elseif opt.dataset == 'cifar100' then
model:add(nn.Linear(iChannels, 100))
end
else
error('invalid dataset: ' .. opt.dataset)
end
local function ConvInit(name)
for k,v in pairs(model:findModules(name)) do
local n = v.kW*v.kH*v.nOutputPlane
v.weight:normal(0,math.sqrt(2/n))
if cudnn.version >= 4000 then
v.bias = nil
v.gradBias = nil
else
v.bias:zero()
end
end
end
local function BNInit(name)
for k,v in pairs(model:findModules(name)) do
v.weight:fill(1)
v.bias:zero()
end
end
ConvInit('cudnn.SpatialConvolution')
ConvInit('nn.SpatialConvolution')
BNInit('fbnn.SpatialBatchNormalization')
BNInit('cudnn.SpatialBatchNormalization')
BNInit('nn.SpatialBatchNormalization')
for k,v in pairs(model:findModules('nn.Linear')) do
v.bias:zero()
end
model:cuda()
if opt.cudnn == 'deterministic' then
model:apply(function(m)
if m.setMode then m:setMode(1,1,1) end
end)
end
model:get(1).gradInput = nil
return model
end
return createModel
gitextract_o_9qlr6a/ ├── README.md ├── addpyramidnet.lua └── mulpyramidnet.lua
Condensed preview — 3 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (23K chars).
[
{
"path": "README.md",
"chars": 7256,
"preview": "# PyramidNet\nThis repository contains the code for the paper:\n\nDongyoon Han*, Jiwhan Kim*, and Junmo Kim, \"Deep Pyramida"
},
{
"path": "addpyramidnet.lua",
"chars": 7612,
"preview": "-- Implementation of \"Deep Pyramidal Residual Networks\" \n\n-- *********************************************************"
},
{
"path": "mulpyramidnet.lua",
"chars": 7543,
"preview": "-- Implementation of \"Deep Pyramidal Residual Networks\" \n\n-- *********************************************************"
}
]
About this extraction
This page contains the full source code of the jhkim89/PyramidNet GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 3 files (21.9 KB), approximately 6.6k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.