Full Code of pathak22/unsupervised-video for AI

master 10780c7a3cb9 cached
11 files
32.2 KB
8.9k tokens
5 symbols
1 requests
Download .txt
Repository: pathak22/unsupervised-video
Branch: master
Commit: 10780c7a3cb9
Files: 11
Total size: 32.2 KB

Directory structure:
gitextract_o89s89cr/

├── .gitignore
├── LICENSE
├── README.md
├── image_transform_layer.py
├── models/
│   ├── download_caffe_models.sh
│   ├── download_torch_models.sh
│   └── download_torch_motion_model.sh
└── motionseg/
    ├── DeepMaskAlexNet.lua
    ├── SpatialSymmetricPadding.lua
    ├── load_motionmodel.lua
    └── utilsModel.lua

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
models/*.tar.gz
models/caffemodels/
models/torchmodels/

================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2017 Deepak Pathak

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.

================================================
FILE: README.md
================================================
## Learning Features by Watching Objects Move ##
In CVPR 2017. [[Project Website]](http://cs.berkeley.edu/~pathak/unsupervised_video/).

[Deepak Pathak](https://people.eecs.berkeley.edu/~pathak/), [Ross Girshick](http://www.rossgirshick.info/), [Piotr Doll&aacute;r](https://pdollar.github.io/), [Trevor Darrell](https://people.eecs.berkeley.edu/~trevor/), [Bharath Hariharan](http://home.bharathh.info/)<br/>
University of California, Berkeley<br/>
Facebook AI Research (FAIR)<br/>

<img src="images/overview.jpg" width="550">

This is the code for our [CVPR 2017 paper on Unsupervised Learning using unlabeled videos](http://cs.berkeley.edu/~pathak/unsupervised_video/). This repository contains models trained by the unsupervised motion grouping algorithm both in Caffe and Torch. If you find this work useful in your research, please cite:

    @inproceedings{pathakCVPR17learning,
        Author = {Pathak, Deepak and Girshick, Ross and Doll\'{a}r,
                  Piotr and Darrell, Trevor and Hariharan, Bharath},
        Title = {Learning Features by Watching Objects Move},
        Booktitle = {Computer Vision and Pattern Recognition ({CVPR})},
        Year = {2017}
    }

### 1) Fetching Models for Unsupervised Transfer
The models below only contains the layer that are used for unsupervised transfer learning. For the full model that contains motion segmentation, see next section.

1. Clone the repository
  ```Shell
  git clone https://github.com/pathak22/unsupervised-video.git
  ```

2. Fetch caffe models
  ```Shell
  cd unsupervised-video/
  bash ./models/download_caffe_models.sh
  # This will populate the `./models/` folder with trained models.
  ```
  The models were initially trained in Torch and then converted to caffe. Hence, please include pycaffe based `image_transform_layer.py` in your folder. It converts the scale and mean of the input image as needed.

3. Fetch torch models
  ```Shell
  cd unsupervised-video/
  bash ./models/download_torch_models.sh
  # This will populate the `./models/` folder with trained models.
  ```

### 2) Fetching Motion Segmentation models
Follow the instructions below to download full motion segmentation model trained on the automatically selected 205K videos from YFCC100m. I trained it in Torch, but you can train your own model from the full data [available here](https://people.eecs.berkeley.edu/~pathak/unsupervised_video/index.html#data) in any deep learning package using the training details from paper.
```Shell
cd unsupervised-video/
bash ./models/download_torch_motion_model.sh
# This will populate the `./models/` folder with trained model.

cd motionseg/
th load_motionmodel.lua -input ../models/motionSegmenter_fullModel.t7
```

### 3) Additional Software Packages

We are releasing software packages which were developed in the project, but could be generally useful for computer vision research. If you find them useful, please consider citing our work. These include:

(a) <a href='https://github.com/pathak22/videoseg'><b>uNLC [github]</b></a>: Implementation of unsupervised bottom-up video segmentation algorithm which is unsupervised adaptation of NLC algorithm by Faktor and Irani, BMVC 2014. For additional details, see section 5.1 in the <a href="http://cs.berkeley.edu/~pathak/unsupervised_video/">paper</a>.<br/><br/>
(b) <a href='https://github.com/pathak22/pyflow'><b>PyFlow [github]</b></a>: This is python wrapper around Ce Liu's <a href="http://people.csail.mit.edu/celiu/OpticalFlow/" target="_blank">C++ implementation</a> of Coarse2Fine Optical Flow. This is used inside uNLC implementation, and also generally useful as an independent package.


================================================
FILE: image_transform_layer.py
================================================
"""
Transform images for compatibility with models trained with
https://github.com/facebook/fb.resnet.torch.

Usage in model prototxt:
layer {
  name: 'data_xform'
  type: 'Python'
  bottom: 'data_caffe'
  top: 'data'
  python_param {
    module: 'image_transform_layer'
    layer: 'TorchImageTransformLayer'
  }
}
"""

import caffe
import numpy as np


class TorchImageTransformLayer(caffe.Layer):
    def setup(self, bottom, top):
        # (1, 3, 1, 1) shaped arrays
        self.PIXEL_MEANS = \
            np.array([[[[0.485]],
                       [[0.456]],
                       [[0.406]]]])
        self.PIXEL_STDS = \
            np.array([[[[0.229]],
                       [[0.224]],
                       [[0.225]]]])
        top[0].reshape(*(bottom[0].shape))

    def forward(self, bottom, top):
        ims = bottom[0].data
        # 1. Permute BGR to RGB and normalize to [0, 1]
        ims = ims[:, [2, 1, 0], :, :] / 255.0
        # 2. Remove channel means
        ims -= self.PIXEL_MEANS
        # 3. Standardize channels
        ims /= self.PIXEL_STDS
        top[0].reshape(*(ims.shape))
        top[0].data[...] = ims

    def backward(self, top, propagate_down, bottom):
        """This layer does not propagate gradients."""
        pass

    def reshape(self, bottom, top):
        """Reshaping happens during the call to forward."""
        pass


================================================
FILE: models/download_caffe_models.sh
================================================
#!/bin/bash

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )/" && pwd )"
cd $DIR

FILE=caffemodels.tar.gz
URL=https://dl.fbaipublicfiles.com/unsupervised-video/$FILE
CHECKSUM=29e4a50f4fc77b0563a201f28577a895

if [ ! -f $FILE ]; then
  echo "Downloading the unsupervised video caffemodels (829MB)..."
  wget $URL -O $FILE
  echo "Unzipping..."
  tar zxvf $FILE
  echo "Downloading Done."
else
  echo "File already exists. Checking md5..."
fi

os=`uname -s`
if [ "$os" = "Linux" ]; then
  checksum=`md5sum $FILE | awk '{ print $1 }'`
elif [ "$os" = "Darwin" ]; then
  checksum=`cat $FILE | md5`
elif [ "$os" = "SunOS" ]; then
  checksum=`digest -a md5 -v $FILE | awk '{ print $4 }'`
fi
if [ "$checksum" = "$CHECKSUM" ]; then
  echo "Checksum is correct. File was correctly downloaded."
  exit 0
else
  echo "Checksum is incorrect. DELETE and download again."
fi


================================================
FILE: models/download_torch_models.sh
================================================
#!/bin/bash

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )/" && pwd )"
cd $DIR

FILE=torchmodels.tar.gz
URL=https://dl.fbaipublicfiles.com/unsupervised-video/$FILE
CHECKSUM=6ead77d7b387b51426ccc5d3c95f78bb

if [ ! -f $FILE ]; then
  echo "Downloading the unsupervised video torchmodels (803MB)..."
  wget $URL -O $FILE
  echo "Unzipping..."
  tar zxvf $FILE
  echo "Downloading Done."
else
  echo "File already exists. Checking md5..."
fi

os=`uname -s`
if [ "$os" = "Linux" ]; then
  checksum=`md5sum $FILE | awk '{ print $1 }'`
elif [ "$os" = "Darwin" ]; then
  checksum=`cat $FILE | md5`
elif [ "$os" = "SunOS" ]; then
  checksum=`digest -a md5 -v $FILE | awk '{ print $4 }'`
fi
if [ "$checksum" = "$CHECKSUM" ]; then
  echo "Checksum is correct. File was correctly downloaded."
  exit 0
else
  echo "Checksum is incorrect. DELETE and download again."
fi


================================================
FILE: models/download_torch_motion_model.sh
================================================
#!/bin/bash

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )/" && pwd )"
cd $DIR

FILE=torchmodels_motion.tar.gz
URL=https://dl.fbaipublicfiles.com/unsupervised-video/$FILE
CHECKSUM=497efcdf10630cf6fd83d9b367765934

if [ ! -f $FILE ]; then
  echo "Downloading the unsupervised video motion segmentation torchmodel (238MB)..."
  wget $URL -O $FILE
  echo "Unzipping..."
  tar zxvf $FILE
  echo "Downloading Done."
else
  echo "File already exists. Checking md5..."
fi

os=`uname -s`
if [ "$os" = "Linux" ]; then
  checksum=`md5sum $FILE | awk '{ print $1 }'`
elif [ "$os" = "Darwin" ]; then
  checksum=`cat $FILE | md5`
elif [ "$os" = "SunOS" ]; then
  checksum=`digest -a md5 -v $FILE | awk '{ print $4 }'`
fi
if [ "$checksum" = "$CHECKSUM" ]; then
  echo "Checksum is correct. File was correctly downloaded."
  exit 0
else
  echo "Checksum is incorrect. DELETE and download again."
fi


================================================
FILE: motionseg/DeepMaskAlexNet.lua
================================================
--[[ DeepMask model:
When initialized, it creates/load the common trunk, the maskBranch and the
scoreBranch or the colorBranch or the flowBranch.
---- deepmask class members:
-- self.trunk: the common trunk (pre-trained resnet50)
-- self.maskBranch: the mask head architecture
-- self.scoreBranch: the score head architecture
-- self.colorBranch: the colorization head architecture
-- self.flowBranch: the flow head architecture
]]

require 'nn'
require 'nnx'
require 'cunn'
require 'cudnn'
local utils = paths.dofile('utilsModel.lua')
paths.dofile('SpatialSymmetricPadding.lua')

local DeepMask,_ = torch.class('nn.DeepMask','nn.Container')

-- function: conv2linear
local function linear2conv(x)
  if torch.typename(x):find('Linear') then
    -- hard-coding for fc6 and fc7: kSz=kernelSize=inputFeatureMapSize
    local kSz = x.weight:size(2) > 5000 and 6 or 1

    local nInp = x.weight:size(2)/(kSz*kSz)
    local nOut = x.weight:size(1)
    local w = torch.reshape(x.weight,nOut,nInp,kSz,kSz)
    local y = cudnn.SpatialConvolution(nInp,nOut,kSz,kSz,1,1,0,0)
    y.weight:copy(w)
    y.gradWeight:copy(w)
    if x.bias~=nil then
      y.bias:copy(x.bias)
      y.gradBias:copy(x.gradBias)
    end
    return y
  elseif torch.typename(x):find('cudnn.BatchNormalization') then
     x.nDim = 4
     return x
  else
    return x
  end
end

--------------------------------------------------------------------------------
-- function: constructor
function DeepMask:__init(config)
   self.color = config.color
   self.flow = config.flow
   if config.noFC then
      print('| create AlexNet (w/o FCs) Trunk')
   else
      print('| create AlexNet (including FCs) Trunk')
   end
   if config.symmPad then
      print('| using symmetric padding')
   else
      print('| no symmetric padding')
   end
   if config.centralCrop then
      print('| using central cropping')
   else
      print('| no central cropping')
   end
   if config.bottleneck then
      print('| using bottleneck')
   else
      print('| no bottleneck')
   end

   -- create common trunk
   self:createTrunk(config)
   local npt = 0
   local p1  = self.trunk:parameters()
   for k,v in pairs(p1) do npt = npt+v:nElement() end
   print(string.format('| number of paramaters trunk: %d', npt))

   if self.flow then
         -- create flow head
         self:createFlowBranch(config)

         local p5, npf = self.flowBranch:parameters(), 0
         for k,v in pairs(p5) do npf = npf+v:nElement() end
         print(string.format('| number of paramaters flow branch: %d', npf))
         print(string.format('| number of paramaters total: %d', npt+npf))
         return
   end

   -- create mask head
   self:createMaskBranch(config)
   local npm = 0
   local p2  = self.maskBranch:parameters()
   for k,v in pairs(p2) do npm = npm+v:nElement() end
   print(string.format('| number of paramaters mask branch: %d', npm))

   if self.color then
      -- create colorization head
      self:createColorBranch(config)

      local p4, npc = self.colorBranch:parameters(), 0
      for k,v in pairs(p4) do npc = npc+v:nElement() end
      print(string.format('| number of paramaters color branch: %d', npc))
      print(string.format('| number of paramaters total: %d', npt+npm+npc))
   else
      -- create score head
      self:createScoreBranch(config)

      local p3, nps = self.scoreBranch:parameters(), 0
      for k,v in pairs(p3) do nps = nps+v:nElement() end
      print(string.format('| number of paramaters score branch: %d', nps))
      print(string.format('| number of paramaters total: %d', npt+nps+npm))
   end
end

--------------------------------------------------------------------------------
-- function: create common trunk
function DeepMask:createTrunk(config)
   -- size of feature maps at end of trunk
   if config.padAlexNet then
      if config.iSz==180 then
         -- self.fSz = config.noFC and 12 or 5  -- alexnet_padded w/o dilation
         self.fSz = 12  -- alexnet_padded w/ dilation
      else
         print('Unknown size setting !! Cant create AlexNet trunk')
         os.exit()
      end
   else
      -- iSz=227 ; for w/ FC
      -- iSz=179 ; for w/o FC
      if config.iSz==160 then
         self.fSz = config.noFC and 8 or -1
      elseif config.iSz==179 then
         self.fSz = config.noFC and 10 or -1
      elseif config.iSz==227 then
         self.fSz = config.noFC and 13 or 1
      else
         print('Unknown size setting !! Cant create AlexNet trunk')
         os.exit()
      end
   end
   self.channels = config.noFC and 128 or 4096
   self.bottleneck = self.channels*self.fSz*self.fSz

   -- load trunk
   local trunk
   print('    | creating trunk:')
   if #config.useImagenet > 0 then
     print(string.format('    | using Imagenet pre-trained AlexNet: %s',
        config.useImagenet))
     trunk = torch.load(config.useImagenet)
     -- Format of sgross's old fb.resnet training code
     if trunk.state ~= nil then
       trunk = trunk.state.network
     end
     -- remove DataParallelTable
     if torch.type(trunk) == 'nn.DataParallelTable' then
       trunk = trunk:get(1)
     end
     if config.useBN then
       print('    | keeping BatchNorm in pre-trained model (if present)')
     else
       print('    | fixing BatchNorm in pre-trained model (if present)')
       utils.BNtoFixed(trunk, true)
     end
   elseif config.useBN then
     print('    | using AlexNet with BatchNorm from scratch !')
     local alexnet = paths.dofile('./models/alexnetbn.lua')
     trunk = alexnet()
   else
     print('    | using AlexNet without BatchNorm from scratch !')
     local alexnet = config.padAlexNet and paths.dofile(
         './models/alexnet_padded.lua') or paths.dofile('./models/alexnet.lua')
     trunk = alexnet()
   end
  --  print('    | loaded trunk model:')
  --  print(trunk)

   -- remove fc8
   trunk:remove();

   if config.noFC then
      -- remove fc7
      trunk:remove();trunk:remove();
      if torch.typename(trunk.modules[#trunk.modules]):find('BatchNorm') then
        trunk:remove();
      end
      trunk:remove();
      -- remove fc6
      trunk:remove();trunk:remove();
      if torch.typename(trunk.modules[#trunk.modules]):find('BatchNorm') then
        trunk:remove();
      end
      trunk:remove();
      if torch.typename(trunk.modules[#trunk.modules]):find('View') then
        trunk:remove();
      end
      -- remove pool5
      trunk:remove();

      -- crop central pad : see DataSamplerCoco.wSz
      if config.centralCrop then
         trunk:add(nn.SpatialZeroPadding(-1,-1,-1,-1))
      end

      -- add common extra layers
      trunk:add(cudnn.SpatialConvolution(256,128,1,1,1,1))
      if config.useBN then
        trunk:add(cudnn.SpatialBatchNormalization(128))
      end
      trunk:add(nn.ReLU(true))
   else
      if #config.useImagenet > 0 then
        print('    | FC to Conv conversion in pre-trained model')
        local startFCLayer = 16
        if config.useBN then
          startFCLayer = 19
        end
        local j=startFCLayer
        for i=startFCLayer,#trunk.modules do
          if not torch.typename(trunk.modules[i]):find('View') then
           trunk.modules[j] = linear2conv(trunk.modules[i])
           j=j+1
          end
        end
        for j=j,#trunk.modules do
          trunk:remove()
        end
      end

      -- crop central pad : see DataSamplerCoco.wSz
      if config.centralCrop then
         trunk:add(nn.SpatialZeroPadding(-1,-1,-1,-1))
      end
   end
   -- trunk:add(nn.View(config.batch,self.bottleneck))

   -- low-rank bottleneck
   if config.bottleneck then
      trunk:add(nn.Linear(self.bottleneck,512))
      if config.useBN then
         trunk:add(cudnn.BatchNormalization(512))
      end
      self.bottleneck = 512
    end

   -- mirrorPadding
   if config.symmPad then
      utils.updatePadding(trunk, nn.SpatialSymmetricPadding)
   end

   self.trunk = trunk:cuda()

   print('    | finalized trunk model:')
   print(trunk)
   return trunk
end

--------------------------------------------------------------------------------
-- function: create mask branch
function DeepMask:createMaskBranch(config)
   local maskBranch = nn.Sequential()

   -- maskBranch
   if not config.bottleneck then
      maskBranch:add(nn.View(config.batch,self.bottleneck))
   end
   maskBranch:add(nn.Linear(self.bottleneck,config.oSz*config.oSz))
   self.maskBranch = nn.Sequential():add(maskBranch:cuda())

   -- upsampling layer
   if config.gSz > config.oSz then
      local upSample = nn.Sequential()
      upSample:add(nn.Copy('torch.CudaTensor','torch.FloatTensor'))
      upSample:add(nn.View(config.batch,config.oSz,config.oSz))
      upSample:add(nn.SpatialReSamplingEx{owidth=config.gSz,oheight=config.gSz,
         mode='bilinear'})
      upSample:add(nn.View(config.batch,config.gSz*config.gSz))
      upSample:add(nn.Copy('torch.FloatTensor','torch.CudaTensor'))
      self.maskBranch:add(upSample)
   end

   print('    | finalized mask model:')
   print(self.maskBranch)
   return self.maskBranch
end

--------------------------------------------------------------------------------
-- function: create score branch
function DeepMask:createScoreBranch(config)
   local scoreBranch = nn.Sequential()
   if not config.bottleneck then
      scoreBranch:add(nn.View(config.batch,self.bottleneck))
   end
   scoreBranch:add(nn.Dropout(.5))
   scoreBranch:add(nn.Linear(self.bottleneck,1024))
   if config.useBN then
     scoreBranch:add(cudnn.BatchNormalization(1024))
   end
   scoreBranch:add(nn.Threshold(0, 1e-6))

   scoreBranch:add(nn.Dropout(.5))
   scoreBranch:add(nn.Linear(1024,1))

   self.scoreBranch = scoreBranch:cuda()
   print('    | finalized score model:')
   print(self.scoreBranch)
   return self.scoreBranch
end

--------------------------------------------------------------------------------
-- function: create colorization branch
function DeepMask:createColorBranch(config)
   if config.bottleneck then
      print('config.bottleneck in trunk is not supported with Color Task !!')
      os.exit()
   end
   local colorBranch = nn.Sequential()
   colorBranch:add(nn.SpatialFullConvolution(self.channels,256,4,4,2,2,1,1))
   colorBranch:add(nn.ReLU(true))
   colorBranch:add(cudnn.SpatialConvolution(256,313,3,3,1,1,1,1))
   colorBranch:add(nn.SpatialUpSamplingBilinear({oheight=config.cgSz,
                                                   owidth=config.cgSz}))
   self.colorBranch = colorBranch:cuda()
   print('    | finalized color model:')
   print(self.colorBranch)
   return self.colorBranch
end

--------------------------------------------------------------------------------
-- function: create flow branch
function DeepMask:createFlowBranch(config)
   if config.bottleneck then
      print('config.bottleneck in trunk is not supported with Flow Task !!')
      os.exit()
   end
   local flowBranch = nn.Sequential()
   flowBranch:add(cudnn.SpatialConvolution(self.channels,
                                             config.numCl,3,3,1,1,1,1))
   -- upsample if fgSz > 12 (e.g. 100)
   -- flowBranch:add(nn.SpatialUpSamplingBilinear({oheight=config.fgSz,
   --                                                 owidth=config.fgSz}))
   self.flowBranch = flowBranch:cuda()
   print('    | finalized flow model:')
   print(self.flowBranch)
   return self.flowBranch
end

--------------------------------------------------------------------------------
-- function: training
function DeepMask:training()
   self.trunk:training()
   if self.flow then
      self.flowBranch:training()
      return
   end
   self.maskBranch:training()
   if self.color then
      self.colorBranch:training()
   else
      self.scoreBranch:training()
   end
end

--------------------------------------------------------------------------------
-- function: evaluate
function DeepMask:evaluate()
   self.trunk:evaluate()
   if self.flow then
      self.flowBranch:evaluate()
      return
   end
   self.maskBranch:evaluate()
   if self.color then
      self.colorBranch:evaluate()
   else
      self.scoreBranch:evaluate()
   end
end

--------------------------------------------------------------------------------
-- function: to cuda
function DeepMask:cuda()
   self.trunk:cuda()
   if self.flow then
      self.flowBranch:cuda()
      return
   end
   self.maskBranch:cuda()
   if self.color then
      self.colorBranch:cuda()
   else
      self.scoreBranch:cuda()
   end
end

--------------------------------------------------------------------------------
-- function: to float
function DeepMask:float()
   self.trunk:float()
   if self.flow then
      self.flowBranch:float()
      return
   end
   self.maskBranch:float()
   if self.color then
      self.colorBranch:float()
   else
      self.scoreBranch:float()
   end
end

--------------------------------------------------------------------------------
-- function: inference (used for full scene inference)
function DeepMask:inference()
   self:cuda()
   utils.linear2convTrunk(self.trunk,self.fSz)
   self.trunk:evaluate()
   self.trunk:forward(torch.CudaTensor(1,3,800,800))
   if self.flow then
      utils.linear2convHead(self.flowBranch)
      self.flowBranch:evaluate()
      self.flowBranch:forward(torch.CudaTensor(1,512,300,300))
      return
   end

   utils.linear2convHead(self.maskBranch.modules[1])
   self.maskBranch = self.maskBranch.modules[1]
   self.maskBranch:evaluate()
   self.maskBranch:forward(torch.CudaTensor(1,512,300,300))

   if self.color then
      utils.linear2convHead(self.colorBranch)
      self.colorBranch:evaluate()
      self.colorBranch:forward(torch.CudaTensor(1,512,300,300))
   else
      utils.linear2convHead(self.scoreBranch)
      self.scoreBranch:evaluate()
      self.scoreBranch:forward(torch.CudaTensor(1,512,300,300))
   end
end

--------------------------------------------------------------------------------
-- function: clone
function DeepMask:clone(...)
   local f = torch.MemoryFile("rw"):binary()
   f:writeObject(self)
   f:seek(1)
   local clone = f:readObject()
   f:close()

   if select('#',...) > 0 then
      clone.trunk:share(self.trunk,...)
      if self.flow then
         clone.flowBranch:share(self.flowBranch,...)
         return clone
      end
      clone.maskBranch:share(self.maskBranch,...)
      if self.color then
         clone.colorBranch:share(self.colorBranch,...)
      else
         clone.scoreBranch:share(self.scoreBranch,...)
      end
   end

   return clone
end

return DeepMask


================================================
FILE: motionseg/SpatialSymmetricPadding.lua
================================================
--[[----------------------------------------------------------------------------
Copyright (c) 2016-present, Facebook, Inc. All rights reserved.
This source code is licensed under the BSD-style license found in the
LICENSE file in the root directory of this source tree. An additional grant
of patent rights can be found in the PATENTS file in the same directory.

SpatialSymmetricPadding module

The forward(A) pads input array A with mirror reflections of itself
It is the same function as Matlab padarray(A, padsize, 'symmetric' )
The updateGradInput(input, gradOutput) is inherited from nn.SpatialZeroPadding
where the padded region is treated as constant and
the gradients would not be accumulated in the backward pass
------------------------------------------------------------------------------]]

local SpatialSymmetricPadding, parent =
  torch.class('nn.SpatialSymmetricPadding', 'nn.SpatialZeroPadding')

function SpatialSymmetricPadding:__init(pad_l, pad_r, pad_t, pad_b)
   parent.__init(self, pad_l, pad_r, pad_t, pad_b)
end

function SpatialSymmetricPadding:updateOutput(input)
  assert(input:dim()==4, "only Dimension=4 implemented")
  -- sizes
  local h = input:size(3) + self.pad_t + self.pad_b
  local w = input:size(4) + self.pad_l + self.pad_r
  if w < 1 or h < 1 then error('input is too small') end
  self.output:resize(input:size(1), input:size(2), h, w)
  self.output:zero()
  -- crop input if necessary
  local c_input = input
  if self.pad_t < 0 then
    c_input = c_input:narrow(3, 1 - self.pad_t, c_input:size(3) + self.pad_t)
  end
  if self.pad_b < 0 then
    c_input = c_input:narrow(3, 1, c_input:size(3) + self.pad_b)
  end
  if self.pad_l < 0 then
    c_input = c_input:narrow(4, 1 - self.pad_l, c_input:size(4) + self.pad_l)
  end
  if self.pad_r < 0 then
    c_input = c_input:narrow(4, 1, c_input:size(4) + self.pad_r)
  end
  -- crop outout if necessary
  local c_output = self.output
  if self.pad_t > 0 then
    c_output = c_output:narrow(3, 1 + self.pad_t, c_output:size(3) - self.pad_t)
  end
  if self.pad_b > 0 then
    c_output = c_output:narrow(3, 1, c_output:size(3) - self.pad_b)
  end
  if self.pad_l > 0 then
    c_output = c_output:narrow(4, 1 + self.pad_l, c_output:size(4) - self.pad_l)
  end
  if self.pad_r > 0 then
    c_output = c_output:narrow(4, 1, c_output:size(4) - self.pad_r)
  end
  -- copy input to output
  c_output:copy(c_input)
  -- symmetric padding that fills in values on the padded region
  if w<2*self.pad_l or w<2*self.pad_r or h<2*self.pad_t or h<2*self.pad_b then
    error('input is too small')
  end
  for i=1,self.pad_t do
    self.output:narrow(3,self.pad_t-i+1,1):copy(
    self.output:narrow(3,i+self.pad_t,1))
  end
  for i=1,self.pad_b do
    self.output:narrow(3,self.output:size(3)-self.pad_b+i,1):copy(
    self.output:narrow(3,self.output:size(3)-self.pad_b-i+1,1))
  end
  for i=1,self.pad_l do
    self.output:narrow(4,self.pad_l-i+1,1):copy(
    self.output:narrow(4,i+self.pad_l,1))
  end
  for i=1,self.pad_r do
    self.output:narrow(4,self.output:size(4)-self.pad_r+i,1):copy(
    self.output:narrow(4,self.output:size(4)-self.pad_r-i+1,1))
  end
  return self.output
end


================================================
FILE: motionseg/load_motionmodel.lua
================================================
require 'nn';
require 'cunn';
require 'cudnn';

paths.dofile('DeepMaskAlexNet.lua');
local cmd = torch.CmdLine()
cmd:text()
cmd:text('Helper script for loading model')
cmd:text()
cmd:option('-input', '', 'Path to input Torch model to be converted')
local config = cmd:parse(arg)

local model = torch.load(config.input);
print(model)
model = model:float()
model:evaluate()


================================================
FILE: motionseg/utilsModel.lua
================================================
-- utility functions for models

local utils = {}

--------------------------------------------------------------------------------
-- SpatialConstDiagonal module
-- all BN modules in resnet to be transformed into SpatialConstDiagonal
if not nn.SpatialConstDiagonal then
   local module, parent = torch.class('nn.SpatialConstDiagonal', 'nn.Module')

   function module:__init(nOutputPlane, inplace)
      parent.__init(self)
      self.a = torch.Tensor(1,nOutputPlane,1,1)
      self.b = torch.Tensor(1,nOutputPlane,1,1)
      self.inplace = inplace
      self:reset()
   end

   function module:reset()
      self.a:fill(1)
      self.b:zero()
   end

   function module:updateOutput(input)
      if self.inplace then
         self.output:set(input)
      else
         self.output:resizeAs(input):copy(input)
      end
      self.output:cmul(self.a:expandAs(input))
      self.output:add(self.b:expandAs(input))
      return self.output
   end

   function module:updateGradInput(input, gradOutput)
      if self.inplace then
         self.gradInput:set(gradOutput)
      else
         self.gradInput:resizeAs(gradOutput):copy(gradOutput)
      end
      self.gradInput:cmul(self.a:expandAs(gradOutput))
      return self.gradInput
   end
end

--------------------------------------------------------------------------------
-- function: goes over a net and recursively replaces modules
-- using callback function
local function replace(self, callback)
   local out = callback(self)
   if self.modules then
      for i=#self.modules,1,-1 do
         local m = self.modules[i]
         local mm = replace(m, callback)
         if mm then self.modules[i] = mm else self:remove(i) end
      end
   end
   return out
end

--------------------------------------------------------------------------------
-- function: replace BN layer to SpatialConstDiagonal
function utils.BNtoFixed(net, ip)
   return replace(
      net,
      function(x)
      if torch.typename(x):find'SpatialBatchNormalization' then
         local no = x.running_mean:numel()
         local y = nn.SpatialConstDiagonal(no, ip):type(x._type)
         if x.running_var then
            x.running_std = x.running_var:pow(-0.5)
         end
         y.a:copy(x.running_std)
         y.b:add(-1,x.running_mean):cmul(x.running_std)
         if x.affine then
            y.a:cmul(x.weight)
            y.b:cmul(x.weight):add(x.bias)
         end
         return y
      else
         return x
      end
   end
   )
end

--------------------------------------------------------------------------------
-- function: replace 0-padding of 3x3 conv into mirror-padding
function utils.updatePadding(net, nn_padding)
   if torch.typename(net) == "nn.Sequential" or
      torch.typename(net) == "nn.ConcatTable" then
      for i = #net.modules,1,-1 do
         local out = utils.updatePadding(net:get(i), nn_padding)
         if out ~= -1 then
            local pw, ph = out[1], out[2]
            net.modules[i] = nn.Sequential():add(nn_padding(pw,pw,ph,ph))
               :add(net.modules[i]):cuda()
         end
      end
   else
      if torch.typename(net) == "nn.SpatialConvolution" or
         torch.typename(net) == "cudnn.SpatialConvolution" then
         if (net.kW == 3 and net.kH == 3) or (net.kW==7 and net.kH==7) or
            (net.kW == 5 and net.kH == 5) then
            local pw, ph = net.padW, net.padH
            net.padW, net.padH = 0, 0
            return {pw,ph}
         end
      end
    end
   return -1
end

--------------------------------------------------------------------------------
-- function: linear2convTrunk
function utils.linear2convTrunk(net,fSz)
   return replace(
   net,
   function(x)
      if torch.typename(x):find('Linear') then
         local nInp,nOut = x.weight:size(2)/(fSz*fSz),x.weight:size(1)
         local w = torch.reshape(x.weight,nOut,nInp,fSz,fSz)
         local y = cudnn.SpatialConvolution(nInp,nOut,fSz,fSz,1,1)
         y.weight:copy(w)
         y.gradWeight:copy(w)
         y.bias:copy(x.bias)
         return y
      elseif torch.typename(x):find('cudnn.BatchNormalization') or
        torch.typename(x):find('nn.BatchNormalization') then
        --  x.nDim = 4
        --  return x
         local nOut = x.running_mean:size(1)
         local y = cudnn.SpatialBatchNormalization(nOut)
         y.weight:copy(x.weight)
         y.bias:copy(x.bias)
         y.gradWeight:copy(x.gradWeight)
         y.gradBias:copy(x.gradBias)
         y.running_mean:copy(x.running_mean)
        --  y.running_var:copy(x.running_var)
        --  y.save_mean:copy(x.save_mean)
        --  y.save_std:copy(x.save_std)
         return y
      elseif torch.typename(x):find('Threshold') then
         return cudnn.ReLU()
      elseif not torch.typename(x):find('View') and
         not torch.typename(x):find('SpatialZeroPadding') then
         return x
      end
   end
   )
end

--------------------------------------------------------------------------------
-- function: linear2convHeads
function utils.linear2convHead(net)
   return replace(
   net,
   function(x)
      if torch.typename(x):find('Linear') then
         local nInp,nOut = x.weight:size(2),x.weight:size(1)
         local w = torch.reshape(x.weight,nOut,nInp,1,1)
         local y = cudnn.SpatialConvolution(nInp,nOut,1,1,1,1)
         y.weight:copy(w)
         y.gradWeight:copy(w)
         y.bias:copy(x.bias)
         return y
      elseif torch.typename(x):find('cudnn.BatchNormalization') or
          torch.typename(x):find('nn.BatchNormalization') then
          -- x.nDim = 4
          -- return x
          local nOut = x.running_mean:size(1)
          local y = cudnn.SpatialBatchNormalization(nOut)
          y.weight:copy(x.weight)
          y.bias:copy(x.bias)
          y.gradWeight:copy(x.gradWeight)
          y.gradBias:copy(x.gradBias)
          y.running_mean:copy(x.running_mean)
          -- y.running_var:copy(x.running_var)
          -- y.save_mean:copy(x.save_mean)
          -- y.save_std:copy(x.save_std)
          return y
      elseif torch.typename(x):find('Threshold') then
         return cudnn.ReLU()
      elseif not torch.typename(x):find('View') and
         not torch.typename(x):find('Copy') then
         return x
      end
   end
   )
end

return utils
Download .txt
gitextract_o89s89cr/

├── .gitignore
├── LICENSE
├── README.md
├── image_transform_layer.py
├── models/
│   ├── download_caffe_models.sh
│   ├── download_torch_models.sh
│   └── download_torch_motion_model.sh
└── motionseg/
    ├── DeepMaskAlexNet.lua
    ├── SpatialSymmetricPadding.lua
    ├── load_motionmodel.lua
    └── utilsModel.lua
Download .txt
SYMBOL INDEX (5 symbols across 1 files)

FILE: image_transform_layer.py
  class TorchImageTransformLayer (line 22) | class TorchImageTransformLayer(caffe.Layer):
    method setup (line 23) | def setup(self, bottom, top):
    method forward (line 35) | def forward(self, bottom, top):
    method backward (line 46) | def backward(self, top, propagate_down, bottom):
    method reshape (line 50) | def reshape(self, bottom, top):
Condensed preview — 11 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (35K chars).
[
  {
    "path": ".gitignore",
    "chars": 55,
    "preview": "models/*.tar.gz\nmodels/caffemodels/\nmodels/torchmodels/"
  },
  {
    "path": "LICENSE",
    "chars": 1069,
    "preview": "MIT License\n\nCopyright (c) 2017 Deepak Pathak\n\nPermission is hereby granted, free of charge, to any person obtaining a c"
  },
  {
    "path": "README.md",
    "chars": 3649,
    "preview": "## Learning Features by Watching Objects Move ##\nIn CVPR 2017. [[Project Website]](http://cs.berkeley.edu/~pathak/unsupe"
  },
  {
    "path": "image_transform_layer.py",
    "chars": 1377,
    "preview": "\"\"\"\nTransform images for compatibility with models trained with\nhttps://github.com/facebook/fb.resnet.torch.\n\nUsage in m"
  },
  {
    "path": "models/download_caffe_models.sh",
    "chars": 857,
    "preview": "#!/bin/bash\n\nDIR=\"$( cd \"$( dirname \"${BASH_SOURCE[0]}\" )/\" && pwd )\"\ncd $DIR\n\nFILE=caffemodels.tar.gz\nURL=https://dl.fb"
  },
  {
    "path": "models/download_torch_models.sh",
    "chars": 857,
    "preview": "#!/bin/bash\n\nDIR=\"$( cd \"$( dirname \"${BASH_SOURCE[0]}\" )/\" && pwd )\"\ncd $DIR\n\nFILE=torchmodels.tar.gz\nURL=https://dl.fb"
  },
  {
    "path": "models/download_torch_motion_model.sh",
    "chars": 883,
    "preview": "#!/bin/bash\n\nDIR=\"$( cd \"$( dirname \"${BASH_SOURCE[0]}\" )/\" && pwd )\"\ncd $DIR\n\nFILE=torchmodels_motion.tar.gz\nURL=https:"
  },
  {
    "path": "motionseg/DeepMaskAlexNet.lua",
    "chars": 14460,
    "preview": "--[[ DeepMask model:\nWhen initialized, it creates/load the common trunk, the maskBranch and the\nscoreBranch or the color"
  },
  {
    "path": "motionseg/SpatialSymmetricPadding.lua",
    "chars": 3168,
    "preview": "--[[----------------------------------------------------------------------------\nCopyright (c) 2016-present, Facebook, I"
  },
  {
    "path": "motionseg/load_motionmodel.lua",
    "chars": 372,
    "preview": "require 'nn';\nrequire 'cunn';\nrequire 'cudnn';\n\npaths.dofile('DeepMaskAlexNet.lua');\nlocal cmd = torch.CmdLine()\ncmd:tex"
  },
  {
    "path": "motionseg/utilsModel.lua",
    "chars": 6264,
    "preview": "-- utility functions for models\n\nlocal utils = {}\n\n---------------------------------------------------------------------"
  }
]

About this extraction

This page contains the full source code of the pathak22/unsupervised-video GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 11 files (32.2 KB), approximately 8.9k tokens, and a symbol index with 5 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!