Repository: janivanecky/Depth-Estimation
Branch: master
Commit: f29082ca01da
Files: 68
Total size: 348.4 KB
Directory structure:
gitextract_fes_sm48/
├── README.md
├── dataset/
│ ├── README.txt
│ ├── test/
│ │ ├── _solarized.py
│ │ ├── _structure_classes.py
│ │ ├── convert.py
│ │ ├── create_test_lmdb.sh
│ │ ├── crop.py
│ │ └── process_test.sh
│ └── train/
│ ├── create_train_lmdb.sh
│ ├── get_train_scenes.m
│ ├── process_raw.m
│ ├── split_train_set.sh
│ ├── train_augment0.py
│ ├── train_augment1.py
│ └── train_augment2.py
├── eval_depth.py
├── get_depth.py
├── net_deploy.prototxt
├── net_train.prototxt
├── solver.prototxt
├── source/
│ ├── README.txt
│ ├── global_context_network/
│ │ ├── abs/
│ │ │ ├── net_deploy.prototxt
│ │ │ └── net_train.prototxt
│ │ ├── eval_depth.py
│ │ ├── log_abs/
│ │ │ ├── net_deploy.prototxt
│ │ │ └── net_train.prototxt
│ │ ├── norm/
│ │ │ ├── net_deploy.prototxt
│ │ │ └── net_train.prototxt
│ │ ├── sc-inv/
│ │ │ ├── net_deploy.prototxt
│ │ │ └── net_train.prototxt
│ │ ├── solver.prototxt
│ │ ├── test_depth.py
│ │ └── train.py
│ ├── gradient_network/
│ │ ├── abs/
│ │ │ ├── net_deploy.prototxt
│ │ │ └── net_train.prototxt
│ │ ├── eval_grad.py
│ │ ├── filter.prototxt
│ │ ├── norm/
│ │ │ ├── net_deploy.prototxt
│ │ │ └── net_train.prototxt
│ │ ├── solver.prototxt
│ │ ├── test_grad.py
│ │ └── train.py
│ ├── joint/
│ │ ├── architecture_A/
│ │ │ ├── net_deploy.prototxt
│ │ │ └── net_train.prototxt
│ │ ├── architecture_B/
│ │ │ ├── net_deploy.prototxt
│ │ │ └── net_train.prototxt
│ │ ├── eval_depth.py
│ │ ├── eval_grad.py
│ │ ├── filter.prototxt
│ │ ├── solver.prototxt
│ │ ├── test_depth.py
│ │ ├── test_grad.py
│ │ └── train.py
│ └── refining_network/
│ ├── abs/
│ │ ├── net_deploy.prototxt
│ │ └── net_train.prototxt
│ ├── eval_depth.py
│ ├── log_abs/
│ │ ├── net_deploy.prototxt
│ │ └── net_train.prototxt
│ ├── norm_abs/
│ │ ├── net_deploy.prototxt
│ │ └── net_train.prototxt
│ ├── norm_abs_global_only/
│ │ ├── net_deploy.prototxt
│ │ └── net_train.prototxt
│ ├── sc-inv_abs/
│ │ ├── net_deploy.prototxt
│ │ └── net_train.prototxt
│ ├── solver.prototxt
│ ├── test_depth.py
│ └── train.py
└── test_depth.py
================================================
FILE CONTENTS
================================================
================================================
FILE: README.md
================================================
# Depth Estimation by Convolutional Neural Networks
This is a repository for my master's thesis - Depth estimation by CNNs. You can read the whole thesis
here. Here I briefly present the solution and results.
## Architecture:
I use architecture similar to the one used by Eigen et al. with the difference, that I also use network that estimates
gradients of the depth map:

For the global context network I use pretrained AlexNet, gradient network is a convolutional part of AlexNet,
and the refining network is also fully convolutional, more details in the thesis. I trained each part separately,
first global context network and the gradient network, after that I fixed their parameters and trained the refining network.
### Normalized loss function:
For training the global context network and the refining network I wanted to use scale invariant loss, similar to the one used by Eigen et al., but I took it a step
further and I used loss function that is scale-and-translation invariant. I would put an equation here, but I couldn't find how to
do that easy. Luckily - it can be explained fairly easily in words: to obtain normalized depth map you just subtract its mean and divide by its variance.
Normalized loss is just a squared distance between normalized output depth map and the target depth map. This showed to improve speed of convergence significantly.
## Trained model
You can download the trained model here.
## Results:
I made several experiments for the thesis, you can have a look at all of them in the chapter 5 of the thesis. Here I present just
the most significant ones. All experiments were performed on NYU Depth v2 dataset.
### Comparison of different loss functions
I trained the refining network with different loss functions for 60 000 iterations.

From left to right: input; squared distance loss; squared distance loss in log space; scale invariant loss by Eigen et al.; normalized loss; ground truth
As you can see, networks utilizing other loss functions produce ineligible outputs compared to network using normalized loss. Difference
is reduced when the network is trained longer (Eigen et al. ran the training for ~1.5M iterations, here it's just 60k).
### Comparison to existing solutions
How does the model fare against existing solutions?
I compared results of my model to the results from this [1] and this [2] papers, both by Eigen et al..
Model with normalized loss has trouble estimating absolute depth values, but it estimates relative structure of the depth map fairly well.
To test this I substitued mean and variance of the ground truth to the output depth map and this model I called 'model with oracle'.
It achieves state of the art performance in RMSE metric at the time of writing the thesis. Keep in mind that this model just aims to prove
that model trained with normalized loss estimates the structure of the depth map well, regardless of absolute depth values.
| | [1] | [2] | Proposed model | With Oracle |
| :------------- | -------------:| -----:| --------------:| -----------:|
| RMSE | 0.907 | 0.641 | 1.169 | 0.569 |

From left to right by columns: input image, ground truth; [1], proposed model; [2], model with oracle
## Usage
`python test_depth.py INPUT_DIR GT_DIR OUT_DIR SNAPSHOTS_DIR [--log]`
- `INPUT_DIR` is the path to the folder containing input images
- `GT_DIR` is the path to the folder containing ground truth depth maps
- `OUT_DIR` is the path to the folder to which will be written output depth maps
- `SNAPSHOTS_DIR` is the path to the folder containing .caffemodel files containing trained network models. All models from this folder will be evaluated.
- `--log` switch is used when the depth values that are produced by the network are in log space
### Frameworks/Libraries needed:
* Caffe
* Python2.7: caffe, scipy, scikit-image, numpy, pypng, cv2, Pillow, matplotlib
### Few notes
- input images should be named in a same way as the corresponding ground truths, with difference that input images should have a suffix 'colors', while ground truth images should have a suffix 'depth'. Note that these suffixes should preceed file extension, e.g., 'image1_colors.png' and corresponding depth map 'image1_depth.png'
- along with .caffemodel file, corresponding deploy network definition file has to be placed into SNAPSHOTS_DIR, with the same name as the model file but with different extension 'prototxt' instead of 'caffemodel'
- there will actually be two output folders created, one OUT_DIR and the other OUT_DIR + '_abs'. OUT_DIR contains output depths that are fit using MVN normalization onto ground truth, OUT_DIR + '_abs' contains the raw output depth maps.
- note that you need AlexNet caffemodel for the training of the global context network, gradient network and their joint configuration. It can be downloaded here: https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet
================================================
FILE: dataset/README.txt
================================================
===================================================
Required libraries for python2.7:
===================================================
- caffe, h5py, scipy, scikit-image, numpy, pypng and joblib.
===================================================
How to process the training dataset:
===================================================
1.) Download RAW NYU Depth v2. dataset (450GB) from http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/nyu_depth_v2_raw.zip
2.) Extract the RAW dataset into a folder A (name not important)
3.) Download NYU Depth v2. toolbox from http://cs.nyu.edu/~silberman/code/toolbox_nyu_depth_v2.zip
4.) Extract scripts from the toolbox to folder 'tools' in folder A
5.) Run process_raw.m in folder A
6.) Download labeled NYU Depth v2. dataset from http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/nyu_depth_v2_labeled.mat
7.) Download splits.mat containing official train/test split http://horatio.cs.nyu.edu/mit/silberman/indoor_seg_sup/splits.mat
8.) Make sure that labeled dataset and splits.mat are in the same folder, let's call it folder B
9.) Run get_train_scenes.m in the folder B
10.) Run split_train_set.sh in the folder B and pass it a single argument, path to folder A ('......./path/to/folder/A')
11.) Run scripts train_augment0.py, train_augment1.py, train_augment2.py in folder B
11.) Run create_train_lmdb.sh in folder B and pass it a path to caffe folder as an argument
12.) You should now have folders 'train_raw0_lmdb' (dataset version Data0), 'train_raw1_lmdb' (dataset version Data1), 'train_raw2_lmdb' (dataset version Data2) in folder B
*Note: all referenced scripts can be foun in folder 'train'
===================================================
How to process the testing dataset:
===================================================
1.) Download labeled NYU Depth v2. dataset from http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/nyu_depth_v2_labeled.mat
2.) Download splits.mat containing official train/test split http://horatio.cs.nyu.edu/mit/silberman/indoor_seg_sup/splits.mat
3.) Place all downloaded files into single folder
4.) Run script process_test.sh
5.) Run create_test_lmdb.sh and pass it a path to caffe folder as an argument
6.) You should now have a folder 'test_lmdb' in your working directory
*Note: all referenced scripts can be found in folder 'test'
*Note2: files crop.py, _structure_classes.py, _solarized.py come from https://github.com/deeplearningais/curfil/wiki/Training-and-Prediction-with-the-NYU-Depth-v2-Dataset
================================================
FILE: dataset/test/_solarized.py
================================================
#######################################################################################
# The MIT License
# Copyright (c) 2014 Hannes Schulz, University of Bonn
# Copyright (c) 2013 Benedikt Waldvogel, University of Bonn
# Copyright (c) 2008-2009 Sebastian Nowozin
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#######################################################################################
colors = [
(0, 43, 54),
(7, 54, 66), # floor
(88, 110, 117),
(101, 123, 131),
(131, 148, 150),
(147, 161, 161), # structure
(238, 232, 213),
(253, 246, 227),
(181, 137, 0), # prop
(203, 75, 22), # furniture
(220, 50, 47),
(211, 54, 130),
(108, 113, 196),
(38, 139, 210),
(42, 161, 152),
(133, 153, 0)
]
================================================
FILE: dataset/test/_structure_classes.py
================================================
#######################################################################################
# The MIT License
# Copyright (c) 2014 Hannes Schulz, University of Bonn
# Copyright (c) 2013 Benedikt Waldvogel, University of Bonn
# Copyright (c) 2008-2009 Sebastian Nowozin
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#######################################################################################
def get_structure_classes():
structure_classes = dict()
structure_classes['shutters'] = "structure"
structure_classes['shelving'] = "furniture";
structure_classes['leg of table'] = "furniture";
structure_classes['colunn'] = "structure"
structure_classes['scissors'] = "prop";
structure_classes['plate with bottles'] = "prop";
structure_classes['plastic container'] = "prop";
structure_classes['hanging items'] = "prop";
structure_classes['leather sitting stool'] = "furniture";
structure_classes['colors'] = "prop";
structure_classes['trible bed sofa'] = "furniture";
structure_classes['hanging board'] = "structure"
structure_classes['wall frame'] = "prop";
structure_classes['brief case'] = "prop";
structure_classes['leg of chair'] = "furniture";
structure_classes['notice board'] = "structure"
structure_classes['bathroom clearning brush'] = "prop";
structure_classes['chess set'] = "prop";
structure_classes['brush'] = "prop";
structure_classes['cabint'] = "furniture";
structure_classes['noticeboard'] = "structure"
structure_classes['headboard'] = "furniture";
structure_classes['coffee table'] = "furniture";
structure_classes['measuring cup'] = "prop";
structure_classes['bottle of ketchup'] = "prop";
structure_classes['reflection of window shutters'] = "structure"
structure_classes['air conditioner'] = "structure"
structure_classes['air duct'] = "structure"
structure_classes['air vent'] = "structure"
structure_classes['alarm clock'] = "prop";
structure_classes['album'] = "prop";
structure_classes['aluminium foil'] = "prop";
structure_classes['antenna'] = "prop";
structure_classes['apple'] = "prop";
structure_classes['ashtray'] = "prop";
structure_classes['avocado'] = "prop";
structure_classes['baby chair'] = "furniture";
structure_classes['baby gate'] = "structure"
structure_classes['back scrubber'] = "prop";
structure_classes['backpack'] = "prop";
structure_classes['bag'] = "prop";
structure_classes['bag of bagels'] = "prop";
structure_classes['bag of chips'] = "prop";
structure_classes['bag of flour'] = "prop";
structure_classes['bag of hot dog buns'] = "prop";
structure_classes['bag of oreo'] = "prop";
structure_classes['bagel'] = "prop";
structure_classes['baking dish'] = "prop";
structure_classes['ball'] = "prop";
structure_classes['balloon'] = "prop";
structure_classes['banana'] = "prop";
structure_classes['banister'] = "structure"
structure_classes['bar'] = "structure"
structure_classes['bar of soap'] = "prop";
structure_classes['barrel'] = "furniture";
structure_classes['baseball'] = "prop";
structure_classes['basket'] = "prop";
structure_classes['basketball'] = "prop";
structure_classes['basketball hoop'] = "prop";
structure_classes['bassinet'] = "furniture";
structure_classes['bathtub'] = "furniture";
structure_classes['bean bag'] = "furniture";
structure_classes['bed'] = "furniture";
structure_classes['bedding package'] = "prop";
structure_classes['beeper'] = "prop";
structure_classes['belt'] = "prop";
structure_classes['bench'] = "furniture";
structure_classes['bicycle'] = "prop";
structure_classes['bicycle helmet'] = "prop";
structure_classes['bin'] = "prop";
structure_classes['binder'] = "prop";
structure_classes['blackboard'] = "structure"
structure_classes['blanket'] = "prop";
structure_classes['blender'] = "prop";
structure_classes['blinds'] = "structure"
structure_classes['board'] = "structure"
structure_classes['book'] = "prop";
structure_classes['bookend'] = "prop";
structure_classes['bookrack'] = "furniture";
structure_classes['books'] = "prop";
structure_classes['bookshelf'] = "furniture";
structure_classes['bottle'] = "prop";
structure_classes['bottle of comet'] = "prop";
structure_classes['bottle of contact lens solution'] = "prop";
structure_classes['bottle of liquid'] = "prop";
structure_classes['bottle of perfume'] = "prop";
structure_classes['bowl'] = "prop";
structure_classes['box'] = "prop";
structure_classes['box of ziplock bags'] = "prop";
structure_classes['bread'] = "prop";
structure_classes['bread pan'] = "prop";
structure_classes['brick'] = "prop";
structure_classes['briefcase'] = "prop";
structure_classes['broom'] = "prop";
structure_classes['bucket'] = "prop";
structure_classes['bulb'] = "prop";
structure_classes['bunk bed'] = "furniture";
structure_classes['business cards'] = "prop";
structure_classes['butterfly sculpture'] = "prop";
structure_classes['cabinet'] = "furniture";
structure_classes['cable box'] = "prop";
structure_classes['cable modem'] = "prop";
structure_classes['cable rack'] = "structure"
structure_classes['cables'] = "prop";
structure_classes['cactus'] = "prop";
structure_classes['cake'] = "prop";
structure_classes['calculator'] = "prop";
structure_classes['calendar'] = "prop";
structure_classes['camera'] = "prop";
structure_classes['can'] = "prop";
structure_classes['can of food'] = "prop";
structure_classes['can opener'] = "prop";
structure_classes['candelabra'] = "prop";
structure_classes['candle'] = "prop";
structure_classes['candlestick'] = "prop";
structure_classes['cane'] = "prop";
structure_classes['canister'] = "prop";
structure_classes['cans of cat food'] = "prop";
structure_classes['cap stand'] = "prop";
structure_classes['car'] = "prop";
structure_classes['cart'] = "prop";
structure_classes['carton'] = "prop";
structure_classes['case'] = "prop";
structure_classes['casserole dish'] = "prop";
structure_classes['cat'] = "prop";
structure_classes['cat bed'] = "furniture";
structure_classes['cat cage'] = "furniture";
structure_classes['cd'] = "prop";
structure_classes['cd disc'] = "prop";
structure_classes['cd player'] = "prop";
structure_classes['ceiling'] = "structure"
structure_classes['celery'] = "prop";
structure_classes['cell phone'] = "prop";
structure_classes['cell phone charger'] = "prop";
structure_classes['centerpiece'] = "prop";
structure_classes['ceramic frog'] = "prop";
structure_classes['certificate'] = "prop";
structure_classes['chair'] = "furniture";
structure_classes['chalk eraser'] = "prop";
structure_classes['chalkboard'] = "prop";
structure_classes['chandelier'] = "structure"
structure_classes['chapstick'] = "prop";
structure_classes['charger'] = "prop";
structure_classes['charger and wire'] = "prop";
structure_classes['chart'] = "prop";
structure_classes['chart roll'] = "prop";
structure_classes['chart stand'] = "furniture";
structure_classes['charts'] = "prop";
structure_classes['chessboard'] = "prop";
structure_classes['chest'] = "furniture";
structure_classes['child carrier'] = "prop";
structure_classes['chimney'] = "structure"
structure_classes['circuit breaker box'] = "structure"
structure_classes['classroom board'] = "structure"
structure_classes['cleaner'] = "prop";
structure_classes['cleaning wipes'] = "prop";
structure_classes['clipboard'] = "prop";
structure_classes['clock'] = "prop";
structure_classes['cloth bag'] = "prop";
structure_classes['cloth drying stand'] = "furniture";
structure_classes['clothes'] = "prop";
structure_classes['clothing detergent'] = "prop";
structure_classes['clothing dryer'] = "furniture";
structure_classes['clothing drying rack'] = "furniture";
structure_classes['clothing hamper'] = "furniture";
structure_classes['clothing hanger'] = "furniture";
structure_classes['clothing iron'] = "prop";
structure_classes['clothing washer'] = "furniture";
structure_classes['coaster'] = "prop";
structure_classes['coffee bag'] = "prop";
structure_classes['coffee grinder'] = "prop";
structure_classes['coffee machine'] = "prop";
structure_classes['coffee packet'] = "prop";
structure_classes['coffee pot'] = "prop";
structure_classes['coins'] = "prop";
structure_classes['coke bottle'] = "prop";
structure_classes['collander'] = "prop";
structure_classes['cologne'] = "prop";
structure_classes['column'] = "structure"
structure_classes['comb'] = "prop";
structure_classes['comforter'] = "prop";
structure_classes['computer'] = "prop";
structure_classes['computer disk'] = "prop";
structure_classes['conch shell'] = "prop";
structure_classes['cone'] = "prop";
structure_classes['console controller'] = "prop";
structure_classes['console system'] = "prop";
structure_classes['contact lens case'] = "prop";
structure_classes['contact lens solution bottle'] = "prop";
structure_classes['container'] = "prop";
structure_classes['container of skin cream'] = "prop";
structure_classes['cooking pan'] = "prop";
structure_classes['cooking pot cover'] = "prop";
structure_classes['copper vessel'] = "prop";
structure_classes['cordless phone'] = "prop";
structure_classes['cordless telephone'] = "prop";
structure_classes['cork board'] = "prop";
structure_classes['corkscrew'] = "prop";
structure_classes['corn'] = "prop";
structure_classes['counter'] = "structure"
structure_classes['cradle'] = "furniture";
structure_classes['crate'] = "furniture";
structure_classes['crayon'] = "prop";
structure_classes['cream'] = "prop";
structure_classes['cream tube'] = "prop";
structure_classes['crib'] = "furniture";
structure_classes['crock pot'] = "prop";
structure_classes['cup'] = "prop";
structure_classes['curtain'] = "structure"
structure_classes['curtain rod'] = "structure"
structure_classes['cutting board'] = "prop";
structure_classes['cylindrical paper holder'] = "prop";
structure_classes['decanter'] = "prop";
structure_classes['decoration item'] = "prop";
structure_classes['decorative bottle'] = "prop";
structure_classes['decorative dish'] = "prop";
structure_classes['decorative item'] = "prop";
structure_classes['decorative plate'] = "prop";
structure_classes['decorative platter'] = "prop";
structure_classes['deodarent spray bottle'] = "prop";
structure_classes['deoderant'] = "prop";
structure_classes['desk'] = "furniture";
structure_classes['desk drawer'] = "furniture";
structure_classes['desk mat'] = "prop";
structure_classes['desser'] = "furniture";
structure_classes['dish'] = "prop";
structure_classes['dish brush'] = "prop";
structure_classes['dish cover'] = "prop";
structure_classes['dish rack'] = "prop";
structure_classes['dish scrubber'] = "prop";
structure_classes['dishes'] = "prop";
structure_classes['dishwasher'] = "structure"
structure_classes['display board'] = "furniture";
structure_classes['display case'] = "furniture";
structure_classes['display platter'] = "prop";
structure_classes['dog'] = "prop";
structure_classes['dog bed'] = "furniture";
structure_classes['dog bowl'] = "prop";
structure_classes['dog cage'] = "furniture";
structure_classes['dog toy'] = "prop";
structure_classes['doily'] = "furniture";
structure_classes['doll'] = "prop";
structure_classes['doll house'] = "furniture";
structure_classes['dollar bill'] = "prop";
structure_classes['dolly'] = "furniture"
structure_classes['door'] = "structure"
structure_classes['door window reflection'] = "structure"
structure_classes['door curtain'] = "structure"
structure_classes['door facing trimreflection'] = "structure"
structure_classes['door frame'] = "structure"
structure_classes['door knob'] = "prop";
structure_classes['door lock'] = "prop";
structure_classes['door way'] = "structure"
structure_classes['door way arch'] = "structure"
structure_classes['doorreflection'] = "structure"
structure_classes['drain'] = "structure"
structure_classes['drawer'] = "furniture";
structure_classes['drawer handle'] = "prop";
structure_classes['dress wire frame'] = "prop";
structure_classes['dresser'] = "furniture";
structure_classes['drum'] = "prop";
structure_classes['drying rack'] = "furniture";
structure_classes['drying stand'] = "furniture";
structure_classes['duck'] = "prop";
structure_classes['duster'] = "prop";
structure_classes['dvd'] = "prop";
structure_classes['dvd player'] = "prop";
structure_classes['dvds'] = "prop";
structure_classes['earphone'] = "prop";
structure_classes['earplugs'] = "prop";
structure_classes['educational display'] = "furniture";
structure_classes['eggplant'] = "prop";
structure_classes['eggs'] = "prop";
structure_classes['electric box'] = "structure"
structure_classes['electric mixer'] = "prop";
structure_classes['electric toothbrush'] = "prop";
structure_classes['electric toothbrush base'] = "prop";
structure_classes['electrical kettle'] = "prop";
structure_classes['electrical outlet'] = "prop";
structure_classes['electrical plug'] = "prop";
structure_classes['electronic drumset'] = "prop";
structure_classes['envelope'] = "prop";
structure_classes['envelopes'] = "prop";
structure_classes['eraser'] = "prop";
structure_classes['ethernet jack'] = "prop";
structure_classes['excercise ball'] = "prop";
structure_classes['excercise equipment'] = "furniture";
structure_classes['excercise machine'] = "furniture";
structure_classes['exit sign'] = "prop";
structure_classes['eye glasses'] = "prop";
structure_classes['eyeball plastic ball'] = "prop";
structure_classes['face wash cream'] = "prop";
structure_classes['fan'] = "prop";
structure_classes['fashion medal'] = "prop";
structure_classes['faucet'] = "prop";
structure_classes['faucet handle'] = "prop";
structure_classes['fax machine'] = "prop";
structure_classes['fiberglass case'] = "prop";
structure_classes['file'] = "prop";
structure_classes['file box'] = "furniture";
structure_classes['file container'] = "prop";
structure_classes['file holder'] = "prop";
structure_classes['file pad'] = "prop";
structure_classes['file stand'] = "furniture";
structure_classes['filing shelves'] = "furniture";
structure_classes['fire alarm'] = "prop";
structure_classes['fire extinguisher'] = "prop";
structure_classes['fireplace'] = "structure"
structure_classes['fish tank'] = "structure"
structure_classes['flag'] = "prop";
structure_classes['flashcard'] = "prop";
structure_classes['flashlight'] = "prop";
structure_classes['flask'] = "prop";
structure_classes['flask set'] = "prop";
structure_classes['flatbed scanner'] = "prop";
structure_classes['flipboard'] = "furniture";
structure_classes['floor'] = "floor";
structure_classes['floor mat'] = "prop";
structure_classes['flower'] = "prop";
structure_classes['flower basket'] = "prop";
structure_classes['flower box'] = "prop";
structure_classes['flower pot'] = "prop";
structure_classes['folder'] = "prop";
structure_classes['folders'] = "prop";
structure_classes['food processor'] = "prop";
structure_classes['food wrapped on a tray'] = "prop";
structure_classes['foosball table'] = "furniture";
structure_classes['foot rest'] = "furniture";
structure_classes['football'] = "prop";
structure_classes['fork'] = "prop";
structure_classes['framed certificate'] = "prop";
structure_classes['fruit'] = "prop";
structure_classes['fruit basket'] = "prop";
structure_classes['fruit platter'] = "prop";
structure_classes['fruit stand'] = "prop";
structure_classes['fruitplate'] = "prop";
structure_classes['frying pan'] = "prop";
structure_classes['furnace'] = "furniture";
structure_classes['furniture'] = "furniture";
structure_classes['game system'] = "prop";
structure_classes['game table'] = "prop";
structure_classes['garage door'] = "structure"
structure_classes['garbage bag'] = "prop";
structure_classes['garbage bin'] = "furniture";
structure_classes['garlic'] = "prop";
structure_classes['gate'] = "structure"
structure_classes['gift wrapping'] = "prop";
structure_classes['gift wrapping roll'] = "prop";
structure_classes['glass'] = "structure"
structure_classes['glass baking dish'] = "prop";
structure_classes['glass box'] = "prop";
structure_classes['glass container'] = "prop";
structure_classes['glass dish'] = "prop";
structure_classes['glass pane'] = "structure"
structure_classes['glass pot'] = "prop";
structure_classes['glass rack'] = "structure"
structure_classes['glass set'] = "prop";
structure_classes['glass ware'] = "prop";
structure_classes['globe'] = "prop";
structure_classes['globe stand'] = "prop";
structure_classes['glove'] = "prop";
structure_classes['gold piece'] = "prop";
structure_classes['grandfather clock'] = "furniture";
structure_classes['grapefruit'] = "prop";
structure_classes['green screen'] = "structure"
structure_classes['grill'] = "structure"
structure_classes['guitar'] = "prop";
structure_classes['guitar case'] = "prop";
structure_classes['hair brush'] = "prop";
structure_classes['hair dryer'] = "prop";
structure_classes['hamburger bun'] = "prop";
structure_classes['hammer'] = "prop";
structure_classes['hand blender'] = "prop";
structure_classes['hand sanitizer'] = "prop";
structure_classes['hand sanitizer dispenser'] = "prop";
structure_classes['hand sculpture'] = "prop";
structure_classes['hanger'] = "prop";
structure_classes['hangers'] = "prop";
structure_classes['hanging hooks'] = "prop";
structure_classes['hat'] = "prop";
structure_classes['head phone'] = "prop";
structure_classes['head phones'] = "prop";
structure_classes['headband'] = "prop";
structure_classes['headphones'] = "prop";
structure_classes['heater'] = "furniture";
structure_classes['hockey glove'] = "prop";
structure_classes['hockey stick'] = "prop";
structure_classes['hole puncher'] = "prop";
structure_classes['hookah'] = "prop";
structure_classes['hooks'] = "prop";
structure_classes['hoola hoop'] = "prop";
structure_classes['horse toy'] = "prop";
structure_classes['hot dogs'] = "prop";
structure_classes['hot water heater'] = "prop";
structure_classes['humidifier'] = "prop";
structure_classes['id card'] = "prop";
structure_classes['incense candle'] = "prop";
structure_classes['incense holder'] = "prop";
structure_classes['inkwell'] = "prop";
structure_classes['ipad'] = "prop";
structure_classes['ipod'] = "prop";
structure_classes['ipod dock'] = "prop";
structure_classes['iron box'] = "prop";
structure_classes['iron grill'] = "structure"
structure_classes['ironing board'] = "furniture";
structure_classes['jacket'] = "prop";
structure_classes['jar'] = "prop";
structure_classes['jersey'] = "prop";
structure_classes['jug'] = "prop";
structure_classes['juicer'] = "prop";
structure_classes['karate belts'] = "prop";
structure_classes['key'] = "prop";
structure_classes['keyboard'] = "prop";
structure_classes['kichen towel'] = "prop";
structure_classes['kinect'] = "prop";
structure_classes['kitchen container plastic'] = "prop";
structure_classes['kitchen island'] = "structure"
structure_classes['kitchen items'] = "prop";
structure_classes['kitchen utensil'] = "prop";
structure_classes['kitchen utensils'] = "prop";
structure_classes['kiwi'] = "prop";
structure_classes['knife'] = "prop";
structure_classes['knife rack'] = "prop";
structure_classes['knob'] = "prop";
structure_classes['knobs'] = "prop";
structure_classes['label'] = "prop";
structure_classes['ladder'] = "furniture";
structure_classes['ladel'] = "prop";
structure_classes['lamp'] = "prop";
structure_classes['laptop'] = "prop";
structure_classes['laundry basket'] = "prop";
structure_classes['laundry detergent jug'] = "prop";
structure_classes['lazy susan'] = "prop";
structure_classes['leather sofa'] = "furniture";
structure_classes['lectern'] = "furniture";
structure_classes['leg of a girl'] = "prop";
structure_classes['lego'] = "prop";
structure_classes['letter stand'] = "prop";
structure_classes['letters'] = "prop";
structure_classes['lid'] = "prop";
structure_classes['lid of jar'] = "prop";
structure_classes['life jacket'] = "prop";
structure_classes['light'] = "structure"
structure_classes['light bulb'] = "prop";
structure_classes['light switch'] = "structure"
structure_classes['light switchreflection'] = "structure"
structure_classes['lighting track'] = "structure"
structure_classes['lint comb'] = "prop";
structure_classes['lint roller'] = "prop";
structure_classes['litter box'] = "prop";
structure_classes['luggage'] = "prop";
structure_classes['luggage rack'] = "furniture";
structure_classes['lunch bag'] = "prop";
structure_classes['machine'] = "prop";
structure_classes['magazine'] = "prop";
structure_classes['magazine holder'] = "prop";
structure_classes['magic 8ball'] = "prop";
structure_classes['magnet'] = "prop";
structure_classes['mail shelf'] = "structure"
structure_classes['mailshelf'] = "structure"
structure_classes['mail tray'] = "prop";
structure_classes['makeup brush'] = "prop";
structure_classes['manilla envelope'] = "prop";
structure_classes['mantel'] = "structure"
structure_classes['map'] = "prop";
structure_classes['mask'] = "prop";
structure_classes['matchbox'] = "prop";
structure_classes['mattress'] = "furniture";
structure_classes['medal'] = "prop";
structure_classes['medicine tube'] = "prop";
structure_classes['mellon'] = "prop";
structure_classes['menorah'] = "prop";
structure_classes['mens suit'] = "prop";
structure_classes['mens tie'] = "prop";
structure_classes['mezuza'] = "prop";
structure_classes['microphone'] = "prop";
structure_classes['microphone stand'] = "prop";
structure_classes['microwave'] = "prop";
structure_classes['mirror'] = "prop";
structure_classes['model boat'] = "prop";
structure_classes['modem'] = "prop";
structure_classes['money'] = "prop";
structure_classes['monitor'] = "prop";
structure_classes['motion camera'] = "prop";
structure_classes['mouse'] = "prop";
structure_classes['mouse pad'] = "prop";
structure_classes['muffins'] = "prop";
structure_classes['mug hanger'] = "prop";
structure_classes['mug holder'] = "prop";
structure_classes['music keyboard'] = "prop";
structure_classes['music stand'] = "furniture";
structure_classes['music stereo'] = "prop";
structure_classes['nailclipper'] = "prop";
structure_classes['napkin'] = "prop";
structure_classes['napkin dispenser'] = "prop";
structure_classes['napkin holder'] = "prop";
structure_classes['necklace'] = "prop";
structure_classes['necklace holder'] = "prop";
structure_classes['night stand'] = "furniture";
structure_classes['notebook'] = "prop";
structure_classes['notecards'] = "prop";
structure_classes['oil container'] = "prop";
structure_classes['onion'] = "prop";
structure_classes['orange'] = "prop";
structure_classes['orange juicer'] = "prop";
structure_classes['orange plastic cap'] = "prop";
structure_classes['ornamental item'] = "prop";
structure_classes['ornamental plant'] = "prop";
structure_classes['ornamental pot'] = "prop";
structure_classes['ottoman'] = "furniture";
structure_classes['oven'] = "structure"
structure_classes['oven handle'] = "prop";
structure_classes['oven mitt'] = "prop";
structure_classes['package of bedroom sheets'] = "prop";
structure_classes['package of bottled water'] = "prop";
structure_classes['package of water'] = "prop";
structure_classes['pan'] = "prop";
structure_classes['paper'] = "prop";
structure_classes['paper bundle'] = "prop";
structure_classes['paper cutter'] = "prop";
structure_classes['paper holder'] = "prop";
structure_classes['paper rack'] = "prop";
structure_classes['paper towel'] = "prop";
structure_classes['paper towel dispenser'] = "prop";
structure_classes['paper towel holder'] = "prop";
structure_classes['paper tray'] = "prop";
structure_classes['paper weight'] = "prop";
structure_classes['papers'] = "prop";
structure_classes['peach'] = "prop";
structure_classes['pen'] = "prop";
structure_classes['pen box'] = "prop";
structure_classes['pen cup'] = "prop";
structure_classes['pen holder'] = "prop";
structure_classes['pen stand'] = "prop";
structure_classes['pencil'] = "prop";
structure_classes['pencil holder'] = "prop";
structure_classes['pencils pens'] = "prop";
structure_classes['penholder'] = "prop";
structure_classes['pepper'] = "prop";
structure_classes['pepper grinder'] = "prop";
structure_classes['pepper shaker'] = "prop";
structure_classes['perfume'] = "prop";
structure_classes['perfume box'] = "prop";
structure_classes['person'] = "prop";
structure_classes['personal care liquid'] = "prop";
structure_classes['phone jack'] = "structure"
structure_classes['photo'] = "prop";
structure_classes['piano'] = "furniture";
structure_classes['piano bench'] = "furniture";
structure_classes['picture'] = "prop";
structure_classes['picture of fish'] = "prop";
structure_classes['piece of wood'] = "prop";
structure_classes['pig'] = "prop";
structure_classes['pillow'] = "prop";
structure_classes['pineapple'] = "prop";
structure_classes['ping pong racquet'] = "prop";
structure_classes['ping pong table'] = "furniture";
structure_classes['pipe'] = "prop";
structure_classes['pitcher'] = "prop";
structure_classes['pizza box'] = "prop";
structure_classes['placard'] = "prop";
structure_classes['placemat'] = "prop";
structure_classes['plant'] = "prop";
structure_classes['plant pot'] = "prop";
structure_classes['plaque'] = "prop";
structure_classes['plastic bowl'] = "prop";
structure_classes['plastic box'] = "prop";
structure_classes['plastic chair'] = "prop";
structure_classes['plastic crate'] = "prop";
structure_classes['plastic cup of coffee'] = "prop";
structure_classes['plastic dish'] = "prop";
structure_classes['plastic rack'] = "prop";
structure_classes['plastic toy container'] = "prop";
structure_classes['plastic tray'] = "prop";
structure_classes['plastic tub'] = "prop";
structure_classes['plate'] = "prop";
structure_classes['platter'] = "prop";
structure_classes['playpen'] = "furniture";
structure_classes['pool sticks'] = "prop";
structure_classes['pool table'] = "furniture";
structure_classes['poster'] = "prop";
structure_classes['poster board'] = "prop";
structure_classes['poster case'] = "prop";
structure_classes['pot'] = "prop";
structure_classes['potato'] = "prop";
structure_classes['power surge'] = "prop";
structure_classes['printer'] = "prop";
structure_classes['projector'] = "prop";
structure_classes['projector screen'] = "structure"
structure_classes['pump dispenser'] = "prop";
structure_classes['puppy toy'] = "prop";
structure_classes['purse'] = "prop";
structure_classes['quill'] = "prop";
structure_classes['quilt'] = "prop";
structure_classes['radiator'] = "furniture";
structure_classes['radio'] = "prop";
structure_classes['rags'] = "prop";
structure_classes['railing'] = "structure"
structure_classes['range hood'] = "structure"
structure_classes['razor'] = "prop";
structure_classes['refridgerator'] = "furniture";
structure_classes['remote control'] = "prop";
structure_classes['rolled carpet'] = "prop";
structure_classes['rolled up rug'] = "prop";
structure_classes['room divider'] = "furniture";
structure_classes['rope'] = "prop";
structure_classes['router'] = "prop";
structure_classes['rug'] = "prop";
structure_classes['ruler'] = "prop";
structure_classes['salt and pepper'] = "prop";
structure_classes['salt container'] = "prop";
structure_classes['salt shaker'] = "prop";
structure_classes['saucer'] = "prop";
structure_classes['scale'] = "prop";
structure_classes['scarf'] = "prop";
structure_classes['scenary'] = "prop";
structure_classes['scissor'] = "prop";
structure_classes['sculpture'] = "prop";
structure_classes['security camera'] = "prop";
structure_classes['server'] = "prop";
structure_classes['serving dish'] = "prop";
structure_classes['serving platter'] = "prop";
structure_classes['serving spoon'] = "prop";
structure_classes['sewing machine'] = "prop";
structure_classes['shaver'] = "prop";
structure_classes['shaving cream'] = "prop";
structure_classes['sheet'] = "prop";
structure_classes['sheet music'] = "prop";
structure_classes['sheet of metal'] = "prop";
structure_classes['sheets'] = "prop";
structure_classes['shelves'] = "furniture";
structure_classes['shirts in hanger'] = "prop";
structure_classes['shoe'] = "prop";
structure_classes['shoe rack'] = "prop";
structure_classes['shoelace'] = "prop";
structure_classes['shofar'] = "prop";
structure_classes['shopping baskets'] = "prop";
structure_classes['shopping cart'] = "prop";
structure_classes['shorts'] = "prop";
structure_classes['shovel'] = "prop";
structure_classes['show piece'] = "prop";
structure_classes['shower area'] = "structure"
structure_classes['shower base'] = "structure"
structure_classes['shower cap'] = "prop";
structure_classes['shower curtain'] = "structure"
structure_classes['shower glass'] = "structure"
structure_classes['shower head'] = "prop";
structure_classes['shower hose'] = "prop";
structure_classes['shower knob'] = "prop";
structure_classes['shower pipe'] = "prop";
structure_classes['shower tube'] = "prop";
structure_classes['showing plate'] = "prop";
structure_classes['sifter'] = "prop";
structure_classes['sign'] = "prop";
structure_classes['sink'] = "prop";
structure_classes['sink protector'] = "prop";
structure_classes['sissors'] = "prop";
structure_classes['six pack of beer'] = "prop";
structure_classes['slide'] = "furniture";
structure_classes['soap'] = "prop";
structure_classes['soap box'] = "prop";
structure_classes['soap dish'] = "prop";
structure_classes['soap holder'] = "prop";
structure_classes['soap stand'] = "prop";
structure_classes['soap tray'] = "prop";
structure_classes['sock'] = "prop";
structure_classes['sofa'] = "furniture";
structure_classes['soft toy'] = "prop";
structure_classes['soft toy group'] = "prop";
structure_classes['spatula'] = "prop";
structure_classes['speaker'] = "prop";
structure_classes['spice bottle'] = "prop";
structure_classes['spice rack'] = "structure"
structure_classes['spice stand'] = "structure"
structure_classes['sponge'] = "prop";
structure_classes['spoon'] = "prop";
structure_classes['spoon sets'] = "prop";
structure_classes['spoon stand'] = "prop";
structure_classes['squash'] = "prop";
structure_classes['squeeze tube'] = "prop";
structure_classes['stacked bins'] = "prop";
structure_classes['stacked bins boxes'] = "prop";
structure_classes['stacked chairs'] = "furniture";
structure_classes['stacked plastic racks'] = "furniture";
structure_classes['stairs'] = "structure"
structure_classes['stamp'] = "prop";
structure_classes['stand'] = "furniture";
structure_classes['staple remover'] = "prop";
structure_classes['stapler'] = "prop";
structure_classes['steamer'] = "prop";
structure_classes['step stool'] = "prop";
structure_classes['stereo'] = "prop";
structure_classes['stick'] = "prop";
structure_classes['sticker'] = "prop";
structure_classes['sticks'] = "prop";
structure_classes['stones'] = "prop";
structure_classes['stool'] = "prop";
structure_classes['storage basket'] = "prop";
structure_classes['storage bin'] = "prop";
structure_classes['storage box'] = "prop";
structure_classes['storage chest'] = "furniture";
structure_classes['storage rack'] = "furniture";
structure_classes['storage shelvesbooks'] = "furniture";
structure_classes['storage space'] = "structure"
structure_classes['stove'] = "structure"
structure_classes['stove burner'] = "prop";
structure_classes['stroller'] = "furniture";
structure_classes['stuffed animal'] = "prop";
structure_classes['styrofoam object'] = "prop";
structure_classes['suger jar'] = "prop";
structure_classes['suitcase'] = "prop";
structure_classes['surge protect'] = "prop";
structure_classes['surge protector'] = "prop";
structure_classes['switchbox'] = "prop";
structure_classes['table'] = "furniture";
structure_classes['table runner'] = "prop";
structure_classes['tablecloth'] = "prop";
structure_classes['tag'] = "prop";
structure_classes['tape'] = "prop";
structure_classes['tape dispenser'] = "prop";
structure_classes['tea box'] = "prop";
structure_classes['tea cannister'] = "prop";
structure_classes['tea coaster'] = "prop";
structure_classes['tea kettle'] = "prop";
structure_classes['tea pot'] = "prop";
structure_classes['telephone'] = "prop";
structure_classes['telephone cord'] = "prop";
structure_classes['telescope'] = "prop";
structure_classes['television'] = "prop";
structure_classes['tennis racket'] = "prop";
structure_classes['tent'] = "furniture";
structure_classes['thermostat'] = "prop";
structure_classes['tin foil'] = "prop";
structure_classes['tissue'] = "prop";
structure_classes['tissue box'] = "prop";
structure_classes['tissue roll'] = "prop";
structure_classes['toaster'] = "prop";
structure_classes['toaster oven'] = "prop";
structure_classes['toilet'] = "furniture";
structure_classes['toilet bowl brush'] = "prop";
structure_classes['toilet brush'] = "prop";
structure_classes['toilet holder'] = "prop";
structure_classes['toilet paper'] = "prop";
structure_classes['toilet paper holder'] = "prop";
structure_classes['toilet plunger'] = "prop";
structure_classes['toiletries'] = "prop";
structure_classes['toiletries bag'] = "prop";
structure_classes['toothbrush'] = "prop";
structure_classes['toothbrush holder'] = "prop";
structure_classes['toothpaste'] = "prop";
structure_classes['toothpaste holder'] = "prop";
structure_classes['torah'] = "prop";
structure_classes['torch'] = "prop";
structure_classes['towel'] = "prop";
structure_classes['towel rod'] = "structure"
structure_classes['toy'] = "prop";
structure_classes['toy boat'] = "prop";
structure_classes['toy box'] = "prop";
structure_classes['toy car'] = "prop";
structure_classes['toy cash register'] = "prop";
structure_classes['toy chair'] = "prop";
structure_classes['toy chest'] = "prop";
structure_classes['toy cube'] = "prop";
structure_classes['toy cuboid'] = "prop";
structure_classes['toy cylinder'] = "prop";
structure_classes['toy doll'] = "prop";
structure_classes['toy horse'] = "prop";
structure_classes['toy house'] = "prop";
structure_classes['toy kitchen'] = "prop";
structure_classes['toy phone'] = "prop";
structure_classes['toy pyramid'] = "prop";
structure_classes['toy rectangle'] = "prop";
structure_classes['toy shelf'] = "prop";
structure_classes['toy sink'] = "prop";
structure_classes['toy sofa'] = "prop";
structure_classes['toy table'] = "prop";
structure_classes['toy tree'] = "prop";
structure_classes['toy triangle'] = "prop";
structure_classes['toy truck'] = "prop";
structure_classes['toy trucks'] = "prop";
structure_classes['toyhouse'] = "prop";
structure_classes['toys basket'] = "prop";
structure_classes['toys box'] = "prop";
structure_classes['toys rack'] = "furniture";
structure_classes['toys shelf'] = "furniture";
structure_classes['track light'] = "structure"
structure_classes['trampoline'] = "furniture";
structure_classes['travel bag'] = "prop";
structure_classes['tray'] = "prop";
structure_classes['treadmill'] = "furniture";
structure_classes['tree sculpture'] = "structure"
structure_classes['tricycle'] = "prop";
structure_classes['trivet'] = "prop";
structure_classes['trolly'] = "furniture";
structure_classes['trophy'] = "prop";
structure_classes['tub of tupperware'] = "prop";
structure_classes['tumbler'] = "prop";
structure_classes['tuna cans'] = "prop";
structure_classes['tupperware'] = "prop";
structure_classes['tv stand'] = "furniture";
structure_classes['typewriter'] = "prop";
structure_classes['umbrella'] = "prop";
structure_classes['unknown'] = "prop";
structure_classes['urn'] = "prop";
structure_classes['usb drive'] = "prop";
structure_classes['utensil'] = "prop";
structure_classes['utensil container'] = "prop";
structure_classes['utensils'] = "prop";
structure_classes['vacuum cleaner'] = "prop";
structure_classes['vase'] = "prop";
structure_classes['vasoline'] = "prop";
structure_classes['vegetable'] = "prop";
structure_classes['vegetable peeler'] = "prop";
structure_classes['vegetables'] = "prop";
structure_classes['ventilation'] = "structure"
structure_classes['vessel'] = "prop";
structure_classes['vessel set'] = "prop";
structure_classes['vessels'] = "prop";
structure_classes['video game'] = "prop";
structure_classes['vuvuzela'] = "prop";
structure_classes['waffle maker'] = "prop";
structure_classes['walkie talkie'] = "prop";
structure_classes['wall'] = "structure"
structure_classes['wall decoration'] = "prop";
structure_classes['wall divider'] = "furniture";
structure_classes['wall hand sanitizer dispenser'] = "prop";
structure_classes['wall stand'] = "structure"
structure_classes['wallet'] = "prop";
structure_classes['wardrobe'] = "furniture";
structure_classes['washing machine'] = "furniture";
structure_classes['watch'] = "prop";
structure_classes['water carboy'] = "prop";
structure_classes['water cooler'] = "furniture";
structure_classes['water dispenser'] = "prop";
structure_classes['water filter'] = "prop";
structure_classes['water fountain'] = "structure"
structure_classes['water heater'] = "prop";
structure_classes['water purifier'] = "prop";
structure_classes['watermellon'] = "prop";
structure_classes['webcam'] = "prop";
structure_classes['whisk'] = "prop";
structure_classes['whiteboard'] = "structure"
structure_classes['whiteboard eraser'] = "prop";
structure_classes['whiteboard marker'] = "prop";
structure_classes['wii'] = "prop";
structure_classes['window'] = "structure"
structure_classes['window box'] = "structure"
structure_classes['window cover'] = "structure"
structure_classes['window frame'] = "structure"
structure_classes['window seat'] = "structure"
structure_classes['window shelf'] = "structure"
structure_classes['wine accessory'] = "prop";
structure_classes['wine bottle'] = "prop";
structure_classes['wine glass'] = "prop";
structure_classes['wine rack'] = "prop";
structure_classes['wiping cloth'] = "prop";
structure_classes['wire'] = "prop";
structure_classes['wire basket'] = "prop";
structure_classes['wire board'] = "prop";
structure_classes['wire rack'] = "prop";
structure_classes['wire tray'] = "prop";
structure_classes['wooden container'] = "prop";
structure_classes['wooden kitchen utensils'] = "prop";
structure_classes['wooden pillar'] = "structure"
structure_classes['wooden plank'] = "prop";
structure_classes['wooden planks'] = "prop";
structure_classes['wooden toy'] = "prop";
structure_classes['wooden utensil'] = "prop";
structure_classes['wooden utensils'] = "prop";
structure_classes['wreathe'] = "prop";
structure_classes['xbox'] = "prop";
structure_classes['yarmulka'] = "prop";
structure_classes['yellow pepper'] = "prop";
structure_classes['yoga mat'] = "prop";
structure_classes['toy bottle'] = "prop";
structure_classes['lock'] = "prop";
structure_classes['iphone'] = "prop";
structure_classes['napkin ring'] = "prop";
structure_classes['bed sheets'] = "prop";
structure_classes['spot light'] = "prop";
structure_classes['mortar and pestle'] = "prop";
structure_classes['stack of plates'] = "prop";
structure_classes['suit jacket'] = "prop";
structure_classes['coat hanger'] = "prop";
structure_classes['cardboard tube'] = "prop";
structure_classes['toy bin'] = "prop";
structure_classes['roll of paper'] = "prop";
structure_classes['cardboard sheet'] = "prop";
structure_classes['pyramid'] = "prop";
structure_classes['toy plane'] = "prop";
structure_classes['bottle of soap'] = "prop";
structure_classes['box of paper'] = "prop";
structure_classes['trolley'] = "prop";
structure_classes['pool ball'] = "prop";
structure_classes['alarm'] = "prop";
structure_classes['cannister'] = "prop";
structure_classes['ping pong ball'] = "prop";
structure_classes['ping pong racket'] = "prop";
structure_classes['roll of toilet paper'] = "prop";
structure_classes['bottle of listerine'] = "prop";
structure_classes['bottle of hand wash liquid'] = "prop";
structure_classes['banana peel'] = "prop";
structure_classes['heating tray'] = "prop";
structure_classes['measuring cap'] = "prop";
structure_classes['bottle of ketcup'] = "prop";
structure_classes['handle'] = "prop";
structure_classes['lemon'] = "prop";
structure_classes['wine'] = "prop";
structure_classes['boomerang'] = "prop";
structure_classes['button'] = "prop";
structure_classes['decorative bowl'] = "prop";
structure_classes['book holder'] = "prop";
structure_classes['toy apple'] = "prop";
structure_classes['toy dog'] = "prop";
structure_classes['drawer knob'] = "prop";
structure_classes['shoe hanger'] = "prop";
structure_classes['figurine'] = "prop";
structure_classes['soccer ball'] = "prop";
structure_classes['hand weight'] = "prop";
structure_classes['sleeping bag'] = "prop";
structure_classes['trinket'] = "prop";
structure_classes['hand fan'] = "prop";
structure_classes['sculpture'] = "prop";
structure_classes['sculpture of the chrysler building'] = "prop";
structure_classes['sculpture of the eiffel tower'] = "prop";
structure_classes['sculpture of the empire state building'] = "prop";
structure_classes['jeans'] = "prop";
structure_classes['toy stroller'] = "prop";
structure_classes['shelf frame'] = "prop";
structure_classes['cat house'] = "prop";
structure_classes['can of beer'] = "prop";
structure_classes['lamp shade'] = "prop";
structure_classes['bracelet'] = "prop";
structure_classes['indoor fountain'] = "furniture";
structure_classes['decorative egg'] = "prop";
structure_classes['photo album'] = "prop";
structure_classes['decorative candle'] = "prop";
structure_classes['walkietalkie'] = "prop";
structure_classes['floor trim'] = "structure"
structure_classes['mini display platform'] = "prop";
structure_classes['american flag'] = "prop";
structure_classes['vhs tapes'] = "prop";
structure_classes['throw'] = "prop";
structure_classes['newspapers'] = "prop";
structure_classes['mantle'] = "structure"
structure_classes['roll of paper towels'] = "prop";
return structure_classes
================================================
FILE: dataset/test/convert.py
================================================
#!/usr/bin/env python
#######################################################################################
# The MIT License
# Copyright (c) 2014 Hannes Schulz, University of Bonn
# Copyright (c) 2013 Benedikt Waldvogel, University of Bonn
# Copyright (c) 2008-2009 Sebastian Nowozin
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in all
# copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
# SOFTWARE.
#######################################################################################
# vim: set fileencoding=utf-8 :
#
# Helper script to convert the NYU Depth v2 dataset Matlab file into a set of
# PNG images in the CURFIL dataset format.
#
# See https://github.com/deeplearningais/curfil/wiki/Training-and-Prediction-with-the-NYU-Depth-v2-Dataset
from __future__ import print_function
from joblib import Parallel, delayed
from skimage import exposure
from skimage.io import imsave
import h5py
import numpy as np
import os
import png
import scipy.io
import sys
from _structure_classes import get_structure_classes
import _solarized
def process_ground_truth(ground_truth):
colors = dict()
colors["structure"] = _solarized.colors[5]
colors["prop"] = _solarized.colors[8]
colors["furniture"] = _solarized.colors[9]
colors["floor"] = _solarized.colors[1]
shape = list(ground_truth.shape) + [3]
img = np.ndarray(shape=shape, dtype=np.uint8)
for i in xrange(shape[0]):
for j in xrange(shape[1]):
l = ground_truth[i, j]
if (l == 0):
img[i, j] = (0, 0, 0) # background
else:
name = classes[names[l - 1]]
assert name in colors, name
img[i, j] = colors[name]
return img
def visualize_depth_image(data):
data[data == 0.0] = np.nan
maxdepth = np.nanmax(data)
mindepth = np.nanmin(data)
data = data.copy()
data -= mindepth
data /= (maxdepth - mindepth)
gray = np.zeros(list(data.shape) + [3], dtype=data.dtype)
data = (1.0 - data)
gray[..., :3] = np.dstack((data, data, data))
# use a greenish color to visualize missing depth
gray[np.isnan(data), :] = (97, 160, 123)
gray[np.isnan(data), :] /= 255
gray = exposure.equalize_hist(gray)
# set alpha channel
gray = np.dstack((gray, np.ones(data.shape[:2])))
gray[np.isnan(data), -1] = 0.5
return gray * 255
def convert_image(i, scene, img_depth, image, label):
idx = int(i) + 1
if idx in train_images:
train_test = "training"
else:
assert idx in test_images, "index %d neither found in training set nor in test set" % idx
train_test = "testing"
folder = "%s/%s/%s" % (out_folder, train_test, scene)
if not os.path.exists(folder):
os.makedirs(folder)
img_depth *= 1000.0
png.from_array(img_depth, 'L;16').save("%s/%05d_depth.png" % (folder, i))
depth_visualization = visualize_depth_image(img_depth)
# workaround for a bug in the png module
depth_visualization = depth_visualization.copy() # makes in contiguous
shape = depth_visualization.shape
depth_visualization.shape = (shape[0], np.prod(shape[1:]))
depth_image = png.from_array(depth_visualization, "RGBA;8")
depth_image.save("%s/%05d_depth_visualization.png" % (folder, i))
imsave("%s/%05d_colors.png" % (folder, i), image)
ground_truth = process_ground_truth(label)
imsave("%s/%05d_ground_truth.png" % (folder, i), ground_truth)
if __name__ == "__main__":
if len(sys.argv) < 4:
print("usage: %s [ ]" % sys.argv[0], file=sys.stderr)
sys.exit(0)
h5_file = h5py.File(sys.argv[1], "r")
# h5py is not able to open that file. but scipy is
train_test = scipy.io.loadmat(sys.argv[2])
out_folder = sys.argv[3]
if len(sys.argv) >= 5:
raw_depth = bool(int(sys.argv[4]))
else:
raw_depth = False
if len(sys.argv) >= 6:
num_threads = int(sys.argv[5])
else:
num_threads = -1
test_images = set([int(x) for x in train_test["testNdxs"]])
train_images = set([int(x) for x in train_test["trainNdxs"]])
print("%d training images" % len(train_images))
print("%d test images" % len(test_images))
if raw_depth:
print("using raw depth images")
depth = h5_file['rawDepths']
else:
print("using filled depth images")
depth = h5_file['depths']
print("reading", sys.argv[1])
labels = h5_file['labels']
images = h5_file['images']
rawDepthFilenames = [u''.join(unichr(c) for c in h5_file[obj_ref]) for obj_ref in h5_file['rawDepthFilenames'][0]]
names = [u''.join(unichr(c) for c in h5_file[obj_ref]) for obj_ref in h5_file['names'][0]]
scenes = [u''.join(unichr(c) for c in h5_file[obj_ref]) for obj_ref in h5_file['sceneTypes'][0]]
rawRgbFilenames = [u''.join(unichr(c) for c in h5_file[obj_ref]) for obj_ref in h5_file['rawRgbFilenames'][0]]
classes = get_structure_classes()
print("processing images")
if num_threads == 1:
print("single-threaded mode")
for i, image in enumerate(images):
print("image", i + 1, "/", len(images))
convert_image(i, scenes[i], depth[i, :, :].T, image.T, labels[i, :, :].T)
else:
Parallel(num_threads, 5)(delayed(convert_image)(i, scenes[i], depth[i, :, :].T, images[i, :, :].T, labels[i, :, :].T) for i in range(len(images)))
print("finished")
================================================
FILE: dataset/test/create_test_lmdb.sh
================================================
#! /bin/bash
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
for f in test_colors/*.png
do
file=$(basename $f)
depthfile="${file/colors/depth}"
printf "$file $depthfile\n" >> list_ordered.txt
done
sort --random-sort -o list.txt list_ordered.txt
awk < list.txt '{printf $1; printf " 0\n"}' > list_color.txt
awk < list.txt '{printf $2; printf " 0\n"}' > list_depth.txt
rm list.txt
rm list_ordered.txt
mkdir -p test_lmdb
$1/build/tools/convert_imageset -resize_height 27 -resize_width 37 -gray test_depths/ list_depth.txt test_lmdb/test_depth_37x27.lmdb
$1/build/tools/convert_imageset -resize_height 54 -resize_width 74 -gray test_depths/ list_depth.txt test_lmdb/test_depth_74x54.lmdb
$1/build/tools/convert_imageset -resize_height 218 -resize_width 298 test_colors/ list_color.txt test_lmdb/test_color_298x218.lmdb
rm list_depth.txt
rm list_color.txt
================================================
FILE: dataset/test/crop.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
import sys
import PIL
from PIL import Image
import cv2
import cv
import caffe
import argparse
import os.path
import random
from random import randint
parser = argparse.ArgumentParser()
#parser.add_argument("color_folder", help="input folder")
args = parser.parse_args()
for file in os.listdir("test_data"):
if file.endswith(".png"):
filePath = 'test_data/' + file
width, height = 640, 480
newWidth, newHeight = 420, 320
borderX = (width - newWidth) / 2
borderY = (height - newHeight) / 2
img = Image.open(filePath)
img = img.crop((borderX, borderY, width - borderX, height - borderY))
print(filePath)
if 'depth' in file:
depthArray = np.array(img)
depthArray = depthArray.astype(np.float32)
depthArray /= 65535.0
depthArray = np.clip(depthArray, 0.0039, 1)
depthArray *= 6.5535 # 1 - 10 meters
depthArray *= 255
depthArray = depthArray.astype(np.uint8)
depthNew = Image.fromarray(depthArray)
depthNew.save('test_depths/' + file)
if 'colors' in file:
img.save('test_colors/' + file)
================================================
FILE: dataset/test/process_test.sh
================================================
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
python convert.py nyu_depth_v2_labeled.mat splits.mat out 0 1
mkdir -p test_data
find out/testing -name *colors.png -exec mv -t test_data {} +
find out/testing -name *depth.png -exec mv -t test_data {} +
mkdir -p test_colors
mkdir -p test_depths
python crop.py
================================================
FILE: dataset/train/create_train_lmdb.sh
================================================
#! /bin/bash
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
#Data 0
for f in train_colors0/*.png
do
file=$(basename $f)
depthfile="${file/rgb/depth}"
printf "$file $depthfile\n" >> list_ordered.txt
done
sort --random-sort -o list.txt list_ordered.txt
awk < list.txt '{printf $1; printf " 0\n"}' > list_color.txt
awk < list.txt '{printf $2; printf " 0\n"}' > list_depth.txt
rm list.txt
rm list_ordered.txt
mkdir train_raw0_lmdb
$1/build/tools/convert_imageset -resize_height 27 -resize_width 37 -gray train_depths0/ list_depth.txt train_raw0_lmdb/train_raw0_depth_37x27.lmdb
$1/build/tools/convert_imageset -resize_height 54 -resize_width 74 -gray train_depths0/ list_depth.txt train_raw0_lmdb/train_raw0_depth_74x54.lmdb
$1/build/tools/convert_imageset -resize_height 218 -resize_width 298 train_colors0/ list_color.txt train_raw0_lmdb/train_raw0_color_298x218.lmdb
rm list_color.txt
rm list_depth.txt
#Data 1
for f in train_colors1/*.png
do
file=$(basename $f)
depthfile="${file/rgb/depth}"
printf "$file $depthfile\n" >> list_ordered.txt
done
sort --random-sort -o list.txt list_ordered.txt
awk < list.txt '{printf $1; printf " 0\n"}' > list_color.txt
awk < list.txt '{printf $2; printf " 0\n"}' > list_depth.txt
rm list.txt
rm list_ordered.txt
mkdir train_raw1_lmdb
$1/build/tools/convert_imageset -resize_height 27 -resize_width 37 -gray train_depths1/ list_depth.txt train_raw1_lmdb/train_raw1_depth_37x27.lmdb
$1/build/tools/convert_imageset -resize_height 54 -resize_width 74 -gray train_depths1/ list_depth.txt train_raw1_lmdb/train_raw1_depth_74x54.lmdb
$1/build/tools/convert_imageset -resize_height 218 -resize_width 298 train_colors1/ list_color.txt train_raw1_lmdb/train_raw1_color_298x218.lmdb
rm list_color.txt
rm list_depth.txt
#Data 1
for f in train_colors2/*.png
do
file=$(basename $f)
depthfile="${file/rgb/depth}"
printf "$file $depthfile\n" >> list_ordered.txt
done
sort --random-sort -o list.txt list_ordered.txt
awk < list.txt '{printf $1; printf " 0\n"}' > list_color.txt
awk < list.txt '{printf $2; printf " 0\n"}' > list_depth.txt
rm list.txt
rm list_ordered.txt
mkdir train_raw2_lmdb
$1/build/tools/convert_imageset -resize_height 27 -resize_width 37 -gray train_depths2/ list_depth.txt train_raw2_lmdb/train_raw2_depth_37x27.lmdb
$1/build/tools/convert_imageset -resize_height 54 -resize_width 74 -gray train_depths2/ list_depth.txt train_raw2_lmdb/train_raw2_depth_74x54.lmdb
$1/build/tools/convert_imageset -resize_height 218 -resize_width 298 train_colors2/ list_color.txt train_raw2_lmdb/train_raw2_color_298x218.lmdb
rm list_color.txt
rm list_depth.txt
================================================
FILE: dataset/train/get_train_scenes.m
================================================
% Master's Thesis - Depth Estimation by Convolutional Neural Networks
% Jan Ivanecky; xivane00@stud.fit.vutbr.cz
data = load('nyu_depth_v2_labeled.mat');
split = load('splits.mat');
for i = 1 : 795
folders(i) = data.scenes(split.trainNdxs(i));
end
folders = unique(folders);
fileID = fopen('train_scenes.txt','w');
for i = 1 :numel(folders)
fprintf(fileID, '%s\n', folders{i});
end
fclose(fileID);
exit();
================================================
FILE: dataset/train/process_raw.m
================================================
% Master's Thesis - Depth Estimation by Convolutional Neural Networks
% Jan Ivanecky; xivane00@stud.fit.vutbr.cz
addpath('tools');
d = dir('.');
isub = [d(:).isdir]; %# returns logical vector
nameFolds = {d(isub).name}';
nameFolds(ismember(nameFolds,{'.','..','tools'})) = [];
nameFolds(~cellfun(@isempty,(regexp(nameFolds,'._out')))) = [];
disp(numel(nameFolds));
count = 0;
outCount = 0;
for f = 1:numel(nameFolds)
disp(f);
disp(nameFolds{f});
files = get_synched_frames(nameFolds{f});
c = numel(files);
disp(strcat('filecount: ',int2str(c)));
files = files(1:5:c);
c = numel(files);
disp(strcat('filecount to process: ',int2str(c)));
count = count + c;
outFolder = strcat(nameFolds{f}, '_out');
if ~exist(outFolder, 'dir')
mkdir(outFolder);
end
parfor idx = 1:c
rgbFilename = strcat(nameFolds{f},'/',files(idx).rawRgbFilename);
depthFilename = strcat(nameFolds{f},'/',files(idx).rawDepthFilename);
outRGBFilename = strcat(nameFolds{f},'_out/',nameFolds{f},num2str(idx),'rgb.png');
outDepthFilename = strcat(nameFolds{f},'_out/',nameFolds{f},num2str(idx),'depth.png');
disp(outRGBFilename);
rgb = imread(rgbFilename);
depth = imread(depthFilename);
depth = swapbytes(depth);%
[depthOut, rgbOut] = project_depth_map(depth, rgb);
imgDepth = fill_depth_colorization(double(rgbOut) / 255.0, depthOut, 0.8);
imgDepth = imgDepth / 10.0;
imgDepth = crop_image(imgDepth);
rgbOut = crop_image(rgbOut);
imwrite(rgbOut, outRGBFilename);
imwrite(imgDepth, outDepthFilename);
end
D = dir([outFolder, '/*rgb.png']);
Num = length(D);%D(not([D.isdir])));
disp(strcat('output filecount: ',int2str(Num)));
outCount = outCount + Num;
end
disp(count);
disp(outCount);
exit;
================================================
FILE: dataset/train/split_train_set.sh
================================================
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
echo "Extracting images from training scenes..."
mkdir -p train_data_all
while read i; do
dirpath=${1}'/'${i}
for d in "${dirpath}"*/
do
cp $d/* -t train_data_all
done
done < train_scenes.txt
echo "Moving extracted RGB images to train_rgbs..."
mkdir -p train_colors
find train_data_all -name '*rgb.png' -exec mv -t train_colors {} +
echo "Moving extracted Depth images to train_depths..."
mkdir -p train_depths
find train_data_all -name '*depth.png' -exec mv -t train_depths {} +
rm -r train_data_all
================================================
FILE: dataset/train/train_augment0.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import sys
import PIL
from PIL import Image
import os.path
import random
from random import randint
try:
os.mkdir('train_colors0')
except OSError:
print('output folder already exists')
try:
os.mkdir('train_depths0')
except OSError:
print('output folder already exists')
counter = 1
for file in os.listdir("train_colors"):
if file.endswith(".png"):
depthFile = file.replace('rgb','depth')
filePath = 'train_colors/' + file
depthFilePath = 'train_depths/' + depthFile
print(str(counter) + filePath + ' ' + depthFilePath)
counter += 1
colorOriginal = Image.open(filePath)
depthOriginal = Image.open(depthFilePath)
width, height = 561, 427
newWidth, newHeight = 420, 320
borderX = (width - newWidth) / 2
borderY = (height - newHeight) / 2
colorNew = colorOriginal.crop((borderX, borderY, width - borderX, height - borderY))
depthNew = depthOriginal.crop((borderX, borderY, width - borderX, height - borderY))
colorNew.save('train_colors0/' + file)
depthNew.save('train_depths0/' + depthFile)
================================================
FILE: dataset/train/train_augment1.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import sys
import PIL
from PIL import Image
import os.path
import random
from random import randint
try:
os.mkdir('train_colors1')
except OSError:
print('output folder already exists')
try:
os.mkdir('train_depths1')
except OSError:
print('output folder already exists')
counter = 1
for file in os.listdir("train_colors"):
if file.endswith(".png"):
depthFile = file.replace('rgb','depth')
filePath = 'train_colors/' + file
depthFilePath = 'train_depths/' + depthFile
print(str(counter) + filePath + ' ' + depthFilePath)
counter += 1
colorOriginal = Image.open(filePath)
depthOriginal = Image.open(depthFilePath)
rotation_std = 2.5
filename = os.path.splitext(file)[0]
depthFilename = os.path.splitext(depthFile)[0]
for i in range(5):
color = colorOriginal
depth = depthOriginal
width, height = 561, 427
newWidth, newHeight = 420, 320
borderX = (width - newWidth) / 2
borderY = (height - newHeight) / 2
if randint(0,2) == 0:
randomTranslationX = 0
randomTranslationY = 0
randomAngle = np.random.normal(0.0, rotation_std)
color = color.rotate(randomAngle)
depth = depth.rotate(randomAngle)
else:
randomScale = random.uniform(0.875, 1.125)
resizeWidth, resizeHeight = int(randomScale * width), int(randomScale * height)
color = color.resize((resizeWidth, resizeHeight), PIL.Image.ANTIALIAS)
depth = depth.resize((resizeWidth, resizeHeight), PIL.Image.ANTIALIAS)
depthArray = np.array(depth)
depthArray = depthArray.astype(np.float32)
depthArray /= randomScale
depthArray = np.clip(depthArray, 0.0, 255.0)
depthArray = depthArray.astype(np.uint8)
depth = Image.fromarray(depthArray)
width, height = color.size
borderX = (width - newWidth) / 2
borderY = (height - newHeight) / 2
randomTranslationX = randint(-borderX + 1,borderX-1)
randomTranslationY = randint(-borderY + 1,borderY-1)
colorNew = color.crop((borderX + randomTranslationX, borderY + randomTranslationY,width - borderX + randomTranslationX, height - borderY + randomTranslationY))
depthNew = depth.crop((borderX + randomTranslationX, borderY + randomTranslationY,width - borderX + randomTranslationX, height - borderY + randomTranslationY))
colorArray = np.array(colorNew)
colorArray = colorArray.astype(np.float32) / 255.0
colorArray = matplotlib.colors.rgb_to_hsv(colorArray)
randomHueShift = random.uniform(-0.05,0.05)
colorArray[:,:,0] += randomHueShift
colorArray[:,:,0] = np.mod(colorArray[:,:,0], 1.0)
randomSaturationShift = random.uniform(-0.05,0.05)
colorArray[:,:,1] += randomSaturationShift
colorArray[:,:,1] = np.clip(colorArray[:,:,1], 0, 1)
randomValueShift = random.uniform(-0.05,0.05)
colorArray[:,:,2] += randomValueShift
colorArray[:,:,2] = np.clip(colorArray[:,:,2], 0, 1)
colorArray = matplotlib.colors.hsv_to_rgb(colorArray) * 255.0
randomContrastChange = random.uniform(205.0,305.0)
colorArray *= randomContrastChange / 255.0
colorArray -= (randomContrastChange - 255.0) / 2.0
colorArray = np.clip(colorArray, 0, 255.0)
colorArray = colorArray.astype(np.uint8)
colorNew = Image.fromarray(colorArray)
colorNew.save('train_colors1/' + filename + str(i) + '.png')
depthNew.save('train_depths1/' + depthFilename + str(i) + '.png')
colorNewH = colorNew.transpose(PIL.Image.FLIP_LEFT_RIGHT)
depthNewH = depthNew.transpose(PIL.Image.FLIP_LEFT_RIGHT)
colorNewH.save('train_colors1/' + filename + str(i) + 'f.png')
depthNewH.save('train_depths1/' + depthFilename + str(i) + 'f.png')
================================================
FILE: dataset/train/train_augment2.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import sys
import PIL
from PIL import Image
import os.path
import random
from random import randint
try:
os.mkdir('train_colors2')
except OSError:
print('output folder already exists')
try:
os.mkdir('train_depths2')
except OSError:
print('output folder already exists')
counter = 1
for file in os.listdir("train_colors"):
if file.endswith(".png"):
depthFile = file.replace('rgb','depth')
filePath = 'train_colors/' + file
depthFilePath = 'train_depths/' + depthFile
print(str(counter) + filePath + ' ' + depthFilePath)
counter += 1
colorOriginal = Image.open(filePath)
depthOriginal = Image.open(depthFilePath)
rotation_std = 5.0
filename = os.path.splitext(file)[0]
depthFilename = os.path.splitext(depthFile)[0]
for i in range(5):
color = colorOriginal
depth = depthOriginal
width, height = 561, 427
newWidth, newHeight = 420, 320
borderX = (width - newWidth) / 2
borderY = (height - newHeight) / 2
if randint(0,2) == 0:
randomTranslationX = 0
randomTranslationY = 0
randomAngle = np.random.normal(0.0, rotation_std)
color = color.rotate(randomAngle)
depth = depth.rotate(randomAngle)
else:
randomScale = random.uniform(0.75, 1.25)
resizeWidth, resizeHeight = int(randomScale * width), int(randomScale * height)
color = color.resize((resizeWidth, resizeHeight), PIL.Image.ANTIALIAS)
depth = depth.resize((resizeWidth, resizeHeight), PIL.Image.ANTIALIAS)
depthArray = np.array(depth)
depthArray = depthArray.astype(np.float32)
depthArray /= randomScale
depthArray = np.clip(depthArray, 0.0, 255.0)
depthArray = depthArray.astype(np.uint8)
depth = Image.fromarray(depthArray)
width, height = color.size
borderX = (width - newWidth) / 2
borderY = (height - newHeight) / 2
if borderX <= 1:
randomTranslationX = 0
randomTranslationY = 0
else:
randomTranslationX = randint(-borderX + 1,borderX-1)
randomTranslationY = randint(-borderY + 1,borderY-1)
colorNew = color.crop((borderX + randomTranslationX, borderY + randomTranslationY,width - borderX + randomTranslationX, height - borderY + randomTranslationY))
depthNew = depth.crop((borderX + randomTranslationX, borderY + randomTranslationY,width - borderX + randomTranslationX, height - borderY + randomTranslationY))
colorArray = np.array(colorNew)
colorArray = colorArray.astype(np.float32) / 255.0
colorArray = matplotlib.colors.rgb_to_hsv(colorArray)
randomHueShift = random.uniform(-0.1,0.1)
colorArray[:,:,0] += randomHueShift
colorArray[:,:,0] = np.mod(colorArray[:,:,0], 1.0)
randomSaturationShift = random.uniform(-0.1,0.1)
colorArray[:,:,1] += randomSaturationShift
colorArray[:,:,1] = np.clip(colorArray[:,:,1], 0, 1)
randomValueShift = random.uniform(-0.1,0.1)
colorArray[:,:,2] += randomValueShift
colorArray[:,:,2] = np.clip(colorArray[:,:,2], 0, 1)
colorArray = matplotlib.colors.hsv_to_rgb(colorArray) * 255.0
randomContrastChange = random.uniform(175.0,335.0)
colorArray *= randomContrastChange / 255.0
colorArray -= (randomContrastChange - 255.0) / 2.0
colorArray = np.clip(colorArray, 0, 255.0)
colorArray = colorArray.astype(np.uint8)
colorNew = Image.fromarray(colorArray)
colorNew.save('train_colors2/' + filename + str(i) + '.png')
depthNew.save('train_depths2/' + depthFilename + str(i) + '.png')
colorNewH = colorNew.transpose(PIL.Image.FLIP_LEFT_RIGHT)
depthNewH = depthNew.transpose(PIL.Image.FLIP_LEFT_RIGHT)
colorNewH.save('train_colors2/' + filename + str(i) + 'f.png')
depthNewH.save('train_depths2/' + depthFilename + str(i) + 'f.png')
================================================
FILE: eval_depth.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import caffe
import operator
import argparse
import os
from scipy import misc
from os.path import basename
def LogDepth(depth):
depth = np.maximum(depth, 1.0 / 255.0)
return 0.179581 * np.log(depth) + 1
def AbsoluteRelativeDifference(output, gt):
gt = np.maximum(gt, 1.0 / 255.0)
diff = np.mean(np.absolute(output - gt) / gt)
return diff
def SquaredRelativeDifference(output, gt):
gt = np.maximum(gt, 1.0 / 255.0)
d = output - gt
diff = np.mean((d * d) / gt)
return diff
def RootMeanSquaredError(output, gt):
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def RootMeanSquaredErrorLog(output, gt):
d = LogDepth(output / 10.0) * 10.0 - LogDepth(gt / 10.0) * 10.0
diff = np.sqrt(np.mean(d * d))
return diff
def ScaleInvariantMeanSquaredError(output, gt):
output = LogDepth(output / 10.0) * 10.0
gt = LogDepth(gt / 10.0) * 10.0
d = output - gt
diff = np.mean(d * d)
relDiff = (d.sum() * d.sum()) / float(d.size * d.size)
return diff - relDiff
def Log10Error(output, gt):
output = np.maximum(output, 1.0 / 255.0)
gt = np.maximum(gt, 1.0 / 255.0)
diff = np.mean(np.absolute(np.log10(output) - np.log10(gt)))
return diff
def MVNError(output, gt):
outMean = np.mean(output)
outStd = np.std(output)
output = (output - outMean)/outStd
gtMean = np.mean(gt)
gtStd = np.std(gt)
gt = (gt - gtMean)/gtStd
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def Threshold(output, gt, threshold):
output = np.maximum(output, 1.0 / 255.0)
gt = np.maximum(gt, 1.0 / 255.0)
withinThresholdCount = np.where(np.maximum(output / gt, gt / output) < threshold)[0].size
return withinThresholdCount / float(gt.size)
def Test(out, gt):
absRelDiff = AbsoluteRelativeDifference(out, gt)
sqrRelDiff = SquaredRelativeDifference(out, gt)
RMSE = RootMeanSquaredError(out, gt)
RMSELog = RootMeanSquaredErrorLog(out, gt)
SIMSE = ScaleInvariantMeanSquaredError(out, gt)
threshold1 = Threshold(out, gt, 1.25)
threshold2 = Threshold(out, gt, 1.25 * 1.25)
threshold3 = Threshold(out, gt, 1.25 * 1.25 * 1.25)
log10 = Log10Error(out, gt)
MVN = MVNError(out, gt)
return [absRelDiff, sqrRelDiff, RMSE, RMSELog, SIMSE, log10, MVN, threshold1, threshold2, threshold3]
def PrintTop5(title, result):
length = min(10, len(result))
print
print
print ("TOP " + str(length) + " for " + title)
for i in xrange(length):
print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1]))
print
print
================================================
FILE: get_depth.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import os.path
os.environ['GLOG_minloglevel'] = '2'
import caffe
import scipy.ndimage
import argparse
import operator
import shutil
WIDTH = 298
HEIGHT = 218
OUT_WIDTH = 74
OUT_HEIGHT = 54
GT_WIDTH = 420
GT_HEIGHT = 320
def testNet(net, img):
net.blobs['X'].data[...] = img
net.forward()
output = net.blobs['depth-refine'].data
output = np.reshape(output, (1,1,OUT_HEIGHT, OUT_WIDTH))
return output
def loadImage(path, channels, width, height):
img = caffe.io.load_image(path)
img = caffe.io.resize(img, (height, width, channels))
img = np.transpose(img, (2,0,1))
img = np.reshape(img, (1,channels,height,width))
return img
def printImage(img, name, channels, width, height):
params = list()
params.append(cv.CV_IMWRITE_PNG_COMPRESSION)
params.append(8)
imgnp = np.reshape(img, (height,width, channels))
imgnp = np.array(imgnp * 255, dtype = np.uint8)
cv2.imwrite(name, imgnp, params)
def eval(out, gt, rawResults):
linearGT = gt * 10.0
linearOut = out * 10.0
rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))]
return rawResults
def ProcessToOutput(depth):
depth = np.clip(depth, 0.001, 1000)
return np.clip(2 * 0.179581 * np.log(depth) + 1, 0, 1)
caffe.set_mode_cpu()
parser = argparse.ArgumentParser()
parser.add_argument("input_dir", help="directory with input images")
parser.add_argument("output", help="folder to output to")
parser.add_argument("snaps", help="folder with snapshots to use")
parser.add_argument('--log', action='store_true', default=False)
args = parser.parse_args()
try:
os.mkdir(args.output + "_abs")
except OSError:
print ('Output directory already exists, not creating a new one')
fileCount = len([name for name in os.listdir(args.input_dir)])
results = [dict() for x in range(10)]
for snapshot in os.listdir(args.snaps):
if not snapshot.endswith("caffemodel"):
continue
currentSnapDir = snapshot.replace(".caffemodel","")
if os.path.exists(args.output + "_abs/" + currentSnapDir):
shutil.rmtree(args.output + "_abs/" + currentSnapDir)
os.mkdir(args.output + "_abs/" + currentSnapDir)
print(currentSnapDir)
sys.stdout.flush()
netFile = snapshot.replace(".caffemodel",".prototxt")
net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST)
rawResults = np.zeros((10))
for count, file in enumerate(os.listdir(args.input_dir)):
out_string = str(count) + '/' + str(fileCount) + ': ' + file
sys.stdout.write('%s\r' % out_string)
sys.stdout.flush()
inputFileName = file
inputFilePath = args.input_dir + '/' + inputFileName
input = loadImage(inputFilePath, 3, WIDTH, HEIGHT)
input *= 255
input -= 127
output = testNet(net, input)
if args.log:
output = np.exp((output - 1) / 0.179581)
outWidth = OUT_WIDTH
outHeight = OUT_HEIGHT
scaleW = float(GT_WIDTH) / float(OUT_WIDTH)
scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT)
output = scipy.ndimage.zoom(output, (1,1,scaleH,scaleW), order=3)
outWidth *= scaleW
outHeight *= scaleH
input += 127
input = input / 255.0
input = np.transpose(input, (0,2,3,1))
input = input[:,:,:,(2,1,0)]
absOutput = output.copy()
output = ProcessToOutput(output)
absOutput = ProcessToOutput(absOutput)
filename = os.path.splitext(os.path.basename(inputFileName))[0]
filePathAbs = args.output + '_abs/' + currentSnapDir + '/' + filename + '.png'
printImage(input, filePathAbs, 3, WIDTH, HEIGHT)
printImage(absOutput, filePathAbs.replace('_colors','_depth'), 1, outWidth, outHeight)
================================================
FILE: net_deploy.prototxt
================================================
name: "refining network_norm_abs_deploy"
#INPUTS
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#GRADIENT NETWORK STARTS HERE
layer {
name: "conv1-grad"
type: "Convolution"
bottom: "X"
top: "conv1-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-grad"
type: "ReLU"
bottom: "conv1-grad"
top: "conv1-grad"
}
layer {
name: "norm1-grad"
type: "LRN"
bottom: "conv1-grad"
top: "norm1-grad"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-grad"
type: "Pooling"
bottom: "norm1-grad"
top: "pool1-grad"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1-grad"
top: "conv2-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2-grad"
type: "ReLU"
bottom: "conv2-grad"
top: "conv2-grad"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2-grad"
top: "conv3-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-grad"
type: "ReLU"
bottom: "conv3-grad"
top: "conv3-grad"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3-grad"
top: "conv4-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-grad"
type: "ReLU"
bottom: "conv4-grad"
top: "conv4-grad"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4-grad"
top: "grad_out"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "mvnGrad"
type: "MVN"
bottom: "grad_out"
top: "gradMVN"
}
#GRADIENT NETWORK ENDS HERE
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample"
type: "Deconvolution"
bottom: "gradMVN"
top: "grad-upsample"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 3
stride: 1
num_output: 2
group: 2
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
layer {
name: "concat-global"
bottom: "grad-upsample"
bottom: "depthMVN"
top: "global-output"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "global-output"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 3
group: 3
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
================================================
FILE: net_train.prototxt
================================================
name: "refining_network_norm_abs_train"
#INPUTS START HERE
#COLOR
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
#INPUTS END HERE
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#GRADIENT NETWORK STARTS HERE
layer {
name: "conv1-grad"
type: "Convolution"
bottom: "X"
top: "conv1-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-grad"
type: "ReLU"
bottom: "conv1-grad"
top: "conv1-grad"
}
layer {
name: "norm1-grad"
type: "LRN"
bottom: "conv1-grad"
top: "norm1-grad"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-grad"
type: "Pooling"
bottom: "norm1-grad"
top: "pool1-grad"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1-grad"
top: "conv2-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2-grad"
type: "ReLU"
bottom: "conv2-grad"
top: "conv2-grad"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2-grad"
top: "conv3-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-grad"
type: "ReLU"
bottom: "conv3-grad"
top: "conv3-grad"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3-grad"
top: "conv4-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-grad"
type: "ReLU"
bottom: "conv4-grad"
top: "conv4-grad"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4-grad"
top: "grad_out"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "mvnGrad"
type: "MVN"
bottom: "grad_out"
top: "gradMVN"
}
#GRADIENT NETWORK ENDS HERE
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample"
type: "Deconvolution"
bottom: "gradMVN"
top: "grad-upsample"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 3
stride: 1
num_output: 2
group: 2
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
layer {
name: "concat-global"
bottom: "grad-upsample"
bottom: "depthMVN"
top: "global-output"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "global-output"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 3
group: 3
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
#LOSSES
layer {
name: "mvnDepthRefine"
type: "MVN"
bottom: "depth-refine"
top: "depthMVN-refine"
}
layer {
name: "mvnGT"
type: "MVN"
bottom: "gt"
top: "gtMVN"
}
layer {
name: "lossMVNDepth"
type: "EuclideanLoss"
bottom: "depthMVN-refine"
bottom: "gtMVN"
top: "lossMVNDepth"
loss_weight: 0.5
}
layer {
name: "lossABSDepth"
type: "EuclideanLoss"
bottom: "depth-refine"
bottom: "gt"
top: "lossABSDepth"
loss_weight: 0.5
}
================================================
FILE: solver.prototxt
================================================
net: "net_train.prototxt"
test_iter: 50
test_interval: 500
base_lr: 0.000025
gamma: 0.5
stepsize: 100000
momentum: 0.9
weight_decay: 0.005
lr_policy: "fixed"
display: 100
max_iter: 100000
snapshot: 10000
snapshot_prefix: "snaps/"
solver_mode: GPU
debug_info: 1
================================================
FILE: source/README.txt
================================================
================================================
Structure of this folder
================================================
global_context_network - contains network definitions and example scripts for training and evaluating the the global context network from the proposed model
gradient_network - contains network definitions and example scripts for training and evaluating the gradient network from the proposed model
joint - contains network definitions and example scripts for training and evaluating the jointly trained global context network and gradient networks from the proposed model
refining_network - contains network definitions and example scripts for training and evaluating the refining network from the proposed model
Each of these folders contains:
-multiple subdirectories, each with different configuration/loss function of the network. Individual subdirectories contain the network definition files for the training - 'net_train.prototxt' and for evaluating - 'net_deploy.prototxt'.
-script 'train.py' for training. Note that this script is just an example and it's content should be modified to fit the desired training process.
-script 'eval_depth.py' or 'eval_grad.py', contains definitions of error functions used for evaluating the performance
-script 'test_depth.py' or 'test_grad.py'. This script is used to evaluate the performance of the network and visualize it's output.
-'solver.prototxt' - example of the definition file for the Caffe solver.
================================================
Usage of the 'test_depth.py'/'test_grad.py' scripts:
================================================
python test_depth.py INPUT_DIR GT_DIR OUT_DIR SNAPSHOTS_DIR [--log]
-INPUT_DIR is the path to the folder containing input images
-GT_DIR is the path to the folder containing ground truth depth maps
-OUT_DIR is the path to the folder to which will be written output depth maps
-SNAPSHOTS_DIR is the path to the folder containing .caffemodel files containing trained network models. All models from this folder will be evaluated.
--log switch is used when the depth values that are produced by the network are in log space
=================================================
Frameworks/Libraries needed:
================================================
Caffe
Python2.7:
- caffe, scipy, scikit-image, numpy, pypng, cv2, Pillow, matplotlib
=================================================
Few notes
=================================================
-input images should be named in a same way as the corresponding ground truths, with difference that input images should have a suffix 'colors', while ground truth images should have a suffix 'depth'. Note that these suffixes should preceed file extension, e.g., 'image1_colors.png' and corresponding depth map 'image1_depth.png'
-along with .caffemodel file, corresponding deploy network definition file has to be placed into SNAPSHOTS_DIR, with the same name as the model file but with different extension 'prototxt' instead of 'caffemodel'
-there will actually be two output folders created, one OUT_DIR and the other OUT_DIR + '_abs'. OUT_DIR contains output depths that are fit using MVN normalization onto ground truth, OUT_DIR + '_abs' contains the raw output depth maps.
-note that you need AlexNet caffemodel for the training of the global context network, gradient network and their joint configuration. It can be downloaded here: https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet
================================================
FILE: source/global_context_network/abs/net_deploy.prototxt
================================================
name: "global_context_network_abs_deploy"
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
# CONVOLUTIONAL
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
================================================
FILE: source/global_context_network/abs/net_train.prototxt
================================================
name: "global_context_network_abs_train"
#INPUTS
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
# CONVOLUTIONAL
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc-main"
top: "fc-main"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
# LOSSES
layer {
name: "lossAbsDepth"
type: "EuclideanLoss"
bottom: "depth"
bottom: "gt"
top: "lossAbsDepth"
loss_weight: 1
}
================================================
FILE: source/global_context_network/eval_depth.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import caffe
import operator
import argparse
import os
from scipy import misc
from os.path import basename
def LogDepth(depth):
depth = np.maximum(depth, 1.0 / 255.0)
return 0.179581 * np.log(depth) + 1
def AbsoluteRelativeDifference(output, gt):
gt = np.maximum(gt, 1.0 / 255.0)
diff = np.mean(np.absolute(output - gt) / gt)
return diff
def SquaredRelativeDifference(output, gt):
gt = np.maximum(gt, 1.0 / 255.0)
d = output - gt
diff = np.mean((d * d) / gt)
return diff
def RootMeanSquaredError(output, gt):
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def RootMeanSquaredErrorLog(output, gt):
d = LogDepth(output / 10.0) * 10.0 - LogDepth(gt / 10.0) * 10.0
diff = np.sqrt(np.mean(d * d))
return diff
def ScaleInvariantMeanSquaredError(output, gt):
output = LogDepth(output / 10.0) * 10.0
gt = LogDepth(gt / 10.0) * 10.0
d = output - gt
diff = np.mean(d * d)
relDiff = (d.sum() * d.sum()) / float(d.size * d.size)
return diff - relDiff
def Log10Error(output, gt):
output = np.maximum(output, 1.0 / 255.0)
gt = np.maximum(gt, 1.0 / 255.0)
diff = np.mean(np.absolute(np.log10(output) - np.log10(gt)))
return diff
def MVNError(output, gt):
outMean = np.mean(output)
outStd = np.std(output)
output = (output - outMean)/outStd
gtMean = np.mean(gt)
gtStd = np.std(gt)
gt = (gt - gtMean)/gtStd
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def Threshold(output, gt, threshold):
output = np.maximum(output, 1.0 / 255.0)
gt = np.maximum(gt, 1.0 / 255.0)
withinThresholdCount = np.where(np.maximum(output / gt, gt / output) < threshold)[0].size
return withinThresholdCount / float(gt.size)
def Test(out, gt):
absRelDiff = AbsoluteRelativeDifference(out, gt)
sqrRelDiff = SquaredRelativeDifference(out, gt)
RMSE = RootMeanSquaredError(out, gt)
RMSELog = RootMeanSquaredErrorLog(out, gt)
SIMSE = ScaleInvariantMeanSquaredError(out, gt)
threshold1 = Threshold(out, gt, 1.25)
threshold2 = Threshold(out, gt, 1.25 * 1.25)
threshold3 = Threshold(out, gt, 1.25 * 1.25 * 1.25)
log10 = Log10Error(out, gt)
MVN = MVNError(out, gt)
return [absRelDiff, sqrRelDiff, RMSE, RMSELog, SIMSE, log10, MVN, threshold1, threshold2, threshold3]
def PrintTop5(title, result):
length = min(10, len(result))
print
print
print ("TOP " + str(length) + " for " + title)
for i in xrange(length):
print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1]))
print
print
================================================
FILE: source/global_context_network/log_abs/net_deploy.prototxt
================================================
name: "global_context_network_log_abs_deploy"
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
# CONVOLUTIONAL
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
================================================
FILE: source/global_context_network/log_abs/net_train.prototxt
================================================
name: "global_context_network_log_abs_train"
#INPUTS
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
# CONVOLUTIONAL
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc-main"
top: "fc-main"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
# LOSSES
layer {
name: "log"
type: "Log"
bottom: "gt"
top: "lnGt"
log_param {
shift: 0.00392156863
scale: 0.996078431
}
}
layer {
name: "power"
type: "Power"
bottom: "lnGt"
top: "logGt"
power_param {
power: 1
scale: 0.179581
shift: 1.0
}
}
layer {
name: "lossLogAbsDepth"
type: "EuclideanLoss"
bottom: "depth"
bottom: "logGt"
top: "lossLogAbsDepth"
loss_weight: 1
}
================================================
FILE: source/global_context_network/norm/net_deploy.prototxt
================================================
name: "global_context_network_norm_deploy"
#INPUTS
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
# CONVOLUTIONAL
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
================================================
FILE: source/global_context_network/norm/net_train.prototxt
================================================
name: "global_context_network_norm_train"
#INPUTS
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
# CONVOLUTIONAL
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc-main"
top: "fc-main"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
# LOSSES
layer {
name: "mvnDepth"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
layer {
name: "mvnGT"
type: "MVN"
bottom: "gt"
top: "gtMVN"
}
layer {
name: "lossMVNDepth"
type: "EuclideanLoss"
bottom: "depthMVN"
bottom: "gtMVN"
top: "lossMVNDepth"
loss_weight: 1
}
================================================
FILE: source/global_context_network/sc-inv/net_deploy.prototxt
================================================
name: "global_context_network_sc-inv_deploy"
#INPUTS
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
# CONVOLUTIONAL
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
================================================
FILE: source/global_context_network/sc-inv/net_train.prototxt
================================================
name: "global_context_network_sc-inv_train"
#INPUTS
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
# CONVOLUTIONAL
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc-main"
top: "fc-main"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
# LOSSES
layer {
name: "log"
type: "Log"
bottom: "gt"
top: "lnGt"
log_param {
shift: 0.00392156863
scale: 0.996078431
}
}
layer {
name: "power"
type: "Power"
bottom: "lnGt"
top: "logGt"
power_param {
power: 1
scale: 0.179581
shift: 1.0
}
}
layer {
name: "mnDepth"
type: "MVN"
bottom: "depth"
top: "depthMN"
mvn_param{normalize_variance: false}
}
layer {
name: "mnGT"
type: "MVN"
bottom: "logGt"
top: "gtMN"
mvn_param{normalize_variance: false}
}
layer {
name: "lossMNDepth"
type: "EuclideanLoss"
bottom: "depthMN"
bottom: "gtMN"
top: "lossMNDepth"
loss_weight: 1
}
================================================
FILE: source/global_context_network/solver.prototxt
================================================
net: "net_train.prototxt"
test_iter: 50
test_interval: 500
base_lr: 0.0005
gamma: 0.5
stepsize: 100000
momentum: 0.9
weight_decay: 0.005
lr_policy: "fixed"
display: 100
max_iter: 100000
snapshot: 10000
snapshot_prefix: "snaps/"
solver_mode: GPU
debug_info: 1
================================================
FILE: source/global_context_network/test_depth.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import os.path
os.environ['GLOG_minloglevel'] = '2'
import caffe
import scipy.ndimage
import argparse
import operator
import shutil
from eval_depth import Test, PrintTop5, LogDepth
WIDTH = 298
HEIGHT = 218
OUT_WIDTH = 37
OUT_HEIGHT = 27
GT_WIDTH = 420
GT_HEIGHT = 320
def testNet(net, img):
net.blobs['X'].data[...] = img
net.forward()
output = net.blobs['depth'].data
output = np.reshape(output, (1,1,OUT_HEIGHT, OUT_WIDTH))
return output
def loadImage(path, channels, width, height):
img = caffe.io.load_image(path)
img = caffe.io.resize(img, (height, width, channels))
img = np.transpose(img, (2,0,1))
img = np.reshape(img, (1,channels,height,width))
return img
def printImage(img, name, channels, width, height):
params = list()
params.append(cv.CV_IMWRITE_PNG_COMPRESSION)
params.append(8)
imgnp = np.reshape(img, (height,width, channels))
imgnp = np.array(imgnp * 255, dtype = np.uint8)
cv2.imwrite(name, imgnp, params)
def eval(out, gt, rawResults):
linearGT = gt * 10.0
linearOut = out * 10.0
rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))]
return rawResults
def ProcessToOutput(depth):
depth = np.clip(depth, 0.001, 1000)
return np.clip(2 * 0.179581 * np.log(depth) + 1, 0, 1)
caffe.set_mode_cpu()
parser = argparse.ArgumentParser()
parser.add_argument("input_dir", help="directory with input images")
parser.add_argument("gt_dir", help="directory with ground truth files")
parser.add_argument("output", help="folder to output to")
parser.add_argument("snaps", help="folder with snapshots to use")
parser.add_argument('--log', action='store_true', default=False)
args = parser.parse_args()
try:
os.mkdir(args.output)
except OSError:
print ('Output directory already exists, not creating a new one')
try:
os.mkdir(args.output + "_abs")
except OSError:
print ('Output directory already exists, not creating a new one')
fileCount = len([name for name in os.listdir(args.input_dir)])
results = [dict() for x in range(10)]
for snapshot in os.listdir(args.snaps):
if not snapshot.endswith("caffemodel"):
continue
currentSnapDir = snapshot.replace(".caffemodel","")
if os.path.exists(args.output + "/" + currentSnapDir):
shutil.rmtree(args.output + "/" + currentSnapDir)
if os.path.exists(args.output + "_abs/" + currentSnapDir):
shutil.rmtree(args.output + "_abs/" + currentSnapDir)
os.mkdir(args.output + "/" + currentSnapDir)
os.mkdir(args.output + "_abs/" + currentSnapDir)
print(currentSnapDir)
sys.stdout.flush()
netFile = snapshot.replace(".caffemodel",".prototxt")
net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST)
rawResults = np.zeros((10))
for count, file in enumerate(os.listdir(args.input_dir)):
out_string = str(count) + '/' + str(fileCount) + ': ' + file
sys.stdout.write('%s\r' % out_string)
sys.stdout.flush()
inputFileName = file
inputFilePath = args.input_dir + '/' + inputFileName
gtFileName = file.replace('colors', 'depth')
gtFilePath = args.gt_dir + '/' + gtFileName
gt = loadImage(gtFilePath, 1, GT_WIDTH, GT_HEIGHT)
input = loadImage(inputFilePath, 3, WIDTH, HEIGHT)
input *= 255
input -= 127
output = testNet(net, input)
if args.log:
output = np.exp((output - 1) / 0.179581)
outWidth = OUT_WIDTH
outHeight = OUT_HEIGHT
scaleW = float(GT_WIDTH) / float(OUT_WIDTH)
scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT)
output = scipy.ndimage.zoom(output, (1,1,scaleH,scaleW), order=3)
outWidth *= scaleW
outHeight *= scaleH
rawResults = eval(output, gt, rawResults)
input += 127
input = input / 255.0
input = np.transpose(input, (0,2,3,1))
input = input[:,:,:,(2,1,0)]
absOutput = output.copy()
output -= output.mean()
output /= output.std()
output *= gt.std()
output += gt.mean()
gt = ProcessToOutput(gt)
output = ProcessToOutput(output)
absOutput = ProcessToOutput(absOutput)
filename = os.path.splitext(os.path.basename(inputFileName))[0]
filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png'
filePathAbs = args.output + '_abs/' + currentSnapDir + '/' + filename + '.png'
printImage(input, filePath, 3, WIDTH, HEIGHT)
printImage(input, filePathAbs, 3, WIDTH, HEIGHT)
printImage(output, filePath.replace('_colors','_depth'), 1, outWidth, outHeight)
printImage(absOutput, filePathAbs.replace('_colors','_depth'), 1, outWidth, outHeight)
printImage(gt, filePath.replace('_colors', '_gt'), 1, outWidth, outHeight)
printImage(gt, filePathAbs.replace('_colors', '_gt'), 1, outWidth, outHeight)
rawResults[:] = [x / fileCount for x in rawResults]
for i in xrange(10):
results[i][currentSnapDir] = rawResults[i]
titles = ["AbsRelDiff", "SqrRelDiff", "RMSE", "RMSELog", "SIMSE", "Log10", "MVN", "Threshold 1.25","Threshold 1.25^2", "Threshold 1.25^3"]
for i in xrange(10):
results[i] = sorted(results[i].items(), key=operator.itemgetter(1))
if i > 6:
results[i] = list(reversed(results[i]))
PrintTop5(titles[i], results[i])
================================================
FILE: source/global_context_network/train.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import cv2
import cv
import caffe
from caffe.proto import caffe_pb2
import sys
from google.protobuf import text_format
import argparse
caffe.set_mode_gpu()
solver = caffe.get_solver('solver.prototxt')
solver.net.copy_from('bvlc_alexnet.caffemodel')
solver.solve()
================================================
FILE: source/gradient_network/abs/net_deploy.prototxt
================================================
name: "gradient_network_abs_deploy"
#INPUTS
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
# NET ITSELF
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4"
top: "grad"
param {
lr_mult: 0.1
decay_mult: 1
}
param {
lr_mult: 0.1
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.002
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "power"
bottom: "grad"
top: "gradient"
type: "Power"
power_param {
power: 1
scale: 0.02
shift: 0
}
}
#LOSS
================================================
FILE: source/gradient_network/abs/net_train.prototxt
================================================
name: "gradient_network_abs_train"
#INPUTS
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
# NET ITSELF
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "drop"
type: "Dropout"
bottom: "conv4"
top: "conv4"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4"
top: "grad"
param {
lr_mult: 0.1
decay_mult: 1
}
param {
lr_mult: 0.1
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.002
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "power"
bottom: "grad"
top: "grad_out"
type: "Power"
power_param {
power: 1
scale: 0.02
shift: 0
}
}
#LOSS
layer {
name: "gradientFilter"
type: "Convolution"
bottom: "gt"
top: "gtGrad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 0
stride: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "lossGradLogAbs"
type: "EuclideanLoss"
bottom: "gtGrad"
bottom: "grad_out"
top: "lossGrad"
loss_weight: 1
}
================================================
FILE: source/gradient_network/eval_grad.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import caffe
import operator
import argparse
import os
from scipy import misc
from os.path import basename
def RootMeanSquaredError(output, gt):
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def MVNError(output, gt):
outMean = np.mean(output)
outStd = np.std(output)
output = (output - outMean)/outStd
gtMean = np.mean(gt)
gtStd = np.std(gt)
gt = (gt - gtMean)/gtStd
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def Test(out, gt):
RMSE = RootMeanSquaredError(out, gt)
MVN = MVNError(out, gt)
return [RMSE, MVN]
def PrintTop5(title, result):
length = min(10, len(result))
print
print
print ("TOP " + str(length) + " for " + title)
for i in xrange(length):
print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1]))
print
print
================================================
FILE: source/gradient_network/filter.prototxt
================================================
name: "GradientFilter"
input: "X"
input_shape {
dim: 1
dim: 1
dim: 320
dim: 420
}
layer {
name: "gradientFilter"
type: "Convolution"
bottom: "X"
top: "out"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
convolution_param {
num_output: 2
pad: 0
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
================================================
FILE: source/gradient_network/norm/net_deploy.prototxt
================================================
name: "gradient_network_norm_deploy"
#INPUTS
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
# NET ITSELF
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4"
top: "grad"
param {
lr_mult: 0.1
decay_mult: 1
}
param {
lr_mult: 0.1
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "power"
bottom: "grad"
top: "gradient"
type: "Power"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
================================================
FILE: source/gradient_network/norm/net_train.prototxt
================================================
name: "gradient_network_norm_train"
#INPUTS
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
# NET ITSELF
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "drop"
type: "Dropout"
bottom: "conv4"
top: "conv4"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4"
top: "grad"
param {
lr_mult: 0.1
decay_mult: 1
}
param {
lr_mult: 0.1
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "power"
bottom: "grad"
top: "grad_out"
type: "Power"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
#LOSS
layer {
name: "gradientFilter"
type: "Convolution"
bottom: "gt"
top: "gtGrad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 0
stride: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "gradMVN"
type: "MVN"
bottom: "grad_out"
top: "grad_outMVN"
}
layer {
name: "gtMVN"
type: "MVN"
bottom: "gtGrad"
top: "gtGradMVN"
}
layer {
name: "lossGradMVN"
type: "EuclideanLoss"
bottom: "grad_outMVN"
bottom: "gtGradMVN"
top: "lossGrad"
loss_weight: 1
}
================================================
FILE: source/gradient_network/solver.prototxt
================================================
net: "net_train.prototxt"
test_iter: 50
test_interval: 500
base_lr: 0.00025
gamma: 0.5
stepsize: 100000
momentum: 0.9
weight_decay: 0.005
lr_policy: "fixed"
display: 100
max_iter: 100000
snapshot: 10000
snapshot_prefix: "snaps/"
solver_mode: GPU
debug_info: 1
================================================
FILE: source/gradient_network/test_grad.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import os.path
os.environ['GLOG_minloglevel'] = '2'
import caffe
import scipy.ndimage
import argparse
import operator
import shutil
from eval_grad import Test, PrintTop5
filter = np.zeros((1,3,3))
filter[0,0,:] = (-1,-1,-1)
filter[0,1,:] = (0,0,0)
filter[0,2,:] = (1,1,1)
filter2 = np.zeros((1,3,3))
filter2[0,0,:] = (-1,0,1)
filter2[0,1,:] = (-1,0,1)
filter2[0,2,:] = (-1,0,1)
WIDTH = 298
HEIGHT = 218
OUT_WIDTH = 35
OUT_HEIGHT = 25
GT_WIDTH = 418
GT_HEIGHT = 318
def filterImage(net, gt):
net.blobs['X'].data[...] = gt
net.forward()
return (net.blobs['out'].data[0,0,:,:], net.blobs['out'].data[0,1,:,:])
def testNet(net, img):
net.blobs['X'].data[...] = img
net.forward()
output = net.blobs['gradient'].data
output = np.reshape(output, (1,2,OUT_HEIGHT, OUT_WIDTH))
out1 = output[0,0,:,:]
out2 = output[0,1,:,:]
return out1, out2
def loadImage(path, channels, width, height):
img = caffe.io.load_image(path)
img = caffe.io.resize(img, (height, width, channels))
img = np.transpose(img, (2,0,1))
img = np.reshape(img, (1,channels,height,width))
return img
def printImage(img, name, channels, width, height):
params = list()
params.append(cv.CV_IMWRITE_PNG_COMPRESSION)
params.append(8)
imgnp = np.reshape(img, (height,width, channels))
imgnp = np.array(imgnp * 255, dtype = np.uint8)
cv2.imwrite(name, imgnp, params)
def eval(out, gt, rawResults):
linearGT = gt#np.exp((gt - 1) / 0.179581) * 65.535
linearOut = out#np.exp((out - 1) / 0.179581) * 65.635
#RAW PIXEL TESTS
rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))]
return rawResults
caffe.set_mode_cpu()
parser = argparse.ArgumentParser()
parser.add_argument("input_dir", help="directory with input images")
parser.add_argument("gt_dir", help="directory with ground truths")
parser.add_argument("output", help="folder to output to")
parser.add_argument("snaps", help="folder with snapshots to use")
args = parser.parse_args()
gradNet = caffe.Net("filter.prototxt", caffe.TEST)
gradNet.params['gradientFilter'][0].data[0,...] = filter
gradNet.params['gradientFilter'][0].data[1,...] = filter2
try:
os.mkdir(args.output)
except OSError:
x = 12
fileCount = len([name for name in os.listdir(args.input_dir)])
results = [dict() for x in range(2)]
for snapshot in os.listdir(args.snaps):
if not snapshot.endswith("caffemodel"):
continue
currentSnapDir = snapshot.replace(".caffemodel","")
if os.path.exists(args.output + "/" + currentSnapDir):
shutil.rmtree(args.output + "/" + currentSnapDir)
os.mkdir(args.output + "/" + currentSnapDir)
print(currentSnapDir)
sys.stdout.flush()
netFile = snapshot.replace(".caffemodel",".prototxt")
net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST)
rawResults = np.zeros((2))
for count, file in enumerate(os.listdir(args.input_dir)):
out_string = str(count) + '/' + str(fileCount) + ': ' + file
sys.stdout.write('%s\r' % out_string)
sys.stdout.flush()
inputFileName = file
inputFilePath = args.input_dir + '/' + inputFileName
gtFileName = file.replace('colors', 'depth')
gtFilePath = args.gt_dir + '/' + gtFileName
gt = loadImage(gtFilePath, 1, GT_WIDTH + 2, GT_HEIGHT + 2)
gt1, gt2 = filterImage(gradNet, gt)
gt1 = np.reshape(gt1, (1,1,GT_HEIGHT, GT_WIDTH))
gt2 = np.reshape(gt2, (1,1,GT_HEIGHT, GT_WIDTH))
input = loadImage(inputFilePath, 3, WIDTH, HEIGHT)
input *= 255
input -= 127
out1, out2 = testNet(net, input)
outWidth = OUT_WIDTH
outHeight = OUT_HEIGHT
scaleW = float(GT_WIDTH) / float(OUT_WIDTH)
scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT)
out1 = scipy.ndimage.zoom(out1, (scaleH,scaleW), order=3)
out2 = scipy.ndimage.zoom(out2, (scaleH,scaleW), order=3)
outWidth *= scaleW
outHeight *= scaleH
rawResults = eval(out1, gt1, rawResults)
rawResults = eval(out2, gt2, rawResults)
gt1 = (gt1 - gt1.min())/(gt1.max() - gt1.min())
gt2 = (gt2 - gt2.min())/(gt2.max() - gt2.min())
out1 -= out1.mean()
out1 /= out1.std()
out1 *= gt1.std()
out1 += gt1.mean()
out2 -= out2.mean()
out2 /= out2.std()
out2 *= gt2.std()
out2 += gt2.mean()
input += 127
input = input / 255.0
input = np.transpose(input, (0,2,3,1))
input = input[:,:,:,(2,1,0)]
gt1 = np.clip(gt1, 0, 1)
gt2 = np.clip(gt2, 0, 1)
out1 = np.clip(out1, 0, 1)
out2 = np.clip(out2, 0, 1)
filename = os.path.splitext(os.path.basename(inputFileName))[0]
filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png'
printImage(input, filePath, 3, WIDTH, HEIGHT)
printImage(out1, filePath.replace('_colors','_grad1'), 1, outWidth, outHeight)
printImage(out2, filePath.replace('_colors','_grad2'), 1, outWidth, outHeight)
printImage(gt1, filePath.replace('_colors', '_gt1'), 1, outWidth, outHeight)
printImage(gt2, filePath.replace('_colors', '_gt2'), 1, outWidth, outHeight)
rawResults[:] = [x / (fileCount * 2.0) for x in rawResults]
for i in xrange(2):
results[i][currentSnapDir] = rawResults[i]
titles = ["RMSE", "MVN"]
for i in xrange(2):
results[i] = sorted(results[i].items(), key=operator.itemgetter(1))
PrintTop5(titles[i], results[i])
================================================
FILE: source/gradient_network/train.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import cv2
import cv
import caffe
from caffe.proto import caffe_pb2
import sys
from google.protobuf import text_format
import argparse
caffe.set_mode_gpu()
solver = caffe.get_solver('solver.prototxt')
solver.net.copy_from('bvlc_alexnet.caffemodel')
filter = np.zeros((1,3,3))
filter[0,0,:] = (-1,-1,-1)
filter[0,1,:] = (0,0,0)
filter[0,2,:] = (1,1,1)
filter2 = np.zeros((1,3,3))
filter2[0,0,:] = (-1,0,1)
filter2[0,1,:] = (-1,0,1)
filter2[0,2,:] = (-1,0,1)
solver.net.params['gradientFilter'][0].data[0,...] = filter
solver.net.params['gradientFilter'][0].data[1,...] = filter2
solver.solve()
================================================
FILE: source/joint/architecture_A/net_deploy.prototxt
================================================
name: "joint_A_deploy"
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
# GLOBAL
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
# GRADIENT
layer {
name: "conv1_g"
type: "Convolution"
bottom: "X"
top: "conv1_g"
param {
lr_mult: 0.0005
decay_mult: 1
}
param {
lr_mult: 0.0005
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "constant"
value: 0.00
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1_g"
type: "ReLU"
bottom: "conv1_g"
top: "conv1_g"
}
layer {
name: "norm1_g"
type: "LRN"
bottom: "conv1_g"
top: "norm1_g"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1_g"
type: "Pooling"
bottom: "norm1_g"
top: "pool1_g"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1_g"
top: "conv2_g"
param {
lr_mult: 0.5
decay_mult: 1
}
param {
lr_mult: 0.5
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2_g"
type: "ReLU"
bottom: "conv2_g"
top: "conv2_g"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2_g"
top: "conv3_g"
param {
lr_mult: 0.5
decay_mult: 1
}
param {
lr_mult: 0.5
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3_g"
type: "ReLU"
bottom: "conv3_g"
top: "conv3_g"
}
# JOINT PART
layer {
name: "concat"
bottom: "conv3"
bottom: "conv3_g"
top: "joint"
type: "Concat"
concat_param {
axis: 1
}
}
# AFTER JOINT GLOBAL
layer {
name: "conv4_joint"
type: "Convolution"
bottom: "joint"
top: "conv4"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
# AFTER JOINT GRAD
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "joint"
top: "conv4_g"
param {
lr_mult: 0.5
decay_mult: 1
}
param {
lr_mult: 0.5
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4_g"
type: "ReLU"
bottom: "conv4_g"
top: "conv4_g"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4_g"
top: "gradient"
param {
lr_mult: 0.05
decay_mult: 1
}
param {
lr_mult: 0.05
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.0001
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
================================================
FILE: source/joint/architecture_A/net_train.prototxt
================================================
name: "joint_A_train"
#INPUTS
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
# GLOBAL
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
# GRADIENT
layer {
name: "conv1_g"
type: "Convolution"
bottom: "X"
top: "conv1_g"
param {
lr_mult: 0.0005
decay_mult: 1
}
param {
lr_mult: 0.0005
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "constant"
value: 0.00
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1_g"
type: "ReLU"
bottom: "conv1_g"
top: "conv1_g"
}
layer {
name: "norm1_g"
type: "LRN"
bottom: "conv1_g"
top: "norm1_g"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1_g"
type: "Pooling"
bottom: "norm1_g"
top: "pool1_g"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1_g"
top: "conv2_g"
param {
lr_mult: 0.5
decay_mult: 1
}
param {
lr_mult: 0.5
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2_g"
type: "ReLU"
bottom: "conv2_g"
top: "conv2_g"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2_g"
top: "conv3_g"
param {
lr_mult: 0.5
decay_mult: 1
}
param {
lr_mult: 0.5
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3_g"
type: "ReLU"
bottom: "conv3_g"
top: "conv3_g"
}
# JOINT PART
layer {
name: "concat"
bottom: "conv3"
bottom: "conv3_g"
top: "joint"
type: "Concat"
concat_param {
axis: 1
}
}
# AFTER JOINT GLOBAL
layer {
name: "conv4_joint"
type: "Convolution"
bottom: "joint"
top: "conv4"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc-main"
top: "fc-main"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
# AFTER JOINT GRAD
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "joint"
top: "conv4_g"
param {
lr_mult: 0.5
decay_mult: 1
}
param {
lr_mult: 0.5
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4_g"
type: "ReLU"
bottom: "conv4_g"
top: "conv4_g"
}
layer {
name: "drop_g"
type: "Dropout"
bottom: "conv4_g"
top: "conv4_g"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4_g"
top: "grad"
param {
lr_mult: 0.05
decay_mult: 1
}
param {
lr_mult: 0.05
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.0001
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
# LOSSES
layer {
name: "mvnDepth"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
layer {
name: "mvnGT"
type: "MVN"
bottom: "gt"
top: "gtMVN"
}
layer {
name: "lossMVNDepth"
type: "EuclideanLoss"
bottom: "depthMVN"
bottom: "gtMVN"
top: "lossMVNDepth"
loss_weight: 1
}
layer {
name: "gradientFilter"
type: "Convolution"
bottom: "gt"
top: "gtGrad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 0
stride: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "gradMVN"
type: "MVN"
bottom: "grad"
top: "grad_outMVN"
}
layer {
name: "gtMVN"
type: "MVN"
bottom: "gtGrad"
top: "gtGradMVN"
}
layer {
name: "lossGradMVN"
type: "EuclideanLoss"
bottom: "grad_outMVN"
bottom: "gtGradMVN"
top: "lossGrad"
loss_weight: 1
}
================================================
FILE: source/joint/architecture_B/net_deploy.prototxt
================================================
name: "joint_B_deploy"
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
# JOINT
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
# AFTER JOINT GLOBAL
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
# AFTER JOINT GRAD
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3"
top: "conv4_g"
param {
lr_mult: 0.5
decay_mult: 1
}
param {
lr_mult: 0.5
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4_g"
type: "ReLU"
bottom: "conv4_g"
top: "conv4_g"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4_g"
top: "gradient"
param {
lr_mult: 0.05
decay_mult: 1
}
param {
lr_mult: 0.05
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.0001
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
================================================
FILE: source/joint/architecture_B/net_train.prototxt
================================================
name: "joint_B_train"
#INPUTS
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_37x27.lmdb"
backend: LMDB
batch_size: 32
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
# JOINT
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
# AFTER JOINT GLOBAL
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0.02
decay_mult: 1
}
param {
lr_mult: 0.02
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 1
}
param {
decay_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "drop6"
type: "Dropout"
bottom: "fc-main"
top: "fc-main"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 1
lr_mult: 0.2
}
param {
lr_mult: 0.2
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
# AFTER JOINT GRAD
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3"
top: "conv4_g"
param {
lr_mult: 0.5
decay_mult: 1
}
param {
lr_mult: 0.5
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4_g"
type: "ReLU"
bottom: "conv4_g"
top: "conv4_g"
}
layer {
name: "drop_g"
type: "Dropout"
bottom: "conv4_g"
top: "conv4_g"
dropout_param {
dropout_ratio: 0.5
}
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4_g"
top: "grad"
param {
lr_mult: 0.05
decay_mult: 1
}
param {
lr_mult: 0.05
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.0001
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
# LOSSES
layer {
name: "mvnDepth"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
layer {
name: "mvnGT"
type: "MVN"
bottom: "gt"
top: "gtMVN"
}
layer {
name: "lossMVNDepth"
type: "EuclideanLoss"
bottom: "depthMVN"
bottom: "gtMVN"
top: "lossMVNDepth"
loss_weight: 1
}
layer {
name: "gradientFilter"
type: "Convolution"
bottom: "gt"
top: "gtGrad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 0
stride: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "gradMVN"
type: "MVN"
bottom: "grad"
top: "grad_outMVN"
}
layer {
name: "gtMVN"
type: "MVN"
bottom: "gtGrad"
top: "gtGradMVN"
}
layer {
name: "lossGradMVN"
type: "EuclideanLoss"
bottom: "grad_outMVN"
bottom: "gtGradMVN"
top: "lossGrad"
loss_weight: 1
}
================================================
FILE: source/joint/eval_depth.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import caffe
import operator
import argparse
import os
from scipy import misc
from os.path import basename
def LogDepth(depth):
depth = np.maximum(depth, 1.0 / 255.0)
return 0.179581 * np.log(depth) + 1
def AbsoluteRelativeDifference(output, gt):
gt = np.maximum(gt, 1.0 / 255.0)
diff = np.mean(np.absolute(output - gt) / gt)
return diff
def SquaredRelativeDifference(output, gt):
gt = np.maximum(gt, 1.0 / 255.0)
d = output - gt
diff = np.mean((d * d) / gt)
return diff
def RootMeanSquaredError(output, gt):
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def RootMeanSquaredErrorLog(output, gt):
d = LogDepth(output / 10.0) * 10.0 - LogDepth(gt / 10.0) * 10.0
diff = np.sqrt(np.mean(d * d))
return diff
def ScaleInvariantMeanSquaredError(output, gt):
output = LogDepth(output / 10.0) * 10.0
gt = LogDepth(gt / 10.0) * 10.0
d = output - gt
diff = np.mean(d * d)
relDiff = (d.sum() * d.sum()) / float(d.size * d.size)
return diff - relDiff
def Log10Error(output, gt):
output = np.maximum(output, 1.0 / 255.0)
gt = np.maximum(gt, 1.0 / 255.0)
diff = np.mean(np.absolute(np.log10(output) - np.log10(gt)))
return diff
def MVNError(output, gt):
outMean = np.mean(output)
outStd = np.std(output)
output = (output - outMean)/outStd
gtMean = np.mean(gt)
gtStd = np.std(gt)
gt = (gt - gtMean)/gtStd
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def Threshold(output, gt, threshold):
output = np.maximum(output, 1.0 / 255.0)
gt = np.maximum(gt, 1.0 / 255.0)
withinThresholdCount = np.where(np.maximum(output / gt, gt / output) < threshold)[0].size
return withinThresholdCount / float(gt.size)
def Test(out, gt):
absRelDiff = AbsoluteRelativeDifference(out, gt)
sqrRelDiff = SquaredRelativeDifference(out, gt)
RMSE = RootMeanSquaredError(out, gt)
RMSELog = RootMeanSquaredErrorLog(out, gt)
SIMSE = ScaleInvariantMeanSquaredError(out, gt)
threshold1 = Threshold(out, gt, 1.25)
threshold2 = Threshold(out, gt, 1.25 * 1.25)
threshold3 = Threshold(out, gt, 1.25 * 1.25 * 1.25)
log10 = Log10Error(out, gt)
MVN = MVNError(out, gt)
return [absRelDiff, sqrRelDiff, RMSE, RMSELog, SIMSE, log10, MVN, threshold1, threshold2, threshold3]
def PrintTop5(title, result):
length = min(10, len(result))
print
print
print ("TOP " + str(length) + " for " + title)
for i in xrange(length):
print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1]))
print
print
================================================
FILE: source/joint/eval_grad.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import caffe
import operator
import argparse
import os
from scipy import misc
from os.path import basename
def RootMeanSquaredError(output, gt):
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def MVNError(output, gt):
outMean = np.mean(output)
outStd = np.std(output)
output = (output - outMean)/outStd
gtMean = np.mean(gt)
gtStd = np.std(gt)
gt = (gt - gtMean)/gtStd
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def Test(out, gt):
RMSE = RootMeanSquaredError(out, gt)
MVN = MVNError(out, gt)
return [RMSE, MVN]
def PrintTop5(title, result):
length = min(10, len(result))
print
print
print ("TOP " + str(length) + " for " + title)
for i in xrange(length):
print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1]))
print
print
================================================
FILE: source/joint/filter.prototxt
================================================
name: "GradientFilter"
input: "X"
input_shape {
dim: 1
dim: 1
dim: 320
dim: 420
}
layer {
name: "gradientFilter"
type: "Convolution"
bottom: "X"
top: "out"
param {
lr_mult: 0
}
param {
lr_mult: 0
}
convolution_param {
num_output: 2
pad: 0
kernel_size: 3
stride: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0
}
}
}
================================================
FILE: source/joint/solver.prototxt
================================================
net: "net_train.prototxt"
test_iter: 50
test_interval: 500
base_lr: 0.0005
gamma: 0.5
stepsize: 100000
momentum: 0.9
weight_decay: 0.005
lr_policy: "fixed"
display: 100
max_iter: 100000
snapshot: 10000
snapshot_prefix: "snaps/"
solver_mode: GPU
debug_info: 1
================================================
FILE: source/joint/test_depth.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import os.path
os.environ['GLOG_minloglevel'] = '2'
import caffe
import scipy.ndimage
import argparse
import operator
import shutil
from eval_depth import Test, PrintTop5, LogDepth
WIDTH = 298
HEIGHT = 218
OUT_WIDTH = 37
OUT_HEIGHT = 27
GT_WIDTH = 420
GT_HEIGHT = 320
def testNet(net, img):
net.blobs['X'].data[...] = img
net.forward()
output = net.blobs['depth'].data
output = np.reshape(output, (1,1,OUT_HEIGHT, OUT_WIDTH))
return output
def loadImage(path, channels, width, height):
img = caffe.io.load_image(path)
img = caffe.io.resize(img, (height, width, channels))
img = np.transpose(img, (2,0,1))
img = np.reshape(img, (1,channels,height,width))
return img
def printImage(img, name, channels, width, height):
params = list()
params.append(cv.CV_IMWRITE_PNG_COMPRESSION)
params.append(8)
imgnp = np.reshape(img, (height,width, channels))
imgnp = np.array(imgnp * 255, dtype = np.uint8)
cv2.imwrite(name, imgnp, params)
def eval(out, gt, rawResults):
linearGT = gt * 10.0
linearOut = out * 10.0
rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))]
return rawResults
def ProcessToOutput(depth):
depth = np.clip(depth, 0.001, 1000)
return np.clip(2 * 0.179581 * np.log(depth) + 1, 0, 1)
caffe.set_mode_cpu()
parser = argparse.ArgumentParser()
parser.add_argument("input_dir", help="directory with input images")
parser.add_argument("gt_dir", help="directory with ground truths")
parser.add_argument("output", help="folder to output to")
parser.add_argument("snaps", help="folder with snapshots to use")
parser.add_argument('--log', action='store_true', default=False)
args = parser.parse_args()
try:
os.mkdir(args.output)
except OSError:
print ('Output directory already exists, not creating a new one')
try:
os.mkdir(args.output + "_abs")
except OSError:
print ('Output directory already exists, not creating a new one')
fileCount = len([name for name in os.listdir(args.input_dir)])
results = [dict() for x in range(10)]
for snapshot in os.listdir(args.snaps):
if not snapshot.endswith("caffemodel"):
continue
currentSnapDir = snapshot.replace(".caffemodel","")
if os.path.exists(args.output + "/" + currentSnapDir):
shutil.rmtree(args.output + "/" + currentSnapDir)
if os.path.exists(args.output + "_abs/" + currentSnapDir):
shutil.rmtree(args.output + "_abs/" + currentSnapDir)
os.mkdir(args.output + "/" + currentSnapDir)
os.mkdir(args.output + "_abs/" + currentSnapDir)
print(currentSnapDir)
sys.stdout.flush()
netFile = snapshot.replace(".caffemodel",".prototxt")
net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST)
rawResults = np.zeros((10))
for count, file in enumerate(os.listdir(args.input_dir)):
out_string = str(count) + '/' + str(fileCount) + ': ' + file
sys.stdout.write('%s\r' % out_string)
sys.stdout.flush()
inputFileName = file
inputFilePath = args.input_dir + '/' + inputFileName
gtFileName = file.replace('colors','depth')
gtFilePath = args.gt_dir + '/' + gtFileName
gt = loadImage(gtFilePath, 1, GT_WIDTH, GT_HEIGHT)
input = loadImage(inputFilePath, 3, WIDTH, HEIGHT)
input *= 255
input -= 127
output = testNet(net, input)
if args.log:
output = np.exp((output - 1) / 0.179581)
outWidth = OUT_WIDTH
outHeight = OUT_HEIGHT
scaleW = float(GT_WIDTH) / float(OUT_WIDTH)
scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT)
output = scipy.ndimage.zoom(output, (1,1,scaleH,scaleW), order=3)
outWidth *= scaleW
outHeight *= scaleH
rawResults = eval(output, gt, rawResults)
input += 127
input = input / 255.0
input = np.transpose(input, (0,2,3,1))
input = input[:,:,:,(2,1,0)]
absOutput = output.copy()
output -= output.mean()
output /= output.std()
output *= gt.std()
output += gt.mean()
gt = ProcessToOutput(gt)
output = ProcessToOutput(output)
absOutput = ProcessToOutput(absOutput)
filename = os.path.splitext(os.path.basename(inputFileName))[0]
filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png'
filePathAbs = args.output + '_abs/' + currentSnapDir + '/' + filename + '.png'
printImage(input, filePath, 3, WIDTH, HEIGHT)
printImage(input, filePathAbs, 3, WIDTH, HEIGHT)
printImage(output, filePath.replace('_colors','_depth'), 1, outWidth, outHeight)
printImage(absOutput, filePathAbs.replace('_colors','_depth'), 1, outWidth, outHeight)
printImage(gt, filePath.replace('_colors', '_gt'), 1, outWidth, outHeight)
printImage(gt, filePathAbs.replace('_colors', '_gt'), 1, outWidth, outHeight)
rawResults[:] = [x / fileCount for x in rawResults]
for i in xrange(10):
results[i][currentSnapDir] = rawResults[i]
titles = ["AbsRelDiff", "SqrRelDiff", "RMSE", "RMSELog", "SIMSE", "Log10", "MVN", "Threshold 1.25","Threshold 1.25^2", "Threshold 1.25^3"]
for i in xrange(10):
results[i] = sorted(results[i].items(), key=operator.itemgetter(1))
if i > 6:
results[i] = list(reversed(results[i]))
PrintTop5(titles[i], results[i])
================================================
FILE: source/joint/test_grad.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import os.path
os.environ['GLOG_minloglevel'] = '2'
import caffe
import scipy.ndimage
import argparse
import operator
import shutil
from eval_grad import Test, PrintTop5
filter = np.zeros((1,3,3))
filter[0,0,:] = (-1,-1,-1)
filter[0,1,:] = (0,0,0)
filter[0,2,:] = (1,1,1)
filter2 = np.zeros((1,3,3))
filter2[0,0,:] = (-1,0,1)
filter2[0,1,:] = (-1,0,1)
filter2[0,2,:] = (-1,0,1)
WIDTH = 298
HEIGHT = 218
OUT_WIDTH = 35
OUT_HEIGHT = 25
GT_WIDTH = 418
GT_HEIGHT = 318
def filterImage(net, gt):
net.blobs['X'].data[...] = gt
net.forward()
return (net.blobs['out'].data[0,0,:,:], net.blobs['out'].data[0,1,:,:])
def testNet(net, img):
net.blobs['X'].data[...] = img
net.forward()
output = net.blobs['gradient'].data
output = np.reshape(output, (1,2,OUT_HEIGHT, OUT_WIDTH))
out1 = output[0,0,:,:]
out2 = output[0,1,:,:]
return out1, out2
def loadImage(path, channels, width, height):
img = caffe.io.load_image(path)
img = caffe.io.resize(img, (height, width, channels))
img = np.transpose(img, (2,0,1))
img = np.reshape(img, (1,channels,height,width))
return img
def printImage(img, name, channels, width, height):
params = list()
params.append(cv.CV_IMWRITE_PNG_COMPRESSION)
params.append(8)
imgnp = np.reshape(img, (height,width, channels))
imgnp = np.array(imgnp * 255, dtype = np.uint8)
cv2.imwrite(name, imgnp, params)
def eval(out, gt, rawResults):
linearGT = gt#np.exp((gt - 1) / 0.179581) * 65.535
linearOut = out#np.exp((out - 1) / 0.179581) * 65.635
#RAW PIXEL TESTS
rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))]
return rawResults
caffe.set_mode_cpu()
parser = argparse.ArgumentParser()
parser.add_argument("input_dir", help="directory with input images")
parser.add_argument("gt_dir", help="directory with ground truths")
parser.add_argument("output", help="folder to output to")
parser.add_argument("snaps", help="folder with snapshots to use")
args = parser.parse_args()
gradNet = caffe.Net("filter.prototxt", caffe.TEST)
gradNet.params['gradientFilter'][0].data[0,...] = filter
gradNet.params['gradientFilter'][0].data[1,...] = filter2
try:
os.mkdir(args.output)
except OSError:
x = 12
fileCount = len([name for name in os.listdir(args.input_dir)])
results = [dict() for x in range(2)]
for snapshot in os.listdir(args.snaps):
if not snapshot.endswith("caffemodel"):
continue
currentSnapDir = snapshot.replace(".caffemodel","")
if os.path.exists(args.output + "/" + currentSnapDir):
shutil.rmtree(args.output + "/" + currentSnapDir)
os.mkdir(args.output + "/" + currentSnapDir)
print(currentSnapDir)
sys.stdout.flush()
netFile = snapshot.replace(".caffemodel",".prototxt")
net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST)
rawResults = np.zeros((2))
for count, file in enumerate(os.listdir(args.input_dir)):
out_string = str(count) + '/' + str(fileCount) + ': ' + file
sys.stdout.write('%s\r' % out_string)
sys.stdout.flush()
inputFileName = file
inputFilePath = args.input_dir + '/' + inputFileName
gtFileName = file.replace('colors','depth')
gtFilePath = args.gt_dir + '/' + gtFileName
gt = loadImage(gtFilePath, 1, GT_WIDTH + 2, GT_HEIGHT + 2)
gt1, gt2 = filterImage(gradNet, gt)
gt1 = np.reshape(gt1, (1,1,GT_HEIGHT, GT_WIDTH))
gt2 = np.reshape(gt2, (1,1,GT_HEIGHT, GT_WIDTH))
input = loadImage(inputFilePath, 3, WIDTH, HEIGHT)
input *= 255
input -= 127
out1, out2 = testNet(net, input)
outWidth = OUT_WIDTH
outHeight = OUT_HEIGHT
scaleW = float(GT_WIDTH) / float(OUT_WIDTH)
scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT)
out1 = scipy.ndimage.zoom(out1, (scaleH,scaleW), order=3)
out2 = scipy.ndimage.zoom(out2, (scaleH,scaleW), order=3)
outWidth *= scaleW
outHeight *= scaleH
rawResults = eval(out1, gt1, rawResults)
rawResults = eval(out2, gt2, rawResults)
gt1 = (gt1 - gt1.min())/(gt1.max() - gt1.min())
gt2 = (gt2 - gt2.min())/(gt2.max() - gt2.min())
out1 -= out1.mean()
out1 /= out1.std()
out1 *= gt1.std()
out1 += gt1.mean()
out2 -= out2.mean()
out2 /= out2.std()
out2 *= gt2.std()
out2 += gt2.mean()
input += 127
input = input / 255.0
input = np.transpose(input, (0,2,3,1))
input = input[:,:,:,(2,1,0)]
gt1 = np.clip(gt1, 0, 1)
gt2 = np.clip(gt2, 0, 1)
out1 = np.clip(out1, 0, 1)
out2 = np.clip(out2, 0, 1)
filename = os.path.splitext(os.path.basename(inputFileName))[0]
filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png'
printImage(input, filePath, 3, WIDTH, HEIGHT)
printImage(out1, filePath.replace('_colors','_grad1'), 1, outWidth, outHeight)
printImage(out2, filePath.replace('_colors','_grad2'), 1, outWidth, outHeight)
printImage(gt1, filePath.replace('_colors', '_gt1'), 1, outWidth, outHeight)
printImage(gt2, filePath.replace('_colors', '_gt2'), 1, outWidth, outHeight)
rawResults[:] = [x / (fileCount * 2.0) for x in rawResults]
for i in xrange(2):
results[i][currentSnapDir] = rawResults[i]
titles = ["RMSE", "MVN"]
for i in xrange(2):
results[i] = sorted(results[i].items(), key=operator.itemgetter(1))
PrintTop5(titles[i], results[i])
================================================
FILE: source/joint/train.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import cv2
import cv
import caffe
from caffe.proto import caffe_pb2
import sys
from google.protobuf import text_format
import argparse
caffe.set_mode_gpu()
solver = caffe.get_solver('solver.prototxt')
solver.net.copy_from('bvlc_alexnet.caffemodel')
solver.net.params['conv1_g'][0].data[...] = solver.net.params['conv1'][0].data
solver.net.params['conv1_g'][1].data[...] = solver.net.params['conv1'][1].data
filter = np.zeros((1,3,3))
filter[0,0,:] = (-1,-1,-1)
filter[0,1,:] = (0,0,0)
filter[0,2,:] = (1,1,1)
filter2 = np.zeros((1,3,3))
filter2[0,0,:] = (-1,0,1)
filter2[0,1,:] = (-1,0,1)
filter2[0,2,:] = (-1,0,1)
solver.net.params['gradientFilter'][0].data[0,...] = filter
solver.net.params['gradientFilter'][0].data[1,...] = filter2
solver.solve()
================================================
FILE: source/refining_network/abs/net_deploy.prototxt
================================================
name: "refining_network_abs_deploy"
#INPUTS
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#GRADIENT NETWORK STARTS HERE
layer {
name: "conv1-grad"
type: "Convolution"
bottom: "X"
top: "conv1-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-grad"
type: "ReLU"
bottom: "conv1-grad"
top: "conv1-grad"
}
layer {
name: "norm1-grad"
type: "LRN"
bottom: "conv1-grad"
top: "norm1-grad"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-grad"
type: "Pooling"
bottom: "norm1-grad"
top: "pool1-grad"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1-grad"
top: "conv2-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2-grad"
type: "ReLU"
bottom: "conv2-grad"
top: "conv2-grad"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2-grad"
top: "conv3-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-grad"
type: "ReLU"
bottom: "conv3-grad"
top: "conv3-grad"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3-grad"
top: "conv4-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-grad"
type: "ReLU"
bottom: "conv4-grad"
top: "conv4-grad"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4-grad"
top: "grad_out"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "mvnGrad"
type: "MVN"
bottom: "grad_out"
top: "gradMVN"
}
#GRADIENT NETWORK ENDS HERE
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample"
type: "Deconvolution"
bottom: "gradMVN"
top: "grad-upsample"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 3
stride: 1
num_output: 2
group: 2
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
layer {
name: "concat-global"
bottom: "grad-upsample"
bottom: "depthMVN"
top: "global-output"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "global-output"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 3
group: 3
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
================================================
FILE: source/refining_network/abs/net_train.prototxt
================================================
name: "refining_network_abs_train"
#INPUTS START HERE
#COLOR
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
#INPUTS END HERE
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#GRADIENT NETWORK STARTS HERE
layer {
name: "conv1-grad"
type: "Convolution"
bottom: "X"
top: "conv1-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-grad"
type: "ReLU"
bottom: "conv1-grad"
top: "conv1-grad"
}
layer {
name: "norm1-grad"
type: "LRN"
bottom: "conv1-grad"
top: "norm1-grad"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-grad"
type: "Pooling"
bottom: "norm1-grad"
top: "pool1-grad"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1-grad"
top: "conv2-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2-grad"
type: "ReLU"
bottom: "conv2-grad"
top: "conv2-grad"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2-grad"
top: "conv3-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-grad"
type: "ReLU"
bottom: "conv3-grad"
top: "conv3-grad"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3-grad"
top: "conv4-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-grad"
type: "ReLU"
bottom: "conv4-grad"
top: "conv4-grad"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4-grad"
top: "grad_out"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "mvnGrad"
type: "MVN"
bottom: "grad_out"
top: "gradMVN"
}
#GRADIENT NETWORK ENDS HERE
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample"
type: "Deconvolution"
bottom: "gradMVN"
top: "grad-upsample"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 3
stride: 1
num_output: 2
group: 2
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
layer {
name: "concat-global"
bottom: "grad-upsample"
bottom: "depthMVN"
top: "global-output"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "global-output"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 3
group: 3
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
#LOSSES
layer {
name: "lossABSDepth"
type: "EuclideanLoss"
bottom: "depth-refine"
bottom: "gt"
top: "lossABSDepth"
loss_weight: 1
}
================================================
FILE: source/refining_network/eval_depth.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import caffe
import operator
import argparse
import os
from scipy import misc
from os.path import basename
def LogDepth(depth):
depth = np.maximum(depth, 1.0 / 255.0)
return 0.179581 * np.log(depth) + 1
def AbsoluteRelativeDifference(output, gt):
gt = np.maximum(gt, 1.0 / 255.0)
diff = np.mean(np.absolute(output - gt) / gt)
return diff
def SquaredRelativeDifference(output, gt):
gt = np.maximum(gt, 1.0 / 255.0)
d = output - gt
diff = np.mean((d * d) / gt)
return diff
def RootMeanSquaredError(output, gt):
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def RootMeanSquaredErrorLog(output, gt):
d = LogDepth(output / 10.0) * 10.0 - LogDepth(gt / 10.0) * 10.0
diff = np.sqrt(np.mean(d * d))
return diff
def ScaleInvariantMeanSquaredError(output, gt):
output = LogDepth(output / 10.0) * 10.0
gt = LogDepth(gt / 10.0) * 10.0
d = output - gt
diff = np.mean(d * d)
relDiff = (d.sum() * d.sum()) / float(d.size * d.size)
return diff - relDiff
def Log10Error(output, gt):
output = np.maximum(output, 1.0 / 255.0)
gt = np.maximum(gt, 1.0 / 255.0)
diff = np.mean(np.absolute(np.log10(output) - np.log10(gt)))
return diff
def MVNError(output, gt):
outMean = np.mean(output)
outStd = np.std(output)
output = (output - outMean)/outStd
gtMean = np.mean(gt)
gtStd = np.std(gt)
gt = (gt - gtMean)/gtStd
d = output - gt
diff = np.sqrt(np.mean(d * d))
return diff
def Threshold(output, gt, threshold):
output = np.maximum(output, 1.0 / 255.0)
gt = np.maximum(gt, 1.0 / 255.0)
withinThresholdCount = np.where(np.maximum(output / gt, gt / output) < threshold)[0].size
return withinThresholdCount / float(gt.size)
def Test(out, gt):
absRelDiff = AbsoluteRelativeDifference(out, gt)
sqrRelDiff = SquaredRelativeDifference(out, gt)
RMSE = RootMeanSquaredError(out, gt)
RMSELog = RootMeanSquaredErrorLog(out, gt)
SIMSE = ScaleInvariantMeanSquaredError(out, gt)
threshold1 = Threshold(out, gt, 1.25)
threshold2 = Threshold(out, gt, 1.25 * 1.25)
threshold3 = Threshold(out, gt, 1.25 * 1.25 * 1.25)
log10 = Log10Error(out, gt)
MVN = MVNError(out, gt)
return [absRelDiff, sqrRelDiff, RMSE, RMSELog, SIMSE, log10, MVN, threshold1, threshold2, threshold3]
def PrintTop5(title, result):
length = min(10, len(result))
print
print
print ("TOP " + str(length) + " for " + title)
for i in xrange(length):
print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1]))
print
print
================================================
FILE: source/refining_network/log_abs/net_deploy.prototxt
================================================
name: "refining_network_log_abs_deploy"
#INPUTS
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#GRADIENT NETWORK STARTS HERE
layer {
name: "conv1-grad"
type: "Convolution"
bottom: "X"
top: "conv1-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-grad"
type: "ReLU"
bottom: "conv1-grad"
top: "conv1-grad"
}
layer {
name: "norm1-grad"
type: "LRN"
bottom: "conv1-grad"
top: "norm1-grad"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-grad"
type: "Pooling"
bottom: "norm1-grad"
top: "pool1-grad"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1-grad"
top: "conv2-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2-grad"
type: "ReLU"
bottom: "conv2-grad"
top: "conv2-grad"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2-grad"
top: "conv3-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-grad"
type: "ReLU"
bottom: "conv3-grad"
top: "conv3-grad"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3-grad"
top: "conv4-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-grad"
type: "ReLU"
bottom: "conv4-grad"
top: "conv4-grad"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4-grad"
top: "grad_out"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "mvnGrad"
type: "MVN"
bottom: "grad_out"
top: "gradMVN"
}
#GRADIENT NETWORK ENDS HERE
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample"
type: "Deconvolution"
bottom: "gradMVN"
top: "grad-upsample"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 3
stride: 1
num_output: 2
group: 2
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
layer {
name: "concat-global"
bottom: "grad-upsample"
bottom: "depthMVN"
top: "global-output"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "global-output"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 3
group: 3
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
================================================
FILE: source/refining_network/log_abs/net_train.prototxt
================================================
name: "refining_network_log_abs_train"
#INPUTS START HERE
#COLOR
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
#INPUTS END HERE
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#GRADIENT NETWORK STARTS HERE
layer {
name: "conv1-grad"
type: "Convolution"
bottom: "X"
top: "conv1-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-grad"
type: "ReLU"
bottom: "conv1-grad"
top: "conv1-grad"
}
layer {
name: "norm1-grad"
type: "LRN"
bottom: "conv1-grad"
top: "norm1-grad"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-grad"
type: "Pooling"
bottom: "norm1-grad"
top: "pool1-grad"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1-grad"
top: "conv2-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2-grad"
type: "ReLU"
bottom: "conv2-grad"
top: "conv2-grad"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2-grad"
top: "conv3-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-grad"
type: "ReLU"
bottom: "conv3-grad"
top: "conv3-grad"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3-grad"
top: "conv4-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-grad"
type: "ReLU"
bottom: "conv4-grad"
top: "conv4-grad"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4-grad"
top: "grad_out"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "mvnGrad"
type: "MVN"
bottom: "grad_out"
top: "gradMVN"
}
#GRADIENT NETWORK ENDS HERE
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample"
type: "Deconvolution"
bottom: "gradMVN"
top: "grad-upsample"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 3
stride: 1
num_output: 2
group: 2
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
layer {
name: "concat-global"
bottom: "grad-upsample"
bottom: "depthMVN"
top: "global-output"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "global-output"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 3
group: 3
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
#LOSSES
layer {
name: "log"
type: "Log"
bottom: "gt"
top: "lnGt"
log_param {
shift: 0.00392156863
scale: 0.996078431
}
}
layer {
name: "power"
type: "Power"
bottom: "lnGt"
top: "logGt"
power_param {
power: 1
scale: 0.179581
shift: 1.0
}
}
layer {
name: "lossABSDepth"
type: "EuclideanLoss"
bottom: "depth-refine"
bottom: "logGt"
top: "lossABSDepth"
loss_weight: 1
}
================================================
FILE: source/refining_network/norm_abs/net_deploy.prototxt
================================================
name: "refining network_norm_abs_deploy"
#INPUTS
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#GRADIENT NETWORK STARTS HERE
layer {
name: "conv1-grad"
type: "Convolution"
bottom: "X"
top: "conv1-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-grad"
type: "ReLU"
bottom: "conv1-grad"
top: "conv1-grad"
}
layer {
name: "norm1-grad"
type: "LRN"
bottom: "conv1-grad"
top: "norm1-grad"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-grad"
type: "Pooling"
bottom: "norm1-grad"
top: "pool1-grad"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1-grad"
top: "conv2-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2-grad"
type: "ReLU"
bottom: "conv2-grad"
top: "conv2-grad"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2-grad"
top: "conv3-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-grad"
type: "ReLU"
bottom: "conv3-grad"
top: "conv3-grad"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3-grad"
top: "conv4-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-grad"
type: "ReLU"
bottom: "conv4-grad"
top: "conv4-grad"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4-grad"
top: "grad_out"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "mvnGrad"
type: "MVN"
bottom: "grad_out"
top: "gradMVN"
}
#GRADIENT NETWORK ENDS HERE
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample"
type: "Deconvolution"
bottom: "gradMVN"
top: "grad-upsample"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 3
stride: 1
num_output: 2
group: 2
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
layer {
name: "concat-global"
bottom: "grad-upsample"
bottom: "depthMVN"
top: "global-output"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "global-output"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 3
group: 3
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
================================================
FILE: source/refining_network/norm_abs/net_train.prototxt
================================================
name: "refining_network_norm_abs_train"
#INPUTS START HERE
#COLOR
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
#INPUTS END HERE
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#GRADIENT NETWORK STARTS HERE
layer {
name: "conv1-grad"
type: "Convolution"
bottom: "X"
top: "conv1-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-grad"
type: "ReLU"
bottom: "conv1-grad"
top: "conv1-grad"
}
layer {
name: "norm1-grad"
type: "LRN"
bottom: "conv1-grad"
top: "norm1-grad"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-grad"
type: "Pooling"
bottom: "norm1-grad"
top: "pool1-grad"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1-grad"
top: "conv2-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2-grad"
type: "ReLU"
bottom: "conv2-grad"
top: "conv2-grad"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2-grad"
top: "conv3-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-grad"
type: "ReLU"
bottom: "conv3-grad"
top: "conv3-grad"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3-grad"
top: "conv4-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-grad"
type: "ReLU"
bottom: "conv4-grad"
top: "conv4-grad"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4-grad"
top: "grad_out"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "mvnGrad"
type: "MVN"
bottom: "grad_out"
top: "gradMVN"
}
#GRADIENT NETWORK ENDS HERE
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample"
type: "Deconvolution"
bottom: "gradMVN"
top: "grad-upsample"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 3
stride: 1
num_output: 2
group: 2
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
layer {
name: "concat-global"
bottom: "grad-upsample"
bottom: "depthMVN"
top: "global-output"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "global-output"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 3
group: 3
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
#LOSSES
layer {
name: "mvnDepthRefine"
type: "MVN"
bottom: "depth-refine"
top: "depthMVN-refine"
}
layer {
name: "mvnGT"
type: "MVN"
bottom: "gt"
top: "gtMVN"
}
layer {
name: "lossMVNDepth"
type: "EuclideanLoss"
bottom: "depthMVN-refine"
bottom: "gtMVN"
top: "lossMVNDepth"
loss_weight: 0.5
}
layer {
name: "lossABSDepth"
type: "EuclideanLoss"
bottom: "depth-refine"
bottom: "gt"
top: "lossABSDepth"
loss_weight: 0.5
}
================================================
FILE: source/refining_network/norm_abs_global_only/net_deploy.prototxt
================================================
name: "refining_network_norm_abs_global_only_deploy"
#INPUTS
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "depthMVN"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 1
group: 1
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
================================================
FILE: source/refining_network/norm_abs_global_only/net_train.prototxt
================================================
name: "refining_network_norm_abs_global_only_train"
#INPUTS START HERE
#COLOR
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
#INPUTS END HERE
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "depthMVN"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 1
group: 1
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
#LOSSES
layer {
name: "mvnDepthRefine"
type: "MVN"
bottom: "depth-refine"
top: "depthMVN-refine"
}
layer {
name: "mvnGT"
type: "MVN"
bottom: "gt"
top: "gtMVN"
}
layer {
name: "lossMVNDepth"
type: "EuclideanLoss"
bottom: "depthMVN-refine"
bottom: "gtMVN"
top: "lossMVNDepth"
loss_weight: 0.5
}
layer {
name: "lossABSDepth"
type: "EuclideanLoss"
bottom: "depth-refine"
bottom: "gt"
top: "lossABSDepth"
loss_weight: 0.5
}
================================================
FILE: source/refining_network/sc-inv_abs/net_deploy.prototxt
================================================
name: "refining_network_sc-inv_abs_deploy"
#INPUTS
layer {
name: "data"
type: "Input"
top: "X"
input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } }
}
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#GRADIENT NETWORK STARTS HERE
layer {
name: "conv1-grad"
type: "Convolution"
bottom: "X"
top: "conv1-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-grad"
type: "ReLU"
bottom: "conv1-grad"
top: "conv1-grad"
}
layer {
name: "norm1-grad"
type: "LRN"
bottom: "conv1-grad"
top: "norm1-grad"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-grad"
type: "Pooling"
bottom: "norm1-grad"
top: "pool1-grad"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1-grad"
top: "conv2-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2-grad"
type: "ReLU"
bottom: "conv2-grad"
top: "conv2-grad"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2-grad"
top: "conv3-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-grad"
type: "ReLU"
bottom: "conv3-grad"
top: "conv3-grad"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3-grad"
top: "conv4-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-grad"
type: "ReLU"
bottom: "conv4-grad"
top: "conv4-grad"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4-grad"
top: "grad_out"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "mvnGrad"
type: "MVN"
bottom: "grad_out"
top: "gradMVN"
}
#GRADIENT NETWORK ENDS HERE
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample"
type: "Deconvolution"
bottom: "gradMVN"
top: "grad-upsample"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 3
stride: 1
num_output: 2
group: 2
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
layer {
name: "concat-global"
bottom: "grad-upsample"
bottom: "depthMVN"
top: "global-output"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "global-output"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 3
group: 3
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
================================================
FILE: source/refining_network/sc-inv_abs/net_train.prototxt
================================================
name: "refining_network_sc-inv_abs_train"
#INPUTS START HERE
#COLOR
layer {
name: "train_color"
type: "Data"
top: "X"
data_param {
source: "train_raw2_lmdb/train_raw2_color_298x218.LMDB"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TRAIN
}
}
layer {
name: "train_depth"
type: "Data"
top: "gt"
data_param {
source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TRAIN
}
}
layer {
name: "test_color"
type: "Data"
top: "X"
data_param {
source: "test_lmdb/test_color_298x218.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
mean_value: 127
}
include {
phase: TEST
}
}
layer {
name: "test_depth"
type: "Data"
top: "gt"
data_param {
source: "test_lmdb/test_depth_74x54.lmdb"
backend: LMDB
batch_size: 16
}
transform_param {
scale: 0.00390625
}
include {
phase: TEST
}
}
#INPUTS END HERE
#GLOBAL NETWORK STARS HERE
layer {
name: "conv1"
type: "Convolution"
bottom: "X"
top: "conv1"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1"
type: "ReLU"
bottom: "conv1"
top: "conv1"
}
layer {
name: "norm1"
type: "LRN"
bottom: "conv1"
top: "norm1"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1"
type: "Pooling"
bottom: "norm1"
top: "pool1"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
layer {
name: "conv2"
type: "Convolution"
bottom: "pool1"
top: "conv2"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2"
type: "ReLU"
bottom: "conv2"
top: "conv2"
}
layer {
name: "norm2"
type: "LRN"
bottom: "conv2"
top: "norm2"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "conv3"
type: "Convolution"
bottom: "norm2"
top: "conv3"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3"
type: "ReLU"
bottom: "conv3"
top: "conv3"
}
layer {
name: "conv4"
type: "Convolution"
bottom: "conv3"
top: "conv4"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4"
type: "ReLU"
bottom: "conv4"
top: "conv4"
}
layer {
name: "conv5"
type: "Convolution"
bottom: "conv4"
top: "conv5"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 1
kernel_size: 3
group: 2
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu5"
type: "ReLU"
bottom: "conv5"
top: "conv5"
}
layer {
name: "pool5"
type: "Pooling"
bottom: "conv5"
top: "pool5"
pooling_param {
pool: MAX
kernel_size: 3
stride: 2
}
}
# MAIN
layer {
name: "fc-main"
type: "InnerProduct"
bottom: "pool5"
top: "fc-main"
param {
decay_mult: 0
lr_mult: 0
}
param {
decay_mult: 0
lr_mult: 0
}
inner_product_param {
num_output: 1024
weight_filler {
type: "xavier"
std: 0.005
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu6"
type: "ReLU"
bottom: "fc-main"
top: "fc-main"
}
layer {
name: "fc-depth"
type: "InnerProduct"
bottom: "fc-main"
top: "fc-depth"
param {
decay_mult: 0
lr_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
inner_product_param {
num_output: 999
weight_filler {
type: "gaussian"
std: 0.001
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "reshape"
type: "Reshape"
bottom: "fc-depth"
top: "depth"
reshape_param {
shape {
dim: 0 # copy the dimension from below
dim: 1
dim: 27
dim: 37 # infer it from the other dimensions
}
}
}
layer {
name: "mvnDepth-global"
type: "MVN"
bottom: "depth"
top: "depthMVN"
}
#GRADIENT NETWORK STARTS HERE
layer {
name: "conv1-grad"
type: "Convolution"
bottom: "X"
top: "conv1-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 4
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-grad"
type: "ReLU"
bottom: "conv1-grad"
top: "conv1-grad"
}
layer {
name: "norm1-grad"
type: "LRN"
bottom: "conv1-grad"
top: "norm1-grad"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-grad"
type: "Pooling"
bottom: "norm1-grad"
top: "pool1-grad"
pooling_param {
pool: MAX
kernel_size: 4
stride: 2
}
}
layer {
name: "conv2-grad"
type: "Convolution"
bottom: "pool1-grad"
top: "conv2-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 256
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu2-grad"
type: "ReLU"
bottom: "conv2-grad"
top: "conv2-grad"
}
layer {
name: "conv3-grad"
type: "Convolution"
bottom: "conv2-grad"
top: "conv3-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-grad"
type: "ReLU"
bottom: "conv3-grad"
top: "conv3-grad"
}
layer {
name: "conv4-grad"
type: "Convolution"
bottom: "conv3-grad"
top: "conv4-grad"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 384
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-grad"
type: "ReLU"
bottom: "conv4-grad"
top: "conv4-grad"
}
layer {
name: "conv5-grad"
type: "Convolution"
bottom: "conv4-grad"
top: "grad_out"
param {
lr_mult: 0
decay_mult: 0
}
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
num_output: 2
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "mvnGrad"
type: "MVN"
bottom: "grad_out"
top: "gradMVN"
}
#GRADIENT NETWORK ENDS HERE
#PREPROCESSING FOR THE REFINE
layer {
name: "upsample"
type: "Deconvolution"
bottom: "gradMVN"
top: "grad-upsample"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 3
stride: 1
num_output: 2
group: 2
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
layer {
name: "concat-global"
bottom: "grad-upsample"
bottom: "depthMVN"
top: "global-output"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "upsample-global"
type: "Deconvolution"
bottom: "global-output"
top: "est"
param {
lr_mult: 0
decay_mult: 0
}
convolution_param {
kernel_size: 2
stride: 2
num_output: 3
group: 3
pad: 0
weight_filler: {
type: "bilinear"
}
bias_term: false
}
}
#GLOBAL NETWORK ENDS HERE
#REFINE NETWORK HERE
layer {
name: "conv1-refine"
type: "Convolution"
bottom: "X"
top: "conv1-refine"
param {
lr_mult: 0.001
decay_mult: 1
}
param {
lr_mult: 0.001
decay_mult: 0
}
convolution_param {
num_output: 96
kernel_size: 11
stride: 2
pad: 2
weight_filler {
type: "constant"
# std: 0.001
value: 0
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu1-refine"
type: "ReLU"
bottom: "conv1-refine"
top: "conv1-refine"
}
layer {
name: "norm1-refine"
type: "LRN"
bottom: "conv1-refine"
top: "norm1-refine"
lrn_param {
local_size: 5
alpha: 0.0001
beta: 0.75
}
}
layer {
name: "pool1-refine"
type: "Pooling"
bottom: "norm1-refine"
top: "pool1-refine"
pooling_param {
pool: MAX
kernel_size: 2
stride: 2
pad: 1
}
}
layer {
name: "concat"
bottom: "pool1-refine"
bottom: "est"
top: "input-refine"
type: "Concat"
concat_param {
axis: 1
}
}
layer {
name: "conv2-refine"
type: "Convolution"
bottom: "input-refine"
top: "conv2-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 1
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.0
}
}
}
layer {
name: "relu2-refine"
type: "ReLU"
bottom: "conv2-refine"
top: "conv2-refine"
}
layer {
name: "conv3-refine"
type: "Convolution"
bottom: "conv2-refine"
top: "conv3-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0
}
}
}
layer {
name: "relu3-refine"
type: "ReLU"
bottom: "conv3-refine"
top: "conv3-refine"
}
layer {
name: "conv4-refine"
type: "Convolution"
bottom: "conv3-refine"
top: "conv4-refine"
param {
lr_mult: 1
decay_mult: 1
}
param {
lr_mult: 1
decay_mult: 0
}
convolution_param {
num_output: 64
pad: 2
kernel_size: 5
group: 2
weight_filler {
type: "xavier"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.1
}
}
}
layer {
name: "relu4-refine"
type: "ReLU"
bottom: "conv4-refine"
top: "conv4-refine"
}
layer {
name: "conv5-refine"
type: "Convolution"
bottom: "conv4-refine"
top: "depth-refine_"
param {
lr_mult: 0.01
decay_mult: 1
}
param {
lr_mult: 0.01
decay_mult: 0
}
convolution_param {
num_output: 1
pad: 1
kernel_size: 3
group: 1
weight_filler {
type: "gaussian"
std: 0.01
}
bias_filler {
type: "constant"
value: 0.5
}
}
}
layer {
name: "power-refine"
type: "Power"
bottom: "depth-refine_"
top: "depth-refine"
power_param {
power: 1
scale: 0.01
shift: 0
}
}
#LOSSES
layer {
name: "log"
type: "Log"
bottom: "gt"
top: "lnGt"
log_param {
shift: 0.00392156863
scale: 0.996078431
}
}
layer {
name: "power"
type: "Power"
bottom: "lnGt"
top: "logGt"
power_param {
power: 1
scale: 0.179581
shift: 1.0
}
}
layer {
name: "mvnDepthRefine"
type: "MVN"
bottom: "depth-refine"
top: "depthMVN-refine"
mvn_param{normalize_variance: false}
}
layer {
name: "mvnGT"
type: "MVN"
bottom: "logGt"
top: "gtMVN"
mvn_param{normalize_variance: false}
}
layer {
name: "lossMVNDepth"
type: "EuclideanLoss"
bottom: "depthMVN-refine"
bottom: "gtMVN"
top: "lossMVNDepth"
loss_weight: 0.5
}
layer {
name: "lossABSDepth"
type: "EuclideanLoss"
bottom: "depth-refine"
bottom: "logGt"
top: "lossABSDepth"
loss_weight: 0.5
}
================================================
FILE: source/refining_network/solver.prototxt
================================================
net: "net_train.prototxt"
test_iter: 50
test_interval: 500
base_lr: 0.000025
gamma: 0.5
stepsize: 100000
momentum: 0.9
weight_decay: 0.005
lr_policy: "fixed"
display: 100
max_iter: 100000
snapshot: 10000
snapshot_prefix: "snaps/"
solver_mode: GPU
debug_info: 1
================================================
FILE: source/refining_network/test_depth.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import os.path
os.environ['GLOG_minloglevel'] = '2'
import caffe
import scipy.ndimage
import argparse
import operator
import shutil
from eval_depth import Test, PrintTop5, LogDepth
WIDTH = 298
HEIGHT = 218
OUT_WIDTH = 74
OUT_HEIGHT = 54
GT_WIDTH = 420
GT_HEIGHT = 320
def testNet(net, img):
net.blobs['X'].data[...] = img
net.forward()
output = net.blobs['depth-refine'].data
output = np.reshape(output, (1,1,OUT_HEIGHT, OUT_WIDTH))
return output
def loadImage(path, channels, width, height):
img = caffe.io.load_image(path)
img = caffe.io.resize(img, (height, width, channels))
img = np.transpose(img, (2,0,1))
img = np.reshape(img, (1,channels,height,width))
return img
def printImage(img, name, channels, width, height):
params = list()
params.append(cv.CV_IMWRITE_PNG_COMPRESSION)
params.append(8)
imgnp = np.reshape(img, (height,width, channels))
imgnp = np.array(imgnp * 255, dtype = np.uint8)
cv2.imwrite(name, imgnp, params)
def eval(out, gt, rawResults):
linearGT = gt * 10.0
linearOut = out * 10.0
rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))]
return rawResults
def ProcessToOutput(depth):
depth = np.clip(depth, 0.001, 1000)
return np.clip(2 * 0.179581 * np.log(depth) + 1, 0, 1)
caffe.set_mode_cpu()
parser = argparse.ArgumentParser()
parser.add_argument("input_dir", help="directory with input images")
parser.add_argument("gt_dir", help="directory with ground truths")
parser.add_argument("output", help="folder to output to")
parser.add_argument("snaps", help="folder with snapshots to use")
parser.add_argument('--log', action='store_true', default=False)
args = parser.parse_args()
try:
os.mkdir(args.output)
except OSError:
print ('Output directory already exists, not creating a new one')
try:
os.mkdir(args.output + "_abs")
except OSError:
print ('Output directory already exists, not creating a new one')
fileCount = len([name for name in os.listdir(args.input_dir)])
results = [dict() for x in range(10)]
for snapshot in os.listdir(args.snaps):
if not snapshot.endswith("caffemodel"):
continue
currentSnapDir = snapshot.replace(".caffemodel","")
if os.path.exists(args.output + "/" + currentSnapDir):
shutil.rmtree(args.output + "/" + currentSnapDir)
if os.path.exists(args.output + "_abs/" + currentSnapDir):
shutil.rmtree(args.output + "_abs/" + currentSnapDir)
os.mkdir(args.output + "/" + currentSnapDir)
os.mkdir(args.output + "_abs/" + currentSnapDir)
print(currentSnapDir)
sys.stdout.flush()
netFile = snapshot.replace(".caffemodel",".prototxt")
net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST)
rawResults = np.zeros((10))
for count, file in enumerate(os.listdir(args.input_dir)):
out_string = str(count) + '/' + str(fileCount) + ': ' + file
sys.stdout.write('%s\r' % out_string)
sys.stdout.flush()
inputFileName = file
inputFilePath = args.input_dir + '/' + inputFileName
gtFileName = file.replace('colors','depth')
gtFilePath = args.gt_dir + '/' + gtFileName
gt = loadImage(gtFilePath, 1, GT_WIDTH, GT_HEIGHT)
input = loadImage(inputFilePath, 3, WIDTH, HEIGHT)
input *= 255
input -= 127
output = testNet(net, input)
if args.log:
output = np.exp((output - 1) / 0.179581)
outWidth = OUT_WIDTH
outHeight = OUT_HEIGHT
scaleW = float(GT_WIDTH) / float(OUT_WIDTH)
scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT)
output = scipy.ndimage.zoom(output, (1,1,scaleH,scaleW), order=3)
outWidth *= scaleW
outHeight *= scaleH
rawResults = eval(output, gt, rawResults)
input += 127
input = input / 255.0
input = np.transpose(input, (0,2,3,1))
input = input[:,:,:,(2,1,0)]
absOutput = output.copy()
output -= output.mean()
output /= output.std()
output *= gt.std()
output += gt.mean()
gt = ProcessToOutput(gt)
output = ProcessToOutput(output)
absOutput = ProcessToOutput(absOutput)
filename = os.path.splitext(os.path.basename(inputFileName))[0]
filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png'
filePathAbs = args.output + '_abs/' + currentSnapDir + '/' + filename + '.png'
printImage(input, filePath, 3, WIDTH, HEIGHT)
printImage(input, filePathAbs, 3, WIDTH, HEIGHT)
printImage(output, filePath.replace('_colors','_depth'), 1, outWidth, outHeight)
printImage(absOutput, filePathAbs.replace('_colors','_depth'), 1, outWidth, outHeight)
printImage(gt, filePath.replace('_colors', '_gt'), 1, outWidth, outHeight)
printImage(gt, filePathAbs.replace('_colors', '_gt'), 1, outWidth, outHeight)
rawResults[:] = [x / fileCount for x in rawResults]
for i in xrange(10):
results[i][currentSnapDir] = rawResults[i]
titles = ["AbsRelDiff", "SqrRelDiff", "RMSE", "RMSELog", "SIMSE", "Log10", "MVN", "Threshold 1.25","Threshold 1.25^2", "Threshold 1.25^3"]
for i in xrange(10):
results[i] = sorted(results[i].items(), key=operator.itemgetter(1))
if i > 6:
results[i] = list(reversed(results[i]))
PrintTop5(titles[i], results[i])
================================================
FILE: source/refining_network/train.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
import numpy as np
import cv2
import cv
import caffe
from caffe.proto import caffe_pb2
import sys
from google.protobuf import text_format
import argparse
caffe.set_mode_gpu()
solver = caffe.get_solver('solver.prototxt')
solver.net.copy_from(path_to_global_context_network_caffemodel)
gradPart = caffe.Net(path_to_gradient_network_definition_file, path_to_gradient_network_caffemodel, caffe.TEST)
params = gradPart.params.keys()
source_params = {pr: (gradPart.params[pr][0].data, gradPart.params[pr][1].data) for pr in params}
target_params = {pr: (solver.net.params[pr][0].data, solver.net.params[pr][1].data) for pr in params}
for pr in params:
if pr == 'conv1':
solver.net.params['conv1-grad'][1].data[...] = source_params [pr][1] #biases
solver.net.params['conv1-grad'][0].data[...] = source_params [pr][0] #weights
else:
target_params[pr][1][...] = source_params [pr][1] #bias
target_params[pr][0][...] = source_params [pr][0] #weights
alexNet = caffe.Net(path_to_gradient_network_definition_file, 'bvlc_alexnet.caffemodel', caffe.TEST)
solver.net.params['conv1-refine'][1].data[...] = alexNet.params['conv1'][1].data #biases
solver.net.params['conv1-refine'][0].data[...] = alexNet.params['conv1'][0].data #weights
solver.solve()
================================================
FILE: test_depth.py
================================================
#!/usr/bin/env python
# Master's Thesis - Depth Estimation by Convolutional Neural Networks
# Jan Ivanecky; xivane00@stud.fit.vutbr.cz
from __future__ import print_function
import numpy as np
import matplotlib.pyplot as plt
import sys
from PIL import Image
import cv2
import cv
import os.path
os.environ['GLOG_minloglevel'] = '2'
import caffe
import scipy.ndimage
import argparse
import operator
import shutil
from eval_depth import Test, PrintTop5, LogDepth
WIDTH = 298
HEIGHT = 218
OUT_WIDTH = 74
OUT_HEIGHT = 54
GT_WIDTH = 420
GT_HEIGHT = 320
def testNet(net, img):
net.blobs['X'].data[...] = img
net.forward()
output = net.blobs['depth-refine'].data
output = np.reshape(output, (1,1,OUT_HEIGHT, OUT_WIDTH))
return output
def loadImage(path, channels, width, height):
img = caffe.io.load_image(path)
img = caffe.io.resize(img, (height, width, channels))
img = np.transpose(img, (2,0,1))
img = np.reshape(img, (1,channels,height,width))
return img
def printImage(img, name, channels, width, height):
params = list()
params.append(cv.CV_IMWRITE_PNG_COMPRESSION)
params.append(8)
imgnp = np.reshape(img, (height,width, channels))
imgnp = np.array(imgnp * 255, dtype = np.uint8)
cv2.imwrite(name, imgnp, params)
def eval(out, gt, rawResults):
linearGT = gt * 10.0
linearOut = out * 10.0
rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))]
return rawResults
def ProcessToOutput(depth):
depth = np.clip(depth, 0.001, 1000)
return np.clip(2 * 0.179581 * np.log(depth) + 1, 0, 1)
caffe.set_mode_cpu()
parser = argparse.ArgumentParser()
parser.add_argument("input_dir", help="directory with input images")
parser.add_argument("gt_dir", help="directory with ground truths")
parser.add_argument("output", help="folder to output to")
parser.add_argument("snaps", help="folder with snapshots to use")
parser.add_argument('--log', action='store_true', default=False)
args = parser.parse_args()
try:
os.mkdir(args.output)
except OSError:
print ('Output directory already exists, not creating a new one')
try:
os.mkdir(args.output + "_abs")
except OSError:
print ('Output directory already exists, not creating a new one')
fileCount = len([name for name in os.listdir(args.input_dir)])
results = [dict() for x in range(10)]
for snapshot in os.listdir(args.snaps):
if not snapshot.endswith("caffemodel"):
continue
currentSnapDir = snapshot.replace(".caffemodel","")
if os.path.exists(args.output + "/" + currentSnapDir):
shutil.rmtree(args.output + "/" + currentSnapDir)
if os.path.exists(args.output + "_abs/" + currentSnapDir):
shutil.rmtree(args.output + "_abs/" + currentSnapDir)
os.mkdir(args.output + "/" + currentSnapDir)
os.mkdir(args.output + "_abs/" + currentSnapDir)
print(currentSnapDir)
sys.stdout.flush()
netFile = snapshot.replace(".caffemodel",".prototxt")
net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST)
rawResults = np.zeros((10))
for count, file in enumerate(os.listdir(args.input_dir)):
out_string = str(count) + '/' + str(fileCount) + ': ' + file
sys.stdout.write('%s\r' % out_string)
sys.stdout.flush()
inputFileName = file
inputFilePath = args.input_dir + '/' + inputFileName
gtFileName = file.replace('colors','depth')
gtFilePath = args.gt_dir + '/' + gtFileName
gt = loadImage(gtFilePath, 1, GT_WIDTH, GT_HEIGHT)
input = loadImage(inputFilePath, 3, WIDTH, HEIGHT)
input *= 255
input -= 127
output = testNet(net, input)
if args.log:
output = np.exp((output - 1) / 0.179581)
outWidth = OUT_WIDTH
outHeight = OUT_HEIGHT
scaleW = float(GT_WIDTH) / float(OUT_WIDTH)
scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT)
output = scipy.ndimage.zoom(output, (1,1,scaleH,scaleW), order=3)
outWidth *= scaleW
outHeight *= scaleH
rawResults = eval(output, gt, rawResults)
input += 127
input = input / 255.0
input = np.transpose(input, (0,2,3,1))
input = input[:,:,:,(2,1,0)]
absOutput = output.copy()
output -= output.mean()
output /= output.std()
output *= gt.std()
output += gt.mean()
gt = ProcessToOutput(gt)
output = ProcessToOutput(output)
absOutput = ProcessToOutput(absOutput)
filename = os.path.splitext(os.path.basename(inputFileName))[0]
filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png'
filePathAbs = args.output + '_abs/' + currentSnapDir + '/' + filename + '.png'
printImage(input, filePath, 3, WIDTH, HEIGHT)
printImage(input, filePathAbs, 3, WIDTH, HEIGHT)
printImage(output, filePath.replace('_colors','_depth'), 1, outWidth, outHeight)
printImage(absOutput, filePathAbs.replace('_colors','_depth'), 1, outWidth, outHeight)
printImage(gt, filePath.replace('_colors', '_gt'), 1, outWidth, outHeight)
printImage(gt, filePathAbs.replace('_colors', '_gt'), 1, outWidth, outHeight)
rawResults[:] = [x / fileCount for x in rawResults]
for i in xrange(10):
results[i][currentSnapDir] = rawResults[i]
titles = ["AbsRelDiff", "SqrRelDiff", "RMSE", "RMSELog", "SIMSE", "Log10", "MVN", "Threshold 1.25","Threshold 1.25^2", "Threshold 1.25^3"]
for i in xrange(10):
results[i] = sorted(results[i].items(), key=operator.itemgetter(1))
if i > 6:
results[i] = list(reversed(results[i]))
PrintTop5(titles[i], results[i])