Repository: janivanecky/Depth-Estimation Branch: master Commit: f29082ca01da Files: 68 Total size: 348.4 KB Directory structure: gitextract_fes_sm48/ ├── README.md ├── dataset/ │ ├── README.txt │ ├── test/ │ │ ├── _solarized.py │ │ ├── _structure_classes.py │ │ ├── convert.py │ │ ├── create_test_lmdb.sh │ │ ├── crop.py │ │ └── process_test.sh │ └── train/ │ ├── create_train_lmdb.sh │ ├── get_train_scenes.m │ ├── process_raw.m │ ├── split_train_set.sh │ ├── train_augment0.py │ ├── train_augment1.py │ └── train_augment2.py ├── eval_depth.py ├── get_depth.py ├── net_deploy.prototxt ├── net_train.prototxt ├── solver.prototxt ├── source/ │ ├── README.txt │ ├── global_context_network/ │ │ ├── abs/ │ │ │ ├── net_deploy.prototxt │ │ │ └── net_train.prototxt │ │ ├── eval_depth.py │ │ ├── log_abs/ │ │ │ ├── net_deploy.prototxt │ │ │ └── net_train.prototxt │ │ ├── norm/ │ │ │ ├── net_deploy.prototxt │ │ │ └── net_train.prototxt │ │ ├── sc-inv/ │ │ │ ├── net_deploy.prototxt │ │ │ └── net_train.prototxt │ │ ├── solver.prototxt │ │ ├── test_depth.py │ │ └── train.py │ ├── gradient_network/ │ │ ├── abs/ │ │ │ ├── net_deploy.prototxt │ │ │ └── net_train.prototxt │ │ ├── eval_grad.py │ │ ├── filter.prototxt │ │ ├── norm/ │ │ │ ├── net_deploy.prototxt │ │ │ └── net_train.prototxt │ │ ├── solver.prototxt │ │ ├── test_grad.py │ │ └── train.py │ ├── joint/ │ │ ├── architecture_A/ │ │ │ ├── net_deploy.prototxt │ │ │ └── net_train.prototxt │ │ ├── architecture_B/ │ │ │ ├── net_deploy.prototxt │ │ │ └── net_train.prototxt │ │ ├── eval_depth.py │ │ ├── eval_grad.py │ │ ├── filter.prototxt │ │ ├── solver.prototxt │ │ ├── test_depth.py │ │ ├── test_grad.py │ │ └── train.py │ └── refining_network/ │ ├── abs/ │ │ ├── net_deploy.prototxt │ │ └── net_train.prototxt │ ├── eval_depth.py │ ├── log_abs/ │ │ ├── net_deploy.prototxt │ │ └── net_train.prototxt │ ├── norm_abs/ │ │ ├── net_deploy.prototxt │ │ └── net_train.prototxt │ ├── norm_abs_global_only/ │ │ ├── net_deploy.prototxt │ │ └── net_train.prototxt │ ├── sc-inv_abs/ │ │ ├── net_deploy.prototxt │ │ └── net_train.prototxt │ ├── solver.prototxt │ ├── test_depth.py │ └── train.py └── test_depth.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: README.md ================================================ # Depth Estimation by Convolutional Neural Networks This is a repository for my master's thesis - Depth estimation by CNNs. You can read the whole thesis here. Here I briefly present the solution and results. ## Architecture: I use architecture similar to the one used by Eigen et al. with the difference, that I also use network that estimates gradients of the depth map: ![Architecture](pics/arch.png) For the global context network I use pretrained AlexNet, gradient network is a convolutional part of AlexNet, and the refining network is also fully convolutional, more details in the thesis. I trained each part separately, first global context network and the gradient network, after that I fixed their parameters and trained the refining network. ### Normalized loss function: For training the global context network and the refining network I wanted to use scale invariant loss, similar to the one used by Eigen et al., but I took it a step further and I used loss function that is scale-and-translation invariant. I would put an equation here, but I couldn't find how to do that easy. Luckily - it can be explained fairly easily in words: to obtain normalized depth map you just subtract its mean and divide by its variance. Normalized loss is just a squared distance between normalized output depth map and the target depth map. This showed to improve speed of convergence significantly. ## Trained model You can download the trained model here. ## Results: I made several experiments for the thesis, you can have a look at all of them in the chapter 5 of the thesis. Here I present just the most significant ones. All experiments were performed on NYU Depth v2 dataset. ### Comparison of different loss functions I trained the refining network with different loss functions for 60 000 iterations. ![Losses](pics/refine_loss.png) From left to right: input; squared distance loss; squared distance loss in log space; scale invariant loss by Eigen et al.; normalized loss; ground truth
As you can see, networks utilizing other loss functions produce ineligible outputs compared to network using normalized loss. Difference is reduced when the network is trained longer (Eigen et al. ran the training for ~1.5M iterations, here it's just 60k). ### Comparison to existing solutions How does the model fare against existing solutions? I compared results of my model to the results from this [1] and this [2] papers, both by Eigen et al.. Model with normalized loss has trouble estimating absolute depth values, but it estimates relative structure of the depth map fairly well. To test this I substitued mean and variance of the ground truth to the output depth map and this model I called 'model with oracle'. It achieves state of the art performance in RMSE metric at the time of writing the thesis. Keep in mind that this model just aims to prove that model trained with normalized loss estimates the structure of the depth map well, regardless of absolute depth values. | | [1] | [2] | Proposed model | With Oracle | | :------------- | -------------:| -----:| --------------:| -----------:| | RMSE | 0.907 | 0.641 | 1.169 | 0.569 | ![Comparison to eigen](pics/oracle.png) From left to right by columns: input image, ground truth; [1], proposed model; [2], model with oracle
## Usage `python test_depth.py INPUT_DIR GT_DIR OUT_DIR SNAPSHOTS_DIR [--log]` - `INPUT_DIR` is the path to the folder containing input images - `GT_DIR` is the path to the folder containing ground truth depth maps - `OUT_DIR` is the path to the folder to which will be written output depth maps - `SNAPSHOTS_DIR` is the path to the folder containing .caffemodel files containing trained network models. All models from this folder will be evaluated. - `--log` switch is used when the depth values that are produced by the network are in log space ### Frameworks/Libraries needed: * Caffe * Python2.7: caffe, scipy, scikit-image, numpy, pypng, cv2, Pillow, matplotlib ### Few notes - input images should be named in a same way as the corresponding ground truths, with difference that input images should have a suffix 'colors', while ground truth images should have a suffix 'depth'. Note that these suffixes should preceed file extension, e.g., 'image1_colors.png' and corresponding depth map 'image1_depth.png' - along with .caffemodel file, corresponding deploy network definition file has to be placed into SNAPSHOTS_DIR, with the same name as the model file but with different extension 'prototxt' instead of 'caffemodel' - there will actually be two output folders created, one OUT_DIR and the other OUT_DIR + '_abs'. OUT_DIR contains output depths that are fit using MVN normalization onto ground truth, OUT_DIR + '_abs' contains the raw output depth maps. - note that you need AlexNet caffemodel for the training of the global context network, gradient network and their joint configuration. It can be downloaded here: https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet ================================================ FILE: dataset/README.txt ================================================ =================================================== Required libraries for python2.7: =================================================== - caffe, h5py, scipy, scikit-image, numpy, pypng and joblib. =================================================== How to process the training dataset: =================================================== 1.) Download RAW NYU Depth v2. dataset (450GB) from http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/nyu_depth_v2_raw.zip 2.) Extract the RAW dataset into a folder A (name not important) 3.) Download NYU Depth v2. toolbox from http://cs.nyu.edu/~silberman/code/toolbox_nyu_depth_v2.zip 4.) Extract scripts from the toolbox to folder 'tools' in folder A 5.) Run process_raw.m in folder A 6.) Download labeled NYU Depth v2. dataset from http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/nyu_depth_v2_labeled.mat 7.) Download splits.mat containing official train/test split http://horatio.cs.nyu.edu/mit/silberman/indoor_seg_sup/splits.mat 8.) Make sure that labeled dataset and splits.mat are in the same folder, let's call it folder B 9.) Run get_train_scenes.m in the folder B 10.) Run split_train_set.sh in the folder B and pass it a single argument, path to folder A ('......./path/to/folder/A') 11.) Run scripts train_augment0.py, train_augment1.py, train_augment2.py in folder B 11.) Run create_train_lmdb.sh in folder B and pass it a path to caffe folder as an argument 12.) You should now have folders 'train_raw0_lmdb' (dataset version Data0), 'train_raw1_lmdb' (dataset version Data1), 'train_raw2_lmdb' (dataset version Data2) in folder B *Note: all referenced scripts can be foun in folder 'train' =================================================== How to process the testing dataset: =================================================== 1.) Download labeled NYU Depth v2. dataset from http://horatio.cs.nyu.edu/mit/silberman/nyu_depth_v2/nyu_depth_v2_labeled.mat 2.) Download splits.mat containing official train/test split http://horatio.cs.nyu.edu/mit/silberman/indoor_seg_sup/splits.mat 3.) Place all downloaded files into single folder 4.) Run script process_test.sh 5.) Run create_test_lmdb.sh and pass it a path to caffe folder as an argument 6.) You should now have a folder 'test_lmdb' in your working directory *Note: all referenced scripts can be found in folder 'test' *Note2: files crop.py, _structure_classes.py, _solarized.py come from https://github.com/deeplearningais/curfil/wiki/Training-and-Prediction-with-the-NYU-Depth-v2-Dataset ================================================ FILE: dataset/test/_solarized.py ================================================ ####################################################################################### # The MIT License # Copyright (c) 2014 Hannes Schulz, University of Bonn # Copyright (c) 2013 Benedikt Waldvogel, University of Bonn # Copyright (c) 2008-2009 Sebastian Nowozin # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell # copies of the Software, and to permit persons to whom the Software is # furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. ####################################################################################### colors = [ (0, 43, 54), (7, 54, 66), # floor (88, 110, 117), (101, 123, 131), (131, 148, 150), (147, 161, 161), # structure (238, 232, 213), (253, 246, 227), (181, 137, 0), # prop (203, 75, 22), # furniture (220, 50, 47), (211, 54, 130), (108, 113, 196), (38, 139, 210), (42, 161, 152), (133, 153, 0) ] ================================================ FILE: dataset/test/_structure_classes.py ================================================ ####################################################################################### # The MIT License # Copyright (c) 2014 Hannes Schulz, University of Bonn # Copyright (c) 2013 Benedikt Waldvogel, University of Bonn # Copyright (c) 2008-2009 Sebastian Nowozin # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell # copies of the Software, and to permit persons to whom the Software is # furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. ####################################################################################### def get_structure_classes(): structure_classes = dict() structure_classes['shutters'] = "structure" structure_classes['shelving'] = "furniture"; structure_classes['leg of table'] = "furniture"; structure_classes['colunn'] = "structure" structure_classes['scissors'] = "prop"; structure_classes['plate with bottles'] = "prop"; structure_classes['plastic container'] = "prop"; structure_classes['hanging items'] = "prop"; structure_classes['leather sitting stool'] = "furniture"; structure_classes['colors'] = "prop"; structure_classes['trible bed sofa'] = "furniture"; structure_classes['hanging board'] = "structure" structure_classes['wall frame'] = "prop"; structure_classes['brief case'] = "prop"; structure_classes['leg of chair'] = "furniture"; structure_classes['notice board'] = "structure" structure_classes['bathroom clearning brush'] = "prop"; structure_classes['chess set'] = "prop"; structure_classes['brush'] = "prop"; structure_classes['cabint'] = "furniture"; structure_classes['noticeboard'] = "structure" structure_classes['headboard'] = "furniture"; structure_classes['coffee table'] = "furniture"; structure_classes['measuring cup'] = "prop"; structure_classes['bottle of ketchup'] = "prop"; structure_classes['reflection of window shutters'] = "structure" structure_classes['air conditioner'] = "structure" structure_classes['air duct'] = "structure" structure_classes['air vent'] = "structure" structure_classes['alarm clock'] = "prop"; structure_classes['album'] = "prop"; structure_classes['aluminium foil'] = "prop"; structure_classes['antenna'] = "prop"; structure_classes['apple'] = "prop"; structure_classes['ashtray'] = "prop"; structure_classes['avocado'] = "prop"; structure_classes['baby chair'] = "furniture"; structure_classes['baby gate'] = "structure" structure_classes['back scrubber'] = "prop"; structure_classes['backpack'] = "prop"; structure_classes['bag'] = "prop"; structure_classes['bag of bagels'] = "prop"; structure_classes['bag of chips'] = "prop"; structure_classes['bag of flour'] = "prop"; structure_classes['bag of hot dog buns'] = "prop"; structure_classes['bag of oreo'] = "prop"; structure_classes['bagel'] = "prop"; structure_classes['baking dish'] = "prop"; structure_classes['ball'] = "prop"; structure_classes['balloon'] = "prop"; structure_classes['banana'] = "prop"; structure_classes['banister'] = "structure" structure_classes['bar'] = "structure" structure_classes['bar of soap'] = "prop"; structure_classes['barrel'] = "furniture"; structure_classes['baseball'] = "prop"; structure_classes['basket'] = "prop"; structure_classes['basketball'] = "prop"; structure_classes['basketball hoop'] = "prop"; structure_classes['bassinet'] = "furniture"; structure_classes['bathtub'] = "furniture"; structure_classes['bean bag'] = "furniture"; structure_classes['bed'] = "furniture"; structure_classes['bedding package'] = "prop"; structure_classes['beeper'] = "prop"; structure_classes['belt'] = "prop"; structure_classes['bench'] = "furniture"; structure_classes['bicycle'] = "prop"; structure_classes['bicycle helmet'] = "prop"; structure_classes['bin'] = "prop"; structure_classes['binder'] = "prop"; structure_classes['blackboard'] = "structure" structure_classes['blanket'] = "prop"; structure_classes['blender'] = "prop"; structure_classes['blinds'] = "structure" structure_classes['board'] = "structure" structure_classes['book'] = "prop"; structure_classes['bookend'] = "prop"; structure_classes['bookrack'] = "furniture"; structure_classes['books'] = "prop"; structure_classes['bookshelf'] = "furniture"; structure_classes['bottle'] = "prop"; structure_classes['bottle of comet'] = "prop"; structure_classes['bottle of contact lens solution'] = "prop"; structure_classes['bottle of liquid'] = "prop"; structure_classes['bottle of perfume'] = "prop"; structure_classes['bowl'] = "prop"; structure_classes['box'] = "prop"; structure_classes['box of ziplock bags'] = "prop"; structure_classes['bread'] = "prop"; structure_classes['bread pan'] = "prop"; structure_classes['brick'] = "prop"; structure_classes['briefcase'] = "prop"; structure_classes['broom'] = "prop"; structure_classes['bucket'] = "prop"; structure_classes['bulb'] = "prop"; structure_classes['bunk bed'] = "furniture"; structure_classes['business cards'] = "prop"; structure_classes['butterfly sculpture'] = "prop"; structure_classes['cabinet'] = "furniture"; structure_classes['cable box'] = "prop"; structure_classes['cable modem'] = "prop"; structure_classes['cable rack'] = "structure" structure_classes['cables'] = "prop"; structure_classes['cactus'] = "prop"; structure_classes['cake'] = "prop"; structure_classes['calculator'] = "prop"; structure_classes['calendar'] = "prop"; structure_classes['camera'] = "prop"; structure_classes['can'] = "prop"; structure_classes['can of food'] = "prop"; structure_classes['can opener'] = "prop"; structure_classes['candelabra'] = "prop"; structure_classes['candle'] = "prop"; structure_classes['candlestick'] = "prop"; structure_classes['cane'] = "prop"; structure_classes['canister'] = "prop"; structure_classes['cans of cat food'] = "prop"; structure_classes['cap stand'] = "prop"; structure_classes['car'] = "prop"; structure_classes['cart'] = "prop"; structure_classes['carton'] = "prop"; structure_classes['case'] = "prop"; structure_classes['casserole dish'] = "prop"; structure_classes['cat'] = "prop"; structure_classes['cat bed'] = "furniture"; structure_classes['cat cage'] = "furniture"; structure_classes['cd'] = "prop"; structure_classes['cd disc'] = "prop"; structure_classes['cd player'] = "prop"; structure_classes['ceiling'] = "structure" structure_classes['celery'] = "prop"; structure_classes['cell phone'] = "prop"; structure_classes['cell phone charger'] = "prop"; structure_classes['centerpiece'] = "prop"; structure_classes['ceramic frog'] = "prop"; structure_classes['certificate'] = "prop"; structure_classes['chair'] = "furniture"; structure_classes['chalk eraser'] = "prop"; structure_classes['chalkboard'] = "prop"; structure_classes['chandelier'] = "structure" structure_classes['chapstick'] = "prop"; structure_classes['charger'] = "prop"; structure_classes['charger and wire'] = "prop"; structure_classes['chart'] = "prop"; structure_classes['chart roll'] = "prop"; structure_classes['chart stand'] = "furniture"; structure_classes['charts'] = "prop"; structure_classes['chessboard'] = "prop"; structure_classes['chest'] = "furniture"; structure_classes['child carrier'] = "prop"; structure_classes['chimney'] = "structure" structure_classes['circuit breaker box'] = "structure" structure_classes['classroom board'] = "structure" structure_classes['cleaner'] = "prop"; structure_classes['cleaning wipes'] = "prop"; structure_classes['clipboard'] = "prop"; structure_classes['clock'] = "prop"; structure_classes['cloth bag'] = "prop"; structure_classes['cloth drying stand'] = "furniture"; structure_classes['clothes'] = "prop"; structure_classes['clothing detergent'] = "prop"; structure_classes['clothing dryer'] = "furniture"; structure_classes['clothing drying rack'] = "furniture"; structure_classes['clothing hamper'] = "furniture"; structure_classes['clothing hanger'] = "furniture"; structure_classes['clothing iron'] = "prop"; structure_classes['clothing washer'] = "furniture"; structure_classes['coaster'] = "prop"; structure_classes['coffee bag'] = "prop"; structure_classes['coffee grinder'] = "prop"; structure_classes['coffee machine'] = "prop"; structure_classes['coffee packet'] = "prop"; structure_classes['coffee pot'] = "prop"; structure_classes['coins'] = "prop"; structure_classes['coke bottle'] = "prop"; structure_classes['collander'] = "prop"; structure_classes['cologne'] = "prop"; structure_classes['column'] = "structure" structure_classes['comb'] = "prop"; structure_classes['comforter'] = "prop"; structure_classes['computer'] = "prop"; structure_classes['computer disk'] = "prop"; structure_classes['conch shell'] = "prop"; structure_classes['cone'] = "prop"; structure_classes['console controller'] = "prop"; structure_classes['console system'] = "prop"; structure_classes['contact lens case'] = "prop"; structure_classes['contact lens solution bottle'] = "prop"; structure_classes['container'] = "prop"; structure_classes['container of skin cream'] = "prop"; structure_classes['cooking pan'] = "prop"; structure_classes['cooking pot cover'] = "prop"; structure_classes['copper vessel'] = "prop"; structure_classes['cordless phone'] = "prop"; structure_classes['cordless telephone'] = "prop"; structure_classes['cork board'] = "prop"; structure_classes['corkscrew'] = "prop"; structure_classes['corn'] = "prop"; structure_classes['counter'] = "structure" structure_classes['cradle'] = "furniture"; structure_classes['crate'] = "furniture"; structure_classes['crayon'] = "prop"; structure_classes['cream'] = "prop"; structure_classes['cream tube'] = "prop"; structure_classes['crib'] = "furniture"; structure_classes['crock pot'] = "prop"; structure_classes['cup'] = "prop"; structure_classes['curtain'] = "structure" structure_classes['curtain rod'] = "structure" structure_classes['cutting board'] = "prop"; structure_classes['cylindrical paper holder'] = "prop"; structure_classes['decanter'] = "prop"; structure_classes['decoration item'] = "prop"; structure_classes['decorative bottle'] = "prop"; structure_classes['decorative dish'] = "prop"; structure_classes['decorative item'] = "prop"; structure_classes['decorative plate'] = "prop"; structure_classes['decorative platter'] = "prop"; structure_classes['deodarent spray bottle'] = "prop"; structure_classes['deoderant'] = "prop"; structure_classes['desk'] = "furniture"; structure_classes['desk drawer'] = "furniture"; structure_classes['desk mat'] = "prop"; structure_classes['desser'] = "furniture"; structure_classes['dish'] = "prop"; structure_classes['dish brush'] = "prop"; structure_classes['dish cover'] = "prop"; structure_classes['dish rack'] = "prop"; structure_classes['dish scrubber'] = "prop"; structure_classes['dishes'] = "prop"; structure_classes['dishwasher'] = "structure" structure_classes['display board'] = "furniture"; structure_classes['display case'] = "furniture"; structure_classes['display platter'] = "prop"; structure_classes['dog'] = "prop"; structure_classes['dog bed'] = "furniture"; structure_classes['dog bowl'] = "prop"; structure_classes['dog cage'] = "furniture"; structure_classes['dog toy'] = "prop"; structure_classes['doily'] = "furniture"; structure_classes['doll'] = "prop"; structure_classes['doll house'] = "furniture"; structure_classes['dollar bill'] = "prop"; structure_classes['dolly'] = "furniture" structure_classes['door'] = "structure" structure_classes['door window reflection'] = "structure" structure_classes['door curtain'] = "structure" structure_classes['door facing trimreflection'] = "structure" structure_classes['door frame'] = "structure" structure_classes['door knob'] = "prop"; structure_classes['door lock'] = "prop"; structure_classes['door way'] = "structure" structure_classes['door way arch'] = "structure" structure_classes['doorreflection'] = "structure" structure_classes['drain'] = "structure" structure_classes['drawer'] = "furniture"; structure_classes['drawer handle'] = "prop"; structure_classes['dress wire frame'] = "prop"; structure_classes['dresser'] = "furniture"; structure_classes['drum'] = "prop"; structure_classes['drying rack'] = "furniture"; structure_classes['drying stand'] = "furniture"; structure_classes['duck'] = "prop"; structure_classes['duster'] = "prop"; structure_classes['dvd'] = "prop"; structure_classes['dvd player'] = "prop"; structure_classes['dvds'] = "prop"; structure_classes['earphone'] = "prop"; structure_classes['earplugs'] = "prop"; structure_classes['educational display'] = "furniture"; structure_classes['eggplant'] = "prop"; structure_classes['eggs'] = "prop"; structure_classes['electric box'] = "structure" structure_classes['electric mixer'] = "prop"; structure_classes['electric toothbrush'] = "prop"; structure_classes['electric toothbrush base'] = "prop"; structure_classes['electrical kettle'] = "prop"; structure_classes['electrical outlet'] = "prop"; structure_classes['electrical plug'] = "prop"; structure_classes['electronic drumset'] = "prop"; structure_classes['envelope'] = "prop"; structure_classes['envelopes'] = "prop"; structure_classes['eraser'] = "prop"; structure_classes['ethernet jack'] = "prop"; structure_classes['excercise ball'] = "prop"; structure_classes['excercise equipment'] = "furniture"; structure_classes['excercise machine'] = "furniture"; structure_classes['exit sign'] = "prop"; structure_classes['eye glasses'] = "prop"; structure_classes['eyeball plastic ball'] = "prop"; structure_classes['face wash cream'] = "prop"; structure_classes['fan'] = "prop"; structure_classes['fashion medal'] = "prop"; structure_classes['faucet'] = "prop"; structure_classes['faucet handle'] = "prop"; structure_classes['fax machine'] = "prop"; structure_classes['fiberglass case'] = "prop"; structure_classes['file'] = "prop"; structure_classes['file box'] = "furniture"; structure_classes['file container'] = "prop"; structure_classes['file holder'] = "prop"; structure_classes['file pad'] = "prop"; structure_classes['file stand'] = "furniture"; structure_classes['filing shelves'] = "furniture"; structure_classes['fire alarm'] = "prop"; structure_classes['fire extinguisher'] = "prop"; structure_classes['fireplace'] = "structure" structure_classes['fish tank'] = "structure" structure_classes['flag'] = "prop"; structure_classes['flashcard'] = "prop"; structure_classes['flashlight'] = "prop"; structure_classes['flask'] = "prop"; structure_classes['flask set'] = "prop"; structure_classes['flatbed scanner'] = "prop"; structure_classes['flipboard'] = "furniture"; structure_classes['floor'] = "floor"; structure_classes['floor mat'] = "prop"; structure_classes['flower'] = "prop"; structure_classes['flower basket'] = "prop"; structure_classes['flower box'] = "prop"; structure_classes['flower pot'] = "prop"; structure_classes['folder'] = "prop"; structure_classes['folders'] = "prop"; structure_classes['food processor'] = "prop"; structure_classes['food wrapped on a tray'] = "prop"; structure_classes['foosball table'] = "furniture"; structure_classes['foot rest'] = "furniture"; structure_classes['football'] = "prop"; structure_classes['fork'] = "prop"; structure_classes['framed certificate'] = "prop"; structure_classes['fruit'] = "prop"; structure_classes['fruit basket'] = "prop"; structure_classes['fruit platter'] = "prop"; structure_classes['fruit stand'] = "prop"; structure_classes['fruitplate'] = "prop"; structure_classes['frying pan'] = "prop"; structure_classes['furnace'] = "furniture"; structure_classes['furniture'] = "furniture"; structure_classes['game system'] = "prop"; structure_classes['game table'] = "prop"; structure_classes['garage door'] = "structure" structure_classes['garbage bag'] = "prop"; structure_classes['garbage bin'] = "furniture"; structure_classes['garlic'] = "prop"; structure_classes['gate'] = "structure" structure_classes['gift wrapping'] = "prop"; structure_classes['gift wrapping roll'] = "prop"; structure_classes['glass'] = "structure" structure_classes['glass baking dish'] = "prop"; structure_classes['glass box'] = "prop"; structure_classes['glass container'] = "prop"; structure_classes['glass dish'] = "prop"; structure_classes['glass pane'] = "structure" structure_classes['glass pot'] = "prop"; structure_classes['glass rack'] = "structure" structure_classes['glass set'] = "prop"; structure_classes['glass ware'] = "prop"; structure_classes['globe'] = "prop"; structure_classes['globe stand'] = "prop"; structure_classes['glove'] = "prop"; structure_classes['gold piece'] = "prop"; structure_classes['grandfather clock'] = "furniture"; structure_classes['grapefruit'] = "prop"; structure_classes['green screen'] = "structure" structure_classes['grill'] = "structure" structure_classes['guitar'] = "prop"; structure_classes['guitar case'] = "prop"; structure_classes['hair brush'] = "prop"; structure_classes['hair dryer'] = "prop"; structure_classes['hamburger bun'] = "prop"; structure_classes['hammer'] = "prop"; structure_classes['hand blender'] = "prop"; structure_classes['hand sanitizer'] = "prop"; structure_classes['hand sanitizer dispenser'] = "prop"; structure_classes['hand sculpture'] = "prop"; structure_classes['hanger'] = "prop"; structure_classes['hangers'] = "prop"; structure_classes['hanging hooks'] = "prop"; structure_classes['hat'] = "prop"; structure_classes['head phone'] = "prop"; structure_classes['head phones'] = "prop"; structure_classes['headband'] = "prop"; structure_classes['headphones'] = "prop"; structure_classes['heater'] = "furniture"; structure_classes['hockey glove'] = "prop"; structure_classes['hockey stick'] = "prop"; structure_classes['hole puncher'] = "prop"; structure_classes['hookah'] = "prop"; structure_classes['hooks'] = "prop"; structure_classes['hoola hoop'] = "prop"; structure_classes['horse toy'] = "prop"; structure_classes['hot dogs'] = "prop"; structure_classes['hot water heater'] = "prop"; structure_classes['humidifier'] = "prop"; structure_classes['id card'] = "prop"; structure_classes['incense candle'] = "prop"; structure_classes['incense holder'] = "prop"; structure_classes['inkwell'] = "prop"; structure_classes['ipad'] = "prop"; structure_classes['ipod'] = "prop"; structure_classes['ipod dock'] = "prop"; structure_classes['iron box'] = "prop"; structure_classes['iron grill'] = "structure" structure_classes['ironing board'] = "furniture"; structure_classes['jacket'] = "prop"; structure_classes['jar'] = "prop"; structure_classes['jersey'] = "prop"; structure_classes['jug'] = "prop"; structure_classes['juicer'] = "prop"; structure_classes['karate belts'] = "prop"; structure_classes['key'] = "prop"; structure_classes['keyboard'] = "prop"; structure_classes['kichen towel'] = "prop"; structure_classes['kinect'] = "prop"; structure_classes['kitchen container plastic'] = "prop"; structure_classes['kitchen island'] = "structure" structure_classes['kitchen items'] = "prop"; structure_classes['kitchen utensil'] = "prop"; structure_classes['kitchen utensils'] = "prop"; structure_classes['kiwi'] = "prop"; structure_classes['knife'] = "prop"; structure_classes['knife rack'] = "prop"; structure_classes['knob'] = "prop"; structure_classes['knobs'] = "prop"; structure_classes['label'] = "prop"; structure_classes['ladder'] = "furniture"; structure_classes['ladel'] = "prop"; structure_classes['lamp'] = "prop"; structure_classes['laptop'] = "prop"; structure_classes['laundry basket'] = "prop"; structure_classes['laundry detergent jug'] = "prop"; structure_classes['lazy susan'] = "prop"; structure_classes['leather sofa'] = "furniture"; structure_classes['lectern'] = "furniture"; structure_classes['leg of a girl'] = "prop"; structure_classes['lego'] = "prop"; structure_classes['letter stand'] = "prop"; structure_classes['letters'] = "prop"; structure_classes['lid'] = "prop"; structure_classes['lid of jar'] = "prop"; structure_classes['life jacket'] = "prop"; structure_classes['light'] = "structure" structure_classes['light bulb'] = "prop"; structure_classes['light switch'] = "structure" structure_classes['light switchreflection'] = "structure" structure_classes['lighting track'] = "structure" structure_classes['lint comb'] = "prop"; structure_classes['lint roller'] = "prop"; structure_classes['litter box'] = "prop"; structure_classes['luggage'] = "prop"; structure_classes['luggage rack'] = "furniture"; structure_classes['lunch bag'] = "prop"; structure_classes['machine'] = "prop"; structure_classes['magazine'] = "prop"; structure_classes['magazine holder'] = "prop"; structure_classes['magic 8ball'] = "prop"; structure_classes['magnet'] = "prop"; structure_classes['mail shelf'] = "structure" structure_classes['mailshelf'] = "structure" structure_classes['mail tray'] = "prop"; structure_classes['makeup brush'] = "prop"; structure_classes['manilla envelope'] = "prop"; structure_classes['mantel'] = "structure" structure_classes['map'] = "prop"; structure_classes['mask'] = "prop"; structure_classes['matchbox'] = "prop"; structure_classes['mattress'] = "furniture"; structure_classes['medal'] = "prop"; structure_classes['medicine tube'] = "prop"; structure_classes['mellon'] = "prop"; structure_classes['menorah'] = "prop"; structure_classes['mens suit'] = "prop"; structure_classes['mens tie'] = "prop"; structure_classes['mezuza'] = "prop"; structure_classes['microphone'] = "prop"; structure_classes['microphone stand'] = "prop"; structure_classes['microwave'] = "prop"; structure_classes['mirror'] = "prop"; structure_classes['model boat'] = "prop"; structure_classes['modem'] = "prop"; structure_classes['money'] = "prop"; structure_classes['monitor'] = "prop"; structure_classes['motion camera'] = "prop"; structure_classes['mouse'] = "prop"; structure_classes['mouse pad'] = "prop"; structure_classes['muffins'] = "prop"; structure_classes['mug hanger'] = "prop"; structure_classes['mug holder'] = "prop"; structure_classes['music keyboard'] = "prop"; structure_classes['music stand'] = "furniture"; structure_classes['music stereo'] = "prop"; structure_classes['nailclipper'] = "prop"; structure_classes['napkin'] = "prop"; structure_classes['napkin dispenser'] = "prop"; structure_classes['napkin holder'] = "prop"; structure_classes['necklace'] = "prop"; structure_classes['necklace holder'] = "prop"; structure_classes['night stand'] = "furniture"; structure_classes['notebook'] = "prop"; structure_classes['notecards'] = "prop"; structure_classes['oil container'] = "prop"; structure_classes['onion'] = "prop"; structure_classes['orange'] = "prop"; structure_classes['orange juicer'] = "prop"; structure_classes['orange plastic cap'] = "prop"; structure_classes['ornamental item'] = "prop"; structure_classes['ornamental plant'] = "prop"; structure_classes['ornamental pot'] = "prop"; structure_classes['ottoman'] = "furniture"; structure_classes['oven'] = "structure" structure_classes['oven handle'] = "prop"; structure_classes['oven mitt'] = "prop"; structure_classes['package of bedroom sheets'] = "prop"; structure_classes['package of bottled water'] = "prop"; structure_classes['package of water'] = "prop"; structure_classes['pan'] = "prop"; structure_classes['paper'] = "prop"; structure_classes['paper bundle'] = "prop"; structure_classes['paper cutter'] = "prop"; structure_classes['paper holder'] = "prop"; structure_classes['paper rack'] = "prop"; structure_classes['paper towel'] = "prop"; structure_classes['paper towel dispenser'] = "prop"; structure_classes['paper towel holder'] = "prop"; structure_classes['paper tray'] = "prop"; structure_classes['paper weight'] = "prop"; structure_classes['papers'] = "prop"; structure_classes['peach'] = "prop"; structure_classes['pen'] = "prop"; structure_classes['pen box'] = "prop"; structure_classes['pen cup'] = "prop"; structure_classes['pen holder'] = "prop"; structure_classes['pen stand'] = "prop"; structure_classes['pencil'] = "prop"; structure_classes['pencil holder'] = "prop"; structure_classes['pencils pens'] = "prop"; structure_classes['penholder'] = "prop"; structure_classes['pepper'] = "prop"; structure_classes['pepper grinder'] = "prop"; structure_classes['pepper shaker'] = "prop"; structure_classes['perfume'] = "prop"; structure_classes['perfume box'] = "prop"; structure_classes['person'] = "prop"; structure_classes['personal care liquid'] = "prop"; structure_classes['phone jack'] = "structure" structure_classes['photo'] = "prop"; structure_classes['piano'] = "furniture"; structure_classes['piano bench'] = "furniture"; structure_classes['picture'] = "prop"; structure_classes['picture of fish'] = "prop"; structure_classes['piece of wood'] = "prop"; structure_classes['pig'] = "prop"; structure_classes['pillow'] = "prop"; structure_classes['pineapple'] = "prop"; structure_classes['ping pong racquet'] = "prop"; structure_classes['ping pong table'] = "furniture"; structure_classes['pipe'] = "prop"; structure_classes['pitcher'] = "prop"; structure_classes['pizza box'] = "prop"; structure_classes['placard'] = "prop"; structure_classes['placemat'] = "prop"; structure_classes['plant'] = "prop"; structure_classes['plant pot'] = "prop"; structure_classes['plaque'] = "prop"; structure_classes['plastic bowl'] = "prop"; structure_classes['plastic box'] = "prop"; structure_classes['plastic chair'] = "prop"; structure_classes['plastic crate'] = "prop"; structure_classes['plastic cup of coffee'] = "prop"; structure_classes['plastic dish'] = "prop"; structure_classes['plastic rack'] = "prop"; structure_classes['plastic toy container'] = "prop"; structure_classes['plastic tray'] = "prop"; structure_classes['plastic tub'] = "prop"; structure_classes['plate'] = "prop"; structure_classes['platter'] = "prop"; structure_classes['playpen'] = "furniture"; structure_classes['pool sticks'] = "prop"; structure_classes['pool table'] = "furniture"; structure_classes['poster'] = "prop"; structure_classes['poster board'] = "prop"; structure_classes['poster case'] = "prop"; structure_classes['pot'] = "prop"; structure_classes['potato'] = "prop"; structure_classes['power surge'] = "prop"; structure_classes['printer'] = "prop"; structure_classes['projector'] = "prop"; structure_classes['projector screen'] = "structure" structure_classes['pump dispenser'] = "prop"; structure_classes['puppy toy'] = "prop"; structure_classes['purse'] = "prop"; structure_classes['quill'] = "prop"; structure_classes['quilt'] = "prop"; structure_classes['radiator'] = "furniture"; structure_classes['radio'] = "prop"; structure_classes['rags'] = "prop"; structure_classes['railing'] = "structure" structure_classes['range hood'] = "structure" structure_classes['razor'] = "prop"; structure_classes['refridgerator'] = "furniture"; structure_classes['remote control'] = "prop"; structure_classes['rolled carpet'] = "prop"; structure_classes['rolled up rug'] = "prop"; structure_classes['room divider'] = "furniture"; structure_classes['rope'] = "prop"; structure_classes['router'] = "prop"; structure_classes['rug'] = "prop"; structure_classes['ruler'] = "prop"; structure_classes['salt and pepper'] = "prop"; structure_classes['salt container'] = "prop"; structure_classes['salt shaker'] = "prop"; structure_classes['saucer'] = "prop"; structure_classes['scale'] = "prop"; structure_classes['scarf'] = "prop"; structure_classes['scenary'] = "prop"; structure_classes['scissor'] = "prop"; structure_classes['sculpture'] = "prop"; structure_classes['security camera'] = "prop"; structure_classes['server'] = "prop"; structure_classes['serving dish'] = "prop"; structure_classes['serving platter'] = "prop"; structure_classes['serving spoon'] = "prop"; structure_classes['sewing machine'] = "prop"; structure_classes['shaver'] = "prop"; structure_classes['shaving cream'] = "prop"; structure_classes['sheet'] = "prop"; structure_classes['sheet music'] = "prop"; structure_classes['sheet of metal'] = "prop"; structure_classes['sheets'] = "prop"; structure_classes['shelves'] = "furniture"; structure_classes['shirts in hanger'] = "prop"; structure_classes['shoe'] = "prop"; structure_classes['shoe rack'] = "prop"; structure_classes['shoelace'] = "prop"; structure_classes['shofar'] = "prop"; structure_classes['shopping baskets'] = "prop"; structure_classes['shopping cart'] = "prop"; structure_classes['shorts'] = "prop"; structure_classes['shovel'] = "prop"; structure_classes['show piece'] = "prop"; structure_classes['shower area'] = "structure" structure_classes['shower base'] = "structure" structure_classes['shower cap'] = "prop"; structure_classes['shower curtain'] = "structure" structure_classes['shower glass'] = "structure" structure_classes['shower head'] = "prop"; structure_classes['shower hose'] = "prop"; structure_classes['shower knob'] = "prop"; structure_classes['shower pipe'] = "prop"; structure_classes['shower tube'] = "prop"; structure_classes['showing plate'] = "prop"; structure_classes['sifter'] = "prop"; structure_classes['sign'] = "prop"; structure_classes['sink'] = "prop"; structure_classes['sink protector'] = "prop"; structure_classes['sissors'] = "prop"; structure_classes['six pack of beer'] = "prop"; structure_classes['slide'] = "furniture"; structure_classes['soap'] = "prop"; structure_classes['soap box'] = "prop"; structure_classes['soap dish'] = "prop"; structure_classes['soap holder'] = "prop"; structure_classes['soap stand'] = "prop"; structure_classes['soap tray'] = "prop"; structure_classes['sock'] = "prop"; structure_classes['sofa'] = "furniture"; structure_classes['soft toy'] = "prop"; structure_classes['soft toy group'] = "prop"; structure_classes['spatula'] = "prop"; structure_classes['speaker'] = "prop"; structure_classes['spice bottle'] = "prop"; structure_classes['spice rack'] = "structure" structure_classes['spice stand'] = "structure" structure_classes['sponge'] = "prop"; structure_classes['spoon'] = "prop"; structure_classes['spoon sets'] = "prop"; structure_classes['spoon stand'] = "prop"; structure_classes['squash'] = "prop"; structure_classes['squeeze tube'] = "prop"; structure_classes['stacked bins'] = "prop"; structure_classes['stacked bins boxes'] = "prop"; structure_classes['stacked chairs'] = "furniture"; structure_classes['stacked plastic racks'] = "furniture"; structure_classes['stairs'] = "structure" structure_classes['stamp'] = "prop"; structure_classes['stand'] = "furniture"; structure_classes['staple remover'] = "prop"; structure_classes['stapler'] = "prop"; structure_classes['steamer'] = "prop"; structure_classes['step stool'] = "prop"; structure_classes['stereo'] = "prop"; structure_classes['stick'] = "prop"; structure_classes['sticker'] = "prop"; structure_classes['sticks'] = "prop"; structure_classes['stones'] = "prop"; structure_classes['stool'] = "prop"; structure_classes['storage basket'] = "prop"; structure_classes['storage bin'] = "prop"; structure_classes['storage box'] = "prop"; structure_classes['storage chest'] = "furniture"; structure_classes['storage rack'] = "furniture"; structure_classes['storage shelvesbooks'] = "furniture"; structure_classes['storage space'] = "structure" structure_classes['stove'] = "structure" structure_classes['stove burner'] = "prop"; structure_classes['stroller'] = "furniture"; structure_classes['stuffed animal'] = "prop"; structure_classes['styrofoam object'] = "prop"; structure_classes['suger jar'] = "prop"; structure_classes['suitcase'] = "prop"; structure_classes['surge protect'] = "prop"; structure_classes['surge protector'] = "prop"; structure_classes['switchbox'] = "prop"; structure_classes['table'] = "furniture"; structure_classes['table runner'] = "prop"; structure_classes['tablecloth'] = "prop"; structure_classes['tag'] = "prop"; structure_classes['tape'] = "prop"; structure_classes['tape dispenser'] = "prop"; structure_classes['tea box'] = "prop"; structure_classes['tea cannister'] = "prop"; structure_classes['tea coaster'] = "prop"; structure_classes['tea kettle'] = "prop"; structure_classes['tea pot'] = "prop"; structure_classes['telephone'] = "prop"; structure_classes['telephone cord'] = "prop"; structure_classes['telescope'] = "prop"; structure_classes['television'] = "prop"; structure_classes['tennis racket'] = "prop"; structure_classes['tent'] = "furniture"; structure_classes['thermostat'] = "prop"; structure_classes['tin foil'] = "prop"; structure_classes['tissue'] = "prop"; structure_classes['tissue box'] = "prop"; structure_classes['tissue roll'] = "prop"; structure_classes['toaster'] = "prop"; structure_classes['toaster oven'] = "prop"; structure_classes['toilet'] = "furniture"; structure_classes['toilet bowl brush'] = "prop"; structure_classes['toilet brush'] = "prop"; structure_classes['toilet holder'] = "prop"; structure_classes['toilet paper'] = "prop"; structure_classes['toilet paper holder'] = "prop"; structure_classes['toilet plunger'] = "prop"; structure_classes['toiletries'] = "prop"; structure_classes['toiletries bag'] = "prop"; structure_classes['toothbrush'] = "prop"; structure_classes['toothbrush holder'] = "prop"; structure_classes['toothpaste'] = "prop"; structure_classes['toothpaste holder'] = "prop"; structure_classes['torah'] = "prop"; structure_classes['torch'] = "prop"; structure_classes['towel'] = "prop"; structure_classes['towel rod'] = "structure" structure_classes['toy'] = "prop"; structure_classes['toy boat'] = "prop"; structure_classes['toy box'] = "prop"; structure_classes['toy car'] = "prop"; structure_classes['toy cash register'] = "prop"; structure_classes['toy chair'] = "prop"; structure_classes['toy chest'] = "prop"; structure_classes['toy cube'] = "prop"; structure_classes['toy cuboid'] = "prop"; structure_classes['toy cylinder'] = "prop"; structure_classes['toy doll'] = "prop"; structure_classes['toy horse'] = "prop"; structure_classes['toy house'] = "prop"; structure_classes['toy kitchen'] = "prop"; structure_classes['toy phone'] = "prop"; structure_classes['toy pyramid'] = "prop"; structure_classes['toy rectangle'] = "prop"; structure_classes['toy shelf'] = "prop"; structure_classes['toy sink'] = "prop"; structure_classes['toy sofa'] = "prop"; structure_classes['toy table'] = "prop"; structure_classes['toy tree'] = "prop"; structure_classes['toy triangle'] = "prop"; structure_classes['toy truck'] = "prop"; structure_classes['toy trucks'] = "prop"; structure_classes['toyhouse'] = "prop"; structure_classes['toys basket'] = "prop"; structure_classes['toys box'] = "prop"; structure_classes['toys rack'] = "furniture"; structure_classes['toys shelf'] = "furniture"; structure_classes['track light'] = "structure" structure_classes['trampoline'] = "furniture"; structure_classes['travel bag'] = "prop"; structure_classes['tray'] = "prop"; structure_classes['treadmill'] = "furniture"; structure_classes['tree sculpture'] = "structure" structure_classes['tricycle'] = "prop"; structure_classes['trivet'] = "prop"; structure_classes['trolly'] = "furniture"; structure_classes['trophy'] = "prop"; structure_classes['tub of tupperware'] = "prop"; structure_classes['tumbler'] = "prop"; structure_classes['tuna cans'] = "prop"; structure_classes['tupperware'] = "prop"; structure_classes['tv stand'] = "furniture"; structure_classes['typewriter'] = "prop"; structure_classes['umbrella'] = "prop"; structure_classes['unknown'] = "prop"; structure_classes['urn'] = "prop"; structure_classes['usb drive'] = "prop"; structure_classes['utensil'] = "prop"; structure_classes['utensil container'] = "prop"; structure_classes['utensils'] = "prop"; structure_classes['vacuum cleaner'] = "prop"; structure_classes['vase'] = "prop"; structure_classes['vasoline'] = "prop"; structure_classes['vegetable'] = "prop"; structure_classes['vegetable peeler'] = "prop"; structure_classes['vegetables'] = "prop"; structure_classes['ventilation'] = "structure" structure_classes['vessel'] = "prop"; structure_classes['vessel set'] = "prop"; structure_classes['vessels'] = "prop"; structure_classes['video game'] = "prop"; structure_classes['vuvuzela'] = "prop"; structure_classes['waffle maker'] = "prop"; structure_classes['walkie talkie'] = "prop"; structure_classes['wall'] = "structure" structure_classes['wall decoration'] = "prop"; structure_classes['wall divider'] = "furniture"; structure_classes['wall hand sanitizer dispenser'] = "prop"; structure_classes['wall stand'] = "structure" structure_classes['wallet'] = "prop"; structure_classes['wardrobe'] = "furniture"; structure_classes['washing machine'] = "furniture"; structure_classes['watch'] = "prop"; structure_classes['water carboy'] = "prop"; structure_classes['water cooler'] = "furniture"; structure_classes['water dispenser'] = "prop"; structure_classes['water filter'] = "prop"; structure_classes['water fountain'] = "structure" structure_classes['water heater'] = "prop"; structure_classes['water purifier'] = "prop"; structure_classes['watermellon'] = "prop"; structure_classes['webcam'] = "prop"; structure_classes['whisk'] = "prop"; structure_classes['whiteboard'] = "structure" structure_classes['whiteboard eraser'] = "prop"; structure_classes['whiteboard marker'] = "prop"; structure_classes['wii'] = "prop"; structure_classes['window'] = "structure" structure_classes['window box'] = "structure" structure_classes['window cover'] = "structure" structure_classes['window frame'] = "structure" structure_classes['window seat'] = "structure" structure_classes['window shelf'] = "structure" structure_classes['wine accessory'] = "prop"; structure_classes['wine bottle'] = "prop"; structure_classes['wine glass'] = "prop"; structure_classes['wine rack'] = "prop"; structure_classes['wiping cloth'] = "prop"; structure_classes['wire'] = "prop"; structure_classes['wire basket'] = "prop"; structure_classes['wire board'] = "prop"; structure_classes['wire rack'] = "prop"; structure_classes['wire tray'] = "prop"; structure_classes['wooden container'] = "prop"; structure_classes['wooden kitchen utensils'] = "prop"; structure_classes['wooden pillar'] = "structure" structure_classes['wooden plank'] = "prop"; structure_classes['wooden planks'] = "prop"; structure_classes['wooden toy'] = "prop"; structure_classes['wooden utensil'] = "prop"; structure_classes['wooden utensils'] = "prop"; structure_classes['wreathe'] = "prop"; structure_classes['xbox'] = "prop"; structure_classes['yarmulka'] = "prop"; structure_classes['yellow pepper'] = "prop"; structure_classes['yoga mat'] = "prop"; structure_classes['toy bottle'] = "prop"; structure_classes['lock'] = "prop"; structure_classes['iphone'] = "prop"; structure_classes['napkin ring'] = "prop"; structure_classes['bed sheets'] = "prop"; structure_classes['spot light'] = "prop"; structure_classes['mortar and pestle'] = "prop"; structure_classes['stack of plates'] = "prop"; structure_classes['suit jacket'] = "prop"; structure_classes['coat hanger'] = "prop"; structure_classes['cardboard tube'] = "prop"; structure_classes['toy bin'] = "prop"; structure_classes['roll of paper'] = "prop"; structure_classes['cardboard sheet'] = "prop"; structure_classes['pyramid'] = "prop"; structure_classes['toy plane'] = "prop"; structure_classes['bottle of soap'] = "prop"; structure_classes['box of paper'] = "prop"; structure_classes['trolley'] = "prop"; structure_classes['pool ball'] = "prop"; structure_classes['alarm'] = "prop"; structure_classes['cannister'] = "prop"; structure_classes['ping pong ball'] = "prop"; structure_classes['ping pong racket'] = "prop"; structure_classes['roll of toilet paper'] = "prop"; structure_classes['bottle of listerine'] = "prop"; structure_classes['bottle of hand wash liquid'] = "prop"; structure_classes['banana peel'] = "prop"; structure_classes['heating tray'] = "prop"; structure_classes['measuring cap'] = "prop"; structure_classes['bottle of ketcup'] = "prop"; structure_classes['handle'] = "prop"; structure_classes['lemon'] = "prop"; structure_classes['wine'] = "prop"; structure_classes['boomerang'] = "prop"; structure_classes['button'] = "prop"; structure_classes['decorative bowl'] = "prop"; structure_classes['book holder'] = "prop"; structure_classes['toy apple'] = "prop"; structure_classes['toy dog'] = "prop"; structure_classes['drawer knob'] = "prop"; structure_classes['shoe hanger'] = "prop"; structure_classes['figurine'] = "prop"; structure_classes['soccer ball'] = "prop"; structure_classes['hand weight'] = "prop"; structure_classes['sleeping bag'] = "prop"; structure_classes['trinket'] = "prop"; structure_classes['hand fan'] = "prop"; structure_classes['sculpture'] = "prop"; structure_classes['sculpture of the chrysler building'] = "prop"; structure_classes['sculpture of the eiffel tower'] = "prop"; structure_classes['sculpture of the empire state building'] = "prop"; structure_classes['jeans'] = "prop"; structure_classes['toy stroller'] = "prop"; structure_classes['shelf frame'] = "prop"; structure_classes['cat house'] = "prop"; structure_classes['can of beer'] = "prop"; structure_classes['lamp shade'] = "prop"; structure_classes['bracelet'] = "prop"; structure_classes['indoor fountain'] = "furniture"; structure_classes['decorative egg'] = "prop"; structure_classes['photo album'] = "prop"; structure_classes['decorative candle'] = "prop"; structure_classes['walkietalkie'] = "prop"; structure_classes['floor trim'] = "structure" structure_classes['mini display platform'] = "prop"; structure_classes['american flag'] = "prop"; structure_classes['vhs tapes'] = "prop"; structure_classes['throw'] = "prop"; structure_classes['newspapers'] = "prop"; structure_classes['mantle'] = "structure" structure_classes['roll of paper towels'] = "prop"; return structure_classes ================================================ FILE: dataset/test/convert.py ================================================ #!/usr/bin/env python ####################################################################################### # The MIT License # Copyright (c) 2014 Hannes Schulz, University of Bonn # Copyright (c) 2013 Benedikt Waldvogel, University of Bonn # Copyright (c) 2008-2009 Sebastian Nowozin # Permission is hereby granted, free of charge, to any person obtaining a copy # of this software and associated documentation files (the "Software"), to deal # in the Software without restriction, including without limitation the rights # to use, copy, modify, merge, publish, distribute, sublicense, and/or sell # copies of the Software, and to permit persons to whom the Software is # furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in all # copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE # SOFTWARE. ####################################################################################### # vim: set fileencoding=utf-8 : # # Helper script to convert the NYU Depth v2 dataset Matlab file into a set of # PNG images in the CURFIL dataset format. # # See https://github.com/deeplearningais/curfil/wiki/Training-and-Prediction-with-the-NYU-Depth-v2-Dataset from __future__ import print_function from joblib import Parallel, delayed from skimage import exposure from skimage.io import imsave import h5py import numpy as np import os import png import scipy.io import sys from _structure_classes import get_structure_classes import _solarized def process_ground_truth(ground_truth): colors = dict() colors["structure"] = _solarized.colors[5] colors["prop"] = _solarized.colors[8] colors["furniture"] = _solarized.colors[9] colors["floor"] = _solarized.colors[1] shape = list(ground_truth.shape) + [3] img = np.ndarray(shape=shape, dtype=np.uint8) for i in xrange(shape[0]): for j in xrange(shape[1]): l = ground_truth[i, j] if (l == 0): img[i, j] = (0, 0, 0) # background else: name = classes[names[l - 1]] assert name in colors, name img[i, j] = colors[name] return img def visualize_depth_image(data): data[data == 0.0] = np.nan maxdepth = np.nanmax(data) mindepth = np.nanmin(data) data = data.copy() data -= mindepth data /= (maxdepth - mindepth) gray = np.zeros(list(data.shape) + [3], dtype=data.dtype) data = (1.0 - data) gray[..., :3] = np.dstack((data, data, data)) # use a greenish color to visualize missing depth gray[np.isnan(data), :] = (97, 160, 123) gray[np.isnan(data), :] /= 255 gray = exposure.equalize_hist(gray) # set alpha channel gray = np.dstack((gray, np.ones(data.shape[:2]))) gray[np.isnan(data), -1] = 0.5 return gray * 255 def convert_image(i, scene, img_depth, image, label): idx = int(i) + 1 if idx in train_images: train_test = "training" else: assert idx in test_images, "index %d neither found in training set nor in test set" % idx train_test = "testing" folder = "%s/%s/%s" % (out_folder, train_test, scene) if not os.path.exists(folder): os.makedirs(folder) img_depth *= 1000.0 png.from_array(img_depth, 'L;16').save("%s/%05d_depth.png" % (folder, i)) depth_visualization = visualize_depth_image(img_depth) # workaround for a bug in the png module depth_visualization = depth_visualization.copy() # makes in contiguous shape = depth_visualization.shape depth_visualization.shape = (shape[0], np.prod(shape[1:])) depth_image = png.from_array(depth_visualization, "RGBA;8") depth_image.save("%s/%05d_depth_visualization.png" % (folder, i)) imsave("%s/%05d_colors.png" % (folder, i), image) ground_truth = process_ground_truth(label) imsave("%s/%05d_ground_truth.png" % (folder, i), ground_truth) if __name__ == "__main__": if len(sys.argv) < 4: print("usage: %s [ ]" % sys.argv[0], file=sys.stderr) sys.exit(0) h5_file = h5py.File(sys.argv[1], "r") # h5py is not able to open that file. but scipy is train_test = scipy.io.loadmat(sys.argv[2]) out_folder = sys.argv[3] if len(sys.argv) >= 5: raw_depth = bool(int(sys.argv[4])) else: raw_depth = False if len(sys.argv) >= 6: num_threads = int(sys.argv[5]) else: num_threads = -1 test_images = set([int(x) for x in train_test["testNdxs"]]) train_images = set([int(x) for x in train_test["trainNdxs"]]) print("%d training images" % len(train_images)) print("%d test images" % len(test_images)) if raw_depth: print("using raw depth images") depth = h5_file['rawDepths'] else: print("using filled depth images") depth = h5_file['depths'] print("reading", sys.argv[1]) labels = h5_file['labels'] images = h5_file['images'] rawDepthFilenames = [u''.join(unichr(c) for c in h5_file[obj_ref]) for obj_ref in h5_file['rawDepthFilenames'][0]] names = [u''.join(unichr(c) for c in h5_file[obj_ref]) for obj_ref in h5_file['names'][0]] scenes = [u''.join(unichr(c) for c in h5_file[obj_ref]) for obj_ref in h5_file['sceneTypes'][0]] rawRgbFilenames = [u''.join(unichr(c) for c in h5_file[obj_ref]) for obj_ref in h5_file['rawRgbFilenames'][0]] classes = get_structure_classes() print("processing images") if num_threads == 1: print("single-threaded mode") for i, image in enumerate(images): print("image", i + 1, "/", len(images)) convert_image(i, scenes[i], depth[i, :, :].T, image.T, labels[i, :, :].T) else: Parallel(num_threads, 5)(delayed(convert_image)(i, scenes[i], depth[i, :, :].T, images[i, :, :].T, labels[i, :, :].T) for i in range(len(images))) print("finished") ================================================ FILE: dataset/test/create_test_lmdb.sh ================================================ #! /bin/bash # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz for f in test_colors/*.png do file=$(basename $f) depthfile="${file/colors/depth}" printf "$file $depthfile\n" >> list_ordered.txt done sort --random-sort -o list.txt list_ordered.txt awk < list.txt '{printf $1; printf " 0\n"}' > list_color.txt awk < list.txt '{printf $2; printf " 0\n"}' > list_depth.txt rm list.txt rm list_ordered.txt mkdir -p test_lmdb $1/build/tools/convert_imageset -resize_height 27 -resize_width 37 -gray test_depths/ list_depth.txt test_lmdb/test_depth_37x27.lmdb $1/build/tools/convert_imageset -resize_height 54 -resize_width 74 -gray test_depths/ list_depth.txt test_lmdb/test_depth_74x54.lmdb $1/build/tools/convert_imageset -resize_height 218 -resize_width 298 test_colors/ list_color.txt test_lmdb/test_color_298x218.lmdb rm list_depth.txt rm list_color.txt ================================================ FILE: dataset/test/crop.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import matplotlib.pyplot as plt import matplotlib import sys import PIL from PIL import Image import cv2 import cv import caffe import argparse import os.path import random from random import randint parser = argparse.ArgumentParser() #parser.add_argument("color_folder", help="input folder") args = parser.parse_args() for file in os.listdir("test_data"): if file.endswith(".png"): filePath = 'test_data/' + file width, height = 640, 480 newWidth, newHeight = 420, 320 borderX = (width - newWidth) / 2 borderY = (height - newHeight) / 2 img = Image.open(filePath) img = img.crop((borderX, borderY, width - borderX, height - borderY)) print(filePath) if 'depth' in file: depthArray = np.array(img) depthArray = depthArray.astype(np.float32) depthArray /= 65535.0 depthArray = np.clip(depthArray, 0.0039, 1) depthArray *= 6.5535 # 1 - 10 meters depthArray *= 255 depthArray = depthArray.astype(np.uint8) depthNew = Image.fromarray(depthArray) depthNew.save('test_depths/' + file) if 'colors' in file: img.save('test_colors/' + file) ================================================ FILE: dataset/test/process_test.sh ================================================ # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz python convert.py nyu_depth_v2_labeled.mat splits.mat out 0 1 mkdir -p test_data find out/testing -name *colors.png -exec mv -t test_data {} + find out/testing -name *depth.png -exec mv -t test_data {} + mkdir -p test_colors mkdir -p test_depths python crop.py ================================================ FILE: dataset/train/create_train_lmdb.sh ================================================ #! /bin/bash # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz #Data 0 for f in train_colors0/*.png do file=$(basename $f) depthfile="${file/rgb/depth}" printf "$file $depthfile\n" >> list_ordered.txt done sort --random-sort -o list.txt list_ordered.txt awk < list.txt '{printf $1; printf " 0\n"}' > list_color.txt awk < list.txt '{printf $2; printf " 0\n"}' > list_depth.txt rm list.txt rm list_ordered.txt mkdir train_raw0_lmdb $1/build/tools/convert_imageset -resize_height 27 -resize_width 37 -gray train_depths0/ list_depth.txt train_raw0_lmdb/train_raw0_depth_37x27.lmdb $1/build/tools/convert_imageset -resize_height 54 -resize_width 74 -gray train_depths0/ list_depth.txt train_raw0_lmdb/train_raw0_depth_74x54.lmdb $1/build/tools/convert_imageset -resize_height 218 -resize_width 298 train_colors0/ list_color.txt train_raw0_lmdb/train_raw0_color_298x218.lmdb rm list_color.txt rm list_depth.txt #Data 1 for f in train_colors1/*.png do file=$(basename $f) depthfile="${file/rgb/depth}" printf "$file $depthfile\n" >> list_ordered.txt done sort --random-sort -o list.txt list_ordered.txt awk < list.txt '{printf $1; printf " 0\n"}' > list_color.txt awk < list.txt '{printf $2; printf " 0\n"}' > list_depth.txt rm list.txt rm list_ordered.txt mkdir train_raw1_lmdb $1/build/tools/convert_imageset -resize_height 27 -resize_width 37 -gray train_depths1/ list_depth.txt train_raw1_lmdb/train_raw1_depth_37x27.lmdb $1/build/tools/convert_imageset -resize_height 54 -resize_width 74 -gray train_depths1/ list_depth.txt train_raw1_lmdb/train_raw1_depth_74x54.lmdb $1/build/tools/convert_imageset -resize_height 218 -resize_width 298 train_colors1/ list_color.txt train_raw1_lmdb/train_raw1_color_298x218.lmdb rm list_color.txt rm list_depth.txt #Data 1 for f in train_colors2/*.png do file=$(basename $f) depthfile="${file/rgb/depth}" printf "$file $depthfile\n" >> list_ordered.txt done sort --random-sort -o list.txt list_ordered.txt awk < list.txt '{printf $1; printf " 0\n"}' > list_color.txt awk < list.txt '{printf $2; printf " 0\n"}' > list_depth.txt rm list.txt rm list_ordered.txt mkdir train_raw2_lmdb $1/build/tools/convert_imageset -resize_height 27 -resize_width 37 -gray train_depths2/ list_depth.txt train_raw2_lmdb/train_raw2_depth_37x27.lmdb $1/build/tools/convert_imageset -resize_height 54 -resize_width 74 -gray train_depths2/ list_depth.txt train_raw2_lmdb/train_raw2_depth_74x54.lmdb $1/build/tools/convert_imageset -resize_height 218 -resize_width 298 train_colors2/ list_color.txt train_raw2_lmdb/train_raw2_color_298x218.lmdb rm list_color.txt rm list_depth.txt ================================================ FILE: dataset/train/get_train_scenes.m ================================================ % Master's Thesis - Depth Estimation by Convolutional Neural Networks % Jan Ivanecky; xivane00@stud.fit.vutbr.cz data = load('nyu_depth_v2_labeled.mat'); split = load('splits.mat'); for i = 1 : 795 folders(i) = data.scenes(split.trainNdxs(i)); end folders = unique(folders); fileID = fopen('train_scenes.txt','w'); for i = 1 :numel(folders) fprintf(fileID, '%s\n', folders{i}); end fclose(fileID); exit(); ================================================ FILE: dataset/train/process_raw.m ================================================ % Master's Thesis - Depth Estimation by Convolutional Neural Networks % Jan Ivanecky; xivane00@stud.fit.vutbr.cz addpath('tools'); d = dir('.'); isub = [d(:).isdir]; %# returns logical vector nameFolds = {d(isub).name}'; nameFolds(ismember(nameFolds,{'.','..','tools'})) = []; nameFolds(~cellfun(@isempty,(regexp(nameFolds,'._out')))) = []; disp(numel(nameFolds)); count = 0; outCount = 0; for f = 1:numel(nameFolds) disp(f); disp(nameFolds{f}); files = get_synched_frames(nameFolds{f}); c = numel(files); disp(strcat('filecount: ',int2str(c))); files = files(1:5:c); c = numel(files); disp(strcat('filecount to process: ',int2str(c))); count = count + c; outFolder = strcat(nameFolds{f}, '_out'); if ~exist(outFolder, 'dir') mkdir(outFolder); end parfor idx = 1:c rgbFilename = strcat(nameFolds{f},'/',files(idx).rawRgbFilename); depthFilename = strcat(nameFolds{f},'/',files(idx).rawDepthFilename); outRGBFilename = strcat(nameFolds{f},'_out/',nameFolds{f},num2str(idx),'rgb.png'); outDepthFilename = strcat(nameFolds{f},'_out/',nameFolds{f},num2str(idx),'depth.png'); disp(outRGBFilename); rgb = imread(rgbFilename); depth = imread(depthFilename); depth = swapbytes(depth);% [depthOut, rgbOut] = project_depth_map(depth, rgb); imgDepth = fill_depth_colorization(double(rgbOut) / 255.0, depthOut, 0.8); imgDepth = imgDepth / 10.0; imgDepth = crop_image(imgDepth); rgbOut = crop_image(rgbOut); imwrite(rgbOut, outRGBFilename); imwrite(imgDepth, outDepthFilename); end D = dir([outFolder, '/*rgb.png']); Num = length(D);%D(not([D.isdir]))); disp(strcat('output filecount: ',int2str(Num))); outCount = outCount + Num; end disp(count); disp(outCount); exit; ================================================ FILE: dataset/train/split_train_set.sh ================================================ # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz echo "Extracting images from training scenes..." mkdir -p train_data_all while read i; do dirpath=${1}'/'${i} for d in "${dirpath}"*/ do cp $d/* -t train_data_all done done < train_scenes.txt echo "Moving extracted RGB images to train_rgbs..." mkdir -p train_colors find train_data_all -name '*rgb.png' -exec mv -t train_colors {} + echo "Moving extracted Depth images to train_depths..." mkdir -p train_depths find train_data_all -name '*depth.png' -exec mv -t train_depths {} + rm -r train_data_all ================================================ FILE: dataset/train/train_augment0.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import sys import PIL from PIL import Image import os.path import random from random import randint try: os.mkdir('train_colors0') except OSError: print('output folder already exists') try: os.mkdir('train_depths0') except OSError: print('output folder already exists') counter = 1 for file in os.listdir("train_colors"): if file.endswith(".png"): depthFile = file.replace('rgb','depth') filePath = 'train_colors/' + file depthFilePath = 'train_depths/' + depthFile print(str(counter) + filePath + ' ' + depthFilePath) counter += 1 colorOriginal = Image.open(filePath) depthOriginal = Image.open(depthFilePath) width, height = 561, 427 newWidth, newHeight = 420, 320 borderX = (width - newWidth) / 2 borderY = (height - newHeight) / 2 colorNew = colorOriginal.crop((borderX, borderY, width - borderX, height - borderY)) depthNew = depthOriginal.crop((borderX, borderY, width - borderX, height - borderY)) colorNew.save('train_colors0/' + file) depthNew.save('train_depths0/' + depthFile) ================================================ FILE: dataset/train/train_augment1.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import sys import PIL from PIL import Image import os.path import random from random import randint try: os.mkdir('train_colors1') except OSError: print('output folder already exists') try: os.mkdir('train_depths1') except OSError: print('output folder already exists') counter = 1 for file in os.listdir("train_colors"): if file.endswith(".png"): depthFile = file.replace('rgb','depth') filePath = 'train_colors/' + file depthFilePath = 'train_depths/' + depthFile print(str(counter) + filePath + ' ' + depthFilePath) counter += 1 colorOriginal = Image.open(filePath) depthOriginal = Image.open(depthFilePath) rotation_std = 2.5 filename = os.path.splitext(file)[0] depthFilename = os.path.splitext(depthFile)[0] for i in range(5): color = colorOriginal depth = depthOriginal width, height = 561, 427 newWidth, newHeight = 420, 320 borderX = (width - newWidth) / 2 borderY = (height - newHeight) / 2 if randint(0,2) == 0: randomTranslationX = 0 randomTranslationY = 0 randomAngle = np.random.normal(0.0, rotation_std) color = color.rotate(randomAngle) depth = depth.rotate(randomAngle) else: randomScale = random.uniform(0.875, 1.125) resizeWidth, resizeHeight = int(randomScale * width), int(randomScale * height) color = color.resize((resizeWidth, resizeHeight), PIL.Image.ANTIALIAS) depth = depth.resize((resizeWidth, resizeHeight), PIL.Image.ANTIALIAS) depthArray = np.array(depth) depthArray = depthArray.astype(np.float32) depthArray /= randomScale depthArray = np.clip(depthArray, 0.0, 255.0) depthArray = depthArray.astype(np.uint8) depth = Image.fromarray(depthArray) width, height = color.size borderX = (width - newWidth) / 2 borderY = (height - newHeight) / 2 randomTranslationX = randint(-borderX + 1,borderX-1) randomTranslationY = randint(-borderY + 1,borderY-1) colorNew = color.crop((borderX + randomTranslationX, borderY + randomTranslationY,width - borderX + randomTranslationX, height - borderY + randomTranslationY)) depthNew = depth.crop((borderX + randomTranslationX, borderY + randomTranslationY,width - borderX + randomTranslationX, height - borderY + randomTranslationY)) colorArray = np.array(colorNew) colorArray = colorArray.astype(np.float32) / 255.0 colorArray = matplotlib.colors.rgb_to_hsv(colorArray) randomHueShift = random.uniform(-0.05,0.05) colorArray[:,:,0] += randomHueShift colorArray[:,:,0] = np.mod(colorArray[:,:,0], 1.0) randomSaturationShift = random.uniform(-0.05,0.05) colorArray[:,:,1] += randomSaturationShift colorArray[:,:,1] = np.clip(colorArray[:,:,1], 0, 1) randomValueShift = random.uniform(-0.05,0.05) colorArray[:,:,2] += randomValueShift colorArray[:,:,2] = np.clip(colorArray[:,:,2], 0, 1) colorArray = matplotlib.colors.hsv_to_rgb(colorArray) * 255.0 randomContrastChange = random.uniform(205.0,305.0) colorArray *= randomContrastChange / 255.0 colorArray -= (randomContrastChange - 255.0) / 2.0 colorArray = np.clip(colorArray, 0, 255.0) colorArray = colorArray.astype(np.uint8) colorNew = Image.fromarray(colorArray) colorNew.save('train_colors1/' + filename + str(i) + '.png') depthNew.save('train_depths1/' + depthFilename + str(i) + '.png') colorNewH = colorNew.transpose(PIL.Image.FLIP_LEFT_RIGHT) depthNewH = depthNew.transpose(PIL.Image.FLIP_LEFT_RIGHT) colorNewH.save('train_colors1/' + filename + str(i) + 'f.png') depthNewH.save('train_depths1/' + depthFilename + str(i) + 'f.png') ================================================ FILE: dataset/train/train_augment2.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import sys import PIL from PIL import Image import os.path import random from random import randint try: os.mkdir('train_colors2') except OSError: print('output folder already exists') try: os.mkdir('train_depths2') except OSError: print('output folder already exists') counter = 1 for file in os.listdir("train_colors"): if file.endswith(".png"): depthFile = file.replace('rgb','depth') filePath = 'train_colors/' + file depthFilePath = 'train_depths/' + depthFile print(str(counter) + filePath + ' ' + depthFilePath) counter += 1 colorOriginal = Image.open(filePath) depthOriginal = Image.open(depthFilePath) rotation_std = 5.0 filename = os.path.splitext(file)[0] depthFilename = os.path.splitext(depthFile)[0] for i in range(5): color = colorOriginal depth = depthOriginal width, height = 561, 427 newWidth, newHeight = 420, 320 borderX = (width - newWidth) / 2 borderY = (height - newHeight) / 2 if randint(0,2) == 0: randomTranslationX = 0 randomTranslationY = 0 randomAngle = np.random.normal(0.0, rotation_std) color = color.rotate(randomAngle) depth = depth.rotate(randomAngle) else: randomScale = random.uniform(0.75, 1.25) resizeWidth, resizeHeight = int(randomScale * width), int(randomScale * height) color = color.resize((resizeWidth, resizeHeight), PIL.Image.ANTIALIAS) depth = depth.resize((resizeWidth, resizeHeight), PIL.Image.ANTIALIAS) depthArray = np.array(depth) depthArray = depthArray.astype(np.float32) depthArray /= randomScale depthArray = np.clip(depthArray, 0.0, 255.0) depthArray = depthArray.astype(np.uint8) depth = Image.fromarray(depthArray) width, height = color.size borderX = (width - newWidth) / 2 borderY = (height - newHeight) / 2 if borderX <= 1: randomTranslationX = 0 randomTranslationY = 0 else: randomTranslationX = randint(-borderX + 1,borderX-1) randomTranslationY = randint(-borderY + 1,borderY-1) colorNew = color.crop((borderX + randomTranslationX, borderY + randomTranslationY,width - borderX + randomTranslationX, height - borderY + randomTranslationY)) depthNew = depth.crop((borderX + randomTranslationX, borderY + randomTranslationY,width - borderX + randomTranslationX, height - borderY + randomTranslationY)) colorArray = np.array(colorNew) colorArray = colorArray.astype(np.float32) / 255.0 colorArray = matplotlib.colors.rgb_to_hsv(colorArray) randomHueShift = random.uniform(-0.1,0.1) colorArray[:,:,0] += randomHueShift colorArray[:,:,0] = np.mod(colorArray[:,:,0], 1.0) randomSaturationShift = random.uniform(-0.1,0.1) colorArray[:,:,1] += randomSaturationShift colorArray[:,:,1] = np.clip(colorArray[:,:,1], 0, 1) randomValueShift = random.uniform(-0.1,0.1) colorArray[:,:,2] += randomValueShift colorArray[:,:,2] = np.clip(colorArray[:,:,2], 0, 1) colorArray = matplotlib.colors.hsv_to_rgb(colorArray) * 255.0 randomContrastChange = random.uniform(175.0,335.0) colorArray *= randomContrastChange / 255.0 colorArray -= (randomContrastChange - 255.0) / 2.0 colorArray = np.clip(colorArray, 0, 255.0) colorArray = colorArray.astype(np.uint8) colorNew = Image.fromarray(colorArray) colorNew.save('train_colors2/' + filename + str(i) + '.png') depthNew.save('train_depths2/' + depthFilename + str(i) + '.png') colorNewH = colorNew.transpose(PIL.Image.FLIP_LEFT_RIGHT) depthNewH = depthNew.transpose(PIL.Image.FLIP_LEFT_RIGHT) colorNewH.save('train_colors2/' + filename + str(i) + 'f.png') depthNewH.save('train_depths2/' + depthFilename + str(i) + 'f.png') ================================================ FILE: eval_depth.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import caffe import operator import argparse import os from scipy import misc from os.path import basename def LogDepth(depth): depth = np.maximum(depth, 1.0 / 255.0) return 0.179581 * np.log(depth) + 1 def AbsoluteRelativeDifference(output, gt): gt = np.maximum(gt, 1.0 / 255.0) diff = np.mean(np.absolute(output - gt) / gt) return diff def SquaredRelativeDifference(output, gt): gt = np.maximum(gt, 1.0 / 255.0) d = output - gt diff = np.mean((d * d) / gt) return diff def RootMeanSquaredError(output, gt): d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def RootMeanSquaredErrorLog(output, gt): d = LogDepth(output / 10.0) * 10.0 - LogDepth(gt / 10.0) * 10.0 diff = np.sqrt(np.mean(d * d)) return diff def ScaleInvariantMeanSquaredError(output, gt): output = LogDepth(output / 10.0) * 10.0 gt = LogDepth(gt / 10.0) * 10.0 d = output - gt diff = np.mean(d * d) relDiff = (d.sum() * d.sum()) / float(d.size * d.size) return diff - relDiff def Log10Error(output, gt): output = np.maximum(output, 1.0 / 255.0) gt = np.maximum(gt, 1.0 / 255.0) diff = np.mean(np.absolute(np.log10(output) - np.log10(gt))) return diff def MVNError(output, gt): outMean = np.mean(output) outStd = np.std(output) output = (output - outMean)/outStd gtMean = np.mean(gt) gtStd = np.std(gt) gt = (gt - gtMean)/gtStd d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def Threshold(output, gt, threshold): output = np.maximum(output, 1.0 / 255.0) gt = np.maximum(gt, 1.0 / 255.0) withinThresholdCount = np.where(np.maximum(output / gt, gt / output) < threshold)[0].size return withinThresholdCount / float(gt.size) def Test(out, gt): absRelDiff = AbsoluteRelativeDifference(out, gt) sqrRelDiff = SquaredRelativeDifference(out, gt) RMSE = RootMeanSquaredError(out, gt) RMSELog = RootMeanSquaredErrorLog(out, gt) SIMSE = ScaleInvariantMeanSquaredError(out, gt) threshold1 = Threshold(out, gt, 1.25) threshold2 = Threshold(out, gt, 1.25 * 1.25) threshold3 = Threshold(out, gt, 1.25 * 1.25 * 1.25) log10 = Log10Error(out, gt) MVN = MVNError(out, gt) return [absRelDiff, sqrRelDiff, RMSE, RMSELog, SIMSE, log10, MVN, threshold1, threshold2, threshold3] def PrintTop5(title, result): length = min(10, len(result)) print print print ("TOP " + str(length) + " for " + title) for i in xrange(length): print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1])) print print ================================================ FILE: get_depth.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz from __future__ import print_function import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import os.path os.environ['GLOG_minloglevel'] = '2' import caffe import scipy.ndimage import argparse import operator import shutil WIDTH = 298 HEIGHT = 218 OUT_WIDTH = 74 OUT_HEIGHT = 54 GT_WIDTH = 420 GT_HEIGHT = 320 def testNet(net, img): net.blobs['X'].data[...] = img net.forward() output = net.blobs['depth-refine'].data output = np.reshape(output, (1,1,OUT_HEIGHT, OUT_WIDTH)) return output def loadImage(path, channels, width, height): img = caffe.io.load_image(path) img = caffe.io.resize(img, (height, width, channels)) img = np.transpose(img, (2,0,1)) img = np.reshape(img, (1,channels,height,width)) return img def printImage(img, name, channels, width, height): params = list() params.append(cv.CV_IMWRITE_PNG_COMPRESSION) params.append(8) imgnp = np.reshape(img, (height,width, channels)) imgnp = np.array(imgnp * 255, dtype = np.uint8) cv2.imwrite(name, imgnp, params) def eval(out, gt, rawResults): linearGT = gt * 10.0 linearOut = out * 10.0 rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))] return rawResults def ProcessToOutput(depth): depth = np.clip(depth, 0.001, 1000) return np.clip(2 * 0.179581 * np.log(depth) + 1, 0, 1) caffe.set_mode_cpu() parser = argparse.ArgumentParser() parser.add_argument("input_dir", help="directory with input images") parser.add_argument("output", help="folder to output to") parser.add_argument("snaps", help="folder with snapshots to use") parser.add_argument('--log', action='store_true', default=False) args = parser.parse_args() try: os.mkdir(args.output + "_abs") except OSError: print ('Output directory already exists, not creating a new one') fileCount = len([name for name in os.listdir(args.input_dir)]) results = [dict() for x in range(10)] for snapshot in os.listdir(args.snaps): if not snapshot.endswith("caffemodel"): continue currentSnapDir = snapshot.replace(".caffemodel","") if os.path.exists(args.output + "_abs/" + currentSnapDir): shutil.rmtree(args.output + "_abs/" + currentSnapDir) os.mkdir(args.output + "_abs/" + currentSnapDir) print(currentSnapDir) sys.stdout.flush() netFile = snapshot.replace(".caffemodel",".prototxt") net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST) rawResults = np.zeros((10)) for count, file in enumerate(os.listdir(args.input_dir)): out_string = str(count) + '/' + str(fileCount) + ': ' + file sys.stdout.write('%s\r' % out_string) sys.stdout.flush() inputFileName = file inputFilePath = args.input_dir + '/' + inputFileName input = loadImage(inputFilePath, 3, WIDTH, HEIGHT) input *= 255 input -= 127 output = testNet(net, input) if args.log: output = np.exp((output - 1) / 0.179581) outWidth = OUT_WIDTH outHeight = OUT_HEIGHT scaleW = float(GT_WIDTH) / float(OUT_WIDTH) scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT) output = scipy.ndimage.zoom(output, (1,1,scaleH,scaleW), order=3) outWidth *= scaleW outHeight *= scaleH input += 127 input = input / 255.0 input = np.transpose(input, (0,2,3,1)) input = input[:,:,:,(2,1,0)] absOutput = output.copy() output = ProcessToOutput(output) absOutput = ProcessToOutput(absOutput) filename = os.path.splitext(os.path.basename(inputFileName))[0] filePathAbs = args.output + '_abs/' + currentSnapDir + '/' + filename + '.png' printImage(input, filePathAbs, 3, WIDTH, HEIGHT) printImage(absOutput, filePathAbs.replace('_colors','_depth'), 1, outWidth, outHeight) ================================================ FILE: net_deploy.prototxt ================================================ name: "refining network_norm_abs_deploy" #INPUTS layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #GRADIENT NETWORK STARTS HERE layer { name: "conv1-grad" type: "Convolution" bottom: "X" top: "conv1-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-grad" type: "ReLU" bottom: "conv1-grad" top: "conv1-grad" } layer { name: "norm1-grad" type: "LRN" bottom: "conv1-grad" top: "norm1-grad" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-grad" type: "Pooling" bottom: "norm1-grad" top: "pool1-grad" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1-grad" top: "conv2-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2-grad" type: "ReLU" bottom: "conv2-grad" top: "conv2-grad" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2-grad" top: "conv3-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-grad" type: "ReLU" bottom: "conv3-grad" top: "conv3-grad" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3-grad" top: "conv4-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-grad" type: "ReLU" bottom: "conv4-grad" top: "conv4-grad" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4-grad" top: "grad_out" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "mvnGrad" type: "MVN" bottom: "grad_out" top: "gradMVN" } #GRADIENT NETWORK ENDS HERE #PREPROCESSING FOR THE REFINE layer { name: "upsample" type: "Deconvolution" bottom: "gradMVN" top: "grad-upsample" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 3 stride: 1 num_output: 2 group: 2 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } layer { name: "concat-global" bottom: "grad-upsample" bottom: "depthMVN" top: "global-output" type: "Concat" concat_param { axis: 1 } } layer { name: "upsample-global" type: "Deconvolution" bottom: "global-output" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 3 group: 3 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } ================================================ FILE: net_train.prototxt ================================================ name: "refining_network_norm_abs_train" #INPUTS START HERE #COLOR layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TEST } } #INPUTS END HERE #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #GRADIENT NETWORK STARTS HERE layer { name: "conv1-grad" type: "Convolution" bottom: "X" top: "conv1-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-grad" type: "ReLU" bottom: "conv1-grad" top: "conv1-grad" } layer { name: "norm1-grad" type: "LRN" bottom: "conv1-grad" top: "norm1-grad" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-grad" type: "Pooling" bottom: "norm1-grad" top: "pool1-grad" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1-grad" top: "conv2-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2-grad" type: "ReLU" bottom: "conv2-grad" top: "conv2-grad" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2-grad" top: "conv3-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-grad" type: "ReLU" bottom: "conv3-grad" top: "conv3-grad" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3-grad" top: "conv4-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-grad" type: "ReLU" bottom: "conv4-grad" top: "conv4-grad" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4-grad" top: "grad_out" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "mvnGrad" type: "MVN" bottom: "grad_out" top: "gradMVN" } #GRADIENT NETWORK ENDS HERE #PREPROCESSING FOR THE REFINE layer { name: "upsample" type: "Deconvolution" bottom: "gradMVN" top: "grad-upsample" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 3 stride: 1 num_output: 2 group: 2 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } layer { name: "concat-global" bottom: "grad-upsample" bottom: "depthMVN" top: "global-output" type: "Concat" concat_param { axis: 1 } } layer { name: "upsample-global" type: "Deconvolution" bottom: "global-output" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 3 group: 3 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } #LOSSES layer { name: "mvnDepthRefine" type: "MVN" bottom: "depth-refine" top: "depthMVN-refine" } layer { name: "mvnGT" type: "MVN" bottom: "gt" top: "gtMVN" } layer { name: "lossMVNDepth" type: "EuclideanLoss" bottom: "depthMVN-refine" bottom: "gtMVN" top: "lossMVNDepth" loss_weight: 0.5 } layer { name: "lossABSDepth" type: "EuclideanLoss" bottom: "depth-refine" bottom: "gt" top: "lossABSDepth" loss_weight: 0.5 } ================================================ FILE: solver.prototxt ================================================ net: "net_train.prototxt" test_iter: 50 test_interval: 500 base_lr: 0.000025 gamma: 0.5 stepsize: 100000 momentum: 0.9 weight_decay: 0.005 lr_policy: "fixed" display: 100 max_iter: 100000 snapshot: 10000 snapshot_prefix: "snaps/" solver_mode: GPU debug_info: 1 ================================================ FILE: source/README.txt ================================================ ================================================ Structure of this folder ================================================ global_context_network - contains network definitions and example scripts for training and evaluating the the global context network from the proposed model gradient_network - contains network definitions and example scripts for training and evaluating the gradient network from the proposed model joint - contains network definitions and example scripts for training and evaluating the jointly trained global context network and gradient networks from the proposed model refining_network - contains network definitions and example scripts for training and evaluating the refining network from the proposed model Each of these folders contains: -multiple subdirectories, each with different configuration/loss function of the network. Individual subdirectories contain the network definition files for the training - 'net_train.prototxt' and for evaluating - 'net_deploy.prototxt'. -script 'train.py' for training. Note that this script is just an example and it's content should be modified to fit the desired training process. -script 'eval_depth.py' or 'eval_grad.py', contains definitions of error functions used for evaluating the performance -script 'test_depth.py' or 'test_grad.py'. This script is used to evaluate the performance of the network and visualize it's output. -'solver.prototxt' - example of the definition file for the Caffe solver. ================================================ Usage of the 'test_depth.py'/'test_grad.py' scripts: ================================================ python test_depth.py INPUT_DIR GT_DIR OUT_DIR SNAPSHOTS_DIR [--log] -INPUT_DIR is the path to the folder containing input images -GT_DIR is the path to the folder containing ground truth depth maps -OUT_DIR is the path to the folder to which will be written output depth maps -SNAPSHOTS_DIR is the path to the folder containing .caffemodel files containing trained network models. All models from this folder will be evaluated. --log switch is used when the depth values that are produced by the network are in log space ================================================= Frameworks/Libraries needed: ================================================ Caffe Python2.7: - caffe, scipy, scikit-image, numpy, pypng, cv2, Pillow, matplotlib ================================================= Few notes ================================================= -input images should be named in a same way as the corresponding ground truths, with difference that input images should have a suffix 'colors', while ground truth images should have a suffix 'depth'. Note that these suffixes should preceed file extension, e.g., 'image1_colors.png' and corresponding depth map 'image1_depth.png' -along with .caffemodel file, corresponding deploy network definition file has to be placed into SNAPSHOTS_DIR, with the same name as the model file but with different extension 'prototxt' instead of 'caffemodel' -there will actually be two output folders created, one OUT_DIR and the other OUT_DIR + '_abs'. OUT_DIR contains output depths that are fit using MVN normalization onto ground truth, OUT_DIR + '_abs' contains the raw output depth maps. -note that you need AlexNet caffemodel for the training of the global context network, gradient network and their joint configuration. It can be downloaded here: https://github.com/BVLC/caffe/tree/master/models/bvlc_alexnet ================================================ FILE: source/global_context_network/abs/net_deploy.prototxt ================================================ name: "global_context_network_abs_deploy" layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } # CONVOLUTIONAL layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } ================================================ FILE: source/global_context_network/abs/net_train.prototxt ================================================ name: "global_context_network_abs_train" #INPUTS layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TEST } } # CONVOLUTIONAL layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "drop6" type: "Dropout" bottom: "fc-main" top: "fc-main" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } # LOSSES layer { name: "lossAbsDepth" type: "EuclideanLoss" bottom: "depth" bottom: "gt" top: "lossAbsDepth" loss_weight: 1 } ================================================ FILE: source/global_context_network/eval_depth.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import caffe import operator import argparse import os from scipy import misc from os.path import basename def LogDepth(depth): depth = np.maximum(depth, 1.0 / 255.0) return 0.179581 * np.log(depth) + 1 def AbsoluteRelativeDifference(output, gt): gt = np.maximum(gt, 1.0 / 255.0) diff = np.mean(np.absolute(output - gt) / gt) return diff def SquaredRelativeDifference(output, gt): gt = np.maximum(gt, 1.0 / 255.0) d = output - gt diff = np.mean((d * d) / gt) return diff def RootMeanSquaredError(output, gt): d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def RootMeanSquaredErrorLog(output, gt): d = LogDepth(output / 10.0) * 10.0 - LogDepth(gt / 10.0) * 10.0 diff = np.sqrt(np.mean(d * d)) return diff def ScaleInvariantMeanSquaredError(output, gt): output = LogDepth(output / 10.0) * 10.0 gt = LogDepth(gt / 10.0) * 10.0 d = output - gt diff = np.mean(d * d) relDiff = (d.sum() * d.sum()) / float(d.size * d.size) return diff - relDiff def Log10Error(output, gt): output = np.maximum(output, 1.0 / 255.0) gt = np.maximum(gt, 1.0 / 255.0) diff = np.mean(np.absolute(np.log10(output) - np.log10(gt))) return diff def MVNError(output, gt): outMean = np.mean(output) outStd = np.std(output) output = (output - outMean)/outStd gtMean = np.mean(gt) gtStd = np.std(gt) gt = (gt - gtMean)/gtStd d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def Threshold(output, gt, threshold): output = np.maximum(output, 1.0 / 255.0) gt = np.maximum(gt, 1.0 / 255.0) withinThresholdCount = np.where(np.maximum(output / gt, gt / output) < threshold)[0].size return withinThresholdCount / float(gt.size) def Test(out, gt): absRelDiff = AbsoluteRelativeDifference(out, gt) sqrRelDiff = SquaredRelativeDifference(out, gt) RMSE = RootMeanSquaredError(out, gt) RMSELog = RootMeanSquaredErrorLog(out, gt) SIMSE = ScaleInvariantMeanSquaredError(out, gt) threshold1 = Threshold(out, gt, 1.25) threshold2 = Threshold(out, gt, 1.25 * 1.25) threshold3 = Threshold(out, gt, 1.25 * 1.25 * 1.25) log10 = Log10Error(out, gt) MVN = MVNError(out, gt) return [absRelDiff, sqrRelDiff, RMSE, RMSELog, SIMSE, log10, MVN, threshold1, threshold2, threshold3] def PrintTop5(title, result): length = min(10, len(result)) print print print ("TOP " + str(length) + " for " + title) for i in xrange(length): print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1])) print print ================================================ FILE: source/global_context_network/log_abs/net_deploy.prototxt ================================================ name: "global_context_network_log_abs_deploy" layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } # CONVOLUTIONAL layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } ================================================ FILE: source/global_context_network/log_abs/net_train.prototxt ================================================ name: "global_context_network_log_abs_train" #INPUTS layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TEST } } # CONVOLUTIONAL layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "drop6" type: "Dropout" bottom: "fc-main" top: "fc-main" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } # LOSSES layer { name: "log" type: "Log" bottom: "gt" top: "lnGt" log_param { shift: 0.00392156863 scale: 0.996078431 } } layer { name: "power" type: "Power" bottom: "lnGt" top: "logGt" power_param { power: 1 scale: 0.179581 shift: 1.0 } } layer { name: "lossLogAbsDepth" type: "EuclideanLoss" bottom: "depth" bottom: "logGt" top: "lossLogAbsDepth" loss_weight: 1 } ================================================ FILE: source/global_context_network/norm/net_deploy.prototxt ================================================ name: "global_context_network_norm_deploy" #INPUTS layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } # CONVOLUTIONAL layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } ================================================ FILE: source/global_context_network/norm/net_train.prototxt ================================================ name: "global_context_network_norm_train" #INPUTS layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TEST } } # CONVOLUTIONAL layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "drop6" type: "Dropout" bottom: "fc-main" top: "fc-main" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } # LOSSES layer { name: "mvnDepth" type: "MVN" bottom: "depth" top: "depthMVN" } layer { name: "mvnGT" type: "MVN" bottom: "gt" top: "gtMVN" } layer { name: "lossMVNDepth" type: "EuclideanLoss" bottom: "depthMVN" bottom: "gtMVN" top: "lossMVNDepth" loss_weight: 1 } ================================================ FILE: source/global_context_network/sc-inv/net_deploy.prototxt ================================================ name: "global_context_network_sc-inv_deploy" #INPUTS layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } # CONVOLUTIONAL layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } ================================================ FILE: source/global_context_network/sc-inv/net_train.prototxt ================================================ name: "global_context_network_sc-inv_train" #INPUTS layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TEST } } # CONVOLUTIONAL layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "drop6" type: "Dropout" bottom: "fc-main" top: "fc-main" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } # LOSSES layer { name: "log" type: "Log" bottom: "gt" top: "lnGt" log_param { shift: 0.00392156863 scale: 0.996078431 } } layer { name: "power" type: "Power" bottom: "lnGt" top: "logGt" power_param { power: 1 scale: 0.179581 shift: 1.0 } } layer { name: "mnDepth" type: "MVN" bottom: "depth" top: "depthMN" mvn_param{normalize_variance: false} } layer { name: "mnGT" type: "MVN" bottom: "logGt" top: "gtMN" mvn_param{normalize_variance: false} } layer { name: "lossMNDepth" type: "EuclideanLoss" bottom: "depthMN" bottom: "gtMN" top: "lossMNDepth" loss_weight: 1 } ================================================ FILE: source/global_context_network/solver.prototxt ================================================ net: "net_train.prototxt" test_iter: 50 test_interval: 500 base_lr: 0.0005 gamma: 0.5 stepsize: 100000 momentum: 0.9 weight_decay: 0.005 lr_policy: "fixed" display: 100 max_iter: 100000 snapshot: 10000 snapshot_prefix: "snaps/" solver_mode: GPU debug_info: 1 ================================================ FILE: source/global_context_network/test_depth.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz from __future__ import print_function import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import os.path os.environ['GLOG_minloglevel'] = '2' import caffe import scipy.ndimage import argparse import operator import shutil from eval_depth import Test, PrintTop5, LogDepth WIDTH = 298 HEIGHT = 218 OUT_WIDTH = 37 OUT_HEIGHT = 27 GT_WIDTH = 420 GT_HEIGHT = 320 def testNet(net, img): net.blobs['X'].data[...] = img net.forward() output = net.blobs['depth'].data output = np.reshape(output, (1,1,OUT_HEIGHT, OUT_WIDTH)) return output def loadImage(path, channels, width, height): img = caffe.io.load_image(path) img = caffe.io.resize(img, (height, width, channels)) img = np.transpose(img, (2,0,1)) img = np.reshape(img, (1,channels,height,width)) return img def printImage(img, name, channels, width, height): params = list() params.append(cv.CV_IMWRITE_PNG_COMPRESSION) params.append(8) imgnp = np.reshape(img, (height,width, channels)) imgnp = np.array(imgnp * 255, dtype = np.uint8) cv2.imwrite(name, imgnp, params) def eval(out, gt, rawResults): linearGT = gt * 10.0 linearOut = out * 10.0 rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))] return rawResults def ProcessToOutput(depth): depth = np.clip(depth, 0.001, 1000) return np.clip(2 * 0.179581 * np.log(depth) + 1, 0, 1) caffe.set_mode_cpu() parser = argparse.ArgumentParser() parser.add_argument("input_dir", help="directory with input images") parser.add_argument("gt_dir", help="directory with ground truth files") parser.add_argument("output", help="folder to output to") parser.add_argument("snaps", help="folder with snapshots to use") parser.add_argument('--log', action='store_true', default=False) args = parser.parse_args() try: os.mkdir(args.output) except OSError: print ('Output directory already exists, not creating a new one') try: os.mkdir(args.output + "_abs") except OSError: print ('Output directory already exists, not creating a new one') fileCount = len([name for name in os.listdir(args.input_dir)]) results = [dict() for x in range(10)] for snapshot in os.listdir(args.snaps): if not snapshot.endswith("caffemodel"): continue currentSnapDir = snapshot.replace(".caffemodel","") if os.path.exists(args.output + "/" + currentSnapDir): shutil.rmtree(args.output + "/" + currentSnapDir) if os.path.exists(args.output + "_abs/" + currentSnapDir): shutil.rmtree(args.output + "_abs/" + currentSnapDir) os.mkdir(args.output + "/" + currentSnapDir) os.mkdir(args.output + "_abs/" + currentSnapDir) print(currentSnapDir) sys.stdout.flush() netFile = snapshot.replace(".caffemodel",".prototxt") net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST) rawResults = np.zeros((10)) for count, file in enumerate(os.listdir(args.input_dir)): out_string = str(count) + '/' + str(fileCount) + ': ' + file sys.stdout.write('%s\r' % out_string) sys.stdout.flush() inputFileName = file inputFilePath = args.input_dir + '/' + inputFileName gtFileName = file.replace('colors', 'depth') gtFilePath = args.gt_dir + '/' + gtFileName gt = loadImage(gtFilePath, 1, GT_WIDTH, GT_HEIGHT) input = loadImage(inputFilePath, 3, WIDTH, HEIGHT) input *= 255 input -= 127 output = testNet(net, input) if args.log: output = np.exp((output - 1) / 0.179581) outWidth = OUT_WIDTH outHeight = OUT_HEIGHT scaleW = float(GT_WIDTH) / float(OUT_WIDTH) scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT) output = scipy.ndimage.zoom(output, (1,1,scaleH,scaleW), order=3) outWidth *= scaleW outHeight *= scaleH rawResults = eval(output, gt, rawResults) input += 127 input = input / 255.0 input = np.transpose(input, (0,2,3,1)) input = input[:,:,:,(2,1,0)] absOutput = output.copy() output -= output.mean() output /= output.std() output *= gt.std() output += gt.mean() gt = ProcessToOutput(gt) output = ProcessToOutput(output) absOutput = ProcessToOutput(absOutput) filename = os.path.splitext(os.path.basename(inputFileName))[0] filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png' filePathAbs = args.output + '_abs/' + currentSnapDir + '/' + filename + '.png' printImage(input, filePath, 3, WIDTH, HEIGHT) printImage(input, filePathAbs, 3, WIDTH, HEIGHT) printImage(output, filePath.replace('_colors','_depth'), 1, outWidth, outHeight) printImage(absOutput, filePathAbs.replace('_colors','_depth'), 1, outWidth, outHeight) printImage(gt, filePath.replace('_colors', '_gt'), 1, outWidth, outHeight) printImage(gt, filePathAbs.replace('_colors', '_gt'), 1, outWidth, outHeight) rawResults[:] = [x / fileCount for x in rawResults] for i in xrange(10): results[i][currentSnapDir] = rawResults[i] titles = ["AbsRelDiff", "SqrRelDiff", "RMSE", "RMSELog", "SIMSE", "Log10", "MVN", "Threshold 1.25","Threshold 1.25^2", "Threshold 1.25^3"] for i in xrange(10): results[i] = sorted(results[i].items(), key=operator.itemgetter(1)) if i > 6: results[i] = list(reversed(results[i])) PrintTop5(titles[i], results[i]) ================================================ FILE: source/global_context_network/train.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import cv2 import cv import caffe from caffe.proto import caffe_pb2 import sys from google.protobuf import text_format import argparse caffe.set_mode_gpu() solver = caffe.get_solver('solver.prototxt') solver.net.copy_from('bvlc_alexnet.caffemodel') solver.solve() ================================================ FILE: source/gradient_network/abs/net_deploy.prototxt ================================================ name: "gradient_network_abs_deploy" #INPUTS layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } # NET ITSELF layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4" top: "grad" param { lr_mult: 0.1 decay_mult: 1 } param { lr_mult: 0.1 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.002 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "power" bottom: "grad" top: "gradient" type: "Power" power_param { power: 1 scale: 0.02 shift: 0 } } #LOSS ================================================ FILE: source/gradient_network/abs/net_train.prototxt ================================================ name: "gradient_network_abs_train" #INPUTS layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TEST } } # NET ITSELF layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "drop" type: "Dropout" bottom: "conv4" top: "conv4" dropout_param { dropout_ratio: 0.5 } } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4" top: "grad" param { lr_mult: 0.1 decay_mult: 1 } param { lr_mult: 0.1 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.002 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "power" bottom: "grad" top: "grad_out" type: "Power" power_param { power: 1 scale: 0.02 shift: 0 } } #LOSS layer { name: "gradientFilter" type: "Convolution" bottom: "gt" top: "gtGrad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 0 stride: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "lossGradLogAbs" type: "EuclideanLoss" bottom: "gtGrad" bottom: "grad_out" top: "lossGrad" loss_weight: 1 } ================================================ FILE: source/gradient_network/eval_grad.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import caffe import operator import argparse import os from scipy import misc from os.path import basename def RootMeanSquaredError(output, gt): d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def MVNError(output, gt): outMean = np.mean(output) outStd = np.std(output) output = (output - outMean)/outStd gtMean = np.mean(gt) gtStd = np.std(gt) gt = (gt - gtMean)/gtStd d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def Test(out, gt): RMSE = RootMeanSquaredError(out, gt) MVN = MVNError(out, gt) return [RMSE, MVN] def PrintTop5(title, result): length = min(10, len(result)) print print print ("TOP " + str(length) + " for " + title) for i in xrange(length): print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1])) print print ================================================ FILE: source/gradient_network/filter.prototxt ================================================ name: "GradientFilter" input: "X" input_shape { dim: 1 dim: 1 dim: 320 dim: 420 } layer { name: "gradientFilter" type: "Convolution" bottom: "X" top: "out" param { lr_mult: 0 } param { lr_mult: 0 } convolution_param { num_output: 2 pad: 0 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } ================================================ FILE: source/gradient_network/norm/net_deploy.prototxt ================================================ name: "gradient_network_norm_deploy" #INPUTS layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } # NET ITSELF layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4" top: "grad" param { lr_mult: 0.1 decay_mult: 1 } param { lr_mult: 0.1 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "power" bottom: "grad" top: "gradient" type: "Power" power_param { power: 1 scale: 0.01 shift: 0 } } ================================================ FILE: source/gradient_network/norm/net_train.prototxt ================================================ name: "gradient_network_norm_train" #INPUTS layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TEST } } # NET ITSELF layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "drop" type: "Dropout" bottom: "conv4" top: "conv4" dropout_param { dropout_ratio: 0.5 } } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4" top: "grad" param { lr_mult: 0.1 decay_mult: 1 } param { lr_mult: 0.1 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "power" bottom: "grad" top: "grad_out" type: "Power" power_param { power: 1 scale: 0.01 shift: 0 } } #LOSS layer { name: "gradientFilter" type: "Convolution" bottom: "gt" top: "gtGrad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 0 stride: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "gradMVN" type: "MVN" bottom: "grad_out" top: "grad_outMVN" } layer { name: "gtMVN" type: "MVN" bottom: "gtGrad" top: "gtGradMVN" } layer { name: "lossGradMVN" type: "EuclideanLoss" bottom: "grad_outMVN" bottom: "gtGradMVN" top: "lossGrad" loss_weight: 1 } ================================================ FILE: source/gradient_network/solver.prototxt ================================================ net: "net_train.prototxt" test_iter: 50 test_interval: 500 base_lr: 0.00025 gamma: 0.5 stepsize: 100000 momentum: 0.9 weight_decay: 0.005 lr_policy: "fixed" display: 100 max_iter: 100000 snapshot: 10000 snapshot_prefix: "snaps/" solver_mode: GPU debug_info: 1 ================================================ FILE: source/gradient_network/test_grad.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz from __future__ import print_function import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import os.path os.environ['GLOG_minloglevel'] = '2' import caffe import scipy.ndimage import argparse import operator import shutil from eval_grad import Test, PrintTop5 filter = np.zeros((1,3,3)) filter[0,0,:] = (-1,-1,-1) filter[0,1,:] = (0,0,0) filter[0,2,:] = (1,1,1) filter2 = np.zeros((1,3,3)) filter2[0,0,:] = (-1,0,1) filter2[0,1,:] = (-1,0,1) filter2[0,2,:] = (-1,0,1) WIDTH = 298 HEIGHT = 218 OUT_WIDTH = 35 OUT_HEIGHT = 25 GT_WIDTH = 418 GT_HEIGHT = 318 def filterImage(net, gt): net.blobs['X'].data[...] = gt net.forward() return (net.blobs['out'].data[0,0,:,:], net.blobs['out'].data[0,1,:,:]) def testNet(net, img): net.blobs['X'].data[...] = img net.forward() output = net.blobs['gradient'].data output = np.reshape(output, (1,2,OUT_HEIGHT, OUT_WIDTH)) out1 = output[0,0,:,:] out2 = output[0,1,:,:] return out1, out2 def loadImage(path, channels, width, height): img = caffe.io.load_image(path) img = caffe.io.resize(img, (height, width, channels)) img = np.transpose(img, (2,0,1)) img = np.reshape(img, (1,channels,height,width)) return img def printImage(img, name, channels, width, height): params = list() params.append(cv.CV_IMWRITE_PNG_COMPRESSION) params.append(8) imgnp = np.reshape(img, (height,width, channels)) imgnp = np.array(imgnp * 255, dtype = np.uint8) cv2.imwrite(name, imgnp, params) def eval(out, gt, rawResults): linearGT = gt#np.exp((gt - 1) / 0.179581) * 65.535 linearOut = out#np.exp((out - 1) / 0.179581) * 65.635 #RAW PIXEL TESTS rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))] return rawResults caffe.set_mode_cpu() parser = argparse.ArgumentParser() parser.add_argument("input_dir", help="directory with input images") parser.add_argument("gt_dir", help="directory with ground truths") parser.add_argument("output", help="folder to output to") parser.add_argument("snaps", help="folder with snapshots to use") args = parser.parse_args() gradNet = caffe.Net("filter.prototxt", caffe.TEST) gradNet.params['gradientFilter'][0].data[0,...] = filter gradNet.params['gradientFilter'][0].data[1,...] = filter2 try: os.mkdir(args.output) except OSError: x = 12 fileCount = len([name for name in os.listdir(args.input_dir)]) results = [dict() for x in range(2)] for snapshot in os.listdir(args.snaps): if not snapshot.endswith("caffemodel"): continue currentSnapDir = snapshot.replace(".caffemodel","") if os.path.exists(args.output + "/" + currentSnapDir): shutil.rmtree(args.output + "/" + currentSnapDir) os.mkdir(args.output + "/" + currentSnapDir) print(currentSnapDir) sys.stdout.flush() netFile = snapshot.replace(".caffemodel",".prototxt") net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST) rawResults = np.zeros((2)) for count, file in enumerate(os.listdir(args.input_dir)): out_string = str(count) + '/' + str(fileCount) + ': ' + file sys.stdout.write('%s\r' % out_string) sys.stdout.flush() inputFileName = file inputFilePath = args.input_dir + '/' + inputFileName gtFileName = file.replace('colors', 'depth') gtFilePath = args.gt_dir + '/' + gtFileName gt = loadImage(gtFilePath, 1, GT_WIDTH + 2, GT_HEIGHT + 2) gt1, gt2 = filterImage(gradNet, gt) gt1 = np.reshape(gt1, (1,1,GT_HEIGHT, GT_WIDTH)) gt2 = np.reshape(gt2, (1,1,GT_HEIGHT, GT_WIDTH)) input = loadImage(inputFilePath, 3, WIDTH, HEIGHT) input *= 255 input -= 127 out1, out2 = testNet(net, input) outWidth = OUT_WIDTH outHeight = OUT_HEIGHT scaleW = float(GT_WIDTH) / float(OUT_WIDTH) scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT) out1 = scipy.ndimage.zoom(out1, (scaleH,scaleW), order=3) out2 = scipy.ndimage.zoom(out2, (scaleH,scaleW), order=3) outWidth *= scaleW outHeight *= scaleH rawResults = eval(out1, gt1, rawResults) rawResults = eval(out2, gt2, rawResults) gt1 = (gt1 - gt1.min())/(gt1.max() - gt1.min()) gt2 = (gt2 - gt2.min())/(gt2.max() - gt2.min()) out1 -= out1.mean() out1 /= out1.std() out1 *= gt1.std() out1 += gt1.mean() out2 -= out2.mean() out2 /= out2.std() out2 *= gt2.std() out2 += gt2.mean() input += 127 input = input / 255.0 input = np.transpose(input, (0,2,3,1)) input = input[:,:,:,(2,1,0)] gt1 = np.clip(gt1, 0, 1) gt2 = np.clip(gt2, 0, 1) out1 = np.clip(out1, 0, 1) out2 = np.clip(out2, 0, 1) filename = os.path.splitext(os.path.basename(inputFileName))[0] filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png' printImage(input, filePath, 3, WIDTH, HEIGHT) printImage(out1, filePath.replace('_colors','_grad1'), 1, outWidth, outHeight) printImage(out2, filePath.replace('_colors','_grad2'), 1, outWidth, outHeight) printImage(gt1, filePath.replace('_colors', '_gt1'), 1, outWidth, outHeight) printImage(gt2, filePath.replace('_colors', '_gt2'), 1, outWidth, outHeight) rawResults[:] = [x / (fileCount * 2.0) for x in rawResults] for i in xrange(2): results[i][currentSnapDir] = rawResults[i] titles = ["RMSE", "MVN"] for i in xrange(2): results[i] = sorted(results[i].items(), key=operator.itemgetter(1)) PrintTop5(titles[i], results[i]) ================================================ FILE: source/gradient_network/train.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import cv2 import cv import caffe from caffe.proto import caffe_pb2 import sys from google.protobuf import text_format import argparse caffe.set_mode_gpu() solver = caffe.get_solver('solver.prototxt') solver.net.copy_from('bvlc_alexnet.caffemodel') filter = np.zeros((1,3,3)) filter[0,0,:] = (-1,-1,-1) filter[0,1,:] = (0,0,0) filter[0,2,:] = (1,1,1) filter2 = np.zeros((1,3,3)) filter2[0,0,:] = (-1,0,1) filter2[0,1,:] = (-1,0,1) filter2[0,2,:] = (-1,0,1) solver.net.params['gradientFilter'][0].data[0,...] = filter solver.net.params['gradientFilter'][0].data[1,...] = filter2 solver.solve() ================================================ FILE: source/joint/architecture_A/net_deploy.prototxt ================================================ name: "joint_A_deploy" layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } # GLOBAL layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } # GRADIENT layer { name: "conv1_g" type: "Convolution" bottom: "X" top: "conv1_g" param { lr_mult: 0.0005 decay_mult: 1 } param { lr_mult: 0.0005 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "constant" value: 0.00 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1_g" type: "ReLU" bottom: "conv1_g" top: "conv1_g" } layer { name: "norm1_g" type: "LRN" bottom: "conv1_g" top: "norm1_g" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1_g" type: "Pooling" bottom: "norm1_g" top: "pool1_g" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1_g" top: "conv2_g" param { lr_mult: 0.5 decay_mult: 1 } param { lr_mult: 0.5 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2_g" type: "ReLU" bottom: "conv2_g" top: "conv2_g" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2_g" top: "conv3_g" param { lr_mult: 0.5 decay_mult: 1 } param { lr_mult: 0.5 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3_g" type: "ReLU" bottom: "conv3_g" top: "conv3_g" } # JOINT PART layer { name: "concat" bottom: "conv3" bottom: "conv3_g" top: "joint" type: "Concat" concat_param { axis: 1 } } # AFTER JOINT GLOBAL layer { name: "conv4_joint" type: "Convolution" bottom: "joint" top: "conv4" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } # AFTER JOINT GRAD layer { name: "conv4-grad" type: "Convolution" bottom: "joint" top: "conv4_g" param { lr_mult: 0.5 decay_mult: 1 } param { lr_mult: 0.5 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4_g" type: "ReLU" bottom: "conv4_g" top: "conv4_g" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4_g" top: "gradient" param { lr_mult: 0.05 decay_mult: 1 } param { lr_mult: 0.05 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.0001 } bias_filler { type: "constant" value: 0.0 } } } ================================================ FILE: source/joint/architecture_A/net_train.prototxt ================================================ name: "joint_A_train" #INPUTS layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TEST } } # GLOBAL layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } # GRADIENT layer { name: "conv1_g" type: "Convolution" bottom: "X" top: "conv1_g" param { lr_mult: 0.0005 decay_mult: 1 } param { lr_mult: 0.0005 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "constant" value: 0.00 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1_g" type: "ReLU" bottom: "conv1_g" top: "conv1_g" } layer { name: "norm1_g" type: "LRN" bottom: "conv1_g" top: "norm1_g" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1_g" type: "Pooling" bottom: "norm1_g" top: "pool1_g" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1_g" top: "conv2_g" param { lr_mult: 0.5 decay_mult: 1 } param { lr_mult: 0.5 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2_g" type: "ReLU" bottom: "conv2_g" top: "conv2_g" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2_g" top: "conv3_g" param { lr_mult: 0.5 decay_mult: 1 } param { lr_mult: 0.5 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3_g" type: "ReLU" bottom: "conv3_g" top: "conv3_g" } # JOINT PART layer { name: "concat" bottom: "conv3" bottom: "conv3_g" top: "joint" type: "Concat" concat_param { axis: 1 } } # AFTER JOINT GLOBAL layer { name: "conv4_joint" type: "Convolution" bottom: "joint" top: "conv4" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "drop6" type: "Dropout" bottom: "fc-main" top: "fc-main" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } # AFTER JOINT GRAD layer { name: "conv4-grad" type: "Convolution" bottom: "joint" top: "conv4_g" param { lr_mult: 0.5 decay_mult: 1 } param { lr_mult: 0.5 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4_g" type: "ReLU" bottom: "conv4_g" top: "conv4_g" } layer { name: "drop_g" type: "Dropout" bottom: "conv4_g" top: "conv4_g" dropout_param { dropout_ratio: 0.5 } } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4_g" top: "grad" param { lr_mult: 0.05 decay_mult: 1 } param { lr_mult: 0.05 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.0001 } bias_filler { type: "constant" value: 0.0 } } } # LOSSES layer { name: "mvnDepth" type: "MVN" bottom: "depth" top: "depthMVN" } layer { name: "mvnGT" type: "MVN" bottom: "gt" top: "gtMVN" } layer { name: "lossMVNDepth" type: "EuclideanLoss" bottom: "depthMVN" bottom: "gtMVN" top: "lossMVNDepth" loss_weight: 1 } layer { name: "gradientFilter" type: "Convolution" bottom: "gt" top: "gtGrad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 0 stride: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "gradMVN" type: "MVN" bottom: "grad" top: "grad_outMVN" } layer { name: "gtMVN" type: "MVN" bottom: "gtGrad" top: "gtGradMVN" } layer { name: "lossGradMVN" type: "EuclideanLoss" bottom: "grad_outMVN" bottom: "gtGradMVN" top: "lossGrad" loss_weight: 1 } ================================================ FILE: source/joint/architecture_B/net_deploy.prototxt ================================================ name: "joint_B_deploy" layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } # JOINT layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } # AFTER JOINT GLOBAL layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } # AFTER JOINT GRAD layer { name: "conv4-grad" type: "Convolution" bottom: "conv3" top: "conv4_g" param { lr_mult: 0.5 decay_mult: 1 } param { lr_mult: 0.5 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4_g" type: "ReLU" bottom: "conv4_g" top: "conv4_g" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4_g" top: "gradient" param { lr_mult: 0.05 decay_mult: 1 } param { lr_mult: 0.05 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.0001 } bias_filler { type: "constant" value: 0.0 } } } ================================================ FILE: source/joint/architecture_B/net_train.prototxt ================================================ name: "joint_B_train" #INPUTS layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 32 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_37x27.lmdb" backend: LMDB batch_size: 32 } transform_param { scale: 0.00390625 } include { phase: TEST } } # JOINT layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } # AFTER JOINT GLOBAL layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0.02 decay_mult: 1 } param { lr_mult: 0.02 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 1 } param { decay_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "drop6" type: "Dropout" bottom: "fc-main" top: "fc-main" dropout_param { dropout_ratio: 0.5 } } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 1 lr_mult: 0.2 } param { lr_mult: 0.2 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } # AFTER JOINT GRAD layer { name: "conv4-grad" type: "Convolution" bottom: "conv3" top: "conv4_g" param { lr_mult: 0.5 decay_mult: 1 } param { lr_mult: 0.5 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4_g" type: "ReLU" bottom: "conv4_g" top: "conv4_g" } layer { name: "drop_g" type: "Dropout" bottom: "conv4_g" top: "conv4_g" dropout_param { dropout_ratio: 0.5 } } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4_g" top: "grad" param { lr_mult: 0.05 decay_mult: 1 } param { lr_mult: 0.05 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.0001 } bias_filler { type: "constant" value: 0.0 } } } # LOSSES layer { name: "mvnDepth" type: "MVN" bottom: "depth" top: "depthMVN" } layer { name: "mvnGT" type: "MVN" bottom: "gt" top: "gtMVN" } layer { name: "lossMVNDepth" type: "EuclideanLoss" bottom: "depthMVN" bottom: "gtMVN" top: "lossMVNDepth" loss_weight: 1 } layer { name: "gradientFilter" type: "Convolution" bottom: "gt" top: "gtGrad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 0 stride: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "gradMVN" type: "MVN" bottom: "grad" top: "grad_outMVN" } layer { name: "gtMVN" type: "MVN" bottom: "gtGrad" top: "gtGradMVN" } layer { name: "lossGradMVN" type: "EuclideanLoss" bottom: "grad_outMVN" bottom: "gtGradMVN" top: "lossGrad" loss_weight: 1 } ================================================ FILE: source/joint/eval_depth.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import caffe import operator import argparse import os from scipy import misc from os.path import basename def LogDepth(depth): depth = np.maximum(depth, 1.0 / 255.0) return 0.179581 * np.log(depth) + 1 def AbsoluteRelativeDifference(output, gt): gt = np.maximum(gt, 1.0 / 255.0) diff = np.mean(np.absolute(output - gt) / gt) return diff def SquaredRelativeDifference(output, gt): gt = np.maximum(gt, 1.0 / 255.0) d = output - gt diff = np.mean((d * d) / gt) return diff def RootMeanSquaredError(output, gt): d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def RootMeanSquaredErrorLog(output, gt): d = LogDepth(output / 10.0) * 10.0 - LogDepth(gt / 10.0) * 10.0 diff = np.sqrt(np.mean(d * d)) return diff def ScaleInvariantMeanSquaredError(output, gt): output = LogDepth(output / 10.0) * 10.0 gt = LogDepth(gt / 10.0) * 10.0 d = output - gt diff = np.mean(d * d) relDiff = (d.sum() * d.sum()) / float(d.size * d.size) return diff - relDiff def Log10Error(output, gt): output = np.maximum(output, 1.0 / 255.0) gt = np.maximum(gt, 1.0 / 255.0) diff = np.mean(np.absolute(np.log10(output) - np.log10(gt))) return diff def MVNError(output, gt): outMean = np.mean(output) outStd = np.std(output) output = (output - outMean)/outStd gtMean = np.mean(gt) gtStd = np.std(gt) gt = (gt - gtMean)/gtStd d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def Threshold(output, gt, threshold): output = np.maximum(output, 1.0 / 255.0) gt = np.maximum(gt, 1.0 / 255.0) withinThresholdCount = np.where(np.maximum(output / gt, gt / output) < threshold)[0].size return withinThresholdCount / float(gt.size) def Test(out, gt): absRelDiff = AbsoluteRelativeDifference(out, gt) sqrRelDiff = SquaredRelativeDifference(out, gt) RMSE = RootMeanSquaredError(out, gt) RMSELog = RootMeanSquaredErrorLog(out, gt) SIMSE = ScaleInvariantMeanSquaredError(out, gt) threshold1 = Threshold(out, gt, 1.25) threshold2 = Threshold(out, gt, 1.25 * 1.25) threshold3 = Threshold(out, gt, 1.25 * 1.25 * 1.25) log10 = Log10Error(out, gt) MVN = MVNError(out, gt) return [absRelDiff, sqrRelDiff, RMSE, RMSELog, SIMSE, log10, MVN, threshold1, threshold2, threshold3] def PrintTop5(title, result): length = min(10, len(result)) print print print ("TOP " + str(length) + " for " + title) for i in xrange(length): print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1])) print print ================================================ FILE: source/joint/eval_grad.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import caffe import operator import argparse import os from scipy import misc from os.path import basename def RootMeanSquaredError(output, gt): d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def MVNError(output, gt): outMean = np.mean(output) outStd = np.std(output) output = (output - outMean)/outStd gtMean = np.mean(gt) gtStd = np.std(gt) gt = (gt - gtMean)/gtStd d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def Test(out, gt): RMSE = RootMeanSquaredError(out, gt) MVN = MVNError(out, gt) return [RMSE, MVN] def PrintTop5(title, result): length = min(10, len(result)) print print print ("TOP " + str(length) + " for " + title) for i in xrange(length): print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1])) print print ================================================ FILE: source/joint/filter.prototxt ================================================ name: "GradientFilter" input: "X" input_shape { dim: 1 dim: 1 dim: 320 dim: 420 } layer { name: "gradientFilter" type: "Convolution" bottom: "X" top: "out" param { lr_mult: 0 } param { lr_mult: 0 } convolution_param { num_output: 2 pad: 0 kernel_size: 3 stride: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0 } } } ================================================ FILE: source/joint/solver.prototxt ================================================ net: "net_train.prototxt" test_iter: 50 test_interval: 500 base_lr: 0.0005 gamma: 0.5 stepsize: 100000 momentum: 0.9 weight_decay: 0.005 lr_policy: "fixed" display: 100 max_iter: 100000 snapshot: 10000 snapshot_prefix: "snaps/" solver_mode: GPU debug_info: 1 ================================================ FILE: source/joint/test_depth.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz from __future__ import print_function import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import os.path os.environ['GLOG_minloglevel'] = '2' import caffe import scipy.ndimage import argparse import operator import shutil from eval_depth import Test, PrintTop5, LogDepth WIDTH = 298 HEIGHT = 218 OUT_WIDTH = 37 OUT_HEIGHT = 27 GT_WIDTH = 420 GT_HEIGHT = 320 def testNet(net, img): net.blobs['X'].data[...] = img net.forward() output = net.blobs['depth'].data output = np.reshape(output, (1,1,OUT_HEIGHT, OUT_WIDTH)) return output def loadImage(path, channels, width, height): img = caffe.io.load_image(path) img = caffe.io.resize(img, (height, width, channels)) img = np.transpose(img, (2,0,1)) img = np.reshape(img, (1,channels,height,width)) return img def printImage(img, name, channels, width, height): params = list() params.append(cv.CV_IMWRITE_PNG_COMPRESSION) params.append(8) imgnp = np.reshape(img, (height,width, channels)) imgnp = np.array(imgnp * 255, dtype = np.uint8) cv2.imwrite(name, imgnp, params) def eval(out, gt, rawResults): linearGT = gt * 10.0 linearOut = out * 10.0 rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))] return rawResults def ProcessToOutput(depth): depth = np.clip(depth, 0.001, 1000) return np.clip(2 * 0.179581 * np.log(depth) + 1, 0, 1) caffe.set_mode_cpu() parser = argparse.ArgumentParser() parser.add_argument("input_dir", help="directory with input images") parser.add_argument("gt_dir", help="directory with ground truths") parser.add_argument("output", help="folder to output to") parser.add_argument("snaps", help="folder with snapshots to use") parser.add_argument('--log', action='store_true', default=False) args = parser.parse_args() try: os.mkdir(args.output) except OSError: print ('Output directory already exists, not creating a new one') try: os.mkdir(args.output + "_abs") except OSError: print ('Output directory already exists, not creating a new one') fileCount = len([name for name in os.listdir(args.input_dir)]) results = [dict() for x in range(10)] for snapshot in os.listdir(args.snaps): if not snapshot.endswith("caffemodel"): continue currentSnapDir = snapshot.replace(".caffemodel","") if os.path.exists(args.output + "/" + currentSnapDir): shutil.rmtree(args.output + "/" + currentSnapDir) if os.path.exists(args.output + "_abs/" + currentSnapDir): shutil.rmtree(args.output + "_abs/" + currentSnapDir) os.mkdir(args.output + "/" + currentSnapDir) os.mkdir(args.output + "_abs/" + currentSnapDir) print(currentSnapDir) sys.stdout.flush() netFile = snapshot.replace(".caffemodel",".prototxt") net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST) rawResults = np.zeros((10)) for count, file in enumerate(os.listdir(args.input_dir)): out_string = str(count) + '/' + str(fileCount) + ': ' + file sys.stdout.write('%s\r' % out_string) sys.stdout.flush() inputFileName = file inputFilePath = args.input_dir + '/' + inputFileName gtFileName = file.replace('colors','depth') gtFilePath = args.gt_dir + '/' + gtFileName gt = loadImage(gtFilePath, 1, GT_WIDTH, GT_HEIGHT) input = loadImage(inputFilePath, 3, WIDTH, HEIGHT) input *= 255 input -= 127 output = testNet(net, input) if args.log: output = np.exp((output - 1) / 0.179581) outWidth = OUT_WIDTH outHeight = OUT_HEIGHT scaleW = float(GT_WIDTH) / float(OUT_WIDTH) scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT) output = scipy.ndimage.zoom(output, (1,1,scaleH,scaleW), order=3) outWidth *= scaleW outHeight *= scaleH rawResults = eval(output, gt, rawResults) input += 127 input = input / 255.0 input = np.transpose(input, (0,2,3,1)) input = input[:,:,:,(2,1,0)] absOutput = output.copy() output -= output.mean() output /= output.std() output *= gt.std() output += gt.mean() gt = ProcessToOutput(gt) output = ProcessToOutput(output) absOutput = ProcessToOutput(absOutput) filename = os.path.splitext(os.path.basename(inputFileName))[0] filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png' filePathAbs = args.output + '_abs/' + currentSnapDir + '/' + filename + '.png' printImage(input, filePath, 3, WIDTH, HEIGHT) printImage(input, filePathAbs, 3, WIDTH, HEIGHT) printImage(output, filePath.replace('_colors','_depth'), 1, outWidth, outHeight) printImage(absOutput, filePathAbs.replace('_colors','_depth'), 1, outWidth, outHeight) printImage(gt, filePath.replace('_colors', '_gt'), 1, outWidth, outHeight) printImage(gt, filePathAbs.replace('_colors', '_gt'), 1, outWidth, outHeight) rawResults[:] = [x / fileCount for x in rawResults] for i in xrange(10): results[i][currentSnapDir] = rawResults[i] titles = ["AbsRelDiff", "SqrRelDiff", "RMSE", "RMSELog", "SIMSE", "Log10", "MVN", "Threshold 1.25","Threshold 1.25^2", "Threshold 1.25^3"] for i in xrange(10): results[i] = sorted(results[i].items(), key=operator.itemgetter(1)) if i > 6: results[i] = list(reversed(results[i])) PrintTop5(titles[i], results[i]) ================================================ FILE: source/joint/test_grad.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz from __future__ import print_function import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import os.path os.environ['GLOG_minloglevel'] = '2' import caffe import scipy.ndimage import argparse import operator import shutil from eval_grad import Test, PrintTop5 filter = np.zeros((1,3,3)) filter[0,0,:] = (-1,-1,-1) filter[0,1,:] = (0,0,0) filter[0,2,:] = (1,1,1) filter2 = np.zeros((1,3,3)) filter2[0,0,:] = (-1,0,1) filter2[0,1,:] = (-1,0,1) filter2[0,2,:] = (-1,0,1) WIDTH = 298 HEIGHT = 218 OUT_WIDTH = 35 OUT_HEIGHT = 25 GT_WIDTH = 418 GT_HEIGHT = 318 def filterImage(net, gt): net.blobs['X'].data[...] = gt net.forward() return (net.blobs['out'].data[0,0,:,:], net.blobs['out'].data[0,1,:,:]) def testNet(net, img): net.blobs['X'].data[...] = img net.forward() output = net.blobs['gradient'].data output = np.reshape(output, (1,2,OUT_HEIGHT, OUT_WIDTH)) out1 = output[0,0,:,:] out2 = output[0,1,:,:] return out1, out2 def loadImage(path, channels, width, height): img = caffe.io.load_image(path) img = caffe.io.resize(img, (height, width, channels)) img = np.transpose(img, (2,0,1)) img = np.reshape(img, (1,channels,height,width)) return img def printImage(img, name, channels, width, height): params = list() params.append(cv.CV_IMWRITE_PNG_COMPRESSION) params.append(8) imgnp = np.reshape(img, (height,width, channels)) imgnp = np.array(imgnp * 255, dtype = np.uint8) cv2.imwrite(name, imgnp, params) def eval(out, gt, rawResults): linearGT = gt#np.exp((gt - 1) / 0.179581) * 65.535 linearOut = out#np.exp((out - 1) / 0.179581) * 65.635 #RAW PIXEL TESTS rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))] return rawResults caffe.set_mode_cpu() parser = argparse.ArgumentParser() parser.add_argument("input_dir", help="directory with input images") parser.add_argument("gt_dir", help="directory with ground truths") parser.add_argument("output", help="folder to output to") parser.add_argument("snaps", help="folder with snapshots to use") args = parser.parse_args() gradNet = caffe.Net("filter.prototxt", caffe.TEST) gradNet.params['gradientFilter'][0].data[0,...] = filter gradNet.params['gradientFilter'][0].data[1,...] = filter2 try: os.mkdir(args.output) except OSError: x = 12 fileCount = len([name for name in os.listdir(args.input_dir)]) results = [dict() for x in range(2)] for snapshot in os.listdir(args.snaps): if not snapshot.endswith("caffemodel"): continue currentSnapDir = snapshot.replace(".caffemodel","") if os.path.exists(args.output + "/" + currentSnapDir): shutil.rmtree(args.output + "/" + currentSnapDir) os.mkdir(args.output + "/" + currentSnapDir) print(currentSnapDir) sys.stdout.flush() netFile = snapshot.replace(".caffemodel",".prototxt") net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST) rawResults = np.zeros((2)) for count, file in enumerate(os.listdir(args.input_dir)): out_string = str(count) + '/' + str(fileCount) + ': ' + file sys.stdout.write('%s\r' % out_string) sys.stdout.flush() inputFileName = file inputFilePath = args.input_dir + '/' + inputFileName gtFileName = file.replace('colors','depth') gtFilePath = args.gt_dir + '/' + gtFileName gt = loadImage(gtFilePath, 1, GT_WIDTH + 2, GT_HEIGHT + 2) gt1, gt2 = filterImage(gradNet, gt) gt1 = np.reshape(gt1, (1,1,GT_HEIGHT, GT_WIDTH)) gt2 = np.reshape(gt2, (1,1,GT_HEIGHT, GT_WIDTH)) input = loadImage(inputFilePath, 3, WIDTH, HEIGHT) input *= 255 input -= 127 out1, out2 = testNet(net, input) outWidth = OUT_WIDTH outHeight = OUT_HEIGHT scaleW = float(GT_WIDTH) / float(OUT_WIDTH) scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT) out1 = scipy.ndimage.zoom(out1, (scaleH,scaleW), order=3) out2 = scipy.ndimage.zoom(out2, (scaleH,scaleW), order=3) outWidth *= scaleW outHeight *= scaleH rawResults = eval(out1, gt1, rawResults) rawResults = eval(out2, gt2, rawResults) gt1 = (gt1 - gt1.min())/(gt1.max() - gt1.min()) gt2 = (gt2 - gt2.min())/(gt2.max() - gt2.min()) out1 -= out1.mean() out1 /= out1.std() out1 *= gt1.std() out1 += gt1.mean() out2 -= out2.mean() out2 /= out2.std() out2 *= gt2.std() out2 += gt2.mean() input += 127 input = input / 255.0 input = np.transpose(input, (0,2,3,1)) input = input[:,:,:,(2,1,0)] gt1 = np.clip(gt1, 0, 1) gt2 = np.clip(gt2, 0, 1) out1 = np.clip(out1, 0, 1) out2 = np.clip(out2, 0, 1) filename = os.path.splitext(os.path.basename(inputFileName))[0] filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png' printImage(input, filePath, 3, WIDTH, HEIGHT) printImage(out1, filePath.replace('_colors','_grad1'), 1, outWidth, outHeight) printImage(out2, filePath.replace('_colors','_grad2'), 1, outWidth, outHeight) printImage(gt1, filePath.replace('_colors', '_gt1'), 1, outWidth, outHeight) printImage(gt2, filePath.replace('_colors', '_gt2'), 1, outWidth, outHeight) rawResults[:] = [x / (fileCount * 2.0) for x in rawResults] for i in xrange(2): results[i][currentSnapDir] = rawResults[i] titles = ["RMSE", "MVN"] for i in xrange(2): results[i] = sorted(results[i].items(), key=operator.itemgetter(1)) PrintTop5(titles[i], results[i]) ================================================ FILE: source/joint/train.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import cv2 import cv import caffe from caffe.proto import caffe_pb2 import sys from google.protobuf import text_format import argparse caffe.set_mode_gpu() solver = caffe.get_solver('solver.prototxt') solver.net.copy_from('bvlc_alexnet.caffemodel') solver.net.params['conv1_g'][0].data[...] = solver.net.params['conv1'][0].data solver.net.params['conv1_g'][1].data[...] = solver.net.params['conv1'][1].data filter = np.zeros((1,3,3)) filter[0,0,:] = (-1,-1,-1) filter[0,1,:] = (0,0,0) filter[0,2,:] = (1,1,1) filter2 = np.zeros((1,3,3)) filter2[0,0,:] = (-1,0,1) filter2[0,1,:] = (-1,0,1) filter2[0,2,:] = (-1,0,1) solver.net.params['gradientFilter'][0].data[0,...] = filter solver.net.params['gradientFilter'][0].data[1,...] = filter2 solver.solve() ================================================ FILE: source/refining_network/abs/net_deploy.prototxt ================================================ name: "refining_network_abs_deploy" #INPUTS layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #GRADIENT NETWORK STARTS HERE layer { name: "conv1-grad" type: "Convolution" bottom: "X" top: "conv1-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-grad" type: "ReLU" bottom: "conv1-grad" top: "conv1-grad" } layer { name: "norm1-grad" type: "LRN" bottom: "conv1-grad" top: "norm1-grad" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-grad" type: "Pooling" bottom: "norm1-grad" top: "pool1-grad" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1-grad" top: "conv2-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2-grad" type: "ReLU" bottom: "conv2-grad" top: "conv2-grad" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2-grad" top: "conv3-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-grad" type: "ReLU" bottom: "conv3-grad" top: "conv3-grad" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3-grad" top: "conv4-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-grad" type: "ReLU" bottom: "conv4-grad" top: "conv4-grad" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4-grad" top: "grad_out" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "mvnGrad" type: "MVN" bottom: "grad_out" top: "gradMVN" } #GRADIENT NETWORK ENDS HERE #PREPROCESSING FOR THE REFINE layer { name: "upsample" type: "Deconvolution" bottom: "gradMVN" top: "grad-upsample" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 3 stride: 1 num_output: 2 group: 2 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } layer { name: "concat-global" bottom: "grad-upsample" bottom: "depthMVN" top: "global-output" type: "Concat" concat_param { axis: 1 } } layer { name: "upsample-global" type: "Deconvolution" bottom: "global-output" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 3 group: 3 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } ================================================ FILE: source/refining_network/abs/net_train.prototxt ================================================ name: "refining_network_abs_train" #INPUTS START HERE #COLOR layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TEST } } #INPUTS END HERE #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #GRADIENT NETWORK STARTS HERE layer { name: "conv1-grad" type: "Convolution" bottom: "X" top: "conv1-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-grad" type: "ReLU" bottom: "conv1-grad" top: "conv1-grad" } layer { name: "norm1-grad" type: "LRN" bottom: "conv1-grad" top: "norm1-grad" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-grad" type: "Pooling" bottom: "norm1-grad" top: "pool1-grad" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1-grad" top: "conv2-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2-grad" type: "ReLU" bottom: "conv2-grad" top: "conv2-grad" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2-grad" top: "conv3-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-grad" type: "ReLU" bottom: "conv3-grad" top: "conv3-grad" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3-grad" top: "conv4-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-grad" type: "ReLU" bottom: "conv4-grad" top: "conv4-grad" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4-grad" top: "grad_out" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "mvnGrad" type: "MVN" bottom: "grad_out" top: "gradMVN" } #GRADIENT NETWORK ENDS HERE #PREPROCESSING FOR THE REFINE layer { name: "upsample" type: "Deconvolution" bottom: "gradMVN" top: "grad-upsample" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 3 stride: 1 num_output: 2 group: 2 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } layer { name: "concat-global" bottom: "grad-upsample" bottom: "depthMVN" top: "global-output" type: "Concat" concat_param { axis: 1 } } layer { name: "upsample-global" type: "Deconvolution" bottom: "global-output" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 3 group: 3 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } #LOSSES layer { name: "lossABSDepth" type: "EuclideanLoss" bottom: "depth-refine" bottom: "gt" top: "lossABSDepth" loss_weight: 1 } ================================================ FILE: source/refining_network/eval_depth.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import caffe import operator import argparse import os from scipy import misc from os.path import basename def LogDepth(depth): depth = np.maximum(depth, 1.0 / 255.0) return 0.179581 * np.log(depth) + 1 def AbsoluteRelativeDifference(output, gt): gt = np.maximum(gt, 1.0 / 255.0) diff = np.mean(np.absolute(output - gt) / gt) return diff def SquaredRelativeDifference(output, gt): gt = np.maximum(gt, 1.0 / 255.0) d = output - gt diff = np.mean((d * d) / gt) return diff def RootMeanSquaredError(output, gt): d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def RootMeanSquaredErrorLog(output, gt): d = LogDepth(output / 10.0) * 10.0 - LogDepth(gt / 10.0) * 10.0 diff = np.sqrt(np.mean(d * d)) return diff def ScaleInvariantMeanSquaredError(output, gt): output = LogDepth(output / 10.0) * 10.0 gt = LogDepth(gt / 10.0) * 10.0 d = output - gt diff = np.mean(d * d) relDiff = (d.sum() * d.sum()) / float(d.size * d.size) return diff - relDiff def Log10Error(output, gt): output = np.maximum(output, 1.0 / 255.0) gt = np.maximum(gt, 1.0 / 255.0) diff = np.mean(np.absolute(np.log10(output) - np.log10(gt))) return diff def MVNError(output, gt): outMean = np.mean(output) outStd = np.std(output) output = (output - outMean)/outStd gtMean = np.mean(gt) gtStd = np.std(gt) gt = (gt - gtMean)/gtStd d = output - gt diff = np.sqrt(np.mean(d * d)) return diff def Threshold(output, gt, threshold): output = np.maximum(output, 1.0 / 255.0) gt = np.maximum(gt, 1.0 / 255.0) withinThresholdCount = np.where(np.maximum(output / gt, gt / output) < threshold)[0].size return withinThresholdCount / float(gt.size) def Test(out, gt): absRelDiff = AbsoluteRelativeDifference(out, gt) sqrRelDiff = SquaredRelativeDifference(out, gt) RMSE = RootMeanSquaredError(out, gt) RMSELog = RootMeanSquaredErrorLog(out, gt) SIMSE = ScaleInvariantMeanSquaredError(out, gt) threshold1 = Threshold(out, gt, 1.25) threshold2 = Threshold(out, gt, 1.25 * 1.25) threshold3 = Threshold(out, gt, 1.25 * 1.25 * 1.25) log10 = Log10Error(out, gt) MVN = MVNError(out, gt) return [absRelDiff, sqrRelDiff, RMSE, RMSELog, SIMSE, log10, MVN, threshold1, threshold2, threshold3] def PrintTop5(title, result): length = min(10, len(result)) print print print ("TOP " + str(length) + " for " + title) for i in xrange(length): print (str(i) + ". " + result[i][0] + ': ' + str(result[i][1])) print print ================================================ FILE: source/refining_network/log_abs/net_deploy.prototxt ================================================ name: "refining_network_log_abs_deploy" #INPUTS layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #GRADIENT NETWORK STARTS HERE layer { name: "conv1-grad" type: "Convolution" bottom: "X" top: "conv1-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-grad" type: "ReLU" bottom: "conv1-grad" top: "conv1-grad" } layer { name: "norm1-grad" type: "LRN" bottom: "conv1-grad" top: "norm1-grad" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-grad" type: "Pooling" bottom: "norm1-grad" top: "pool1-grad" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1-grad" top: "conv2-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2-grad" type: "ReLU" bottom: "conv2-grad" top: "conv2-grad" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2-grad" top: "conv3-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-grad" type: "ReLU" bottom: "conv3-grad" top: "conv3-grad" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3-grad" top: "conv4-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-grad" type: "ReLU" bottom: "conv4-grad" top: "conv4-grad" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4-grad" top: "grad_out" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "mvnGrad" type: "MVN" bottom: "grad_out" top: "gradMVN" } #GRADIENT NETWORK ENDS HERE #PREPROCESSING FOR THE REFINE layer { name: "upsample" type: "Deconvolution" bottom: "gradMVN" top: "grad-upsample" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 3 stride: 1 num_output: 2 group: 2 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } layer { name: "concat-global" bottom: "grad-upsample" bottom: "depthMVN" top: "global-output" type: "Concat" concat_param { axis: 1 } } layer { name: "upsample-global" type: "Deconvolution" bottom: "global-output" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 3 group: 3 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } ================================================ FILE: source/refining_network/log_abs/net_train.prototxt ================================================ name: "refining_network_log_abs_train" #INPUTS START HERE #COLOR layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TEST } } #INPUTS END HERE #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #GRADIENT NETWORK STARTS HERE layer { name: "conv1-grad" type: "Convolution" bottom: "X" top: "conv1-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-grad" type: "ReLU" bottom: "conv1-grad" top: "conv1-grad" } layer { name: "norm1-grad" type: "LRN" bottom: "conv1-grad" top: "norm1-grad" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-grad" type: "Pooling" bottom: "norm1-grad" top: "pool1-grad" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1-grad" top: "conv2-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2-grad" type: "ReLU" bottom: "conv2-grad" top: "conv2-grad" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2-grad" top: "conv3-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-grad" type: "ReLU" bottom: "conv3-grad" top: "conv3-grad" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3-grad" top: "conv4-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-grad" type: "ReLU" bottom: "conv4-grad" top: "conv4-grad" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4-grad" top: "grad_out" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "mvnGrad" type: "MVN" bottom: "grad_out" top: "gradMVN" } #GRADIENT NETWORK ENDS HERE #PREPROCESSING FOR THE REFINE layer { name: "upsample" type: "Deconvolution" bottom: "gradMVN" top: "grad-upsample" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 3 stride: 1 num_output: 2 group: 2 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } layer { name: "concat-global" bottom: "grad-upsample" bottom: "depthMVN" top: "global-output" type: "Concat" concat_param { axis: 1 } } layer { name: "upsample-global" type: "Deconvolution" bottom: "global-output" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 3 group: 3 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } #LOSSES layer { name: "log" type: "Log" bottom: "gt" top: "lnGt" log_param { shift: 0.00392156863 scale: 0.996078431 } } layer { name: "power" type: "Power" bottom: "lnGt" top: "logGt" power_param { power: 1 scale: 0.179581 shift: 1.0 } } layer { name: "lossABSDepth" type: "EuclideanLoss" bottom: "depth-refine" bottom: "logGt" top: "lossABSDepth" loss_weight: 1 } ================================================ FILE: source/refining_network/norm_abs/net_deploy.prototxt ================================================ name: "refining network_norm_abs_deploy" #INPUTS layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #GRADIENT NETWORK STARTS HERE layer { name: "conv1-grad" type: "Convolution" bottom: "X" top: "conv1-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-grad" type: "ReLU" bottom: "conv1-grad" top: "conv1-grad" } layer { name: "norm1-grad" type: "LRN" bottom: "conv1-grad" top: "norm1-grad" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-grad" type: "Pooling" bottom: "norm1-grad" top: "pool1-grad" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1-grad" top: "conv2-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2-grad" type: "ReLU" bottom: "conv2-grad" top: "conv2-grad" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2-grad" top: "conv3-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-grad" type: "ReLU" bottom: "conv3-grad" top: "conv3-grad" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3-grad" top: "conv4-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-grad" type: "ReLU" bottom: "conv4-grad" top: "conv4-grad" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4-grad" top: "grad_out" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "mvnGrad" type: "MVN" bottom: "grad_out" top: "gradMVN" } #GRADIENT NETWORK ENDS HERE #PREPROCESSING FOR THE REFINE layer { name: "upsample" type: "Deconvolution" bottom: "gradMVN" top: "grad-upsample" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 3 stride: 1 num_output: 2 group: 2 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } layer { name: "concat-global" bottom: "grad-upsample" bottom: "depthMVN" top: "global-output" type: "Concat" concat_param { axis: 1 } } layer { name: "upsample-global" type: "Deconvolution" bottom: "global-output" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 3 group: 3 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } ================================================ FILE: source/refining_network/norm_abs/net_train.prototxt ================================================ name: "refining_network_norm_abs_train" #INPUTS START HERE #COLOR layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TEST } } #INPUTS END HERE #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #GRADIENT NETWORK STARTS HERE layer { name: "conv1-grad" type: "Convolution" bottom: "X" top: "conv1-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-grad" type: "ReLU" bottom: "conv1-grad" top: "conv1-grad" } layer { name: "norm1-grad" type: "LRN" bottom: "conv1-grad" top: "norm1-grad" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-grad" type: "Pooling" bottom: "norm1-grad" top: "pool1-grad" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1-grad" top: "conv2-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2-grad" type: "ReLU" bottom: "conv2-grad" top: "conv2-grad" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2-grad" top: "conv3-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-grad" type: "ReLU" bottom: "conv3-grad" top: "conv3-grad" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3-grad" top: "conv4-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-grad" type: "ReLU" bottom: "conv4-grad" top: "conv4-grad" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4-grad" top: "grad_out" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "mvnGrad" type: "MVN" bottom: "grad_out" top: "gradMVN" } #GRADIENT NETWORK ENDS HERE #PREPROCESSING FOR THE REFINE layer { name: "upsample" type: "Deconvolution" bottom: "gradMVN" top: "grad-upsample" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 3 stride: 1 num_output: 2 group: 2 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } layer { name: "concat-global" bottom: "grad-upsample" bottom: "depthMVN" top: "global-output" type: "Concat" concat_param { axis: 1 } } layer { name: "upsample-global" type: "Deconvolution" bottom: "global-output" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 3 group: 3 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } #LOSSES layer { name: "mvnDepthRefine" type: "MVN" bottom: "depth-refine" top: "depthMVN-refine" } layer { name: "mvnGT" type: "MVN" bottom: "gt" top: "gtMVN" } layer { name: "lossMVNDepth" type: "EuclideanLoss" bottom: "depthMVN-refine" bottom: "gtMVN" top: "lossMVNDepth" loss_weight: 0.5 } layer { name: "lossABSDepth" type: "EuclideanLoss" bottom: "depth-refine" bottom: "gt" top: "lossABSDepth" loss_weight: 0.5 } ================================================ FILE: source/refining_network/norm_abs_global_only/net_deploy.prototxt ================================================ name: "refining_network_norm_abs_global_only_deploy" #INPUTS layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #PREPROCESSING FOR THE REFINE layer { name: "upsample-global" type: "Deconvolution" bottom: "depthMVN" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 1 group: 1 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } ================================================ FILE: source/refining_network/norm_abs_global_only/net_train.prototxt ================================================ name: "refining_network_norm_abs_global_only_train" #INPUTS START HERE #COLOR layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.lmdb" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TEST } } #INPUTS END HERE #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #PREPROCESSING FOR THE REFINE layer { name: "upsample-global" type: "Deconvolution" bottom: "depthMVN" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 1 group: 1 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } #LOSSES layer { name: "mvnDepthRefine" type: "MVN" bottom: "depth-refine" top: "depthMVN-refine" } layer { name: "mvnGT" type: "MVN" bottom: "gt" top: "gtMVN" } layer { name: "lossMVNDepth" type: "EuclideanLoss" bottom: "depthMVN-refine" bottom: "gtMVN" top: "lossMVNDepth" loss_weight: 0.5 } layer { name: "lossABSDepth" type: "EuclideanLoss" bottom: "depth-refine" bottom: "gt" top: "lossABSDepth" loss_weight: 0.5 } ================================================ FILE: source/refining_network/sc-inv_abs/net_deploy.prototxt ================================================ name: "refining_network_sc-inv_abs_deploy" #INPUTS layer { name: "data" type: "Input" top: "X" input_param { shape: { dim: 1 dim: 3 dim: 218 dim: 298 } } } #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #GRADIENT NETWORK STARTS HERE layer { name: "conv1-grad" type: "Convolution" bottom: "X" top: "conv1-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-grad" type: "ReLU" bottom: "conv1-grad" top: "conv1-grad" } layer { name: "norm1-grad" type: "LRN" bottom: "conv1-grad" top: "norm1-grad" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-grad" type: "Pooling" bottom: "norm1-grad" top: "pool1-grad" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1-grad" top: "conv2-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2-grad" type: "ReLU" bottom: "conv2-grad" top: "conv2-grad" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2-grad" top: "conv3-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-grad" type: "ReLU" bottom: "conv3-grad" top: "conv3-grad" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3-grad" top: "conv4-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-grad" type: "ReLU" bottom: "conv4-grad" top: "conv4-grad" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4-grad" top: "grad_out" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "mvnGrad" type: "MVN" bottom: "grad_out" top: "gradMVN" } #GRADIENT NETWORK ENDS HERE #PREPROCESSING FOR THE REFINE layer { name: "upsample" type: "Deconvolution" bottom: "gradMVN" top: "grad-upsample" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 3 stride: 1 num_output: 2 group: 2 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } layer { name: "concat-global" bottom: "grad-upsample" bottom: "depthMVN" top: "global-output" type: "Concat" concat_param { axis: 1 } } layer { name: "upsample-global" type: "Deconvolution" bottom: "global-output" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 3 group: 3 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } ================================================ FILE: source/refining_network/sc-inv_abs/net_train.prototxt ================================================ name: "refining_network_sc-inv_abs_train" #INPUTS START HERE #COLOR layer { name: "train_color" type: "Data" top: "X" data_param { source: "train_raw2_lmdb/train_raw2_color_298x218.LMDB" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TRAIN } } layer { name: "train_depth" type: "Data" top: "gt" data_param { source: "train_raw2_lmdb/train_raw2_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TRAIN } } layer { name: "test_color" type: "Data" top: "X" data_param { source: "test_lmdb/test_color_298x218.lmdb" backend: LMDB batch_size: 16 } transform_param { mean_value: 127 } include { phase: TEST } } layer { name: "test_depth" type: "Data" top: "gt" data_param { source: "test_lmdb/test_depth_74x54.lmdb" backend: LMDB batch_size: 16 } transform_param { scale: 0.00390625 } include { phase: TEST } } #INPUTS END HERE #GLOBAL NETWORK STARS HERE layer { name: "conv1" type: "Convolution" bottom: "X" top: "conv1" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1" type: "ReLU" bottom: "conv1" top: "conv1" } layer { name: "norm1" type: "LRN" bottom: "conv1" top: "norm1" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1" type: "Pooling" bottom: "norm1" top: "pool1" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } layer { name: "conv2" type: "Convolution" bottom: "pool1" top: "conv2" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2" type: "ReLU" bottom: "conv2" top: "conv2" } layer { name: "norm2" type: "LRN" bottom: "conv2" top: "norm2" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "conv3" type: "Convolution" bottom: "norm2" top: "conv3" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3" type: "ReLU" bottom: "conv3" top: "conv3" } layer { name: "conv4" type: "Convolution" bottom: "conv3" top: "conv4" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4" type: "ReLU" bottom: "conv4" top: "conv4" } layer { name: "conv5" type: "Convolution" bottom: "conv4" top: "conv5" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 1 kernel_size: 3 group: 2 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu5" type: "ReLU" bottom: "conv5" top: "conv5" } layer { name: "pool5" type: "Pooling" bottom: "conv5" top: "pool5" pooling_param { pool: MAX kernel_size: 3 stride: 2 } } # MAIN layer { name: "fc-main" type: "InnerProduct" bottom: "pool5" top: "fc-main" param { decay_mult: 0 lr_mult: 0 } param { decay_mult: 0 lr_mult: 0 } inner_product_param { num_output: 1024 weight_filler { type: "xavier" std: 0.005 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu6" type: "ReLU" bottom: "fc-main" top: "fc-main" } layer { name: "fc-depth" type: "InnerProduct" bottom: "fc-main" top: "fc-depth" param { decay_mult: 0 lr_mult: 0 } param { lr_mult: 0 decay_mult: 0 } inner_product_param { num_output: 999 weight_filler { type: "gaussian" std: 0.001 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "reshape" type: "Reshape" bottom: "fc-depth" top: "depth" reshape_param { shape { dim: 0 # copy the dimension from below dim: 1 dim: 27 dim: 37 # infer it from the other dimensions } } } layer { name: "mvnDepth-global" type: "MVN" bottom: "depth" top: "depthMVN" } #GRADIENT NETWORK STARTS HERE layer { name: "conv1-grad" type: "Convolution" bottom: "X" top: "conv1-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 4 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-grad" type: "ReLU" bottom: "conv1-grad" top: "conv1-grad" } layer { name: "norm1-grad" type: "LRN" bottom: "conv1-grad" top: "norm1-grad" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-grad" type: "Pooling" bottom: "norm1-grad" top: "pool1-grad" pooling_param { pool: MAX kernel_size: 4 stride: 2 } } layer { name: "conv2-grad" type: "Convolution" bottom: "pool1-grad" top: "conv2-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 256 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu2-grad" type: "ReLU" bottom: "conv2-grad" top: "conv2-grad" } layer { name: "conv3-grad" type: "Convolution" bottom: "conv2-grad" top: "conv3-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-grad" type: "ReLU" bottom: "conv3-grad" top: "conv3-grad" } layer { name: "conv4-grad" type: "Convolution" bottom: "conv3-grad" top: "conv4-grad" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 384 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-grad" type: "ReLU" bottom: "conv4-grad" top: "conv4-grad" } layer { name: "conv5-grad" type: "Convolution" bottom: "conv4-grad" top: "grad_out" param { lr_mult: 0 decay_mult: 0 } param { lr_mult: 0 decay_mult: 0 } convolution_param { num_output: 2 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "mvnGrad" type: "MVN" bottom: "grad_out" top: "gradMVN" } #GRADIENT NETWORK ENDS HERE #PREPROCESSING FOR THE REFINE layer { name: "upsample" type: "Deconvolution" bottom: "gradMVN" top: "grad-upsample" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 3 stride: 1 num_output: 2 group: 2 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } layer { name: "concat-global" bottom: "grad-upsample" bottom: "depthMVN" top: "global-output" type: "Concat" concat_param { axis: 1 } } layer { name: "upsample-global" type: "Deconvolution" bottom: "global-output" top: "est" param { lr_mult: 0 decay_mult: 0 } convolution_param { kernel_size: 2 stride: 2 num_output: 3 group: 3 pad: 0 weight_filler: { type: "bilinear" } bias_term: false } } #GLOBAL NETWORK ENDS HERE #REFINE NETWORK HERE layer { name: "conv1-refine" type: "Convolution" bottom: "X" top: "conv1-refine" param { lr_mult: 0.001 decay_mult: 1 } param { lr_mult: 0.001 decay_mult: 0 } convolution_param { num_output: 96 kernel_size: 11 stride: 2 pad: 2 weight_filler { type: "constant" # std: 0.001 value: 0 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu1-refine" type: "ReLU" bottom: "conv1-refine" top: "conv1-refine" } layer { name: "norm1-refine" type: "LRN" bottom: "conv1-refine" top: "norm1-refine" lrn_param { local_size: 5 alpha: 0.0001 beta: 0.75 } } layer { name: "pool1-refine" type: "Pooling" bottom: "norm1-refine" top: "pool1-refine" pooling_param { pool: MAX kernel_size: 2 stride: 2 pad: 1 } } layer { name: "concat" bottom: "pool1-refine" bottom: "est" top: "input-refine" type: "Concat" concat_param { axis: 1 } } layer { name: "conv2-refine" type: "Convolution" bottom: "input-refine" top: "conv2-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 1 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.0 } } } layer { name: "relu2-refine" type: "ReLU" bottom: "conv2-refine" top: "conv2-refine" } layer { name: "conv3-refine" type: "Convolution" bottom: "conv2-refine" top: "conv3-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0 } } } layer { name: "relu3-refine" type: "ReLU" bottom: "conv3-refine" top: "conv3-refine" } layer { name: "conv4-refine" type: "Convolution" bottom: "conv3-refine" top: "conv4-refine" param { lr_mult: 1 decay_mult: 1 } param { lr_mult: 1 decay_mult: 0 } convolution_param { num_output: 64 pad: 2 kernel_size: 5 group: 2 weight_filler { type: "xavier" std: 0.01 } bias_filler { type: "constant" value: 0.1 } } } layer { name: "relu4-refine" type: "ReLU" bottom: "conv4-refine" top: "conv4-refine" } layer { name: "conv5-refine" type: "Convolution" bottom: "conv4-refine" top: "depth-refine_" param { lr_mult: 0.01 decay_mult: 1 } param { lr_mult: 0.01 decay_mult: 0 } convolution_param { num_output: 1 pad: 1 kernel_size: 3 group: 1 weight_filler { type: "gaussian" std: 0.01 } bias_filler { type: "constant" value: 0.5 } } } layer { name: "power-refine" type: "Power" bottom: "depth-refine_" top: "depth-refine" power_param { power: 1 scale: 0.01 shift: 0 } } #LOSSES layer { name: "log" type: "Log" bottom: "gt" top: "lnGt" log_param { shift: 0.00392156863 scale: 0.996078431 } } layer { name: "power" type: "Power" bottom: "lnGt" top: "logGt" power_param { power: 1 scale: 0.179581 shift: 1.0 } } layer { name: "mvnDepthRefine" type: "MVN" bottom: "depth-refine" top: "depthMVN-refine" mvn_param{normalize_variance: false} } layer { name: "mvnGT" type: "MVN" bottom: "logGt" top: "gtMVN" mvn_param{normalize_variance: false} } layer { name: "lossMVNDepth" type: "EuclideanLoss" bottom: "depthMVN-refine" bottom: "gtMVN" top: "lossMVNDepth" loss_weight: 0.5 } layer { name: "lossABSDepth" type: "EuclideanLoss" bottom: "depth-refine" bottom: "logGt" top: "lossABSDepth" loss_weight: 0.5 } ================================================ FILE: source/refining_network/solver.prototxt ================================================ net: "net_train.prototxt" test_iter: 50 test_interval: 500 base_lr: 0.000025 gamma: 0.5 stepsize: 100000 momentum: 0.9 weight_decay: 0.005 lr_policy: "fixed" display: 100 max_iter: 100000 snapshot: 10000 snapshot_prefix: "snaps/" solver_mode: GPU debug_info: 1 ================================================ FILE: source/refining_network/test_depth.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz from __future__ import print_function import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import os.path os.environ['GLOG_minloglevel'] = '2' import caffe import scipy.ndimage import argparse import operator import shutil from eval_depth import Test, PrintTop5, LogDepth WIDTH = 298 HEIGHT = 218 OUT_WIDTH = 74 OUT_HEIGHT = 54 GT_WIDTH = 420 GT_HEIGHT = 320 def testNet(net, img): net.blobs['X'].data[...] = img net.forward() output = net.blobs['depth-refine'].data output = np.reshape(output, (1,1,OUT_HEIGHT, OUT_WIDTH)) return output def loadImage(path, channels, width, height): img = caffe.io.load_image(path) img = caffe.io.resize(img, (height, width, channels)) img = np.transpose(img, (2,0,1)) img = np.reshape(img, (1,channels,height,width)) return img def printImage(img, name, channels, width, height): params = list() params.append(cv.CV_IMWRITE_PNG_COMPRESSION) params.append(8) imgnp = np.reshape(img, (height,width, channels)) imgnp = np.array(imgnp * 255, dtype = np.uint8) cv2.imwrite(name, imgnp, params) def eval(out, gt, rawResults): linearGT = gt * 10.0 linearOut = out * 10.0 rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))] return rawResults def ProcessToOutput(depth): depth = np.clip(depth, 0.001, 1000) return np.clip(2 * 0.179581 * np.log(depth) + 1, 0, 1) caffe.set_mode_cpu() parser = argparse.ArgumentParser() parser.add_argument("input_dir", help="directory with input images") parser.add_argument("gt_dir", help="directory with ground truths") parser.add_argument("output", help="folder to output to") parser.add_argument("snaps", help="folder with snapshots to use") parser.add_argument('--log', action='store_true', default=False) args = parser.parse_args() try: os.mkdir(args.output) except OSError: print ('Output directory already exists, not creating a new one') try: os.mkdir(args.output + "_abs") except OSError: print ('Output directory already exists, not creating a new one') fileCount = len([name for name in os.listdir(args.input_dir)]) results = [dict() for x in range(10)] for snapshot in os.listdir(args.snaps): if not snapshot.endswith("caffemodel"): continue currentSnapDir = snapshot.replace(".caffemodel","") if os.path.exists(args.output + "/" + currentSnapDir): shutil.rmtree(args.output + "/" + currentSnapDir) if os.path.exists(args.output + "_abs/" + currentSnapDir): shutil.rmtree(args.output + "_abs/" + currentSnapDir) os.mkdir(args.output + "/" + currentSnapDir) os.mkdir(args.output + "_abs/" + currentSnapDir) print(currentSnapDir) sys.stdout.flush() netFile = snapshot.replace(".caffemodel",".prototxt") net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST) rawResults = np.zeros((10)) for count, file in enumerate(os.listdir(args.input_dir)): out_string = str(count) + '/' + str(fileCount) + ': ' + file sys.stdout.write('%s\r' % out_string) sys.stdout.flush() inputFileName = file inputFilePath = args.input_dir + '/' + inputFileName gtFileName = file.replace('colors','depth') gtFilePath = args.gt_dir + '/' + gtFileName gt = loadImage(gtFilePath, 1, GT_WIDTH, GT_HEIGHT) input = loadImage(inputFilePath, 3, WIDTH, HEIGHT) input *= 255 input -= 127 output = testNet(net, input) if args.log: output = np.exp((output - 1) / 0.179581) outWidth = OUT_WIDTH outHeight = OUT_HEIGHT scaleW = float(GT_WIDTH) / float(OUT_WIDTH) scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT) output = scipy.ndimage.zoom(output, (1,1,scaleH,scaleW), order=3) outWidth *= scaleW outHeight *= scaleH rawResults = eval(output, gt, rawResults) input += 127 input = input / 255.0 input = np.transpose(input, (0,2,3,1)) input = input[:,:,:,(2,1,0)] absOutput = output.copy() output -= output.mean() output /= output.std() output *= gt.std() output += gt.mean() gt = ProcessToOutput(gt) output = ProcessToOutput(output) absOutput = ProcessToOutput(absOutput) filename = os.path.splitext(os.path.basename(inputFileName))[0] filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png' filePathAbs = args.output + '_abs/' + currentSnapDir + '/' + filename + '.png' printImage(input, filePath, 3, WIDTH, HEIGHT) printImage(input, filePathAbs, 3, WIDTH, HEIGHT) printImage(output, filePath.replace('_colors','_depth'), 1, outWidth, outHeight) printImage(absOutput, filePathAbs.replace('_colors','_depth'), 1, outWidth, outHeight) printImage(gt, filePath.replace('_colors', '_gt'), 1, outWidth, outHeight) printImage(gt, filePathAbs.replace('_colors', '_gt'), 1, outWidth, outHeight) rawResults[:] = [x / fileCount for x in rawResults] for i in xrange(10): results[i][currentSnapDir] = rawResults[i] titles = ["AbsRelDiff", "SqrRelDiff", "RMSE", "RMSELog", "SIMSE", "Log10", "MVN", "Threshold 1.25","Threshold 1.25^2", "Threshold 1.25^3"] for i in xrange(10): results[i] = sorted(results[i].items(), key=operator.itemgetter(1)) if i > 6: results[i] = list(reversed(results[i])) PrintTop5(titles[i], results[i]) ================================================ FILE: source/refining_network/train.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz import numpy as np import cv2 import cv import caffe from caffe.proto import caffe_pb2 import sys from google.protobuf import text_format import argparse caffe.set_mode_gpu() solver = caffe.get_solver('solver.prototxt') solver.net.copy_from(path_to_global_context_network_caffemodel) gradPart = caffe.Net(path_to_gradient_network_definition_file, path_to_gradient_network_caffemodel, caffe.TEST) params = gradPart.params.keys() source_params = {pr: (gradPart.params[pr][0].data, gradPart.params[pr][1].data) for pr in params} target_params = {pr: (solver.net.params[pr][0].data, solver.net.params[pr][1].data) for pr in params} for pr in params: if pr == 'conv1': solver.net.params['conv1-grad'][1].data[...] = source_params [pr][1] #biases solver.net.params['conv1-grad'][0].data[...] = source_params [pr][0] #weights else: target_params[pr][1][...] = source_params [pr][1] #bias target_params[pr][0][...] = source_params [pr][0] #weights alexNet = caffe.Net(path_to_gradient_network_definition_file, 'bvlc_alexnet.caffemodel', caffe.TEST) solver.net.params['conv1-refine'][1].data[...] = alexNet.params['conv1'][1].data #biases solver.net.params['conv1-refine'][0].data[...] = alexNet.params['conv1'][0].data #weights solver.solve() ================================================ FILE: test_depth.py ================================================ #!/usr/bin/env python # Master's Thesis - Depth Estimation by Convolutional Neural Networks # Jan Ivanecky; xivane00@stud.fit.vutbr.cz from __future__ import print_function import numpy as np import matplotlib.pyplot as plt import sys from PIL import Image import cv2 import cv import os.path os.environ['GLOG_minloglevel'] = '2' import caffe import scipy.ndimage import argparse import operator import shutil from eval_depth import Test, PrintTop5, LogDepth WIDTH = 298 HEIGHT = 218 OUT_WIDTH = 74 OUT_HEIGHT = 54 GT_WIDTH = 420 GT_HEIGHT = 320 def testNet(net, img): net.blobs['X'].data[...] = img net.forward() output = net.blobs['depth-refine'].data output = np.reshape(output, (1,1,OUT_HEIGHT, OUT_WIDTH)) return output def loadImage(path, channels, width, height): img = caffe.io.load_image(path) img = caffe.io.resize(img, (height, width, channels)) img = np.transpose(img, (2,0,1)) img = np.reshape(img, (1,channels,height,width)) return img def printImage(img, name, channels, width, height): params = list() params.append(cv.CV_IMWRITE_PNG_COMPRESSION) params.append(8) imgnp = np.reshape(img, (height,width, channels)) imgnp = np.array(imgnp * 255, dtype = np.uint8) cv2.imwrite(name, imgnp, params) def eval(out, gt, rawResults): linearGT = gt * 10.0 linearOut = out * 10.0 rawResults = [x + y for x, y in zip(rawResults, Test(linearOut, linearGT))] return rawResults def ProcessToOutput(depth): depth = np.clip(depth, 0.001, 1000) return np.clip(2 * 0.179581 * np.log(depth) + 1, 0, 1) caffe.set_mode_cpu() parser = argparse.ArgumentParser() parser.add_argument("input_dir", help="directory with input images") parser.add_argument("gt_dir", help="directory with ground truths") parser.add_argument("output", help="folder to output to") parser.add_argument("snaps", help="folder with snapshots to use") parser.add_argument('--log', action='store_true', default=False) args = parser.parse_args() try: os.mkdir(args.output) except OSError: print ('Output directory already exists, not creating a new one') try: os.mkdir(args.output + "_abs") except OSError: print ('Output directory already exists, not creating a new one') fileCount = len([name for name in os.listdir(args.input_dir)]) results = [dict() for x in range(10)] for snapshot in os.listdir(args.snaps): if not snapshot.endswith("caffemodel"): continue currentSnapDir = snapshot.replace(".caffemodel","") if os.path.exists(args.output + "/" + currentSnapDir): shutil.rmtree(args.output + "/" + currentSnapDir) if os.path.exists(args.output + "_abs/" + currentSnapDir): shutil.rmtree(args.output + "_abs/" + currentSnapDir) os.mkdir(args.output + "/" + currentSnapDir) os.mkdir(args.output + "_abs/" + currentSnapDir) print(currentSnapDir) sys.stdout.flush() netFile = snapshot.replace(".caffemodel",".prototxt") net = caffe.Net(args.snaps + '/' + netFile, args.snaps + '/' + snapshot, caffe.TEST) rawResults = np.zeros((10)) for count, file in enumerate(os.listdir(args.input_dir)): out_string = str(count) + '/' + str(fileCount) + ': ' + file sys.stdout.write('%s\r' % out_string) sys.stdout.flush() inputFileName = file inputFilePath = args.input_dir + '/' + inputFileName gtFileName = file.replace('colors','depth') gtFilePath = args.gt_dir + '/' + gtFileName gt = loadImage(gtFilePath, 1, GT_WIDTH, GT_HEIGHT) input = loadImage(inputFilePath, 3, WIDTH, HEIGHT) input *= 255 input -= 127 output = testNet(net, input) if args.log: output = np.exp((output - 1) / 0.179581) outWidth = OUT_WIDTH outHeight = OUT_HEIGHT scaleW = float(GT_WIDTH) / float(OUT_WIDTH) scaleH = float(GT_HEIGHT) / float(OUT_HEIGHT) output = scipy.ndimage.zoom(output, (1,1,scaleH,scaleW), order=3) outWidth *= scaleW outHeight *= scaleH rawResults = eval(output, gt, rawResults) input += 127 input = input / 255.0 input = np.transpose(input, (0,2,3,1)) input = input[:,:,:,(2,1,0)] absOutput = output.copy() output -= output.mean() output /= output.std() output *= gt.std() output += gt.mean() gt = ProcessToOutput(gt) output = ProcessToOutput(output) absOutput = ProcessToOutput(absOutput) filename = os.path.splitext(os.path.basename(inputFileName))[0] filePath = args.output + '/' + currentSnapDir + '/' + filename + '.png' filePathAbs = args.output + '_abs/' + currentSnapDir + '/' + filename + '.png' printImage(input, filePath, 3, WIDTH, HEIGHT) printImage(input, filePathAbs, 3, WIDTH, HEIGHT) printImage(output, filePath.replace('_colors','_depth'), 1, outWidth, outHeight) printImage(absOutput, filePathAbs.replace('_colors','_depth'), 1, outWidth, outHeight) printImage(gt, filePath.replace('_colors', '_gt'), 1, outWidth, outHeight) printImage(gt, filePathAbs.replace('_colors', '_gt'), 1, outWidth, outHeight) rawResults[:] = [x / fileCount for x in rawResults] for i in xrange(10): results[i][currentSnapDir] = rawResults[i] titles = ["AbsRelDiff", "SqrRelDiff", "RMSE", "RMSELog", "SIMSE", "Log10", "MVN", "Threshold 1.25","Threshold 1.25^2", "Threshold 1.25^3"] for i in xrange(10): results[i] = sorted(results[i].items(), key=operator.itemgetter(1)) if i > 6: results[i] = list(reversed(results[i])) PrintTop5(titles[i], results[i])