Repository: aurooj/Hand-Segmentation-in-the-Wild Branch: master Commit: 2a8dc04394d5 Files: 26 Total size: 89.5 KB Directory structure: gitextract_b8gevuvl/ ├── README.md ├── _config.yml ├── data/ │ └── readme.txt ├── egohands+.md ├── license.txt ├── matlab_scripts/ │ ├── getObjectForHand.m │ ├── load_generate_gt_imgs2.m │ ├── load_generate_gt_imgs_hands_objects2.m │ ├── load_generate_gt_imgs_objects_only.m │ └── reshapeAreaCoords.m └── refinenet_files/ ├── computeAccuracyEYTH.m ├── computeAccuracyEgoHands.m ├── computeAccuracyGTEA.m ├── computeAccuracyHoF.m ├── demo_refinenet_test_example_egohands.m ├── demo_refinenet_test_example_eyth.m ├── demo_refinenet_test_example_gtea.m ├── demo_refinenet_test_example_hof.m ├── gen_class_info_ego.m ├── gen_class_info_gtea.m ├── getIOU.m ├── multiscale_evaluation.m ├── my_gen_ds_info_egohands.m ├── my_gen_ds_info_eyth.m ├── my_gen_ds_info_gtea.m └── my_gen_ds_info_hof.m ================================================ FILE CONTENTS ================================================ ================================================ FILE: README.md ================================================ ## Analysis of Hand Segmentation in the Wild ### Abstract A large number of works in egocentric vision have concentrated on action and object recognition. Detection and segmentation of hands in first-person videos, however, has less been explored. For many applications in this domain, it is necessary to accurately segment not only hands of the camera wearer but also the hands of others with whom he is interacting. Here, we take an in-depth look at the hand segmentation problem. In the quest for robust hand segmentation methods, we evaluated the performance of the state of the art semantic segmentation methods, off the shelf and fine-tuned, on existing datasets. We fine-tune RefineNet, a leading semantic segmentation method, for hand segmentation and find that it does much better than the best contenders. Existing hand segmentation datasets are collected in the laboratory settings. To overcome this limitation, we contribute by collecting two new datasets: a) EgoYouTubeHands including egocentric videos containing hands in the wild, and b) HandOverFace to analyze the performance of our models in presence of similar appearance occlusions. We further explore whether conditional random fields can help refine generated hand segmentations. To demonstrate the benefit of accurate hand maps, we train a CNN for hand-based activity recognition and achieve higher accuracy when a CNN was trained using hand maps produced by the fine-tuned RefineNet. Finally, we annotate a subset of the EgoHands dataset for fine-grained action recognition and show that an accuracy of 58.6% can be achieved by just looking at a single hand pose which is much better than the chance level (12.5%). * [[Paper]](http://openaccess.thecvf.com/content_cvpr_2018/papers/Urooj_Analysis_of_Hand_CVPR_2018_paper.pdf) * [[Project]](https://aurooj.github.io/Hand-Segmentation-in-the-Wild/) ### Code We have uploaded the additional files needed to train, test and evaluate our models' performance. Code for multiscale evaluation is also provided. See the folder ```refinenet_files```. To test the models: * you will need to download the refinenet code from their [github repository](https://github.com/guosheng/refinenet). * Copy the files provided in ```refinenet_files``` folder to ```refinenet/main``` folder. * Place the refinenet-based hand segmentation model (see Models section) in ```refinenet/model_trained``` folder. * For instance, to test the model trained on EgoHands dataset, copy the ```refinenet_res101_egohands.mat``` file in ```refinenet/model_trained``` folder. Set the path to test images folder in ```demo_refinenet_test_example_egohands.m``` and run the script. * The demo code is the same from the original refinenet demo files except minor changes. ### Models You can download our refinenet-based hand segmentation models using the links given below: * [refinenet_res101_egohands.mat](https://drive.google.com/file/d/1u7yGIafopsn_w-RHGt1wzO-8XgmL-1zu/view?usp=sharing) * [refinenet_res101_eyth.mat](https://drive.google.com/file/d/12HRYXdHWOGkl71QqUdlijCq2w2ARa6-M/view?usp=sharing) * [refinenet_res101_gtea.mat](https://drive.google.com/file/d/1yCnpTpBuBF8wYoM4_E1o8dAWjFS0BkxM/view?usp=sharing) * [refinenet_res101_hof.mat](https://drive.google.com/file/d/1AOY8EQ9LRNYFusgFxHEhE_fAifNdsayh/view?usp=sharing) ### Datasets We used 4 hand segmentation datasets in our work, two of them(EgoYouTubeHands and HandOverFace datasets) are collected as part of our contribution: * [EgoHands dataset](http://vision.soic.indiana.edu/projects/egohands/) * EgoYouTubeHands(EYTH) dataset [[download]](https://drive.google.com/file/d/1EwjJx-V-Gq7NZtfiT6LZPLGXD2HN--qT/view?usp=sharing) * [GTEA dataset](http://www.cbi.gatech.edu/fpv/) * HandOverFace(HOF) dataset [[download]](https://drive.google.com/open?id=1hHUvINGICvOGcaDgA5zMbzAIUv7ewDd3) #### Warning! Thanks to [Rafael Redondo Tejedor](https://github.com/valillon) who pointed out some minor mistakes in the dataset: * For HandOverFace dataset, 216.jpg and 221.jpg images are actually GIFs in the original size folder. * There were minor annotations errors for the following images in xml files: 10.jpg and 225.jpg which were pointed out and corrected by Rafael Redondo Tejedor. * Current link to the dataset has updated xml files for the above mentioned annotation errors. #### NEW! 02/23/2021! Hands masks for GTEA (cropped till wrist) has been uploaded under [![data](https://github.com/aurooj/Hand-Segmentation-in-the-Wild/tree/master/data) directory. Links to the videos used for EYTH dataset are given below. Each video is 3-6 minutes long. We cleaned the dataset before annotation and discarded unnecessary frames (e.g., frames containing text or if hands were out of view for a long time, etc). [![vid4](http://img.youtube.com/vi/dYZm7jB9YA4/0.jpg)](https://www.youtube.com/watch?v=dYZm7jB9YA4&feature=youtu.be&hd=1 "vid4") [![vid6](http://img.youtube.com/vi/5RTJ4dymKfo/0.jpg)](https://www.youtube.com/watch?v=5RTJ4dymKfo&feature=youtu.be&hd=1 "vid6") [![vid9](http://img.youtube.com/vi/vG9vfjdcmRw/0.jpg)](https://www.youtube.com/watch?v=vG9vfjdcmRw&feature=youtu.be&hd=1 "vid9") #### NEW! Test set for HandOverFace dataset is uploaded [here](https://drive.google.com/file/d/1-OmtqYBVmAstCzOKz8xpatw6lk9hUm--/view?usp=sharing). Example images from EgoYouTubeHands dataset: ![EYTH](images/eyth.jpg) Example images from HandOverFace dataset: ![HOF](images/hof.jpg) * **EgoHands+** dataset: To study fine-level action recognition, we provide additional annotations for a subset of EgoHands dataset. You can find more details [here](https://github.com/aurooj/Hand-Segmentation-in-the-Wild/blob/master/egohands%2B.md) and download the dataset from this [download](https://drive.google.com/file/d/1WwGNsOhjk3hIKEnDoCKplFvMNroMCxtZ/view?usp=sharing) link. ## Results ### Qualitative Results Hand segmentation results for all datasets: ![All datasets:](images/crfs.jpg) ### CVPR Poster ![cvpr-poster](images/cvpr2018-AUK.jpg) ### License We only provide the annotations for the videos in EYTH and images in HOF. The copyright to the videos belong to YouTube, and images are collected from internet. The license for our work is in the license.txt file. ### Acknowledgements We would like to thank undergraduate students Cristopher Matos, and Jose-Valentin Sera-Josef, and MS student Shiven Goyal for helping us in data annotations. ### Citation If this work and/or datasets is useful for your research, please cite our paper. ### Questions? Please contact 'aishaurooj@gmail.com' ================================================ FILE: _config.yml ================================================ theme: jekyll-theme-hacker ================================================ FILE: data/readme.txt ================================================ gtea_cropped.zip has the cropped hand masks for GTEA train and test set. (see section 3.1.3 in the paper for details) We used labelme tool to draw bounding boxes around hand masks provided by GTEA authors, and then kept the mask pixels inside the bounding boxes only for our use. ================================================ FILE: egohands+.md ================================================ To study fine-level action recognition, we labeled a subset of video frames from [Egohands](http://vision.soic.indiana.edu/projects/egohands/) dataset at hand-level. We annotated a small subset of 8 videos (800 frames), 2 from each coarse-level activity for outdoors (courtyard) in the EgoHands dataset at hand-pose level. We labeled each hand pose with one of the following 16 activities: **holding, picking, placing, resting, moving, replacing, thinking, pulling, pushing, stacking, adjusting, matching, pressing, highfive, pointing,** and **catching**. Ambiguous hand poses are annotated with multiple possible labels(e.g., picking and placing are sometimes difficult to be inferred at a single frame-level). ![EgoHands+](images/egohands+.png) We have annotated hands with action labels in two settings: * coarse hand maps: where we just outlined the hand boundaries * fine hand maps: where we take extra care to outline details about fingers as much as possible. We used [LabelMe](http://labelme.csail.mit.edu/Release3.0/) for annotations. Annotation files are in .xml format and we provide matlab scripts(based on labelme-toolbox api) to parse these annotations and generate hand masks. However, one can easily write his/her own script using LabelMe toolbox to manipulate these annotations as per their need. Each "hand" object is labeled with attributes in the following format: 'hand_type,actions,object,subject' * 'hand_type' is which hand it is ('left' or 'right'), respective to the person the hand belongs to * 'actions' is all of the actions it looks like the hand is doing based on that single frame. If it's an ambiguous action, we annotate it with multiple possible actions, separated with commas. Ex: * 1 action: left,picking,cards,other * 2 actions: left,picking,placing,cards,other * 'object' is one of these 4 objects: * cards * jenga_block * chess_piece * puzzle_piece If the hand is not manipulating any objects, we simply put an underscore '_' separated by commas. Ex: left,resting,_,first_person * 'subject' is either 'first_person' or 'other' Each "object" is labeled and named as one of the following: * cards * jenga_block * chess_piece * puzzle_piece In the attributes, we labeled who is manipulating the object (either 'first_person' or 'other'). If they are both manipulating the same object, we just put both separated by a comma ('first_person,other') ## Usage 1. Install LabelMe MATLAB toolbox as instructed [here](http://labelme2.csail.mit.edu/Release3.0/browserTools/php/matlab_toolbox.php). 2. Download [EgoHands+](https://1drv.ms/u/s!AtxSFigVVA5JhNtsRdvgmxvB2c1rPg). 3. We have borrowed some code from EgoHands dataset's [page](http://vision.soic.indiana.edu/projects/egohands/) already uploaded here. For their complete API you can refer to the original project. 4. Place our matlab_scripts in the labelme toolbox folder. 5. Setup paths for the directory with EgoHands+ dataset and destination directory. 6. Run ```load_generate_gt_imgs_hands_objects2.m``` to generate masks for hands+objects setup. This would also generate a text file with images along with their labels. 7. Run ```load_generate_gt_imgs_objects_only.m``` to generate masks for objects only setup. 8. Run ```load_generate_gt_imgs2.m``` to generate masks for hands only setup. ================================================ FILE: license.txt ================================================ MIT License Copyright (c) 2018 UCF CRCV Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: matlab_scripts/getObjectForHand.m ================================================ function [ obj ] = getObjectForHand( hand_obj, all_obj ) % getObjectForHand this function returns object which is being handled by a % specific hand. % Input: hand_obj, all_obj % Output: obj: Object being handled by hand_obj is_both_hands = 0; attributes = hand_obj.attributes; hand_type = getHandType(attributes); hand_idx = vec2ind(hand_type'); % get index for hand type from one-hot vector obj_type = getObjectType(attributes); object_exists = ~strcmp(obj_type,'_'); switch hand_idx case 1 % other_left hand %return object being handled by this hand, if any if(object_exists) datarow = getObject(all_obj, obj_type,'other','left'); end case 2 % other_right hand %return object being handled by this hand, if any if(object_exists) datarow = getObject(all_obj, obj_type, 'other', 'right'); end case 3 % my_left hand %return object being handled by this hand, if any if(object_exists) datarow = getObject(all_obj, obj_type,'first_person','left'); end case 4 % my_right hand %return object being handled by this hand, if any if(object_exists) datarow = getObject(all_obj, obj_type,'first_person','right'); end otherwise disp('Error: Invalid hand type.'); end if(object_exists) for i = 1:size(datarow,2) poly = datarow(i).polygon; poly = [poly.x poly.y]; obj{i} = reshapeAreaCoords(poly); end else obj = cell(0) end end function [type] = getHandType(hand_attr) % this function returns hand type based on hand attributes in one hot % encoding format. output will be of format [is_other_left is_other_right is_my_left is_my_right] % Sample: [0 0 0 1] means hand type is first person's right hand. is_other_left = ~isempty(regexp(hand_attr,'left','once')) && ~isempty(regexp(hand_attr,'other','once')); is_other_right = ~isempty(regexp(hand_attr,'right','once')) && ~isempty(regexp(hand_attr,'other','once')); is_my_left = ~isempty(regexp(hand_attr,'left','once')) && ~isempty(regexp(hand_attr,'first_person','once')); is_my_right = ~isempty(regexp(hand_attr,'right','once')) && ~isempty(regexp(hand_attr,'first_person','once')); type = [is_other_left is_other_right is_my_left is_my_right]; end function [obj_type] = getObjectType(hand_attr) % this function returns object type when input is hand attributes % object type can be 'cards', 'puzzle_piece', 'chess_piece', 'jenga_block' or '_' % where '_' means no object is being handled by this hand. attr_parts = strsplit(hand_attr,','); size_of_attributes_parts = size(attr_parts); obj_type = attr_parts(end-1); end function [action] = getActionType(hand_attr) %this function returns action type when input is hand attributes % action type can be one of the 16 action classes attr_parts = strsplit(hand_attr,','); action = attr_parts(2:end-2); end function [out_rows] = getObject(objects, object_name,hand_type,hand_side) %works for only string values out_rows = struct(); %fieldvalue for i=1:length(objects) %if object is same as object type and hand is also same as hand %type if(strcmp(objects(i).name,char(object_name)) && ~isempty(regexp(objects(i).attributes,hand_type,'once')) && ~isempty(regexp(objects(i).attributes,hand_side,'once'))) out_rows = struct(objects(i)); %% to do: check if the object is for the correct hand(may be check some common coordinates ) end end end ================================================ FILE: matlab_scripts/load_generate_gt_imgs2.m ================================================ %%%%%%%%%%this script produces masked images for hands only when hands %%%%%%%%%%are annotated with details.%%%%%%%%%%%%%%%%%%%% %%%%This script also visualizes the output %%%%%%%% images written to the output directory. %%%%%%%% Comment out all imshow() statements if you want to speedup the %%%%%%%% execution of this script.%%%%%%%%%%%%%%%%%%%% clear all; close all; folders = {'cards_courtyard_h_s','cards_courtyard_s_h','chess_courtyard_h_s'... ,'chess_courtyard_s_h','jenga_courtyard_b_h','jenga_courtyard_h_b',... 'puzzle_courtyard_b_s','puzzle_courtyard_s_b'}; DIR = 'path/to/output/directory'; SRC_DIR = 'path/to/folder/with/annotations/'; fileID = fopen('annotations.txt','w');%open a file for writing annotations to it. for id = 1:length(folders) HOMEANNOTATIONS = fullfile(SRC_DIR,folders(id)); D = LMdatabase(HOMEANNOTATIONS{1}); filters = strsplit(char(folders(id)),'_');%split folder name into components like cards,courtyard, s,h activity = upper(char(filters(1))); loc = upper(char(filters(2))); actor1 = upper(char(filters(3))); actor2 = upper(char(filters(4))); % vid = getMetaBy('Location',loc,'Activity',activity,'Viewer',actor1,'Partner',actor2); % frames = vid.labelled_frames; if ~exist(char(fullfile(DIR,folders(id))), 'dir') mkdir(char(fullfile(DIR,folders(id)))); end for idx = 1:length(D) try objects = D(idx).annotation.object; img_name = D(idx).annotation.filename; img = imread(fullfile(SRC_DIR,char(folders(id)),'\Images\users\shivengoyal',char(folders(id)),D(idx).annotation.filename)); [row,col,ch]=size(img); % both = both + isBothHands(objects); for obj_id = 1:length(objects) if(~isempty(regexp(objects(obj_id).name,'hand','once'))) attributes = objects(obj_id).attributes; if(~isempty(regexp(attributes,'left','once')) && ~isempty(regexp(attributes,'other','once'))) poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); ur_left_img = bsxfun(@times, img, cast(BW, 'like', img)); %ur_left_img = getSegImageForEachHand(vid,idx,img,'your_left',[]); imshow(ur_left_img); if(~isempty(ur_left_img)) fn = strcat(img_name(1:end-4),'_ul.jpg'); imwrite(ur_left_img,char(fullfile(DIR,folders(id),fn))); end elseif(~isempty(regexp(attributes,'right','once')) && ~isempty(regexp(attributes,'other','once'))) poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); ur_right_img = bsxfun(@times, img, cast(BW, 'like', img)); %ur_right_img = getSegImageForEachHand(vid,idx,img,'your_right',[]); imshow(ur_right_img); if(~isempty(ur_right_img)) fn = strcat(img_name(1:end-4),'_ur.jpg'); imwrite(ur_right_img,char(fullfile(DIR,folders(id),fn))); end elseif(~isempty(regexp(attributes,'left','once')) && ~isempty(regexp(attributes,'first_person','once'))) poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); my_left_img = bsxfun(@times, img, cast(BW, 'like', img)); %my_left_img = getSegImageForEachHand(vid,idx,img,'my_left',[]); imshow(my_left_img); if(~isempty(my_left_img)) fn = strcat(img_name(1:end-4),'_ml.jpg'); imwrite(my_left_img,char(fullfile(DIR,folders(id),fn))); end elseif(~isempty(regexp(attributes,'right','once')) && ~isempty(regexp(attributes,'first_person','once'))) poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); my_right_img = bsxfun(@times, img, cast(BW, 'like', img)); %my_right_img = getSegImageForEachHand(vid,idx,img,'my_right',[]); imshow(my_right_img); if(~isempty(my_right_img)) fn = strcat(img_name(1:end-4),'_mr.jpg'); imwrite(my_right_img,char(fullfile(DIR,folders(id),fn))); end end attributes = strsplit(char(attributes),','); for i = 1:length(attributes) n = char(attributes(i)); switch n case 'holding' str = char(strcat(fullfile(DIR,folders(id),fn),' 0')); fprintf(fileID,'%s\n',str); case 'picking' str = char(strcat(fullfile(DIR,folders(id),fn),' 1')); fprintf(fileID,'%s\n',str); case 'placing' str = char(strcat(fullfile(DIR,folders(id),fn),' 2')); fprintf(fileID,'%s\n',str); case 'resting' str = char(strcat(fullfile(DIR,folders(id),fn),' 3')); fprintf(fileID,'%s\n',str); case 'moving' str = char(strcat(fullfile(DIR,folders(id),fn),' 4')); fprintf(fileID,'%s\n',str); case 'replacing' str = char(strcat(fullfile(DIR,folders(id),fn),' 5')); fprintf(fileID,'%s\n',str); case 'thinking' str = char(strcat(fullfile(DIR,folders(id),fn),' 6')); fprintf(fileID,'%s\n',str); case 'pulling' str = char(strcat(fullfile(DIR,folders(id),fn),' 7')); fprintf(fileID,'%s\n',str); case 'pushing' str = char(strcat(fullfile(DIR,folders(id),fn),' 8')); fprintf(fileID,'%s\n',str); case 'stacking' str = char(strcat(fullfile(DIR,folders(id),fn),' 9')); fprintf(fileID,'%s\n',str); case 'adjusting' str = char(strcat(fullfile(DIR,folders(id),fn),' 10')); fprintf(fileID,'%s\n',str); case 'matching' str = char(strcat(fullfile(DIR,folders(id),fn),' 11')); fprintf(fileID,'%s\n',str); case 'pressing' str = char(strcat(fullfile(DIR,folders(id),fn),' 12')); fprintf(fileID,'%s\n',str); case 'highfive' str = char(strcat(fullfile(DIR,folders(id),fn),' 13')); fprintf(fileID,'%s\n',str); case 'pointing' str = char(strcat(fullfile(DIR,folders(id),fn),' 14')); fprintf(fileID,'%s\n',str); case 'catching' str = char(strcat(fullfile(DIR,folders(id),fn),' 15')); fprintf(fileID,'%s\n',str); otherwise disp(n) end end end end catch ME msg = ME.message continue; end end end fclose(fileID); ================================================ FILE: matlab_scripts/load_generate_gt_imgs_hands_objects2.m ================================================ %%%%%%%%%%this script produces masked images for hand_objects when hands %%%%%%%%%%are annotated detailed. This script also visualizes the output %%%%%%%%%%%%%%%%%%%% images written to the output directory. %%%%%%%% Comment out all imshow() statements if you want to speedup the %%%%%%%% execution of this script. clear all; close all; folders = {'cards_courtyard_h_s','cards_courtyard_s_h','chess_courtyard_h_s',... 'chess_courtyard_s_h','jenga_courtyard_b_h','jenga_courtyard_h_b',... 'puzzle_courtyard_b_s','puzzle_courtyard_s_b'}; %DIR to save your resulting images DIR = 'path\to\save\output\images\'; SRC_DIR = 'path\to\egohands\fine\annotations\folders\'; fileID = fopen('inputfile_fineannotations.txt','w');%open a file for writing image paths along with annotations to it. % iterating over each folder for id = 1:length(folders) HOMEANNOTATIONS = fullfile(SRC_DIR,folders(id)); D = LMdatabase(HOMEANNOTATIONS{1});%read annotations for each video using Labelme library function filters = strsplit(char(folders(id)),'_');%split folder name into components like cards,courtyard, s,h activity = upper(char(filters(1))); loc = upper(char(filters(2))); actor1 = upper(char(filters(3))); actor2 = upper(char(filters(4))); % if DIR for output images doesn't exist already, then create it. if ~exist(char(fullfile(DIR,folders(id))), 'dir') mkdir(char(fullfile(DIR,folders(id)))); end %for each image for idx = 1:length(D) %get all annotated objects in that image objects = D(idx).annotation.object; img_name = D(idx).annotation.filename; img = imread(fullfile(SRC_DIR,char(folders(id)),'\Images\users\shivengoyal',char(folders(id)),D(idx).annotation.filename)); [row,col,ch]=size(img); for obj_id = 1:length(objects) try %if the object is a hand if(~isempty(regexp(objects(obj_id).name,'hand','once'))) %get the object in that hand obj = getObjectForHand(objects(obj_id),objects); %read the attributes for that hand object attributes = objects(obj_id).attributes; %if the hand is left hand of third person if(~isempty(regexp(attributes,'left','once')) && ~isempty(regexp(attributes,'other','once'))) if(~isempty(obj)) %if hand is manipulating some object for i = 1:length(obj) poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); BW = BW+poly2mask(double(obj{i}(1:2:end)),double(obj{i}(2:2:end)),row,col); ur_left_img = bsxfun(@times, img, cast(BW, 'like', img)); imshow(BW) imshow(ur_left_img) if(~isempty(ur_left_img)) %write image in destination folder fn = strcat(img_name(1:end-4),'_',num2str(i),'_ul.jpg');%filename for image imwrite(ur_left_img,char(fullfile(DIR,folders(id),fn))); end end else %if hand is not manipulating any object poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); ur_left_img = bsxfun(@times, img, cast(BW, 'like', img)); imshow(ur_left_img); if(~isempty(ur_left_img)) %write image in destination folder fn = strcat(img_name(1:end-4),'_ul.jpg');%filename for image imwrite(ur_left_img,char(fullfile(DIR,folders(id),fn))); end end %if the hand is right hand of third person elseif(~isempty(regexp(attributes,'right','once')) && ~isempty(regexp(attributes,'other','once'))) if(~isempty(obj))%if hand is manipulating some object for i = 1:length(obj) poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); BW = BW+poly2mask(double(obj{i}(1:2:end)),double(obj{i}(2:2:end)),row,col); ur_right_img = bsxfun(@times, img, cast(BW, 'like', img)); imshow(ur_right_img); if(~isempty(ur_right_img)) fn = strcat(img_name(1:end-4),'_',num2str(i),'_ur.jpg'); imwrite(ur_right_img,char(fullfile(DIR,folders(id),fn))); end end else %if hand does not have any object in it poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); ur_right_img = bsxfun(@times, img, cast(BW, 'like', img)); imshow(ur_right_img); if(~isempty(ur_right_img)) fn = strcat(img_name(1:end-4),'_ur.jpg'); imwrite(ur_right_img,char(fullfile(DIR,folders(id),fn))); end end %if the hand is left hand of first person elseif(~isempty(regexp(attributes,'left','once')) && ~isempty(regexp(attributes,'first_person','once'))) if(~isempty(obj))%if hand is manipulating some object for i = 1:length(obj) poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); BW = BW+poly2mask(double(obj{i}(1:2:end)),double(obj{i}(2:2:end)),row,col); my_left_img = bsxfun(@times, img, cast(BW, 'like', img)); imshow(my_left_img); if(~isempty(my_left_img)) fn = strcat(img_name(1:end-4),'_',num2str(i),'_ml.jpg'); imwrite(my_left_img,char(fullfile(DIR,folders(id),fn))); end end else %if hand is not manipulating any object poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); my_left_img = bsxfun(@times, img, cast(BW, 'like', img)); imshow(my_left_img); if(~isempty(my_left_img)) fn = strcat(img_name(1:end-4),'_ml.jpg'); imwrite(my_left_img,char(fullfile(DIR,folders(id),fn))); end end %if the hand is "right" hand of first person elseif(~isempty(regexp(attributes,'right','once')) && ~isempty(regexp(attributes,'first_person','once'))) if(~isempty(obj))%if hand is manipulating some object for i = 1:length(obj) poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); BW = BW+poly2mask(double(obj{i}(1:2:end)),double(obj{i}(2:2:end)),row,col); my_right_img = bsxfun(@times, img, cast(BW, 'like', img)); % imshow(my_right_img); if(~isempty(my_right_img)) fn = strcat(img_name(1:end-4),'_',num2str(i),'_mr.jpg'); imwrite(my_right_img,char(fullfile(DIR,folders(id),fn))); end end else %if hand is not manipulating any object poly = objects(obj_id).polygon; BW = poly2mask(double(poly.x),double(poly.y), row, col); my_right_img = bsxfun(@times, img, cast(BW, 'like', img)); imshow(my_right_img); if(~isempty(my_right_img)) fn = strcat(img_name(1:end-4),'_mr.jpg'); imwrite(my_right_img,char(fullfile(DIR,folders(id),fn))); end end end %write the path/to/img with its label in txt file which will be used %as an input to the caffe network for training and testing attributes = strsplit(char(attributes),','); % assign a label for corresponding action e.g., if % attributes of hand has action 'holding', then label is 0. for i = 1:length(attributes) n = char(attributes(i)); switch n case 'holding' str = char(strcat(fullfile(DIR,folders(id),fn),' 0')); fprintf(fileID,'%s\n',str);%write to file case 'picking' str = char(strcat(fullfile(DIR,folders(id),fn),' 1')); fprintf(fileID,'%s\n',str); case 'placing' str = char(strcat(fullfile(DIR,folders(id),fn),' 2')); fprintf(fileID,'%s\n',str); case 'resting' str = char(strcat(fullfile(DIR,folders(id),fn),' 3')); fprintf(fileID,'%s\n',str); case 'moving' str = char(strcat(fullfile(DIR,folders(id),fn),' 4')); fprintf(fileID,'%s\n',str); case 'replacing' str = char(strcat(fullfile(DIR,folders(id),fn),' 5')); fprintf(fileID,'%s\n',str); case 'thinking' str = char(strcat(fullfile(DIR,folders(id),fn),' 6')); fprintf(fileID,'%s\n',str); case 'pulling' str = char(strcat(fullfile(DIR,folders(id),fn),' 7')); fprintf(fileID,'%s\n',str); case 'pushing' str = char(strcat(fullfile(DIR,folders(id),fn),' 8')); fprintf(fileID,'%s\n',str); case 'stacking' str = char(strcat(fullfile(DIR,folders(id),fn),' 9')); fprintf(fileID,'%s\n',str); case 'adjusting' str = char(strcat(fullfile(DIR,folders(id),fn),' 10')); fprintf(fileID,'%s\n',str); case 'matching' str = char(strcat(fullfile(DIR,folders(id),fn),' 11')); fprintf(fileID,'%s\n',str); case 'pressing' str = char(strcat(fullfile(DIR,folders(id),fn),' 12')); fprintf(fileID,'%s\n',str); case 'highfive' str = char(strcat(fullfile(DIR,folders(id),fn),' 13')); fprintf(fileID,'%s\n',str); case 'pointing' str = char(strcat(fullfile(DIR,folders(id),fn),' 14')); fprintf(fileID,'%s\n',str); case 'catching' str = char(strcat(fullfile(DIR,folders(id),fn),' 15')); fprintf(fileID,'%s\n',str); otherwise disp(n)%displays non-actions attributes, useful for debugging end end end catch ME msg = ME.message ME.stack.name ME.stack.line continue; end end end end fclose(fileID); ================================================ FILE: matlab_scripts/load_generate_gt_imgs_objects_only.m ================================================ %%%%%%%%%%this script produces masked images for objects bring handled by %%%%%%%%%%hands.This script also visualizes the output %%%%%%%%%%%%%%%%%%%% images written to the output directory. %%%%%%%% Comment out all imshow() statements if you want to speedup the %%%%%%%% execution of this script.%%%%%%%%%%%%%%%%%%%% clear all; close all; folders = {'cards_courtyard_h_s','cards_courtyard_s_h','chess_courtyard_h_s'... ,'chess_courtyard_s_h','jenga_courtyard_b_h','jenga_courtyard_h_b',... 'puzzle_courtyard_b_s','puzzle_courtyard_s_b'}; %DIR to save your resulting images DIR = 'path\to\save\output\images\'; SRC_DIR = 'path\to\egohands\fine\annotations\folders\'; fileID = fopen('inputfile_fineannot.txt','w');%open a file for writing annotations to it. % iterating over each folder for id = 1:length(folders) HOMEANNOTATIONS = fullfile(SRC_DIR,folders(id)); D = LMdatabase(HOMEANNOTATIONS{1});%read annotations for each video using Labelme library function filters = strsplit(char(folders(id)),'_');%split folder name into components like cards,courtyard, s,h activity = upper(char(filters(1))); loc = upper(char(filters(2))); actor1 = upper(char(filters(3))); actor2 = upper(char(filters(4))); %get labelled frames from original dataset using these filters vid = getMetaBy('Location',loc,'Activity',activity,'Viewer',actor1,'Partner',actor2); frames = vid.labelled_frames; % if DIR for output images doesn't exist already, then create it. if ~exist(char(fullfile(DIR,folders(id))), 'dir') mkdir(char(fullfile(DIR,folders(id)))); end %for each image for idx = 1:length(D) %get all annotated objects in that image objects = D(idx).annotation.object; img_name = D(idx).annotation.filename; img = imread(fullfile(SRC_DIR,char(folders(id)),'\Images\users\shivengoyal',char(folders(id)),D(idx).annotation.filename)); [row,col,ch]=size(img); for obj_id = 1:length(objects) try %if the object is a hand if(~isempty(regexp(objects(obj_id).name,'hand','once'))) %get the object in that hand obj = getObjectForHand(objects(obj_id),objects); %read the attributes for that hand object attributes = objects(obj_id).attributes; %if the hand is left hand of third person if(~isempty(regexp(attributes,'left','once')) && ~isempty(regexp(attributes,'other','once'))) if(~isempty(obj)) %if hand is manipulating some object for i = 1:length(obj) BW = poly2mask(double(obj{i}(1:2:end)),double(obj{i}(2:2:end)),row,col); ur_left_img = bsxfun(@times, img, cast(BW, 'like', img)); imshow(BW) imshow(ur_left_img) if(~isempty(ur_left_img)) %write image in destination folder fn = strcat(img_name(1:end-4),'_',num2str(i),'_ul.jpg');%filename for image imwrite(ur_left_img,char(fullfile(DIR,folders(id),fn))); end end end %if the hand is right hand of third person elseif(~isempty(regexp(attributes,'right','once')) && ~isempty(regexp(attributes,'other','once'))) if(~isempty(obj))%if hand is manipulating some object for i = 1:length(obj) BW = poly2mask(double(obj{i}(1:2:end)),double(obj{i}(2:2:end)),row,col); ur_right_img = bsxfun(@times, img, cast(BW, 'like', img)); imshow(ur_right_img); if(~isempty(ur_right_img)) fn = strcat(img_name(1:end-4),'_',num2str(i),'_ur.jpg'); imwrite(ur_right_img,char(fullfile(DIR,folders(id),fn))); end end end %if the hand is left hand of first person elseif(~isempty(regexp(attributes,'left','once')) && ~isempty(regexp(attributes,'first_person','once'))) if(~isempty(obj))%if hand is manipulating some object for i = 1:length(obj) BW = poly2mask(double(obj{i}(1:2:end)),double(obj{i}(2:2:end)),row,col); my_left_img = bsxfun(@times, img, cast(BW, 'like', img)); imshow(my_left_img); if(~isempty(my_left_img)) fn = strcat(img_name(1:end-4),'_',num2str(i),'_ml.jpg'); imwrite(my_left_img,char(fullfile(DIR,folders(id),fn))); end end end %if the hand is right hand of first person elseif(~isempty(regexp(attributes,'right','once')) && ~isempty(regexp(attributes,'first_person','once'))) if(~isempty(obj))%if hand is manipulating some object for i = 1:length(obj) BW = poly2mask(double(obj{i}(1:2:end)),double(obj{i}(2:2:end)),row,col); my_right_img = bsxfun(@times, img, cast(BW, 'like', img)); imshow(my_right_img); if(~isempty(my_right_img)) fn = strcat(img_name(1:end-4),'_',num2str(i),'_mr.jpg'); imwrite(my_right_img,char(fullfile(DIR,folders(id),fn))); end end end end %write the path/to/img with its label in txt file which will be used %as an input to the caffe network for training and testing attributes = strsplit(char(attributes),','); % assign a label for corresponding action like if % attributes of hand has action 'holding', then label is 0. for i = 1:length(attributes) n = char(attributes(i)); if(~isempty(obj)) switch n case 'holding' str = char(strcat(fullfile(DIR,folders(id),fn),' 0')); fprintf(fileID,'%s\n',str);%write to file case 'picking' str = char(strcat(fullfile(DIR,folders(id),fn),' 1')); fprintf(fileID,'%s\n',str); case 'placing' str = char(strcat(fullfile(DIR,folders(id),fn),' 2')); fprintf(fileID,'%s\n',str); case 'resting' str = char(strcat(fullfile(DIR,folders(id),fn),' 3')); fprintf(fileID,'%s\n',str); case 'moving' str = char(strcat(fullfile(DIR,folders(id),fn),' 4')); fprintf(fileID,'%s\n',str); case 'replacing' str = char(strcat(fullfile(DIR,folders(id),fn),' 5')); fprintf(fileID,'%s\n',str); case 'thinking' str = char(strcat(fullfile(DIR,folders(id),fn),' 6')); fprintf(fileID,'%s\n',str); case 'pulling' str = char(strcat(fullfile(DIR,folders(id),fn),' 7')); fprintf(fileID,'%s\n',str); case 'pushing' str = char(strcat(fullfile(DIR,folders(id),fn),' 8')); fprintf(fileID,'%s\n',str); case 'stacking' str = char(strcat(fullfile(DIR,folders(id),fn),' 9')); fprintf(fileID,'%s\n',str); case 'adjusting' str = char(strcat(fullfile(DIR,folders(id),fn),' 10')); fprintf(fileID,'%s\n',str); case 'matching' str = char(strcat(fullfile(DIR,folders(id),fn),' 11')); fprintf(fileID,'%s\n',str); case 'pressing' str = char(strcat(fullfile(DIR,folders(id),fn),' 12')); fprintf(fileID,'%s\n',str); case 'highfive' str = char(strcat(fullfile(DIR,folders(id),fn),' 13')); fprintf(fileID,'%s\n',str); case 'pointing' str = char(strcat(fullfile(DIR,folders(id),fn),' 14')); fprintf(fileID,'%s\n',str); case 'catching' str = char(strcat(fullfile(DIR,folders(id),fn),' 15')); fprintf(fileID,'%s\n',str); otherwise disp(n)%displays non-actions attributes, useful for debugging end end end end catch ME msg = ME.message ME.stack.name ME.stack.line continue; end end end end fclose(fileID); ================================================ FILE: matlab_scripts/reshapeAreaCoords.m ================================================ function shape2 = reshapeAreaCoords(shape) shape2 = zeros(1, 2*length(shape)); shape2(1:2:end) = shape(:,1)'; shape2(2:2:end) = shape(:,2)'; end ================================================ FILE: refinenet_files/computeAccuracyEYTH.m ================================================ close all; clear all; gt_dir = '/home/zxi/refinenet/datasets/eyth/SegmentationClass/testset'; pred_dir = '/home/zxi/refinenet/cache_data/Multiscale_evaluation_refinenet/eyth_on_eyth'; im_dir = '/home/zxi/refinenet/datasets/eyth/JPEGImages/testset'; dir1 = dir(fullfile(gt_dir)); %remove '.' and '..' from directories dir1=dir1(~ismember({dir1.name},{'.','..'})); idx = 1; num_of_frames = 258; IOU = zeros(1,num_of_frames,'double'); Precision = zeros(1,num_of_frames,'double'); Recall = zeros(1,num_of_frames,'double'); for i = 1:length(dir1) vid_name=dir1(i).name; %read frames from folder img_files = dir(fullfile(gt_dir,vid_name)); %remove '.' and '..' from directories img_files=img_files(~ismember({img_files.name},{'.','..'})); for j = 1:length(img_files) gt_fr_name = img_files(j).name; im = imread(fullfile(im_dir,vid_name,strcat(gt_fr_name(1:end-4),'.jpg'))); gt_map = imread(fullfile(gt_dir,dir1(i).name,gt_fr_name)); %now get predicted output for the ground truth pred_frame_path = fullfile(pred_dir,vid_name, 'predict_result_mask',strcat(gt_fr_name(1:end-4),'.png')); try pred_im = imread(pred_frame_path); if(size(pred_im) ~= size(gt_map)) pred_im = imresize(pred_im,size(gt_map)); end catch continue; end [iou,prec,recall] = getIOU(pred_im,gt_map,1); if(iou==0) num_of_frames = num_of_frames - 1; end IOU(idx)=iou; Precision(idx)=prec; Recall(idx)=recall; idx = idx+1; end end %fprintf(1, 'The mean Intersection_over_Union for cards videos is %d\n The mean Intersection_over_Union for chess videos is %d\n The mean Intersection_over_Union for jenga videos is %d\n The mean Intersection_over_Union for puzzle videos is %d\n '... % ,mean_cards_IOU,mean_chess_IOU,mean_jenga_IOU,mean_puzzle_IOU); mean_IOU = sum(IOU(:))/num_of_frames mean_prec = sum(Precision(:))/num_of_frames mean_recall = sum(Recall(:))/num_of_frames fprintf(1, 'The mean Intersection_over_Union for all videos is %d\n',mean_IOU); ================================================ FILE: refinenet_files/computeAccuracyEgoHands.m ================================================ close all; clear all; gt_dir = '/path/to/refinenet/datasets/EgoHands/SegmentationClass/test'; pred_dir = '/path/to/refinenet/cache_data/EgoHands/'; im_dir = '/path/to/refinenet/datasets/EgoHands'; dir1 = dir(fullfile(gt_dir)); %remove '.' and '..' from directories dir1=dir1(~ismember({dir1.name},{'.','..'})); idx = 1; num_of_frames = 800; IOU = zeros(1,num_of_frames,'double'); Precision = zeros(1,num_of_frames,'double'); Recall = zeros(1,num_of_frames,'double'); for i = 1:length(dir1) vid_name=dir1(i).name %read frames from folder img_files = dir(fullfile(gt_dir,vid_name)); %remove '.' and '..' from directories img_files=img_files(~ismember({img_files.name},{'.','..'})); for j = 1:length(img_files) gt_fr_name = img_files(j).name; gt_map = imread(fullfile(gt_dir,dir1(i).name,gt_fr_name)); %now get predicted output for the ground truth pred_frame_path = fullfile(pred_dir,vid_name,strcat(gt_fr_name(1:end-4),'.png')); pred_im = double(imread(pred_frame_path)); if(size(pred_im) ~= size(gt_map)) gt_map = imresize(gt_map,size(pred_im)); end if(~islogical(gt_map)) gt_map = imbinarize(gt_map); end pred_im = pred_im./255; pred_im = imbinarize(pred_im); [iou,prec,recall] = getIOU(pred_im, gt_map, 1); IOU(idx)=iou; Precision(idx)=prec; Recall(idx)=recall; idx = idx+1; end end mean_IOU = sum(IOU(:))/num_of_frames mean_prec = sum(Precision(:))/num_of_frames mean_recall = sum(Recall(:))/num_of_frames fprintf(1, 'The mean Intersection_over_Union for all videos is %d\n',mean_IOU); ================================================ FILE: refinenet_files/computeAccuracyGTEA.m ================================================ close all; clear all; gt_dir = '/home/zxi/refinenet/datasets/gtea/SegmentationClass/testset'; pred_dir = '/home/zxi/refinenet/cache_data/Multiscale_evaluation_refinenet/gtea/test/predict_result_mask'; im_dir = '/home/zxi/refinenet/datasets/gtea/JPEGImages/testset'; dir1 = dir(fullfile(pred_dir)); %remove '.' and '..' from directories dir1=dir1(~ismember({dir1.name},{'.','..'})); num_of_frames = length(dir1); IOU = zeros(1,num_of_frames,'double'); Precision = zeros(1,num_of_frames,'double'); Recall = zeros(1,num_of_frames,'double'); for j = 1:length(dir1) gt_fr_name = dir1(j).name; im = imread(fullfile(im_dir,strcat(gt_fr_name(1:end-4),'.jpg'))); gt_map = imread(fullfile(gt_dir,strcat(gt_fr_name(1:end-4),'.jpg'))); %now get predicted output for the ground truth pred_frame_path = fullfile(pred_dir,strcat(gt_fr_name(1:end-4),'.png')); pred_im = imread(pred_frame_path); %if(~(size(gt_map)~=size(pred_im))) gt_map = imresize(gt_map,size(pred_im)); % end if(~islogical(gt_map)) gt_map = imbinarize(gt_map); end [iou,prec,recall] = getIOU(pred_im,gt_map,1); IOU(j)=iou; Precision(j)=prec; Recall(j)=recall; end mean_IOU = sum(IOU(:))/num_of_frames mean_prec = sum(Precision(:))/num_of_frames mean_recall = sum(Recall(:))/num_of_frames fprintf(1, 'The mean Intersection_over_Union for all images is %d\n',mean_IOU); ================================================ FILE: refinenet_files/computeAccuracyHoF.m ================================================ close all; clear all; gt_dir = '/home/zxi/refinenet/datasets/ADE20K_hand_images/SegmentationClass/test'; pred_dir = '/home/zxi/refinenet/cache_data/Multiscale_evaluation_refinenet/ade_on_ade/test/predict_result_mask'; im_dir = '/home/zxi/refinenet/datasets/ADE20K_hand_images/JPEGImages/test'; dir1 = dir(fullfile(pred_dir)); %remove '.' and '..' from directories dir1=dir1(~ismember({dir1.name},{'.','..'})); idx = 1; num_of_frames = length(dir1); IOU = zeros(1,num_of_frames,'double'); Precision = zeros(1,num_of_frames,'double'); Recall = zeros(1,num_of_frames,'double'); for j = 1:length(dir1) gt_fr_name = dir1(j).name; im = imread(fullfile(im_dir,strcat(gt_fr_name(1:end-4),'.jpg'))); gt_map = imread(fullfile(gt_dir,dir1(j).name)); %now get predicted output for the ground truth pred_frame_path = fullfile(pred_dir,strcat(gt_fr_name(1:end-4),'.png')); pred_im = imread(pred_frame_path); if(~islogical(gt_map)) gt_map = imbinarize(gt_map); %works in matlab 2017 %gt_map = im2bw(gt_map,0.1); %use this line for matlab version < 2017 end [iou,prec,recall,mcr] = getIOU(pred_im,gt_map,1); IOU(j)=iou; Precision(j)=prec; Recall(j)=recall; end mean_IOU = sum(IOU(:))/num_of_frames mean_prec = sum(Precision(:))/num_of_frames mean_recall = sum(Recall(:))/num_of_frames fprintf(1, 'The mean Intersection_over_Union for all videos is %d\n',mean_IOU); ================================================ FILE: refinenet_files/demo_refinenet_test_example_egohands.m ================================================ % Author: Guosheng Lin (guosheng.lin@gmail.com) % perform segmentation prediction on user provided images. % specify the location of your images, e.g., using the following % folder which contains serveral example images: % ds_config.img_data_dir='../datasets/custom_data'; function demo_refinenet_test_example_egohands() %%% testset folders = {'CARDS_COURTYARD_B_T','CARDS_OFFICE_S_B','CHESS_COURTYARD_B_T','CHESS_LIVINGROOM_T_H','JENGA_LIVINGROOM_S_T','JENGA_OFFICE_H_T','PUZZLE_COURTYARD_H_T','PUZZLE_LIVINGROOM_T_B'}; %%% valset %folders = {'JENGA_COURTYARD_T_S','CHESS_COURTYARD_H_S','PUZZLE_OFFICE_S_T','CARDS_LIVINGROOM_S_H'} for i = 1:length(folders) rng('shuffle'); addpath('./my_utils'); dir_matConvNet='../libs/matconvnet/matlab'; run(fullfile(dir_matConvNet, 'vl_setupnn.m')); run_config=[]; ds_config=[]; run_config.use_gpu=true; % run_config.use_gpu=false; run_config.gpu_idx=1; % result dir: result_name=[char(folders(i))]; result_dir=fullfile('../cache_data', 'egohands', result_name); % the folder that contains testing images: ds_config.img_data_dir=strcat('/home/zxi/refinenet/datasets/EgoHands/JPEGImages/',char(folders(i))); % using a trained model which is trained on VOC 2012 % run_config.trained_model_path='../model_trained/refinenet_res101_voc2012.mat'; % ds_config.class_info=gen_class_info_voc(); % using the object parsing model run_config.trained_model_path='../model_trained/refinenet_res101_egohands.mat'; ds_config.class_info=gen_class_info_ego(); % for voc trained model, control the size of input images run_config.input_img_short_edge_min=450; run_config.input_img_short_edge_max=600; % set the input image scales, useful for multi-scale evaluation % e.g. using multiple scale settings (1.0 0.8 0.6) and average the resulting score maps. run_config.input_img_scale=1.0; run_config.gen_net_opts_fn=@gen_net_opts_model_type1; run_config.run_evaonly=true; ds_config.use_custom_data=true; ds_config.use_dummy_gt=true; run_config.use_dummy_gt=ds_config.use_dummy_gt; ds_config.ds_name='tmp_data'; run_config.root_cache_dir=result_dir; mkdir_notexist(run_config.root_cache_dir); run_config.model_name=result_name; diary_dir=run_config.root_cache_dir; mkdir_notexist(diary_dir); diary(fullfile(diary_dir, 'output.txt')); diary on run_dir_name=fileparts(mfilename('fullpath')); [~, run_dir_name]=fileparts(run_dir_name); run_config.run_dir_name=run_dir_name; run_config.run_file_name=mfilename(); ds_info=gen_dataset_info(ds_config); my_diary_flush(); train_opts=run_config.gen_net_opts_fn(run_config, ds_info.class_info); imdb=my_gen_imdb(train_opts, ds_info); data_norm_info=[]; data_norm_info.image_mean=128; imdb.ref.data_norm_info=data_norm_info; if run_config.use_gpu gpu_num=gpuDeviceCount; if gpu_num>=1 gpuDevice(run_config.gpu_idx); else error('no gpu found!'); end end [net_config, net_exp_info]=prepare_running_model(train_opts); my_net_tool(train_opts, imdb, net_config, net_exp_info); fprintf('\n\n--------------------------------------------------\n\n'); disp('results are saved in:'); disp(run_config.root_cache_dir); my_diary_flush(); diary off end end ================================================ FILE: refinenet_files/demo_refinenet_test_example_eyth.m ================================================ % Author: Guosheng Lin (guosheng.lin@gmail.com) % perform segmentation prediction on user provided images. % specify the location of your images, e.g., using the following % folder which contains serveral example images: % ds_config.img_data_dir='../datasets/custom_data'; function demo_refinenet_test_example_eyth() folders = {'vid4','vid6','vid9'}; for i = 1:length(folders) rng('shuffle'); addpath('./my_utils'); dir_matConvNet='../libs/matconvnet/matlab'; run(fullfile(dir_matConvNet, 'vl_setupnn.m')); run_config=[]; ds_config=[]; run_config.use_gpu=true; % run_config.use_gpu=false; run_config.gpu_idx=1; % result dir: result_name=[char(folders(i))] result_dir=fullfile('../cache_data', 'eyth_on_eyth_val/val_1', result_name); % the folder that contains testing images: ds_config.img_data_dir=strcat('../datasets/eyth/JPEGImages/test/',char(folders(i))); % using a trained model which is trained on VOC 2012 % run_config.trained_model_path='../model_trained/refinenet_res101_voc2012.mat'; % ds_config.class_info=gen_class_info_voc(); % using the object parsing model run_config.trained_model_path='../model_trained/refinenet_res101_eyth.mat'; ds_config.class_info=gen_class_info_ego(); % for voc trained model, control the size of input images run_config.input_img_short_edge_min=450; run_config.input_img_short_edge_max=600; % set the input image scales, useful for multi-scale evaluation % e.g. using multiple scale settings (1.0 0.8 0.6) and average the resulting score maps. run_config.input_img_scale=1.0; run_config.gen_net_opts_fn=@gen_net_opts_model_type1; run_config.run_evaonly=true; ds_config.use_custom_data=true; ds_config.use_dummy_gt=true; run_config.use_dummy_gt=ds_config.use_dummy_gt; ds_config.ds_name='tmp_data'; run_config.root_cache_dir=result_dir; mkdir_notexist(run_config.root_cache_dir); run_config.model_name=result_name; diary_dir=run_config.root_cache_dir; mkdir_notexist(diary_dir); diary(fullfile(diary_dir, 'output.txt')); diary on run_dir_name=fileparts(mfilename('fullpath')); [~, run_dir_name]=fileparts(run_dir_name); run_config.run_dir_name=run_dir_name; run_config.run_file_name=mfilename(); ds_info=gen_dataset_info(ds_config); my_diary_flush(); train_opts=run_config.gen_net_opts_fn(run_config, ds_info.class_info); imdb=my_gen_imdb(train_opts, ds_info); data_norm_info=[]; data_norm_info.image_mean=128; imdb.ref.data_norm_info=data_norm_info; if run_config.use_gpu gpu_num=gpuDeviceCount; if gpu_num>=1 gpuDevice(run_config.gpu_idx); else error('no gpu found!'); end end [net_config, net_exp_info]=prepare_running_model(train_opts); my_net_tool(train_opts, imdb, net_config, net_exp_info); fprintf('\n\n--------------------------------------------------\n\n'); disp('results are saved in:'); disp(run_config.root_cache_dir); my_diary_flush(); diary off end end ================================================ FILE: refinenet_files/demo_refinenet_test_example_gtea.m ================================================ % Author: Guosheng Lin (guosheng.lin@gmail.com) % perform segmentation prediction on user provided images. % specify the location of your images, e.g., using the following % folder which contains serveral example images: % ds_config.img_data_dir='../datasets/custom_data'; function demo_refinenet_test_example_gtea() rng('shuffle'); addpath('./my_utils'); dir_matConvNet='../libs/matconvnet/matlab'; run(fullfile(dir_matConvNet, 'vl_setupnn.m')); run_config=[]; ds_config=[]; %run_config.use_gpu=true; run_config.use_gpu=false; run_config.gpu_idx=1; % result dir: % result_name=[char(folders(i))]%['result_' datestr(now, 'YYYYmmDDHHMMSS') '_evaonly_custom_data' char(folders(i))] result_name='val_1.0'; result_dir=fullfile('../cache_data', 'gtea_on_gtea', result_name); % the folder that contains testing images: %ds_config.img_data_dir=strcat('../datasets/EgoHands/Testset/',char(folders(i))); ds_config.img_data_dir='../datasets/gtea/JPEGImages/val'; % using a trained model which is trained on VOC 2012 % run_config.trained_model_path='../model_trained/refinenet_res101_voc2012.mat'; % ds_config.class_info=gen_class_info_voc(); % using the object parsing model run_config.trained_model_path='../model_trained/refinenet_res101_gtea.mat'; ds_config.class_info=gen_class_info_ego(); % for voc trained model, control the size of input images run_config.input_img_short_edge_min=450; run_config.input_img_short_edge_max=600; % set the input image scales, useful for multi-scale evaluation % e.g. using multiple scale settings (1.0 0.8 0.6) and average the resulting score maps. run_config.input_img_scale=1.0; run_config.gen_net_opts_fn=@gen_net_opts_model_type1; run_config.run_evaonly=true; ds_config.use_custom_data=true; ds_config.use_dummy_gt=true; run_config.use_dummy_gt=ds_config.use_dummy_gt; ds_config.ds_name='tmp_data'; run_config.root_cache_dir=result_dir; mkdir_notexist(run_config.root_cache_dir); run_config.model_name=result_name; diary_dir=run_config.root_cache_dir; mkdir_notexist(diary_dir); diary(fullfile(diary_dir, 'output.txt')); diary on run_dir_name=fileparts(mfilename('fullpath')); [~, run_dir_name]=fileparts(run_dir_name); run_config.run_dir_name=run_dir_name; run_config.run_file_name=mfilename(); ds_info=gen_dataset_info(ds_config); my_diary_flush(); train_opts=run_config.gen_net_opts_fn(run_config, ds_info.class_info); imdb=my_gen_imdb(train_opts, ds_info); data_norm_info=[]; data_norm_info.image_mean=128; imdb.ref.data_norm_info=data_norm_info; if run_config.use_gpu gpu_num=gpuDeviceCount; if gpu_num>=1 gpuDevice(run_config.gpu_idx); else error('no gpu found!'); end end [net_config, net_exp_info]=prepare_running_model(train_opts); my_net_tool(train_opts, imdb, net_config, net_exp_info); fprintf('\n\n--------------------------------------------------\n\n'); disp('results are saved in:'); disp(run_config.root_cache_dir); my_diary_flush(); diary off end ================================================ FILE: refinenet_files/demo_refinenet_test_example_hof.m ================================================ % Author: Guosheng Lin (guosheng.lin@gmail.com) % perform segmentation prediction on user provided images in HandOnFace dataset % specify the location of your images, e.g., using the following % folder which contains serveral example images: % ds_config.img_data_dir='../datasets/custom_data'; function demo_refinenet_test_example_hof() rng('shuffle'); addpath('./my_utils'); dir_matConvNet='../libs/matconvnet/matlab'; run(fullfile(dir_matConvNet, 'vl_setupnn.m')); run_config=[]; ds_config=[]; run_config.use_gpu=true; % run_config.use_gpu=false; run_config.gpu_idx=1; % result dir: result_name='test_1'; result_dir=fullfile('../cache_data', 'hof', result_name); % the folder that contains testing images: ds_config.img_data_dir=strcat('../datasets/hof/test'); %ds_config.img_data_dir='../datasets/EgoHands_Temporarily_Removed_Folders/TestImages'; % using a trained model which is trained on VOC 2012 % run_config.trained_model_path='../model_trained/refinenet_res101_voc2012.mat'; % ds_config.class_info=gen_class_info_voc(); % using the object parsing model run_config.trained_model_path='../model_trained/refinenet_res101_hof.mat'; ds_config.class_info=gen_class_info_ego(); % for voc trained model, control the size of input images run_config.input_img_short_edge_min=450; run_config.input_img_short_edge_max=600; % set the input image scales, useful for multi-scale evaluation % e.g. using multiple scale settings (1.0 0.8 0.6) and average the resulting score maps. run_config.input_img_scale=1.0; run_config.gen_net_opts_fn=@gen_net_opts_model_type1; run_config.run_evaonly=true; ds_config.use_custom_data=true; ds_config.use_dummy_gt=true; run_config.use_dummy_gt=ds_config.use_dummy_gt; ds_config.ds_name='tmp_data'; run_config.root_cache_dir=result_dir; mkdir_notexist(run_config.root_cache_dir); run_config.model_name=result_name; diary_dir=run_config.root_cache_dir; mkdir_notexist(diary_dir); diary(fullfile(diary_dir, 'output.txt')); diary on run_dir_name=fileparts(mfilename('fullpath')); [~, run_dir_name]=fileparts(run_dir_name); run_config.run_dir_name=run_dir_name; run_config.run_file_name=mfilename(); ds_info=gen_dataset_info(ds_config); my_diary_flush(); train_opts=run_config.gen_net_opts_fn(run_config, ds_info.class_info); imdb=my_gen_imdb(train_opts, ds_info); data_norm_info=[]; data_norm_info.image_mean=128; imdb.ref.data_norm_info=data_norm_info; if run_config.use_gpu gpu_num=gpuDeviceCount; if gpu_num>=1 gpuDevice(run_config.gpu_idx); else error('no gpu found!'); end end [net_config, net_exp_info]=prepare_running_model(train_opts); my_net_tool(train_opts, imdb, net_config, net_exp_info); fprintf('\n\n--------------------------------------------------\n\n'); disp('results are saved in:'); disp(run_config.root_cache_dir); my_diary_flush(); diary off end ================================================ FILE: refinenet_files/gen_class_info_ego.m ================================================ function class_info=gen_class_info_ego() class_info=[]; class_info.class_names = { 'background', 'hand','void'}'; void_class_value=255; class_info.class_label_values=uint8([0:1 void_class_value]); class_info.background_label_value=uint8(0); class_info.void_label_values=uint8(void_class_value); % addpath ../libs/VOCdevkit_2012/VOCcode class_info.mask_cmap = VOClabelcolormap(256); class_info=process_class_info(class_info); end ================================================ FILE: refinenet_files/gen_class_info_gtea.m ================================================ function class_info=gen_class_info_gtea() class_info=[]; class_info.class_names = { 'background', 'hand','void'}'; void_class_value=255; class_info.class_label_values=uint8([0:1 void_class_value]); class_info.background_label_value=uint8(0); class_info.void_label_values=uint8(void_class_value); % addpath ../libs/VOCdevkit_2012/VOCcode class_info.mask_cmap = VOClabelcolormap(256); class_info=process_class_info(class_info); end ================================================ FILE: refinenet_files/getIOU.m ================================================ function [ IOU, Precision, Recall,MCR ] = getIOU( pred_map,gt_map,get_other_scores) %this function computes IOU when given a prediction map and respective %ground truth map. Both pred_map and gt_map should be 2D binary maps. % INPUT: pred_map: a binary image prediction map % gt_map: a binay image ground truth map % get_other_scores: An optional flag which by default is zero. % Returns Precision, Recall and MCR(miss classification rate) scores if set to 1. % OUTPUT: IOU: Intersection over Union score which should be a scalar % number between the range [0,1]. IOU=0; Precision=0; Recall=0;MCR=0; if(nargin <2) error('input_example : pred_map and gt_map are required inputs'); end if(nargin == 2 && nargin < 3) get_other_scores = 0; end %compute true_positives true_pos = (pred_map.*gt_map); num_pxls = size(pred_map,1)*size(pred_map,2); TP_count = sum(true_pos(:)); %compute false_positives FP_count = sum(pred_map(:)) - TP_count; %compute false negatives FN_count = sum(gt_map(:)) - TP_count; if(TP_count~=0 || (TP_count~=0 && FN_count~=0)) IOU = (TP_count/(TP_count+FP_count+FN_count)); if(get_other_scores) Precision = (TP_count/(TP_count+FP_count)); Recall = (TP_count/(TP_count+FN_count)); MCR = (FP_count+FN_count)/num_pxls; else Precision = 0; Recall = 0; MCR = 0; end end end ================================================ FILE: refinenet_files/multiscale_evaluation.m ================================================ %this script generates final prediction maps after averaging the scoremaps %across different scales. % This code assumes that the prediction scoremaps have already been % generated for the mentioned destination folder with following naming % convention: _ e.g. egohands_0.6, egohands_0.8 and % egohands_1.0 provided that the scales are same as mentioned in this file. % e.g. if scales = [0.8 1.0], then this script will only look for folders % with _0.8 and _1.0 respectively. close all; clear all; src_dir = '/home/zxi/refinenet/cache_data/gtea_on_gtea/' dest_dir = '/home/zxi/Desktop/research/densecrf_on_all_datasets/gtea-valset/' scales = [0.6 0.8 1.0]; img_size = [216 384];% for Egohands %img_size = [405 720]; % for GTEA folder_name = 'val'; folders = dir(fullfile(src_dir,strcat(folder_name,'_',num2str(scales(1),1)))); %remove '.' and '..' from directories folders=folders(~ismember({folders.name},{'.','..'})); %folders = {}; for i = 1:length(folders) %for each file, read scoremap from its respective folder for all scales % take the average scoremap, and get prediction map for that. %write it back to destination folder folder = strcat(folder_name,'_',num2str(scales(1),1)); path_to_sc_maps = fullfile(src_dir,folder,folders(i).name,'predict_result_full'); files = dir(path_to_sc_maps) %remove '.' and '..' from directories files=files(~ismember({files.name},{'.','..'})); for j = 1:length(files) path_to_one_map = fullfile(path_to_sc_maps,files(j).name); sc_map1 = load(path_to_one_map); sc_map1 = double(sc_map1.data_obj.score_map); score_map_size=size(sc_map1); score_map_size=score_map_size(1:2); if any(img_size~=score_map_size) sc_map1=log(sc_map1); sc_map1=max(sc_map1, -20); sc_map1=my_resize(single(sc_map1), img_size); sc_map1=exp(sc_map1); end sum_sc = double(sc_map1); %initialize with first score_map for k = 2:length(scales) folder = strcat(folder_name,'_',num2str(scales(k),1)); path_to_sc_maps = fullfile(src_dir,folder,folders(i).name,'predict_result_full'); path_to_one_map = fullfile(path_to_sc_maps,files(j).name); sc_map = load(path_to_one_map); sc_map = double(sc_map.data_obj.score_map); score_map_size=size(sc_map); score_map_size=score_map_size(1:2); if any(img_size~=score_map_size) sc_map=log(sc_map); sc_map=max(sc_map, -20); sc_map=my_resize(single(sc_map), img_size); sc_map=exp(sc_map); end sum_sc = sum_sc + double(sc_map); end meansc_map = sum_sc/length(scales);%average scoremap over multiple scales imshow(meansc_map) file_name = files(j).name; file_name = strcat(file_name(1:end-4),'.png'); %uncomment if want to save scoremaps path_to_dest_scmaps = fullfile(dest_dir,'score_maps',folders(i).name); if(~exist(path_to_dest_scmaps,'dir')) mkdir(path_to_dest_scmaps); end map_to_write = gather(meansc_map); save(fullfile(path_to_dest_scmaps,files(j).name),'map_to_write'); [~, predict_mask]=max(meansc_map,[],3); % imshow(predict_mask,[]) class_info = gen_class_info_ego(); predict_mask_data=class_info.class_label_values(predict_mask); %%unique(predict_mask_data) %uncomment for debug only path_to_dest = fullfile(dest_dir,folders(i).name,'predict_result_mask'); if(~exist(path_to_dest,'dir')) mkdir(path_to_dest) end imwrite(logical(predict_mask_data),fullfile(path_to_dest,file_name)); end end ================================================ FILE: refinenet_files/my_gen_ds_info_egohands.m ================================================ function ds_info=my_gen_ds_info_egohands(ds_config) ds_dir=fullfile('../datasets', 'EgoHands'); train_idx_file=fullfile(ds_dir, 'ImageSets/Segmentation/train.txt'); fid=fopen(train_idx_file); train_file_names=textscan(fid, '%s'); train_file_names=train_file_names{1}; fclose(fid); val_idx_file=fullfile(ds_dir, 'ImageSets/Segmentation/val.txt'); fid=fopen(val_idx_file); val_file_names=textscan(fid, '%s'); val_file_names=val_file_names{1}; fclose(fid); train_num=length(train_file_names); img_names=cat(1, train_file_names, val_file_names); img_num=length(img_names); img_files=cell(img_num, 1); mask_files=cell(img_num, 1); for t_idx=1:img_num file_name=img_names{t_idx}; img_files{t_idx}=[file_name(1:end-4) '.jpg']; mask_file_name = strrep(file_name,'JPEGImages','SegmentationClass'); mask_files{t_idx}=[mask_file_name(1:end-4) '.png']; end train_idxes=1:train_num; val_idxes=train_num+1:img_num; ds_info=[]; ds_info.img_names=img_names; ds_info.img_files=img_files; ds_info.mask_files=mask_files; ds_info.train_idxes=uint32(train_idxes'); ds_info.test_idxes=uint32(val_idxes'); disp('here::') img_dir=fullfile(ds_dir, 'JPEGImages') mask_dir=fullfile(ds_dir, 'SegmentationClass') data_dirs=[]; data_dirs{1}=img_dir; data_dirs{2}=mask_dir; data_dir_idxes_img=zeros([img_num 1], 'uint8')+1; data_dir_idxes_mask=zeros([img_num 1], 'uint8')+2; ds_info.data_dir_idxes_img=data_dir_idxes_img; ds_info.data_dir_idxes_mask=data_dir_idxes_mask; ds_info.data_dirs=data_dirs; ds_info.ds_dir=ds_dir; ds_info.class_info=gen_class_info_ego(); ds_info.ds_name='EgoHands'; ds_info=process_ds_info_classification(ds_info, ds_config); end ================================================ FILE: refinenet_files/my_gen_ds_info_eyth.m ================================================ function ds_info=my_gen_ds_info_eyth(ds_config) ds_dir=fullfile('../datasets', 'eyth'); train_idx_file=fullfile(ds_dir, 'ImageSets/Segmentation/train.txt'); fid=fopen(train_idx_file); train_file_names=textscan(fid, '%s'); train_file_names=train_file_names{1}; fclose(fid); val_idx_file=fullfile(ds_dir, 'ImageSets/Segmentation/val.txt'); fid=fopen(val_idx_file); val_file_names=textscan(fid, '%s'); val_file_names=val_file_names{1}; fclose(fid); train_num=length(train_file_names); img_names=cat(1, train_file_names, val_file_names); img_num=length(img_names); img_files=cell(img_num, 1); mask_files=cell(img_num, 1); for t_idx=1:img_num file_name=img_names{t_idx}; img_files{t_idx}=[file_name(1:end-4) '.jpg']; mask_file_name = strrep(file_name,'JPEGImages','SegmentationClass'); mask_files{t_idx}=[mask_file_name(1:end-4) '.png']; end train_idxes=1:train_num; val_idxes=train_num+1:img_num; ds_info=[]; ds_info.img_names=img_names; ds_info.img_files=img_files; ds_info.mask_files=mask_files; ds_info.train_idxes=uint32(train_idxes'); ds_info.test_idxes=uint32(val_idxes'); disp('here::') img_dir=fullfile(ds_dir, 'JPEGImages') mask_dir=fullfile(ds_dir, 'SegmentationClass') data_dirs=[]; data_dirs{1}=img_dir; data_dirs{2}=mask_dir; data_dir_idxes_img=zeros([img_num 1], 'uint8')+1; data_dir_idxes_mask=zeros([img_num 1], 'uint8')+2; ds_info.data_dir_idxes_img=data_dir_idxes_img; ds_info.data_dir_idxes_mask=data_dir_idxes_mask; ds_info.data_dirs=data_dirs; ds_info.ds_dir=ds_dir; ds_info.class_info=gen_class_info_ego(); ds_info.ds_name='eyth'; ds_info=process_ds_info_classification(ds_info, ds_config); end ================================================ FILE: refinenet_files/my_gen_ds_info_gtea.m ================================================ function ds_info=my_gen_ds_info_gtea(ds_config) ds_dir=fullfile('../datasets', 'gtea'); train_idx_file=fullfile(ds_dir, 'ImageSets/Segmentation/train.txt'); fid=fopen(train_idx_file); train_file_names=textscan(fid, '%s'); train_file_names=train_file_names{1}; fclose(fid); val_idx_file=fullfile(ds_dir, 'ImageSets/Segmentation/val.txt'); fid=fopen(val_idx_file); val_file_names=textscan(fid, '%s'); val_file_names=val_file_names{1}; fclose(fid); train_num=length(train_file_names); img_names=cat(1, train_file_names, val_file_names); img_num=length(img_names); img_files=cell(img_num, 1); mask_files=cell(img_num, 1); for t_idx=1:img_num file_name=img_names{t_idx}; img_files{t_idx}=[file_name(1:end-4) '.jpg']; mask_file_name = strrep(file_name,'JPEGImages','SegmentationClass'); mask_files{t_idx}=[mask_file_name(1:end-4) '.png']; end train_idxes=1:train_num; val_idxes=train_num+1:img_num; ds_info=[]; ds_info.img_names=img_names; ds_info.img_files=img_files; ds_info.mask_files=mask_files; ds_info.train_idxes=uint32(train_idxes'); ds_info.test_idxes=uint32(val_idxes'); disp('here::') img_dir=fullfile(ds_dir, 'JPEGImages') mask_dir=fullfile(ds_dir, 'SegmentationClass') data_dirs=[]; data_dirs{1}=img_dir; data_dirs{2}=mask_dir; data_dir_idxes_img=zeros([img_num 1], 'uint8')+1; data_dir_idxes_mask=zeros([img_num 1], 'uint8')+2; ds_info.data_dir_idxes_img=data_dir_idxes_img; ds_info.data_dir_idxes_mask=data_dir_idxes_mask; ds_info.data_dirs=data_dirs; ds_info.ds_dir=ds_dir; ds_info.class_info=gen_class_info_ego(); ds_info.ds_name='gtea'; ds_info=process_ds_info_classification(ds_info, ds_config); end ================================================ FILE: refinenet_files/my_gen_ds_info_hof.m ================================================ function ds_info=my_gen_ds_info_hof(ds_config) ds_dir=fullfile('../datasets', 'hof'); train_idx_file=fullfile(ds_dir, 'ImageSets/Segmentation/train.txt'); fid=fopen(train_idx_file); train_file_names=textscan(fid, '%s'); train_file_names=train_file_names{1}; fclose(fid); val_idx_file=fullfile(ds_dir, 'ImageSets/Segmentation/val.txt'); fid=fopen(val_idx_file); val_file_names=textscan(fid, '%s'); val_file_names=val_file_names{1}; fclose(fid); train_num=length(train_file_names); img_names=cat(1, train_file_names, val_file_names); img_num=length(img_names); img_files=cell(img_num, 1); mask_files=cell(img_num, 1); for t_idx=1:img_num file_name=img_names{t_idx}; img_files{t_idx}=[file_name(1:end-4) '.jpg']; mask_file_name = strrep(file_name,'JPEGImages','SegmentationClass'); mask_files{t_idx}=[mask_file_name(1:end-4) '.png']; end train_idxes=1:train_num; val_idxes=train_num+1:img_num; ds_info=[]; ds_info.img_names=img_names; ds_info.img_files=img_files; ds_info.mask_files=mask_files; ds_info.train_idxes=uint32(train_idxes'); ds_info.test_idxes=uint32(val_idxes'); disp('here::') img_dir=fullfile(ds_dir, 'JPEGImages') mask_dir=fullfile(ds_dir, 'SegmentationClass') data_dirs=[]; data_dirs{1}=img_dir; data_dirs{2}=mask_dir; data_dir_idxes_img=zeros([img_num 1], 'uint8')+1; data_dir_idxes_mask=zeros([img_num 1], 'uint8')+2; ds_info.data_dir_idxes_img=data_dir_idxes_img; ds_info.data_dir_idxes_mask=data_dir_idxes_mask; ds_info.data_dirs=data_dirs; ds_info.ds_dir=ds_dir; ds_info.class_info=gen_class_info_ego(); ds_info.ds_name='gtea'; ds_info=process_ds_info_classification(ds_info, ds_config); end