Repository: NVIDIA-AI-IOT/trt_pose_hand
Branch: main
Commit: bb7536dcdc67
Files: 17
Total size: 91.8 KB
Directory structure:
gitextract_gm96bpqz/
├── LICENSE.md
├── README.md
├── cursor_control_live_demo.ipynb
├── data_collection/
│ └── gesture_data_collection.ipynb
├── dataloader.py
├── gesture_classification_live_demo.ipynb
├── gesture_classifier.py
├── gesture_data_collection_pose.ipynb
├── gesture_training/
│ ├── realold_svmmodel.sav
│ └── train_gesture_classification.ipynb
├── live_hand_pose.ipynb
├── mini_paint_live_demo.ipynb
├── model/
│ └── README.md
├── preprocess/
│ ├── gesture.json
│ └── hand_pose.json
├── preprocessdata.py
└── svmmodel.sav
================================================
FILE CONTENTS
================================================
================================================
FILE: LICENSE.md
================================================
Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
================================================
FILE: README.md
================================================
# Hand Pose Estimation And Classification
This project is an extention of TRT Pose for Hand Pose Detection. The project includes
- Pretrained models for hand pose estimation capable of running in real time on Jetson Xavier NX.
- Scripts for applications of Hand Pose Estimation
- Hand gesture recoginition (hand pose classification)
- Cursor control
- Mini-Paint type of application
- Pretrained model for gesture recoginition
## Getting Started
### Step 1 - Install trt_pose and it's dependencies
Make sure to follow all the instructions from trt_pose and install all it's depenedencies.
Follow step 1 and step 2 from https://github.com/NVIDIA-AI-IOT/trt_pose.
### Step 2 - Install dependecies for hand pose
pip install traitlets
### Step 3 - Download model wieght
| Model | Weight |
|-------|---------|
| hand_pose_resnet18_baseline_att_224x224_A | [download model](https://drive.google.com/file/d/1NCVo0FiooWccDzY7hCc5MAKaoUpts3mo/view?usp=sharing) |
1. Download the model weight using the link above.
2. Place the downloaded weight in the [model](model/) directory
### Step 4 - Run hand pose and it's applications
A) Hand Pose demo
- Open and follow live_hand_pose.ipynb notebook.

B) Hand gesture recoginition (hand pose classification)
- Install dependecies
- scikit-learn
- pip install -U scikit-learn
- or install it from the source
The current gesture classification model supports six classes (fist, pan, stop, fine, peace, no hand).
More gestures can be added by a simple process of creating your own dataset and training it on an svm model.
An SVM model weight (svmmodel.sav) is provided for inference.

To make your own hand gesture classification from the hand pose estimation, follow the following steps
- Create your own dataset using the gesture_data_collection.ipynb or gesture_data_collection_with_pose.ipynb.
This will allow you to create the type of gestures you want to classify. (eg. tumbs up, fist,etc).
This notebook will automatically create a dataset with images and labels that is ready to be trained for gesture classification.
- Train using the train_gesture_classification.ipynb notebook file. It uses an SVM from scikit-learn.
Other types of models can also be experimented.
C) Cursor control application
- Install dependecies
- pyautogui
- python3 -m pip install pyautogui
- On jetson install it from the source
- Open and follow the cursor_control_live_demo.ipynb notebook.
- This will allow you to control your mouse cursor on your desktop. It uses the hand gesture classification.
When your hand geture is pan, you can control the cursor. when it is click, it's left click.
| Buying a tuna sandwich :) | navigating map |
|-------|---------|
| | |
D) Mini-Paint
A mini paint app that let's you draw, erase and clear on your camera screen.
-------------------------------------------------------------------------------------------------------------------------------------
The model was trained using the training script in trt_pose and the hand pose data collected in Nvidia.
Model details: resnet18
-------------------------------------------------------------------------------------------------------------------------------------
## See also
- [trt_pose](https://github.com/NVIDIA-AI-IOT/trt_pose) - Real-time pose estimation accelerated with NVIDIA TensorRT
- [deepstream_pose_estimation](https://github.com/NVIDIA-AI-IOT/deepstream_pose_estimation) - [trt_pose](https://github.com/NVIDIA-AI-IOT/trt_pose) deepstream integration
- [ros2_trt_pose](https://github.com/NVIDIA-AI-IOT/ros2_trt_pose) - ROS 2 package for "trt_pose": real-time human pose estimation on NVIDIA Jetson Platform
- [torch2trt](http://github.com/NVIDIA-AI-IOT/torch2trt) - An easy to use PyTorch to TensorRT converter
## References
Cao, Zhe, et al. "Realtime multi-person 2d pose estimation using part affinity fields." Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2017.
Xiao, Bin, Haiping Wu, and Yichen Wei. "Simple baselines for human pose estimation and tracking." Proceedings of the European Conference on Computer Vision (ECCV). 2018.
================================================
FILE: cursor_control_live_demo.ipynb
================================================
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Matplotlib created a temporary config/cache directory at /tmp/matplotlib-q__yfjpz because the default path (/home/mikyas/.cache/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.\n"
]
}
],
"source": [
"import json\n",
"import cv2\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.image as mpimg \n",
"import trt_pose.coco\n",
"import math\n",
"import os\n",
"import numpy as np\n",
"import traitlets\n",
"import pickle \n",
"import pyautogui\n",
"import time "
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<All keys matched successfully>"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('hand_pose.json', 'r') as f:\n",
" hand_pose = json.load(f)\n",
"\n",
"topology = trt_pose.coco.coco_category_to_topology(hand_pose)\n",
"import trt_pose.models\n",
"\n",
"num_parts = len(hand_pose['keypoints'])\n",
"num_links = len(hand_pose['skeleton'])\n",
"\n",
"model = trt_pose.models.resnet18_baseline_att(num_parts, 2 * num_links).cuda().eval()\n",
"import torch\n",
"\n",
"\n",
"WIDTH = 224\n",
"HEIGHT = 224\n",
"data = torch.zeros((1, 3, HEIGHT, WIDTH)).cuda()\n",
"\n",
"if not os.path.exists('hand_pose_resnet18_att_244_244_trt.pth'):\n",
" MODEL_WEIGHTS = 'hand_pose_resnet18_att_244_244.pth'\n",
" model.load_state_dict(torch.load(MODEL_WEIGHTS))\n",
" import torch2trt\n",
" model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)\n",
" OPTIMIZED_MODEL = 'hand_pose_resnet18_att_244_244_trt.pth'\n",
" torch.save(model_trt.state_dict(), OPTIMIZED_MODEL)\n",
"\n",
"\n",
"OPTIMIZED_MODEL = 'hand_pose_resnet18_att_244_244_trt.pth'\n",
"from torch2trt import TRTModule\n",
"\n",
"model_trt = TRTModule()\n",
"model_trt.load_state_dict(torch.load(OPTIMIZED_MODEL))"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"from trt_pose.draw_objects import DrawObjects\n",
"from trt_pose.parse_objects import ParseObjects\n",
"\n",
"parse_objects = ParseObjects(topology,cmap_threshold=0.15, link_threshold=0.15)\n",
"draw_objects = DrawObjects(topology)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"\n",
"import torchvision.transforms as transforms\n",
"import PIL.Image\n",
"\n",
"mean = torch.Tensor([0.485, 0.456, 0.406]).cuda()\n",
"std = torch.Tensor([0.229, 0.224, 0.225]).cuda()\n",
"device = torch.device('cuda')\n",
"\n",
"def preprocess(image):\n",
" global device\n",
" device = torch.device('cuda')\n",
" image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
" image = PIL.Image.fromarray(image)\n",
" image = transforms.functional.to_tensor(image).to(device)\n",
" image.sub_(mean[:, None, None]).div_(std[:, None, None])\n",
" return image[None, ...]"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.pipeline import make_pipeline\n",
"from sklearn.preprocessing import StandardScaler\n",
"from sklearn.svm import SVC\n",
"clf = make_pipeline(StandardScaler(), SVC(gamma='auto', kernel='rbf'))"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"from preprocessdata import preprocessdata\n",
"preprocessdata = preprocessdata(topology, num_parts)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"svm_train = False\n",
"if svm_train:\n",
" clf, predicted = preprocessdata.trainsvm(clf, joints_train, joints_test, labels_train, hand.labels_test)\n",
" filename = 'svmmodel.sav'\n",
" pickle.dump(clf, open(filename, 'wb'))\n",
"else:\n",
" filename = 'svmmodel.sav'\n",
" clf = pickle.load(open(filename, 'rb'))"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"from jetcam.usb_camera import USBCamera\n",
"from jetcam.csi_camera import CSICamera\n",
"from jetcam.utils import bgr8_to_jpeg\n",
"\n",
"camera = USBCamera(width=WIDTH, height=HEIGHT, capture_fps=30, capture_device=1)\n",
"#camera = CSICamera(width=WIDTH, height=HEIGHT, capture_fps=30)\n",
"\n",
"camera.running = True"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "ae138f658a94450a9f45d6f81506aac0",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Image(value=b'', format='jpeg', height='256', width='256')"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import ipywidgets\n",
"from IPython.display import display\n",
"\n",
"\n",
"image_w = ipywidgets.Image(format='jpeg', width=224, height=224)\n",
"display(image_w)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"def draw_joints(image, joints):\n",
" count = 0\n",
" for i in joints:\n",
" if i==[0,0]:\n",
" count+=1\n",
" if count>= 19:\n",
" return \n",
" for i in joints:\n",
" cv2.circle(image, (i[0],i[1]), 2, (0,0,255), 1)\n",
" cv2.circle(image, (joints[0][0],joints[0][1]), 2, (255,0,255), 1)\n",
" for i in hand_pose['skeleton']:\n",
" if joints[i[0]-1][0]==0 or joints[i[1]-1][0] == 0:\n",
" break\n",
" cv2.line(image, (joints[i[0]-1][0],joints[i[0]-1][1]), (joints[i[1]-1][0],joints[i[1]-1][1]), (0,255,0), 1)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"screenWidth, screenHeight = pyautogui.size()\n",
"p_text = 'none'\n",
"p_sc = 0\n",
"cur_x, cur_y = pyautogui.position()\n",
"fixed_x, fixed_y = pyautogui.position()\n",
"pyautogui.FAILSAFE = False\n",
"t0 = time.time()\n",
"def control_cursor(text, joints):\n",
" global p_text\n",
" global p_sc\n",
" global t0\n",
" global cur_x\n",
" global cur_y\n",
" global fixed_x, fixed_y\n",
" cursor_joint = 6\n",
" if p_text!=\"pan\":\n",
" #pyautogui.position()\n",
" fixed_x = joints[cursor_joint][0]\n",
" fixed_y = joints[cursor_joint][1] \n",
" if p_text!=\"click\" and text==\"click\":\n",
" pyautogui.mouseUp(((joints[cursor_joint][0])*screenWidth)/256, ((joints[cursor_joint][1])*screenHeight)/256, button= 'left')\n",
" pyautogui.click()\n",
" if text == \"pan\":\n",
" if joints[cursor_joint]!=[0,0]:\n",
" pyautogui.mouseUp(((joints[cursor_joint][0])*screenWidth)/256, ((joints[cursor_joint][1])*screenHeight)/256, button= 'left')\n",
"\n",
" pyautogui.moveTo(((joints[cursor_joint][0])*screenWidth)/256, ((joints[cursor_joint][1])*screenHeight)/256)\n",
" if text == \"scroll\":\n",
" \n",
" if joints[cursor_joint]!=[0,0] and joints[0]!=[0,0]:\n",
" pyautogui.mouseUp(((joints[cursor_joint][0])*screenWidth)/256, ((joints[cursor_joint][1])*screenHeight)/256, button= 'left')#to_scroll = (joints[8][1]-joints[0][1])/10\n",
" to_scroll = (p_sc-joints[cursor_joint][1])\n",
" if to_scroll>0:\n",
" to_scroll = 1\n",
" else:\n",
" to_scroll = -1\n",
" pyautogui.scroll(int(to_scroll),x=(joints[cursor_joint][0]*screenWidth)/256, y=(joints[cursor_joint][1]*screenHeight)/256)\n",
" if text == \"zoom\":\n",
" \n",
" \n",
" pyautogui.keyDown('ctrl')\n",
" if joints[cursor_joint]!=[0,0] and joints[0]!=[0,0]:\n",
" pyautogui.mouseUp(((joints[cursor_joint][0])*screenWidth)/256, ((joints[cursor_joint][1])*screenHeight)/256, button= 'left')\n",
" \n",
" to_scroll = (p_sc-joints[cursor_joint][1])\n",
" if to_scroll>0:\n",
" to_scroll = 1\n",
" else:\n",
" to_scroll = -1\n",
" t1 = time.time()\n",
" #print(t1-t0)\n",
" if t1-t0>1: \n",
" pyautogui.scroll(int(to_scroll),x=(joints[cursor_joint][0]*screenWidth)/256, y=(joints[cursor_joint][1]*screenHeight)/256)\n",
" t0 = time.time()\n",
" pyautogui.keyUp('ctrl')\n",
" \n",
" \n",
" if text == \"drag\":\n",
" \n",
" if joints[cursor_joint]!=[0,0]:\n",
" pyautogui.mouseDown(((joints[cursor_joint][0])*screenWidth)/256, ((joints[cursor_joint][1])*screenHeight)/256, button= 'left')\n",
" \n",
" \n",
" p_text = text\n",
" p_sc = joints[cursor_joint][1]"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"def execute(change):\n",
" image = change['new']\n",
" data = preprocess(image)\n",
" cmap, paf = model_trt(data)\n",
" cmap, paf = cmap.detach().cpu(), paf.detach().cpu()\n",
" counts, objects, peaks = parse_objects(cmap, paf)#, cmap_threshold=0.15, link_threshold=0.15)\n",
" #draw_objects(image, counts, objects, peaks)\n",
" joints = preprocessdata.joints_inference(image, counts, objects, peaks)\n",
" \n",
" dist_bn_joints = preprocessdata.find_distance(joints)\n",
" gesture = clf.predict([dist_bn_joints,[0]*num_parts*num_parts])\n",
" gesture_joints = gesture[0]\n",
" preprocessdata.prev_queue.append(gesture_joints)\n",
" preprocessdata.prev_queue.pop(0)\n",
" preprocessdata.print_label(image, preprocessdata.prev_queue)\n",
" #draw_joints(image, joints)\n",
" control_cursor(preprocessdata.text, joints)\n",
" image_w.value = bgr8_to_jpeg(image)\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"execute({'new': camera.value})"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"camera.observe(execute, names='value')"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"camera.unobserve_all()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#camera.running = False"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
================================================
FILE: data_collection/gesture_data_collection.ipynb
================================================
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import os\n",
"import json"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"This notebook creates a dataset (images and labels as a json file). The dataset created can be used for pose classification. \n",
"In order to create a new dataset for gesture recoginition specify the following parameters \n",
"\n",
"**no_of_classes** - Number of classes to be created. i.e. For hand pose the number of hand gestures to be created.\n",
"\n",
"**path_dir** - Path to the directory to be created\n",
"\n",
"**dataset_name** - The name of the dataset to be created\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"def create_directories_for_classes(no_of_classes, path_dir, dataset_name):\n",
" dir_ = os.path.join(path_dir, dataset_name)\n",
" for i in range(no_of_classes):\n",
" dir_to_create = os.path.join(dir_,\"%s\" % (i+1))\n",
" try:\n",
" os.makedirs(dir_to_create)\n",
" except FileExistsError:\n",
" print(os.path.join(\"The following directory was not created because it already exsists\", dir_ , ))\n"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"dir_datasets = '/home/mikyas/mike_dataset/jj/'\n",
"dataset_name = \"hand_dataset\"\n",
"no_of_classes = 5\n",
"create_directories_for_classes(no_of_classes, dir_datasets, dataset_name )"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import ipywidgets.widgets as widgets\n",
"dir_ = os.path.join(dir_datasets, dataset_name)\n",
"curr_class_no = 1\n",
"button_layout = widgets.Layout(width='128px', height='32px')\n",
"curr_dir = os.path.join(dir_,'%s'%curr_class_no )\n",
"collecting_button = widgets.Button(description= 'Collect Class ' + str(curr_class_no), button_style='success', layout=button_layout)\n",
"prev_button = widgets.Button(description='Previous Class', button_style='primary', layout=button_layout)\n",
"nxt_button = widgets.Button(description='Next Class', button_style='info', layout=button_layout)\n",
"\n",
"dir_count = widgets.IntText(layout=button_layout, value=len(os.listdir(curr_dir)))\n",
"dir_count.continuous_update"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [],
"source": [
"from uuid import uuid1\n",
"def save_snapshot(directory):\n",
" image_path = os.path.join(directory, str(uuid1()) + '.jpg')\n",
" with open(image_path, 'wb') as f:\n",
" f.write(image_w.value)\n",
"def save_dir():\n",
" global curr_dir, dir_count\n",
" save_snapshot(curr_dir)\n",
" dir_count.value = len(os.listdir(curr_dir))\n",
"def prev_dir():\n",
" global curr_class_no, curr_dir, no_of_classes\n",
" if curr_class_no>1:\n",
" curr_class_no-=1\n",
" curr_dir = os.path.join(dir_,'%s'%curr_class_no )\n",
" collecting_button.description = 'Collect Class ' + str(curr_class_no)\n",
" dir_count.value = len(os.listdir(curr_dir))\n",
" dir_count.continuous_update\n",
"def nxt_dir():\n",
" global curr_class_no, curr_dir, no_of_classes\n",
" if curr_class_no<no_of_classes:\n",
" curr_class_no+=1\n",
" curr_dir = os.path.join(dir_,'%s'%curr_class_no )\n",
" collecting_button.description = 'Collect Class ' + str(curr_class_no)\n",
" dir_count.value = len(os.listdir(curr_dir))\n",
"\n",
" \n",
"\n",
"collecting_button.on_click(lambda x: save_dir())\n",
"nxt_button.on_click(lambda x: nxt_dir())\n",
"prev_button.on_click(lambda x: prev_dir())\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"from jetcam.usb_camera import USBCamera\n",
"from jetcam.csi_camera import CSICamera\n",
"from jetcam.utils import bgr8_to_jpeg\n",
"WIDTH = 256\n",
"HEIGHT = 256\n",
"camera = USBCamera(width=WIDTH, height=HEIGHT, capture_fps=30, capture_device=1)\n",
"#camera = CSICamera(width=WIDTH, height=HEIGHT, capture_fps=50)\n",
"\n",
"camera.running = True"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"import ipywidgets\n",
"from IPython.display import display\n",
"\n",
"image_w = ipywidgets.Image(format='jpeg', width=256, height=256)\n"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "fe359c9b5fe04069acc2f1d3549d5415",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Image(value=b'\\xff\\xd8\\xff\\xe0\\x00\\x10JFIF\\x00\\x01\\x01\\x00\\x00\\x01\\x00\\x01\\x00\\x00\\xff\\xdb\\x00C\\x00\\x02\\x01\\x0…"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "025ec73f99464909b3ba3aca3fa1799c",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(IntText(value=0, layout=Layout(height='32px', width='128px')), Button(button_style='success', d…"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "71a2b3ee3ecc4ed9b66a5f9a3ee6fe02",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(Button(button_style='info', description='Next Class', layout=Layout(height='32px', width='128px…"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "7c858c4fc67d4a51a0143bd55bbe5068",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(Button(button_style='primary', description='Previous Class', layout=Layout(height='32px', width…"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(image_w)\n",
"display(widgets.HBox([dir_count, collecting_button]))\n",
"display(widgets.HBox([ nxt_button]))\n",
"display(widgets.HBox([ prev_button]))"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"def execute(change):\n",
" image = change['new']\n",
" image_w.value = bgr8_to_jpeg(image[:, ::-1, :]) "
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"execute({'new': camera.value})"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"camera.observe(execute, names='value')"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"camera.unobserve_all()\n",
"camera.running = False"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [],
"source": [
"def generate_labels(dir_, dataset_name):\n",
" labels = []\n",
" starting_label = 1\n",
" for i in range(len(os.listdir(dir_))):\n",
" dir_to_check = os.path.join(dir_,\"%s\" % (i+1))\n",
" for j in range(len(os.listdir(dir_to_check))):\n",
" labels.append(starting_label)\n",
" starting_label+=1\n",
" labels_to_dict = {\"labels\": labels}\n",
" with open((dir_+'.json'), 'w') as outfile:\n",
" json.dump(labels_to_dict, outfile)\n",
" return labels "
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"def rename_images(dir_):\n",
" overall_count = 0\n",
" #dir_ = dir_+dataset_name\n",
" for i in range(len(os.listdir(dir_))):\n",
" dir_to_check = os.path.join(dir_,\"%s\" % (i+1))\n",
" dir_to_check+='/'\n",
" for count, filename in enumerate(os.listdir(dir_to_check)):\n",
" dst = \"%08d.jpg\"% overall_count\n",
" src = dir_to_check+filename\n",
" dst = dir_to_check+dst \n",
" os.rename(src, dst)\n",
" overall_count+=1"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [],
"source": [
"generate_labels(dir_, dataset_name)\n",
"rename_images(dir_)"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"5\n",
"/home/mikyas/mike_dataset/jj/hand_dataset\n",
"0\n",
"1\n",
"2\n",
"3\n",
"4\n"
]
},
{
"data": {
"text/plain": [
"'/home/mikyas/mike_dataset/jj/hand_dataset/hand_dataset.json'"
]
},
"execution_count": 36,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import shutil\n",
"dir_training = dir_datasets +'/trainig/'\n",
"try:\n",
" os.makedirs(dir_training)\n",
"except FileExistsError:\n",
" print(os.path.join(\"The following directory was not created because it already exsists\", dir_ , ))\n",
"for i in range(len(os.listdir(dir_))):\n",
" dir_to_check = os.path.join(dir_,\"%s\" % (i+1))+'/'\n",
" for count, filename in enumerate(os.listdir(dir_to_check)):\n",
" src = dir_to_check+filename\n",
" shutil.move(src,dir_training)\n",
" os.rmdir(dir_to_check)\n",
"shutil.move(dir_training,dir_)\n",
"shutil.move(dir_+'.json',dir_)"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"camera.running = True"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
================================================
FILE: dataloader.py
================================================
import os
import json
import cv2
import numpy as np
class dataloader:
def __init__(self, path, label_file, test_label):
self.train_path = path+"training/"
self.test_path = path+"testing/"
self.label_path = path+label_file
self.test_label_path = path+test_label
self.train_data = []
self.train_images =[]
self.train_file_name = []
self.test_data = []
self.test_images = []
self.test_file_name = []
self.labels_train = []
self.labels_test = []
self.load_hand_dataset(self.train_path, self.test_path)
self._assert_exist(self.label_path)
self._assert_exist(self.test_label_path)
self.load_labels(self.label_path, self.test_label_path)
self.TOTAL_SAMPLES_PER_CLASS = 100
def _assert_exist(self, label_path):
msg = 'File is not availble: %s' % label_path
assert os.path.exists(label_path), msg
def load_labels(self, label_path, test_label):
"""
loads the training and testing labels from a json file
"""
self._assert_exist(label_path)
self._assert_exist(test_label)
with open(label_path, 'r') as f:
label_data = json.load(f)
self.labels_train = label_data["labels"]
with open(test_label, 'r') as f:
test_label = json.load(f)
self.labels_test = test_label["labels"]
#return labels_train, labels_test
def scaled_data(self, train_data, test_data):
"""
This method helps scaling/normalizing data
"""
raw_scaler = preprocessing.StandardScaler().fit(train_data)
scaled_train_data = raw_scaler.transform(train_data)
scaled_test_data = raw_scaler.transform(test_data)
return scaled_train_data, scaled_test_data, raw_scaler
def load_hand_dataset(self, train_path, test_path):
"""
This method loads the images for training the classifier.
"""
WIDTH = 256
HEIGHT = 256
for filename in sorted(os.listdir(train_path)):
self.train_file_name.append(filename)
image = cv2.imread(train_path+filename)
#image = image[:, ::-1, :]
#image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (WIDTH, HEIGHT), interpolation = cv2.INTER_AREA)
self.train_data.append(np.reshape(np.array(image), 196608))
self.train_images.append(image)
for filename in sorted(os.listdir(test_path)):
self.test_file_name.append(filename)
image = cv2.imread(test_path+filename)
#image = image[:, ::-1, :]
#image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
image = cv2.resize(image, (WIDTH, HEIGHT), interpolation = cv2.INTER_AREA)
self.test_images.append(image)
self.test_data.append(np.reshape(np.array(image), 196608))
#return train_images, test_images, train_data, test_data
def smaller_dataset(self, dataset, no_samples_per_class, no_of_classes):
"""
This method helps training with a subset of the dataset
"""
start = 0
end = no_samples_per_class
new_dataset = []
labels = []
for i in range(no_of_classes):
new_data = dataset[start:end]
start = start+self.TOTAL_SAMPLES_PER_CLASS
end = start+no_samples_per_class
new_dataset.extend(new_data)
labels.extend([i+1]*no_samples_per_class)
return new_dataset, labels
================================================
FILE: gesture_classification_live_demo.ipynb
================================================
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Matplotlib created a temporary config/cache directory at /tmp/matplotlib-rlsubdmo because the default path (/home/mikyas/.cache/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.\n"
]
}
],
"source": [
"import json\n",
"import cv2\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.image as mpimg \n",
"import trt_pose.coco\n",
"import math\n",
"import os\n",
"import numpy as np\n",
"import traitlets\n",
"import pickle \n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<All keys matched successfully>"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('preprocess/hand_pose.json', 'r') as f:\n",
" hand_pose = json.load(f)\n",
"\n",
"topology = trt_pose.coco.coco_category_to_topology(hand_pose)\n",
"import trt_pose.models\n",
"\n",
"num_parts = len(hand_pose['keypoints'])\n",
"num_links = len(hand_pose['skeleton'])\n",
"\n",
"model = trt_pose.models.resnet18_baseline_att(num_parts, 2 * num_links).cuda().eval()\n",
"import torch\n",
"\n",
"\n",
"WIDTH = 224\n",
"HEIGHT = 224\n",
"data = torch.zeros((1, 3, HEIGHT, WIDTH)).cuda()\n",
"\n",
"if not os.path.exists('model/hand_pose_resnet18_att_244_244_trt.pth'):\n",
" MODEL_WEIGHTS = 'model/hand_pose_resnet18_att_244_244.pth'\n",
" model.load_state_dict(torch.load(MODEL_WEIGHTS))\n",
" import torch2trt\n",
" model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)\n",
" OPTIMIZED_MODEL = 'model/hand_pose_resnet18_att_244_244_trt.pth'\n",
" torch.save(model_trt.state_dict(), OPTIMIZED_MODEL)\n",
"\n",
"\n",
"OPTIMIZED_MODEL = 'model/hand_pose_resnet18_att_244_244_trt.pth'\n",
"from torch2trt import TRTModule\n",
"\n",
"model_trt = TRTModule()\n",
"model_trt.load_state_dict(torch.load(OPTIMIZED_MODEL))"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"from trt_pose.draw_objects import DrawObjects\n",
"from trt_pose.parse_objects import ParseObjects\n",
"\n",
"parse_objects = ParseObjects(topology,cmap_threshold=0.12, link_threshold=0.15)\n",
"draw_objects = DrawObjects(topology)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"\n",
"import torchvision.transforms as transforms\n",
"import PIL.Image\n",
"\n",
"mean = torch.Tensor([0.485, 0.456, 0.406]).cuda()\n",
"std = torch.Tensor([0.229, 0.224, 0.225]).cuda()\n",
"device = torch.device('cuda')\n",
"\n",
"def preprocess(image):\n",
" global device\n",
" device = torch.device('cuda')\n",
" image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
" image = PIL.Image.fromarray(image)\n",
" image = transforms.functional.to_tensor(image).to(device)\n",
" image.sub_(mean[:, None, None]).div_(std[:, None, None])\n",
" return image[None, ...]"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.pipeline import make_pipeline\n",
"from sklearn.preprocessing import StandardScaler\n",
"from sklearn.svm import SVC\n",
"clf = make_pipeline(StandardScaler(), SVC(gamma='auto', kernel='rbf'))"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"from preprocessdata import preprocessdata\n",
"preprocessdata = preprocessdata(topology, num_parts)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"svm_train = False\n",
"if svm_train:\n",
" clf, predicted = preprocessdata.trainsvm(clf, joints_train, joints_test, hand.labels_train, hand.labels_test)\n",
" filename = 'svmmodel.sav'\n",
" pickle.dump(clf, open(filename, 'wb'))\n",
"else:\n",
" filename = 'svmmodel.sav'\n",
" clf = pickle.load(open(filename, 'rb'))"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"with open('preprocess/gesture.json', 'r') as f:\n",
" gesture = json.load(f)\n",
"gesture_type = gesture[\"classes\"]\n"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"from jetcam.usb_camera import USBCamera\n",
"from jetcam.csi_camera import CSICamera\n",
"from jetcam.utils import bgr8_to_jpeg\n",
"\n",
"camera = USBCamera(width=WIDTH, height=HEIGHT, capture_fps=30, capture_device=1)\n",
"#camera = CSICamera(width=WIDTH, height=HEIGHT, capture_fps=30)\n",
"\n",
"camera.running = True"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"def draw_joints(image, joints):\n",
" count = 0\n",
" for i in joints:\n",
" if i==[0,0]:\n",
" count+=1\n",
" if count>= 3:\n",
" return \n",
" for i in joints:\n",
" cv2.circle(image, (i[0],i[1]), 2, (0,0,255), 1)\n",
" cv2.circle(image, (joints[0][0],joints[0][1]), 2, (255,0,255), 1)\n",
" for i in hand_pose['skeleton']:\n",
" if joints[i[0]-1][0]==0 or joints[i[1]-1][0] == 0:\n",
" break\n",
" cv2.line(image, (joints[i[0]-1][0],joints[i[0]-1][1]), (joints[i[1]-1][0],joints[i[1]-1][1]), (0,255,0), 1)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "33e9829c8753413d8dbc0bc40221d92c",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Image(value=b'', format='jpeg', height='256', width='256')"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import ipywidgets\n",
"from IPython.display import display\n",
"\n",
"\n",
"image_w = ipywidgets.Image(format='jpeg', width=224, height=224)\n",
"display(image_w)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"def execute(change):\n",
" image = change['new']\n",
" data = preprocess(image)\n",
" cmap, paf = model_trt(data)\n",
" cmap, paf = cmap.detach().cpu(), paf.detach().cpu()\n",
" counts, objects, peaks = parse_objects(cmap, paf)\n",
" joints = preprocessdata.joints_inference(image, counts, objects, peaks)\n",
" draw_joints(image, joints)\n",
" #draw_objects(image, counts, objects, peaks)\n",
" dist_bn_joints = preprocessdata.find_distance(joints)\n",
" gesture = clf.predict([dist_bn_joints,[0]*num_parts*num_parts])\n",
" gesture_joints = gesture[0]\n",
" preprocessdata.prev_queue.append(gesture_joints)\n",
" preprocessdata.prev_queue.pop(0)\n",
" preprocessdata.print_label(image, preprocessdata.prev_queue, gesture_type)\n",
" image_w.value = bgr8_to_jpeg(image)\n"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"execute({'new': camera.value})"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"camera.observe(execute, names='value')"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"#camera.unobserve_all()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"#camera.running = False"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
================================================
FILE: gesture_classifier.py
================================================
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
class gesture_classifier:
def __init__(self):
pass
def svm_accuracy(self, test_predicted, labels_test):
predicted = []
for i in range(len(labels_test)):
if labels_test[i]==test_predicted[i]:
predicted.append(0)
else:
predicted.append(1)
accuracy = 1 - sum(predicted)/len(labels_test)
return accuracy
def trainsvm(self, clf, train_data, test_data, labels_train, labels_test):
clf.fit(train_data,labels_train)
predicted_test = clf.predict(test_data)
return clf, predicted_test
================================================
FILE: gesture_data_collection_pose.ipynb
================================================
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Matplotlib created a temporary config/cache directory at /tmp/matplotlib-bhsk56tr because the default path (/home/mikyas/.cache/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.\n"
]
}
],
"source": [
"import json\n",
"import cv2\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.image as mpimg \n",
"import trt_pose.coco\n",
"import math\n",
"import os\n",
"import numpy as np\n",
"import traitlets\n",
"import sys\n"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<All keys matched successfully>"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('preprocess/hand_pose.json', 'r') as f:\n",
" hand_pose = json.load(f)\n",
"\n",
"topology = trt_pose.coco.coco_category_to_topology(hand_pose)\n",
"import trt_pose.models\n",
"\n",
"num_parts = len(hand_pose['keypoints'])\n",
"num_links = len(hand_pose['skeleton'])\n",
"\n",
"model = trt_pose.models.resnet18_baseline_att(num_parts, 2 * num_links).cuda().eval()\n",
"import torch\n",
"\n",
"\n",
"WIDTH = 224\n",
"HEIGHT = 224\n",
"data = torch.zeros((1, 3, HEIGHT, WIDTH)).cuda()\n",
"\n",
"if not os.path.exists('model/hand_pose_resnet18_att_244_244_trt.pth'):\n",
" MODEL_WEIGHTS = 'model/hand_pose_resnet18_att_244_244.pth'\n",
" model.load_state_dict(torch.load(MODEL_WEIGHTS))\n",
" import torch2trt\n",
" model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)\n",
" OPTIMIZED_MODEL = 'model/hand_pose_resnet18_att_244_244_trt.pth'\n",
" torch.save(model_trt.state_dict(), OPTIMIZED_MODEL)\n",
"\n",
"\n",
"OPTIMIZED_MODEL = 'model/hand_pose_resnet18_att_244_244_trt.pth'\n",
"from torch2trt import TRTModule\n",
"\n",
"model_trt = TRTModule()\n",
"model_trt.load_state_dict(torch.load(OPTIMIZED_MODEL))"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from trt_pose.draw_objects import DrawObjects\n",
"from trt_pose.parse_objects import ParseObjects\n",
"\n",
"parse_objects = ParseObjects(topology,cmap_threshold=0.15, link_threshold=0.15)\n",
"draw_objects = DrawObjects(topology)"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"def create_directories_for_classes(no_of_classes, path_dir, dataset_name):\n",
" dir_ = os.path.join(path_dir, dataset_name)\n",
" for i in range(no_of_classes):\n",
" dir_to_create = os.path.join(dir_,\"%s\" % (i+1))\n",
" try:\n",
" os.makedirs(dir_to_create)\n",
" except FileExistsError:\n",
" print(os.path.join(\"The following directory was not created because it already exsists\", dir_ , ))\n"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"/home/mikyas/data_collection/hand_dataset_train\n",
"/home/mikyas/data_collection/hand_dataset_train\n",
"/home/mikyas/data_collection/hand_dataset_train\n",
"/home/mikyas/data_collection/hand_dataset_train\n",
"/home/mikyas/data_collection/hand_dataset_train\n",
"/home/mikyas/data_collection/hand_dataset_train\n"
]
}
],
"source": [
"dir_datasets = '/home/mikyas/data_collection/' #give the path to where you want to save you collected data\n",
"dataset_name = \"hand_dataset_train\" #change this to hand_dataset_test when you are collecting data for test\n",
"no_of_classes = 6\n",
"create_directories_for_classes(no_of_classes, dir_datasets, dataset_name )"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"False"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import ipywidgets.widgets as widgets\n",
"dir_ = os.path.join(dir_datasets, dataset_name)\n",
"curr_class_no = 1\n",
"button_layout = widgets.Layout(width='128px', height='32px')\n",
"curr_dir = os.path.join(dir_,'%s'%curr_class_no )\n",
"collecting_button = widgets.Button(description= 'Collect Class ' + str(curr_class_no), button_style='success', layout=button_layout)\n",
"prev_button = widgets.Button(description='Previous Class', button_style='primary', layout=button_layout)\n",
"nxt_button = widgets.Button(description='Next Class', button_style='info', layout=button_layout)\n",
"\n",
"dir_count = widgets.IntText(layout=button_layout, value=len(os.listdir(curr_dir)))\n",
"dir_count.continuous_update"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"from uuid import uuid1\n",
"def save_snapshot(directory):\n",
" image_path = os.path.join(directory, str(uuid1()) + '.jpg')\n",
" with open(image_path, 'wb') as f:\n",
" f.write(image_s.value)\n",
"def save_dir():\n",
" global curr_dir, dir_count\n",
" save_snapshot(curr_dir)\n",
" dir_count.value = len(os.listdir(curr_dir))\n",
"def prev_dir():\n",
" global curr_class_no, curr_dir, no_of_classes\n",
" if curr_class_no>1:\n",
" curr_class_no-=1\n",
" curr_dir = os.path.join(dir_,'%s'%curr_class_no )\n",
" collecting_button.description = 'Collect Class ' + str(curr_class_no)\n",
" dir_count.value = len(os.listdir(curr_dir))\n",
" dir_count.continuous_update\n",
"def nxt_dir():\n",
" global curr_class_no, curr_dir, no_of_classes\n",
" if curr_class_no<no_of_classes:\n",
" curr_class_no+=1\n",
" curr_dir = os.path.join(dir_,'%s'%curr_class_no )\n",
" collecting_button.description = 'Collect Class ' + str(curr_class_no)\n",
" dir_count.value = len(os.listdir(curr_dir))\n",
"\n",
" \n",
"\n",
"collecting_button.on_click(lambda x: save_dir())\n",
"nxt_button.on_click(lambda x: nxt_dir())\n",
"prev_button.on_click(lambda x: prev_dir())"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"\n",
"import torchvision.transforms as transforms\n",
"import PIL.Image\n",
"\n",
"mean = torch.Tensor([0.485, 0.456, 0.406]).cuda()\n",
"std = torch.Tensor([0.229, 0.224, 0.225]).cuda()\n",
"device = torch.device('cuda')\n",
"\n",
"def preprocess(image):\n",
" global device\n",
" device = torch.device('cuda')\n",
" image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
" image = PIL.Image.fromarray(image)\n",
" image = transforms.functional.to_tensor(image).to(device)\n",
" image.sub_(mean[:, None, None]).div_(std[:, None, None])\n",
" return image[None, ...]"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"from preprocessdata import preprocessdata\n",
"preprocessdata = preprocessdata(topology, num_parts)\n",
"from gesture_classifier import gesture_classifier\n",
"gesture_classifier = gesture_classifier()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, let's define a function that will preprocess the image, which is originally in BGR8 / HWC format."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"def draw_joints(image, joints):\n",
" count = 0\n",
" for i in joints:\n",
" if i==[0,0]:\n",
" count+=1\n",
" if count>= 3:\n",
" return \n",
" for i in joints:\n",
" cv2.circle(image, (i[0],i[1]), 2, (0,0,255), 1)\n",
" cv2.circle(image, (joints[0][0],joints[0][1]), 2, (255,0,255), 1)\n",
" for i in hand_pose['skeleton']:\n",
" if joints[i[0]-1][0]==0 or joints[i[1]-1][0] == 0:\n",
" break\n",
" cv2.line(image, (joints[i[0]-1][0],joints[i[0]-1][1]), (joints[i[1]-1][0],joints[i[1]-1][1]), (0,255,0), 1)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"from jetcam.usb_camera import USBCamera\n",
"from jetcam.csi_camera import CSICamera\n",
"from jetcam.utils import bgr8_to_jpeg\n",
"\n",
"camera = USBCamera(width=WIDTH, height=HEIGHT, capture_fps=30, capture_device=1)\n",
"#camera = CSICamera(width=WIDTH, height=HEIGHT, capture_fps=30)\n",
"\n",
"camera.running = True"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [],
"source": [
"import ipywidgets\n",
"from IPython.display import display\n",
"\n",
"\n",
"image_w = ipywidgets.Image(format='jpeg', width=224, height=224)\n",
"image_s = ipywidgets.Image(format='jpeg', width=224, height=224)\n"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "375b9e6d61384dc1beb1b7cc25a0666b",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Image(value=b'', format='jpeg', height='224', width='224')"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "380b01ad84b340c1a9342ccea6ad1b46",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(IntText(value=0, layout=Layout(height='32px', width='128px')), Button(button_style='success', d…"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "d954768e10cf46fb963d60c733baf892",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(Button(button_style='info', description='Next Class', layout=Layout(height='32px', width='128px…"
]
},
"metadata": {},
"output_type": "display_data"
},
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "13412c28b4ca46a7b41ab27decca0214",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"HBox(children=(Button(button_style='primary', description='Previous Class', layout=Layout(height='32px', width…"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"display(image_w)\n",
"display(widgets.HBox([dir_count, collecting_button]))\n",
"display(widgets.HBox([ nxt_button]))\n",
"display(widgets.HBox([ prev_button]))"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"def execute(change):\n",
" image = change['new']\n",
" image_s.value = bgr8_to_jpeg(image[:, ::-1, :])\n",
" data = preprocess(image)\n",
" cmap, paf = model_trt(data)\n",
" cmap, paf = cmap.detach().cpu(), paf.detach().cpu()\n",
" counts, objects, peaks = parse_objects(cmap, paf)\n",
" joints = preprocessdata.joints_inference(image, counts, objects, peaks)\n",
" draw_joints(image, joints)\n",
" #draw_objects(image, counts, objects, peaks)# try this for multiple hand pose prediction \n",
" image_w.value = bgr8_to_jpeg(image[:, ::-1, :])"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"execute({'new': camera.value})"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [],
"source": [
"camera.observe(execute, names='value')"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"camera.unobserve_all()"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [],
"source": [
"#camera.running = False"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"def generate_labels(dir_, dataset_name):\n",
" labels = []\n",
" starting_label = 1\n",
" for i in range(len(os.listdir(dir_))):\n",
" dir_to_check = os.path.join(dir_,\"%s\" % (i+1))\n",
" for j in range(len(os.listdir(dir_to_check))):\n",
" labels.append(starting_label)\n",
" starting_label+=1\n",
" labels_to_dict = {\"labels\": labels}\n",
" with open((dir_+'.json'), 'w') as outfile:\n",
" json.dump(labels_to_dict, outfile)\n",
" return labels "
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [],
"source": [
"def rename_images(dir_):\n",
" overall_count = 0\n",
" #dir_ = dir_+dataset_name\n",
" for i in range(len(os.listdir(dir_))):\n",
" dir_to_check = os.path.join(dir_,\"%s\" % (i+1))\n",
" dir_to_check+='/'\n",
" for count, filename in enumerate(os.listdir(dir_to_check)):\n",
" dst = \"%08d.jpg\"% overall_count\n",
" src = dir_to_check+filename\n",
" dst = dir_to_check+dst \n",
" os.rename(src, dst)\n",
" overall_count+=1"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"generate_labels(dir_, dataset_name)\n",
"rename_images(dir_)"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"'/home/mikyas/chitoku/data_collection/hand_dataset_test/hand_dataset_test.json'"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"import shutil\n",
"dir_training = dir_datasets +'/training/'#change this to /test/ when you are collecting data for test\n",
"try:\n",
" os.makedirs(dir_training)\n",
"except FileExistsError:\n",
" print(os.path.join(\"The following directory was not created because it already exsists\", dir_ , ))\n",
"for i in range(len(os.listdir(dir_))):\n",
" dir_to_check = os.path.join(dir_,\"%s\" % (i+1))+'/'\n",
" for count, filename in enumerate(os.listdir(dir_to_check)):\n",
" src = dir_to_check+filename\n",
" shutil.move(src,dir_training)\n",
" os.rmdir(dir_to_check)\n",
"shutil.move(dir_training,dir_)\n",
"shutil.move(dir_+'.json',dir_)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
================================================
FILE: gesture_training/train_gesture_classification.ipynb
================================================
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Matplotlib created a temporary config/cache directory at /tmp/matplotlib-mn8hww0h because the default path (/home/mikyas/.cache/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.\n"
]
}
],
"source": [
"import json\n",
"import cv2\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.image as mpimg \n",
"import trt_pose.coco\n",
"import math\n",
"import os\n",
"import numpy as np\n",
"import traitlets\n",
"import pickle \n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<All keys matched successfully>"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('hand_pose.json', 'r') as f:\n",
" hand_pose = json.load(f)\n",
"\n",
"topology = trt_pose.coco.coco_category_to_topology(hand_pose)\n",
"import trt_pose.models\n",
"\n",
"num_parts = len(hand_pose['keypoints'])\n",
"num_links = len(hand_pose['skeleton'])\n",
"\n",
"model = trt_pose.models.resnet18_baseline_att(num_parts, 2 * num_links).cuda().eval()\n",
"import torch\n",
"\n",
"\n",
"WIDTH = 256\n",
"HEIGHT = 256\n",
"data = torch.zeros((1, 3, HEIGHT, WIDTH)).cuda()\n",
"\n",
"if not os.path.exists('resnet18_244x224_epoch_4150_trt.pth'):\n",
" MODEL_WEIGHTS = 'resnet18_244x224_epoch_4150.pth'\n",
" model.load_state_dict(torch.load(MODEL_WEIGHTS))\n",
" import torch2trt\n",
" model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)\n",
" OPTIMIZED_MODEL = 'resnet18_244x224_epoch_4150_trt.pth'\n",
" torch.save(model_trt.state_dict(), OPTIMIZED_MODEL)\n",
"\n",
"\n",
"OPTIMIZED_MODEL = 'resnet18_244x224_epoch_4150_trt.pth'\n",
"from torch2trt import TRTModule\n",
"\n",
"model_trt = TRTModule()\n",
"model_trt.load_state_dict(torch.load(OPTIMIZED_MODEL))"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"from trt_pose.draw_objects import DrawObjects\n",
"from trt_pose.parse_objects import ParseObjects\n",
"\n",
"parse_objects = ParseObjects(topology,cmap_threshold=0.12, link_threshold=0.15)\n",
"draw_objects = DrawObjects(topology)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"\n",
"import torchvision.transforms as transforms\n",
"import PIL.Image\n",
"\n",
"mean = torch.Tensor([0.485, 0.456, 0.406]).cuda()\n",
"std = torch.Tensor([0.229, 0.224, 0.225]).cuda()\n",
"device = torch.device('cuda')\n",
"\n",
"def preprocess(image):\n",
" global device\n",
" device = torch.device('cuda')\n",
" image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
" image = PIL.Image.fromarray(image)\n",
" image = transforms.functional.to_tensor(image).to(device)\n",
" image.sub_(mean[:, None, None]).div_(std[:, None, None])\n",
" return image[None, ...]"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.pipeline import make_pipeline\n",
"from sklearn.preprocessing import StandardScaler\n",
"from sklearn.svm import SVC\n",
"clf = make_pipeline(StandardScaler(), SVC(gamma='auto', kernel='rbf'))"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"from preprocessdata import preprocessdata\n",
"preprocessdata = preprocessdata(topology, num_parts)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"from dataloader import dataloader\n",
"path = \"/home/mikyas/mike_dataset/gestures/hand_dataset/\"\n",
"label_file = \"hand_dataset.json\"\n",
"test_label = \"hand_dataset_test.json\"\n",
"hand = dataloader(path, label_file, test_label)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def data_preprocess(images):\n",
" dist_bn_joints_all_data = []\n",
" for im in images:\n",
" im = im[:, ::-1, :]\n",
" data_im = preprocess(im)\n",
" cmap, paf = model_trt(data_im)\n",
" cmap, paf = cmap.detach().cpu(), paf.detach().cpu()\n",
" counts, objects, peaks = parse_objects(cmap, paf)\n",
" joints = preprocessdata.joints_inference(im, counts, objects, peaks)\n",
" dist_bn_joints = preprocessdata.find_distance(joints)\n",
" dist_bn_joints_all_data.append(dist_bn_joints)\n",
" return dist_bn_joints_all_data"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"def smaller_dataset(dataset, no_samples_per_class, no_of_classes):\n",
" total_samples_per_class =100\n",
" start = 0\n",
" end = no_samples_per_class\n",
" new_dataset = []\n",
" labels = []\n",
" for i in range(no_of_classes):\n",
" new_data = dataset[start:end]\n",
" start = start+total_samples_per_class\n",
" end = start+no_samples_per_class\n",
" new_dataset.extend(new_data)\n",
" labels.extend([i+1]*no_samples_per_class)\n",
" return new_dataset, labels"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"train_images, labels_train = hand.smaller_dataset(hand.train_images,100,6)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"joints_train = data_preprocess(hand.train_images)\n",
"joints_test = data_preprocess(hand.test_images)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"svm_train = False\n",
"if svm_train:\n",
" clf, predicted = preprocessdata.trainsvm(clf, joints_train, joints_test, hand.labels_train, hand.labels_test)\n",
" filename = 'svmmodel_new.sav'\n",
" pickle.dump(clf, open(filename, 'wb'))\n",
"else:\n",
" filename = 'svmmodel.sav'\n",
" clf = pickle.load(open(filename, 'rb'))"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"preprocessdata.svm_accuracy(clf.predict(joints_test), hand.labels_test)\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"clf.predict([joints_test[40],[0]*num_parts*num_parts])"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"clf.predict(joints_test)"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"from jetcam.usb_camera import USBCamera\n",
"from jetcam.csi_camera import CSICamera\n",
"from jetcam.utils import bgr8_to_jpeg\n",
"\n",
"camera = USBCamera(width=WIDTH, height=HEIGHT, capture_fps=30, capture_device=1)\n",
"#camera = CSICamera(width=WIDTH, height=HEIGHT, capture_fps=30)\n",
"\n",
"camera.running = True"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "0eb636d637824f2596b9f26ee5c970c1",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Image(value=b'', format='jpeg', height='256', width='256')"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import ipywidgets\n",
"from IPython.display import display\n",
"\n",
"\n",
"image_w = ipywidgets.Image(format='jpeg', width=256, height=256)\n",
"display(image_w)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"def execute(change):\n",
" image = change['new']\n",
" data = preprocess(image)\n",
" cmap, paf = model_trt(data)\n",
" cmap, paf = cmap.detach().cpu(), paf.detach().cpu()\n",
" counts, objects, peaks = parse_objects(cmap, paf)#, cmap_threshold=0.15, link_threshold=0.15)\n",
" draw_objects(image, counts, objects, peaks)\n",
" joints = preprocessdata.joints_inference(image, counts, objects, peaks)\n",
" dist_bn_joints = preprocessdata.find_distance(joints)\n",
" gesture = clf.predict([dist_bn_joints,[0]*num_parts*num_parts])\n",
" gesture_joints = gesture[0]\n",
" preprocessdata.prev_queue.append(gesture_joints)\n",
" preprocessdata.prev_queue.pop(0)\n",
" preprocessdata.print_label(image, preprocessdata.prev_queue)\n",
" image_w.value = bgr8_to_jpeg(image)\n"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"execute({'new': camera.value})"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [],
"source": [
"camera.observe(execute, names='value')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"camera.unobserve_all()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#camera.running = False"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
================================================
FILE: live_hand_pose.ipynb
================================================
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Matplotlib created a temporary config/cache directory at /tmp/matplotlib-73_p9iw7 because the default path (/home/mikyas/.cache/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.\n"
]
}
],
"source": [
"import json\n",
"import cv2\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.image as mpimg \n",
"import trt_pose.coco\n",
"import math\n",
"import os\n",
"import numpy as np\n",
"import traitlets\n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<All keys matched successfully>"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('preprocess/hand_pose.json', 'r') as f:\n",
" hand_pose = json.load(f)\n",
"\n",
"topology = trt_pose.coco.coco_category_to_topology(hand_pose)\n",
"import trt_pose.models\n",
"\n",
"num_parts = len(hand_pose['keypoints'])\n",
"num_links = len(hand_pose['skeleton'])\n",
"\n",
"model = trt_pose.models.resnet18_baseline_att(num_parts, 2 * num_links).cuda().eval()\n",
"import torch\n",
"\n",
"\n",
"WIDTH = 224\n",
"HEIGHT = 224\n",
"data = torch.zeros((1, 3, HEIGHT, WIDTH)).cuda()\n",
"\n",
"if not os.path.exists('model/hand_pose_resnet18_att_244_244_trt.pth'):\n",
" MODEL_WEIGHTS = 'model/hand_pose_resnet18_att_244_244.pth'\n",
" model.load_state_dict(torch.load(MODEL_WEIGHTS))\n",
" import torch2trt\n",
" model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)\n",
" OPTIMIZED_MODEL = 'model/hand_pose_resnet18_att_244_244_trt.pth'\n",
" torch.save(model_trt.state_dict(), OPTIMIZED_MODEL)\n",
"\n",
"\n",
"OPTIMIZED_MODEL = 'model/hand_pose_resnet18_att_244_244_trt.pth'\n",
"from torch2trt import TRTModule\n",
"\n",
"model_trt = TRTModule()\n",
"model_trt.load_state_dict(torch.load(OPTIMIZED_MODEL))"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"from trt_pose.draw_objects import DrawObjects\n",
"from trt_pose.parse_objects import ParseObjects\n",
"\n",
"parse_objects = ParseObjects(topology,cmap_threshold=0.15, link_threshold=0.15)\n",
"draw_objects = DrawObjects(topology)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"\n",
"import torchvision.transforms as transforms\n",
"import PIL.Image\n",
"\n",
"mean = torch.Tensor([0.485, 0.456, 0.406]).cuda()\n",
"std = torch.Tensor([0.229, 0.224, 0.225]).cuda()\n",
"device = torch.device('cuda')\n",
"\n",
"def preprocess(image):\n",
" global device\n",
" device = torch.device('cuda')\n",
" image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
" image = PIL.Image.fromarray(image)\n",
" image = transforms.functional.to_tensor(image).to(device)\n",
" image.sub_(mean[:, None, None]).div_(std[:, None, None])\n",
" return image[None, ...]"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from preprocessdata import preprocessdata\n",
"preprocessdata = preprocessdata(topology, num_parts)\n",
"from gesture_classifier import gesture_classifier\n",
"gesture_classifier = gesture_classifier()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Next, let's define a function that will preprocess the image, which is originally in BGR8 / HWC format."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"def draw_joints(image, joints):\n",
" count = 0\n",
" for i in joints:\n",
" if i==[0,0]:\n",
" count+=1\n",
" if count>= 3:\n",
" return \n",
" for i in joints:\n",
" cv2.circle(image, (i[0],i[1]), 2, (0,0,255), 1)\n",
" cv2.circle(image, (joints[0][0],joints[0][1]), 2, (255,0,255), 1)\n",
" for i in hand_pose['skeleton']:\n",
" if joints[i[0]-1][0]==0 or joints[i[1]-1][0] == 0:\n",
" break\n",
" cv2.line(image, (joints[i[0]-1][0],joints[i[0]-1][1]), (joints[i[1]-1][0],joints[i[1]-1][1]), (0,255,0), 1)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"from jetcam.usb_camera import USBCamera\n",
"from jetcam.csi_camera import CSICamera\n",
"from jetcam.utils import bgr8_to_jpeg\n",
"\n",
"camera = USBCamera(width=WIDTH, height=HEIGHT, capture_fps=30, capture_device=1)\n",
"#camera = CSICamera(width=WIDTH, height=HEIGHT, capture_fps=30)\n",
"\n",
"camera.running = True"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "f0833455fca64ca1be4515b072fd6941",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Image(value=b'', format='jpeg', height='224', width='224')"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import ipywidgets\n",
"from IPython.display import display\n",
"\n",
"\n",
"image_w = ipywidgets.Image(format='jpeg', width=224, height=224)\n",
"display(image_w)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"def execute(change):\n",
" image = change['new']\n",
" data = preprocess(image)\n",
" cmap, paf = model_trt(data)\n",
" cmap, paf = cmap.detach().cpu(), paf.detach().cpu()\n",
" counts, objects, peaks = parse_objects(cmap, paf)\n",
" joints = preprocessdata.joints_inference(image, counts, objects, peaks)\n",
" draw_joints(image, joints)\n",
" #draw_objects(image, counts, objects, peaks)# try this for multiple hand pose prediction \n",
" image_w.value = bgr8_to_jpeg(image[:, ::-1, :])"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"execute({'new': camera.value})"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"camera.observe(execute, names='value')"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"camera.unobserve_all()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#camera.running = False"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
================================================
FILE: mini_paint_live_demo.ipynb
================================================
{
"cells": [
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"Matplotlib created a temporary config/cache directory at /tmp/matplotlib-ljzfvzxa because the default path (/home/mikyas/.cache/matplotlib) is not a writable directory; it is highly recommended to set the MPLCONFIGDIR environment variable to a writable directory, in particular to speed up the import of Matplotlib and to better support multiprocessing.\n"
]
}
],
"source": [
"import json\n",
"import cv2\n",
"import matplotlib.pyplot as plt\n",
"import matplotlib.image as mpimg \n",
"import trt_pose.coco\n",
"import math\n",
"import os\n",
"import numpy as np\n",
"import traitlets\n",
"import pickle \n"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"<All keys matched successfully>"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"with open('preprocess/hand_pose.json', 'r') as f:\n",
" hand_pose = json.load(f)\n",
"\n",
"topology = trt_pose.coco.coco_category_to_topology(hand_pose)\n",
"import trt_pose.models\n",
"\n",
"num_parts = len(hand_pose['keypoints'])\n",
"num_links = len(hand_pose['skeleton'])\n",
"\n",
"model = trt_pose.models.resnet18_baseline_att(num_parts, 2 * num_links).cuda().eval()\n",
"import torch\n",
"\n",
"\n",
"WIDTH = 224\n",
"HEIGHT = 224\n",
"data = torch.zeros((1, 3, HEIGHT, WIDTH)).cuda()\n",
"\n",
"if not os.path.exists('model/hand_pose_resnet18_att_244_244_trt.pth'):\n",
" MODEL_WEIGHTS = 'model/hand_pose_resnet18_att_244_244.pth'\n",
" model.load_state_dict(torch.load(MODEL_WEIGHTS))\n",
" import torch2trt\n",
" model_trt = torch2trt.torch2trt(model, [data], fp16_mode=True, max_workspace_size=1<<25)\n",
" OPTIMIZED_MODEL = 'model/hand_pose_resnet18_att_244_244_trt.pth'\n",
" torch.save(model_trt.state_dict(), OPTIMIZED_MODEL)\n",
"\n",
"\n",
"OPTIMIZED_MODEL = 'model/hand_pose_resnet18_att_244_244_trt.pth'\n",
"from torch2trt import TRTModule\n",
"\n",
"model_trt = TRTModule()\n",
"model_trt.load_state_dict(torch.load(OPTIMIZED_MODEL))"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"from trt_pose.draw_objects import DrawObjects\n",
"from trt_pose.parse_objects import ParseObjects\n",
"\n",
"parse_objects = ParseObjects(topology,cmap_threshold=0.12, link_threshold=0.15)\n",
"draw_objects = DrawObjects(topology)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [],
"source": [
"\n",
"import torchvision.transforms as transforms\n",
"import PIL.Image\n",
"\n",
"mean = torch.Tensor([0.485, 0.456, 0.406]).cuda()\n",
"std = torch.Tensor([0.229, 0.224, 0.225]).cuda()\n",
"device = torch.device('cuda')\n",
"\n",
"def preprocess(image):\n",
" global device\n",
" device = torch.device('cuda')\n",
" image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)\n",
" image = PIL.Image.fromarray(image)\n",
" image = transforms.functional.to_tensor(image).to(device)\n",
" image.sub_(mean[:, None, None]).div_(std[:, None, None])\n",
" return image[None, ...]"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"from sklearn.pipeline import make_pipeline\n",
"from sklearn.preprocessing import StandardScaler\n",
"from sklearn.svm import SVC\n",
"clf = make_pipeline(StandardScaler(), SVC(gamma='auto', kernel='rbf'))"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"from preprocessdata import preprocessdata\n",
"preprocessdata = preprocessdata(topology, num_parts)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"svm_train = False\n",
"if svm_train:\n",
" clf, predicted = preprocessdata.trainsvm(clf, joints_train, joints_test, hand.labels_train, hand.labels_test)\n",
" filename = 'svmmodel.sav'\n",
" pickle.dump(clf, open(filename, 'wb'))\n",
"else:\n",
" filename = 'svmmodel.sav'\n",
" clf = pickle.load(open(filename, 'rb'))"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [],
"source": [
"from jetcam.usb_camera import USBCamera\n",
"from jetcam.csi_camera import CSICamera\n",
"from jetcam.utils import bgr8_to_jpeg\n",
"\n",
"camera = USBCamera(width=WIDTH, height=HEIGHT, capture_fps=30, capture_device=1)\n",
"#camera = CSICamera(width=WIDTH, height=HEIGHT, capture_fps=30)\n",
"\n",
"camera.running = True"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"def draw_joints(image, joints):\n",
" count = 0\n",
" for i in joints:\n",
" if i==[0,0]:\n",
" count+=1\n",
" if count>= 7:\n",
" return \n",
" for i in joints:\n",
" cv2.circle(image, (i[0],i[1]), 2, (0,0,255), 1)\n",
" cv2.circle(image, (joints[0][0],joints[0][1]), 2, (255,0,255), 1)\n",
" for i in hand_pose['skeleton']:\n",
" if joints[i[0]-1][0]==0 or joints[i[1]-1][0] == 0:\n",
" break\n",
" cv2.line(image, (joints[i[0]-1][0],joints[i[0]-1][1]), (joints[i[1]-1][0],joints[i[1]-1][1]), (0,255,0), 1)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"with open('preprocess/gesture.json', 'r') as f:\n",
" gesture = json.load(f)\n",
"gesture_type = gesture[\"paint\"]"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"pen = []\n",
"rectangle = []\n",
"def draw(image, joints):\n",
" global pen\n",
" global rectangle\n",
" if preprocessdata.text==\"draw\":\n",
" pen.append((joints[6][0], joints[6][1]))\n",
" for i in range(len(pen)):\n",
" cv2.circle(image, pen[i], 1,(0,0,0), 2)\n",
" if preprocessdata.text==\"line\":\n",
" if joints[5]!=[0,0]:\n",
" rectangle.append((joints[6][0], joints[6][1]))\n",
" for i in range(len(rectangle)):\n",
" if i > 0:\n",
" if rectangle[i]!=[0,0]:\n",
" cv2.line(image,rectangle[i-1], rectangle[i], (0,0,0), 2)\n",
" if preprocessdata.text==\"erase\":\n",
" to_be_erased = []\n",
" for i in range(10):\n",
" for j in range(10):\n",
" \n",
" x = (joints[6][0]+i, joints[6][1]+j)\n",
" if x[0]>=0 or x[1]>=0:\n",
" to_be_erased.append(x)\n",
" for i in to_be_erased:\n",
" if i in pen:\n",
" pen.remove(i) \n",
" \n",
" if preprocessdata.text==\"clear\":\n",
" pen.clear()\n",
" rectangle.clear()"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"application/vnd.jupyter.widget-view+json": {
"model_id": "7e1c0d5b318f4e998bbe4312c5d829f2",
"version_major": 2,
"version_minor": 0
},
"text/plain": [
"Image(value=b'', format='jpeg', height='256', width='256')"
]
},
"metadata": {},
"output_type": "display_data"
}
],
"source": [
"import ipywidgets\n",
"from IPython.display import display\n",
"\n",
"\n",
"image_w = ipywidgets.Image(format='jpeg', width=224, height=224)\n",
"display(image_w)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"def execute(change):\n",
" image = change['new']\n",
" data = preprocess(image)\n",
" cmap, paf = model_trt(data)\n",
" cmap, paf = cmap.detach().cpu(), paf.detach().cpu()\n",
" counts, objects, peaks = parse_objects(cmap, paf)\n",
" joints = preprocessdata.joints_inference(image, counts, objects, peaks)\n",
" #draw_objects(image, counts, objects, peaks )\n",
" draw_joints(image, joints)\n",
" dist_bn_joints = preprocessdata.find_distance(joints)\n",
" gesture = clf.predict([dist_bn_joints,[0]*num_parts*num_parts])\n",
" gesture_joints = gesture[0]\n",
" preprocessdata.prev_queue.append(gesture_joints)\n",
" preprocessdata.prev_queue.pop(0)\n",
" preprocessdata.print_label(image, preprocessdata.prev_queue, gesture_type)\n",
" draw(image, joints)\n",
" #image = image[:, ::-1, :]\n",
" image_w.value = bgr8_to_jpeg(image)\n"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [],
"source": [
"execute({'new': camera.value})"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"camera.observe(execute, names='value')"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"camera.unobserve_all()"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#camera.running = False"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.9"
}
},
"nbformat": 4,
"nbformat_minor": 4
}
================================================
FILE: model/README.md
================================================
# Download and put your model here
================================================
FILE: preprocess/gesture.json
================================================
{"paint": ["clear","draw", "click", "line", "erase", "no hand", "no hand"], "mouse": ["drag","pan", "click", "zoom", "scroll", "no hand", "no hand"], "classes": ["fist","pan", "stop", "peace", "ok", "no hand", "no hand"]}
================================================
FILE: preprocess/hand_pose.json
================================================
{"supercategory": "hand", "id": 1, "name": "hand", "keypoints": ["palm","thumb_1", "thumb_2", "thumb_3", "thumb_4", "index_finger_1", "index_finger_2", "index_finger_3", "index_finger_4", "middle_finger_1", "middle_finger_2", "middle_finger_3", "middle_finger_4", "ring_finger_1", "ring_finger_2", "ring_finger_3", "ring_finger_4", "baby_finger_1", "baby_finger_2", "baby_finger_3", "baby_finger_4"], "skeleton": [[1, 5], [1, 9], [1, 13], [1, 17], [1, 21], [2, 3], [3, 4], [4, 5], [6, 7], [7, 8], [8, 9], [10, 11], [11, 12], [12, 13], [14, 15], [15, 16], [16, 17], [18, 19], [19, 20], [20, 21]]}
================================================
FILE: preprocessdata.py
================================================
import math
import pickle
import cv2
class preprocessdata:
def __init__(self, topology, num_parts):
self.joints = []
self.dist_bn_joints = []
self.topology = topology
self.num_parts = num_parts
self.text = "no hand"
self.num_frames = 4
self.prev_queue = [ self.num_frames ]*self.num_frames
def svm_accuracy(self, test_predicted, labels_test):
""""
This method calculates the accuracy of the model
Input: test_predicted - predicted test classes
labels_test
Output: accuracy - of the model
"""
predicted = []
for i in range(len(labels_test)):
if labels_test[i]==test_predicted[i]:
predicted.append(0)
else:
predicted.append(1)
accuracy = 1 - sum(predicted)/len(labels_test)
return accuracy
def trainsvm(self, clf, train_data, test_data, labels_train, labels_test):
"""
This method trains the different gestures
Input: clf - Sk-learn model pipeline to train, You can choose an SVM, linear regression, etc
train_data - preprocessed training image data -in this case the distance between the joints
test_data - preprocessed testing image data -in this case the distance between the joints
labels_train - labels for training images
labels_test - labels for testing images
Output: trained model, predicted_test_classes
"""
clf.fit(train_data,labels_train)
predicted_test = clf.predict(test_data)
return clf, predicted_test
#def loadsvmweights():
def joints_inference(self, image, counts, objects, peaks):
"""
This method returns predicted joints from an image/frame
Input: image, counts, objects, peaks
Output: predicted joints
"""
joints_t = []
height = image.shape[0]
width = image.shape[1]
K = self.topology.shape[0]
count = int(counts[0])
for i in range(count):
obj = objects[0][i]
C = obj.shape[0]
for j in range(C):
k = int(obj[j])
picked_peaks = peaks[0][j][k]
joints_t.append([round(float(picked_peaks[1]) * width), round(float(picked_peaks[0]) * height)])
joints_pt = joints_t[:self.num_parts]
rest_of_joints_t = joints_t[self.num_parts:]
#when it does not predict a particular joint in the same association it will try to find it in a different association
for i in range(len(rest_of_joints_t)):
l = i%self.num_parts
if joints_pt[l] == [0,0]:
joints_pt[l] = rest_of_joints_t[i]
#if nothing is predicted
if count == 0:
joints_pt = [[0,0]]*self.num_parts
return joints_pt
def find_distance(self, joints):
"""
This method finds the distance between each joints
Input: a list that contains the [x,y] positions of the 21 joints
Output: a list that contains the distance between the joints
"""
joints_features = []
for i in joints:
for j in joints:
dist_between_i_j = math.sqrt((i[0]-j[0])**2+(i[1]-j[1])**2)
joints_features.append(dist_between_i_j)
return joints_features
def print_label(self, image, gesture_joints, gesture_type):
"""
This method prints the gesture class detected.
Example. Incase of the cursor control application it shows if your gesture is a click or other type of gesture
"""
font = cv2.FONT_HERSHEY_SIMPLEX
color = (255, 0, 0)
org = (50, 50)
thickness = 2
fontScale = 0.5
no_frames = 4
if self.prev_queue == [1]* self.num_frames:
self.text = gesture_type[0]
elif self.prev_queue == [2]* self.num_frames:
self.text = gesture_type[1]
elif self.prev_queue == [3]* self.num_frames:
self.text = gesture_type[2]
elif self.prev_queue == [4]* self.num_frames:
self.text = gesture_type[3]
elif self.prev_queue == [5]* self.num_frames:
self.text = gesture_type[4]
elif self.prev_queue == [6]* self.num_frames:
self.text = gesture_type[5]
elif self.prev_queue == [7]*self.num_frames:
self.text = gesture_type[6]
image = cv2.putText(image, self.text, org, font,
fontScale, color, thickness, cv2.LINE_AA)
return image
gitextract_gm96bpqz/ ├── LICENSE.md ├── README.md ├── cursor_control_live_demo.ipynb ├── data_collection/ │ └── gesture_data_collection.ipynb ├── dataloader.py ├── gesture_classification_live_demo.ipynb ├── gesture_classifier.py ├── gesture_data_collection_pose.ipynb ├── gesture_training/ │ ├── realold_svmmodel.sav │ └── train_gesture_classification.ipynb ├── live_hand_pose.ipynb ├── mini_paint_live_demo.ipynb ├── model/ │ └── README.md ├── preprocess/ │ ├── gesture.json │ └── hand_pose.json ├── preprocessdata.py └── svmmodel.sav
SYMBOL INDEX (18 symbols across 3 files)
FILE: dataloader.py
class dataloader (line 6) | class dataloader:
method __init__ (line 9) | def __init__(self, path, label_file, test_label):
method _assert_exist (line 31) | def _assert_exist(self, label_path):
method load_labels (line 34) | def load_labels(self, label_path, test_label):
method scaled_data (line 48) | def scaled_data(self, train_data, test_data):
method load_hand_dataset (line 56) | def load_hand_dataset(self, train_path, test_path):
method smaller_dataset (line 80) | def smaller_dataset(self, dataset, no_samples_per_class, no_of_classes):
FILE: gesture_classifier.py
class gesture_classifier (line 4) | class gesture_classifier:
method __init__ (line 6) | def __init__(self):
method svm_accuracy (line 9) | def svm_accuracy(self, test_predicted, labels_test):
method trainsvm (line 18) | def trainsvm(self, clf, train_data, test_data, labels_train, labels_te...
FILE: preprocessdata.py
class preprocessdata (line 6) | class preprocessdata:
method __init__ (line 8) | def __init__(self, topology, num_parts):
method svm_accuracy (line 17) | def svm_accuracy(self, test_predicted, labels_test):
method trainsvm (line 32) | def trainsvm(self, clf, train_data, test_data, labels_train, labels_te...
method joints_inference (line 47) | def joints_inference(self, image, counts, objects, peaks):
method find_distance (line 78) | def find_distance(self, joints):
method print_label (line 90) | def print_label(self, image, gesture_joints, gesture_type):
Condensed preview — 17 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (104K chars).
[
{
"path": "LICENSE.md",
"chars": 1085,
"preview": "Copyright (c) 2020, NVIDIA CORPORATION. All rights reserved.\n\nPermission is hereby granted, free of charge, to any perso"
},
{
"path": "README.md",
"chars": 4512,
"preview": "# Hand Pose Estimation And Classification\n\nThis project is an extention of TRT Pose for Hand Pose Detection. The project"
},
{
"path": "cursor_control_live_demo.ipynb",
"chars": 12411,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "data_collection/gesture_data_collection.ipynb",
"chars": 11236,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [],\n \"source\": [\n "
},
{
"path": "dataloader.py",
"chars": 3662,
"preview": "import os\nimport json\nimport cv2\nimport numpy as np\n\nclass dataloader:\n \n \n def __init__(self, path, label_file"
},
{
"path": "gesture_classification_live_demo.ipynb",
"chars": 9173,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "gesture_classifier.py",
"chars": 717,
"preview": "from sklearn.pipeline import make_pipeline\nfrom sklearn.preprocessing import StandardScaler\n\nclass gesture_classifier:\n "
},
{
"path": "gesture_data_collection_pose.ipynb",
"chars": 16119,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "gesture_training/train_gesture_classification.ipynb",
"chars": 10890,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "live_hand_pose.ipynb",
"chars": 8043,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "mini_paint_live_demo.ipynb",
"chars": 10560,
"preview": "{\n \"cells\": [\n {\n \"cell_type\": \"code\",\n \"execution_count\": 1,\n \"metadata\": {},\n \"outputs\": [\n {\n \"name\":"
},
{
"path": "model/README.md",
"chars": 35,
"preview": "# Download and put your model here\n"
},
{
"path": "preprocess/gesture.json",
"chars": 222,
"preview": "{\"paint\": [\"clear\",\"draw\", \"click\", \"line\", \"erase\", \"no hand\", \"no hand\"], \"mouse\": [\"drag\",\"pan\", \"click\", \"zoom\", \"sc"
},
{
"path": "preprocess/hand_pose.json",
"chars": 596,
"preview": "{\"supercategory\": \"hand\", \"id\": 1, \"name\": \"hand\", \"keypoints\": [\"palm\",\"thumb_1\", \"thumb_2\", \"thumb_3\", \"thumb_4\", \"ind"
},
{
"path": "preprocessdata.py",
"chars": 4707,
"preview": "import math\nimport pickle\nimport cv2\n\n\nclass preprocessdata:\n \n def __init__(self, topology, num_parts):\n s"
}
]
// ... and 2 more files (download for full content)
About this extraction
This page contains the full source code of the NVIDIA-AI-IOT/trt_pose_hand GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 17 files (91.8 KB), approximately 27.9k tokens, and a symbol index with 18 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.