Repository: garion9013/impl-pruning-TF
Branch: master
Commit: a185420c7bf8
Files: 16
Total size: 50.0 MB
Directory structure:
gitextract_z8dibw7e/
├── README.md
├── config.py
├── deploy_test.py
├── deploy_test_pruned.py
├── draw_histogram.py
├── model_ckpt_dense
├── model_ckpt_dense.meta
├── model_ckpt_dense_pruned
├── model_ckpt_sparse_retrained
├── papl.py
├── read_model.py
├── sparse_model_extreme/
│ ├── model_ckpt_dense_pruned
│ ├── model_ckpt_dense_retrained
│ └── model_ckpt_sparse_retrained
├── thspace.py
└── train.py
================================================
FILE CONTENTS
================================================
================================================
FILE: README.md
================================================
## TensorFlow implementation of "Iterative Pruning"
**CAUTION**: Out-of-date notices.
Currently, I've checked TF (>1.3) supports *sparse_matmul* and it seems that this is
more correct way to implement iterative pruning. This work is just naively done with quite old
versions (0.8.0) and thus, I do not recommend to consider these codes for your serious cases. And there will be no updates or maintenance either.
---
This work is based on "Learning both Weights and Connections for Efficient
Neural Network." [Song et al.](http://arxiv.org/pdf/1506.02626v3.pdf) @ NIPS '15.
Note that these works are just for quantifying its effectiveness on latency (within TensorFlow),
not a best optimal. Thus, some details are abbreviated for simplicity. (e.g. # of iterations, adjusted dropout ratio, etc.)
I applied Iterative Pruning on a small MNIST CNN model (13MB, originally), which can be
accessed from [TensorFlow Tutorials](https://www.tensorflow.org/versions/r0.8/tutorials/mnist/pros/index.html).
After pruning off some percentages of weights, I've simply retrained two epochs for
each case and got compressed models (minimum 2.6MB with 90% off) with minor loss of accuracy.
(99.17% -> 98.99% with 90% off and retraining) Again, this is not an optimal.
## Issues
Due to lack of supports on SparseTensor and its operations of TensorFlow (0.8.0),
this implementation has some limitations. This work uses [*embedding_lookup_sparse*](https://www.tensorflow.org/versions/r0.8/api_docs/python/nn.html#embedding_lookup_sparse) to compute sparse matrix-vector multiplication.
It is not solely for the purpose of sparse matrix vector multiplication, and thus its performance may be sub-optimal. (I'm not sure.)
Also, TensorFlow uses \<index, value\> pair for sparse matrix rather than
using typical CSR format which is more compact and performant.
In summary, because of the following reasons, I think this implementation has some limitations.
1. *embedding_lookup_sparse* doesn't support ```broadcasting```, which prohibits users to run test with normal test datasets.
2. Performance may be somewhat sub-optimal.
3. Because "Sparse Variable" is not supported, manual dense to sparse and sparse to dense transformation is required.
4. 4D Convolution Tensor may also be applicable, but bit tricky.
5. Current *embedding_lookup_sparse* forces additional matrix transpose, dimension squeeze and dimension reshape.
## File descriptions and usages
model_ckpt_dense: original model<br>
model_ckpt_dense_pruned: 90% pruned-only model<br>
model_ckpt_sparse_retrained: 90% pruned and retrained model<br>
#### Python package requirements
```bash
sudo apt-get install python-scipy python-numpy python-matplotlib
```
To regenerate these sparse model, edit ```config.py``` first as your threshold configuration,
and then run training with second (pruning and retraining) and third (generate sparse form of weight data) round options.
```bash
./train.py -2 -3
```
To inference single image (seven.png) and measure its latency,
```bash
./deploy_test.py -d -m model_ckpt_dense
./deploy_test_sparse.py -d -m model_ckpt_sparse_retrained
```
To test dense model,
```bash
./deploy_test.py -t -m model_ckpt_dense
./deploy_test.py -t -m model_ckpt_dense_pruned
./deploy_test.py -t -m model_ckpt_dense_retrained
```
To draw histogram that shows the weight distribution,
```bash
# After running train.py (it generates .dat files)
./draw_histogram.py
```
## Performance
Results are currently somewhat mediocre or degraded due to indirection and additional storage overhead originated from sparse matrix form.
Also, it may because model size is too small. (12.49MB)
#### Storage overhead
Baseline: 12.49 MB<br>
10 % pruned: 21.86 MB<br>
20 % pruned: 19.45 MB<br>
30 % pruned: 17.05 MB<br>
40 % pruned: 14.64 MB<br>
50 % pruned: 12.23 MB<br>
60 % pruned: 9.83 MB<br>
70 % pruned: 7.42 MB<br>
80 % pruned: 5.02 MB<br>
90 % pruned: 2.61 MB<br>
#### CPU performance (5 times averaged)
CPU: Intel Core i5-2500 @ 3.3 GHz,
LLC size: 6 MB
<img src=http://younghwanoh.github.io/images/cpu-desktop.png alt=http://younghwanoh.github.io/images/cpu-desktop.png>
Baseline: 0.01118040085 s<br>
10 % pruned: 1.919299984 s<br>
20 % pruned: 0.2325239658 s<br>
30 % pruned: 0.2111079693 s<br>
40 % pruned: 0.1982570648 s<br>
50 % pruned: 0.1691776752 s<br>
60 % pruned: 0.1305227757 s<br>
70 % pruned: 0.116039753 s<br>
80 % pruned: 0.103564167 s<br>
90 % pruned: 0.1058168888 s<br>
#### GPU performance (5 times averaged)
GPU: Nvidia Geforce GTX650 @ 1.058 GHz,
LLC size: 256 KB
<img src=http://younghwanoh.github.io/images/gpu-desktop.png alt=http://younghwanoh.github.io/images/gpu-desktop.png>
Baseline: 0.1475181845 s<br>
10 % pruned: 0.2954540253 s<br>
20 % pruned: 0.2665398121 s<br>
30 % pruned: 0.2585638046 s<br>
40 % pruned: 0.2090051651 s<br>
50 % pruned: 0.1995279789 s<br>
60 % pruned: 0.1815193653 s<br>
70 % pruned: 0.1436806202 s<br>
80 % pruned: 0.135668993 s<br>
90 % pruned: 0.1218701839 s<br>
================================================
FILE: config.py
================================================
#!/usr/bin/python
import thspace as ths
def _complex_concat(a, b):
tmp = []
for i in a:
for j in b:
tmp.append(i+j)
return tmp
def _add_prefix(a):
tmp = []
for idx, val in enumerate(a):
tmp.append("w_" + val)
# tmp.append("b_" + val)
return tmp
# Pruning threshold setting (90 % off)
th = ths.th90
# CNN settings for pruned training
target_layer = ["fc1", "fc2"]
retrain_iterations = 10
# Output data lists: do not change this
target_all_layer = _add_prefix(target_layer)
target_dat = _complex_concat(target_all_layer, [".dat"])
target_p_dat = _complex_concat(target_all_layer, ["_p.dat"])
target_tp_dat = _complex_concat(target_all_layer, ["_tp.dat"])
weight_all = target_dat + target_p_dat + target_tp_dat
syn_all = ["in_conv1.syn", "in_conv2.syn", "in_fc1.syn", "in_fc2.syn"]
# Data settings
show_zero = False
# Graph settings
alpha = 0.75
color = "green"
pdf_prefix = ""
================================================
FILE: deploy_test.py
================================================
#!/usr/bin/python
import sys
sys.dont_write_bytecode = True
import tensorflow as tf
import numpy as np
import argparse
import papl
import config
argparser = argparse.ArgumentParser()
argparser.add_argument("-t", "--test", action="store_true", help="Run test")
argparser.add_argument("-d", "--deploy", action="store_true", help="Run deploy with seven.png")
argparser.add_argument("-s", "--print_syn", action="store_true", help="Print synapses to .syn")
argparser.add_argument("-m", "--model", default="./model_ckpt_dense", help="Specify a target model file")
args = argparser.parse_args()
if (args.test or args.deploy or args.print_syn) == True:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('/tmp/data/', one_hot=True)
else:
argparser.print_help()
sys.exit()
# sess = tf.InteractiveSession(config=tf.ConfigProto(gpu_options=tf.GPUOptions(allow_growth=True)))
sess = tf.InteractiveSession()
# sess = tf.Session()
def imgread(path):
tmp = papl.imread(path)
img = np.zeros((28,28,1))
img[:,:,0]=tmp[:,:,0]
img = np.reshape(img, img.size)
return img
# Restore values of variables
saver = tf.train.import_meta_graph(args.model+'.meta')
saver.restore(sess, args.model)
# Calc results
if args.test == True:
# Evaluate test sets
import time
accuracy = tf.get_collection("accuracy")[0]
# To avoid OOM, run validation with 500/10000 test dataset
b = time.time()
result = 0
for i in range(20):
batch = mnist.test.next_batch(500)
result += sess.run(accuracy, feed_dict={"x:0": batch[0],
"y_:0": batch[1],
"keep_prob:0": 1.0})
result /= 20
a = time.time()
print("Test accuracy %g" % result)
print "Time: %s s" % (a-b)
elif args.deploy == True:
# Infer a single image & check its latency
import time
img = imgread('seven.png')
y_conv = tf.get_collection("y_conv")[0]
b = time.time()
result = sess.run(tf.argmax(y_conv,1), feed_dict={"x:0":[img],
"y_:0":mnist.test.labels,
"keep_prob:0": 1.0})
a = time.time()
print "Output: %s" % result
print "Time: %s s" % (a-b)
papl.log("performance_ref.log", a-b)
elif args.print_syn == True:
# Print synapses (Input data of each neuron)
img = imgread('seven.png')
target_syn = config.syn_all
synapses = [ tf.get_collection(elem.split(".")[0])[0] for elem in target_syn ]
for i,j in zip(synapses, config.syn_all):
syn = sess.run(i, feed_dict={"x:0":[img],
"y_:0":mnist.test.labels,
"keep_prob:0": 1.0})
papl.print_synapse_nps(syn, j)
print "Done! Synapse data is printed to x.syn"
================================================
FILE: deploy_test_pruned.py
================================================
#!/usr/bin/python
import sys
sys.dont_write_bytecode = True
import tensorflow as tf
import numpy as np
import argparse
import config
import papl
argparser = argparse.ArgumentParser()
argparser.add_argument("-t", "--test", action="store_true", help="Run test")
argparser.add_argument("-d", "--deploy", action="store_true", help="Run deploy with seven.png")
argparser.add_argument("-m", "--model", default="./model_ckpt_sparse_retrained", help="Specify a target model file")
args = argparser.parse_args()
if (args.test) == True:
print "Error: TensorFlow 0.8 doesn't support broadcasts on sparse operations, cannot run test set now"
sys.exit()
elif (args.deploy) == True:
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('/tmp/data/', one_hot=True)
else:
argparser.print_help()
sys.exit()
# sess = tf.InteractiveSession(config=tf.ConfigProto(gpu_options=tf.GPUOptions(allow_growth=True)))
sess = tf.InteractiveSession(config=tf.ConfigProto(device_count={'GPU':0}))
# sess = tf.Session()
def imgread(path):
tmp = papl.imread(path)
img = np.zeros((28,28,1))
img[:,:,0]=tmp[:,:,0]
return img
# Declare weight variables
sparse_w={
"w_conv1": tf.Variable(tf.truncated_normal([5, 5, 1, 32], stddev=0.1), name="w_conv1"),
"b_conv1": tf.Variable(tf.constant(0.1, shape=[32]), name="b_conv1"),
"w_conv2": tf.Variable(tf.truncated_normal([5, 5, 32, 64], stddev=0.1), name="w_conv2"),
"b_conv2": tf.Variable(tf.constant(0.1, shape=[64]), name="b_conv2"),
"w_fc1": tf.Variable(tf.zeros([config.th["fc1_nnz"]], dtype=tf.float32),name="w_fc1"),
"w_fc1_idx": tf.Variable(tf.zeros([config.th["fc1_nnz"],2],dtype=tf.int32), name="w_fc1_idx"),
"w_fc1_shape":tf.Variable(tf.zeros([2], dtype=tf.int32), name="w_fc1_shape"),
"b_fc1": tf.Variable(tf.zeros([1024], dtype=tf.float32), name="b_fc1"),
"w_fc2": tf.Variable(tf.zeros([config.th["fc2_nnz"]], dtype=tf.float32),name="w_fc2"),
"w_fc2_idx": tf.Variable(tf.zeros([config.th["fc2_nnz"],2],dtype=tf.int32), name="w_fc2_idx"),
"w_fc2_shape":tf.Variable(tf.zeros([2], dtype=tf.int32), name="w_fc2_shape"),
"b_fc2": tf.Variable(tf.zeros([10], dtype=tf.float32), name="b_fc2"),
}
def sparse_cnn_model(weights):
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
h_conv1 = tf.nn.relu(conv2d(x_image, weights["w_conv1"]) + weights["b_conv1"])
h_pool1 = max_pool_2x2(h_conv1)
h_conv2 = tf.nn.relu(conv2d(h_pool1, weights["w_conv2"]) + weights["b_conv2"])
h_pool2 = max_pool_2x2(h_conv2)
h_pool2_flat = tf.squeeze(tf.reshape(h_pool2, [-1, 7*7*64]))
h_fc1 = tf.nn.relu(tf.nn.embedding_lookup_sparse(h_pool2_flat, weights["w_fc1_ids"], weights["w_fc1"], combiner="sum") + weights["b_fc1"])
h_fc1_drop = tf.squeeze(tf.nn.dropout(h_fc1, keep_prob))
y_conv = tf.nn.relu(tf.nn.embedding_lookup_sparse(h_fc1_drop, weights["w_fc2_ids"], weights["w_fc2"], combiner="sum") + weights["b_fc2"])
y_conv = tf.nn.softmax(tf.reshape(y_conv, [1, -1]))
return y_conv
# Restore values of variables
saver = tf.train.Saver()
saver.restore(sess, args.model)
# Retrieve SparseTensor from serialized dense variables
sparse_w["w_fc1"] = tf.SparseTensor(sparse_w["w_fc1_idx"].eval(),
sparse_w["w_fc1"].eval(),
sparse_w["w_fc1_shape"].eval())
sparse_w["w_fc2"] = tf.SparseTensor(sparse_w["w_fc2_idx"].eval(),
sparse_w["w_fc2"].eval(),
sparse_w["w_fc2_shape"].eval())
sparse_w["w_fc1_ids"] = tf.SparseTensor(sparse_w["w_fc1_idx"].eval(),
sparse_w["w_fc1_idx"].eval()[:,1],
sparse_w["w_fc1_shape"].eval())
sparse_w["w_fc2_ids"] = tf.SparseTensor(sparse_w["w_fc2_idx"].eval(),
sparse_w["w_fc2_idx"].eval()[:,1],
sparse_w["w_fc2_shape"].eval())
# Construct a sparse model with retrieved variables
if args.test == True:
x = tf.placeholder("float", shape=[None, 784])
x_image = tf.reshape(x, [-1,28,28,1])
elif args.deploy == True:
img = imgread("./seven.png")
x = tf.placeholder("float", shape=[None, 28, 28, 1])
x_image = x
y_ = tf.placeholder("float", shape=[None, 10])
keep_prob = tf.placeholder("float")
y_conv = sparse_cnn_model(sparse_w)
# Calc results
if args.test == True:
# Evaluate test sets
import time
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
# To avoid OOM, run validation with 500/10000 test dataset
b = time.time()
result = 0
for i in range(20):
batch = mnist.test.next_batch(500)
result += accuracy.eval(feed_dict={x: batch[0],
y_: batch[1],
keep_prob: 1.0})
result /= 20
a = time.time()
print("test accuracy %g" % result)
print "time: %s s" % (a-b)
elif args.deploy == True:
# Infer a single image & check its latency
import time
b = time.time()
result = sess.run(tf.argmax(y_conv,1), feed_dict={x:[img], y_:mnist.test.labels, keep_prob: 1.0})
a = time.time()
print "output: %s" % result
print "time: %s s" % (a-b)
papl.log("performance_ref.log", a-b)
================================================
FILE: draw_histogram.py
================================================
#!/usr/bin/python
import sys
sys.dont_write_bytecode = True
import papl
import config
papl.draw_histogram(config.weight_all, step=0.01)
papl.draw_histogram(config.syn_all, step=1)
================================================
FILE: model_ckpt_dense
================================================
[File too large to display: 12.5 MB]
================================================
FILE: model_ckpt_dense_pruned
================================================
[File too large to display: 12.5 MB]
================================================
FILE: papl.py
================================================
import csv
import matplotlib
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
from matplotlib.backends.backend_pdf import PdfPages
import numpy as np
import sys
sys.dont_write_bytecode = True
import config
# =====================================================================================
# Private methods
# =====================================================================================
def _saveToPdf(output):
pp = PdfPages(output)
plt.savefig(pp, format='pdf')
pp.close()
plt.close()
# Manipulate y-axis of histogarm
def _to_percent(y, position):
# tick locations calculated from fraction (global).
s = str(y*100)
# The percent symbol needs escaping in latex
if matplotlib.rcParams['text.usetex'] == True:
return s + r'$\%$'
else:
return s + '%'
# Calc min, max position of each histogram bin
def _minRuler(array):
minimum = min(array)
print " - min: ", minimum
offset = minimum % step
return minimum - offset
def _maxRuler(array):
maximum = max(array)
print " - max: ", maximum
offset = maximum % step
return maximum - offset + step
# =====================================================================================
# Start main methods (tools)
# =====================================================================================
# Input: x.dat from global variables (config) or arguments
# Output: histogram. x.pdf
# Histogram settings are configurable through config.py
def draw_histogram(*target, **kwargs):
if len(target) == 1:
target = target[0]
assert type(target) == list
file_list = target
else:
file_list = config.weight_all
global step
step = kwargs["step"]
for target in file_list:
print "Target: ", target
try:
with open(config.pdf_prefix+"%s" % target) as text:
x = np.float32(text.read().rstrip("\n").split("\n"))
# norm = np.ones_like(x) / float(len(x))
norm = np.ones_like(x)
binspace = np.arange(_minRuler(x), _maxRuler(x), step)
n, bins, patches = plt.hist(x, bins=binspace, weights=norm,
alpha=config.alpha, facecolor=config.color)
# formatter = FuncFormatter(_to_percent)
# plt.gca().yaxis.set_major_formatter(formatter)
plt.grid(True)
_saveToPdf(config.pdf_prefix+"%s.pdf" % target.split(".")[0])
except IOError as e:
print "Warning: I/O error({0}) - {1}".format(e.errno, e.strerror)
pass
except:
print "Unexpected error:", sys.exc_info()[0]
raise
print "Graphs are drawned!"
# Input: model object list, Output: human-readable form of model as x.dat
def print_weight_vars(obj_dict, weight_obj_list, fname_list, show_zero=False):
for elem, fname in zip(weight_obj_list, fname_list):
weight_arr = obj_dict[elem].eval()
ndim = weight_arr.size
flat_weight_space = weight_arr.reshape(ndim)
with open(fname, "w") as filelog:
if show_zero == False:
flat_weight_space = flat_weight_space[flat_weight_space != 0]
writeLine = csv.writer(filelog, delimiter='\n')
writeLine.writerow(flat_weight_space)
# Input: synapse, Output: human-readable form of model as x.syn
def print_synapse_nps(syn_arr, fname, show_zero=False):
ndim = syn_arr.size
flat_syn_space = syn_arr.reshape(ndim)
with open(fname, "w") as filelog:
if show_zero == False:
flat_syn_space = flat_syn_space[flat_syn_space != 0]
writeLine = csv.writer(filelog, delimiter='\n')
writeLine.writerow(flat_syn_space)
# Input: sparse model object list, Output: human-readable form of model as x.dat
def print_sparse_weight_vars(obj_dict, weight_obj_list, fname_list):
for elem, fname in zip(weight_obj_list, fname_list):
weight_arr = obj_dict[elem].eval().values
ndim = weight_arr.size
flat_weight_space = weight_arr.reshape(ndim)
with open(fname, "w") as filelog:
writeLine = csv.writer(filelog, delimiter='\n')
writeLine.writerow(flat_weight_space)
# Input: n-d dense array, Output: pruned array with threshold
def prune_dense(weight_arr, name="None", thresh=0.005, **kwargs):
"""Apply weight pruning with threshold """
under_threshold = abs(weight_arr) < thresh
weight_arr[under_threshold] = 0
count = np.sum(under_threshold)
print "Non-zero count (%s): %s" % (name, weight_arr.size - count)
return weight_arr, -under_threshold, count
# Input: anonymous dimension array and its pruning threshold,
# Output: indices - index list of non-zero elements
# values - value list of non-zero elements
# shape - original shape of matrix
def prune_tf_sparse(weight_arr, name="None", thresh=0.005):
assert isinstance(weight_arr, np.ndarray)
under_threshold = abs(weight_arr) < thresh
weight_arr[under_threshold] = 0
values = weight_arr[weight_arr != 0]
indices = np.transpose(np.nonzero(weight_arr))
shape = list(weight_arr.shape)
count = np.sum(under_threshold)
print "Non-zero count (Sparse %s): %s" % (name, weight_arr.size - count)
return [indices, values, shape]
# Input: file name and text, Output: log file
def log(fname, log):
with open(fname, "a") as wobj:
wobj.write(str(log)+"\n")
# Input: Path to target image, Output: ndarray resized to fixed (28,28)
def imread(path):
import numpy as np
import Image
return np.array(Image.open(path).resize((28,28), resample=2))
================================================
FILE: read_model.py
================================================
#!/usr/bin/python
import sys
sys.dont_write_bytecode = True
import tensorflow as tf
import papl
import argparse
argparser = argparse.ArgumentParser()
argparser.add_argument("-m", "--model", required=True, help="Specify serialized input model")
argparser.add_argument("-r", "--ratio", help="Specify ratio")
args = argparser.parse_args()
def read_model_obj_with_sorted_ratio(fname, ratio):
saver = tf.train.Saver()
saver.restore(sess, fname)
print str(ratio*100)+" %"
target_obj_list = [weights[elem] for elem in papl.config.target_all_layer]
for elem in target_obj_list:
arr = elem.eval()
arr = list(arr.reshape(arr.size))
arr.sort(cmp=lambda x,y:cmp(abs(x), abs(y)))
print "\""+elem.name[:-2]+"\": ", abs(arr[int(len(arr)*ratio)-1]), ","
def print_raw_matrix(fname):
saver = tf.train.Saver()
saver.restore(sess, fname)
import numpy as np
np.save("w_fc1.raw", weights["w_fc1"].eval())
np.save("w_fc2.raw", weights["w_fc2"].eval())
def read_model_obj(fname):
saver = tf.train.Saver()
import os.path
try:
assert os.path.isfile(fname)
saver.restore(sess, fname)
switcher = {
"model_ckpt_dense": papl.config.target_dat,
"model_ckpt_dense_pruned": papl.config.target_p_dat,
"model_ckpt_dense_retrained": papl.config.target_tp_dat
}
papl.print_weight_vars(weights, papl.config.target_all_layer, switcher.get(args.model))
except AssertionError:
print "Warning: No such files or directory\n"
pass
except:
import sys
print "Unexpected error:", sys.exc_info()[0]
weights = {
"w_conv1": tf.Variable(tf.truncated_normal([5, 5, 1, 32], stddev=0.1), name="w_conv1"),
"b_conv1": tf.Variable(tf.constant(0.1, shape=[32]), name="b_conv1"),
"w_conv2": tf.Variable(tf.truncated_normal([5, 5, 32, 64], stddev=0.1), name="w_conv2"),
"b_conv2": tf.Variable(tf.constant(0.1, shape=[64]), name="b_conv2"),
"w_fc1": tf.Variable(tf.truncated_normal([7*7*64, 1024], stddev=0.1), name="w_fc1"),
"b_fc1": tf.Variable(tf.constant(0.1, shape=[1024]), name="b_fc1"),
"w_fc2": tf.Variable(tf.truncated_normal([1024, 10], stddev=0.1), name="w_fc2"),
"b_fc2": tf.Variable(tf.constant(0.1, shape=[10]), name="b_fc2")
}
sess = tf.InteractiveSession()
if __name__ == "__main__":
if bool(args.ratio) == False:
read_model_obj(args.model)
else:
read_model_obj_with_sorted_ratio(args.model, float(args.ratio))
# print_raw_matrix(args.model)
================================================
FILE: sparse_model_extreme/model_ckpt_dense_pruned
================================================
[File too large to display: 12.5 MB]
================================================
FILE: sparse_model_extreme/model_ckpt_dense_retrained
================================================
[File too large to display: 12.5 MB]
================================================
FILE: thspace.py
================================================
th10 = {
"fc1_nnz": 2890138 ,
"fc2_nnz": 9216 ,
"w_conv1": 0.0143598 ,
"b_conv1": 0.0639633 ,
"w_conv2": 0.0122113 ,
"b_conv2": 0.0764835 ,
"w_fc1": 0.0121145 ,
"b_fc1": 0.0907546 ,
"w_fc2": 0.0132132 ,
"b_fc2": 0.0911222
}
th20 = {
"fc1_nnz": 2569011 ,
"fc2_nnz": 8192 ,
"w_conv1": 0.0303244 ,
"b_conv1": 0.0703646 ,
"w_conv2": 0.0243528 ,
"b_conv2": 0.0805169 ,
"w_fc1": 0.024393 ,
"b_fc1": 0.0929042 ,
"w_fc2": 0.0266463 ,
"b_fc2": 0.0943206
}
th30 = {
"fc1_nnz": 2247885 ,
"fc2_nnz": 7168 ,
"w_conv1": 0.0473049 ,
"b_conv1": 0.075084 ,
"w_conv2": 0.0371282 ,
"b_conv2": 0.0821582 ,
"w_fc1": 0.0370279 ,
"b_fc1": 0.0944582 ,
"w_fc2": 0.0407012 ,
"b_fc2": 0.0944
}
th40 = {
"fc1_nnz": 1926757 ,
"fc2_nnz": 6143 ,
"w_conv1": 0.0619981 ,
"b_conv1": 0.0783646 ,
"w_conv2": 0.0506691 ,
"b_conv2": 0.0849416 ,
"w_fc1": 0.0503098 ,
"b_fc1": 0.0957049 ,
"w_fc2": 0.0552152 ,
"b_fc2": 0.0960752
}
th50 = {
"fc1_nnz": 1605631 ,
"fc2_nnz": 5120 ,
"w_conv1": 0.0762394 ,
"b_conv1": 0.0791745 ,
"w_conv2": 0.0650136 ,
"b_conv2": 0.0858885 ,
"w_fc1": 0.0645222 ,
"b_fc1": 0.0967964 ,
"w_fc2": 0.0705915 ,
"b_fc2": 0.0978322
}
th60 = {
"fc1_nnz": 1284506 ,
"fc2_nnz": 4095 ,
"w_conv1": 0.0936658 ,
"b_conv1": 0.0817409 ,
"w_conv2": 0.0805099 ,
"b_conv2": 0.0873334 ,
"w_fc1": 0.0801966 ,
"b_fc1": 0.0979769 ,
"w_fc2": 0.0870296 ,
"b_fc2": 0.0996566
}
th70 = {
"fc1_nnz": 963379 ,
"fc2_nnz": 3071 ,
"w_conv1": 0.110689 ,
"b_conv1": 0.0830755 ,
"w_conv2": 0.0988535 ,
"b_conv2": 0.088152 ,
"w_fc1": 0.0979934 ,
"b_fc1": 0.0991785 ,
"w_fc2": 0.105566 ,
"b_fc2": 0.100518
}
th80 = {
"fc1_nnz": 642247 ,
"fc2_nnz": 2048 ,
"w_conv1": 0.130691 ,
"b_conv1": 0.0912763 ,
"w_conv2": 0.120732 ,
"b_conv2": 0.0898623 ,
"w_fc1": 0.119522 ,
"b_fc1": 0.100626 ,
"w_fc2": 0.126458 ,
"b_fc2": 0.112008
}
th90 = {
"fc1_nnz": 320939 ,
"fc2_nnz": 1014 ,
"w_conv1": 0.162963 ,
"b_conv1": 0.0956728 ,
"w_conv2": 0.150202 ,
"b_conv2": 0.0928398 ,
"w_fc1": 0.148615 ,
"b_fc1": 0.102556 ,
"w_fc2": 0.15566 ,
"b_fc2": 0.112008
}
th95 = {
"fc1_nnz": 160566 ,
"fc2_nnz": 513 ,
"w_fc1": 0.169592 ,
"b_fc1": 0.103936 ,
"w_fc2": 0.17703 ,
"b_fc2": 0.112008
}
th99 = {
"fc1_nnz": 32111 ,
"fc2_nnz": 103 ,
"w_fc1": 0.1975 ,
"b_fc1": 0.10679 ,
"w_fc2": 0.21004 ,
"b_fc2": 0.112008
}
================================================
FILE: train.py
================================================
#!/usr/bin/python
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import sys
sys.dont_write_bytecode = True
import tensorflow as tf
import numpy as np
import argparse
import papl
import scipy.sparse as sp
argparser = argparse.ArgumentParser()
argparser.add_argument("-1", "--first_round", action="store_true",
help="Run 1st-round: train with 20000 iterations")
argparser.add_argument("-2", "--second_round", action="store_true",
help="Run 2nd-round: apply pruning and its additional training")
argparser.add_argument("-3", "--third_round", action="store_true",
help="Run 3rd-round: transform model to a sparse format and save it")
argparser.add_argument("-m", "--checkpoint", default="./model_ckpt_dense",
help="Target checkpoint model file for 2nd and 3rd round")
args = argparser.parse_args()
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets('/tmp/data/', one_hot=True)
if (args.first_round or args.second_round or args.third_round) == False:
argparser.print_help()
sys.exit()
sess = tf.InteractiveSession()
def apply_prune(weights):
total_fc_byte = 0
total_fc_csr_byte = 0
total_nnz_elem = 0
total_origin_elem = 0
dict_nzidx = {}
for target in papl.config.target_layer:
wl = "w_" + target
print(wl + " threshold:\t" + str(papl.config.th[wl]))
# Get target layer's weights
weight_obj = weights[wl]
weight_arr = weight_obj.eval()
# Apply pruning
weight_arr, w_nzidx, w_nnz = papl.prune_dense(weight_arr, name=wl,
thresh=papl.config.th[wl])
# Store pruned weights as tensorflow objects
dict_nzidx[wl] = w_nzidx
sess.run(weight_obj.assign(weight_arr))
return dict_nzidx
def apply_prune_on_grads(grads_and_vars, dict_nzidx):
# Mask gradients with pruned elements
for key, nzidx in dict_nzidx.items():
count = 0
for grad, var in grads_and_vars:
if var.name == key+":0":
nzidx_obj = tf.cast(tf.constant(nzidx), tf.float32)
grads_and_vars[count] = (tf.mul(nzidx_obj, grad), var)
count += 1
return grads_and_vars
def gen_sparse_dict(dense_w):
sparse_w = dense_w
for target in papl.config.target_all_layer:
target_arr = np.transpose(dense_w[target].eval())
sparse_arr = papl.prune_tf_sparse(target_arr, name=target)
sparse_w[target+"_idx"]=tf.Variable(tf.constant(sparse_arr[0],dtype=tf.int32),
name=target+"_idx")
sparse_w[target]=tf.Variable(tf.constant(sparse_arr[1],dtype=tf.float32),
name=target)
sparse_w[target+"_shape"]=tf.Variable(tf.constant(sparse_arr[2],dtype=tf.int32),
name=target+"_shape")
return sparse_w
dense_w={
"w_conv1": tf.Variable(tf.truncated_normal([5,5,1,32],stddev=0.1), name="w_conv1"),
"b_conv1": tf.Variable(tf.constant(0.1,shape=[32]), name="b_conv1"),
"w_conv2": tf.Variable(tf.truncated_normal([5,5,32,64],stddev=0.1), name="w_conv2"),
"b_conv2": tf.Variable(tf.constant(0.1,shape=[64]), name="b_conv2"),
"w_fc1": tf.Variable(tf.truncated_normal([7*7*64,1024],stddev=0.1), name="w_fc1"),
"b_fc1": tf.Variable(tf.constant(0.1,shape=[1024]), name="b_fc1"),
"w_fc2": tf.Variable(tf.truncated_normal([1024,10],stddev=0.1), name="w_fc2"),
"b_fc2": tf.Variable(tf.constant(0.1,shape=[10]), name="b_fc2")
}
def dense_cnn_model(weights):
def conv2d(x, W):
return tf.nn.conv2d(x, W, strides=[1, 1, 1, 1], padding='SAME')
def max_pool_2x2(x):
return tf.nn.max_pool(x, ksize=[1, 2, 2, 1],
strides=[1, 2, 2, 1], padding='SAME')
x_image = tf.reshape(x, [-1,28,28,1])
h_conv1 = tf.nn.relu(conv2d(x_image, weights["w_conv1"]) + weights["b_conv1"])
tf.add_to_collection("in_conv1", x_image)
h_pool1 = max_pool_2x2(h_conv1)
tf.add_to_collection("in_conv2", h_pool1)
h_conv2 = tf.nn.relu(conv2d(h_pool1, weights["w_conv2"]) + weights["b_conv2"])
h_pool2 = max_pool_2x2(h_conv2)
h_pool2_flat = tf.reshape(h_pool2, [-1, 7*7*64])
tf.add_to_collection("in_fc1", h_pool2_flat)
h_fc1 = tf.nn.relu(tf.matmul(h_pool2_flat, weights["w_fc1"]) + weights["b_fc1"])
h_fc1_drop = tf.nn.dropout(h_fc1, keep_prob)
tf.add_to_collection("in_fc2", h_fc1_drop)
y_conv=tf.nn.softmax(tf.matmul(h_fc1_drop, weights["w_fc2"]) + weights["b_fc2"])
return y_conv
def test(y_infer, message="None."):
correct_prediction = tf.equal(tf.argmax(y_infer,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
# To avoid OOM, run validation with 500/10000 test dataset
result = 0
for i in range(20):
batch = mnist.test.next_batch(500)
result += accuracy.eval(feed_dict={x: batch[0],
y_: batch[1],
keep_prob: 1.0})
result /= 20
print(message+" %g\n" % result)
return result
def check_file_exists(key):
import os
fileList = os.listdir(".")
count = 0
for elem in fileList:
if elem.find(key) >= 0:
count += 1
return key + ("-"+str(count) if count>0 else "")
# Construct a dense model
x = tf.placeholder("float", shape=[None, 784], name="x")
y_ = tf.placeholder("float", shape=[None, 10], name="y_")
keep_prob = tf.placeholder("float", name="keep_prob")
y_conv = dense_cnn_model(dense_w)
tf.add_to_collection("y_conv", y_conv)
saver = tf.train.Saver()
if args.first_round == True:
# First round: Train baseline dense model
cross_entropy = -tf.reduce_sum(y_*tf.log(tf.clip_by_value(y_conv,1e-10,1.0)))
train_step = tf.train.AdamOptimizer(1e-4).minimize(cross_entropy)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
tf.add_to_collection("accuracy", accuracy)
sess.run(tf.initialize_all_variables())
for i in range(20000):
batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x:batch[0], y_: batch[1], keep_prob: 1.0})
print("step %d, training accuracy %g"%(i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
# Test
score = test(y_conv, message="First-round prune-only test accuracy")
papl.log("baseline_accuracy.log", score)
# Save model objects to readable format
papl.print_weight_vars(dense_w, papl.config.target_all_layer,
papl.config.target_dat, show_zero=papl.config.show_zero)
# Save model objects to serialized format
saver.save(sess, "./model_ckpt_dense")
if args.second_round == True:
# Second round: Retrain pruned model, start with default model: model_ckpt_dense
saver.restore(sess, args.checkpoint)
# Apply pruning on this context
dict_nzidx = apply_prune(dense_w)
# save model objects to readable format
papl.print_weight_vars(dense_w, papl.config.target_all_layer,
papl.config.target_p_dat, show_zero=papl.config.show_zero)
# Test prune-only networks
score = test(y_conv, message="Second-round prune-only test accuracy")
papl.log("prune_accuracy.log", score)
# save model objects to serialized format
saver.save(sess, "./model_ckpt_dense_pruned")
# Retrain networks
cross_entropy = -tf.reduce_sum(y_*tf.log(tf.clip_by_value(y_conv,1e-10,1.0)))
trainer = tf.train.AdamOptimizer(1e-4)
grads_and_vars = trainer.compute_gradients(cross_entropy)
grads_and_vars = apply_prune_on_grads(grads_and_vars, dict_nzidx)
train_step = trainer.apply_gradients(grads_and_vars)
correct_prediction = tf.equal(tf.argmax(y_conv,1), tf.argmax(y_,1))
accuracy = tf.reduce_mean(tf.cast(correct_prediction, "float"))
# Initialize firstly touched variables (mostly from accuracy calc.)
for var in tf.all_variables():
if tf.is_variable_initialized(var).eval() == False:
sess.run(tf.initialize_variables([var]))
# Train x epochs additionally
for i in range(papl.config.retrain_iterations):
batch = mnist.train.next_batch(50)
if i%100 == 0:
train_accuracy = accuracy.eval(feed_dict={
x:batch[0], y_: batch[1], keep_prob: 1.0})
print("step %d, training accuracy %g"%(i, train_accuracy))
train_step.run(feed_dict={x: batch[0], y_: batch[1], keep_prob: 0.5})
# Save retrained variables to a desne form
# key = check_file_exists("model_ckpt_dense_retrained")
# saver.save(sess, key)
saver.save(sess, "model_ckpt_dense_retrained")
# Test the retrained model
score = test(y_conv, message="Second-round final test accuracy")
papl.log("final_accuracy.log", score)
if args.third_round == True:
# Third round: Transform iteratively pruned model to a sparse format and save it
if args.second_round == False:
saver.restore(sess, "./model_ckpt_dense_pruned")
# Transform final weights to a sparse form
sparse_w = gen_sparse_dict(dense_w)
# Initialize new variables in a sparse form
for var in tf.all_variables():
if tf.is_variable_initialized(var).eval() == False:
sess.run(tf.initialize_variables([var]))
# Save model objects to readable format
papl.print_weight_vars(dense_w, papl.config.target_all_layer,
papl.config.target_tp_dat, show_zero=papl.config.show_zero)
# Save model objects to serialized format
final_saver = tf.train.Saver(sparse_w)
final_saver.save(sess, "./model_ckpt_sparse_retrained")
gitextract_z8dibw7e/ ├── README.md ├── config.py ├── deploy_test.py ├── deploy_test_pruned.py ├── draw_histogram.py ├── model_ckpt_dense ├── model_ckpt_dense.meta ├── model_ckpt_dense_pruned ├── model_ckpt_sparse_retrained ├── papl.py ├── read_model.py ├── sparse_model_extreme/ │ ├── model_ckpt_dense_pruned │ ├── model_ckpt_dense_retrained │ └── model_ckpt_sparse_retrained ├── thspace.py └── train.py
SYMBOL INDEX (26 symbols across 6 files) FILE: config.py function _complex_concat (line 4) | def _complex_concat(a, b): function _add_prefix (line 11) | def _add_prefix(a): FILE: deploy_test.py function imgread (line 30) | def imgread(path): FILE: deploy_test_pruned.py function imgread (line 32) | def imgread(path): function sparse_cnn_model (line 54) | def sparse_cnn_model(weights): FILE: papl.py function _saveToPdf (line 16) | def _saveToPdf(output): function _to_percent (line 23) | def _to_percent(y, position): function _minRuler (line 33) | def _minRuler(array): function _maxRuler (line 39) | def _maxRuler(array): function draw_histogram (line 52) | def draw_histogram(*target, **kwargs): function print_weight_vars (line 88) | def print_weight_vars(obj_dict, weight_obj_list, fname_list, show_zero=F... function print_synapse_nps (line 100) | def print_synapse_nps(syn_arr, fname, show_zero=False): function print_sparse_weight_vars (line 110) | def print_sparse_weight_vars(obj_dict, weight_obj_list, fname_list): function prune_dense (line 120) | def prune_dense(weight_arr, name="None", thresh=0.005, **kwargs): function prune_tf_sparse (line 132) | def prune_tf_sparse(weight_arr, name="None", thresh=0.005): function log (line 146) | def log(fname, log): function imread (line 151) | def imread(path): FILE: read_model.py function read_model_obj_with_sorted_ratio (line 15) | def read_model_obj_with_sorted_ratio(fname, ratio): function print_raw_matrix (line 28) | def print_raw_matrix(fname): function read_model_obj (line 35) | def read_model_obj(fname): FILE: train.py function apply_prune (line 36) | def apply_prune(weights): function apply_prune_on_grads (line 62) | def apply_prune_on_grads(grads_and_vars, dict_nzidx): function gen_sparse_dict (line 73) | def gen_sparse_dict(dense_w): function dense_cnn_model (line 97) | def dense_cnn_model(weights): function test (line 120) | def test(y_infer, message="None."): function check_file_exists (line 136) | def check_file_exists(key):
Condensed preview — 16 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (38K chars).
[
{
"path": "README.md",
"chars": 5005,
"preview": "## TensorFlow implementation of \"Iterative Pruning\"\n\n**CAUTION**: Out-of-date notices.\n\nCurrently, I've checked TF (>1.3"
},
{
"path": "config.py",
"chars": 944,
"preview": "#!/usr/bin/python\nimport thspace as ths\n\ndef _complex_concat(a, b):\n tmp = []\n for i in a:\n for j in b:\n "
},
{
"path": "deploy_test.py",
"chars": 2918,
"preview": "#!/usr/bin/python\n\nimport sys\nsys.dont_write_bytecode = True\n\nimport tensorflow as tf\nimport numpy as np\nimport argparse"
},
{
"path": "deploy_test_pruned.py",
"chars": 5653,
"preview": "#!/usr/bin/python\n\nimport sys\nsys.dont_write_bytecode = True\n\nimport tensorflow as tf\nimport numpy as np\nimport argparse"
},
{
"path": "draw_histogram.py",
"chars": 183,
"preview": "#!/usr/bin/python\n\nimport sys\nsys.dont_write_bytecode = True\n\nimport papl\nimport config\n\npapl.draw_histogram(config.weig"
},
{
"path": "papl.py",
"chars": 5662,
"preview": "import csv\nimport matplotlib\nimport matplotlib.pyplot as plt\nfrom matplotlib.ticker import FuncFormatter\nfrom matplotlib"
},
{
"path": "read_model.py",
"chars": 2570,
"preview": "#!/usr/bin/python\n\nimport sys\nsys.dont_write_bytecode = True\n\nimport tensorflow as tf\nimport papl\nimport argparse\n\nargpa"
},
{
"path": "thspace.py",
"chars": 2758,
"preview": "th10 = {\n \"fc1_nnz\": 2890138 ,\n \"fc2_nnz\": 9216 ,\n \"w_conv1\": 0.0143598 ,\n \"b_conv1\": 0.0639633 ,\n \"w"
},
{
"path": "train.py",
"chars": 9839,
"preview": "#!/usr/bin/python\n\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_f"
}
]
// ... and 7 more files (download for full content)
About this extraction
This page contains the full source code of the garion9013/impl-pruning-TF GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 16 files (50.0 MB), approximately 10.5k tokens, and a symbol index with 26 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.