Repository: lengstrom/fast-style-transfer
Branch: master
Commit: 0d3d981f7ab9
Files: 13
Total size: 39.9 KB
Directory structure:
gitextract_gzfrkqti/
├── .github/
│ └── FUNDING.yml
├── .gitignore
├── CITATION.cff
├── README.md
├── docs.md
├── evaluate.py
├── setup.sh
├── src/
│ ├── optimize.py
│ ├── transform.py
│ ├── utils.py
│ └── vgg.py
├── style.py
└── transform_video.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .github/FUNDING.yml
================================================
# These are supported funding model platforms
github: [lengstrom] # Replace with up to 4 GitHub Sponsors-enabled usernames e.g., [user1, user2]
================================================
FILE: .gitignore
================================================
t Byte-compiled / optimized / DLL files
deps.txt
archive
saver
*~
styles
pngs
preds
*.sw*
data
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
env/
build/
develop-eggs/
dist/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
*.egg-info/
.installed.cfg
*.egg
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*,cover
.hypothesis/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# IPython Notebook
.ipynb_checkpoints
# pyenv
.python-version
# celery beat schedule file
celerybeat-schedule
# dotenv
.env
# virtualenv
venv/
ENV/
# Spyder project settings
.spyderproject
# Rope project settings
.ropeproject
# PyCharm
.idea
# checkpoint
checkpoint
================================================
FILE: CITATION.cff
================================================
# YAML 1.2
---
authors:
-
family-names: Engstrom
given-names: Logan
cff-version: "1.1.0"
date-released: 2016-10-31
message: "If you use this software, please cite it using these metadata."
repository-code: "https://github.com/lengstrom/fast-style-transfer"
title: "Fast Style Transfer"
version: "1.0"
...
================================================
FILE: README.md
================================================
## Fast Style Transfer in [TensorFlow](https://github.com/tensorflow/tensorflow)
Add styles from famous paintings to any photo in a fraction of a second! [You can even style videos!](#video-stylization)
<p align = 'center'>
<img src = 'examples/style/udnie.jpg' height = '246px'>
<img src = 'examples/content/stata.jpg' height = '246px'>
<a href = 'examples/results/stata_udnie.jpg'><img src = 'examples/results/stata_udnie_header.jpg' width = '627px'></a>
</p>
<p align = 'center'>
It takes 100ms on a 2015 Titan X to style the MIT Stata Center (1024×680) like Udnie, by Francis Picabia.
</p>
Our implementation is based off of a combination of Gatys' [A Neural Algorithm of Artistic Style](https://arxiv.org/abs/1508.06576), Johnson's [Perceptual Losses for Real-Time Style Transfer and Super-Resolution](http://cs.stanford.edu/people/jcjohns/eccv16/), and Ulyanov's [Instance Normalization](https://arxiv.org/abs/1607.08022).
### Sponsorship
Please consider sponsoring my work on this project!
### License
Copyright (c) 2016 Logan Engstrom. Contact me for commercial use (or rather any use that is not academic research) (email: engstrom at my university's domain dot edu). Free for research use, as long as proper attribution is given and this copyright notice is retained.
## Video Stylization
Here we transformed every frame in a video, then combined the results. [Click to go to the full demo on YouTube!](https://www.youtube.com/watch?v=xVJwwWQlQ1o) The style here is Udnie, as above.
<div align = 'center'>
<a href = 'https://www.youtube.com/watch?v=xVJwwWQlQ1o'>
<img src = 'examples/results/fox_udnie.gif' alt = 'Stylized fox video. Click to go to YouTube!' width = '800px' height = '400px'>
</a>
</div>
See how to generate these videos [here](#stylizing-video)!
## Image Stylization
We added styles from various paintings to a photo of Chicago. Click on thumbnails to see full applied style images.
<div align='center'>
<img src = 'examples/content/chicago.jpg' height="200px">
</div>
<div align = 'center'>
<a href = 'examples/style/wave.jpg'><img src = 'examples/thumbs/wave.jpg' height = '200px'></a>
<img src = 'examples/results/chicago_wave.jpg' height = '200px'>
<img src = 'examples/results/chicago_udnie.jpg' height = '200px'>
<a href = 'examples/style/udnie.jpg'><img src = 'examples/thumbs/udnie.jpg' height = '200px'></a>
<br>
<a href = 'examples/style/rain_princess.jpg'><img src = 'examples/thumbs/rain_princess.jpg' height = '200px'></a>
<img src = 'examples/results/chicago_rain_princess.jpg' height = '200px'>
<img src = 'examples/results/chicago_la_muse.jpg' height = '200px'>
<a href = 'examples/style/la_muse.jpg'><img src = 'examples/thumbs/la_muse.jpg' height = '200px'></a>
<br>
<a href = 'examples/style/the_shipwreck_of_the_minotaur.jpg'><img src = 'examples/thumbs/the_shipwreck_of_the_minotaur.jpg' height = '200px'></a>
<img src = 'examples/results/chicago_wreck.jpg' height = '200px'>
<img src = 'examples/results/chicago_the_scream.jpg' height = '200px'>
<a href = 'examples/style/the_scream.jpg'><img src = 'examples/thumbs/the_scream.jpg' height = '200px'></a>
</div>
## Implementation Details
Our implementation uses TensorFlow to train a fast style transfer network. We use roughly the same transformation network as described in Johnson, except that batch normalization is replaced with Ulyanov's instance normalization, and the scaling/offset of the output `tanh` layer is slightly different. We use a loss function close to the one described in Gatys, using VGG19 instead of VGG16 and typically using "shallower" layers than in Johnson's implementation (e.g. we use `relu1_1` rather than `relu1_2`). Empirically, this results in larger scale style features in transformations.
## Virtual Environment Setup (Anaconda) - Windows/Linux
Tested on
| Spec | |
|-----------------------------|-------------------------------------------------------------|
| Operating System | Windows 10 Home |
| GPU | Nvidia GTX 2080 TI |
| CUDA Version | 11.0 |
| Driver Version | 445.75 |
### Step 1:Install Anaconda
https://docs.anaconda.com/anaconda/install/
### Step 2:Build a virtual environment
Run the following commands in sequence in Anaconda Prompt:
```
conda create -n tf-gpu tensorflow-gpu=2.1.0
conda activate tf-gpu
conda install jupyterlab
jupyter lab
```
Run the following command in the notebook or just conda install the package:
```
!pip install moviepy==1.0.2
```
Follow the commands below to use fast-style-transfer
## Documentation
### Training Style Transfer Networks
Use `style.py` to train a new style transfer network. Run `python style.py` to view all the possible parameters. Training takes 4-6 hours on a Maxwell Titan X. [More detailed documentation here](docs.md#stylepy). **Before you run this, you should run `setup.sh`**. Example usage:
python style.py --style path/to/style/img.jpg \
--checkpoint-dir checkpoint/path \
--test path/to/test/img.jpg \
--test-dir path/to/test/dir \
--content-weight 1.5e1 \
--checkpoint-iterations 1000 \
--batch-size 20
### Evaluating Style Transfer Networks
Use `evaluate.py` to evaluate a style transfer network. Run `python evaluate.py` to view all the possible parameters. Evaluation takes 100 ms per frame (when batch size is 1) on a Maxwell Titan X. [More detailed documentation here](docs.md#evaluatepy). Takes several seconds per frame on a CPU. **Models for evaluation are [located here](https://drive.google.com/drive/folders/0B9jhaT37ydSyRk9UX0wwX3BpMzQ?resourcekey=0-Z9LcNHC-BTB4feKwm4loXw&usp=sharing)**. Example usage:
python evaluate.py --checkpoint path/to/style/model.ckpt \
--in-path dir/of/test/imgs/ \
--out-path dir/for/results/
### Stylizing Video
Use `transform_video.py` to transfer style into a video. Run `python transform_video.py` to view all the possible parameters. Requires `ffmpeg`. [More detailed documentation here](docs.md#transform_videopy). Example usage:
python transform_video.py --in-path path/to/input/vid.mp4 \
--checkpoint path/to/style/model.ckpt \
--out-path out/video.mp4 \
--device /gpu:0 \
--batch-size 4
### Requirements
You will need the following to run the above:
- TensorFlow 0.11.0
- Python 2.7.9, Pillow 3.4.2, scipy 0.18.1, numpy 1.11.2
- If you want to train (and don't want to wait for 4 months):
- A decent GPU
- All the required NVIDIA software to run TF on a GPU (cuda, etc)
- ffmpeg 3.1.3 if you want to stylize video
### Citation
```
@misc{engstrom2016faststyletransfer,
author = {Logan Engstrom},
title = {Fast Style Transfer},
year = {2016},
howpublished = {\url{https://github.com/lengstrom/fast-style-transfer/}},
note = {commit xxxxxxx}
}
```
### Attributions/Thanks
- This project could not have happened without the advice (and GPU access) given by [Anish Athalye](http://www.anishathalye.com/).
- The project also borrowed some code from Anish's [Neural Style](https://github.com/anishathalye/neural-style/)
- Some readme/docs formatting was borrowed from Justin Johnson's [Fast Neural Style](https://github.com/jcjohnson/fast-neural-style)
- The image of the Stata Center at the very beginning of the README was taken by [Juan Paulo](https://juanpaulo.me/)
### Related Work
- Michael Ramos ported this network [to use CoreML on iOS](https://medium.com/@rambossa/diy-prisma-fast-style-transfer-app-with-coreml-and-tensorflow-817c3b90dacd)
================================================
FILE: docs.md
================================================
## style.py
`style.py` trains networks that can transfer styles from artwork into images.
**Flags**
- `--checkpoint-dir`: Directory to save checkpoint in. Required.
- `--style`: Path to style image. Required.
- `--train-path`: Path to training images folder. Default: `data/train2014`.
- `--test`: Path to content image to test network on at at every checkpoint iteration. Default: no image.
- `--test-dir`: Path to directory to save test images in. Required if `--test` is passed a value.
- `--epochs`: Epochs to train for. Default: `2`.
- `--batch-size`: Batch size for training. Default: `4`.
- `--checkpoint-iterations`: Number of iterations to go for between checkpoints. Default: `2000`.
- `--vgg-path`: Path to VGG19 network (default). Can pass VGG16 if you want to try out other loss functions. Default: `data/imagenet-vgg-verydeep-19.mat`.
- `--content-weight`: Weight of content in loss function. Default: `7.5e0`.
- `--style-weight`: Weight of style in loss function. Default: `1e2`.
- `--tv-weight`: Weight of total variation term in loss function. Default: `2e2`.
- `--learning-rate`: Learning rate for optimizer. Default: `1e-3`.
- `--slow`: For debugging loss function. Direct optimization on pixels using Gatys' approach. Uses `test` image as content value, `test_dir` for saving fully optimized images.
## evaluate.py
`evaluate.py` evaluates trained networks given a checkpoint directory. If evaluating images from a directory, every image in the directory must have the same dimensions.
**Flags**
- `--checkpoint`: Directory or `ckpt` file to load checkpoint from. Required.
- `--in-path`: Path of image or directory of images to transform. Required.
- `--out-path`: Out path of transformed image or out directory to put transformed images from in directory (if `in_path` is a directory). Required.
- `--device`: Device used to transform image. Default: `/cpu:0`.
- `--batch-size`: Batch size used to evaluate images. In particular meant for directory transformations. Default: `4`.
- `--allow-different-dimensions`: Allow different image dimensions. Default: not enabled
## transform_video.py
`transform_video.py` transforms videos into stylized videos given a style transfer net.
**Flags**
- `--checkpoint-dir`: Directory or `ckpt` file to load checkpoint from. Required.
- `--in-path`: Path to video to transfer style to. Required.
- `--out-path`: Path to out video. Required.
- `--tmp-dir`: Directory to put temporary processing files in. Will generate a dir if you do not pass it a path. Will delete tmpdir afterwards. Default: randomly generates invisible dir, then deletes it after execution completion.
- `--device`: Device to evaluate frames with. Default: `/gpu:0`.
- `--batch-size`: Batch size for evaluating images. Default: `4`.
================================================
FILE: evaluate.py
================================================
from __future__ import print_function
import sys
sys.path.insert(0, 'src')
import transform, numpy as np, vgg, pdb, os
import scipy.misc
import tensorflow as tf
from utils import save_img, get_img, exists, list_files
from argparse import ArgumentParser
from collections import defaultdict
import time
import json
import subprocess
import numpy
from moviepy.video.io.VideoFileClip import VideoFileClip
import moviepy.video.io.ffmpeg_writer as ffmpeg_writer
BATCH_SIZE = 4
DEVICE = '/gpu:0'
def ffwd_video(path_in, path_out, checkpoint_dir, device_t='/gpu:0', batch_size=4):
video_clip = VideoFileClip(path_in, audio=False)
video_writer = ffmpeg_writer.FFMPEG_VideoWriter(path_out, video_clip.size, video_clip.fps, codec="libx264",
preset="medium", bitrate="2000k",
audiofile=path_in, threads=None,
ffmpeg_params=None)
g = tf.Graph()
soft_config = tf.compat.v1.ConfigProto(allow_soft_placement=True)
soft_config.gpu_options.allow_growth = True
with g.as_default(), g.device(device_t), \
tf.compat.v1.Session(config=soft_config) as sess:
batch_shape = (batch_size, video_clip.size[1], video_clip.size[0], 3)
img_placeholder = tf.compat.v1.placeholder(tf.float32, shape=batch_shape,
name='img_placeholder')
preds = transform.net(img_placeholder)
saver = tf.compat.v1.train.Saver()
if os.path.isdir(checkpoint_dir):
ckpt = tf.train.get_checkpoint_state(checkpoint_dir)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
else:
raise Exception("No checkpoint found...")
else:
saver.restore(sess, checkpoint_dir)
X = np.zeros(batch_shape, dtype=np.float32)
def style_and_write(count):
for i in range(count, batch_size):
X[i] = X[count - 1] # Use last frame to fill X
_preds = sess.run(preds, feed_dict={img_placeholder: X})
for i in range(0, count):
video_writer.write_frame(np.clip(_preds[i], 0, 255).astype(np.uint8))
frame_count = 0 # The frame count that written to X
for frame in video_clip.iter_frames():
X[frame_count] = frame
frame_count += 1
if frame_count == batch_size:
style_and_write(frame_count)
frame_count = 0
if frame_count != 0:
style_and_write(frame_count)
video_writer.close()
# get img_shape
def ffwd(data_in, paths_out, checkpoint_dir, device_t='/gpu:0', batch_size=4):
assert len(paths_out) > 0
is_paths = type(data_in[0]) == str
if is_paths:
assert len(data_in) == len(paths_out)
img_shape = get_img(data_in[0]).shape
else:
assert data_in.size[0] == len(paths_out)
img_shape = X[0].shape
g = tf.Graph()
batch_size = min(len(paths_out), batch_size)
curr_num = 0
soft_config = tf.compat.v1.ConfigProto(allow_soft_placement=True)
soft_config.gpu_options.allow_growth = True
with g.as_default(), g.device(device_t), \
tf.compat.v1.Session(config=soft_config) as sess:
batch_shape = (batch_size,) + img_shape
img_placeholder = tf.compat.v1.placeholder(tf.float32, shape=batch_shape,
name='img_placeholder')
preds = transform.net(img_placeholder)
saver = tf.compat.v1.train.Saver()
if os.path.isdir(checkpoint_dir):
ckpt = tf.train.get_checkpoint_state(checkpoint_dir)
if ckpt and ckpt.model_checkpoint_path:
saver.restore(sess, ckpt.model_checkpoint_path)
else:
raise Exception("No checkpoint found...")
else:
saver.restore(sess, checkpoint_dir)
num_iters = int(len(paths_out)/batch_size)
for i in range(num_iters):
pos = i * batch_size
curr_batch_out = paths_out[pos:pos+batch_size]
if is_paths:
curr_batch_in = data_in[pos:pos+batch_size]
X = np.zeros(batch_shape, dtype=np.float32)
for j, path_in in enumerate(curr_batch_in):
img = get_img(path_in)
assert img.shape == img_shape, \
'Images have different dimensions. ' + \
'Resize images or use --allow-different-dimensions.'
X[j] = img
else:
X = data_in[pos:pos+batch_size]
_preds = sess.run(preds, feed_dict={img_placeholder:X})
for j, path_out in enumerate(curr_batch_out):
save_img(path_out, _preds[j])
remaining_in = data_in[num_iters*batch_size:]
remaining_out = paths_out[num_iters*batch_size:]
if len(remaining_in) > 0:
ffwd(remaining_in, remaining_out, checkpoint_dir,
device_t=device_t, batch_size=1)
def ffwd_to_img(in_path, out_path, checkpoint_dir, device='/cpu:0'):
paths_in, paths_out = [in_path], [out_path]
ffwd(paths_in, paths_out, checkpoint_dir, batch_size=1, device_t=device)
def ffwd_different_dimensions(in_path, out_path, checkpoint_dir,
device_t=DEVICE, batch_size=4):
in_path_of_shape = defaultdict(list)
out_path_of_shape = defaultdict(list)
for i in range(len(in_path)):
in_image = in_path[i]
out_image = out_path[i]
shape = "%dx%dx%d" % get_img(in_image).shape
in_path_of_shape[shape].append(in_image)
out_path_of_shape[shape].append(out_image)
for shape in in_path_of_shape:
print('Processing images of shape %s' % shape)
ffwd(in_path_of_shape[shape], out_path_of_shape[shape],
checkpoint_dir, device_t, batch_size)
def build_parser():
parser = ArgumentParser()
parser.add_argument('--checkpoint', type=str,
dest='checkpoint_dir',
help='dir or .ckpt file to load checkpoint from',
metavar='CHECKPOINT', required=True)
parser.add_argument('--in-path', type=str,
dest='in_path',help='dir or file to transform',
metavar='IN_PATH', required=True)
help_out = 'destination (dir or file) of transformed file or files'
parser.add_argument('--out-path', type=str,
dest='out_path', help=help_out, metavar='OUT_PATH',
required=True)
parser.add_argument('--device', type=str,
dest='device',help='device to perform compute on',
metavar='DEVICE', default=DEVICE)
parser.add_argument('--batch-size', type=int,
dest='batch_size',help='batch size for feedforwarding',
metavar='BATCH_SIZE', default=BATCH_SIZE)
parser.add_argument('--allow-different-dimensions', action='store_true',
dest='allow_different_dimensions',
help='allow different image dimensions')
return parser
def check_opts(opts):
exists(opts.checkpoint_dir, 'Checkpoint not found!')
exists(opts.in_path, 'In path not found!')
if os.path.isdir(opts.out_path):
exists(opts.out_path, 'out dir not found!')
assert opts.batch_size > 0
def main():
parser = build_parser()
opts = parser.parse_args()
check_opts(opts)
if not os.path.isdir(opts.in_path):
if os.path.exists(opts.out_path) and os.path.isdir(opts.out_path):
out_path = \
os.path.join(opts.out_path,os.path.basename(opts.in_path))
else:
out_path = opts.out_path
ffwd_to_img(opts.in_path, out_path, opts.checkpoint_dir,
device=opts.device)
else:
files = list_files(opts.in_path)
full_in = [os.path.join(opts.in_path,x) for x in files]
full_out = [os.path.join(opts.out_path,x) for x in files]
if opts.allow_different_dimensions:
ffwd_different_dimensions(full_in, full_out, opts.checkpoint_dir,
device_t=opts.device, batch_size=opts.batch_size)
else :
ffwd(full_in, full_out, opts.checkpoint_dir, device_t=opts.device,
batch_size=opts.batch_size)
if __name__ == '__main__':
main()
================================================
FILE: setup.sh
================================================
#! /bin/bash
mkdir data
cd data
wget http://www.vlfeat.org/matconvnet/models/beta16/imagenet-vgg-verydeep-19.mat
mkdir bin
wget http://msvocds.blob.core.windows.net/coco2014/train2014.zip
unzip -q train2014.zip
================================================
FILE: src/optimize.py
================================================
from __future__ import print_function
import functools
import vgg, pdb, time
import tensorflow as tf, numpy as np, os
import transform
from utils import get_img
STYLE_LAYERS = ('relu1_1', 'relu2_1', 'relu3_1', 'relu4_1', 'relu5_1')
CONTENT_LAYER = 'relu4_2'
DEVICES = 'CUDA_VISIBLE_DEVICES'
# np arr, np arr
def optimize(content_targets, style_target, content_weight, style_weight,
tv_weight, vgg_path, epochs=2, print_iterations=1000,
batch_size=4, save_path='saver/fns.ckpt', slow=False,
learning_rate=1e-3, debug=False):
if slow:
batch_size = 1
mod = len(content_targets) % batch_size
if mod > 0:
print("Train set has been trimmed slightly..")
content_targets = content_targets[:-mod]
style_features = {}
batch_shape = (batch_size,256,256,3)
style_shape = (1,) + style_target.shape
print(style_shape)
# precompute style features
with tf.Graph().as_default(), tf.device('/cpu:0'), tf.compat.v1.Session() as sess:
style_image = tf.compat.v1.placeholder(tf.float32, shape=style_shape, name='style_image')
style_image_pre = vgg.preprocess(style_image)
net = vgg.net(vgg_path, style_image_pre)
style_pre = np.array([style_target])
for layer in STYLE_LAYERS:
features = net[layer].eval(feed_dict={style_image:style_pre})
features = np.reshape(features, (-1, features.shape[3]))
gram = np.matmul(features.T, features) / features.size
style_features[layer] = gram
with tf.Graph().as_default(), tf.compat.v1.Session() as sess:
X_content = tf.compat.v1.placeholder(tf.float32, shape=batch_shape, name="X_content")
X_pre = vgg.preprocess(X_content)
# precompute content features
content_features = {}
content_net = vgg.net(vgg_path, X_pre)
content_features[CONTENT_LAYER] = content_net[CONTENT_LAYER]
if slow:
preds = tf.Variable(
tf.random.normal(X_content.get_shape()) * 0.256
)
preds_pre = preds
else:
preds = transform.net(X_content/255.0)
preds_pre = vgg.preprocess(preds)
net = vgg.net(vgg_path, preds_pre)
content_size = _tensor_size(content_features[CONTENT_LAYER])*batch_size
assert _tensor_size(content_features[CONTENT_LAYER]) == _tensor_size(net[CONTENT_LAYER])
content_loss = content_weight * (2 * tf.nn.l2_loss(
net[CONTENT_LAYER] - content_features[CONTENT_LAYER]) / content_size
)
style_losses = []
for style_layer in STYLE_LAYERS:
layer = net[style_layer]
bs, height, width, filters = map(lambda i:i,layer.get_shape())
size = height * width * filters
feats = tf.reshape(layer, (bs, height * width, filters))
feats_T = tf.transpose(a=feats, perm=[0,2,1])
grams = tf.matmul(feats_T, feats) / size
style_gram = style_features[style_layer]
style_losses.append(2 * tf.nn.l2_loss(grams - style_gram)/style_gram.size)
style_loss = style_weight * functools.reduce(tf.add, style_losses) / batch_size
# total variation denoising
tv_y_size = _tensor_size(preds[:,1:,:,:])
tv_x_size = _tensor_size(preds[:,:,1:,:])
y_tv = tf.nn.l2_loss(preds[:,1:,:,:] - preds[:,:batch_shape[1]-1,:,:])
x_tv = tf.nn.l2_loss(preds[:,:,1:,:] - preds[:,:,:batch_shape[2]-1,:])
tv_loss = tv_weight*2*(x_tv/tv_x_size + y_tv/tv_y_size)/batch_size
loss = content_loss + style_loss + tv_loss
# overall loss
train_step = tf.compat.v1.train.AdamOptimizer(learning_rate).minimize(loss)
sess.run(tf.compat.v1.global_variables_initializer())
import random
uid = random.randint(1, 100)
print("UID: %s" % uid)
for epoch in range(epochs):
num_examples = len(content_targets)
iterations = 0
while iterations * batch_size < num_examples:
start_time = time.time()
curr = iterations * batch_size
step = curr + batch_size
X_batch = np.zeros(batch_shape, dtype=np.float32)
for j, img_p in enumerate(content_targets[curr:step]):
X_batch[j] = get_img(img_p, (256,256,3)).astype(np.float32)
iterations += 1
assert X_batch.shape[0] == batch_size
feed_dict = {
X_content:X_batch
}
train_step.run(feed_dict=feed_dict)
end_time = time.time()
delta_time = end_time - start_time
if debug:
print("UID: %s, batch time: %s" % (uid, delta_time))
is_print_iter = int(iterations) % print_iterations == 0
if slow:
is_print_iter = epoch % print_iterations == 0
is_last = epoch == epochs - 1 and iterations * batch_size >= num_examples
should_print = is_print_iter or is_last
if should_print:
to_get = [style_loss, content_loss, tv_loss, loss, preds]
test_feed_dict = {
X_content:X_batch
}
tup = sess.run(to_get, feed_dict = test_feed_dict)
_style_loss,_content_loss,_tv_loss,_loss,_preds = tup
losses = (_style_loss, _content_loss, _tv_loss, _loss)
if slow:
_preds = vgg.unprocess(_preds)
else:
saver = tf.compat.v1.train.Saver()
res = saver.save(sess, save_path)
yield(_preds, losses, iterations, epoch)
def _tensor_size(tensor):
from operator import mul
return functools.reduce(mul, (d for d in tensor.get_shape()[1:]), 1)
================================================
FILE: src/transform.py
================================================
import tensorflow as tf, pdb
WEIGHTS_INIT_STDEV = .1
def net(image):
conv1 = _conv_layer(image, 32, 9, 1)
conv2 = _conv_layer(conv1, 64, 3, 2)
conv3 = _conv_layer(conv2, 128, 3, 2)
resid1 = _residual_block(conv3, 3)
resid2 = _residual_block(resid1, 3)
resid3 = _residual_block(resid2, 3)
resid4 = _residual_block(resid3, 3)
resid5 = _residual_block(resid4, 3)
conv_t1 = _conv_tranpose_layer(resid5, 64, 3, 2)
conv_t2 = _conv_tranpose_layer(conv_t1, 32, 3, 2)
conv_t3 = _conv_layer(conv_t2, 3, 9, 1, relu=False)
preds = tf.nn.tanh(conv_t3) * 150 + 255./2
return preds
def _conv_layer(net, num_filters, filter_size, strides, relu=True):
weights_init = _conv_init_vars(net, num_filters, filter_size)
strides_shape = [1, strides, strides, 1]
net = tf.nn.conv2d(input=net, filters=weights_init, strides=strides_shape, padding='SAME')
net = _instance_norm(net)
if relu:
net = tf.nn.relu(net)
return net
def _conv_tranpose_layer(net, num_filters, filter_size, strides):
weights_init = _conv_init_vars(net, num_filters, filter_size, transpose=True)
batch_size, rows, cols, in_channels = [i for i in net.get_shape()]
new_rows, new_cols = int(rows * strides), int(cols * strides)
# new_shape = #tf.pack([tf.shape(net)[0], new_rows, new_cols, num_filters])
new_shape = [batch_size, new_rows, new_cols, num_filters]
tf_shape = tf.stack(new_shape)
strides_shape = [1,strides,strides,1]
net = tf.nn.conv2d_transpose(net, weights_init, tf_shape, strides_shape, padding='SAME')
net = _instance_norm(net)
return tf.nn.relu(net)
def _residual_block(net, filter_size=3):
tmp = _conv_layer(net, 128, filter_size, 1)
return net + _conv_layer(tmp, 128, filter_size, 1, relu=False)
def _instance_norm(net, train=True):
batch, rows, cols, channels = [i for i in net.get_shape()]
var_shape = [channels]
mu, sigma_sq = tf.nn.moments(x=net, axes=[1,2], keepdims=True)
shift = tf.Variable(tf.zeros(var_shape))
scale = tf.Variable(tf.ones(var_shape))
epsilon = 1e-3
normalized = (net-mu)/(sigma_sq + epsilon)**(.5)
return scale * normalized + shift
def _conv_init_vars(net, out_channels, filter_size, transpose=False):
_, rows, cols, in_channels = [i for i in net.get_shape()]
if not transpose:
weights_shape = [filter_size, filter_size, in_channels, out_channels]
else:
weights_shape = [filter_size, filter_size, out_channels, in_channels]
weights_init = tf.Variable(tf.random.truncated_normal(weights_shape, stddev=WEIGHTS_INIT_STDEV, seed=1), dtype=tf.float32)
return weights_init
================================================
FILE: src/utils.py
================================================
import scipy.misc, numpy as np, os, sys
import imageio
from PIL import Image
def save_img(out_path, img):
img = np.clip(img, 0, 255).astype(np.uint8)
imageio.imwrite(out_path, img)
def scale_img(style_path, style_scale):
scale = float(style_scale)
o0, o1, o2 = imageio.imread(style_path, pilmode='RGB').shape
scale = float(style_scale)
new_shape = (int(o0 * scale), int(o1 * scale), o2)
style_target = _get_img(style_path, img_size=new_shape)
return style_target
def get_img(src, img_size=False):
img = imageio.imread(src, pilmode='RGB') # misc.imresize(, (256, 256, 3))
if not (len(img.shape) == 3 and img.shape[2] == 3):
img = np.dstack((img,img,img))
if img_size != False:
img = np.array(Image.fromarray(img).resize(img_size[:2]))
return img
def exists(p, msg):
assert os.path.exists(p), msg
def list_files(in_path):
files = []
for (dirpath, dirnames, filenames) in os.walk(in_path):
files.extend(filenames)
break
return files
================================================
FILE: src/vgg.py
================================================
# Copyright (c) 2015-2016 Anish Athalye. Released under GPLv3.
import tensorflow as tf
import numpy as np
import scipy.io
import pdb
MEAN_PIXEL = np.array([ 123.68 , 116.779, 103.939])
def net(data_path, input_image):
layers = (
'conv1_1', 'relu1_1', 'conv1_2', 'relu1_2', 'pool1',
'conv2_1', 'relu2_1', 'conv2_2', 'relu2_2', 'pool2',
'conv3_1', 'relu3_1', 'conv3_2', 'relu3_2', 'conv3_3',
'relu3_3', 'conv3_4', 'relu3_4', 'pool3',
'conv4_1', 'relu4_1', 'conv4_2', 'relu4_2', 'conv4_3',
'relu4_3', 'conv4_4', 'relu4_4', 'pool4',
'conv5_1', 'relu5_1', 'conv5_2', 'relu5_2', 'conv5_3',
'relu5_3', 'conv5_4', 'relu5_4'
)
data = scipy.io.loadmat(data_path)
mean = data['normalization'][0][0][0]
mean_pixel = np.mean(mean, axis=(0, 1))
weights = data['layers'][0]
net = {}
current = input_image
for i, name in enumerate(layers):
kind = name[:4]
if kind == 'conv':
kernels, bias = weights[i][0][0][0][0]
# matconvnet: weights are [width, height, in_channels, out_channels]
# tensorflow: weights are [height, width, in_channels, out_channels]
kernels = np.transpose(kernels, (1, 0, 2, 3))
bias = bias.reshape(-1)
current = _conv_layer(current, kernels, bias)
elif kind == 'relu':
current = tf.nn.relu(current)
elif kind == 'pool':
current = _pool_layer(current)
net[name] = current
assert len(net) == len(layers)
return net
def _conv_layer(input, weights, bias):
conv = tf.nn.conv2d(input=input, filters=tf.constant(weights), strides=(1, 1, 1, 1),
padding='SAME')
return tf.nn.bias_add(conv, bias)
def _pool_layer(input):
return tf.nn.max_pool2d(input=input, ksize=(1, 2, 2, 1), strides=(1, 2, 2, 1),
padding='SAME')
def preprocess(image):
return image - MEAN_PIXEL
def unprocess(image):
return image + MEAN_PIXEL
================================================
FILE: style.py
================================================
from __future__ import print_function
import sys, os, pdb
sys.path.insert(0, 'src')
import numpy as np, scipy.misc
from optimize import optimize
from argparse import ArgumentParser
from utils import save_img, get_img, exists, list_files
import evaluate
CONTENT_WEIGHT = 7.5e0
STYLE_WEIGHT = 1e2
TV_WEIGHT = 2e2
LEARNING_RATE = 1e-3
NUM_EPOCHS = 2
CHECKPOINT_DIR = 'checkpoints'
CHECKPOINT_ITERATIONS = 2000
VGG_PATH = 'data/imagenet-vgg-verydeep-19.mat'
TRAIN_PATH = 'data/train2014'
BATCH_SIZE = 4
DEVICE = '/gpu:0'
FRAC_GPU = 1
def build_parser():
parser = ArgumentParser()
parser.add_argument('--checkpoint-dir', type=str,
dest='checkpoint_dir', help='dir to save checkpoint in',
metavar='CHECKPOINT_DIR', required=True)
parser.add_argument('--style', type=str,
dest='style', help='style image path',
metavar='STYLE', required=True)
parser.add_argument('--train-path', type=str,
dest='train_path', help='path to training images folder',
metavar='TRAIN_PATH', default=TRAIN_PATH)
parser.add_argument('--test', type=str,
dest='test', help='test image path',
metavar='TEST', default=False)
parser.add_argument('--test-dir', type=str,
dest='test_dir', help='test image save dir',
metavar='TEST_DIR', default=False)
parser.add_argument('--slow', dest='slow', action='store_true',
help='gatys\' approach (for debugging, not supported)',
default=False)
parser.add_argument('--epochs', type=int,
dest='epochs', help='num epochs',
metavar='EPOCHS', default=NUM_EPOCHS)
parser.add_argument('--batch-size', type=int,
dest='batch_size', help='batch size',
metavar='BATCH_SIZE', default=BATCH_SIZE)
parser.add_argument('--checkpoint-iterations', type=int,
dest='checkpoint_iterations', help='checkpoint frequency',
metavar='CHECKPOINT_ITERATIONS',
default=CHECKPOINT_ITERATIONS)
parser.add_argument('--vgg-path', type=str,
dest='vgg_path',
help='path to VGG19 network (default %(default)s)',
metavar='VGG_PATH', default=VGG_PATH)
parser.add_argument('--content-weight', type=float,
dest='content_weight',
help='content weight (default %(default)s)',
metavar='CONTENT_WEIGHT', default=CONTENT_WEIGHT)
parser.add_argument('--style-weight', type=float,
dest='style_weight',
help='style weight (default %(default)s)',
metavar='STYLE_WEIGHT', default=STYLE_WEIGHT)
parser.add_argument('--tv-weight', type=float,
dest='tv_weight',
help='total variation regularization weight (default %(default)s)',
metavar='TV_WEIGHT', default=TV_WEIGHT)
parser.add_argument('--learning-rate', type=float,
dest='learning_rate',
help='learning rate (default %(default)s)',
metavar='LEARNING_RATE', default=LEARNING_RATE)
return parser
def check_opts(opts):
exists(opts.checkpoint_dir, "checkpoint dir not found!")
exists(opts.style, "style path not found!")
exists(opts.train_path, "train path not found!")
if opts.test or opts.test_dir:
exists(opts.test, "test img not found!")
exists(opts.test_dir, "test directory not found!")
exists(opts.vgg_path, "vgg network data not found!")
assert opts.epochs > 0
assert opts.batch_size > 0
assert opts.checkpoint_iterations > 0
assert os.path.exists(opts.vgg_path)
assert opts.content_weight >= 0
assert opts.style_weight >= 0
assert opts.tv_weight >= 0
assert opts.learning_rate >= 0
def _get_files(img_dir):
files = list_files(img_dir)
return [os.path.join(img_dir,x) for x in files]
def main():
parser = build_parser()
options = parser.parse_args()
check_opts(options)
style_target = get_img(options.style)
if not options.slow:
content_targets = _get_files(options.train_path)
elif options.test:
content_targets = [options.test]
kwargs = {
"slow":options.slow,
"epochs":options.epochs,
"print_iterations":options.checkpoint_iterations,
"batch_size":options.batch_size,
"save_path":os.path.join(options.checkpoint_dir,'fns.ckpt'),
"learning_rate":options.learning_rate
}
if options.slow:
if options.epochs < 10:
kwargs['epochs'] = 1000
if options.learning_rate < 1:
kwargs['learning_rate'] = 1e1
args = [
content_targets,
style_target,
options.content_weight,
options.style_weight,
options.tv_weight,
options.vgg_path
]
for preds, losses, i, epoch in optimize(*args, **kwargs):
style_loss, content_loss, tv_loss, loss = losses
print('Epoch %d, Iteration: %d, Loss: %s' % (epoch, i, loss))
to_print = (style_loss, content_loss, tv_loss)
print('style: %s, content:%s, tv: %s' % to_print)
if options.test:
assert options.test_dir != False
preds_path = '%s/%s_%s.png' % (options.test_dir,epoch,i)
if not options.slow:
ckpt_dir = os.path.dirname(options.checkpoint_dir)
evaluate.ffwd_to_img(options.test,preds_path,
options.checkpoint_dir)
else:
save_img(preds_path, img)
ckpt_dir = options.checkpoint_dir
cmd_text = 'python evaluate.py --checkpoint %s ...' % ckpt_dir
print("Training complete. For evaluation:\n `%s`" % cmd_text)
if __name__ == '__main__':
main()
================================================
FILE: transform_video.py
================================================
from __future__ import print_function
from argparse import ArgumentParser
import sys
sys.path.insert(0, 'src')
import os, random, subprocess, evaluate, shutil
from utils import exists, list_files
import pdb
TMP_DIR = '.fns_frames_%s/' % random.randint(0,99999)
DEVICE = '/gpu:0'
BATCH_SIZE = 4
def build_parser():
parser = ArgumentParser()
parser.add_argument('--checkpoint', type=str,
dest='checkpoint', help='checkpoint directory or .ckpt file',
metavar='CHECKPOINT', required=True)
parser.add_argument('--in-path', type=str,
dest='in_path', help='in video path',
metavar='IN_PATH', required=True)
parser.add_argument('--out-path', type=str,
dest='out', help='path to save processed video to',
metavar='OUT', required=True)
parser.add_argument('--tmp-dir', type=str, dest='tmp_dir',
help='tmp dir for processing', metavar='TMP_DIR',
default=TMP_DIR)
parser.add_argument('--device', type=str, dest='device',
help='device for eval. CPU discouraged. ex: \'/gpu:0\'',
metavar='DEVICE', default=DEVICE)
parser.add_argument('--batch-size', type=int,
dest='batch_size',help='batch size for eval. default 4.',
metavar='BATCH_SIZE', default=BATCH_SIZE)
parser.add_argument('--no-disk', type=bool, dest='no_disk',
help='Don\'t save intermediate files to disk. Default False',
metavar='NO_DISK', default=False)
return parser
def check_opts(opts):
exists(opts.checkpoint)
exists(opts.out)
def main():
parser = build_parser()
opts = parser.parse_args()
evaluate.ffwd_video(opts.in_path, opts.out, opts.checkpoint, opts.device, opts.batch_size)
if __name__ == '__main__':
main()
gitextract_gzfrkqti/ ├── .github/ │ └── FUNDING.yml ├── .gitignore ├── CITATION.cff ├── README.md ├── docs.md ├── evaluate.py ├── setup.sh ├── src/ │ ├── optimize.py │ ├── transform.py │ ├── utils.py │ └── vgg.py ├── style.py └── transform_video.py
SYMBOL INDEX (32 symbols across 7 files) FILE: evaluate.py function ffwd_video (line 21) | def ffwd_video(path_in, path_out, checkpoint_dir, device_t='/gpu:0', bat... function ffwd (line 72) | def ffwd(data_in, paths_out, checkpoint_dir, device_t='/gpu:0', batch_si... function ffwd_to_img (line 130) | def ffwd_to_img(in_path, out_path, checkpoint_dir, device='/cpu:0'): function ffwd_different_dimensions (line 134) | def ffwd_different_dimensions(in_path, out_path, checkpoint_dir, function build_parser (line 149) | def build_parser(): function check_opts (line 179) | def check_opts(opts): function main (line 186) | def main(): FILE: src/optimize.py function optimize (line 13) | def optimize(content_targets, style_target, content_weight, style_weight, function _tensor_size (line 140) | def _tensor_size(tensor): FILE: src/transform.py function net (line 5) | def net(image): function _conv_layer (line 20) | def _conv_layer(net, num_filters, filter_size, strides, relu=True): function _conv_tranpose_layer (line 30) | def _conv_tranpose_layer(net, num_filters, filter_size, strides): function _residual_block (line 45) | def _residual_block(net, filter_size=3): function _instance_norm (line 49) | def _instance_norm(net, train=True): function _conv_init_vars (line 59) | def _conv_init_vars(net, out_channels, filter_size, transpose=False): FILE: src/utils.py function save_img (line 5) | def save_img(out_path, img): function scale_img (line 9) | def scale_img(style_path, style_scale): function get_img (line 17) | def get_img(src, img_size=False): function exists (line 25) | def exists(p, msg): function list_files (line 28) | def list_files(in_path): FILE: src/vgg.py function net (line 10) | def net(data_path, input_image): function _conv_layer (line 52) | def _conv_layer(input, weights, bias): function _pool_layer (line 58) | def _pool_layer(input): function preprocess (line 63) | def preprocess(image): function unprocess (line 67) | def unprocess(image): FILE: style.py function build_parser (line 24) | def build_parser(): function check_opts (line 90) | def check_opts(opts): function _get_files (line 107) | def _get_files(img_dir): function main (line 112) | def main(): FILE: transform_video.py function build_parser (line 13) | def build_parser(): function check_opts (line 44) | def check_opts(opts): function main (line 48) | def main():
Condensed preview — 13 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (42K chars).
[
{
"path": ".github/FUNDING.yml",
"chars": 145,
"preview": "# These are supported funding model platforms\n\ngithub: [lengstrom] # Replace with up to 4 GitHub Sponsors-enabled userna"
},
{
"path": ".gitignore",
"chars": 1143,
"preview": "t Byte-compiled / optimized / DLL files\ndeps.txt\narchive\nsaver\n*~\nstyles\npngs\npreds\n\n*.sw*\ndata\n__pycache__/\n*.py[cod]\n*"
},
{
"path": "CITATION.cff",
"chars": 316,
"preview": "# YAML 1.2\n---\nauthors: \n -\n family-names: Engstrom\n given-names: Logan\ncff-version: \"1.1.0\"\ndate-released: 2016-"
},
{
"path": "README.md",
"chars": 7801,
"preview": "## Fast Style Transfer in [TensorFlow](https://github.com/tensorflow/tensorflow)\n\nAdd styles from famous paintings to an"
},
{
"path": "docs.md",
"chars": 2768,
"preview": "## style.py \n\n`style.py` trains networks that can transfer styles from artwork into images.\n\n**Flags**\n- `--checkpoint-d"
},
{
"path": "evaluate.py",
"chars": 8590,
"preview": "from __future__ import print_function\nimport sys\nsys.path.insert(0, 'src')\nimport transform, numpy as np, vgg, pdb, os\ni"
},
{
"path": "setup.sh",
"chars": 212,
"preview": "#! /bin/bash\n\nmkdir data\ncd data\nwget http://www.vlfeat.org/matconvnet/models/beta16/imagenet-vgg-verydeep-19.mat\nmkdir "
},
{
"path": "src/optimize.py",
"chars": 5984,
"preview": "from __future__ import print_function\nimport functools\nimport vgg, pdb, time\nimport tensorflow as tf, numpy as np, os\nim"
},
{
"path": "src/transform.py",
"chars": 2667,
"preview": "import tensorflow as tf, pdb\n\nWEIGHTS_INIT_STDEV = .1\n\ndef net(image):\n conv1 = _conv_layer(image, 32, 9, 1)\n conv"
},
{
"path": "src/utils.py",
"chars": 1026,
"preview": "import scipy.misc, numpy as np, os, sys\nimport imageio\nfrom PIL import Image\n\ndef save_img(out_path, img):\n img = np."
},
{
"path": "src/vgg.py",
"chars": 2015,
"preview": "# Copyright (c) 2015-2016 Anish Athalye. Released under GPLv3.\n\nimport tensorflow as tf\nimport numpy as np\nimport scipy."
},
{
"path": "style.py",
"chars": 6161,
"preview": "from __future__ import print_function\nimport sys, os, pdb\nsys.path.insert(0, 'src')\nimport numpy as np, scipy.misc \nfrom"
},
{
"path": "transform_video.py",
"chars": 1986,
"preview": "from __future__ import print_function\nfrom argparse import ArgumentParser\nimport sys\nsys.path.insert(0, 'src')\nimport os"
}
]
About this extraction
This page contains the full source code of the lengstrom/fast-style-transfer GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 13 files (39.9 KB), approximately 10.3k tokens, and a symbol index with 32 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.