Full Code of golsun/deep-RL-trading for AI

master 4109834e9178 cached
19 files
36.7 KB
11.6k tokens
87 symbols
1 requests
Download .txt
Repository: golsun/deep-RL-trading
Branch: master
Commit: 4109834e9178
Files: 19
Total size: 36.7 KB

Directory structure:
gitextract_vp_b1_jb/

├── .gitignore
├── LICENSE
├── README.md
├── data/
│   ├── PairSamplerDB/
│   │   ├── randjump_100,1(10, 30)[]_A/
│   │   │   ├── db.pickle
│   │   │   └── param.json
│   │   └── randjump_100,1(10, 30)[]_B/
│   │       ├── db.pickle
│   │       └── param.json
│   └── SinSamplerDB/
│       ├── concat_half_base_A/
│       │   ├── db.pickle
│       │   └── param.json
│       └── concat_half_base_B/
│           ├── db.pickle
│           └── param.json
├── env.yml
└── src/
    ├── agents.py
    ├── emulator.py
    ├── lib.py
    ├── main.py
    ├── sampler.py
    ├── simulators.py
    └── visualizer.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================

*.pyc


================================================
FILE: LICENSE
================================================
MIT License

Copyright (c) 2018 Xiang Gao

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: README.md
================================================

# **Playing trading games with deep reinforcement learning**

This repo is the code for this [paper](https://arxiv.org/abs/1803.03916). Deep reinforcement learing is used to find optimal strategies in these two scenarios:
* Momentum trading: capture the underlying dynamics
* Arbitrage trading: utilize the hidden relation among the inputs

Several neural networks are compared: 
* Recurrent Neural Networks (GRU/LSTM)
* Convolutional Neural Network (CNN)
* Multi-Layer Perception (MLP)

### Dependencies

You can get all dependencies via the [Anaconda](https://conda.io/docs/user-guide/tasks/manage-environments.html#creating-an-environment-from-an-environment-yml-file) environment file, [env.yml](https://github.com/golsun/deep-RL-time-series/blob/master/env.yml):

    conda env create -f env.yml

### Play with it
Just call the main function

    python main.py

You can play with model parameters (specified in main.py), if you get good results or any trouble, please contact me at gxiang1228@gmail.com


================================================
FILE: data/PairSamplerDB/randjump_100,1(10, 30)[]_A/param.json
================================================
{"n_episodes": 100, "title": "randjump(5, (10, 30), 1, [])", "window_episode": 180, "forecast_horizon_range": [10, 30], "max_change_perc": 30.0, "noise_level": 5, "n_section": 1, "n_var": 2}

================================================
FILE: data/PairSamplerDB/randjump_100,1(10, 30)[]_B/param.json
================================================
{"n_episodes": 100, "title": "randjump(5, (10, 30), 1, [])", "window_episode": 180, "forecast_horizon_range": [10, 30], "max_change_perc": 30.0, "noise_level": 5, "n_section": 1, "n_var": 2}

================================================
FILE: data/SinSamplerDB/concat_half_base_A/param.json
================================================
{"n_episodes": 100, "title": "ConcatHalfSin+Base(0.5, (10, 40), (5, 80))", "window_episode": 180, "noise_amplitude_ratio": 0.5, "period_range": [10, 40], "amplitude_range": [5, 80], "can_half_period": true}

================================================
FILE: data/SinSamplerDB/concat_half_base_B/param.json
================================================
{"n_episodes": 100, "title": "ConcatHalfSin+Base(0.5, (10, 40), (5, 80))", "window_episode": 180, "noise_amplitude_ratio": 0.5, "period_range": [10, 40], "amplitude_range": [5, 80], "can_half_period": true}

================================================
FILE: env.yml
================================================
name: drlts
channels:
  - defaults
dependencies:
  - ca-certificates=2018.03.07=0
  - certifi=2018.4.16=py36_0
  - h5py=2.7.1=py36h39cdac5_0
  - hdf5=1.10.1=ha036c08_1
  - intel-openmp=2018.0.0=8
  - keras=2.1.5=py36_0
  - libcxx=4.0.1=h579ed51_0
  - libcxxabi=4.0.1=hebd6815_0
  - libedit=3.1=hb4e282d_0
  - libffi=3.2.1=h475c297_4
  - libgfortran=3.0.1=h93005f0_2
  - libprotobuf=3.5.2=h2cd40f5_0
  - mkl=2018.0.2=1
  - ncurses=6.0=hd04f020_2
  - numpy=1.12.1=py36h8871d66_1
  - openssl=1.0.2o=h26aff7b_0
  - pandas=0.22.0=py36h0a44026_0
  - pip=9.0.3=py36_0
  - protobuf=3.5.2=py36h0a44026_0
  - python=3.6.5=hc167b69_0
  - python-dateutil=2.7.2=py36_0
  - pytz=2018.4=py36_0
  - pyyaml=3.12=py36h2ba1e63_1
  - readline=7.0=hc1231fa_4
  - scipy=1.0.1=py36hcaad992_0
  - setuptools=39.0.1=py36_0
  - six=1.11.0=py36h0e22d5e_1
  - sqlite=3.23.1=hf1716c9_0
  - tensorflow=1.1.0=np112py36_0
  - tk=8.6.7=h35a86e2_3
  - werkzeug=0.14.1=py36_0
  - wheel=0.31.0=py36_0
  - xz=5.2.3=h0278029_2
  - yaml=0.1.7=hc338f04_2
  - zlib=1.2.11=hf3cbc9b_2



================================================
FILE: src/agents.py
================================================
from lib import *

class Agent:

	def __init__(self, model, 
		batch_size=32, discount_factor=0.95):

		self.model = model
		self.batch_size = batch_size
		self.discount_factor = discount_factor
		self.memory = []


	def remember(self, state, action, reward, next_state, done, next_valid_actions):
		self.memory.append((state, action, reward, next_state, done, next_valid_actions))


	def replay(self):
		batch = random.sample(self.memory, min(len(self.memory), self.batch_size))
		for state, action, reward, next_state, done, next_valid_actions in batch:
			q = reward
			if not done:
				q += self.discount_factor * np.nanmax(self.get_q_valid(next_state, next_valid_actions))
			self.model.fit(state, action, q)


	def get_q_valid(self, state, valid_actions):
		q = self.model.predict(state)
		q_valid = [np.nan] * len(q)
		for action in valid_actions:
			q_valid[action] = q[action]
		return q_valid


	def act(self, state, exploration, valid_actions):
		if np.random.random() > exploration:
			q_valid = self.get_q_valid(state, valid_actions)
			if np.nanmin(q_valid) != np.nanmax(q_valid):
				return np.nanargmax(q_valid)
		return random.sample(valid_actions, 1)[0]


	def save(self, fld):
		makedirs(fld)

		attr = {
			'batch_size':self.batch_size, 
			'discount_factor':self.discount_factor, 
			#'memory':self.memory
			}

		pickle.dump(attr, open(os.path.join(fld, 'agent_attr.pickle'),'wb'))
		self.model.save(fld)

	def load(self, fld):
		path = os.path.join(fld, 'agent_attr.pickle')
		print(path)
		attr = pickle.load(open(path,'rb'))
		for k in attr:
			setattr(self, k, attr[k])
		self.model.load(fld)


def add_dim(x, shape):
	return np.reshape(x, (1,) + shape)



class QModelKeras:
	# ref: https://keon.io/deep-q-learning/
	
	def init(self):
		pass

	def build_model(self):
		pass

	def __init__(self, state_shape, n_action):
		self.state_shape = state_shape
		self.n_action = n_action
		self.attr2save = ['state_shape','n_action','model_name']
		self.init()


	def save(self, fld):
		makedirs(fld)
		with open(os.path.join(fld, 'model.json'), 'w') as json_file:
			json_file.write(self.model.to_json())
		self.model.save_weights(os.path.join(fld, 'weights.hdf5'))

		attr = dict()
		for a in self.attr2save:
			attr[a] = getattr(self, a)
		pickle.dump(attr, open(os.path.join(fld, 'Qmodel_attr.pickle'),'wb'))

	def load(self, fld, learning_rate):
		json_str = open(os.path.join(fld, 'model.json')).read()
		self.model = keras.models.model_from_json(json_str)
		self.model.load_weights(os.path.join(fld, 'weights.hdf5'))
		self.model.compile(loss='mse', optimizer=keras.optimizers.Adam(lr=learning_rate))

		attr = pickle.load(open(os.path.join(fld, 'Qmodel_attr.pickle'), 'rb'))
		for a in attr:
			setattr(self, a, attr[a])

	def predict(self, state):
		q = self.model.predict(
			add_dim(state, self.state_shape)
			)[0]
		
		if np.isnan(max(q)):
			print('state'+str(state))
			print('q'+str(q))
			raise ValueError

		return q

	def fit(self, state, action, q_action):
		q = self.predict(state)
		q[action] = q_action

		self.model.fit(
			add_dim(state, self.state_shape), 
			add_dim(q, (self.n_action,)), 
			epochs=1, verbose=0)



class QModelMLP(QModelKeras):
	# multi-layer perception (MLP), i.e., dense only

	def init(self):
		self.qmodel = 'MLP'	

	def build_model(self, n_hidden, learning_rate, activation='relu'):

		model = keras.models.Sequential()
		model.add(keras.layers.Reshape(
			(self.state_shape[0]*self.state_shape[1],), 
			input_shape=self.state_shape))

		for i in range(len(n_hidden)):
			model.add(keras.layers.Dense(n_hidden[i], activation=activation))
			#model.add(keras.layers.Dropout(drop_rate))
		
		model.add(keras.layers.Dense(self.n_action, activation='linear'))
		model.compile(loss='mse', optimizer=keras.optimizers.Adam(lr=learning_rate))
		self.model = model
		self.model_name = self.qmodel + str(n_hidden)
		


class QModelRNN(QModelKeras):
	"""
	https://keras.io/getting-started/sequential-model-guide/#example
	note param doesn't grow with len of sequence
	"""

	def _build_model(self, Layer, n_hidden, dense_units, learning_rate, activation='relu'):

		model = keras.models.Sequential()
		model.add(keras.layers.Reshape(self.state_shape, input_shape=self.state_shape))
		m = len(n_hidden)
		for i in range(m):
			model.add(Layer(n_hidden[i],
				return_sequences=(i<m-1)))
		for i in range(len(dense_units)):
			model.add(keras.layers.Dense(dense_units[i], activation=activation))
		model.add(keras.layers.Dense(self.n_action, activation='linear'))
		model.compile(loss='mse', optimizer=keras.optimizers.Adam(lr=learning_rate))
		self.model = model
		self.model_name = self.qmodel + str(n_hidden) + str(dense_units)
		


class QModelLSTM(QModelRNN):
	def init(self):
		self.qmodel = 'LSTM'
	def build_model(self, n_hidden, dense_units, learning_rate, activation='relu'):
		Layer = keras.layers.LSTM
		self._build_model(Layer, n_hidden, dense_units, learning_rate, activation)


class QModelGRU(QModelRNN):
	def init(self):
		self.qmodel = 'GRU'
	def build_model(self, n_hidden, dense_units, learning_rate, activation='relu'):
		Layer = keras.layers.GRU
		self._build_model(Layer, n_hidden, dense_units, learning_rate, activation)



class QModelConv(QModelKeras):
	"""
	ref: https://keras.io/layers/convolutional/
	"""
	def init(self):
		self.qmodel = 'Conv'

	def build_model(self, 
		filter_num, filter_size, dense_units, 
		learning_rate, activation='relu', dilation=None, use_pool=None):

		if use_pool is None:
			use_pool = [True]*len(filter_num)
		if dilation is None:
			dilation = [1]*len(filter_num)

		model = keras.models.Sequential()
		model.add(keras.layers.Reshape(self.state_shape, input_shape=self.state_shape))
		
		for i in range(len(filter_num)):
			model.add(keras.layers.Conv1D(filter_num[i], kernel_size=filter_size[i], dilation_rate=dilation[i], 
				activation=activation, use_bias=True))
			if use_pool[i]:
				model.add(keras.layers.MaxPooling1D(pool_size=2))
		
		model.add(keras.layers.Flatten())
		for i in range(len(dense_units)):
			model.add(keras.layers.Dense(dense_units[i], activation=activation))
		model.add(keras.layers.Dense(self.n_action, activation='linear'))
		model.compile(loss='mse', optimizer=keras.optimizers.Adam(lr=learning_rate))
		
		self.model = model

		self.model_name = self.qmodel + str([a for a in
			zip(filter_num, filter_size, dilation, use_pool)
			])+' + '+str(dense_units)

		

class QModelConvRNN(QModelKeras):
	"""
	https://keras.io/getting-started/sequential-model-guide/#example
	note param doesn't grow with len of sequence
	"""

	def _build_model(self, RNNLayer, conv_n_hidden, RNN_n_hidden, dense_units, learning_rate, 
		conv_kernel_size=3, use_pool=False, activation='relu'):

		model = keras.models.Sequential()
		model.add(keras.layers.Reshape(self.state_shape, input_shape=self.state_shape))

		for i in range(len(conv_n_hidden)):
			model.add(keras.layers.Conv1D(conv_n_hidden[i], kernel_size=conv_kernel_size, 
				activation=activation, use_bias=True))
			if use_pool:
				model.add(keras.layers.MaxPooling1D(pool_size=2))
		m = len(RNN_n_hidden)
		for i in range(m):
			model.add(RNNLayer(RNN_n_hidden[i],
				return_sequences=(i<m-1)))
		for i in range(len(dense_units)):
			model.add(keras.layers.Dense(dense_units[i], activation=activation))

		model.add(keras.layers.Dense(self.n_action, activation='linear'))
		model.compile(loss='mse', optimizer=keras.optimizers.Adam(lr=learning_rate))
		self.model = model
		self.model_name = self.qmodel + str(conv_n_hidden) + str(RNN_n_hidden) + str(dense_units)
		

class QModelConvLSTM(QModelConvRNN):
	def init(self):
		self.qmodel = 'ConvLSTM'
	def build_model(self, conv_n_hidden, RNN_n_hidden, dense_units, learning_rate, 
		conv_kernel_size=3, use_pool=False, activation='relu'):
		Layer = keras.layers.LSTM
		self._build_model(Layer, conv_n_hidden, RNN_n_hidden, dense_units, learning_rate, 
		conv_kernel_size, use_pool, activation)


class QModelConvGRU(QModelConvRNN):
	def init(self):
		self.qmodel = 'ConvGRU'
	def build_model(self, conv_n_hidden, RNN_n_hidden, dense_units, learning_rate, 
		conv_kernel_size=3, use_pool=False, activation='relu'):
		Layer = keras.layers.GRU
		self._build_model(Layer, conv_n_hidden, RNN_n_hidden, dense_units, learning_rate, 
		conv_kernel_size, use_pool, activation)







def load_model(fld, learning_rate):
	s = open(os.path.join(fld,'QModel.txt'),'r').read().strip()
	qmodels = {
		'Conv':QModelConv,
		'DenseOnly':QModelMLP,
		'MLP':QModelMLP,
		'LSTM':QModelLSTM,
		'GRU':QModelGRU,
		}
	qmodel = qmodels[s](None, None)
	qmodel.load(fld, learning_rate)
	return qmodel




================================================
FILE: src/emulator.py
================================================
from lib import *

# by Xiang Gao, 2018



def find_ideal(p, just_once):
	if not just_once:
		diff = np.array(p[1:]) - np.array(p[:-1])
		return sum(np.maximum(np.zeros(diff.shape), diff))
	else:
		best = 0.
		i0_best = None
		for i in range(len(p)-1):
			best = max(best, max(p[i+1:]) - p[i])

		return best


class Market:
	"""
	state 			MA of prices, normalized using values at t
					ndarray of shape (window_state, n_instruments * n_MA), i.e., 2D
					which is self.state_shape

	action 			three action
					0:	empty, don't open/close. 
					1:	open a position
					2: 	keep a position
	"""
	
	def reset(self, rand_price=True):
		self.empty = True
		if rand_price:
			prices, self.title = self.sampler.sample()
			price = np.reshape(prices[:,0], prices.shape[0])

			self.prices = prices.copy()
			self.price = price/price[0]*100
			self.t_max = len(self.price) - 1

		self.max_profit = find_ideal(self.price[self.t0:], False)
		self.t = self.t0
		return self.get_state(), self.get_valid_actions()


	def get_state(self, t=None):
		if t is None:
			t = self.t
		state = self.prices[t - self.window_state + 1: t + 1, :].copy()
		for i in range(self.sampler.n_var):
			norm = np.mean(state[:,i])
			state[:,i] = (state[:,i]/norm - 1.)*100	
		return state

	def get_valid_actions(self):
		if self.empty:
			return [0, 1]	# wait, open
		else:
			return [0, 2]	# close, keep


	def get_noncash_reward(self, t=None, empty=None):
		if t is None:
			t = self.t
		if empty is None:
			empty = self.empty
		reward = self.direction * (self.price[t+1] - self.price[t])
		if empty:
			reward -= self.open_cost
		if reward < 0:
			reward *= (1. + self.risk_averse)
		return reward


	def step(self, action):

		done = False
		if action == 0:		# wait/close
			reward = 0.
			self.empty = True
		elif action == 1:	# open
			reward = self.get_noncash_reward()
			self.empty = False
		elif action == 2:	# keep
			reward = self.get_noncash_reward()
		else:
			raise ValueError('no such action: '+str(action))

		self.t += 1
		return self.get_state(), reward, self.t == self.t_max, self.get_valid_actions()


	def __init__(self, 
		sampler, window_state, open_cost,
		direction=1., risk_averse=0.):

		self.sampler = sampler
		self.window_state = window_state
		self.open_cost = open_cost
		self.direction = direction
		self.risk_averse = risk_averse

		self.n_action = 3
		self.state_shape = (window_state, self.sampler.n_var)
		self.action_labels = ['empty','open','keep']
		self.t0 = window_state - 1


if __name__ == '__main__':
	test_env()


================================================
FILE: src/lib.py
================================================
import random, os, datetime, pickle, json, keras, sys
import pandas as pd
#import matplotlib.pyplot as plt
import numpy as np

OUTPUT_FLD = os.path.join('..','results')
PRICE_FLD = '/Users/xianggao/Dropbox/distributed/code_db/price coinbase/vm-w7r-db'

def makedirs(fld):
	if not os.path.exists(fld):
		os.makedirs(fld)


================================================
FILE: src/main.py
================================================
#!/usr/bin/env python2

from lib import *
from sampler import *
from agents import *
from emulator import *
from simulators import *
from visualizer import *


def get_model(model_type, env, learning_rate, fld_load):

	print_t = False
	exploration_init = 1.

	if model_type == 'MLP':
		m = 16
		layers = 5
		hidden_size = [m]*layers
		model = QModelMLP(env.state_shape, env.n_action)
		model.build_model(hidden_size, learning_rate=learning_rate, activation='tanh')
	
	elif model_type == 'conv':

		m = 16
		layers = 2
		filter_num = [m]*layers
		filter_size = [3] * len(filter_num)
		#use_pool = [False, True, False, True]
		#use_pool = [False, False, True, False, False, True]
		use_pool = None
		#dilation = [1,2,4,8]
		dilation = None
		dense_units = [48,24]
		model = QModelConv(env.state_shape, env.n_action)
		model.build_model(filter_num, filter_size, dense_units, learning_rate, 
			dilation=dilation, use_pool=use_pool)

	elif model_type == 'RNN':

		m = 32
		layers = 3
		hidden_size = [m]*layers
		dense_units = [m,m]
		model = QModelGRU(env.state_shape, env.n_action)
		model.build_model(hidden_size, dense_units, learning_rate=learning_rate)
		print_t = True

	elif model_type == 'ConvRNN':
	
		m = 8
		conv_n_hidden = [m,m]
		RNN_n_hidden = [m,m]
		dense_units = [m,m]
		model = QModelConvGRU(env.state_shape, env.n_action)
		model.build_model(conv_n_hidden, RNN_n_hidden, dense_units, learning_rate=learning_rate)
		print_t = True

	elif model_type == 'pretrained':
		agent.model = load_model(fld_load, learning_rate)

	else:
		raise ValueError
		
	return model, print_t


def main():

	"""
	it is recommended to generate database usng sampler.py before run main
	"""

	model_type = 'conv'; exploration_init = 1.; fld_load = None
	n_episode_training = 1000
	n_episode_testing = 100
	open_cost = 3.3
	#db_type = 'SinSamplerDB'; db = 'concat_half_base_'; Sampler = SinSampler
	db_type = 'PairSamplerDB'; db = 'randjump_100,1(10, 30)[]_'; Sampler = PairSampler
	batch_size = 8
	learning_rate = 1e-4
	discount_factor = 0.8
	exploration_decay = 0.99
	exploration_min = 0.01
	window_state = 40

	fld = os.path.join('..','data',db_type,db+'A')
	sampler = Sampler('load', fld=fld)
	env = Market(sampler, window_state, open_cost)
	model, print_t = get_model(model_type, env, learning_rate, fld_load)
	model.model.summary()
	#return

	agent = Agent(model, discount_factor=discount_factor, batch_size=batch_size)
	visualizer = Visualizer(env.action_labels)

	fld_save = os.path.join(OUTPUT_FLD, sampler.title, model.model_name, 
		str((env.window_state, sampler.window_episode, agent.batch_size, learning_rate,
			agent.discount_factor, exploration_decay, env.open_cost)))
	
	print('='*20)
	print(fld_save)
	print('='*20)

	simulator = Simulator(agent, env, visualizer=visualizer, fld_save=fld_save)
	simulator.train(n_episode_training, save_per_episode=1, exploration_decay=exploration_decay, 
		exploration_min=exploration_min, print_t=print_t, exploration_init=exploration_init)
	#agent.model = load_model(os.path.join(fld_save,'model'), learning_rate)

	#print('='*20+'\nin-sample testing\n'+'='*20)
	simulator.test(n_episode_testing, save_per_episode=1, subfld='in-sample testing')

	"""
	fld = os.path.join('data',db_type,db+'B')
	sampler = SinSampler('load',fld=fld)
	simulator.env.sampler = sampler
	simulator.test(n_episode_testing, save_per_episode=1, subfld='out-of-sample testing')
	"""
	

if __name__ == '__main__':
	main()


================================================
FILE: src/sampler.py
================================================
from lib import *

def read_data(date, instrument, time_step):
	path = os.path.join(PRICE_FLD, date, instrument+'.csv')
	if not os.path.exists(path):
		print('no such file: '+path)
		return None

	df_raw = pd.read_csv(path, parse_dates=['time'], index_col='time')
	df = df_raw.resample(time_step, how='last').fillna(method='ffill')
	return df['spot'].values



class Sampler:

	def load_db(self, fld):

		self.db = pickle.load(open(os.path.join(fld, 'db.pickle'),'rb'))
		param = json.load(open(os.path.join(fld, 'param.json'),'rb'))
		self.i_db = 0
		self.n_db = param['n_episodes']
		self.sample = self.__sample_db
		for attr in param:
			if hasattr(self, attr):
				setattr(self, attr, param[attr])
		self.title = 'DB_'+param['title']


	def build_db(self, n_episodes, fld):
		db = []
		for i in range(n_episodes):
			prices, title = self.sample()
			db.append((prices, '[%i]_'%i+title))
		os.makedirs(fld)	# don't overwrite existing fld
		pickle.dump(db, open(os.path.join(fld, 'db.pickle'),'wb'))
		param = {'n_episodes':n_episodes}
		for k in self.attrs:
			param[k] = getattr(self, k)
		json.dump(param, open(os.path.join(fld, 'param.json'),'w'))


	def __sample_db(self):
		prices, title = self.db[self.i_db]
		self.i_db += 1
		if self.i_db == self.n_db:
			self.i_db = 0
		return prices, title



class PairSampler(Sampler):

	def __init__(self, game,
		window_episode=None, forecast_horizon_range=None, max_change_perc=10., noise_level=10., n_section=1,
		fld=None, windows_transform=[]):

		self.window_episode = window_episode
		self.forecast_horizon_range = forecast_horizon_range
		self.max_change_perc = max_change_perc
		self.noise_level = noise_level
		self.n_section = n_section
		self.windows_transform = windows_transform
		self.n_var = 2 + len(self.windows_transform) # price, signal

		self.attrs = ['title', 'window_episode', 'forecast_horizon_range', 
			'max_change_perc', 'noise_level', 'n_section', 'n_var']
		param_str = str((self.noise_level, self.forecast_horizon_range, self.n_section, self.windows_transform))

		if game == 'load':
			self.load_db(fld)
		elif game in ['randwalk','randjump']:
			self.__rand = getattr(self, '_PairSampler__'+game)
			self.sample = self.__sample
			self.title = game + param_str
		else:
			raise ValueError


	def __randwalk(self, l):
		change = (np.random.random(l + self.forecast_horizon_range[1]) - 0.5) * 2 * self.max_change_perc/100
		forecast_horizon = random.randrange(self.forecast_horizon_range[0], self.forecast_horizon_range[1])
		return change[:l], change[forecast_horizon: forecast_horizon + l], forecast_horizon


	def __randjump(self, l):
		change = [0.] * (l + self.forecast_horizon_range[1])
		n_jump = random.randrange(15,30)
		for i in range(n_jump):
			t = random.randrange(len(change))
			change[t] = (np.random.random() - 0.5) * 2 * self.max_change_perc/100
		forecast_horizon = random.randrange(self.forecast_horizon_range[0], self.forecast_horizon_range[1])
		return change[:l], change[forecast_horizon: forecast_horizon + l], forecast_horizon



	def __sample(self):

		L = self.window_episode
		if bool(self.windows_transform):
			L += max(self.windows_transform)
		l0 = L/self.n_section
		l1 = L

		d_price = []
		d_signal = []
		forecast_horizon = []

		for i in range(self.n_section):
			if i == self.n_section - 1:
				l = l1
			else:
				l = l0
				l1 -= l0
			d_price_i, d_signal_i, horizon_i = self.__rand(l)
			d_price = np.append(d_price, d_price_i)
			d_signal = np.append(d_signal, d_signal_i)
			forecast_horizon.append(horizon_i)

		price = 100. * (1. + np.cumsum(d_price))
		signal = 100. * (1. + np.cumsum(d_signal)) + \
				np.random.random(len(price)) * self.noise_level

		price += (100 - min(price))
		signal += (100 - min(signal))

		inputs = [price[-self.window_episode:], signal[-self.window_episode:]]
		for w in self.windows_transform:
			inputs.append(signal[-self.window_episode - w: -w])

		return np.array(inputs).T, 'forecast_horizon='+str(forecast_horizon)




class SinSampler(Sampler):

	def __init__(self, game, 
		window_episode=None, noise_amplitude_ratio=None, period_range=None, amplitude_range=None,
		fld=None):

		self.n_var = 1	# price only

		self.window_episode = window_episode
		self.noise_amplitude_ratio = noise_amplitude_ratio
		self.period_range = period_range
		self.amplitude_range = amplitude_range
		self.can_half_period = False

		self.attrs = ['title','window_episode', 'noise_amplitude_ratio', 'period_range', 'amplitude_range', 'can_half_period']

		param_str = str((
			self.noise_amplitude_ratio, self.period_range, self.amplitude_range
			))
		if game == 'single':
			self.sample = self.__sample_single_sin
			self.title = 'SingleSin'+param_str
		elif game == 'concat':
			self.sample = self.__sample_concat_sin
			self.title = 'ConcatSin'+param_str
		elif game == 'concat_half':
			self.can_half_period = True
			self.sample = self.__sample_concat_sin
			self.title = 'ConcatHalfSin'+param_str
		elif game == 'concat_half_base':
			self.can_half_period = True
			self.sample = self.__sample_concat_sin_w_base
			self.title = 'ConcatHalfSin+Base'+param_str
			self.base_period_range = (int(2*self.period_range[1]), 4*self.period_range[1])
			self.base_amplitude_range = (20,80)
		elif game == 'load':
			self.load_db(fld)
		else:
			raise ValueError


	def __rand_sin(self, 
		period_range=None, amplitude_range=None, noise_amplitude_ratio=None, full_episode=False):

		if period_range is None:
			period_range = self.period_range
		if amplitude_range is None:
			amplitude_range = self.amplitude_range
		if noise_amplitude_ratio is None:
			noise_amplitude_ratio = self.noise_amplitude_ratio

		period = random.randrange(period_range[0], period_range[1])
		amplitude = random.randrange(amplitude_range[0], amplitude_range[1])
		noise = noise_amplitude_ratio * amplitude

		if full_episode:
			length = self.window_episode
		else:
			if self.can_half_period:
				length = int(random.randrange(1,4) * 0.5 * period)
			else:
				length = period

		p = 100. + amplitude * np.sin(np.array(range(length)) * 2 * 3.1416 / period)
		p += np.random.random(p.shape) * noise

		return p, '100+%isin((2pi/%i)t)+%ie'%(amplitude, period, noise)




	def __sample_concat_sin(self):
		prices = []
		p = []
		while True:
			p = np.append(p, self.__rand_sin(full_episode=False)[0])
			if len(p) > self.window_episode:
				break
		prices.append(p[:self.window_episode])
		return np.array(prices).T, 'concat sin'

	def __sample_concat_sin_w_base(self):
		prices = []
		p = []
		while True:
			p = np.append(p, self.__rand_sin(full_episode=False)[0])
			if len(p) > self.window_episode:
				break
		base, base_title = self.__rand_sin(
			period_range=self.base_period_range, 
			amplitude_range=self.base_amplitude_range, 
			noise_amplitude_ratio=0., 
			full_episode=True)
		prices.append(p[:self.window_episode] + base)
		return np.array(prices).T, 'concat sin + base: '+base_title
			
	def __sample_single_sin(self):
		prices = []
		funcs = []
		p, func = self.__rand_sin(full_episode=True)
		prices.append(p)
		funcs.append(func)
		return np.array(prices).T, str(funcs)





def test_SinSampler():

	window_episode = 180
	window_state = 40
	noise_amplitude_ratio = 0.5
	period_range = (10,40)
	amplitude_range = (5,80)
	game = 'concat_half_base'
	instruments = ['fake']

	sampler = SinSampler(game, 
		window_episode, noise_amplitude_ratio, period_range, amplitude_range)
	n_episodes = 100
	"""
	for i in range(100):
		plt.plot(sampler.sample(instruments)[0])
		plt.show()
		"""
	fld = os.path.join('data','SinSamplerDB',game+'_B')
	sampler.build_db(n_episodes, fld)



def test_PairSampler():
	fhr = (10,30)
	n_section = 1
	max_change_perc = 30.
	noise_level = 5
	game = 'randjump'
	windows_transform = []

	sampler = PairSampler(game, window_episode=180, forecast_horizon_range=fhr, 
		n_section=n_section, noise_level=noise_level, max_change_perc=max_change_perc, windows_transform=windows_transform)
	
	#plt.plot(sampler.sample()[0]);plt.show()
	#"""
	n_episodes = 100
	fld = os.path.join('data','PairSamplerDB',
		game+'_%i,%i'%(n_episodes, n_section)+str(fhr)+str(windows_transform)+'_B')
	sampler.build_db(n_episodes, fld)
	#"""




if __name__ == '__main__':
	#scan_match()
	test_SinSampler()
	#p = [1,2,3,2,1,2,3]
	#print find_ideal(p)
	test_PairSampler()


================================================
FILE: src/simulators.py
================================================
from lib import *



class Simulator:

	def play_one_episode(self, exploration, training=True, rand_price=True, print_t=False):

		state, valid_actions = self.env.reset(rand_price=rand_price)
		done = False
		env_t = 0
		try:
			env_t = self.env.t
		except AttributeError:
			pass

		cum_rewards = [np.nan] * env_t
		actions = [np.nan] * env_t
		states = [None] * env_t
		prev_cum_rewards = 0.

		while not done:
			if print_t:
				print(self.env.t)
    

			action = self.agent.act(state, exploration, valid_actions)
			next_state, reward, done, valid_actions = self.env.step(action)

			cum_rewards.append(prev_cum_rewards+reward)
			prev_cum_rewards = cum_rewards[-1]
			actions.append(action)
			states.append(next_state)

			if training:
				self.agent.remember(state, action, reward, next_state, done, valid_actions)
				self.agent.replay()

			state = next_state

		return cum_rewards, actions, states


	def train(self, n_episode, 
		save_per_episode=10, exploration_decay=0.995, exploration_min=0.01, print_t=False, exploration_init=1.):

		fld_model = os.path.join(self.fld_save,'model')
		makedirs(fld_model)	# don't overwrite if already exists
		with open(os.path.join(fld_model,'QModel.txt'),'w') as f:
			f.write(self.agent.model.qmodel)

		exploration = exploration_init
		fld_save = os.path.join(self.fld_save,'training')

		makedirs(fld_save)
		MA_window = 100		# MA of performance
		safe_total_rewards = []
		explored_total_rewards = []
		explorations = []
		path_record = os.path.join(fld_save,'record.csv')

		with open(path_record,'w') as f:
			f.write('episode,game,exploration,explored,safe,MA_explored,MA_safe\n')

		for n in range(n_episode):

			print('\ntraining...')
			exploration = max(exploration_min, exploration * exploration_decay)
			explorations.append(exploration)
			explored_cum_rewards, explored_actions, _ = self.play_one_episode(exploration, print_t=print_t)
			explored_total_rewards.append(100.*explored_cum_rewards[-1]/self.env.max_profit)
			safe_cum_rewards, safe_actions, _ = self.play_one_episode(0, training=False, rand_price=False, print_t=False)
			safe_total_rewards.append(100.*safe_cum_rewards[-1]/self.env.max_profit)

			MA_total_rewards = np.median(explored_total_rewards[-MA_window:])
			MA_safe_total_rewards = np.median(safe_total_rewards[-MA_window:])

			ss = [
				str(n), self.env.title.replace(',',';'), '%.1f'%(exploration*100.), 
				'%.1f'%(explored_total_rewards[-1]), '%.1f'%(safe_total_rewards[-1]),
				'%.1f'%MA_total_rewards, '%.1f'%MA_safe_total_rewards,
				]
			
			with open(path_record,'a') as f:
				f.write(','.join(ss)+'\n')
				print('\t'.join(ss))

			
			if n%save_per_episode == 0:
				print('saving results...')
				self.agent.save(fld_model)

				"""
				self.visualizer.plot_a_episode(
					self.env, self.agent.model, 
					explored_cum_rewards, explored_actions,
					safe_cum_rewards, safe_actions,
					os.path.join(fld_save, 'episode_%i.png'%(n)))

				self.visualizer.plot_episodes(
					explored_total_rewards, safe_total_rewards, explorations, 
					os.path.join(fld_save, 'total_rewards.png'),
					MA_window)
					"""




	def test(self, n_episode, save_per_episode=10, subfld='testing'):

		fld_save = os.path.join(self.fld_save, subfld)
		makedirs(fld_save)
		MA_window = 100		# MA of performance
		safe_total_rewards = []
		path_record = os.path.join(fld_save,'record.csv')

		with open(path_record,'w') as f:
			f.write('episode,game,pnl,rel,MA\n')

		for n in range(n_episode):
			print('\ntesting...')
			
			safe_cum_rewards, safe_actions, _ = self.play_one_episode(0, training=False, rand_price=True)
			safe_total_rewards.append(100.*safe_cum_rewards[-1]/self.env.max_profit)
			MA_safe_total_rewards = np.median(safe_total_rewards[-MA_window:])
			ss = [str(n), self.env.title.replace(',',';'), 
				'%.1f'%(safe_cum_rewards[-1]),
				'%.1f'%(safe_total_rewards[-1]), 
				'%.1f'%MA_safe_total_rewards]
			
			with open(path_record,'a') as f:
				f.write(','.join(ss)+'\n')
				print('\t'.join(ss))

			
			if n%save_per_episode == 0:
				print('saving results...')

				"""
				self.visualizer.plot_a_episode(
					self.env, self.agent.model, 
					[np.nan]*len(safe_cum_rewards), [np.nan]*len(safe_actions),
					safe_cum_rewards, safe_actions,
					os.path.join(fld_save, 'episode_%i.png'%(n)))

				self.visualizer.plot_episodes(
					None, safe_total_rewards, None, 
					os.path.join(fld_save, 'total_rewards.png'),
					MA_window)
					"""
					



	def __init__(self, agent, env, 
		visualizer, fld_save):

		self.agent = agent
		self.env = env
		self.visualizer = visualizer
		self.fld_save = fld_save





if __name__ == '__main__':
	#print 'episode%i, init%i'%(1,2)
	a = [1,2,3]
	print(np.mean(a[-100:]))

================================================
FILE: src/visualizer.py
================================================
from lib import *



def get_tick_labels(bins, ticks):

	ticklabels = []
	for i in ticks:
		if i < len(bins):
			ticklabels.append('%.2f'%(bins[int(i)]))
		else:
			ticklabels.append('%.2f'%(bins[-1])+'+')

	return ticklabels



class Visualizer:

	def __init__(self, action_labels):
		self.n_action = len(action_labels)
		self.action_labels = action_labels


	def plot_a_episode(self, 
		env, model,
		explored_cum_rewards, explored_actions, 
		safe_cum_rewards, safe_actions,
		fig_path):

		f, axs = plt.subplots(3,1,sharex=True, figsize=(14,14))
		ax_price, ax_action, ax_Q = axs  

		ls = ['-','--']
		for i in range(min(2,env.prices.shape[1])):
			p = env.prices[:,i]/env.prices[0,i]*100 - 100
			ax_price.plot(p, 'k'+ls[i], label='input%i - 100'%i)

		ax_price.plot(explored_cum_rewards, 'b', label='explored P&L')
		ax_price.plot(safe_cum_rewards, 'r', label='safe P&L')
		ax_price.legend(loc='best', frameon=False)
		ax_price.set_title(env.title+', ideal: %.1f, safe: %.1f, explored: %1.f'%(
			env.max_profit, safe_cum_rewards[-1], explored_cum_rewards[-1]))

		ax_action.plot(explored_actions, 'b', label='explored')
		ax_action.plot(safe_actions, 'r', label='safe', linewidth=2)
		ax_action.set_ylim(-0.4, self.n_action-0.6)
		ax_action.set_ylabel('action')
		ax_action.set_yticks(range(self.n_action))
		ax_action.legend(loc='best', frameon=False)
		
		style = ['k','r','b']
		qq = []
		for t in xrange(env.t0):
			qq.append([np.nan] * self.n_action)
		for t in xrange(env.t0, env.t_max):
			qq.append(model.predict(env.get_state(t))) 
		for i in xrange(self.n_action):
			ax_Q.plot([float(qq[t][i]) for t in xrange(len(qq))], 
				style[i], label=self.action_labels[i])
		ax_Q.set_ylabel('Q')
		ax_Q.legend(loc='best', frameon=False)
		ax_Q.set_xlabel('t')

		plt.subplots_adjust(wspace=0.4)
		plt.savefig(fig_path)
		plt.close()



	def plot_episodes(self, 
		explored_total_rewards, safe_total_rewards, explorations, 
		fig_path, MA_window=100):

		f = plt.figure(figsize=(14,10))	# width, height in inch (100 pixel)
		if explored_total_rewards is None:
			f, ax_reward = plt.subplots()
		else:
			figshape = (3,1)
			ax_reward = plt.subplot2grid(figshape, (0, 0), rowspan=2)
			ax_exploration = plt.subplot2grid(figshape, (2, 0), sharex=ax_reward)

		tt = range(len(safe_total_rewards))

		if explored_total_rewards is not None:
			ma = pd.rolling_median(np.array(explored_total_rewards), window=MA_window, min_periods=1)
			std = pd.rolling_std(np.array(explored_total_rewards), window=MA_window, min_periods=3)
			ax_reward.plot(tt, explored_total_rewards,'bv', fillstyle='none')
			ax_reward.plot(tt, ma, 'b', label='explored ma', linewidth=2)
			ax_reward.plot(tt, std, 'b--', label='explored std', linewidth=2)

		ma = pd.rolling_median(np.array(safe_total_rewards), window=MA_window, min_periods=1)
		std = pd.rolling_std(np.array(safe_total_rewards), window=MA_window, min_periods=3)
		ax_reward.plot(tt, safe_total_rewards,'ro', fillstyle='none')
		ax_reward.plot(tt, ma,'r', label='safe ma', linewidth=2)
		ax_reward.plot(tt, std,'r--', label='safe std', linewidth=2)

		ax_reward.axhline(y=0, color='k', linestyle=':')
		#ax_reward.axhline(y=60, color='k', linestyle=':')
		ax_reward.set_ylabel('total reward')
		ax_reward.legend(loc='best', frameon=False)
		ax_reward.yaxis.tick_right()
		ylim = ax_reward.get_ylim()
		ax_reward.set_ylim((max(-100,ylim[0]), min(100,ylim[1])))

		if explored_total_rewards is not None:
			ax_exploration.plot(tt, np.array(explorations)*100., 'k')
			ax_exploration.set_ylabel('exploration')
			ax_exploration.set_xlabel('episode')

		plt.savefig(fig_path)
		plt.close()
		



def test_visualizer():

	f = plt.figure()#figsize=(5,8))
	axs_action = []
	ncol = 3
	nrow = 2

	clim = (0,1)

	ax = plt.subplot2grid((nrow, ncol), (0,ncol-1))
	ax.matshow(np.random.random((2,2)), cmap='RdYlBu_r', clim=clim)

	for action in range(3):
		row = 1 + action/ncol
		col = action%ncol
		ax = plt.subplot2grid((nrow, ncol), (row,col))
		cax = ax.matshow(np.random.random((2,2)), cmap='RdYlBu_r', clim=clim)
	

	ax = plt.subplot2grid((nrow, ncol), (0,0), colspan=ncol-1)
	cbar = f.colorbar(cax, ax=ax)

	plt.show()




class VisualizerSequential:

	def config(self):
		pass

	def __init__(self, model):
		self.model = model
		self.layers = []
		for layer in self.model.layers:
			self.layers.append(str(layer.name))

		self.inter_models = dict()
		model_input = self.model.input
		for layer in self.layers:
			self.inter_models[layer] = keras.models.Model(
								inputs=model_input,
                                outputs=self.model.get_layer(layer).output)
		self.config()



class VisualizerConv1D(VisualizerSequential):

	def config(self):

		self.n_channel = self.model.input.shape[2]
		n_col = self.n_channel
		for layer in self.layers:
			shape = self.inter_models[layer].output.shape
			if len(shape) == 3:
				n_col = max(n_col, shape[2])

		self.figshape = (len(self.layers)+1, int(n_col))


	def plot(self, x):

		f = plt.figure(figsize=(30,30))	
		
		for i in range(self.n_channel):
			ax = plt.subplot2grid(self.figshape, (0,i))
			ax.plot(x[0,:,i], '.-')
			ax.set_title('input, channel %i'%i)

		for i_layer in range(len(self.layers)):
			layer = self.layers[i_layer]
			z = self.inter_models[layer].predict(x)
			print('plotting '+layer)
			if len(z.shape) == 3:
				for i in range(z.shape[2]):
					ax = plt.subplot2grid(self.figshape, (i_layer+1, i))
					ax.plot(z[0,:,i], '.-')
					ax.set_title(layer+' filter %i'%i)
			else:
				ax = plt.subplot2grid(self.figshape, (i_layer+1, 0))
				ax.plot(z[0,:], '.-')
				ax.set_title(layer)


		ax.set_ylim(-100,100)


	def print_w(self):
		layer = self.layers[0]
		ww = self.inter_models[layer].get_weights()
		for w in ww:
			print(w.shape)
			print(w)

Download .txt
gitextract_vp_b1_jb/

├── .gitignore
├── LICENSE
├── README.md
├── data/
│   ├── PairSamplerDB/
│   │   ├── randjump_100,1(10, 30)[]_A/
│   │   │   ├── db.pickle
│   │   │   └── param.json
│   │   └── randjump_100,1(10, 30)[]_B/
│   │       ├── db.pickle
│   │       └── param.json
│   └── SinSamplerDB/
│       ├── concat_half_base_A/
│       │   ├── db.pickle
│       │   └── param.json
│       └── concat_half_base_B/
│           ├── db.pickle
│           └── param.json
├── env.yml
└── src/
    ├── agents.py
    ├── emulator.py
    ├── lib.py
    ├── main.py
    ├── sampler.py
    ├── simulators.py
    └── visualizer.py
Download .txt
SYMBOL INDEX (87 symbols across 7 files)

FILE: src/agents.py
  class Agent (line 3) | class Agent:
    method __init__ (line 5) | def __init__(self, model,
    method remember (line 14) | def remember(self, state, action, reward, next_state, done, next_valid...
    method replay (line 18) | def replay(self):
    method get_q_valid (line 27) | def get_q_valid(self, state, valid_actions):
    method act (line 35) | def act(self, state, exploration, valid_actions):
    method save (line 43) | def save(self, fld):
    method load (line 55) | def load(self, fld):
  function add_dim (line 64) | def add_dim(x, shape):
  class QModelKeras (line 69) | class QModelKeras:
    method init (line 72) | def init(self):
    method build_model (line 75) | def build_model(self):
    method __init__ (line 78) | def __init__(self, state_shape, n_action):
    method save (line 85) | def save(self, fld):
    method load (line 96) | def load(self, fld, learning_rate):
    method predict (line 106) | def predict(self, state):
    method fit (line 118) | def fit(self, state, action, q_action):
  class QModelMLP (line 129) | class QModelMLP(QModelKeras):
    method init (line 132) | def init(self):
    method build_model (line 135) | def build_model(self, n_hidden, learning_rate, activation='relu'):
  class QModelRNN (line 153) | class QModelRNN(QModelKeras):
    method _build_model (line 159) | def _build_model(self, Layer, n_hidden, dense_units, learning_rate, ac...
  class QModelLSTM (line 176) | class QModelLSTM(QModelRNN):
    method init (line 177) | def init(self):
    method build_model (line 179) | def build_model(self, n_hidden, dense_units, learning_rate, activation...
  class QModelGRU (line 184) | class QModelGRU(QModelRNN):
    method init (line 185) | def init(self):
    method build_model (line 187) | def build_model(self, n_hidden, dense_units, learning_rate, activation...
  class QModelConv (line 193) | class QModelConv(QModelKeras):
    method init (line 197) | def init(self):
    method build_model (line 200) | def build_model(self,
  class QModelConvRNN (line 232) | class QModelConvRNN(QModelKeras):
    method _build_model (line 238) | def _build_model(self, RNNLayer, conv_n_hidden, RNN_n_hidden, dense_un...
  class QModelConvLSTM (line 262) | class QModelConvLSTM(QModelConvRNN):
    method init (line 263) | def init(self):
    method build_model (line 265) | def build_model(self, conv_n_hidden, RNN_n_hidden, dense_units, learni...
  class QModelConvGRU (line 272) | class QModelConvGRU(QModelConvRNN):
    method init (line 273) | def init(self):
    method build_model (line 275) | def build_model(self, conv_n_hidden, RNN_n_hidden, dense_units, learni...
  function load_model (line 287) | def load_model(fld, learning_rate):

FILE: src/emulator.py
  function find_ideal (line 7) | def find_ideal(p, just_once):
  class Market (line 20) | class Market:
    method reset (line 32) | def reset(self, rand_price=True):
    method get_state (line 47) | def get_state(self, t=None):
    method get_valid_actions (line 56) | def get_valid_actions(self):
    method get_noncash_reward (line 63) | def get_noncash_reward(self, t=None, empty=None):
    method step (line 76) | def step(self, action):
    method __init__ (line 94) | def __init__(self,

FILE: src/lib.py
  function makedirs (line 9) | def makedirs(fld):

FILE: src/main.py
  function get_model (line 11) | def get_model(model_type, env, learning_rate, fld_load):
  function main (line 68) | def main():

FILE: src/sampler.py
  function read_data (line 3) | def read_data(date, instrument, time_step):
  class Sampler (line 15) | class Sampler:
    method load_db (line 17) | def load_db(self, fld):
    method build_db (line 30) | def build_db(self, n_episodes, fld):
    method __sample_db (line 43) | def __sample_db(self):
  class PairSampler (line 52) | class PairSampler(Sampler):
    method __init__ (line 54) | def __init__(self, game,
    method __randwalk (line 80) | def __randwalk(self, l):
    method __randjump (line 86) | def __randjump(self, l):
    method __sample (line 97) | def __sample(self):
  class SinSampler (line 136) | class SinSampler(Sampler):
    method __init__ (line 138) | def __init__(self, game,
    method __rand_sin (line 177) | def __rand_sin(self,
    method __sample_concat_sin (line 207) | def __sample_concat_sin(self):
    method __sample_concat_sin_w_base (line 217) | def __sample_concat_sin_w_base(self):
    method __sample_single_sin (line 232) | def __sample_single_sin(self):
  function test_SinSampler (line 244) | def test_SinSampler():
  function test_PairSampler (line 267) | def test_PairSampler():

FILE: src/simulators.py
  class Simulator (line 5) | class Simulator:
    method play_one_episode (line 7) | def play_one_episode(self, exploration, training=True, rand_price=True...
    method train (line 44) | def train(self, n_episode,
    method test (line 109) | def test(self, n_episode, save_per_episode=10, subfld='testing'):
    method __init__ (line 155) | def __init__(self, agent, env,

FILE: src/visualizer.py
  function get_tick_labels (line 5) | def get_tick_labels(bins, ticks):
  class Visualizer (line 18) | class Visualizer:
    method __init__ (line 20) | def __init__(self, action_labels):
    method plot_a_episode (line 25) | def plot_a_episode(self,
    method plot_episodes (line 71) | def plot_episodes(self,
  function test_visualizer (line 117) | def test_visualizer():
  class VisualizerSequential (line 144) | class VisualizerSequential:
    method config (line 146) | def config(self):
    method __init__ (line 149) | def __init__(self, model):
  class VisualizerConv1D (line 165) | class VisualizerConv1D(VisualizerSequential):
    method config (line 167) | def config(self):
    method plot (line 179) | def plot(self, x):
    method print_w (line 206) | def print_w(self):
Condensed preview — 19 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (42K chars).
[
  {
    "path": ".gitignore",
    "chars": 7,
    "preview": "\n*.pyc\n"
  },
  {
    "path": "LICENSE",
    "chars": 1066,
    "preview": "MIT License\n\nCopyright (c) 2018 Xiang Gao\n\nPermission is hereby granted, free of charge, to any person obtaining a copy\n"
  },
  {
    "path": "README.md",
    "chars": 1010,
    "preview": "\n# **Playing trading games with deep reinforcement learning**\n\nThis repo is the code for this [paper](https://arxiv.org/"
  },
  {
    "path": "data/PairSamplerDB/randjump_100,1(10, 30)[]_A/param.json",
    "chars": 190,
    "preview": "{\"n_episodes\": 100, \"title\": \"randjump(5, (10, 30), 1, [])\", \"window_episode\": 180, \"forecast_horizon_range\": [10, 30], "
  },
  {
    "path": "data/PairSamplerDB/randjump_100,1(10, 30)[]_B/param.json",
    "chars": 190,
    "preview": "{\"n_episodes\": 100, \"title\": \"randjump(5, (10, 30), 1, [])\", \"window_episode\": 180, \"forecast_horizon_range\": [10, 30], "
  },
  {
    "path": "data/SinSamplerDB/concat_half_base_A/param.json",
    "chars": 206,
    "preview": "{\"n_episodes\": 100, \"title\": \"ConcatHalfSin+Base(0.5, (10, 40), (5, 80))\", \"window_episode\": 180, \"noise_amplitude_ratio"
  },
  {
    "path": "data/SinSamplerDB/concat_half_base_B/param.json",
    "chars": 206,
    "preview": "{\"n_episodes\": 100, \"title\": \"ConcatHalfSin+Base(0.5, (10, 40), (5, 80))\", \"window_episode\": 180, \"noise_amplitude_ratio"
  },
  {
    "path": "env.yml",
    "chars": 1043,
    "preview": "name: drlts\nchannels:\n  - defaults\ndependencies:\n  - ca-certificates=2018.03.07=0\n  - certifi=2018.4.16=py36_0\n  - h5py="
  },
  {
    "path": "src/agents.py",
    "chars": 8619,
    "preview": "from lib import *\n\nclass Agent:\n\n\tdef __init__(self, model, \n\t\tbatch_size=32, discount_factor=0.95):\n\n\t\tself.model = mod"
  },
  {
    "path": "src/emulator.py",
    "chars": 2530,
    "preview": "from lib import *\n\n# by Xiang Gao, 2018\n\n\n\ndef find_ideal(p, just_once):\n\tif not just_once:\n\t\tdiff = np.array(p[1:]) - n"
  },
  {
    "path": "src/lib.py",
    "chars": 320,
    "preview": "import random, os, datetime, pickle, json, keras, sys\nimport pandas as pd\n#import matplotlib.pyplot as plt\nimport numpy "
  },
  {
    "path": "src/main.py",
    "chars": 3441,
    "preview": "#!/usr/bin/env python2\n\nfrom lib import *\nfrom sampler import *\nfrom agents import *\nfrom emulator import *\nfrom simulat"
  },
  {
    "path": "src/sampler.py",
    "chars": 8299,
    "preview": "from lib import *\n\ndef read_data(date, instrument, time_step):\n\tpath = os.path.join(PRICE_FLD, date, instrument+'.csv')\n"
  },
  {
    "path": "src/simulators.py",
    "chars": 4725,
    "preview": "from lib import *\n\n\n\nclass Simulator:\n\n\tdef play_one_episode(self, exploration, training=True, rand_price=True, print_t="
  },
  {
    "path": "src/visualizer.py",
    "chars": 5765,
    "preview": "from lib import *\n\n\n\ndef get_tick_labels(bins, ticks):\n\n\tticklabels = []\n\tfor i in ticks:\n\t\tif i < len(bins):\n\t\t\tticklab"
  }
]

// ... and 4 more files (download for full content)

About this extraction

This page contains the full source code of the golsun/deep-RL-trading GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 19 files (36.7 KB), approximately 11.6k tokens, and a symbol index with 87 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!