Repository: wenkesj/holdem Branch: master Commit: b2089d54122c Files: 8 Total size: 33.8 KB Directory structure: gitextract_e7zk95ig/ ├── .gitignore ├── README.md ├── example.py ├── holdem/ │ ├── __init__.py │ ├── env.py │ ├── player.py │ └── utils.py └── setup.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ /*.egg-info /dist /build /test /examples __pycache__ *.DS_Store ================================================ FILE: README.md ================================================ # holdem :warning: **This is an experimental API, it will most definitely contain bugs, but that's why you are here!** ```sh pip install holdem ``` Afaik, this is the first [OpenAI Gym](https://github.com/openai/gym) _No-Limit Texas Hold'em_* (NLTH) environment written in Python. It's an experiment to build a Gym environment that is synchronous and can support any number of players but also appeal to the general public that wants to learn how to "solve" NLTH. *Python 3 supports arbitrary length integers :money_with_wings: Right now, this is a work in progress, but I believe the API is mature enough for some preliminary experiments. Join me in making some interesting progress on multi-agent Gym environments. # Usage There is limited documentation at the moment. I'll try to make this less painful to understand. ## `env = holdem.TexasHoldemEnv(n_seats, max_limit=1e9, debug=False)` Creates a gym environment representation a NLTH Table from the parameters: + `n_seats` - number of available players for the current table. No players are initially allocated to the table. You must call `env.add_player(seat_id, ...)` to populate the table. + `max_limit` - max_limit is used to define the `gym.spaces` API for the class. It does not actually determine any NLTH limits; in support of `gym.spaces.Discrete`. + `debug` - add debug statements to play, will probably be removed in the future. ### `env.add_player(seat_id, stack=2000)` Adds a player to the table according to the specified seat (`seat_id`) and the initial amount of chips allocated to the player's `stack`. If the table does not have enough seats according to the `n_seats` used by the constructor, a `gym.error.Error` will be raised. ### `(player_states, community_states) = env.reset()` Calling `env.reset` resets the NLTH table to a new hand state. It does not reset any of the players stacks, or, reset any of the blinds. New behavior is reserved for a special, future portion of the API that is yet another feature that is not standard in Gym environments and is a work in progress. The observation returned is a `tuple` of the following by index: 0. `player_states` - a `tuple` where each entry is `tuple(player_info, player_hand)`, this feature can be used to gather all states and hands by `(player_infos, player_hands) = zip(*player_states)`. + `player_infos` - is a `list` of `int` features describing the individual player. It contains the following by index: 0. `[0, 1]` - `0` - seat is empty, `1` - seat is not empty. 1. `[0, n_seats - 1]` - player's id, where they are sitting. 2. `[0, inf]` - player's current stack. 3. `[0, 1]` - player is playing the current hand. 4. `[0, inf]` the player's current handrank according to `treys.Evaluator.evaluate(hand, community)`. 5. `[0, 1]` - `0` - player has not played this round, `1` - player has played this round. 6. `[0, 1]` - `0` - player is currently not betting, `1` - player is betting. 7. `[0, 1]` - `0` - player is currently not all-in, `1` - player is all-in. 8. `[0, inf]` - player's last sidepot. + `player_hands` - is a `list` of `int` features describing the cards in the player's pocket. The values are encoded based on the `treys.Card` integer representation. 1. `community_states` - a `tuple(community_infos, community_cards)` where: + `community_infos` - a `list` by index: 0. `[0, n_seats - 1]` - location of the dealer button, where big blind is posted. 1. `[0, inf]` - the current small blind amount. 2. `[0, inf]` - the current big blind amount. 3. `[0, inf]` - the current total amount in the community pot. 4. `[0, inf]` - the last posted raise amount. 5. `[0, inf]` - minimum required raise amount, if above 0. 6. `[0, inf]` - the amount required to call. 7. `[0, n_seats - 1]` - the current player required to take an action. + `community_cards` - is a `list` of `int` features describing the cards in the community. The values are encoded based on the `treys.Card` integer representation. There are 5 `int` in the list, where `-1` represents that there is no card present. # Example ```python import gym import holdem def play_out_hand(env, n_seats): # reset environment, gather relevant observations (player_states, (community_infos, community_cards)) = env.reset() (player_infos, player_hands) = zip(*player_states) # display the table, cards and all env.render(mode='human') terminal = False while not terminal: # play safe actions, check when noone else has raised, call when raised. actions = holdem.safe_actions(community_infos, n_seats=n_seats) (player_states, (community_infos, community_cards)), rews, terminal, info = env.step(actions) env.render(mode='human') env = gym.make('TexasHoldem-v1') # holdem.TexasHoldemEnv(2) # start with 2 players env.add_player(0, stack=2000) # add a player to seat 0 with 2000 "chips" env.add_player(1, stack=2000) # add another player to seat 1 with 2000 "chips" # play out a hand play_out_hand(env, env.n_seats) # add one more player env.add_player(2, stack=2000) # add another player to seat 1 with 2000 "chips" # play out another hand play_out_hand(env, env.n_seats) ``` ================================================ FILE: example.py ================================================ # -*- coding: utf-8 -*- # # Copyright (c) 2018 Sam Wenke (samwenke@gmail.com) # # Permission is hereby granted, free of charge, to any person obtaining # a copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in # all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN # THE SOFTWARE. import gym import holdem def play_out_hand(env, n_seats): # reset environment, gather relevant observations (player_states, (community_infos, community_cards)) = env.reset() (player_infos, player_hands) = zip(*player_states) # display the table, cards and all env.render(mode='human') terminal = False while not terminal: # play safe actions, check when noone else has raised, call when raised. actions = holdem.safe_actions(community_infos, n_seats=n_seats) (player_states, (community_infos, community_cards)), rews, terminal, info = env.step(actions) env.render(mode='human') env = gym.make('TexasHoldem-v1') # holdem.TexasHoldemEnv(2) # start with 2 players env.add_player(0, stack=2000) # add a player to seat 0 with 2000 "chips" env.add_player(1, stack=2000) # add another player to seat 1 with 2000 "chips" # play out a hand play_out_hand(env, env.n_seats) # add one more player env.add_player(2, stack=2000) # add another player to seat 1 with 2000 "chips" # play out another hand play_out_hand(env, env.n_seats) ================================================ FILE: holdem/__init__.py ================================================ # -*- coding: utf-8 -*- from gym.envs.registration import register from .env import TexasHoldemEnv from .utils import card_to_str, hand_to_str, safe_actions, action_table register( id='TexasHoldem-v0', entry_point='holdem.env:TexasHoldemEnv', kwargs={'n_seats': 2, 'debug': False}, ) register( id='TexasHoldem-v1', entry_point='holdem.env:TexasHoldemEnv', kwargs={'n_seats': 4, 'debug': False}, ) register( id='TexasHoldem-v2', entry_point='holdem.env:TexasHoldemEnv', kwargs={'n_seats': 8, 'debug': False}, ) ================================================ FILE: holdem/env.py ================================================ # -*- coding: utf-8 -*- # # Copyright (c) 2016 Aleksander Beloi (beloi.alex@gmail.com) # Copyright (c) 2018 Sam Wenke (samwenke@gmail.com) # # Permission is hereby granted, free of charge, to any person obtaining # a copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in # all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN # THE SOFTWARE. from gym import Env, error, spaces, utils from gym.utils import seeding from treys import Card, Deck, Evaluator from .player import Player from .utils import hand_to_str, format_action class TexasHoldemEnv(Env, utils.EzPickle): BLIND_INCREMENTS = [[10,25], [25,50], [50,100], [75,150], [100,200], [150,300], [200,400], [300,600], [400,800], [500,10000], [600,1200], [800,1600], [1000,2000]] def __init__(self, n_seats, max_limit=100000, debug=False): n_suits = 4 # s,h,d,c n_ranks = 13 # 2,3,4,5,6,7,8,9,T,J,Q,K,A n_community_cards = 5 # flop, turn, river n_pocket_cards = 2 n_stud = 5 self.n_seats = n_seats self._blind_index = 0 [self._smallblind, self._bigblind] = TexasHoldemEnv.BLIND_INCREMENTS[0] self._deck = Deck() self._evaluator = Evaluator() self.community = [] self._round = 0 self._button = 0 self._discard = [] self._side_pots = [0] * n_seats self._current_sidepot = 0 # index of _side_pots self._totalpot = 0 self._tocall = 0 self._lastraise = 0 self._number_of_hands = 0 # fill seats with dummy players self._seats = [Player(i, stack=0, emptyplayer=True) for i in range(n_seats)] self.emptyseats = n_seats self._player_dict = {} self._current_player = None self._debug = debug self._last_player = None self._last_actions = None self.observation_space = spaces.Tuple([ spaces.Tuple([ # players spaces.MultiDiscrete([ 1, # emptyplayer n_seats - 1, # seat max_limit, # stack 1, # is_playing_hand max_limit, # handrank 1, # playedthisround 1, # is_betting 1, # isallin max_limit, # last side pot ]), spaces.Tuple([ spaces.MultiDiscrete([ # hand n_suits, # suit, can be negative one if it's not avaiable. n_ranks, # rank, can be negative one if it's not avaiable. ]) ] * n_pocket_cards) ] * n_seats), spaces.Tuple([ spaces.Discrete(n_seats - 1), # big blind location spaces.Discrete(max_limit), # small blind spaces.Discrete(max_limit), # big blind spaces.Discrete(max_limit), # pot amount spaces.Discrete(max_limit), # last raise spaces.Discrete(max_limit), # minimum amount to raise spaces.Discrete(max_limit), # how much needed to call by current player. spaces.Discrete(n_seats - 1), # current player seat location. spaces.MultiDiscrete([ # community cards n_suits - 1, # suit n_ranks - 1, # rank 1, # is_flopped ]), ] * n_stud), ]) self.action_space = spaces.Tuple([ spaces.MultiDiscrete([ 3, # action_id max_limit, # raise_amount ]), ] * n_seats) def seed(self, seed=None): _, seed = seeding.np_random(seed) return [seed] def add_player(self, seat_id, stack=2000): """Add a player to the environment seat with the given stack (chipcount)""" player_id = seat_id if player_id not in self._player_dict: new_player = Player(player_id, stack=stack, emptyplayer=False) if self._seats[player_id].emptyplayer: self._seats[player_id] = new_player new_player.set_seat(player_id) else: raise error.Error('Seat already taken.') self._player_dict[player_id] = new_player self.emptyseats -= 1 def remove_player(self, seat_id): """Remove a player from the environment seat.""" player_id = seat_id try: idx = self._seats.index(self._player_dict[player_id]) self._seats[idx] = Player(0, stack=0, emptyplayer=True) del self._player_dict[player_id] self.emptyseats += 1 except ValueError: pass def reset(self): self._reset_game() self._ready_players() self._number_of_hands = 1 [self._smallblind, self._bigblind] = TexasHoldemEnv.BLIND_INCREMENTS[0] if (self.emptyseats < len(self._seats) - 1): players = [p for p in self._seats if p.playing_hand] self._new_round() self._round = 0 self._current_player = self._first_to_act(players) self._post_smallblind(self._current_player) self._current_player = self._next(players, self._current_player) self._post_bigblind(self._current_player) self._current_player = self._next(players, self._current_player) self._tocall = self._bigblind self._round = 0 self._deal_next_round() self._folded_players = [] return self._get_current_reset_returns() def step(self, actions): """ CHECK = 0 CALL = 1 RAISE = 2 FOLD = 3 RAISE_AMT = [0, minraise] """ if len(actions) != len(self._seats): raise error.Error('actions must be same shape as number of seats.') if self._current_player is None: raise error.Error('Round cannot be played without 2 or more players.') if self._round == 4: raise error.Error('Rounds already finished, needs to be reset.') players = [p for p in self._seats if p.playing_hand] if len(players) == 1: raise error.Error('Round cannot be played with one player.') self._last_player = self._current_player self._last_actions = actions if not self._current_player.playedthisround and len([p for p in players if not p.isallin]) >= 1: if self._current_player.isallin: self._current_player = self._next(players, self._current_player) return self._get_current_step_returns(False) move = self._current_player.player_move( self._output_state(self._current_player), actions[self._current_player.player_id]) if move[0] == 'call': self._player_bet(self._current_player, self._tocall) if self._debug: print('Player', self._current_player.player_id, move) self._current_player = self._next(players, self._current_player) elif move[0] == 'check': self._player_bet(self._current_player, self._current_player.currentbet) if self._debug: print('Player', self._current_player.player_id, move) self._current_player = self._next(players, self._current_player) elif move[0] == 'raise': self._player_bet(self._current_player, move[1]+self._current_player.currentbet) if self._debug: print('Player', self._current_player.player_id, move) for p in players: if p != self._current_player: p.playedthisround = False self._current_player = self._next(players, self._current_player) elif move[0] == 'fold': self._current_player.playing_hand = False folded_player = self._current_player if self._debug: print('Player', self._current_player.player_id, move) self._current_player = self._next(players, self._current_player) players.remove(folded_player) self._folded_players.append(folded_player) # break if a single player left if len(players) == 1: self._resolve(players) if all([player.playedthisround for player in players]): self._resolve(players) terminal = False if all([player.isallin for player in players]): while self._round < 4: self._deal_next_round() self._round += 1 if self._round == 4 or len(players) == 1: terminal = True self._resolve_round(players) return self._get_current_step_returns(terminal) def render(self, mode='human', close=False): print('total pot: {}'.format(self._totalpot)) if self._last_actions is not None: pid = self._last_player.player_id print('last action by player {}:'.format(pid)) print(format_action(self._last_player, self._last_actions[pid])) (player_states, community_states) = self._get_current_state() (player_infos, player_hands) = zip(*player_states) (community_infos, community_cards) = community_states print('community:') print('-' + hand_to_str(community_cards)) print('players:') for idx, hand in enumerate(player_hands): print('{}{}stack: {}'.format(idx, hand_to_str(hand), self._seats[idx].stack)) def _resolve(self, players): self._current_player = self._first_to_act(players) self._resolve_sidepots(players + self._folded_players) self._new_round() self._deal_next_round() if self._debug: print('totalpot', self._totalpot) def _deal_next_round(self): if self._round == 0: self._deal() elif self._round == 1: self._flop() elif self._round == 2: self._turn() elif self._round == 3: self._river() def _increment_blinds(self): self._blind_index = min(self._blind_index + 1, len(TexasHoldemEnv.BLIND_INCREMENTS) - 1) [self._smallblind, self._bigblind] = TexasHoldemEnv.BLIND_INCREMENTS[self._blind_index] def _post_smallblind(self, player): if self._debug: print('player ', player.player_id, 'small blind', self._smallblind) self._player_bet(player, self._smallblind) player.playedthisround = False def _post_bigblind(self, player): if self._debug: print('player ', player.player_id, 'big blind', self._bigblind) self._player_bet(player, self._bigblind) player.playedthisround = False self._lastraise = self._bigblind def _player_bet(self, player, total_bet): # relative_bet is how much _additional_ money is the player betting this turn, # on top of what they have already contributed # total_bet is the total contribution by player to pot in this round relative_bet = min(player.stack, total_bet - player.currentbet) player.bet(relative_bet + player.currentbet) self._totalpot += relative_bet self._tocall = max(self._tocall, total_bet) if self._tocall > 0: self._tocall = max(self._tocall, self._bigblind) self._lastraise = max(self._lastraise, relative_bet - self._lastraise) def _first_to_act(self, players): if self._round == 0 and len(players) == 2: return self._next(sorted( players + [self._seats[self._button]], key=lambda x:x.get_seat()), self._seats[self._button]) try: first = [player for player in players if player.get_seat() > self._button][0] except IndexError: first = players[0] return first def _next(self, players, current_player): idx = players.index(current_player) return players[(idx+1) % len(players)] def _deal(self): for player in self._seats: if player.playing_hand: player.hand = self._deck.draw(2) def _flop(self): self._discard.append(self._deck.draw(1)) #burn self.community = self._deck.draw(3) def _turn(self): self._discard.append(self._deck.draw(1)) #burn self.community.append(self._deck.draw(1)) def _river(self): self._discard.append(self._deck.draw(1)) #burn self.community.append(self._deck.draw(1)) def _ready_players(self): for p in self._seats: if not p.emptyplayer and p.sitting_out: p.sitting_out = False p.playing_hand = True def _resolve_sidepots(self, players_playing): players = [p for p in players_playing if p.currentbet] if self._debug: print('current bets: ', [p.currentbet for p in players]) print('playing hand: ', [p.playing_hand for p in players]) if not players: return try: smallest_bet = min([p.currentbet for p in players if p.playing_hand]) except ValueError: for p in players: self._side_pots[self._current_sidepot] += p.currentbet p.currentbet = 0 return smallest_players_allin = [p for p, bet in zip(players, [p.currentbet for p in players]) if bet == smallest_bet and p.isallin] for p in players: self._side_pots[self._current_sidepot] += min(smallest_bet, p.currentbet) p.currentbet -= min(smallest_bet, p.currentbet) p.lastsidepot = self._current_sidepot if smallest_players_allin: self._current_sidepot += 1 self._resolve_sidepots(players) if self._debug: print('sidepots: ', self._side_pots) def _new_round(self): for player in self._player_dict.values(): player.currentbet = 0 player.playedthisround = False self._round += 1 self._tocall = 0 self._lastraise = 0 def _resolve_round(self, players): if len(players) == 1: players[0].refund(sum(self._side_pots)) self._totalpot = 0 else: # compute hand ranks for player in players: player.handrank = self._evaluator.evaluate(player.hand, self.community) # trim side_pots to only include the non-empty side pots temp_pots = [pot for pot in self._side_pots if pot > 0] # compute who wins each side pot and pay winners for pot_idx,_ in enumerate(temp_pots): # find players involved in given side_pot, compute the winner(s) pot_contributors = [p for p in players if p.lastsidepot >= pot_idx] winning_rank = min([p.handrank for p in pot_contributors]) winning_players = [p for p in pot_contributors if p.handrank == winning_rank] for player in winning_players: split_amount = int(self._side_pots[pot_idx]/len(winning_players)) if self._debug: print('Player', player.player_id, 'wins side pot (', int(self._side_pots[pot_idx]/len(winning_players)), ')') player.refund(split_amount) self._side_pots[pot_idx] -= split_amount # any remaining chips after splitting go to the winner in the earliest position if self._side_pots[pot_idx]: earliest = self._first_to_act([player for player in winning_players]) earliest.refund(self._side_pots[pot_idx]) def _reset_game(self): playing = 0 for player in self._seats: if not player.emptyplayer and not player.sitting_out: player.reset_hand() playing += 1 self.community = [] self._current_sidepot = 0 self._totalpot = 0 self._side_pots = [0] * len(self._seats) self._deck.shuffle() if playing: self._button = (self._button + 1) % len(self._seats) while not self._seats[self._button].playing_hand: self._button = (self._button + 1) % len(self._seats) def _output_state(self, current_player): return { 'players': [player.player_state() for player in self._seats], 'community': self.community, 'my_seat': current_player.get_seat(), 'pocket_cards': current_player.hand, 'pot': self._totalpot, 'button': self._button, 'tocall': (self._tocall - current_player.currentbet), 'stack': current_player.stack, 'bigblind': self._bigblind, 'player_id': current_player.player_id, 'lastraise': self._lastraise, 'minraise': max(self._bigblind, self._lastraise + self._tocall), } def _pad(self, l, n, v): if (not l) or (l is None): l = [] return l + [v] * (n - len(l)) def _get_current_state(self): player_states = [] for player in self._seats: player_features = [ int(player.emptyplayer), int(player.get_seat()), int(player.stack), int(player.playing_hand), int(player.handrank), int(player.playedthisround), int(player.betting), int(player.isallin), int(player.lastsidepot), ] player_states.append((player_features, self._pad(player.hand, 2, -1))) community_states = ([ int(self._button), int(self._smallblind), int(self._bigblind), int(self._totalpot), int(self._lastraise), int(max(self._bigblind, self._lastraise + self._tocall)), int(self._tocall - self._current_player.currentbet), int(self._current_player.player_id), ], self._pad(self.community, 5, -1)) return (tuple(player_states), community_states) def _get_current_reset_returns(self): return self._get_current_state() def _get_current_step_returns(self, terminal): obs = self._get_current_state() # TODO, make this something else? rew = [player.stack for player in self._seats] return obs, rew, terminal, [] # TODO, return some info? ================================================ FILE: holdem/player.py ================================================ # -*- coding: utf-8 -*- # # Copyright (c) 2016 Aleksander Beloi (beloi.alex@gmail.com) # Copyright (c) 2018 Sam Wenke (samwenke@gmail.com) # # Permission is hereby granted, free of charge, to any person obtaining # a copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in # all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN # THE SOFTWARE. from random import randint from gym import error from treys import Card class Player(object): CHECK = 0 CALL = 1 RAISE = 2 FOLD = 3 def __init__(self, player_id, stack=2000, emptyplayer=False): self.player_id = player_id self.hand = [] self.stack = stack self.currentbet = 0 self.lastsidepot = 0 self._seat = -1 self.handrank = -1 # flags for table management self.emptyplayer = emptyplayer self.betting = False self.isallin = False self.playing_hand = False self.playedthisround = False self.sitting_out = True def get_seat(self): return self._seat def set_seat(self, value): self._seat = value def reset_hand(self): self._hand = [] self.playedthisround = False self.betting = False self.isallin = False self.currentbet = 0 self.lastsidepot = 0 self.playing_hand = (self.stack != 0) def bet(self, bet_size): self.playedthisround = True if not bet_size: return self.stack -= (bet_size - self.currentbet) self.currentbet = bet_size if self.stack == 0: self.isallin = True def refund(self, ammount): self.stack += ammount def player_state(self): return (self.get_seat(), self.stack, self.playing_hand, self.betting, self.player_id) def reset_stack(self): self.stack = 2000 def update_localstate(self, table_state): self.stack = table_state.get('stack') self.hand = table_state.get('pocket_cards') # cleanup def player_move(self, table_state, action): self.update_localstate(table_state) bigblind = table_state.get('bigblind') tocall = min(table_state.get('tocall', 0), self.stack) minraise = table_state.get('minraise', 0) [action_idx, raise_amount] = action raise_amount = int(raise_amount) action_idx = int(action_idx) if tocall == 0: assert action_idx in [Player.CHECK, Player.RAISE] if action_idx == Player.RAISE: if raise_amount < minraise: raise error.Error('raise must be greater than minraise {}'.format(minraise)) if raise_amount > self.stack: raise error.Error('raise must be less than maxraise {}'.format(self.stack)) move_tuple = ('raise', raise_amount) elif action_idx == Player.CHECK: move_tuple = ('check', 0) else: raise error.Error('invalid action ({}) must be check (0) or raise (2)'.format(action_idx)) else: if action_idx not in [Player.RAISE, Player.CALL, Player.FOLD]: raise error.Error('invalid action ({}) must be raise (2), call (1), or fold (3)'.format(action_idx)) if action_idx == Player.RAISE: if raise_amount < minraise: raise error.Error('raise must be greater than minraise {}'.format(minraise)) if raise_amount > self.stack: raise error.Error('raise must be less than maxraise {}'.format(self.stack)) move_tuple = ('raise', raise_amount) elif action_idx == Player.CALL: move_tuple = ('call', tocall) elif action_idx == Player.FOLD: move_tuple = ('fold', -1) else: raise error.Error('invalid action ({}) must be raise (2), call (1), or fold (3)'.format(action_idx)) return move_tuple ================================================ FILE: holdem/utils.py ================================================ # -*- coding: utf-8 -*- # # Copyright (c) 2018 Sam Wenke (samwenke@gmail.com) # # Permission is hereby granted, free of charge, to any person obtaining # a copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in # all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN # THE SOFTWARE. from treys import Card class action_table: CHECK = 0 CALL = 1 RAISE = 2 FOLD = 3 NA = 0 def format_action(player, action): color = False try: from termcolor import colored # for mac, linux: http://pypi.python.org/pypi/termcolor # can use for windows: http://pypi.python.org/pypi/colorama color = True except ImportError: pass [aid, raise_amt] = action if aid == action_table.CHECK: text = '_ check' if color: text = colored(text, 'white') return text if aid == action_table.CALL: text = '- call, current bet: {}'.format(player.currentbet) if color: text = colored(text, 'yellow') return text if aid == action_table.RAISE: text = '^ raise, current bet: {}'.format(raise_amt) if color: text = colored(text, 'green') return text if aid == action_table.FOLD: text = 'x fold' if color: text = colored(text, 'red') return text def card_to_str(card): if card == -1: return '' return Card.int_to_pretty_str(card) def hand_to_str(hand): output = " " for i in range(len(hand)): c = hand[i] if c == -1: if i != len(hand) - 1: output += '[ ],' else: output += '[ ] ' continue if i != len(hand) - 1: output += str(Card.int_to_pretty_str(c)) + ',' else: output += str(Card.int_to_pretty_str(c)) + ' ' return output def safe_actions(community_infos, n_seats): current_player = community_infos[-1] to_call = community_infos[-2] actions = [[action_table.CHECK, action_table.NA]] * n_seats if to_call > 0: actions[current_player] = [action_table.CALL, action_table.NA] return actions ================================================ FILE: setup.py ================================================ # -*- coding: utf-8 -*- # # Copyright (c) 2018 Sam Wenke (samwenke@gmail.com) # # Permission is hereby granted, free of charge, to any person obtaining # a copy of this software and associated documentation files (the "Software"), # to deal in the Software without restriction, including without limitation # the rights to use, copy, modify, merge, publish, distribute, sublicense, # and/or sell copies of the Software, and to permit persons to whom the # Software is furnished to do so, subject to the following conditions: # # The above copyright notice and this permission notice shall be included in # all copies or substantial portions of the Software. # # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR # IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, # FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE # AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER # LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, # OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN # THE SOFTWARE. from setuptools import setup, find_packages with open("README.md") as readme: long_description = readme.read() setup( name='holdem', version='1.0.0', long_description=long_description, url='https://github.com/wenkesj/holdem', author='Sam Wenke', author_email='samwenke@gmail.com', license='MIT', description=('OpenAI Gym No-Limit Texas Holdem Environment.'), packages=find_packages(exclude=['test', 'examples']), install_requires=['treys', 'gym'], platforms='any', )