master 543cf41f5b11 cached
8 files
94.3 KB
28.5k tokens
44 symbols
1 requests
Download .txt
Repository: christopherjenness/NBA-player-movement
Branch: master
Commit: 543cf41f5b11
Files: 8
Total size: 94.3 KB

Directory structure:
gitextract_z95h9qgz/

├── .gitignore
├── README.md
└── game/
    ├── allgames.txt
    ├── game.py
    ├── pbpevents.txt
    ├── scrape_games.py
    ├── spacing_analysis.py
    └── velocity_analysis.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
*~
*.DS_Store
*__pycache__/
*.pyc


================================================
FILE: README.md
================================================
# NBA player tracking visualization and analysis

This library contains useful methods for visualizing and analyzing NBA player tracking data.

The data is located here and contains all player and ball locations for NBA games from the 2015-16 season.  Play-by-play data is obtained from nba.stats.com.

Example visualizations are shown below.

## System Requirements
* `curl`
* `ffmpeg`
* `p7zip`

## TODO
* Long term solution for play-by-play data.  This may break at any moment.  [See here](https://github.com/christopherjenness/NBA-player-movement/issues/5)
* Python 3 support.  [See here](https://github.com/christopherjenness/NBA-player-movement/issues/4)

## Visualization
Note, these examples use watch_play() to visualize plays.  This method is extremely slow.  animate_play() is much faster since it streams frames directly to ffmpeg without writing them to disk first.

To visualize games from the tracking data, the `Game` class in `game.py` is used.
```python
from game import Game
game = Game('01.08.2016', 'POR', 'GSW')
game.watch_play(game_time=6, length=120, commentary=False)
```
![NoCommentary](examples/GSWatPORnocommentary.gif)

To easily follow the flow of the game, commentary can be added.
```python
game.watch_play(game_time=6, length=120, commentary=True)
```

![Commentary](examples/GSWatPOR.gif)

If you are interested in a single player, they can easily be tracked.
```python
game.watch_play(game_time=2007, length=10, highlight_player='Stephen Curry', commentary=False)
```

![Curry3](examples/Curry3.gif)

All of a players actions can be extracted and viewed with a single method call.  Currently, actions can be in ['all_FG', 'made_FG', 'miss_FG', 'rebound'], but this method can be easily extended to include any action.

```python
game.watch_player_actions("Stephen Curry", "made_FG")
"""
This method will output a video for each of Steph's made FGs in the game, 
however, I am just diplaying one of them.
"""
```

![CurryFG](examples/CurryFG.gif)

## Analysis (In Progress)

Here, we analyze two aspects of basketball that are difficult to address without player tracking data:
* Defensive Spacing
* Player/Team Velocity

### Defensive Spacing

NBA commentators often praise offensive teams who can "Space the defense".  Essentially, if an offensive team can draw out defenders to the three point line, passing lanes will open up and drives to the basket will be less clogged and more efficient.  Here we analyze how effectively teams can space the defense.  

The workhorse of this analysis is `scipy.spatial.ConvexHull` which measures the convex hull of the defense (larger convex hull = more spaced defense).  This can be visualized:

```python
game.watch_play(121, 10, commentary=False, show_spacing='home')
```

![SpacingPlay](examples/GSWspacing.gif)

`spacing_analysis.py` contains the code for the following analysis.  To process the data, only "set plays" were analyzed.  Since "transition plays" have unique spacing properties, we limited this analysis to "standard" plays where the offense and defense are set.

Which teams are best at spacing the defense? (Remember, spacing the defense more is thought to be better).  If we average over all time points for each team, we get the following:

![SpacingBar](examples/DefensiveSpacing.png)

Interestingly, we see that Detroit is the best team at spacing defenses.  [This is something that has been anecdotally documented by Mike Prada, and the data back up his claims.](http://www.sbnation.com/nba/2015/1/9/7517125/detroit-pistons-winning-streak-josh-smith-released)  Additionally, teams like Cleveland that are thought to have a modern offense, are great at spacing the defense.  

But the question is: **Does spacing the defense help you win?**  Here we look at the score differential vs defensive spacing and we see a positive correlation.  In fact, spacing the defense an extra 5 square feet correlates with increasing the score differential 4.25 points! 

![SpacingScore](examples/SpacingVsScore.png)

If you stare at this graph long enough, you can notice it also shows the level of home court advantage in the NBA.  If you are interested, you can read an analysis of home court advantage I did [here.](https://github.com/christopherjenness/my-pdfs/blob/master/NBAHomeTeamAdvantage.pdf)

**How can a team space the defense better?**  Intuitively, spacing your offense will draw out the defense.  The plot below looks at each game, and plots how spaced the offenses and defenses were.  Clearly, a more spaced offense correlates with a more spaced defense.

![SpacingOffDeff](examples/OffenseVsDefense.png)

But when you break it down by team, how effectively can each team space the defense?  Below is a plot of each teams average offensive spacing plotted against how well they can space the opponent's defense.  As expected, if a team has a well spaced offense, their opponents defense is more spaced.  There are a few interested exceptions though.

![TeamSpacing](examples/Spacing_scatter.png)

Notice Toronto (TOR).  Toronto has a hard time spacing the defense even though they space out their offense.  This is likely due to their star DeMar DeRozan being a shooting liability.  Defenders don't need to guard him out on the 3PT line, so they can keep the paint clogged.

Notice San Antonio (SAS).  San Antonio can effectively space the defense without spacing out their offense.  This may be due to having one of the best 3PT shooters in the league, Kawhi Leonard, who needs be guarded religiously at the 3PT line.

Currently, I'm working on breaking down defensive spacing per play to see the effect on individual plays instead of aggregated game data.  This is yielding interesting insights.

### Player Velocity

Player tracking data provides insight into tean's and player's velocity.  Here we analyze how player speed affects the flow of the game.  The analysis code can be found in `velocity_analysis.py`.

Using the visualization shwon above, team velocities can be shown as the game progresses:

```python
game = Game('01.08.2016', 'POR', 'GSW')
watch_play_velocities(game, game_time=7, length=54)
```

![TeamVelocity](examples/TeamVelocity.gif)

Alternatively, individual player velocities can be visualized:

```python
game = Game('01.08.2016', 'POR', 'GSW')
watch_play_velocities(game, game_time=7, length=54, highlight_player='Stephen Curry')
```

![StephVelocity](examples/CurryVelocity.gif)

Different teams have different offense/defensive scheme's that require different amounts of running.  When we break down velocity by team, we can look at how much effort each team's scheme takes.  (Note: I threw out all transition data, since I was interested in 'set' plays).


![OffenseVelocity](examples/VelocityOffenseTeams.png)

What we see makes sense- The Spurs have the most running incorporated into their offense.  The Spurs are known for their "flowing" offense, so this makes sense.

![DefenseVelocity](examples/VelocityDefenseTeams.png)

Looking at defense is a bit more complicated.  Defensive velocity takes into account a number of things: closing out, switching, zoning, etc.  We will need to break these down to get real insight.

One aspect of basketball that is currently hard to evaluate is player fatigue.  Tracking player velocity, we can see how it decreases over the course of a game as a metric for fatigue. 

What we see is some teams, such as the Indiana Pacers decrease in offensive velocity as the game progresses (each dot is the average velocity of a single game).

![INDfatigue](examples/INDfatige.png)

Interestingly, while the Spurs have the highest offensive velocity in the league, they show no fatigue over the course of a game.  This reflects the speculated culture of the Spurs.

![SASfatigue](examples/SASfatige.png)

This will be more insightful when we break down fatigue by player, since different players are affected differentially. 


================================================
FILE: game/allgames.txt
================================================
01.01.2016.CHA.at.TOR.7z
01.01.2016.DAL.at.MIA.7z
01.01.2016.NYK.at.CHI.7z
01.01.2016.ORL.at.WAS.7z
01.01.2016.PHI.at.LAL.7z
01.02.2016.BKN.at.BOS.7z
01.02.2016.DEN.at.GSW.7z
01.02.2016.DET.at.IND.7z
01.02.2016.HOU.at.SAS.7z
01.02.2016.MEM.at.UTA.7z
01.02.2016.MIL.at.MIN.7z
01.02.2016.NOP.at.DAL.7z
01.02.2016.OKC.at.CHA.7z
01.02.2016.ORL.at.CLE.7z
01.02.2016.PHI.at.LAC.7z
01.02.2016.PHX.at.SAC.7z
01.03.2016.ATL.at.NYK.7z
01.03.2016.CHI.at.TOR.7z
01.03.2016.MIA.at.WAS.7z
01.03.2016.PHX.at.LAL.7z
01.03.2016.POR.at.DEN.7z
01.04.2016.BOS.at.BKN.7z
01.04.2016.CHA.at.GSW.7z
01.04.2016.HOU.at.UTA.7z
01.04.2016.IND.at.MIA.7z
01.04.2016.MEM.at.POR.7z
01.04.2016.MIN.at.PHI.7z
01.04.2016.ORL.at.DET.7z
01.04.2016.SAC.at.OKC.7z
01.04.2016.SAS.at.MIL.7z
01.04.2016.TOR.at.CLE.7z
01.05.2016.GSW.at.LAL.7z
01.05.2016.MIL.at.CHI.7z
01.05.2016.NYK.at.ATL.7z
01.05.2016.SAC.at.DAL.7z
01.06.2016.CHA.at.PHX.7z
01.06.2016.CLE.at.WAS.7z
01.06.2016.DAL.at.NOP.7z
01.06.2016.DEN.at.MIN.7z
01.06.2016.DET.at.BOS.7z
01.06.2016.IND.at.ORL.7z
01.06.2016.LAC.at.POR.7z
01.06.2016.MEM.at.OKC.7z
01.06.2016.NYK.at.MIA.7z
01.06.2016.TOR.at.BKN.7z
01.06.2016.UTA.at.SAS.7z
01.07.2016.ATL.at.PHI.7z
01.07.2016.BOS.at.CHI.7z
01.07.2016.LAL.at.SAC.7z
01.07.2016.UTA.at.HOU.7z
01.08.2016.CLE.at.MIN.7z
01.08.2016.DAL.at.MIL.7z
01.08.2016.DEN.at.MEM.7z
01.08.2016.GSW.at.POR.7z
01.08.2016.IND.at.NOP.7z
01.08.2016.MIA.at.PHX.7z
01.08.2016.NYK.at.SAS.7z
01.08.2016.OKC.at.LAL.7z
01.08.2016.ORL.at.BKN.7z
01.08.2016.TOR.at.WAS.7z
01.09.2016.BKN.at.DET.7z
01.09.2016.CHA.at.LAC.7z
01.09.2016.CHI.at.ATL.7z
01.09.2016.GSW.at.SAC.7z
01.09.2016.MIA.at.UTA.7z
01.09.2016.TOR.at.PHI.7z
01.09.2016.WAS.at.ORL.7z
01.10.2016.BOS.at.MEM.7z
01.10.2016.CHA.at.DEN.7z
01.10.2016.CLE.at.PHI.7z
01.10.2016.DAL.at.MIN.7z
01.10.2016.IND.at.HOU.7z
01.10.2016.MIL.at.NYK.7z
01.10.2016.NOP.at.LAC.7z
01.10.2016.OKC.at.POR.7z
01.10.2016.UTA.at.LAL.7z
01.11.2016.MIA.at.GSW.7z
01.11.2016.SAS.at.BKN.7z
01.11.2016.WAS.at.CHI.7z
01.12.2016.BOS.at.NYK.7z
01.12.2016.CHI.at.MIL.7z
01.12.2016.CLE.at.DAL.7z
01.12.2016.HOU.at.MEM.7z
01.12.2016.NOP.at.LAL.7z
01.12.2016.OKC.at.MIN.7z
01.12.2016.PHX.at.IND.7z
01.12.2016.SAS.at.DET.7z
01.13.2016.ATL.at.CHA.7z
01.13.2016.DAL.at.OKC.7z
01.13.2016.GSW.at.DEN.7z
01.13.2016.IND.at.BOS.7z
01.13.2016.MIA.at.LAC.7z
01.13.2016.MIL.at.WAS.7z
01.13.2016.MIN.at.HOU.7z
01.13.2016.NOP.at.SAC.7z
01.13.2016.NYK.at.BKN.7z
01.13.2016.UTA.at.POR.7z
01.14.2016.CHI.at.PHI.7z
01.14.2016.CLE.at.SAS.7z
01.14.2016.DET.at.MEM.7z
01.14.2016.LAL.at.GSW.7z
01.14.2016.SAC.at.UTA.7z
01.15.2016.ATL.at.MIL.7z
01.15.2016.CHA.at.NOP.7z
01.15.2016.CLE.at.HOU.7z
01.15.2016.DAL.at.CHI.7z
01.15.2016.MIA.at.DEN.7z
01.15.2016.MIN.at.OKC.7z
01.15.2016.PHX.at.BOS.7z
01.15.2016.POR.at.BKN.7z
01.15.2016.WAS.at.IND.7z
01.18.2016.BKN.at.TOR.7z
01.18.2016.BOS.at.DAL.7z
01.18.2016.CHI.at.DET.7z
01.18.2016.GSW.at.CLE.7z
01.18.2016.HOU.at.LAC.7z
01.18.2016.NOP.at.MEM.7z
01.18.2016.ORL.at.ATL.7z
01.18.2016.PHI.at.NYK.7z
01.18.2016.POR.at.WAS.7z
01.18.2016.UTA.at.CHA.7z
01.19.2016.IND.at.PHX.7z
01.19.2016.MIL.at.MIA.7z
01.19.2016.MIN.at.NOP.7z
01.19.2016.OKC.at.DEN.7z
01.20.2016.ATL.at.POR.7z
01.20.2016.BOS.at.TOR.7z
01.20.2016.CHA.at.OKC.7z
01.20.2016.CLE.at.BKN.7z
01.20.2016.DET.at.HOU.7z
01.20.2016.GSW.at.CHI.7z
01.20.2016.MIA.at.WAS.7z
01.20.2016.MIN.at.DAL.7z
01.20.2016.PHI.at.ORL.7z
01.20.2016.SAC.at.LAL.7z
01.20.2016.UTA.at.NYK.7z
01.21.2016.DET.at.NOP.7z
01.21.2016.MEM.at.DEN.7z
01.22.2016.CHA.at.ORL.7z
01.22.2016.CHI.at.BOS.7z
01.22.2016.IND.at.GSW.7z
01.22.2016.LAC.at.NYK.7z
01.22.2016.MIA.at.TOR.7z
01.22.2016.MIL.at.HOU.7z
01.22.2016.OKC.at.DAL.7z
01.22.2016.SAS.at.LAL.7z
01.22.2016.UTA.at.BKN.7z
01.23.2016.ATL.at.PHX.7z
01.23.2016.CHI.at.CLE.7z
01.23.2016.DET.at.DEN.7z
01.23.2016.IND.at.SAC.7z
01.23.2016.LAL.at.POR.7z
01.23.2016.MEM.at.MIN.7z
01.23.2016.MIL.at.NOP.7z
01.23.2016.NYK.at.CHA.7z
01.23.2016.UTA.at.WAS.7z
10.27.2015.CLE.at.CHI.7z
10.27.2015.DET.at.ATL.7z
10.27.2015.NOP.at.GSW.7z
10.28.2015.CLE.at.MEM.7z
10.28.2015.DEN.at.HOU.7z
10.28.2015.IND.at.TOR.7z
10.28.2015.LAC.at.SAC.7z
10.28.2015.MIN.at.LAL.7z
10.28.2015.NOP.at.POR.7z
10.28.2015.NYK.at.MIL.7z
10.28.2015.PHI.at.BOS.7z
10.28.2015.SAS.at.OKC.7z
10.28.2015.UTA.at.DET.7z
10.28.2015.WAS.at.ORL.7z
10.29.2015.ATL.at.NYK.7z
10.29.2015.DAL.at.LAC.7z
10.29.2015.MEM.at.IND.7z
10.30.2015.BKN.at.SAS.7z
10.30.2015.CHA.at.ATL.7z
10.30.2015.CHI.at.DET.7z
10.30.2015.LAL.at.SAC.7z
10.30.2015.MIA.at.CLE.7z
10.30.2015.MIN.at.DEN.7z
10.30.2015.OKC.at.ORL.7z
10.30.2015.POR.at.PHX.7z
10.30.2015.TOR.at.BOS.7z
10.30.2015.UTA.at.PHI.7z
10.30.2015.WAS.at.MIL.7z
10.31.2015.BKN.at.MEM.7z
10.31.2015.GSW.at.NOP.7z
10.31.2015.NYK.at.WAS.7z
10.31.2015.PHX.at.POR.7z
10.31.2015.SAC.at.LAC.7z
10.31.2015.UTA.at.IND.7z
11.01.2015.ATL.at.CHA.7z
11.01.2015.DAL.at.LAL.7z
11.01.2015.DEN.at.OKC.7z
11.01.2015.HOU.at.MIA.7z
11.01.2015.MIL.at.TOR.7z
11.01.2015.ORL.at.CHI.7z
11.01.2015.SAS.at.BOS.7z
11.02.2015.CLE.at.PHI.7z
11.02.2015.MEM.at.GSW.7z
11.02.2015.MIL.at.BKN.7z
11.02.2015.OKC.at.HOU.7z
11.02.2015.PHX.at.LAC.7z
11.02.2015.POR.at.MIN.7z
11.02.2015.SAS.at.NYK.7z
11.03.2015.ATL.at.MIA.7z
11.03.2015.CHI.at.CHA.7z
11.03.2015.DEN.at.LAL.7z
11.03.2015.IND.at.DET.7z
11.03.2015.MEM.at.SAC.7z
11.03.2015.ORL.at.NOP.7z
11.03.2015.TOR.at.DAL.7z
11.04.2015.BKN.at.ATL.7z
11.04.2015.BOS.at.IND.7z
11.04.2015.LAC.at.GSW.7z
11.04.2015.NYK.at.CLE.7z
11.04.2015.ORL.at.HOU.7z
11.04.2015.PHI.at.MIL.7z
11.04.2015.POR.at.UTA.7z
11.04.2015.SAC.at.PHX.7z
11.04.2015.SAS.at.WAS.7z
11.04.2015.TOR.at.OKC.7z
11.05.2015.CHA.at.DAL.7z
11.05.2015.MEM.at.POR.7z
11.05.2015.MIA.at.MIN.7z
11.05.2015.OKC.at.CHI.7z
11.05.2015.UTA.at.DEN.7z
11.06.2015.ATL.at.NOP.7z
11.06.2015.DEN.at.GSW.7z
11.06.2015.DET.at.PHX.7z
11.06.2015.HOU.at.SAC.7z
11.06.2015.LAL.at.BKN.7z
11.06.2015.MIA.at.IND.7z
11.06.2015.MIL.at.NYK.7z
11.06.2015.PHI.at.CLE.7z
11.06.2015.TOR.at.ORL.7z
11.06.2015.WAS.at.BOS.7z
11.07.2015.BKN.at.MIL.7z
11.07.2015.CHA.at.SAS.7z
11.07.2015.GSW.at.SAC.7z
11.07.2015.HOU.at.LAC.7z
11.07.2015.MEM.at.UTA.7z
11.07.2015.MIN.at.CHI.7z
11.07.2015.NOP.at.DAL.7z
11.07.2015.ORL.at.PHI.7z
11.07.2015.WAS.at.ATL.7z
11.08.2015.DET.at.POR.7z
11.08.2015.IND.at.CLE.7z
11.08.2015.LAL.at.NYK.7z
11.08.2015.PHX.at.OKC.7z
11.08.2015.TOR.at.MIA.7z
11.09.2015.DET.at.GSW.7z
11.09.2015.MEM.at.LAC.7z
11.09.2015.MIN.at.ATL.7z
11.09.2015.ORL.at.IND.7z
11.09.2015.POR.at.DEN.7z
11.09.2015.SAS.at.SAC.7z
11.10.2015.BOS.at.MIL.7z
11.10.2015.CHA.at.MIN.7z
11.10.2015.DAL.at.NOP.7z
11.10.2015.LAL.at.MIA.7z
11.10.2015.NYK.at.TOR.7z
11.10.2015.OKC.at.WAS.7z
11.10.2015.UTA.at.CLE.7z
11.11.2015.BKN.at.HOU.7z
11.11.2015.DET.at.SAC.7z
11.11.2015.GSW.at.MEM.7z
11.11.2015.IND.at.BOS.7z
11.11.2015.LAC.at.DAL.7z
11.11.2015.LAL.at.ORL.7z
11.11.2015.MIL.at.DEN.7z
11.11.2015.NOP.at.ATL.7z
11.11.2015.NYK.at.CHA.7z
11.11.2015.SAS.at.POR.7z
11.11.2015.TOR.at.PHI.7z
11.12.2015.GSW.at.MIN.7z
11.12.2015.LAC.at.PHX.7z
11.12.2015.UTA.at.MIA.7z
11.13.2015.ATL.at.BOS.7z
11.13.2015.BKN.at.SAC.7z
11.13.2015.CHA.at.CHI.7z
11.13.2015.CLE.at.NYK.7z
11.13.2015.HOU.at.DEN.7z
11.13.2015.LAL.at.DAL.7z
11.13.2015.MIN.at.IND.7z
11.13.2015.NOP.at.TOR.7z
11.13.2015.PHI.at.OKC.7z
11.13.2015.POR.at.MEM.7z
11.13.2015.UTA.at.ORL.7z
11.14.2015.BKN.at.GSW.7z
11.14.2015.CLE.at.MIL.7z
11.14.2015.DAL.at.HOU.7z
11.14.2015.DEN.at.PHX.7z
11.14.2015.DET.at.LAC.7z
11.14.2015.ORL.at.WAS.7z
11.14.2015.PHI.at.SAS.7z
11.15.2015.BOS.at.OKC.7z
11.15.2015.DET.at.LAL.7z
11.15.2015.MEM.at.MIN.7z
11.15.2015.NOP.at.NYK.7z
11.15.2015.POR.at.CHA.7z
11.15.2015.TOR.at.SAC.7z
11.15.2015.UTA.at.ATL.7z
11.16.2015.BOS.at.HOU.7z
11.16.2015.DAL.at.PHI.7z
11.16.2015.IND.at.CHI.7z
11.16.2015.LAL.at.PHX.7z
11.16.2015.OKC.at.MEM.7z
11.16.2015.POR.at.SAS.7z
11.17.2015.ATL.at.BKN.7z
11.17.2015.CHA.at.NYK.7z
11.17.2015.CLE.at.DET.7z
11.17.2015.DEN.at.NOP.7z
11.17.2015.MIL.at.WAS.7z
11.17.2015.MIN.at.MIA.7z
11.17.2015.TOR.at.GSW.7z
11.18.2015.BKN.at.CHA.7z
11.18.2015.CHI.at.PHX.7z
11.18.2015.DAL.at.BOS.7z
11.18.2015.DEN.at.SAS.7z
11.18.2015.IND.at.PHI.7z
11.18.2015.MIN.at.ORL.7z
11.18.2015.NOP.at.OKC.7z
11.18.2015.POR.at.HOU.7z
11.18.2015.SAC.at.ATL.7z
11.18.2015.TOR.at.UTA.7z
11.19.2015.GSW.at.LAC.7z
11.19.2015.MIL.at.CLE.7z
11.19.2015.SAC.at.MIA.7z
11.20.2015.BKN.at.BOS.7z
11.20.2015.CHI.at.GSW.7z
11.20.2015.DET.at.MIN.7z
11.20.2015.HOU.at.MEM.7z
11.20.2015.LAC.at.POR.7z
11.20.2015.NYK.at.OKC.7z
11.20.2015.PHI.at.CHA.7z
11.20.2015.PHX.at.DEN.7z
11.20.2015.SAS.at.NOP.7z
11.20.2015.TOR.at.LAL.7z
11.20.2015.UTA.at.DAL.7z
11.21.2015.ATL.at.CLE.7z
11.21.2015.MEM.at.SAS.7z
11.21.2015.MIL.at.IND.7z
11.21.2015.NYK.at.HOU.7z
11.21.2015.PHI.at.MIA.7z
11.21.2015.SAC.at.ORL.7z
11.21.2015.WAS.at.DET.7z
11.22.2015.BOS.at.BKN.7z
11.22.2015.DAL.at.OKC.7z
11.22.2015.GSW.at.DEN.7z
11.22.2015.PHX.at.NOP.7z
11.22.2015.POR.at.LAL.7z
11.22.2015.TOR.at.LAC.7z
11.23.2015.DET.at.MIL.7z
11.23.2015.NYK.at.MIA.7z
11.23.2015.OKC.at.UTA.7z
11.23.2015.ORL.at.CLE.7z
11.23.2015.PHI.at.MIN.7z
11.23.2015.PHX.at.SAS.7z
11.23.2015.SAC.at.CHA.7z
11.24.2015.BOS.at.ATL.7z
11.24.2015.CHI.at.POR.7z
11.24.2015.DAL.at.MEM.7z
11.24.2015.IND.at.WAS.7z
11.24.2015.LAC.at.DEN.7z
11.24.2015.LAL.at.GSW.7z
11.25.2015.ATL.at.MIN.7z
11.25.2015.BKN.at.OKC.7z
11.25.2015.CLE.at.TOR.7z
11.25.2015.DAL.at.SAS.7z
11.25.2015.MEM.at.HOU.7z
11.25.2015.MIA.at.DET.7z
11.25.2015.NOP.at.PHX.7z
11.25.2015.NYK.at.ORL.7z
11.25.2015.PHI.at.BOS.7z
11.25.2015.SAC.at.MIL.7z
11.25.2015.UTA.at.LAC.7z
11.25.2015.WAS.at.CHA.7z
11.27.2015.ATL.at.MEM.7z
11.27.2015.CHI.at.IND.7z
11.27.2015.CLE.at.CHA.7z
11.27.2015.DET.at.OKC.7z
11.27.2015.GSW.at.PHX.7z
11.27.2015.MIA.at.NYK.7z
11.27.2015.MIL.at.ORL.7z
11.27.2015.MIN.at.SAC.7z
11.27.2015.PHI.at.HOU.7z
11.27.2015.SAS.at.DEN.7z
11.27.2015.WAS.at.BOS.7z
11.28.2015.ATL.at.SAS.7z
11.28.2015.DEN.at.DAL.7z
11.28.2015.LAL.at.POR.7z
11.28.2015.NOP.at.UTA.7z
11.28.2015.SAC.at.GSW.7z
11.28.2015.TOR.at.WAS.7z
11.29.2015.BOS.at.ORL.7z
11.29.2015.DET.at.BKN.7z
11.29.2015.HOU.at.NYK.7z
11.29.2015.IND.at.LAL.7z
11.29.2015.MIL.at.CHA.7z
11.29.2015.MIN.at.LAC.7z
11.29.2015.PHI.at.MEM.7z
11.29.2015.PHX.at.TOR.7z
11.30.2015.BOS.at.MIA.7z
11.30.2015.DAL.at.SAC.7z
11.30.2015.DEN.at.MIL.7z
11.30.2015.GSW.at.UTA.7z
11.30.2015.HOU.at.DET.7z
11.30.2015.OKC.at.ATL.7z
11.30.2015.POR.at.LAC.7z
11.30.2015.SAS.at.CHI.7z
12.01.2015.DAL.at.POR.7z
12.01.2015.LAL.at.PHI.7z
12.01.2015.MEM.at.NOP.7z
12.01.2015.ORL.at.MIN.7z
12.01.2015.PHX.at.BKN.7z
12.01.2015.WAS.at.CLE.7z
12.02.2015.DEN.at.CHI.7z
12.02.2015.GSW.at.CHA.7z
12.02.2015.LAL.at.WAS.7z
12.02.2015.MIL.at.SAS.7z
12.02.2015.NOP.at.HOU.7z
12.02.2015.PHI.at.NYK.7z
12.02.2015.PHX.at.DET.7z
12.02.2015.TOR.at.ATL.7z
12.03.2015.DEN.at.TOR.7z
12.03.2015.IND.at.POR.7z
12.03.2015.OKC.at.MIA.7z
12.03.2015.ORL.at.UTA.7z
12.03.2015.SAS.at.MEM.7z
12.04.2015.BKN.at.NYK.7z
12.04.2015.CLE.at.NOP.7z
12.04.2015.HOU.at.DAL.7z
12.04.2015.LAL.at.ATL.7z
12.04.2015.MIL.at.DET.7z
12.04.2015.PHX.at.WAS.7z
12.05.2015.BOS.at.SAS.7z
12.05.2015.CHA.at.CHI.7z
12.05.2015.CLE.at.MIA.7z
12.05.2015.DEN.at.PHI.7z
12.05.2015.GSW.at.TOR.7z
12.05.2015.IND.at.UTA.7z
12.05.2015.NYK.at.MIL.7z
12.05.2015.ORL.at.LAC.7z
12.05.2015.POR.at.MIN.7z
12.05.2015.SAC.at.HOU.7z
12.06.2015.DAL.at.WAS.7z
12.06.2015.GSW.at.BKN.7z
12.06.2015.LAL.at.DET.7z
12.06.2015.PHX.at.MEM.7z
12.06.2015.SAC.at.OKC.7z
12.07.2015.BOS.at.NOP.7z
12.07.2015.DAL.at.NYK.7z
12.07.2015.DET.at.CHA.7z
12.07.2015.LAC.at.MIN.7z
12.07.2015.LAL.at.TOR.7z
12.07.2015.PHX.at.CHI.7z
12.07.2015.POR.at.MIL.7z
12.07.2015.SAS.at.PHI.7z
12.07.2015.WAS.at.MIA.7z
12.08.2015.GSW.at.IND.7z
12.08.2015.HOU.at.BKN.7z
12.08.2015.OKC.at.MEM.7z
12.08.2015.ORL.at.DEN.7z
12.08.2015.POR.at.CLE.7z
12.08.2015.UTA.at.SAC.7z
12.09.2015.ATL.at.DAL.7z
12.09.2015.CHI.at.BOS.7z
12.09.2015.HOU.at.WAS.7z
12.09.2015.LAC.at.MIL.7z
12.09.2015.LAL.at.MIN.7z
12.09.2015.MEM.at.DET.7z
12.09.2015.MIA.at.CHA.7z
12.09.2015.NYK.at.UTA.7z
12.09.2015.ORL.at.PHX.7z
12.09.2015.SAS.at.TOR.7z
12.10.2015.ATL.at.OKC.7z
12.10.2015.LAC.at.CHI.7z
12.10.2015.NYK.at.SAC.7z
12.10.2015.PHI.at.BKN.7z
12.11.2015.CHA.at.MEM.7z
12.11.2015.CLE.at.ORL.7z
12.11.2015.DET.at.PHI.7z
12.11.2015.GSW.at.BOS.7z
12.11.2015.LAL.at.SAS.7z
12.11.2015.MIA.at.IND.7z
12.11.2015.MIL.at.TOR.7z
12.11.2015.MIN.at.DEN.7z
12.11.2015.OKC.at.UTA.7z
12.11.2015.POR.at.PHX.7z
12.11.2015.WAS.at.NOP.7z
12.12.2015.BOS.at.CHA.7z
12.12.2015.GSW.at.MIL.7z
12.12.2015.IND.at.DET.7z
12.12.2015.LAC.at.BKN.7z
12.12.2015.LAL.at.HOU.7z
12.12.2015.NOP.at.CHI.7z
12.12.2015.NYK.at.POR.7z
12.12.2015.SAS.at.ATL.7z
12.12.2015.WAS.at.DAL.7z
12.13.2015.MEM.at.MIA.7z
12.13.2015.MIN.at.PHX.7z
12.13.2015.PHI.at.TOR.7z
12.13.2015.UTA.at.OKC.7z
12.14.2015.HOU.at.DEN.7z
12.14.2015.LAC.at.DET.7z
12.14.2015.MIA.at.ATL.7z
12.14.2015.NOP.at.POR.7z
12.14.2015.ORL.at.BKN.7z
12.14.2015.PHI.at.CHI.7z
12.14.2015.PHX.at.DAL.7z
12.14.2015.TOR.at.IND.7z
12.14.2015.UTA.at.SAS.7z
12.14.2015.WAS.at.MEM.7z
12.15.2015.CLE.at.BOS.7z
12.15.2015.DEN.at.MIN.7z
12.15.2015.HOU.at.SAC.7z
12.15.2015.MIL.at.LAL.7z
12.16.2015.BOS.at.DET.7z
12.16.2015.CHA.at.ORL.7z
12.16.2015.DAL.at.IND.7z
12.16.2015.MEM.at.CHI.7z
12.16.2015.MIA.at.BKN.7z
12.16.2015.MIL.at.LAC.7z
12.16.2015.MIN.at.NYK.7z
12.16.2015.NOP.at.UTA.7z
12.16.2015.PHI.at.ATL.7z
12.16.2015.PHX.at.GSW.7z
12.16.2015.POR.at.OKC.7z
12.16.2015.WAS.at.SAS.7z
12.17.2015.HOU.at.LAL.7z
12.17.2015.OKC.at.CLE.7z
12.17.2015.TOR.at.CHA.7z
12.18.2015.ATL.at.BOS.7z
12.18.2015.BKN.at.IND.7z
12.18.2015.DEN.at.UTA.7z
12.18.2015.DET.at.CHI.7z
12.18.2015.LAC.at.SAS.7z
12.18.2015.MEM.at.DAL.7z
12.18.2015.MIL.at.GSW.7z
12.18.2015.NOP.at.PHX.7z
12.18.2015.NYK.at.PHI.7z
12.18.2015.POR.at.ORL.7z
12.18.2015.SAC.at.MIN.7z
12.18.2015.TOR.at.MIA.7z
12.19.2015.CHA.at.WAS.7z
12.19.2015.CHI.at.NYK.7z
12.19.2015.IND.at.MEM.7z
12.19.2015.LAC.at.HOU.7z
12.19.2015.LAL.at.OKC.7z
12.20.2015.ATL.at.ORL.7z
12.20.2015.MIL.at.PHX.7z
12.20.2015.MIN.at.BKN.7z
12.20.2015.NOP.at.DEN.7z
12.20.2015.PHI.at.CLE.7z
12.20.2015.POR.at.MIA.7z
12.20.2015.SAC.at.TOR.7z
12.21.2015.BKN.at.CHI.7z
12.21.2015.CHA.at.HOU.7z
12.21.2015.IND.at.SAS.7z
12.21.2015.MIN.at.BOS.7z
12.21.2015.OKC.at.LAC.7z
12.21.2015.ORL.at.NYK.7z
12.21.2015.PHX.at.UTA.7z
12.21.2015.POR.at.ATL.7z
12.21.2015.SAC.at.WAS.7z
12.22.2015.DAL.at.TOR.7z
12.22.2015.DET.at.MIA.7z
12.22.2015.LAL.at.DEN.7z
12.22.2015.MEM.at.PHI.7z
12.23.2015.BOS.at.CHA.7z
12.23.2015.DAL.at.BKN.7z
12.23.2015.DEN.at.PHX.7z
12.23.2015.DET.at.ATL.7z
12.23.2015.HOU.at.ORL.7z
12.23.2015.MEM.at.WAS.7z
12.23.2015.NYK.at.CLE.7z
12.23.2015.OKC.at.LAL.7z
12.23.2015.PHI.at.MIL.7z
12.23.2015.POR.at.NOP.7z
12.23.2015.SAC.at.IND.7z
12.23.2015.SAS.at.MIN.7z
12.23.2015.UTA.at.GSW.7z
12.25.2015.CHI.at.OKC.7z
12.25.2015.CLE.at.GSW.7z
12.25.2015.LAC.at.LAL.7z
12.25.2015.NOP.at.MIA.7z
12.25.2015.SAS.at.HOU.7z
12.26.2015.BOS.at.DET.7z
12.26.2015.CHI.at.DAL.7z
12.26.2015.CLE.at.POR.7z
12.26.2015.DEN.at.SAS.7z
12.26.2015.HOU.at.NOP.7z
12.26.2015.IND.at.MIN.7z
12.26.2015.LAC.at.UTA.7z
12.26.2015.MEM.at.CHA.7z
12.26.2015.MIA.at.ORL.7z
12.26.2015.NYK.at.ATL.7z
12.26.2015.PHI.at.PHX.7z
12.26.2015.TOR.at.MIL.7z
12.26.2015.WAS.at.BKN.7z
12.27.2015.DEN.at.OKC.7z
12.27.2015.LAL.at.MEM.7z
12.27.2015.NYK.at.BOS.7z
12.27.2015.POR.at.SAC.7z
12.28.2015.ATL.at.IND.7z
12.28.2015.BKN.at.MIA.7z
12.28.2015.CLE.at.PHX.7z
12.28.2015.LAC.at.WAS.7z
12.28.2015.LAL.at.CHA.7z
12.28.2015.MIL.at.DAL.7z
12.28.2015.MIN.at.SAS.7z
12.28.2015.NOP.at.ORL.7z
12.28.2015.PHI.at.UTA.7z
12.28.2015.SAC.at.GSW.7z
12.28.2015.TOR.at.CHI.7z
12.29.2015.ATL.at.HOU.7z
12.29.2015.CLE.at.DEN.7z
12.29.2015.DET.at.NYK.7z
12.29.2015.MIA.at.MEM.7z
12.29.2015.MIL.at.OKC.7z
12.30.2015.BKN.at.ORL.7z
12.30.2015.DEN.at.POR.7z
12.30.2015.GSW.at.DAL.7z
12.30.2015.IND.at.CHI.7z
12.30.2015.LAC.at.CHA.7z
12.30.2015.LAL.at.BOS.7z
12.30.2015.PHI.at.SAC.7z
12.30.2015.PHX.at.SAS.7z
12.30.2015.UTA.at.MIN.7z
12.30.2015.WAS.at.TOR.7z
12.31.2015.GSW.at.HOU.7z
12.31.2015.LAC.at.NOP.7z
12.31.2015.MIL.at.IND.7z
12.31.2015.MIN.at.DET.7z
12.31.2015.PHX.at.OKC.7z
12.31.2015.POR.at.UTA.7z


================================================
FILE: game/game.py
================================================
"""
Library for retrieving basektball player-tracking and play-by-play data.
"""

import matplotlib
matplotlib.use('TkAgg')

import os
import warnings
import json
from subprocess import Popen, PIPE
import pandas as pd
import matplotlib.pyplot as plt
from matplotlib.patches import Circle, Rectangle, Arc, Polygon
import numpy as np
import seaborn as sns
from scipy.spatial import ConvexHull

# Initialize project
os.system('mkdir temp')
datalink = None
curl_request = None


class Game(object):
    """
    Class for basketball game.
    Contains play by play and player tracking data and methods for
    anaylsis and plotting.
    """

    def __init__(self, date, team1, team2):
        """
        Args:
            date (str): 'MM.DD.YYYY', date of game
            team1 (str): 'XXX', abbreviation of team1 in data
                tracking file name
            team2 (str): 'XXX', abbreviation of team2 in data
                tracking file name

        Attributes:
            date (str): 'MM.DD.YYYY', date of game
            team1 (str): 'XXX', abbreviation of team1 in data
                tracking file name
            team2 (str): 'XXX', abbreviation of team2 in data
                tracking file name
            tracking_id (str): id to access player tracking data
                Due to the way the SportVU data is stored, game_id is
                complicated: 'MM.DD.YYYY.AWAYTEAM.at.HOMETEAM'
                For Example: 01.13.2016.GSW.at.DEN
            tracking_data (dict): Dictionary of unstructured tracking
                data scraped from github.
            game_id (str): ID for game.  Lukcily, SportVU and play by
                play use the same game ID
            pbp (pd.DataFrame): Play by play data.  33 columns per pbp
                instance.
            moments (pd.DataFrame): DataFrame of player tracking data.
                Each entry is a single snap-shot of where the players
                are at a given time on the court.
                Columns: ['quarter', 'universe_time', 'quarter_time',
                'shot_clock', 'positions', 'game_time'].
                moments['positions'] contains a list of where each player
                and the ball are located.
            player_ids (dict): dictionary of {player: player_id} for
                all players in game.
            away_id (int): ID of away team
            home_id (int): ID of home team
            team_colors (dict): dictionary of colors for each team and
                ball. Used for ploting.
            home_team (str): 'XXX', abbreviation of home team
            away_team (str): 'XXX', abbreviation of away team
        """
        self.date = date
        self.team1 = team1
        self.team2 = team2
        self.flip_direction = False
        self.tracking_id = ('{self.date}.{self.team2}.at.{self.team1}'
                            .format(self=self))
        self.tracking_data = None
        self.game_id = None
        self.pbp = None
        self.moments = None
        self.player_ids = None
        self._get_tracking_data()
        self._get_playbyplay_data()
        self._format_tracking_data()
        self._get_player_ids()
        self.away_id = self.tracking_data['events'][0]['visitor']['teamid']
        self.home_id = self.tracking_data['events'][0]['home']['teamid']
        self.team_colors = {-1: "orange",
                            self.away_id: "blue",
                            self.home_id: "red"}
        self.home_team = (self.tracking_data['events'][0]['home']
                          ['abbreviation'])
        self.away_team = (self.tracking_data['events'][0]['visitor']
                          ['abbreviation'])
        self.flip_direction = False
        self._determine_direction()
        print('All data is loaded')

    def _get_tracking_data(self):
        """
        Helper function for retrieving tracking data
        Tracking Data is provided by NBA.com,
        hosted at: https://www.github.com/neilmj
        """
        # Retrive and extract Data into /temp folder

        os.system(("curl {datalink} -o temp/zipdata"
                   .format(datalink=datalink)))
        os.system("7za -o./temp x temp/zipdata")
        os.remove("./temp/zipdata")

        # Extract game ID from extracted file name.
        for file in os.listdir('./temp'):
            if os.path.splitext(file)[1] == '.json':
                self.game_id = file[:-5]

        # Load tracking data and remove json file
        with open('temp/{self.game_id}.json'.format(self=self)) as data_file:
            self.tracking_data = json.load(data_file)  # Load this json
        os.remove('./temp/{self.game_id}.json'.format(self=self))
        return self

    def _get_playbyplay_data(self):
        """
        Helper function for retrieving play-by-play data.
        Play-by-play data is obtained via API call to NBA.com
        This service is likely to go down at any moment and ruin this
        whole project.
        """
        os.system(curl_request)
        # load play by play into pandas DataFrame
        with open(("{cwd}/temp/pbp_{self.game_id}.json"
                   .format(cwd=os.getcwd(), self=self))) as json_file:
            parsed = json.load(json_file)['resultSets'][0]
        os.remove(("{cwd}/temp/pbp_{self.game_id}.json"
                   .format(cwd=os.getcwd(), self=self)))
        self.pbp = pd.DataFrame(parsed['rowSet'])
        self.pbp.columns = parsed['headers']

        # Get time in quarter reamining to cross-reference tracking data
        self.pbp['Qmin'] = (self.pbp['PCTIMESTRING'].str
                            .split(':', expand=True)[0])
        self.pbp['Qsec'] = (self.pbp['PCTIMESTRING'].str
                            .split(':', expand=True)[1])
        self.pbp['Qtime'] = (self.pbp['Qmin'].astype(int)*60 +
                             self.pbp['Qsec'].astype(int))
        self.pbp['game_time'] = ((self.pbp['PERIOD'] - 1) * 720 +
                                 (720 - self.pbp['Qtime']))

        # Format score so that it makes sense: 'XX-XX'
        self.pbp['SCORE'] = (self.pbp['SCORE']
                             .fillna(method='ffill')
                             .fillna('0 - 0'))
        return self

    def _get_player_ids(self):
        """
        Helper function for returning player ids for all players in game.
        Note: This data may also be somewhere more conveniently
            accessible in tracking_data.
        """
        ids = {}
        for index, row in self.pbp.iterrows():
            if row['PLAYER1_NAME'] not in ids:
                ids[row['PLAYER1_NAME']] = row['PLAYER1_ID']
            if row['PLAYER2_NAME'] not in ids:
                ids[row['PLAYER2_NAME']] = row['PLAYER2_ID']
            if row['PLAYER3_NAME'] not in ids:
                ids[row['PLAYER3_NAME']] = row['PLAYER3_ID']
        ids.pop(None)
        self.player_ids = ids
        return self

    def _format_tracking_data(self):
        """
        Heler function to format tracking data into pandas DataFrame
        """
        events = pd.DataFrame(self.tracking_data['events'])
        moments = []
        # Extract 'moments': Each moment is an individual frame
        for row in events['moments']:
            for inner_row in row:
                moments.append(inner_row)
        moments = pd.DataFrame(moments)
        moments = moments.drop_duplicates(subset=[1])
        moments = moments.reset_index()

        moments.columns = ['index', 'quarter', 'universe_time', 'quarter_time',
                           'shot_clock', 'unknown', 'positions']
        moments['game_time'] = (moments.quarter - 1) * 720 + \
                               (720 - moments.quarter_time)
        moments.drop(['index', 'unknown'], axis=1, inplace=True)
        self.moments = moments
        return self

    def _draw_court(self, color="gray", lw=2, grid=False, zorder=0):
        """
        Helper function to draw court.
        Modified from Savvas Tjortjoglou with contribution
            from Michael Wheelock
        S. Tjortjoglou: http://savvastjortjoglou.com/nba-shot-sharts.html
        M. Wheelock: https://www.linkedin.com/in/michael-s-wheelock-a5635a66
        """
        ax = plt.gca()

        # Create the court lines
        outer = Rectangle((0, -50), width=94, height=50, color=color,
                          zorder=zorder, fill=False, lw=lw)
        l_hoop = Circle((5.35, -25), radius=.75, lw=lw, fill=False,
                        color=color, zorder=zorder)
        r_hoop = Circle((88.65, -25), radius=.75, lw=lw, fill=False,
                        color=color, zorder=zorder)
        l_backboard = Rectangle((4, -28), 0, 6, lw=lw, color=color,
                                zorder=zorder)
        r_backboard = Rectangle((90, -28), 0, 6, lw=lw, color=color,
                                zorder=zorder)
        l_outer_box = Rectangle((0, -33), 19, 16, lw=lw, fill=False,
                                color=color, zorder=zorder)
        l_inner_box = Rectangle((0, -31), 19, 12, lw=lw, fill=False,
                                color=color, zorder=zorder)
        r_outer_box = Rectangle((75, -33), 19, 16, lw=lw, fill=False,
                                color=color, zorder=zorder)
        r_inner_box = Rectangle((75, -31), 19, 12, lw=lw, fill=False,
                                color=color, zorder=zorder)
        l_free_throw = Circle((19, -25), radius=6, lw=lw, fill=False,
                              color=color, zorder=zorder)
        r_free_throw = Circle((75, -25), radius=6, lw=lw, fill=False,
                              color=color, zorder=zorder)
        l_corner_a = Rectangle((0, -3), 14, 0, lw=lw, color=color,
                               zorder=zorder)
        l_corner_b = Rectangle((0, -47), 14, 0, lw=lw, color=color,
                               zorder=zorder)
        r_corner_a = Rectangle((80, -3), 14, 0, lw=lw, color=color,
                               zorder=zorder)
        r_corner_b = Rectangle((80, -47), 14, 0, lw=lw, color=color,
                               zorder=zorder)
        l_arc = Arc((5, -25), 47.5, 47.5, theta1=292, theta2=68, lw=lw,
                    color=color, zorder=zorder)
        r_arc = Arc((89, -25), 47.5, 47.5, theta1=112, theta2=248,
                    lw=lw, color=color, zorder=zorder)
        half_court = Rectangle((47, -50), 0, 50, lw=lw, color=color,
                               zorder=zorder)
        hc_big_circle = Circle((47, -25), radius=6, lw=lw, fill=False,
                               color=color, zorder=zorder)
        hc_sm_circle = Circle((47, -25), radius=2, lw=lw, fill=False,
                              color=color, zorder=zorder)
        court_elements = [l_hoop, l_backboard, l_outer_box, outer,
                          l_inner_box, l_free_throw, l_corner_a,
                          l_corner_b, l_arc, r_hoop, r_backboard,
                          r_outer_box, r_inner_box, r_free_throw,
                          r_corner_a, r_corner_b, r_arc, half_court,
                          hc_big_circle, hc_sm_circle]

        # Add the court elements onto the axes
        for element in court_elements:
            ax.add_patch(element)

        return ax

    def watch_play(self, game_time, length, highlight_player=None,
                   commentary=True, show_spacing=None):
        """
        DEPRECIATED.  See animate_play() for similar (fastere) method

        Method for viewing plays in game.
        Outputs video file of play in {cwd}/temp

        Args:
            game_time (int): time in game to start video
                (seconds into the game).
                Currently game_time can also be an tuple of length
                two with (starting_frame, ending_frame) if you want
                to watch a play using frames instead of game time.
            length (int): length of play to watch (seconds)
            highlight_player (str): If not None, video will highlight
                the circle of the inputed player for easy tracking.
            commentary (bool): Whether to include play-by-play
                commentary underneath video
            show_spacing (str in ['home', 'away']): show convex hull
                of home or away team.
                if None, does not display any convex hull

        Returns: an instance of self, and outputs video file of play
        """
        warnings.warn(("watch_play is extremely slow. "
                       "Use animate_play for similar functionality, "
                       "but greater efficiency"))

        if type(game_time) == tuple:
            starting_frame = game_time[0]
            ending_frame = game_time[1]
        else:
            # Get starting and ending frame from requested game_time and length
            starting_frame = self.moments[self.moments.game_time.round() ==
                                          game_time].index.values[0]
            ending_frame = self.moments[self.moments.game_time.round() ==
                                        game_time + length].index.values[0]

        # Make video of each frame
        for frame in range(starting_frame, ending_frame):
            self.plot_frame(frame, highlight_player=highlight_player,
                            commentary=commentary, show_spacing=show_spacing)
        command = ('ffmpeg -framerate 20 -start_number {starting_frame} '
                   '-i %d.png -c:v libx264 -r 30 -pix_fmt yuv420p -vf '
                   '"scale=trunc(iw/2)*2:trunc(ih/2)*2" {starting_frame}'
                   '.mp4').format(starting_frame=starting_frame)
        os.chdir('temp')
        os.system(command)
        os.chdir('..')

        # Delete images
        for file in os.listdir('./temp'):
            if os.path.splitext(file)[1] == '.png':
                os.remove('./temp/{file}'.format(file=file))

        return self

    def watch_player_actions(self, player_name, action, length=15, max_vids=5):
        """
        Method for viewing all plays a player in the game had of a
        specified type.
        For example: all of Damian Lillards FG attempts in the game
        Outputs video file for each play in {cwd}/temp

        Args:
            player_name (str): Name of player for which to produce videos.
                Currently, player_name must be perfectly formatted and
                capitalized, since no string processing is performed.
            action (str) {'all_FG', 'made_FG', 'miss_FG', 'rebound'}:
                Action type of interest
            length (int): length of play to watch (seconds) for each action.
            max_vids (int): Maximum number of videos to produce.
                max_vids=None if all videos are desired.  If max_vids
                is less than the total number of actions in the game, the
                earliest actions are made into videos.

        Returns: an instance of self, and outputs video file of plays
        """
        player_action_times = self._get_player_actions(player_name, action)
        for index, time in enumerate(player_action_times):
            if index == max_vids:
                break
            self.watch_play(time-length, length,
                            highlight_player=player_name,
                            commentary=False)
        return self

    def _get_commentary(self, game_time, commentary_length=6,
                        commentary_depth=10):
        """
        Helper function for returning play by play events for a
        given game time.

        Args:
            game_time (int): game time (in seconds) for which to
                retrieve commentary for
            commentary_length (int): Number of play-by-play calls to
                include in commentary
            commentary_depth (int): Number of seconds to look in past
                to retrieve play-by-play calls
                commentary_depth=10 looks at previous 10 seconds of
                game for play-by-play calls

        Returns: tuple of information (commentary_script, score)
            commentary_script (str): string of commentary
                Most recent play-by-play calls, seperated by line breaks
            score (str): Score at current time 'XX - XX'
        """
        commentary = [' 'for i in range(commentary_length)]
        commentary[0] = '.'
        count = 0
        score = "0 - 0"
        for game_second in range(game_time - commentary_depth, game_time + 2):
            for index, row in self.pbp[self.pbp.game_time ==
                                       game_second].iterrows():
                if row['HOMEDESCRIPTION']:
                    commentary[count] = ('{self.home_team}: '
                                         .format(self=self) +
                                         str(row['HOMEDESCRIPTION']))
                    count += 1
                if row['VISITORDESCRIPTION']:
                    commentary[count] = ('{self.away_team}: '
                                         .format(self=self) +
                                         str(row['VISITORDESCRIPTION']))
                    count += 1
                if row['NEUTRALDESCRIPTION']:
                    commentary[count] = str(row['NEUTRALDESCRIPTION'])
                    count += 1
                score = str(row['SCORE'])
                if count == commentary_length - 1:
                    break
        commentary_script = """{commentary[0]}
                                \n{commentary[1]}
                                \n{commentary[2]}
                                \n{commentary[3]}
                                \n{commentary[4]}
                                \n{commentary[5]}
                                """.format(commentary=commentary)
        return (commentary_script, score)

    def _get_player_actions(self, player_name, action):
        """
        Helper function to get all times a player performed a specific action

        Args:
            player_name (str): name of player to get all actions for
            action {'all_FG', 'made_FG', 'miss_FG', 'rebound'}:
                Type of action to get all times for.

        Returns:
            times (list): list of game times a player performed a
                specific specific action
        """
        player_id = self.player_ids[player_name]
        action_dict = {'all_FG': [1, 2], 'made_FG': [1],
                       'miss_FG': [2], 'rebound': [4]}
        action_df = self.pbp[(self.pbp['PLAYER1_ID'] == player_id) &
                             (self.pbp['EVENTMSGTYPE']
                              .isin(action_dict[action]))]
        times = list(action_df['game_time'])
        return times

    def _get_moment_details(self, frame_number, highlight_player=None):
        """
        Helper function for getting important information for a given frame

        Args:
            frame_number (int): Frame in game to retrieve data for
                frame_number gets player tracking data from
                    moments.ix[frame_number]
            highlight_player (str): Name of player to be highlighted
                in downstream plotting.
                if None, no player is highlighted.

        Returns: tuple of data
            game_time (int): seconds into game of current moment
            x_pos (list): list of x coordinants for all players and ball
            y_pos (list): list of y coordinants for all players and ball
            colors (list): color coding of each player/ball for coordinant data
            sizes (list): size of each player/ball
                (used for showing ball height)
            quarter (int): Game quarter
            shot_clock (str): shot clock
            game_clock (str): game clock
            edges (list): list of marker edge sizes of each player for video.
                useful when trying to highlight a player by making
                their edge thicker.
            universe_time (int): Time in the universe, in msec
        """
        current_moment = self.moments.ix[frame_number]
        game_time = int(np.round(current_moment['game_time']))
        universe_time = int(current_moment['universe_time'])
        x_pos, y_pos, colors, sizes, edges = [], [], [], [], []
        # Get player positions
        for player in current_moment.positions:
            x_pos.append(player[2])
            y_pos.append(player[3])
            colors.append(self.team_colors[player[0]])
            # Use ball height for size (useful to sevie a shot)
            if player[0] == -1:
                sizes.append(max(150 - 2*(player[4] - 5)**2, 10))
            else:
                sizes.append(200)
            # highlight_player makes their outline much thicker on the video
            if (highlight_player and
                    player[1] == self.player_ids[highlight_player]):
                edges.append(5)
            else:
                edges.append(0.5)
        # Unfortunately, the plot is below the y axis,
        # so the y positions need to be corrected
        y_pos = np.array(y_pos) - 50
        shot_clock = current_moment.shot_clock
        if np.isnan(shot_clock):
            shot_clock = 24.00
        shot_clock = str(shot_clock).split('.')[0]
        game_min, game_sec = divmod(current_moment.quarter_time, 60)
        game_clock = "%02d:%02d" % (game_min, game_sec)
        quarter = current_moment.quarter
        return (game_time, x_pos, y_pos, colors, sizes, quarter,
                shot_clock, game_clock, edges, universe_time)

    def plot_frame(self, frame_number, highlight_player=None,
                   commentary=True, show_spacing=False,
                   plot_spacing=False, pipe=None):
        """
        Creates an individual the frame of game.
        Outputs .png file in {cwd}/temp

        Args:
            frame_number (int): number of frame in game to create
                frame_number gets player tracking data from
                moments.ix[frame_number]
            highlight_player (str): Name of player to highlight
                (by making their outline thicker).
                if None, no player is highlighted
            commentary (bool): if True, add play-by-play commentary
                under frame
            show_spacing (str in ['home', 'away']): show convex hull
                of home or away team
                if None, does not display any convex hull
            pipe (subprocesses.Popen): Popen object with open pipe
                to send image to if False, image is written to disk
                instead of sent to pipe

        Returns: an instance of self, and outputs .png file of frame
            If pipe, ARGB values are sent to pipe object instead of
            writing to disk.

        TODO be able to call this method by game time instead of frame_number
        """
        (game_time, x_pos, y_pos, colors, sizes,
         quarter, shot_clock, game_clock, edges,
         universe_time) = self._get_moment_details(frame_number,
                                                   highlight_player=highlight_player)
        (commentary_script, score) = self._get_commentary(game_time)
        fig = plt.figure(figsize=(12, 6))
        self._draw_court()
        frame = plt.gca()
        frame.axes.get_xaxis().set_ticks([])
        frame.axes.get_yaxis().set_ticks([])
        plt.scatter(x_pos, y_pos, c=colors, s=sizes, alpha=0.85,
                    linewidths=edges)
        plt.xlim(-5, 100)
        plt.ylim(-55, 5)
        sns.set_style('dark')
        if commentary:
            plt.figtext(0.23, -.6, commentary_script, size=20)
        plt.figtext(0.43, 0.125, shot_clock, size=18)
        plt.figtext(0.5, 0.125, 'Q'+str(quarter), size=18)
        plt.figtext(0.57, 0.125, str(game_clock), size=18)
        plt.figtext(0.43, .85,
                    self.away_team + "  " + score + "  " + self.home_team,
                    size=18)
        if highlight_player:
            plt.figtext(0.17, 0.85, highlight_player, size=18)
        # Add team color indicators to top of frame
        plt.scatter([30, 67], [2.5, 2.5], s=100,
                    c=[self.team_colors[self.away_id],
                       self.team_colors[self.home_id]])
        if show_spacing:
            # Show convex hull on frame
            xy_pos = np.column_stack((np.array(x_pos), np.array(y_pos)))
            if show_spacing == 'home':
                points = xy_pos[1:6, :]
            if show_spacing == 'away':
                points = xy_pos[6:, :]
            hull = ConvexHull(points)
            hull_points = points[hull.vertices, :]
            polygon = Polygon(hull_points, alpha=0.3, color='gray')
            ax = plt.gca()
            ax.add_patch(polygon)
        if pipe:
            # Write ARGB values to pipe
            fig.canvas.draw()
            string = fig.canvas.tostring_argb()
            pipe.stdin.write(string)
            plt.close()
            if commentary:
                fig = plt.figure(figsize=(12, 6))
                plt.figtext(.2, .4, commentary_script, size=20)
                fig.canvas.draw()
                string = fig.canvas.tostring_argb()
                pipe.stdin.write(string)
            plt.close()

        else:
            # Save image to disk
            plt.savefig('temp/{frame_number}.png'
                        .format(frame_number=frame_number),
                        bbox_inches='tight')
            plt.close()
        return self

    def _in_formation(self, frame_number):
        """
        This is a complicated method to explain, but it is actually
        very simple.
        It determines if the game is in a set offense/defense.
        It basically returns True if a normal play is being run,
        and False if the game is in transition, out of bounds,
        free throw, etc.  It is useful for analyzing plays that teams
        run, and discarding all extranous times from the game.
        """
        # Get relevant moment details
        details = self._get_moment_details(frame_number)
        x_pos = np.array(details[1])
        shot_clock = details[6]
        # Determine if offense/defense is set
        if float(shot_clock) < 23:
            if (x_pos < 47).all() or (x_pos > 47).all():
                return True
        return False

    def get_spacing_area(self, frame_number):
        """
        Calculates convex hull of home and away team for a given frame.
        Useful for analyzing the spacing of teams.

        Args:
            frame_number (int): number of frame in game to calculate
                team convex hulls

        Returns: tuple of data (home_area, away_area)
            home_area (float): convex hull area of home team
            away_area (float): convex hull area of away team

        """
        details = self._get_moment_details(frame_number)
        x_pos = np.array(details[1])
        y_pos = np.array(details[2])
        xy_pos = np.column_stack((x_pos, y_pos))
        home_area = ConvexHull(xy_pos[1:6, :]).area
        away_area = ConvexHull(xy_pos[6:, :]).area
        return (home_area, away_area)

    def get_offensive_team(self, frame_number):
        """
        Determines which team is on offense.
        Currently only works if team is in set offense or defense.

        Args:
            frame_number (int): number of frame in game to determine
                offensive team

        Returns:
            str in ['home', 'away']
        """
        details = self._get_moment_details(frame_number)
        x_pos = np.array(details[1])
        quarter = details[5]
        if len(x_pos) != 11:
            return None
        if self.flip_direction:
            if (x_pos < 47).all() and quarter in [1, 2]:
                return 'away'
            if (x_pos > 47).all() and quarter in [3, 4]:
                return 'away'
            if (x_pos < 47).all() and quarter in [3, 4]:
                return 'home'
            if (x_pos > 47).all() and quarter in [1, 2]:
                return 'home'
        if (x_pos < 47).all() and quarter in [1, 2]:
            return 'home'
        if (x_pos > 47).all() and quarter in [3, 4]:
            return 'home'
        if (x_pos < 47).all() and quarter in [3, 4]:
            return 'away'
        if (x_pos > 47).all() and quarter in [1, 2]:
            return 'away'
        return None

    def _determine_direction(self):
        """
        Helper funcation to determine which direction the home team is going.
        Surprisingly, this is not consistent and depends on the game.
        Currently, this method detects which side the players start on and is
        ~90% accurate
        """
        incorrect_count = 0
        correct_count = 0
        for frame in range(0, 10000, 100):
            details = self._get_moment_details(frame)
            home_team_x = details[1][1:6]
            away_team_x = details[1][6:]
            if np.mean(home_team_x) < np.mean(away_team_x):
                incorrect_count += 1
            else:
                correct_count += 1
        if incorrect_count > correct_count:
            self.flip_direction = True
        return None

    def get_frame(self, game_time):
        """
        Converts a game time to a frame number.  Useful all over the place.

        Args:
            game_time (int): game time in seconds of interest

        Returns:
            frame (int): frame number of game time
        """
        test_time = game_time
        while True:
            if test_time in self.moments.game_time.round():
                frames = self.moments[self.moments.game_time.round() ==
                                      test_time].index.values
                if len(frames) > 0:
                    frame = frames[0]
                    break
                else:
                    test_time -= 1
            else:
                test_time -= 1
        return frame

    def get_play_frames(self, event_num, play_type='offense'):
        """
        Args:
            event_num (int): EVENTNUM of interest in games.pbp
                NOTE: Check pbpevents.txt for event numbers
            play_type (str in ['offense', 'defense']): Team of interest
                is offense or defense

        Returns:
            tuple of (start_time (int), end_time (int)): start time
                and end time in seconds for play of interest
        """
        play_index = self.pbp[self.pbp['EVENTNUM'] == event_num].index[0]
        event_team = str(self.pbp[self.pbp['EVENTNUM'] == event_num]
                         .PLAYER1_TEAM_ABBREVIATION.head(1).values[0])
        if event_team == self.home_team:
            target_team = 'home'
        if event_team == self.away_team:
            target_team = 'away'
        end_time = int(self.pbp[self.pbp['EVENTNUM'] == event_num].game_time)
        # To find lower bound on starting frame of the play,
        # determining when previous play ended
        putative_start_time = int(self.pbp.ix[play_index-1].game_time)
        putative_start_frame = self.get_frame(putative_start_time)
        end_frame = self.get_frame(end_time)
        for test_frame in range(putative_start_frame, end_frame):
            if self.get_offensive_team(test_frame) == target_team:
                break
        # If the previous loop never found an offensive play,
        # the function returns None
        else:
            return None
        # Add two seconds to game time to let the players settle into position
        start_frame = self.get_frame(round(self.moments.ix[test_frame].game_time + 2))
        return (start_frame, end_frame)

    def animate_play(self, game_time, length, highlight_player=None,
                     commentary=True, show_spacing=None):
        """
        Method for animating plays in game.
        Outputs video file of play in {cwd}/temp.
        Individual frames are streamed directly to ffmpeg without writing them
        to the disk, which is a great speed improvement over watch_play

        Args:
            game_time (int): time in game to start video
                (seconds into the game).
                Currently game_time can also be an tuple of length two
                with (starting_frame, ending_frame)if you want to
                watch a play using frames instead of game time.
            length (int): length of play to watch (seconds)
            highlight_player (str): If not None, video will highlight
                the circle of the inputed player for easy tracking.
            commentary (bool): Whether to include play-by-play commentary in
                the animation
            show_spacing (str) in ['home', 'away']: show convex hull
                spacing of home or away team.
                If None, does not show spacing.

        Returns: an instance of self, and outputs video file of play
        """
        if type(game_time) == tuple:
            starting_frame = game_time[0]
            ending_frame = game_time[1]
        else:
            # Get starting and ending frame from requested game_time and length
            starting_frame = self.moments[self.moments.game_time.round() ==
                                          game_time].index.values[0]
            ending_frame = self.moments[self.moments.game_time.round() ==
                                        game_time + length].index.values[0]

        # Make video of each frame
        filename = "./temp/{game_time}.mp4".format(game_time=game_time)
        if commentary:
            size = (960, 960)
        else:
            size = (960, 480)
        cmdstring = ('ffmpeg',
                     '-y', '-r', '20',  # fps
                     '-s', '%dx%d' % size,  # size of image string
                     '-pix_fmt', 'argb',  # Stream argb data from matplotlib
                     '-f', 'rawvideo',  '-i', '-',
                     '-vcodec', 'libx264', filename)

        # Stream plots to pipe
        pipe = Popen(cmdstring, stdin=PIPE)
        for frame in range(starting_frame, ending_frame):
            self.plot_frame(frame, highlight_player=highlight_player,
                            commentary=commentary, show_spacing=show_spacing,
                            pipe=pipe)
        pipe.stdin.close()
        pipe.wait()
        return self


================================================
FILE: game/pbpevents.txt
================================================
Description of Play-by-play ‘EVENTMSGTYPE’

1: Made FG
2: Miss FG
3: FT Attempt
4: Rebound
5: Turnover
6: Foul
7: Lane Violation (?)
8: Substitution 
9: Timeout
10: Jump Ball
11: (?)
12: Quarter Start
13: Quarter End
14: (?)
15: (?)
16: (?)
17: (?)
18: (?)

================================================
FILE: game/scrape_games.py
================================================
"""
Quick scipt to get all games in the database and save to text file.
"""

from bs4 import BeautifulSoup
from urllib2 import urlopen


def scrape():
    page = urlopen(('https://github.com/sealneaward/'
                    'nba-movement-data/tree/master/data')).read()
    soup = BeautifulSoup(page)
    f = open('allgames.txt', 'w')
    for anchor in soup.findAll('a', class_="js-navigation-open"):
        if anchor.text.endswith('.7z') and len(anchor.text) == 24:
            f.write(anchor.text + '\n')
    f.close()
    return


if __name__ == '__main__':
    scrape()


================================================
FILE: game/spacing_analysis.py
================================================
"""
Scripts for analyzing spacing of NBA tracking data.

The workhorse statistic for spacing is "Convex Hull"
"""

import os
import pickle
import pandas as pd
import numpy as np
import seaborn as sns
import matplotlib.pyplot as plt
from sklearn import linear_model
from game import Game


def extract_games():
    """
    Extract games from allgames.txt

    Returns:
        list: list of games.  Each element is list is
            [date, home_team, away_team]
            example element: ['01.01.2016', 'TOR', 'CHI']
    """

    games = []
    with open('allgames.txt', 'r') as game_file:
        for line in game_file:
            game = line.split('.')
            date = "{game[0]}.{game[1]}.{game[2]}".format(game=game)
            away = game[3]
            home = game[5]
            games.append([date, home, away])
    return games


def get_spacing_statistics(date, home_team, away_team, write_file=False,
                           write_score=False, write_game=False):
    """
    Calculates spacing statistics for each frame in game

    Args:
        date (str): date of game in form 'MM.DD.YYYY'.  Example: '01.01.2016'
        home_team (str): home team in form 'XXX'. Example: 'TOR'
        away_team (str): away team in form 'XXX'. Example: 'CHI'
        write_file (bool): If True, write pickle file of spacing
            statistics into data/spacing directory
        write_score (bool): If True, write pickle file of game score
            into data/score directory
        write_game (bool): If True, write pickle file of tracking data
            into data/game directory
            Note: This file is ~100MB.

    Returns:
        tuple: tuple of data (home_offense_areas, home_defense_areas,
               away_offense_areas, away_defense_areas), where each
               element of the tuple is a list of convex hull areas
               for each frame in the game.
    """
    filename = ("{date}-{away_team}-"
                "{home_team}.p").format(date=date,
                                        away_team=away_team,
                                        home_team=home_team)
    # Do not recalculate spacing data if already saved to disk
    if filename in os.listdir('./data/spacing'):
        return
    game = Game(date, home_team, away_team)
    # Write game data to disk
    if write_game:
        pickle.dump(game, open('data/game/' + filename, "wb"))
    home_offense_areas, home_defense_areas = [], []
    away_offense_areas, away_defense_areas = [], []
    print(date, home_team, away_team)
    for frame in range(len(game.moments)):
        offensive_team = game.get_offensive_team(frame)
        if offensive_team:
            home_area, away_area = game.get_spacing_area(frame)
            if offensive_team == 'home':
                home_offense_areas.append(home_area)
                away_defense_areas.append(away_area)
            if offensive_team == 'away':
                home_defense_areas.append(home_area)
                away_offense_areas.append(away_area)
    results = (home_offense_areas, home_defense_areas,
               away_offense_areas, away_defense_areas)
    # Write spacing data to disk
    if write_file:
        filename = ("{date}-{away_team}-"
                    "{home_team}").format(date=date,
                                          away_team=away_team,
                                          home_team=home_team)
        pickle.dump(results, open('data/spacing/' + filename + '.p', "wb"))
    # Write game scores to disk
    if write_score:
        score = game.pbp['SCORE'].ix[len(game.pbp) - 1]
        pickle.dump(score, open('data/score/' + filename + '.p', "wb"))

    return(home_offense_areas, home_defense_areas,
           away_offense_areas, away_defense_areas)


def write_spacing(gamelist):
    """
    Writes all spacing statistics to data/spacing directory for each game
    """
    for game in gamelist:
        try:
            get_spacing_statistics(game[0], game[1], game[2],
                                   write_file=True, write_score=True)
        except:
            with open('errorlog.txt', 'a') as myfile:
                myfile.write("{game} Could not extract spacing data\n"
                             .format(game=game))


def plot_spacing(date, home_team, away_team, defense=True, save_plot=False):
    """
    Plots team's spacing distrubution in a game.

    Args:
        date (str): date of game in form 'MM.DD.YYYY'.  Example: '01.01.2016'
        home_team (str): home team in form 'XXX'. Example: 'TOR'
        away_team (str): away team in form 'XXX'. Example: 'CHI'
        defense (bool): if True, plot defensive spacing.
            if False, plot offensive spacing
        save_plot (bool): if True, save plot to /temp directory

    Returns: None
        Also, shows plt.hist of team spacing during game

    """
    plt.plot()
    filename = ("{date}-{away_team}-"
                "{home_team}").format(date=date, away_team=away_team,
                                      home_team=home_team)
    if filename in os.listdir('data/spacing'):
        data = pickle.load(open("data/spacing/"+filename, "rb"))
    else:
        return None
    plt.figure()
    if defense:
        plt.hist(data[1], bins=100, alpha=0.4, label=home_team)
        plt.hist(data[3], bins=100, alpha=0.4, label=away_team)
    else:
        plt.hist(data[0], bins=100, alpha=0.4, label=home_team)
        plt.hist(data[1], bins=100, alpha=0.4, label=away_team)
    plt.xlim(20, 100)
    plt.legend(loc='upper right')
    plt.show()
    if save_plot:
        plt.savefig('temp/spacing{date}.png'.format(date=date))
    return None


def get_spacing_details(game):
    """
    Calculates mean spacing for game.

    Args:
        game (Game): game to compute spacing details for

    Returns: tuple of data  (home_points, away_points, home_offense_areas,
        home_defense_areas, away_offense_areas, away_defense_areas)

        home_points (int): Points scored by home team
        away_points (int): Points scored by away team
        home_offense_area (float): Average spacing (sq ft) of home
            team while on offense
        home_defense_area (float): Average spacing (sq ft) of home
            team while on defense
        away_offense_area (float): Average spacing (sq ft) of away
            team while on offense
        away_defense_area (float): Average spacing (sq ft) of away
            team while on defense

        If game not saved in data/spacing directory, returns None

    """

    fname = "{game[0]}-{game[2]}-{game[1]}.p".format(game=game)
    if (fname in os.listdir('data/spacing') and
            fname in os.listdir('data/score')):
        data = pickle.load(open("data/spacing/"+fname, "rb"))
        score = pickle.load(open("data/score/"+fname, "rb")).split(' ')
        away_points, home_points = score[0], score[2]
        means = tuple(map(np.mean, data))
        return (int(home_points), int(away_points), *means)
    else:
        return None


def get_spacing_df(gamelist):
    """
    Organizes spacing data from all games into a DataFrame

    Args:
        gamelist (list): list of games where each element
            [date, home_team, away_team]
            example element: ['01.01.2016', 'TOR', 'CHI']
    Returns: pd.DataFrame
        DataFrame up spacing data with columns: ['home_points', 'away_points',
            'home_offense_areas', 'home_defense_areas', 'away_offense_areas',
            'away_defense_areas', 'away_team', 'home_team', 'space_dif',
            'home_win']
        within DataFrame:
            home_win (int): 1 if home team won, -1 if lost
            space_dif (float): difference (sq ft) between away team's
                defensive spacing and home team's defensive spacing
    """
    details = []
    for game in gamelist:
        detail = get_spacing_details(game)
        if detail:
            details.append((*detail, game[1], game[2]))
    df = pd.DataFrame(details)
    df.columns = ['home_points', 'away_points', 'home_offense_areas',
                  'home_defense_areas', 'away_offense_areas',
                  'away_defense_areas', 'away_team', 'home_team']
    df['space_dif'] = df.away_defense_areas - df.home_defense_areas
    df['home_win'] = np.sign(df.home_points - df.away_points)
    df = df[df.home_offense_areas > 80]
    return df


def plot_offense_vs_defense_spacing(spacing_data):
    """
    Plot of offensive vs. defensive spacing for games

    Args:
        spacing_data (pd.DataFrame): Dataframe with columns of spacing data
            ['home_offense_areas', 'home_defense_areas',
             'away_offense_areas', 'away_defense_areas']
        save_fig (bool): if True, save plot to temp/ directory

    Returns None
        Also, shows plot.
    """
    sns.regplot(spacing_data.away_offense_areas,
                spacing_data.home_defense_areas,
                fit_reg=True, color=sns.color_palette()[0],
                ci=None)
    sns.regplot(spacing_data.home_offense_areas,
                spacing_data.away_defense_areas,
                fit_reg=False, color=sns.color_palette()[0],
                ci=None)
    plt.xlabel('Average Offensive Spacing (sq ft)', fontsize=16)
    plt.ylabel('Average Defensive Spacing (sq ft)', fontsize=16)
    plt.title('Offensive spacing robustly induces defensive spacing',
              fontsize=16)
    plt.savefig('temp/OffenseVsDefense.png')
    plt.close()
    return None


def plot_defense_spacing_vs_score(spacing_data):
    """
    Plot of team's defensive spacing vs score differential for games

    Args:
        spacing_data (pd.DataFrame): Dataframe with columns of spacing data
            ['home_offense_areas', 'home_defense_areas',
             'away_offense_areas', 'away_defense_areas']
        save_fig (bool): if True, save plot to temp/ directory

    Returns None
        Also, shows plot.
    """
    y = spacing_data.home_points - spacing_data.away_points
    x = spacing_data.away_defense_areas - spacing_data.home_defense_areas
    sns.regplot(x, y, ci=False)
    plt.xlabel(' Home Team Defensive Spacing Differential (sq ft)',
               fontsize=16)
    plt.ylabel('Home Team Score Differential (pts)', fontsize=16)
    plt.title('Spacing the defense correlates with outscoring opponents',
              fontsize=16)
    plt.savefig('temp/SpacingVsScore.png')
    plt.close()


def plot_defense_spacing_vs_wins(spacing_datae):
    """
    Plot of team's defensive spacing vs wins (binary: 0, 1) for games

    Args:
        spacing_data (pd.DataFrame): Dataframe with columns of spacing data
            ['home_offense_areas', 'home_defense_areas',
             'away_offense_areas', 'away_defense_areas']
        save_fig (bool): if True, save plot to temp/ directory

    Returns None
        Also, shows plot.
    """
    clf = linear_model.LogisticRegression(C=1)
    X = np.array(spacing_data.space_dif)
    X = X[:, np.newaxis]
    y = np.array(spacing_data.home_win)
    y_adjusted = (y+1) / 2
    clf.fit(X, y)
    plt.scatter(X.ravel(), y_adjusted, color=sns.color_palette()[0],
                s=600, alpha=1, marker='|')
    plt.xlim(-10, 10)
    X_test = np.linspace(-10, 10, 300)
    X_test = X_test[:, np.newaxis]
    clf.predict(X_test)

    def model(x):
        return 1 / (1 + np.exp(-x))

    log_fit = model(X_test * clf.coef_ + clf.intercept_).ravel()
    plt.scatter(X_test.ravel(), log_fit)
    plt.xlabel('Home Team Defensive Spacing Differential (sq ft)', fontsize=16)
    plt.ylabel('Home Team Win', fontsize=16)
    plt.title('Spacing the Defense Correlates with winning', fontsize=16)
    plt.savefig('temp/SpacingVsWins.png')
    plt.close()


def plot_team_defensive_spacing(spacing_data):
    """
    Plot of team's defensive spacing (bar graph)

    Args:
        spacing_data (pd.DataFrame): Dataframe with columns of spacing data
            ['home_offense_areas', 'home_defense_areas',
             'away_offense_areas', 'away_defense_areas']
        save_fig (bool): if True, save plot to temp/ directory

    Returns None
        Also, shows plot.
    """
    df = pd.DataFrame()
    df['home'] = spacing_data.groupby('home_team')['away_defense_areas'].sum()
    df['home_count'] = spacing_data.groupby('home_team')['away_defense_areas'].count()
    df['away'] = spacing_data.groupby('away_team')['home_defense_areas'].sum()
    df['away_count'] = spacing_data.groupby('away_team')['home_defense_areas'].count()
    df['average_induced_space'] = (df.home + df.away) / (df.away_count + df.home_count)
    df['average_induced_space'].sort_values().plot(kind='bar', color=sns.color_palette()[0])
    plt.xlabel('', fontsize=16)
    plt.ylabel("Opponent's Defensive Spacing (sq ft)", fontsize=16)
    plt.ylim(60, 70)
    plt.title("Team's ability to space the defense", fontsize=18)
    plt.savefig('temp/DefensiveSpacing.png')
    plt.close()


def plot_teams_ability_to_space_defense(spacing_data):
    """
    Plots teams ability to space defense given their offensive spacing
        (scatter plot)

    Args:
        spacing_data (pd.DataFrame): Dataframe with columns of spacing data
            ['home_offense_areas', 'home_defense_areas',
             'away_offense_areas', 'away_defense_areas']
        save_fig (bool): if True, save plot to temp/ directory

    Returns None
        Also saves plot to temp dir
    """
    df = spacing_data.groupby('home_team').count()
    df['home'] = spacing_data.groupby('home_team')['away_defense_areas'].sum()
    df['home_count'] = spacing_data.groupby('home_team')['away_defense_areas'].count()
    df['away'] = spacing_data.groupby('away_team')['home_defense_areas'].sum()
    df['away_count'] = spacing_data.groupby('away_team')['home_defense_areas'].count()
    df['average_induced_space'] = (df.home + df.away) / (df.away_count + df.home_count)

    df['home_offense'] = spacing_data.groupby('home_team')['home_offense_areas'].sum()
    df['home_offense_count'] = spacing_data.groupby('home_team')['home_offense_areas'].count()
    df['away_offense'] = spacing_data.groupby('away_team')['away_offense_areas'].sum()
    df['away_offense_count'] = spacing_data.groupby('away_team')['away_offense_areas'].count()
    df['average_offense_space'] = (df.home_offense +
                                   df.away_offense) / (df.away_offense_count +
                                                       df.home_offense_count)
    plt.scatter(df['average_induced_space'],
                df['average_offense_space'],
                s=74, alpha=0.7,
                c=sns.color_palette()[0])
    for row in df.iterrows():
        if row[0] in ['DEN', 'SAS', 'LAC', 'CLE', 'DET', 'WAS', 'TOR',
                      'MIL', 'ORL', 'DAL']:
            plt.annotate(row[0],
                         xy=[row[1]['average_induced_space'] + -0.15,
                             row[1]['average_offense_space'] + 0.1])
    plt.xlabel('Average Offensive Spacing (sq ft)', fontsize=16)
    plt.ylabel("Average Opponent's Defensive Spacing (sq ft)", fontsize=16)
    plt.title("Team's ability to space opponent's defense", fontsize=16)
    plt.savefig('temp/Spacing_scatter.png')
    plt.close()


if __name__ == "__main__":
    """
    Calls functions to generate plots.  Uncomment lines which you want to plot.
    if spacing data has not been calculated, uncomment 'write_spacing(games)',
    which will calculate the spacing data for all games and save it to disk.
    """

    all_games = extract_games()
    # Uncomment if writing spacing first time
    # write_spacing(all_games)
    spacing_data = get_spacing_df(all_games)
    plot_offense_vs_defense_spacing(spacing_data)
    plot_defense_spacing_vs_score(spacing_data)
    plot_defense_spacing_vs_wins(spacing_data)
    plot_team_defensive_spacing(spacing_data)
    plot_teams_ability_to_space_defense(spacing_data)


================================================
FILE: game/velocity_analysis.py
================================================
""""
Analysis of NBA player velocities.
"""

import os
import pickle
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import pandas as pd
from game import Game

# Initialize Project
os.chdir('~/Desktop/Personal/SportVU/NBA-player-movement')


def extract_games():
    """
    Extract games from allgames.txt

    Returns:
        list: list of games.  Each element is list is tuple
            (date, home_team, away_team)
        example element: ('01.01.2016', 'TOR', 'CHI')
    """

    games = []
    with open('allgames.txt', 'r') as game_file:
        for line in game_file:
            game = line.split('.')
            date = "{game[0]}.{game[1]}.{game[2]}".format(game=game)
            away = game[3]
            home = game[5]
            games.append((date, home, away))
    return games


def calculate_velocities(game, frame, highlight_player=None):
    """
    Calculates team or player velocity for a frame in a game

    Args:
        game (Game): Game instance to get data from
        frame_number (int): number of frame in game to calculate velocities
            frame_number gets player tracking data from moments.ix[frame]
        highlight_player (str): Name of player to calculate velocity of.
            if None, cumulative team velocities are calculated.

    Returns: tuple of data (game_time, home_velocity, away_velocity)
        game_time (int): universe time of the frame
        home_velocity (float): cumulative velocity (ft/msec) of home team
        away_velocity (float): cumulative velocity (ft/msec) of away team
    """
    details = game._get_moment_details(frame,
                                       highlight_player=highlight_player)
    previous_details = game._get_moment_details(frame - 1)
    game_time = details[9]

    # Highlighed player's edge value (details[8]) is 5 instead of 0.5
    # Use this fact to retrieve the index of the player of interest
    if highlight_player:
        if 5 in details[8]:
            player_index = details[8].index(5)
        else:
            highlight_player = None

    if frame == 0:
        if highlight_player:
            return 0
        return (game_time, 0, 0)

    # If not all the players are on the court, there is an error in the data
    if len(details[1]) != 11 or \
       len(details[2]) != 11 or \
       len(previous_details[1]) != 11 or \
       len(previous_details[2]) != 11:
        return (game_time, 0, 0)

    delta_x = np.array(details[1]) - np.array(previous_details[1])
    delta_y = np.array(details[2]) - np.array(previous_details[2])
    delta_coordinants = zip(delta_x, delta_y)
    distance_traveled = map(lambda coords: np.linalg.norm(coords),
                            delta_coordinants)
    delta_time = details[9] - previous_details[9]
    # Note, universe time is in msec
    velocity = list(map(lambda distances: distances / delta_time,
                        distance_traveled))
    if highlight_player:
        return (game_time, velocity[player_index])
    home_velocity = sum(velocity[1:6])
    away_velocity = sum(velocity[6:])
    return (game_time, home_velocity, away_velocity)


def plot_velocity_frame(game, frame_number, ax, highlight_player=None):
    """
    Creates an individual the frame of game.

    Args:
        game (Game): Game instance to get data from
        frame_number (int): number of frame in game to create
            frame_number gets player tracking data from
            moments.ix[frame_number]
        highlight_player (str): Name of player to highlight (by making
            their outline thicker).
            if None, no player is highlighted

    Returns: plt.fig of frame from game with subplot of velocity.
        see README.md for example
    """
    (game_time, x_pos, y_pos, colors, sizes,
     quarter, shot_clock, game_clock, edges,
     universe_time) = game._get_moment_details(frame_number,
                                               highlight_player=highlight_player)
    (commentary_script, score) = game._get_commentary(game_time)

    game._draw_court()
    frame = plt.gca()
    frame.axes.get_xaxis().set_ticks([])
    frame.axes.get_yaxis().set_ticks([])
    ax.scatter(x_pos, y_pos, c=colors, s=sizes, alpha=0.85, linewidths=edges)
    plt.xlim(-5, 100)
    plt.ylim(-55, 5)
    sns.set_style('dark')
    plt.figtext(0.43, 0.105, shot_clock, size=18)
    plt.figtext(0.5, 0.105, 'Q'+str(quarter), size=18)
    plt.figtext(0.57, 0.105, str(game_clock), size=18)
    plt.figtext(0.43, .442,
                game.away_team + "  " + score + "  " + game.home_team,
                size=18)

    # Add team color indicators to top of frame
    ax.scatter([30, 67], [2.5, 2.5], s=100,
               c=[game.team_colors[game.away_id],
                  game.team_colors[game.home_id]])


def watch_play_velocities(game, game_time, length, highlight_player=None):
    """
    Creates an movie of a play which includes a plot of the
        real-time velocities.

    Args:
        game (Game): Game instance to get data from
        game_time (int): time in game to start video (seconds into the game).
        length (int): length of play to watch (seconds)
        highlight_player (str): If not None, video will highlight the circle of
            the inputed player for easy tracking, and also display that
            players velocity

    Returns: None and outputs video file of play with
        velocity plot. See README.md for example
    """
    starting_frames = game.moments[game.moments.game_time.round() ==
                                   game_time]
    starting_frame = starting_frames.index.values[0]
    ending_frames = game.moments[game.moments.game_time.round() ==
                                 game_time + length]
    ending_frame = ending_frames.index.values[0]

    indices = list(range(ending_frame - starting_frame))

    if highlight_player:
        player_velocities = [calculate_velocities(game, frame,
                                                  highlight_player=highlight_player)[1]
                             for frame in range(starting_frame, ending_frame)]
        max_velocity = max(player_velocities)
    else:
        home_velocities = [calculate_velocities(game, frame)[1]
                           for frame in range(starting_frame, ending_frame)]
        away_velocities = [calculate_velocities(game, frame)[2]
                           for frame in range(starting_frame, ending_frame)]
        all_velocities = home_velocities + away_velocities
        max_velocity = max(all_velocities)

    # Plot each frame
    for index, frame in enumerate(range(starting_frame, ending_frame)):
        f, (ax1, ax2) = plt.subplots(2, figsize=(12, 12))
        plot_velocity_frame(game, frame, ax=ax2,
                            highlight_player=highlight_player)
        ax1.set_xlim([0, len(indices)])
        ax1.set_ylim([0, max_velocity * 1.2])
        if highlight_player:
            ax1.plot(indices[:index+1], player_velocities[:index+1],
                     c='black', label=highlight_player)
        else:
            ax1.plot(indices[:index+1], home_velocities[:index+1],
                     c=game.team_colors[game.home_id], label=game.home_team)
            ax1.plot(indices[:index+1], away_velocities[:index+1],
                     c=game.team_colors[game.away_id], label=game.away_team)
        ax1.set_yticklabels([])
        ax1.set_xticklabels([])
        ax1.set_ylabel('Velocity', fontsize=22)
        if highlight_player:
            ax1.set_title(highlight_player, fontsize=24)
        else:
            ax1.legend(fontsize=18)
        plt.savefig('temp/' + str(index) + '.png')
        plt.close()

    # Make video of each frame
    command = ('ffmpeg -framerate 20 -start_number 0 -i %d.png -c:v '
               'libx264 -r 30 -pix_fmt yuv420p -vf '
               '"scale=trunc(iw/2)*2:trunc(ih/2)*2" {starting_frame}'
               '.mp4').format(starting_frame=starting_frame)
    os.chdir('temp')
    os.system(command)
    os.chdir('..')

    # Delete images
    for file in os.listdir('./temp'):
        if os.path.splitext(file)[1] == '.png':
            os.remove('./temp/{file}'.format(file=file))

    return


def get_velocity_statistics(date, home_team, away_team, write_file=False,
                            write_score=False, write_game=False):
    """
    Calculates velocity statistics for each frame in game

    Args:
        date (str): date of game in form 'MM.DD.YYYY'.  Example: '01.01.2016'
        home_team (str): home team in form 'XXX'. Example: 'TOR'
        away_team (str): away team in form 'XXX'. Example: 'CHI'
        write_file (bool): If True, write pickle file of velocity
            statistics into data/velocity directory
        write_score (bool): If True, write pickle file of game score
            into data/score directory
        write_game (bool): If True, write pickle file of tracking data
            into data/game directory
            Note: This file is ~100MB.

    Returns:
        tuple: tuple of data (home_offense_velocities, home_defense_velocities,
               away_offense_velocities, away_defense_velocities), where each
               element of the tuple is a list of tuples
               (frame, game_time, velocity) for each frame in the game.
    """
    filename = ("{date}-{away_team}-"
                "{home_team}.p").format(date=date, away_team=away_team,
                                        home_team=home_team)
    # Do not recalculate spacing data if already saved to disk
    if filename in os.listdir('./data/velocity/'):
        return
    game = Game(date, home_team, away_team)
    # Write game data to disk
    if write_game:
        pickle.dump(game, open('data/game/' + filename, "wb"))
    home_offense_velocities, home_defense_velocities = [], []
    away_offense_velocities, away_defense_velocities = [], []
    print(date, home_team, away_team)
    for frame in range(1, len(game.moments)):
        offensive_team = game.get_offensive_team(frame)
        if offensive_team:
            (game_time, home_velocity,
             away_velocity) = calculate_velocities(game, frame)
            if offensive_team == 'home':
                home_offense_velocities.append((frame, game_time,
                                                home_velocity))
                away_defense_velocities.append((frame, game_time,
                                                away_velocity))
            if offensive_team == 'away':
                home_defense_velocities.append((frame, game_time,
                                                home_velocity))
                away_offense_velocities.append((frame, game_time,
                                                away_velocity))
    results = (home_offense_velocities, home_defense_velocities,
               away_offense_velocities, away_defense_velocities)
    # Write velocity data to disk
    if write_file:
        filename = ("{date}-{away_team}-"
                    "{home_team}").format(date=date,
                                          away_team=away_team,
                                          home_team=home_team)
        pickle.dump(results, open('data/velocity/' + filename + '.p', "wb"))
    # Write game scores to disk
    if write_score:
        score = game.pbp['SCORE'].ix[len(game.pbp) - 1]
        pickle.dump(score, open('data/score/' + filename + '.p', "wb"))

    return (home_offense_velocities, home_defense_velocities,
            away_offense_velocities, away_defense_velocities)


def write_velocity(gamelist):
    """
    Writes all spacing statistics to data/spacing directory for each game
    """
    for game in gamelist:
        try:
            get_velocity_statistics(game[0], game[1], game[2],
                                    write_file=True, write_score=True)
        except:
            with open('errorlog_velocity.txt', 'a') as myfile:
                myfile.write("{game} Could not extract velocity data\n"
                             .format(game=game))


def extract_velocity(gamelist):
    """
    Loads velocity data, calculates average offensive and defensive
        velocity for each game in gamelist
        Note: requires velocity data to be written for each game in
        data/velocity and data/score (see get_velocity_statistics())

    Args:
        gamelist (list):  list of games.  Each element is list is tuple
            (date, home_team, away_team).
            example element: ('01.01.2016', 'TOR', 'CHI')

    Returns (pd.DataFrame): Dataframe of velocity data with columns:
        0: Home Offensive Velocity
        1: Away Offensive Velocity
        2: Home Defensive Velocity
        3: Away Defensive Velocity
        4: Away Score
        5: Home Score
        6: Away Team
        7: Home Team
    """
    data = []
    for game in gamelist:
        away_team = game[2]
        home_team = game[1]
        print(away_team, home_team)
        filename = ("{date}-{away_team}-"
                    "{home_team}").format(date=game[0],
                                          away_team=away_team,
                                          home_team=home_team)

        # Load velocity/score data
        try:
            velocity_data = pickle.load(open('data/velocity/'
                                             + filename + '.p',
                                             'rb'))
            score_data = pickle.load(open('data/score/'
                                          + filename + '.p',
                                          'rb'))
        except:
            print('velocity data not written for: ', game)
            continue

        away_score, home_score = extract_scores(score_data)
        # Organize velocity data by team and offense/defense
        HOV = pd.DataFrame(velocity_data[0])
        HDV = pd.DataFrame(velocity_data[1])
        AOV = pd.DataFrame(velocity_data[2])
        ADV = pd.DataFrame(velocity_data[3])

        # Cut out erroneous velocity data
        # This is due to Frame-skipping in the SVU data
        # For example, from the last frame of a quarter to the
        # first frame of the next quarter, etc.
        HOV = HOV[HOV[2] < 0.15]
        AOV = AOV[AOV[2] < 0.15]
        HDV = HDV[HDV[2] < 0.15]
        ADV = ADV[ADV[2] < 0.15]

        game_data = (HOV[2].mean(), AOV[2].mean(), HDV[2].mean(),
                     ADV[2].mean(), away_score, home_score, away_team,
                     home_team)
        data.append(game_data)
    return pd.DataFrame(data)


def extract_fatigue(gamelist):
    """
    Loads velocity data, calculates average offensive and defensive
        velocity for each quarter for each game in gamelist
        Note: requires velocity data to be written for each game in
        data/velocity and data/score (see get_velocity_statistics())

    Args:
        gamelist (list):  list of games.  Each element is list is tuple
            (date, home_team, away_team).
            example element: ('01.01.2016', 'TOR', 'CHI')

    Returns (pd.DataFrame): Dataframe of velocity data with columns:
        Tm: team
        Pos: Offense or Defense
        1: 1st Quarter Mean Velocity
        2: 2nd Quarter Mean Velocity
        3: 3rd Quarter Mean Velocity
        4: 4th Quarter Mean Velocity
    """
    data = []
    for game in gamelist:
        away_team = game[2]
        home_team = game[1]
        print(away_team, home_team)
        filename = ("{date}-{away_team}-"
                    "{home_team}").format(date=game[0],
                                          away_team=away_team,
                                          home_team=home_team)

        # Load velocity/score data
        try:
            velocity_data = pickle.load(open('data/velocity/'
                                             + filename + '.p',
                                             'rb'))
            score_data = pickle.load(open('data/score/'
                                          + filename + '.p',
                                          'rb'))
        except:
            print('velocity data not written for: ', game)
            continue

        away_score, home_score = extract_scores(score_data)
        # Organize velocity data by team and offense/defense
        HOV = pd.DataFrame(velocity_data[0])
        HDV = pd.DataFrame(velocity_data[1])
        AOV = pd.DataFrame(velocity_data[2])
        ADV = pd.DataFrame(velocity_data[3])

        # Cut out erroneous velocity data
        # This is due to Frame-skipping in the SVU data
        # For example, from the last frame of a quarter to the
        # first frame of the next quarter, etc.
        HOV = HOV[HOV[2] < 0.15]
        AOV = AOV[AOV[2] < 0.15]
        HDV = HDV[HDV[2] < 0.15]
        ADV = ADV[ADV[2] < 0.15]

        quarter_velocities = {}
        for quarter in [1, 2, 3, 4]:
            ending_frame = int(len(HOV)/4 * quarter)
            starting_frame = int(len(HOV)/4 * (quarter-1))

            quarter_velocities[quarter] = [HOV.iloc[starting_frame:
                                                    ending_frame][2].mean(),
                                           HDV.iloc[starting_frame:
                                                    ending_frame][2].mean(),
                                           AOV.iloc[starting_frame:
                                                    ending_frame][2].mean(),
                                           ADV.iloc[starting_frame:
                                                    ending_frame][2].mean(),
                                           ]
        df = pd.DataFrame(quarter_velocities)
        df['Tm'] = [home_team, home_team, away_team, away_team]
        df['Pos'] = ['Off', 'Def', 'Off', 'Def']

        game_data = (df, away_score, home_score, away_team, home_team)
        data.append(game_data)
    df = pd.DataFrame()
    for i in range(len(data)):
        df = pd.concat((df, data[i][0]))
    df = pd.melt(df, ['Tm', 'Pos'], [1, 2, 3, 4])
    return df


def velocity_plots(df):
    """
    Makes plots showing game velocity for SAS and IND

    Args:
        df (pd.DataFrame): dataframe of velocity data
            Note: use extract_velocity() to obtain this data

    Returns:
        None
        Saves plots to examples/
    """

    # Organize velocity data
    home = df[[0, 2, 5, 7]]
    away = df[[1, 3, 4, 6]]
    home.columns = ['Off', 'Def', 'Pts', 'Tm']
    away.columns = ['Off', 'Def', 'Pts', 'Tm']
    all_dat = pd.concat((home, away))
    ave = all_dat.groupby('Tm').mean()

    # Plot of offense velocity by team
    plt.figure()
    sns.barplot(x='Tm', y='Off', data=all_dat,
                order=ave.sort_values('Off').index,
                color=sns.xkcd_rgb["pale red"])
    plt.ylim(0.022, 0.03)
    locs, labels = plt.xticks()
    plt.setp(labels, rotation=90)
    locs, labels = plt.yticks()
    plt.yticks(locs, map(lambda x: "%.1f" % x, locs*1000))
    plt.ylabel('Mean Offensive Velocity (ft/sec)')
    plt.xlabel('')
    plt.title('Offensive Velocity')
    plt.savefig('examples/VelocityOffenseTeams')

    # Plot of defense velocity by team
    plt.figure()
    sns.barplot(x='Tm', y='Def', data=all_dat,
                order=ave.sort_values('Def').index,
                color=sns.xkcd_rgb["pale red"])
    plt.ylim(0.018, 0.024)
    locs, labels = plt.xticks()
    plt.setp(labels, rotation=90)
    locs, labels = plt.yticks()
    plt.yticks(locs, map(lambda x: "%.1f" % x, locs*1000))
    plt.ylabel('Mean Defensive Velocity (ft/sec)')
    plt.xlabel('')
    plt.title('Defensive Velocity')
    plt.savefig('examples/VelocityDefenseTeams')


def fatigue_plots(df):
    """
    Makes plots showing game fatigue for SAS and IND

    Args:
        df (pd.DataFrame): dataframe of fatigue data
            Note: use extract_fatigue() to obtain this data

    Returns:
        None
        Saves plots to examples/
    """
    plt.figure()
    sns.swarmplot(x='variable', y='value',
                  data=df[df.Pos == 'Off'][df.Tm == 'IND'])
    plt.title('Indiana Pacers Fatigue')
    plt.xlabel('Quarter')
    plt.ylabel('Mean Offensive Velocity (ft/sec)')
    plt.ylim(0.015, 0.034)
    locs, labels = plt.yticks()
    plt.yticks(locs, map(lambda x: "%.1f" % x, locs*1000))
    plt.savefig('examples/INDfatige')

    plt.figure()
    sns.swarmplot(x='variable', y='value',
                  data=df[df.Pos == 'Off'][df.Tm == 'SAS'])
    plt.title('San Antonio Spurs Fatigue')
    plt.xlabel('Quarter')
    plt.ylabel('Mean Offensive Velocity (ft/sec)')
    locs, labels = plt.yticks()
    plt.yticks(locs, map(lambda x: "%.1f" % x, locs*1000))
    plt.savefig('examples/SASfatige')


def extract_scores(score_data):
    """
    Organizes score data from string to tuple

    Args:
        score_data (str): string of form 'AWAYSCORE - HOMESCORE'
            Example: '111 - 105'

    Returns:
        scores (tuple): tuple of form (away_score, home_score) where
            each score is an int
    """
    away_score = int(score_data.split('-')[0])
    home_score = int(score_data.split('-')[1])
    scores = (away_score, home_score)
    return scores


def set_plot_params(size):
    """
    Sets font size on plots.  16-22 is a good range.
    """
    SIZE = size
    plt.rc('font', size=SIZE)
    plt.rc('axes', titlesize=SIZE)
    plt.rc('axes', labelsize=SIZE)
    plt.rc('xtick', labelsize=SIZE)
    plt.rc('ytick', labelsize=SIZE)
    plt.rc('legend', fontsize=SIZE)


if __name__ == "__main__":
    set_plot_params(16)
    all_games = extract_games()
    write_velocity(all_games)
    velocity_data = extract_velocity(all_games)
    velocity_plots(velocity_data)
    fatigue_data = extract_fatigue(all_games)
    fatigue_plots(fatigue_data)
Download .txt
gitextract_z95h9qgz/

├── .gitignore
├── README.md
└── game/
    ├── allgames.txt
    ├── game.py
    ├── pbpevents.txt
    ├── scrape_games.py
    ├── spacing_analysis.py
    └── velocity_analysis.py
Download .txt
SYMBOL INDEX (44 symbols across 4 files)

FILE: game/game.py
  class Game (line 25) | class Game(object):
    method __init__ (line 32) | def __init__(self, date, team1, team2):
    method _get_tracking_data (line 101) | def _get_tracking_data(self):
    method _get_playbyplay_data (line 125) | def _get_playbyplay_data(self):
    method _get_player_ids (line 158) | def _get_player_ids(self):
    method _format_tracking_data (line 176) | def _format_tracking_data(self):
    method _draw_court (line 198) | def _draw_court(self, color="gray", lw=2, grid=False, zorder=0):
    method watch_play (line 262) | def watch_play(self, game_time, length, highlight_player=None,
    method watch_player_actions (line 320) | def watch_player_actions(self, player_name, action, length=15, max_vid...
    method _get_commentary (line 350) | def _get_commentary(self, game_time, commentary_length=6,
    method _get_player_actions (line 403) | def _get_player_actions(self, player_name, action):
    method _get_moment_details (line 425) | def _get_moment_details(self, frame_number, highlight_player=None):
    method plot_frame (line 485) | def plot_frame(self, frame_number, highlight_player=None,
    method _in_formation (line 577) | def _in_formation(self, frame_number):
    method get_spacing_area (line 597) | def get_spacing_area(self, frame_number):
    method get_offensive_team (line 619) | def get_offensive_team(self, frame_number):
    method _determine_direction (line 655) | def _determine_direction(self):
    method get_frame (line 676) | def get_frame(self, game_time):
    method get_play_frames (line 700) | def get_play_frames(self, event_num, play_type='offense'):
    method animate_play (line 736) | def animate_play(self, game_time, length, highlight_player=None,

FILE: game/scrape_games.py
  function scrape (line 9) | def scrape():

FILE: game/spacing_analysis.py
  function extract_games (line 17) | def extract_games():
  function get_spacing_statistics (line 38) | def get_spacing_statistics(date, home_team, away_team, write_file=False,
  function write_spacing (line 103) | def write_spacing(gamelist):
  function plot_spacing (line 117) | def plot_spacing(date, home_team, away_team, defense=True, save_plot=Fal...
  function get_spacing_details (line 156) | def get_spacing_details(game):
  function get_spacing_df (line 193) | def get_spacing_df(gamelist):
  function plot_offense_vs_defense_spacing (line 226) | def plot_offense_vs_defense_spacing(spacing_data):
  function plot_defense_spacing_vs_score (line 256) | def plot_defense_spacing_vs_score(spacing_data):
  function plot_defense_spacing_vs_wins (line 281) | def plot_defense_spacing_vs_wins(spacing_datae):
  function plot_team_defensive_spacing (line 319) | def plot_team_defensive_spacing(spacing_data):
  function plot_teams_ability_to_space_defense (line 347) | def plot_teams_ability_to_space_defense(spacing_data):

FILE: game/velocity_analysis.py
  function extract_games (line 17) | def extract_games():
  function calculate_velocities (line 38) | def calculate_velocities(game, frame, highlight_player=None):
  function plot_velocity_frame (line 95) | def plot_velocity_frame(game, frame_number, ax, highlight_player=None):
  function watch_play_velocities (line 138) | def watch_play_velocities(game, game_time, length, highlight_player=None):
  function get_velocity_statistics (line 218) | def get_velocity_statistics(date, home_team, away_team, write_file=False,
  function write_velocity (line 287) | def write_velocity(gamelist):
  function extract_velocity (line 301) | def extract_velocity(gamelist):
  function extract_fatigue (line 368) | def extract_fatigue(gamelist):
  function velocity_plots (line 453) | def velocity_plots(df):
  function fatigue_plots (line 505) | def fatigue_plots(df):
  function extract_scores (line 539) | def extract_scores(score_data):
  function set_plot_params (line 557) | def set_plot_params(size):
Condensed preview — 8 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (100K chars).
[
  {
    "path": ".gitignore",
    "chars": 34,
    "preview": "*~\n*.DS_Store\n*__pycache__/\n*.pyc\n"
  },
  {
    "path": "README.md",
    "chars": 7889,
    "preview": "# NBA player tracking visualization and analysis\n\nThis library contains useful methods for visualizing and analyzing NBA"
  },
  {
    "path": "game/allgames.txt",
    "chars": 15900,
    "preview": "01.01.2016.CHA.at.TOR.7z\n01.01.2016.DAL.at.MIA.7z\n01.01.2016.NYK.at.CHI.7z\n01.01.2016.ORL.at.WAS.7z\n01.01.2016.PHI.at.LA"
  },
  {
    "path": "game/game.py",
    "chars": 34323,
    "preview": "\"\"\"\nLibrary for retrieving basektball player-tracking and play-by-play data.\n\"\"\"\n\nimport matplotlib\nmatplotlib.use('TkAg"
  },
  {
    "path": "game/pbpevents.txt",
    "chars": 256,
    "preview": "Description of Play-by-play ‘EVENTMSGTYPE’\n\n1: Made FG\n2: Miss FG\n3: FT Attempt\n4: Rebound\n5: Turnover\n6: Foul\n7: Lane V"
  },
  {
    "path": "game/scrape_games.py",
    "chars": 576,
    "preview": "\"\"\"\nQuick scipt to get all games in the database and save to text file.\n\"\"\"\n\nfrom bs4 import BeautifulSoup\nfrom urllib2 "
  },
  {
    "path": "game/spacing_analysis.py",
    "chars": 15867,
    "preview": "\"\"\"\nScripts for analyzing spacing of NBA tracking data.\n\nThe workhorse statistic for spacing is \"Convex Hull\"\n\"\"\"\n\nimpor"
  },
  {
    "path": "game/velocity_analysis.py",
    "chars": 21760,
    "preview": "\"\"\"\"\nAnalysis of NBA player velocities.\n\"\"\"\n\nimport os\nimport pickle\nimport numpy as np\nimport matplotlib.pyplot as plt\n"
  }
]

About this extraction

This page contains the full source code of the christopherjenness/NBA-player-movement GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 8 files (94.3 KB), approximately 28.5k tokens, and a symbol index with 44 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!