Full Code of Ryo-N7/soccer_ggplots for AI

master c41c144ec8a4 cached
248 files
134.0 MB
2.6M tokens
1 requests
Copy disabled (too large) Download .txt
Showing preview only (10,534K chars total). Download the full file to get everything.
Repository: Ryo-N7/soccer_ggplots
Branch: master
Commit: c41c144ec8a4
Files: 248
Total size: 134.0 MB

Directory structure:
gitextract_i_d1l2y0/

├── .gitignore
├── Africa Cup of Nations 2019/
│   └── afcon.Rmd
├── Asian Cup 2019/
│   ├── asian_cup_2019.rmd
│   ├── japan_qatar.Rmd
│   ├── jpn_aus_waffle.Rmd
│   ├── jpn_aus_waffle.md
│   ├── jpn_saudi.Rmd
│   ├── visualize_asian_cup_2019.knit.md
│   ├── visualize_asian_cup_2019.md
│   ├── visualize_asian_cup_2019.rmd
│   └── visualize_asian_cup_2019.utf8.md
├── Bundesliga 2018-2019/
│   └── player_goal_contribution_matrix.Rmd
├── Bundesliga 2019-2020/
│   ├── buli_age_utility.Rmd
│   ├── buli_dribbling_1920_hinrunde.Rmd
│   ├── buli_goalkeepers_1920_hinrunde.Rmd
│   ├── buli_progressive_passing_1920_hinrunde.Rmd
│   ├── buli_shot_quality_1920_hinrunde.Rmd
│   └── goal_contrib_graph_1920_hinrunde.Rmd
├── Champions League & Europa League 2019-2020/
│   └── europa_league_eloRatings.Rmd
├── Copa America 2019/
│   ├── 1-copa_america2019.md
│   ├── COPY-2019-06-18-visualize-copa-america.md
│   ├── copa_america2019.md
│   ├── copa_america2019.rmd
│   └── copa_extras.Rmd
├── Eredivisie 2018-2019/
│   └── player_goal_contribution_matrix.rmd
├── Europe 2021-2022/
│   ├── fbref_sca_waffle_blogpost.Rmd
│   ├── fbref_sca_waffle_blogpost.md
│   └── fbref_sca_waffle_raw.Rmd
├── J-League 2018/
│   ├── j_league.rmd
│   ├── j_league_avg_age_value.rmd
│   ├── jleague_age_utility.Rmd
│   ├── player_goal_contribution_matrix.Rmd
│   └── player_turnover.Rmd
├── J-League 2019/
│   ├── goal_minutes.Rmd
│   ├── jleague_age_utility_2019.Rmd
│   └── jleague_summary_2019_season.Rmd
├── J-League 2020/
│   ├── jleague_2020_review_code.Rmd
│   └── jleague_age_utility_2020.Rmd
├── Japan National Team/
│   ├── japan_kirin_cup.Rmd
│   ├── japan_korea_rivalry.rmd
│   └── japan_worldcup.rmd
├── LICENSE
├── La Liga 2018-2019/
│   └── player_goal_contribution_matrix.Rmd
├── La Liga 2019-2020/
│   ├── age_utility_LaLiga.Rmd
│   └── laliga_goalkeepers_1920_3420.Rmd
├── Ligue 1 2018-2019/
│   └── player_goal_contribution_matrix.rmd
├── Lionel Messi/
│   ├── check_coords.Rmd
│   ├── check_y_coordinates.Rmd
│   ├── code_pass_upsetplot.Rmd
│   ├── explore_messi.Rmd
│   ├── messi_pass_upsetplots.Rmd
│   ├── statsbomb_tutorialBlogOne.Rmd
│   └── statsbomb_tutorialBlogOne.md
├── Premier League 2018-2019/
│   ├── LFC_ELO_Ratings.rmd
│   ├── LFC_goals_timeframe.rmd
│   ├── Premier_League_Center_of_Gravity.rmd
│   ├── appearances_season_players.Rmd
│   ├── appearances_season_split_manager.Rmd
│   ├── epl_wages.rmd
│   ├── liverpool_age_utility.rmd
│   ├── liverpoolfc_goals.rmd
│   ├── north_west_derby.rmd
│   └── player_goal_contribution_matrix.Rmd
├── Premier League 2019-2020/
│   ├── 2019-11-21-visualize-EPL-part-1.Rmd
│   ├── 2019-11-28-visualize-EPL-part-2.Rmd
│   ├── 2019-11-28-visualize-EPL-part-3.Rmd
│   ├── epl_goalkeepers_1920_11920.Rmd
│   ├── goal_contrib_graph_1920_MD21.Rmd
│   └── premierleague_top_goalscorers.Rmd
├── README.md
├── Serie A 2018-2019/
│   └── player_goal_contribution_matrix.Rmd
├── Serie A 2019-2020/
│   ├── age_utility_serieA.Rmd
│   └── serieA_goalkeepers_1920_1-23-20.Rmd
├── Women's World Cup 2019/
│   ├── tidytuesday.Rmd
│   └── tidytuesday_statsbomb.Rmd
├── World Cup 2018/
│   ├── RMarkdown/
│   │   ├── blog posts/
│   │   │   ├── soccer_plots_part1.md
│   │   │   ├── soccer_plots_part1.rmd
│   │   │   ├── soccer_plots_part2.md
│   │   │   ├── soccer_plots_part2.rmd
│   │   │   ├── soccer_plots_part3.md
│   │   │   ├── soccer_plots_part3.rmd
│   │   │   └── soccer_plots_part3_DS+.rmd
│   │   ├── ggsoccer_graphs.rmd
│   │   ├── group_goals.csv
│   │   ├── group_table_final_matchday.rmd
│   │   ├── historical_kits.rmd
│   │   ├── joyplot_goals.rmd
│   │   ├── presentation.rmd
│   │   ├── soccer_plots_part4.rmd
│   │   ├── worldcup_goal_plots.rmd
│   │   ├── worldcup_goal_plots_DRAFT.rmd
│   │   └── worldcup_ideas.rmd
│   ├── anim_save_try.r
│   ├── other articles/
│   │   ├── world_cup_BBC_charts.rmd
│   │   └── worldcup_player_data.rmd
│   └── scripts/
│       └── kit_read().r
├── data/
│   ├── Dortmund
│   ├── EPL_shots_data_df_raw.RDS
│   ├── EPL_shots_data_df_raw_matchday13.RDS
│   ├── EPL_shots_data_df_raw_matchday15.RDS
│   ├── FCTokyo_2019_age_utility_df.RDS
│   ├── J-League_2020_review/
│   │   ├── interval_goaltimes_all_df_jleague_2020.csv
│   │   ├── jleague_2020_individual_xG.csv
│   │   ├── jleague_2020_shooting_df.csv
│   │   ├── jleague_2020_situation_all_df.csv
│   │   ├── jleague_age_utility_df_2020.csv
│   │   ├── jleague_table_2020_cleaned.csv
│   │   └── team_xG_J-League-2020.csv
│   ├── J-League_2021_mid_review/
│   │   ├── Gamba-2021.csv
│   │   ├── J-League_2021_mid_league_table.csv
│   │   ├── interval_goaltimes_all_df_jleague_2021_mid.RDS
│   │   ├── jleague_2021_mid_shooting_clean_df.RDS
│   │   ├── jleague_2021_mid_shooting_clean_df.csv
│   │   ├── jleague_2021_mid_shooting_df.xlsx
│   │   ├── jleague_2021_mid_squad_standard_against.csv
│   │   ├── jleague_2021_mid_squad_standard_for.csv
│   │   ├── jleague_2021_situation_all_df.RDS
│   │   ├── jleague_2021_situation_df.xlsx
│   │   ├── jleague_age_utility_df_2021_mid.RDS
│   │   ├── jleague_age_utility_df_2021_mid.csv
│   │   ├── jleague_age_utility_df_2021_mid_raw.RDS
│   │   ├── jleague_table_2021_mid_cleaned.RDS
│   │   ├── jleague_table_2021_mid_cleaned.csv
│   │   ├── jleague_xg_player_2021_mid.RDS
│   │   ├── jleague_xg_player_2021_mid.csv
│   │   ├── team_xG_J-League-2021_mid.RDS
│   │   └── team_xG_J-League-2021_mid.csv
│   ├── Mainz
│   ├── afcon_squads_df_raw.RDS
│   ├── appearances_df_LFC_10_11.RDS
│   ├── appearances_df_LFC_15_16.RDS
│   ├── appearances_df_LFC_18_19.RDS
│   ├── appearances_df_LFC_19_20.RDS
│   ├── appearances_df_raw_LFC_10_11.RDS
│   ├── appearances_df_raw_LFC_15_16.RDS
│   ├── appearances_df_raw_LFC_18_19.RDS
│   ├── appearances_df_raw_LFC_19_20.RDS
│   ├── base_LFC_10_11_dates_df.RDS
│   ├── base_LFC_15_16_dates_df.RDS
│   ├── base_LFC_18_19_dates_df.RDS
│   ├── base_LFC_19_20_dates_df.RDS
│   ├── br_cr.RDS
│   ├── buli_age_utility_df_MD24_1920.RDS
│   ├── buli_player_dribbling_hinrunde_clean.RDS
│   ├── buli_player_dribbling_stats_hinrunde.csv
│   ├── buli_player_goalkeeping_hinrunde_clean.RDS
│   ├── buli_player_goalkeeping_stats_hinrunde.csv
│   ├── buli_player_passing_hinrunde_clean.RDS
│   ├── buli_player_passing_stats_hinrunde.csv
│   ├── buli_player_regular_goalkeeping_stats_hinrunde.csv
│   ├── buli_player_shooting_hinrunde_clean.RDS
│   ├── buli_player_shooting_stats_hinrunde.csv
│   ├── buli_player_stats_hinrunde.RDS
│   ├── buli_player_stats_hinrunde.csv
│   ├── buli_squad_stats_hinrunde.RDS
│   ├── buli_squad_stats_hinrunde.csv
│   ├── bundesliga_goal_contrib_clean_df.RDS
│   ├── bundesliga_goal_contrib_df_soccerway.RDS
│   ├── championship_age_utility_df_MD35_1920.RDS
│   ├── copa_america2019_squads_clean.RDS
│   ├── copa_america2019_squads_raw.RDS
│   ├── copa_america_understat.RDS
│   ├── copa_campeones_clean.RDS
│   ├── copa_top_scorers.RDS
│   ├── eng_champ_location.RDS
│   ├── epl_age_utility_df_MD27_1920.RDS
│   ├── epl_age_utility_df_MD28_1920.RDS
│   ├── epl_goal_contrib_clean_df.RDS
│   ├── epl_goal_contrib_df.RDS
│   ├── epl_goal_contrib_df_soccerway.RDS
│   ├── epl_player_defensive_actions_stats_MD29.csv
│   ├── epl_player_goalkeeping_MD23_clean.RDS
│   ├── epl_player_goalkeeping_stats_MD23.csv
│   ├── epl_player_regular_goalkeeping_stats_MD23.csv
│   ├── epl_player_stats_MD20.RDS
│   ├── epl_player_stats_MD20.csv
│   ├── epl_player_stats_MD21.RDS
│   ├── epl_player_stats_MD21.csv
│   ├── epl_player_stats_MD21_2.RDS
│   ├── epl_player_stats_MD21_2.csv
│   ├── epl_squad_stats_MD20.RDS
│   ├── epl_squad_stats_MD20.csv
│   ├── epl_squad_stats_MD21.RDS
│   ├── epl_squad_stats_MD21.csv
│   ├── epl_squad_stats_MD21_2.RDS
│   ├── epl_squad_stats_MD21_2.csv
│   ├── eredivisie_goal_contrib_clean_df.RDS
│   ├── eredivisie_goal_contrib_df_soccerway.RDS
│   ├── federation_affiliations/
│   │   ├── AFC
│   │   ├── CAF
│   │   ├── Concacaf
│   │   ├── Conmebol
│   │   ├── OFC
│   │   └── UEFA
│   ├── goal_contrib3_df.RDS
│   ├── goal_contrib_clean_df.RDS
│   ├── goal_contrib_df.RDS
│   ├── goal_timeline_df_raw_42920.RDS
│   ├── goalcontrib_webscrape_tutorial.RDS
│   ├── gpg_data.RDS
│   ├── j_league_2018_age_value.RDS
│   ├── j_league_2019_age_value.RDS
│   ├── jleague2019_goal_contrib_raw_df.RDS
│   ├── jleague2019_shot_data.csv
│   ├── jleague_2021_END/
│   │   ├── jleague_age_utility_df_2021_end.csv
│   │   ├── jleague_table_2021_end_cleaned.csv
│   │   └── xGDiff_all_matches_per_team.csv
│   ├── jleague_2022_end/
│   │   ├── jleague_age_utility_df_2022_end.csv
│   │   └── jleague_table_2022_end_cleaned.csv
│   ├── jleague_2022_mid/
│   │   ├── jleague_age_utility_df_2022_mid.csv
│   │   └── jleague_table_2022_mid_cleaned.csv
│   ├── jleague_age_utility_df_2019.RDS
│   ├── jleague_goal_contrib_clean_df.RDS
│   ├── jp_bel.RDS
│   ├── jp_col.RDS
│   ├── jp_pol.RDS
│   ├── jp_sen.RDS
│   ├── laliga_age_utility_df_MD25_1920.RDS
│   ├── laliga_goal_contrib_clean_df.RDS
│   ├── laliga_goal_contrib_df_soccerway.RDS
│   ├── laliga_player_goalkeeping_MD26_clean.RDS
│   ├── laliga_player_goalkeeping_stats_MD26.csv
│   ├── laliga_player_regular_goalkeeping_stats_MD26.csv
│   ├── lewa_shot_contrib.RDS
│   ├── ligueUn_goal_contrib_clean_df.RDS
│   ├── ligueUn_goal_contrib_df_soccerway.RDS
│   ├── liverpool
│   ├── messi_data_clean.RDS
│   ├── messi_data_raw.RDS
│   ├── premierleague_1516_1920_results.RDS
│   ├── premierleague_klopp_results.RDS
│   ├── results.csv
│   ├── results_copa_cleaned.RDS
│   ├── results_jp_asia.RDS
│   ├── sca_big5_demo.RDS
│   ├── sca_big5_demo.csv
│   ├── serieA_age_utility_df_2020-03-01_1920.RDS
│   ├── serieA_goal_contrib_clean_df.RDS
│   ├── serieA_goal_contrib_df_soccerway.RDS
│   ├── serieA_player_goalkeeping_MD20_clean.RDS
│   ├── serieA_player_goalkeeping_stats_1-23-20.csv
│   ├── serieA_player_regular_goalkeeping_stats_1-23-20.csv
│   ├── spi_matches.csv
│   ├── squad_LFC_18_19_df.RDS
│   ├── squad_LFC_19_20_df.RDS
│   ├── team_situation_data.json
│   └── wwc_final_raw.RDS
└── soccer_ggplots.Rproj

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
.Rproj.user
.Rhistory
.RData
.Ruserdata
notes

================================================
FILE: Africa Cup of Nations 2019/afcon.Rmd
================================================
---
title: "Untitled"
author: "RN7"
date: "6/22/2019"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

```{r echo=FALSE, message=FALSE, warning=FALSE}
pacman::p_load(tidyverse, polite, scales, ggimage, ggforce, ggtextures, DT, 
               cowplot, rvest, glue, extrafont, ggrepel, magick)
loadfonts()
```


## AFCON theme


```{r}
theme_afcon <- function(
  title.size = 24,
  subtitle.size = 14,
  caption.size = 8,
  axis.text.size = 14,
  axis.text.x.size = 12,
  axis.text.y.size = 12,
  axis.title.size = 16,
  strip.text.size = 18,
  panel.grid.major.x = element_line(size = 0.5, color = "white"),
  panel.grid.major.y = element_line(size = 0.5, color = "white"),
  panel.grid.minor.x = element_blank(),
  panel.grid.minor.y = element_blank(),
  axis.ticks = element_line(color = "white")) {
  ## Theme:
  theme(text = element_text(family = "Roboto Condensed", color = "white"),
        plot.title = element_text(family = "Roboto Condensed", face = "bold", 
                                  size = title.size, color = "yellow"),
        plot.subtitle = element_text(size = subtitle.size),
        plot.caption = element_text(size = caption.size),
        panel.background = element_rect(fill = "#CE1127"),
        plot.background = element_rect(fill = "#000000"),
        axis.text = element_text(size = axis.text.size, color = "white"),
        axis.text.x = element_text(size = axis.text.x.size, color = "white"),
        axis.text.y = element_text(size = axis.text.y.size, color = "white"),
        axis.title = element_text(size = axis.title.size),
        axis.line.x = element_blank(),
        axis.line.y = element_blank(),
        panel.grid.major.x = panel.grid.major.x,
        panel.grid.major.y = panel.grid.major.y,
        panel.grid.minor.x = panel.grid.minor.x,
        panel.grid.minor.y = panel.grid.minor.y,
        strip.text = element_text(color = "yellow", face = "bold", 
                                  size = strip.text.size, 
                                  margin = margin(4.4, 4.4, 4.4, 4.4)),
        strip.background = element_blank(),
        axis.ticks = axis.ticks
        )
}
```


```{r}
iris %>% 
  ggplot(aes(Sepal.Width, Sepal.Length)) +
  geom_point() +
  labs(title = "balaljld", 
       subtitle = "lajdl") +
  theme_afcon()
```



## Top Goalscorers

```{r}
base_url <- "https://en.wikipedia.org/wiki/Africa_Cup_of_Nations_records_and_statistics"

session <- bow(base_url)

afcon_goalscorers_raw <- scrape(session) %>% 
  html_nodes("table.wikitable:nth-child(9)") %>% 
  html_table() %>% 
  flatten_df()
  
afcon_goalscorers_raw %>% 
  slice(1:5) %>% 
  ggplot(aes(x = Scorers, y = Goals)) +
  geom_col() +
  coord_flip() + 
  theme_afcon()
```





## Tournament wins

2nd place 3rd place? do i seriously have to iterate over every tournament? -_- dios mio...

```{r}
afcon_champions_raw <- scrape(session) %>% 
  html_nodes("table.wikitable:nth-child(69)") %>% 
  html_table() %>% 
  flatten_df()

afcon_champions_clean <- afcon_champions_raw %>% 
  janitor::clean_names() %>% 
  select(-rank) %>% 
  mutate(team = team %>% str_replace("\\[.*\\]", "")) %>% 
  arrange(desc(titles))
```




## Goals per squad

most squads don't have goals scored listed.........!!!!!!!!!!!!!!!!!

```{r}
squad_url <- "https://en.wikipedia.org/wiki/2019_Africa_Cup_of_Nations_squads"

session <- bow(squad_url)

xpaths <- 1:24 %>% 
  map(., ~glue("//*[@id='mw-content-text']/div/table[{.x}]"))

squads_df_raw <- scrape(session) %>% 
  html_node(xpath = '//*[@id="toc"]') %>%  
  html_text() %>% 
  str_split("\n") %>% 
  unlist() %>% 
  tibble::enframe() %>% 
  rename(country = value) %>% 
  filter(str_detect(country, "^[1-6]\\."), !str_detect(country, "Group")) %>% 
  separate(country, c("group", "delete", "country"), sep = c(1, 3)) %>% 
  slice(1:24) %>% 
  mutate(group = LETTERS[as.numeric(group)], 
         country = str_trim(country), 
         xpaths = xpaths,
         squads = map(xpaths, ~ scrape(session) %>% 
                        html_node(xpath = .x) %>% 
                        html_table())) %>% 
  unnest(squads)

saveRDS(squads_df_raw, "../data/afcon_squads_df_raw.RDS")
```



================================================
FILE: Asian Cup 2019/asian_cup_2019.rmd
================================================
---
title: "Untitled"
author: "RN7"
date: "December 26, 2018"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# Load packages

```{r message=FALSE}
pacman::p_load(tidyverse, scales, lubridate, ggrepel, stringi, magick, gapminder,
               glue, extrafont, rvest, ggtextures, cowplot, countrycode, ggimage,
               polite)
# necessary for Roboto Condensed font
loadfonts()
```

# Top goal scorers

```{r}
ac_top_scorers <- data.frame(
  player = c("Ali Daei", "Ali Daei", "Ali Daei",
             "Lee Dong Gook", "Lee Dong Gook",
             "Naohiro Takahara", "Naohiro Takahara",
             "Jassem Al-Houwaidi", "Jassem Al-Houwaidi",
             "Younis Mahmoud", "Younis Mahmoud", "Younis Mahmoud", "Younis Mahmoud"),
  country = c("Iran", "Iran", "Iran",
              "South Korea", "South Korea",
              "Japan", "Japan",
              "Kuwait", "Kuwait",
              "Iraq", "Iraq", "Iraq", "Iraq"),
  tournament = c("1996", "2000", "2004",
                 "2000", "2004",
                 "2000", "2004",
                 "1996", "2000",
                 "2004", "2007", "2011", "2015"),
  goals = as.numeric(c(8, 3, 3,
            6, 4,
            5, 4,
            6, 2,
            1, 4, 1, 2)),
  total_goals = as.numeric(c(14, 14, 14,
            10, 10,
            9, 9,
            8, 8,
            8, 8, 8, 8))
)

# soccer ball images
# https://i.pinimg.com/originals/e7/d7/19/e7d7190f0b5b3abd4f6c17e2c7989ec3.jpg
# https://www.emoji.co.uk/files/microsoft-emojis/activity-windows10/8356-soccer-ball.png
ac_top_scorers <- ac_top_scorers %>% 
  mutate(image = case_when(
    tournament == "2004" ~ "http://football-balls.com/ball_files/2004-asian-cup-adidas-roteiro-official-match-ball.png",
    tournament == "2007" ~ "http://football-balls.com/ball_files/2007-asian-cup-mercurial-veloci-official-match-ball.png",
    tournament == "2011" ~ "http://football-balls.com/ball_files/2011-asian-cup-nike-total-90-tracer-official-match-ball.png",
    tournament == "2015" ~ "http://football-balls.com/ball_files/2015-asian-cup-nike-ordem-2-official-match-ball.png",
    TRUE ~ "https://www.emoji.co.uk/files/microsoft-emojis/activity-windows10/8356-soccer-ball.png")) %>% 
  mutate(country_code = country %>% 
           countrycode(., origin = "country.name", destination = "iso2c"))


ac_top_scorers %>% 
  gather(key = "player", value = "tournament")

ac_top_scorers %>% 
  distinct(player, .keep_all = TRUE)
```


```{r, fig.width=8, fig.height=6}
ac_top_graph <- ac_top_scorers %>% 
  distinct(player, .keep_all = TRUE) %>% 
  ggplot(aes(x = reorder(player, total_goals), y = total_goals,
             image = image)) +
  #ggimage::geom_emoji(aes(image = '26bd'), size = 0.06) +
  geom_isotype_col(img_width = grid::unit(1, "native"), img_height = NULL,
    ncol = NA, nrow = 1, hjust = 0, vjust = 0.5) +
  coord_flip() +
  #geom_flag(y = -1.5, aes(image = country_code), size = 0.1) +
  scale_y_continuous(breaks = c(0, 2, 4, 6, 8, 10, 12, 14),
                     expand = c(0, 0), 
                     limits = c(0, 15)) +
  #expand_limits(y = -2) +
  ggthemes::theme_solarized() +
  labs(title = "Top Scorers of the Asian Cup!",
       subtitle = "Most goals in a single tournament: 8 (Ali Daei, 1996)",
       y = "Number of Goals", x = NULL) +
  theme(text = element_text(family = "Roboto Condensed"),
        title = element_text(size = 18),
        subtitle = element_text(size = 14),
        axis.text = element_text(size = 12),
        axis.line.y = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        axis.ticks.y = element_blank())

# scale them differently as flag sizes are different...
pimage <- axis_canvas(ac_top_graph, axis = 'y') + 
  draw_image("https://upload.wikimedia.org/wikipedia/commons/c/ca/Flag_of_Iran.svg", 
             y = 13, scale = 1.5) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/0/09/Flag_of_South_Korea.svg", 
             y = 10, scale = 1.7) +
  draw_image("https://upload.wikimedia.org/wikipedia/en/9/9e/Flag_of_Japan.svg", 
             y = 7, scale = 1.7) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/f/f6/Flag_of_Iraq.svg", 
             y = 4, scale = 1.6) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/a/aa/Flag_of_Kuwait.svg", 
             y = 1, scale = 1.2)

# insert the image strip into the bar plot and draw  
ggdraw(insert_yaxis_grob(ac_top_graph, pimage, position = "left"))
```


```{r}
theme_void
ggthemes::theme_wsj
ggthemes::theme_solarized
```





# Goals scored per tournament

```{r}
wiki_url <- "https://en.wikipedia.org"
acup_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup"

cup_links <- read_html(acup_url) %>% 
  html_nodes("br+ i a") %>% 
  html_attr("href") %>% 
  .[-17:-18]


acup_df <- cup_links %>% 
  as_data_frame() %>% 
  mutate(cup = str_remove(value, "\\/wiki\\/") %>% str_replace_all("_", " ")) %>% 
  rename(link = value)



goals_info <- function(x) {
  goal_info <- wiki_url %>% 
    html_session() %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    select(`Goals scored`) %>% 
    mutate(`Goals scored` = str_remove_all(`Goals scored`, pattern = ".*\\(") %>% 
             str_extract_all("\\d+\\.*\\d*") %>% as.numeric)
}

team_num_info <- function(x) {
  team_num_info <- wiki_url %>% 
    html_session() %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    select(`Teams`) %>% 
    mutate(`Teams` = as.numeric(`Teams`))
}

match_num_info <- function(x) {
  match_num_info <- wiki_url %>% 
    html_session() %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    janitor::clean_names() %>% 
    select(matches_played) %>% 
    mutate(matches_played = as.numeric(matches_played))
}


# all together:
goals_data <- acup_df %>% 
  mutate(goals_per_game = map(acup_df$link, goals_info) %>% unlist,
         team_num = map(acup_df$link, team_num_info) %>% unlist,
         match_num = map(acup_df$link, match_num_info) %>% unlist)

```

```{r}
ac_goals_df <- goals_data %>% 
  mutate(label = cup %>% str_extract("[0-9]+") %>% str_replace("..", "'"),
         team_num = case_when(
           is.na(team_num) ~ 16,
           TRUE ~ team_num
         )) %>% 
  arrange(cup) %>% 
  mutate(label = factor(label, label),
         team_num = c(4, 4, 4, 5, 6, 6, 10, 10, 10, 8, 12, 12, 16, 16, 16, 16))
```


```{r}
ac_goals_df %>% 
  ggplot(aes(x = label, y = goals_per_game, group = 1)) +
  geom_line() +
  #geom_point() +
  scale_y_continuous(limits = c(NA, 5.35),
                     breaks = c(1.5, 2, 2.5, 3, 3.5, 4, 4.5)) +
  labs(x = "Tournament (Year)", y = "Goals per Game"#,
       # title = "Goals per Game throughout the Asian Cup.",
       # subtitle = "Odd dip throughout the 80s to early 90s..."
       ) +
  theme_minimal() +
  theme(text = element_text(family = "Roboto Condensed"),
        #title = element_text(size = 18),
        axis.title = element_text(size = 12),
        axis.text = element_text(size = 10)) +
  annotate(geom = "label", x = "'56", y = 5.23, family = "Roboto Condensed",
           color = "black", #fill = "grey",
           label = "Total Number of Games Played:", hjust = 0) +
  annotate(geom = "text", x = "'60", y = 4.9, 
           label = "6", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 1, xend = 3, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'68", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 3.8, xend = 4.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'72", y = 4.9, 
           label = "13", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 4.8, xend = 5.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'76", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 5.8, xend = 6.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'84", y = 4.9, 
           label = "24", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 7, xend = 9, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'92", y = 4.9, 
           label = "16", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 9.8, xend = 10.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 11.5, y = 4.9, 
           label = "26", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 11, xend = 12, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 14.5, y = 4.9, 
           label = "32", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 13, xend = 16, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 9, y = 4, family = "Roboto Condensed",
           label = glue("
                        Incredibly low amount of goals in Group B
                        (15 in 10 Games) and in Knock-Out Stages
                        (4 goals in 4, only one scored in normal time)")) +
  annotate(geom = "segment", x = 9, xend = 9, y = 1.6, yend = 3.6,
           color = "red") +
  ggimage::geom_emoji(aes(image = '26bd'), size = 0.03) 

ggsave(filename = paste0(here::here("Asian Cup 2019"), "/gpg_plot.png"))
plot <- image_read(paste0(here::here("Asian Cup 2019"), "/gpg_plot.png"))
```

## add logo

```{r}
logo_raw <- image_read("https://upload.wikimedia.org/wikipedia/en/a/ad/2019_afc_asian_cup_logo.png")

logo <- logo_raw %>% 
  image_scale("100") %>% 
  image_background("grey", flatten = TRUE) %>% 
  image_border("grey", "600x10") %>% 
  image_annotate(text = glue("Goals per Game throughout the Asian Cup"),
                 color = "white", size = 30, 
                 location = "+10+50", gravity = "northwest")

final_plot <- image_append(image_scale(c(logo, plot), "500"), stack = TRUE)

logo_proc <- logo_raw %>% image_scale("100")

# create blank canvas
a <- image_blank(width = 6, height = 0.8, color = "white")

# combine with logo and shit it to the left, to the left
b <- image_composite(image_scale(a, "x100"), image_scale(logo_proc, "x60"), 
                     offset = "+500+25")
logo_2 <- b %>% 
  image_annotate(text = glue("Goals per Game throughout the Asian Cup"),
                 color = "black", size = 18, font = "Roboto Condensed",
                 location = "+63+50", gravity = "northwest")

final2_plot <- image_append(image_scale(c(logo_2, plot), "500"), stack = TRUE)
final2_plot
image_write(final2_plot,
            paste0(here::here("Asian Cup 2019"), "/gpg_plot_final.png"))
```

- annotate number of matches played on top as strip
- fit avg goals per game per each sequence of # of matches >>> kinda like splines
- add AFC logo? top right corner
- put top scorer country as geom_point?
- soccer ball emoji as geom_point?
- patchwork to include top scoring countries + other additional info


## add logo 2.0

```{r}
ac_goals_df %>% 
  ggplot(aes(x = label, y = goals_per_game, group = 1)) +
  geom_line() +
  scale_y_continuous(limits = c(NA, 5.35),
                     breaks = c(1.5, 2, 2.5, 3, 3.5, 4, 4.5)) +
  labs(x = "Tournament (Year)", y = "Goals per Game",
       title = "Goals per Game throughout the Asian Cup.") +
  theme_minimal() +
  theme(text = element_text(family = "Roboto Condensed"),
        #title = element_text(size = 18),
        axis.title = element_text(size = 12),
        axis.text = element_text(size = 10)) +
  annotate(geom = "label", x = "'56", y = 5.23, family = "Roboto Condensed",
           color = "black", 
           label = "Total Number of Games Played:", hjust = 0) +
  annotate(geom = "text", x = "'60", y = 4.9, 
           label = "6", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 1, xend = 3, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'68", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 3.8, xend = 4.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'72", y = 4.9, 
           label = "13", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 4.8, xend = 5.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'76", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 5.8, xend = 6.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'84", y = 4.9, 
           label = "24", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 7, xend = 9, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'92", y = 4.9, 
           label = "16", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 9.8, xend = 10.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 11.5, y = 4.9, 
           label = "26", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 11, xend = 12, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 14.5, y = 4.9, 
           label = "32", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 13, xend = 16, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 9, y = 4, family = "Roboto Condensed",
           label = glue("
                        Incredibly low amount of goals in Group B
                        (15 in 10 Games) and in Knock-Out Stages
                        (4 goals in 4, only one scored in normal time)")) +
  annotate(geom = "segment", x = 9, xend = 9, y = 1.6, yend = 3.6,
           color = "red") +
  ggimage::geom_emoji(aes(image = '26bd'), size = 0.03) 

ggsave(filename = paste0(here::here("Asian Cup 2019"), "/gpg_title_plot.png"))
plot <- image_read(paste0(here::here("Asian Cup 2019"), "/gpg_title_plot.png"))
```


```{r}
add_logo <- function(plot_path, logo_path, logo_position, logo_scale = 10){

    # Requires magick R Package https://github.com/ropensci/magick

    # Useful error message for logo position
    if (!logo_position %in% c("top right", "top left", "bottom right", "bottom left")) {
        stop("Error Message: Uh oh! Logo Position not recognized\n  Try: logo_positon = 'top left', 'top right', 'bottom left', or 'bottom right'")
    }

    # read in raw images
    plot <- magick::image_read(plot_path)
    logo_raw <- magick::image_read(logo_path)

    # get dimensions of plot for scaling
    plot_height <- magick::image_info(plot)$height
    plot_width <- magick::image_info(plot)$width

    # default scale to 1/10th width of plot
    # Can change with logo_scale
    logo <- magick::image_scale(logo_raw, as.character(plot_width/logo_scale))

    # Get width of logo
    logo_width <- magick::image_info(logo)$width
    logo_height <- magick::image_info(logo)$height

    # Set position of logo
    # Position starts at 0,0 at top left
    # Using 0.01 for 1% - aesthetic padding

    if (logo_position == "top right") {
        x_pos = plot_width - logo_width - 0.01 * plot_width
        y_pos = 0.01 * plot_height
    } else if (logo_position == "top left") {
        x_pos = 0.01 * plot_width
        y_pos = 0.01 * plot_height
    } else if (logo_position == "bottom right") {
        x_pos = plot_width - logo_width - 0.01 * plot_width
        y_pos = plot_height - logo_height - 0.01 * plot_height
    } else if (logo_position == "bottom left") {
        x_pos = 0.01 * plot_width
        y_pos = plot_height - logo_height - 0.01 * plot_height
    }

    # Compose the actual overlay
    magick::image_composite(plot, logo, offset = paste0("+", x_pos, "+", y_pos))

}
```



```{r}
add_logo(plot_path = "gpg_title_plot.png",
         logo_path = "https://upload.wikimedia.org/wikipedia/en/a/ad/2019_afc_asian_cup_logo.png",
         logo_position = "top right",
         logo_scale = 15) -> plot_2.0

plot_2.0
```






```{r}
wiki_url %>% 
  html_session() %>% 
  jump_to("/wiki/2015_AFC_Asian_Cup") %>% 
  html_nodes(".vcalendar") %>% 
  html_table(header = FALSE) %>% 
  flatten_df() %>% 
  spread(key = X1, value = X2) %>% 
  select(`Goals scored`) %>% 
  mutate(`Goals scored` = str_remove_all(`Goals scored`, pattern = ".*\\(") %>% 
           str_extract_all("\\d+\\.*\\d*") %>% as.numeric)
```


```{r}
###

one_cup <- "https://en.wikipedia.org/wiki/1968_AFC_Asian_Cup"

copa <- one_cup %>% 
  read_html() %>% 
  html_nodes(".vcalendar") %>% 
  html_table(header = FALSE) %>% 
  flatten_df() %>% 
  spread(key = X1, value = X2) %>% 
  janitor::clean_names() %>% 
  select(goals_scored) %>% 
  mutate(`Goals scored` = str_remove_all(`Goals scored`, pattern = ".*\\(") %>% 
           str_extract_all("\\d+\\.*\\d*") %>% as.numeric) # \\(.*)

```








# Asian Cup record

```{r}
# .navigation-not-searchable+ .jquery-tablesorter

acup_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup"

session <- bow(acup_url)

acup_winners_raw <- scrape(session) %>% 
  html_nodes("#mw-content-text > div > table:nth-child(30)") %>% 
  html_table() %>% 
  flatten_df()
```

```{r}
acup_winners_clean <- acup_winners_raw %>% 
  janitor::clean_names() %>% 
  slice(1:8) %>% 
  select(-fourth_place, -total_top_four) %>% 
  separate(winners, into = c("first_num", "first_place_year"), sep = " ", extra = "merge") %>% 
  separate(runners_up, into = c("second_num", "second_place_year"), sep = " ", extra = "merge") %>% 
  separate(third_place, into = c("third_num", "third_place_year"), sep = " ", extra = "merge") %>% 
  mutate_all(funs(str_replace_all(., "–", "0"))) %>% 
  mutate_at(vars(contains("num")), funs(as.numeric)) %>% 
  mutate(team = if_else(team == "Israel1", "Israel", team)) %>% 
  gather(key = "key", value = "value", -team, 
         -first_place_year, -second_place_year, -third_place_year) %>% 
  mutate(key = case_when(
           key == "first_num" ~ "Champions",
           key == "second_num" ~ "Runners-up",
           key == "third_num" ~ "Third Place"
         ),
         key = key %>% fct_relevel(c("Champions", "Runners-up", "Third Place"))) %>% 
  # hack-ish solution?
  arrange(key, value) %>% 
  mutate(team = as_factor(team),
         order = row_number(),
         image = team %>% 
           countrycode(., origin = "country.name", destination = "iso2c"))
```

```{r, fig.width = 8, fig.height = 6}
a <- acup_winners_clean %>% 
  ggplot(aes(value, team, color = key)) +
  geom_point(size = 5) +
  scale_color_manual(values = c("Champions" = "#FFCC33",
                                "Runners-up" = "#999999",
                                "Third Place" = "#CC6600"),
                     guide = FALSE) +
  labs(x = "Number of Occurrence",
       title = "Winners & Losers of the Asian Cup (1956-2015)",
       subtitle = glue("
                       Ordered by number of Asian Cup(s) won.
                       Four-time Champions, Japan, only won their first in 1992!"),
       caption = glue("
                      Note: Israel was expelled by the AFC in 1974 while Australia joined the AFC in 2006.
                      Source: Wikipedia
                      By @R_by_Ryo")) +
  facet_wrap(~key) +
  theme_minimal() +
  theme(text = element_text(family = "Roboto Condensed"),
        title = element_text(size = 18),
        plot.subtitle = element_text(size = 12),
        axis.title.y = element_blank(),
        axis.title.x = element_text(size = 12),
        axis.text.y = element_text(size = 14),
        axis.text.x = element_text(size = 12),
        plot.caption = element_text(hjust = 0, size = 10),
        panel.border = element_rect(fill = NA, colour = "grey20"),
        panel.grid.minor.x = element_blank(),
        strip.text = element_text(size = 16))

ggsave("../Asian Cup 2019/asiancup_winners.png", width = 8, height = 6)

# a +
#   geom_flag(data = acup_winners_clean, 
#             x = -2, aes(image = image), size = 0.15) +
#   expand_limits(x = -2)

# insert_yaxis_grob()

#null_dev_env <- new.env(parent = emptyenv())
# set_null_device("pdf")
# cairo_pdf()
# ggdraw(add_sub(a, label = "Source: Wikipedia\nBy @R_by_Ryo", x = 0.95, size = 8))
# 
# Cairo::Cairo(1000, 750, "test.png", bg = "white")
# last_plot()
# dev.off()
```




# Working hours & Productivity

```{r}
acup_url <- "https://en.wikipedia.org/wiki/2019_AFC_Asian_Cup"

acup_kickoff <- acup_url %>% 
  read_html() %>% 
  html_nodes("time") %>% 
  html_text() %>% 
  as_data_frame()

# time
```

- calculate +90 minutes for entire game duration
- NOT count extra time or stoppage time
- cross-refernce with other local time zones
- count up total number of hours per each spectator country

```{r}
acup_kickoff %>% 
  mutate(match_num = row_number(),
         match_type = if_else(between(match_num, 37, 51), 
                              "Knock-Out Stage", "Group Stage"),
         time = value %>% str_replace("\\(.*\\)", "")) %>% 
  mutate(time2 = hm(time),
         time_ac2 = force_tz(time2, "Asia/Dubai"), # time based in UTC +4
         time_jp = with_tz(time_ac2, tz = "Asia/Tokyo"), # UTC +4 time converted to Japan
         time_jp_end = time_jp + hm("2 0"),
         time_ac = with_tz(time2, tz = "America/New_York"),
         timeZONE2 = tz(time_ac)) %>% 
  mutate(diff = make_difftime(hour = 2),
         int = as.interval(diff, time_ac2),
         jp_start = hm("09:00") %>% with_tz(tz = "Asia/Tokyo"),
         jp_end = hm("12:00") %>% with_tz(tz = "Asia/Tokyo"),
         jp_work = as.interval(jp_start, jp_end),
         overlap = int_overlaps(jp_work, int))


acup_kickoff %>% 
  mutate(match_num = row_number(),
         match_type = if_else(between(match_num, 37, 51), 
                              "Knock-Out Stage", "Group Stage"),
         time = value %>% str_replace("\\(.*\\)", "")) %>% 
  mutate(time2 = dmy_hm(time),
         time3 = hour(time2))
```



```{r}
#tz <- "Asia/"
acup_times_df <- acup_kickoff %>% 
  mutate(time = value %>% str_replace("\\(.*\\)", "")) %>% 
  mutate(time2 = dmy_hm(time)) %>%  # proper time var
  arrange(time2) %>% 
  mutate(
         # match num must arrangeDESC time2
         match_num = row_number(),
         match_type = if_else(between(match_num, 37, 51), 
                              "Knock-Out Stage", "Group Stage"),
         # 
         is_weekday = wday(time2, label = TRUE),
         time_ac2 = force_tz(time2, "Asia/Dubai"), # time based in UTC +4
         time_jp = with_tz(time_ac2, tz = "Asia/Tokyo"), # UTC +4 time converted to Japan
         time_jp_end = time_jp + hm("2 0"),
         time_ac = with_tz(time2, tz = "America/New_York")) %>% # UTC +4 time to NYC
  mutate(diff = make_difftime(hour = 2),
         int = as.interval(diff, time_ac2),
         # jp_start = ymd_hms("2019-01-05 09:00:00", tz = "Asia/Tokyo"),
         # jp_end = ymd_hms("2019-01-05 12:00:00", tz = "Asia/Tokyo"),
         jp_match = as.interval(time_jp, time_jp_end),
         overlap = int_overlaps(jp_match, int)) %>% 
  select(jp_match, int, overlap)

# now create 9AM-5PM for Japan for each of those days on the match days!!
# is_weekday? T/F

acup_times_df$jp_work %within% acup_times_df$int

```

ymd_hms("2011-07-01 09:00:00", tz = "Pacific/Auckland")




```{r}
acup_kickoff %>% 
  mutate(time = value %>% str_replace("\\(.*\\)", "")) %>% 
  mutate(time2 = dmy_hm(time)) %>%  # proper time var
  arrange(time2) %>% 
  mutate(match_num = row_number(),
         match_type = if_else(between(match_num, 37, 51), 
                              "Knock-Out Stage", "Group Stage"),
         # 
         is_weekday = wday(time2, label = TRUE),
         time_ac2 = force_tz(time2, "Asia/Dubai"), # time based in UTC +4
         time_ny = with_tz(time_ac2, tz = "America/New_York"), # converted to NYC
         time_ny_end = time_ny + hm("2 0")) %>% 
  mutate(work_time = time %>% str_sub(end = -6),
         work_time_begin = glue("{work_time} 09:00") %>% 
           dmy_hm(tz = "America/New_York"),
         work_time_end = glue("{work_time} 17:00") %>% 
           dmy_hm(tz = "America/New_York")) %>% 
  # create intervals
  mutate(diff = make_difftime(hour = 2),
         int = as.interval(diff, time_ac2),
         int_ny_work = as.interval(work_time_begin, work_time_end),
         ny_match = as.interval(time_ny, time_ny_end),
         overlap = int_overlaps(ny_match, int_ny_work)) %>% 
  # sum overlapping hours?
  mutate(overlap_num = pmax(pmin(time_ny_end, work_time_end) - 
                              pmax(time_ny, work_time_begin) + 1,0)) %>% 
  mutate(overlap_laplap = map2_dbl(work_time_begin, work_time_end,
                                   ~pmax((pmin(time_ny_end, .y)) - 
                                          pmax(time_ny, .x) + 1), 0)) %>% 
  select(ny_match, int, overlap, overlap_num, overlap_laplap)
```


## NY time

```{r}
acup_kickoff %>% 
  mutate(time = value %>% str_replace("\\(.*\\)", "")) %>% 
  mutate(time2 = dmy_hm(time)) %>%  # proper time var
  arrange(time2) %>% 
  mutate(match_num = row_number(),
         match_type = if_else(between(match_num, 37, 51), 
                              "Knock-Out Stage", "Group Stage"),
         # 
         is_weekday = wday(time2, label = TRUE),
         time_ac2 = force_tz(time2, "Asia/Dubai"), # time based in UTC +4
         time_jp = with_tz(time_ac2, tz = "Asia/Tokyo"), # UTC +4 time converted to Japan
         time_jp_end = time_jp + hm("2 0")) %>% 
  mutate(work_time = time %>% str_sub(end = -6),
         work_time_begin = glue("{work_time} 09:00") %>% 
           dmy_hm(tz = "Asia/Tokyo"),
         work_time_end = glue("{work_time} 17:00") %>% 
           dmy_hm(tz = "Asia/Tokyo")) %>% 
  # create intervals
  mutate(diff = make_difftime(hour = 2),
         int = as.interval(diff, time_ac2),
         int_jp_work = as.interval(work_time_begin, work_time_end),
         jp_match = as.interval(time_jp, time_jp_end),
         overlap = int_overlaps(jp_match, int_jp_work)) %>% 
  select(jp_match, int, overlap)
```


# Japan vs. RIVALS

- use Kaggle international results
- results UP TO END OF WORLD CUP
- filter AFC Asian Cup, Friendly, AFC Asian Cup qualification
- filter Asia >>> take out UEFA countries like Kazakhstan, Georgia, Israel, etc.


- https://www.kaggle.com/phjulien/a-journey-through-the-history-of-soccer/

```{r}
federation_files <- Sys.glob("../data/federation_affiliations/*")

df_federations = data.frame(country = NULL, federation = NULL)
for (f in federation_files) {
    federation = basename(f)
    content = read.csv(f, header=FALSE)
    content <- cbind(content,federation=rep(federation, dim(content)[1]))
    df_federations <- rbind(df_federations, content)
}

colnames(df_federations) <- c("country", "federation")

df_federations <- df_federations %>% 
  mutate(country = as.character(country) %>% str_trim(side = "both"))
```

```{r}
results_raw <- read_csv("../data/results.csv")

results_japan_raw <- results_raw %>% 
  filter(home_team == "Japan" | away_team == "Japan") %>% 
  rename(venue_country = country, 
         venue_city = city) %>% 
  mutate(match_num = row_number())

# combine with federation affiliations
results_japan_home <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("home_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(home_federation = federation) #%>% 
  View()

results_japan_away <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("away_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(away_federation = federation) #%>% 
  View()

# combine home-away
results_japan_cleaned <- results_japan_home %>% 
  full_join(results_japan_away)

results_japan_cleaned %>% 
  filter(is.na(home_federation)) %>% 
  pull(home_team) %>% 
  unique()

results_japan_cleaned %>% 
  filter(is.na(away_federation)) %>% 
  pull(away_team) %>% 
  unique()

```



```{r}
results_japan_cleaned <- results_japan_cleaned %>% 
  mutate(
    home_federation = case_when(
      home_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei") ~ "AFC",
      home_team == "USA" ~ "Concacaf",
      home_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ home_federation),
    away_federation = case_when(
      away_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei", "Taiwan") ~ "AFC",
      away_team == "USA" ~ "Concacaf",
      away_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ away_federation
    ))
```

Now that it's nice and cleaned up I can reshape it so that the data is set from Japan's perspective.

```{r}
# reshape to Japan p.o.v.

results_jp_asia <- results_japan_cleaned %>% 
  # filter only for Japan games and AFC opponents
  filter(home_team == "Japan" | away_team == "Japan",
         home_federation == "AFC" & away_federation == "AFC") %>% 
  select(-contains("federation"), -contains("venue"),
         -neutral, -match_num,
         date, home_team, home_score, away_team, away_score, tournament) %>% 
  # reshape columns to Japan vs. opponent
  mutate(
    opponent = case_when(
      away_team != "Japan" ~ away_team,
      home_team != "Japan" ~ home_team),
    home_away = case_when(
      home_team == "Japan" ~ "home",
      away_team == "Japan" ~ "away"),
    japan_goals = case_when(
      home_team == "Japan" ~ home_score,
      away_team == "Japan" ~ away_score),
    opp_goals = case_when(
      home_team != "Japan" ~ home_score,
      away_team != "Japan" ~ away_score)) %>% 
  # results
  mutate(
    result = case_when(
      japan_goals > opp_goals ~ "Win",
      japan_goals < opp_goals ~ "Loss",
      japan_goals == opp_goals ~ "Draw"),
    result = result %>% as_factor() %>% fct_relevel(c("Win", "Draw", "Loss"))) %>% 
  select(-contains("score"), -contains("team"))


#results_jp_asia %>% View()
```



```{r}
results_jp_asia %>% 
  filter(opponent == "Uzbekistan") %>% 
  group_by(result) %>% 
  count()

results_jp_asia %>% 
  filter(opponent == "Turkmenistan")

results_jp_asia %>% 
  filter(opponent == "Oman") %>% 
  knitr::kable()


results_jp_asia %>% 
  filter(opponent %in% c("Oman", "Uzbekistan", "Turkmenistan")) %>% 
  group_by(result, opponent) %>% 
  tally()

results_jp_asia %>% 
  filter(opponent %in% c("Oman", "Uzbekistan", "Turkmenistan")) %>% 
  group_by(result, opponent) %>% 
  summarize(j_g = sum(japan_goals),
            o_g = sum(opp_goals),
            n = n()) %>% 
  spread(result, n)
```



```{r}
results_jp_asia %>% 
  filter(opponent %in% c("Australia", "Korea Republic", "Iran")) %>% 
  group_by(result, opponent) %>% 
  mutate(n = n()) %>% 
  ungroup() %>% 
  group_by(result, opponent) %>% 
  summarize(j_g = sum(sum(japan_goals)),
            o_g = sum(sum(opp_goals)),
            n = n()) %>% 
  ungroup() %>% 
  spread(result, n) %>% 
  group_by(opponent) %>% 
  mutate(j_g = sum(j_g),
         o_g = sum(o_g),
         Win = sum(Win, na.rm = TRUE),
         Draw = sum(Draw, na.rm = TRUE),
         Loss = sum(Loss, na.rm = TRUE)) %>% 
  distinct()

```

thankfull south korea should be on the otherside of the bracket and we would also only meet Iran in the semifinals

Japan could meet Australia in the Quarters but without Aaron Mooy they're a much weaker side

Japan have unfortunately lost our rising star, Nakajima, to injury but we have replaced him with World Cup hero Takashi Inui.

```{r, echo=FALSE}
results_jp_asia %>% 
  filter(opponent %in% c("Oman", "Vietnam", "India")) %>% 
  group_by(result, opponent) %>% 
  mutate(n = n()) %>% 
  ungroup() %>% 
  group_by(home_away, result, opponent) %>% 
  summarize(j_goals = sum(japan_goals),
         oppo_goals = sum(opp_goals),
         n = n()) %>% 
  ungroup() %>% 
  arrange(opponent, result) %>% 
  spread(result, n) %>% 
  group_by(opponent, home_away) %>% 
  mutate(Win = if("Win" %in% names(.)){return(Win)} else{return(0)},
         Draw = if("Draw" %in% names(.)){return(Draw)} else{return(0)},
         Loss = if("Loss" %in% names(.)){return(Loss)} else{return(0)}) %>% 
  summarize(Win = sum(Win, na.rm = TRUE),
         Draw = sum(Draw, na.rm = TRUE),
         Loss = sum(Loss, na.rm = TRUE),
         j_goals = sum(j_goals),
         o_goals = sum(oppo_goals)) %>% 
  ungroup() %>% 
  group_by(opponent) %>% 
  do(add_row(.,
             opponent = .$opponent %>% unique(),
             home_away = "total",
             Win = sum(.$Win, na.rm = TRUE),
             Draw = sum(.$Draw, na.rm = TRUE),
             Loss = sum(.$Loss, na.rm = TRUE),
             j_goals = sum(.$j_goals),
             o_goals = sum(.$o_goals)))
```



# waffle charts

```{r}
library(waffle)

tibble(
  team = c("Liverpool FC", "Draw", "Man. Utd"),
  values = c(55, 46, 68)
) -> liv_man

cols <- c("Liverpool FC" = "red", 
          "Draw" = "grey",
          "Man. Utd" = "black")

liv_man %>% 
  mutate(team = as_factor(team) %>% fct_relevel("Liverpool FC", "Draw", "Man. Utd")) %>% 
  ggplot(aes(fill = team, values = values)) +
  geom_waffle(color = "white", size = 1.125, n_rows = 6) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0)) +
  scale_fill_manual(values = cols, name = NULL) +
  #ggthemes::scale_fill_tableau(name=NULL) +
  coord_equal() +
  hrbrthemes::theme_ipsum_rc(grid = "") +
  theme_enhance_waffle() +
  labs(title = "The North West Derby")
```


## japan_versus function

```{r}
japan_versus <- function(data, ...) {
  # filter 
  filter_vars <- enquos(...)
  
  jp_vs <- data %>% 
    filter(!!!filter_vars) %>% 
    # count results type per opponent
    group_by(result, opponent) %>% 
    mutate(n = n()) %>% 
    ungroup() %>% 
    # sum amount of goals by Japan and opponent
    group_by(result, opponent) %>% 
    summarize(j_g = sum(japan_goals),
              o_g = sum(opp_goals),
              n = n()) %>% 
    ungroup() %>% 
    # spread results over multiple columns
    spread(result, n) %>% 
    # 1. failsafe against no type of result against an opponent
    # 2. sum up counts per opponent
    group_by(opponent) %>% 
    mutate(Win = if("Win" %in% names(.)){return(Win)} else{return(0)},
         Draw = if("Draw" %in% names(.)){return(Draw)} else{return(0)},
         Loss = if("Loss" %in% names(.)){return(Loss)} else{return(0)}) %>% 
    summarize(Win = sum(Win, na.rm = TRUE),
              Draw = sum(Draw, na.rm = TRUE),
              Loss = sum(Loss, na.rm = TRUE),
              `Goals For` = sum(j_g),
              `Goals Against` = sum(o_g))
  
  return(jp_vs)
}
```

```{r, fig.height = 4, fig.width=3}
library(waffle)
library(extrafont)
loadfonts(device = "win")

results_jp_asia <- readRDS("../data/results_jp_asia.RDS")

glimpse(results_jp_asia)

jp_aus <- results_jp_asia %>% 
  japan_versus(opponent == "Australia") %>% 
  select(-opponent, Japan = Win, Australia = Loss) %>% 
  gather(key = "team", value = "values", -`Goals For`, -`Goals Against`) %>% 
  select(-contains("Goals"))

waffle(
  jp_aus, rows = 4, size = 1, 
  title = glue("
               Japan vs. Australia: 
               The New 'Asian' Rivalry"),
  colors = c("red", "grey", "blue"), 
  use_glyph = "futbol", glyph_size = 5,
  legend_pos = "bottom"
)

```



```{r}
pal <- c("Japan" = "blue", "Draw" = "grey", "South Korea" = "red")

results_jp_asia %>% 
  japan_versus(opponent == "Korea Republic") %>% 
  select(-opponent, Japan = Win, `South Korea` = Loss) %>% 
  gather(key = "team", value = "values", -`Goals For`, -`Goals Against`) %>% 
  ggplot(aes(fill = team, values = values)) +
  geom_waffle(color = "white", size = 1.125, n_rows = 6) +
  scale_x_discrete(expand = c(0, 0)) +
  scale_y_discrete(expand = c(0, 0)) +
  scale_fill_manual(values = pal, name = NULL) +
  coord_equal() +
  hrbrthemes::theme_ipsum_rc(grid="") +
  theme_enhance_waffle() +
  labs(title = "Japan vs. South Korea")

```



## time between start-finish + distance travelled

bar plot + calendar plot

================================================
FILE: Asian Cup 2019/japan_qatar.Rmd
================================================
---
title: "Untitled"
author: "RN7"
date: "February 1, 2019"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## R Markdown

```{r message=FALSE}
pacman::p_load(tidyverse, scales, lubridate, ggrepel, stringi, magick, 
               glue, extrafont, rvest, ggtextures, cowplot, ggimage, polite)
```

```{r}
federation_files <- Sys.glob("../data/federation_affiliations/*")

df_federations = data.frame(country = NULL, federation = NULL)
for (f in federation_files) {
    federation = basename(f)
    content = read.csv(f, header=FALSE)
    content <- cbind(content,federation=rep(federation, dim(content)[1]))
    df_federations <- rbind(df_federations, content)
}

colnames(df_federations) <- c("country", "federation")

df_federations <- df_federations %>% 
  mutate(country = as.character(country) %>% str_trim(side = "both"))
```

Now to load the results data and then join it with the affiliations data.

```{r, message=FALSE}
results_raw <- read_csv("../data/results.csv")

results_japan_raw <- results_raw %>% 
  filter(home_team == "Japan" | away_team == "Japan") %>% 
  rename(venue_country = country, 
         venue_city = city) %>% 
  mutate(match_num = row_number())

# combine with federation affiliations
results_japan_home <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("home_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(home_federation = federation) 

results_japan_away <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("away_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(away_federation = federation)

# combine home-away
results_japan_cleaned <- results_japan_home %>% 
  full_join(results_japan_away)
```

Next I need to edit some of the continents for teams that didn't have a match in the federation affiliation data set, for example, "South Korea" is "Korea Republic" in the Kaggle data set.

```{r}
results_japan_cleaned <- results_japan_cleaned %>% 
  mutate(
    home_federation = case_when(
      home_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei") ~ "AFC",
      home_team == "USA" ~ "Concacaf",
      home_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ home_federation),
    away_federation = case_when(
      away_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei", "Taiwan") ~ "AFC",
      away_team == "USA" ~ "Concacaf",
      away_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ away_federation
    ))
```

Now that it's nice and cleaned up I can reshape it so that the data is set from Japan's perspective.

```{r}
results_jp_asia <- results_japan_cleaned %>% 
  # filter only for Japan games and AFC opponents
  filter(home_team == "Japan" | away_team == "Japan",
         home_federation == "AFC" & away_federation == "AFC") %>% 
  select(-contains("federation"), -contains("venue"),
         -neutral, -match_num,
         date, home_team, home_score, away_team, away_score, tournament) %>% 
  # reshape columns to Japan vs. opponent
  mutate(
    opponent = case_when(
      away_team != "Japan" ~ away_team,
      home_team != "Japan" ~ home_team),
    home_away = case_when(
      home_team == "Japan" ~ "home",
      away_team == "Japan" ~ "away"),
    japan_goals = case_when(
      home_team == "Japan" ~ home_score,
      away_team == "Japan" ~ away_score),
    opp_goals = case_when(
      home_team != "Japan" ~ home_score,
      away_team != "Japan" ~ away_score)) %>% 
  # label results from Japan's perspective
  mutate(
    result = case_when(
      japan_goals > opp_goals ~ "Win",
      japan_goals < opp_goals ~ "Loss",
      japan_goals == opp_goals ~ "Draw"),
    result = result %>% as_factor() %>% fct_relevel(c("Win", "Draw", "Loss"))) %>% 
  select(-contains("score"), -contains("team"))
```


```{r}
results_jp_asia %>% filter(opponent == "Qatar") %>% knitr::kable()
```


```{r}
japan_versus <- function(data, ...) {
  # filter 
  filter_vars <- enquos(...)
  
  jp_vs <- data %>% 
    filter(!!!filter_vars) %>% 
    # count results type per opponent
    group_by(result, opponent) %>% 
    mutate(n = n()) %>% 
    ungroup() %>% 
    # sum amount of goals by Japan and opponent
    group_by(result, opponent) %>% 
    summarize(j_g = sum(japan_goals),
              o_g = sum(opp_goals),
              n = n()) %>% 
    ungroup() %>% 
    # spread results over multiple columns
    spread(result, n) %>% 
    # 1. failsafe against no type of result against an opponent
    # 2. sum up counts per opponent
    group_by(opponent) %>% 
    mutate(Win = if("Win" %in% names(.)){return(Win)} else{return(0)},
         Draw = if("Draw" %in% names(.)){return(Draw)} else{return(0)},
         Loss = if("Loss" %in% names(.)){return(Loss)} else{return(0)}) %>% 
    summarize(Win = sum(Win, na.rm = TRUE),
              Draw = sum(Draw, na.rm = TRUE),
              Loss = sum(Loss, na.rm = TRUE),
              `Goals For` = sum(j_g),
              `Goals Against` = sum(o_g))
  
  return(jp_vs)
}
```



```{r}
results_jp_asia %>% 
  japan_versus(opponent == "Qatar") %>% 
  knitr::kable(format = "html",
               caption = "Japan's Record vs. Qatar") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```

```{r}
results_jp_asia %>% 
  japan_versus(opponent == "Iran") %>% 
  knitr::kable(format = "html",
               caption = "Japan's Record vs. Iran") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```





```{r}
library(ggsoccer)
library(SBpitch)

create_Pitch(grass_colour = "#538032", 
line_colour =  "#ffffff", 
background_colour = "#538032", 
goal_colour = "#000000")


dat <- data.frame(x = c(92, 80, 80, 80, 80, 
                        70, 70, 62,
                        65, 65,
                        55),
                   y = c(50, 10, 40, 65, 90, 
                         40, 60, 50,
                         10, 90,
                         50),
                  lab = c("Gonda", "Sakai", "Yoshida", "Tomiyasu", "Nagatomo",
                          "Shiotani", "Shibasaki", "Minamino",
                          "Haraguchi", "Doan",
                          "Osako"))
dat %>%
  ggplot(aes(x = x, y = y)) +
  annotate_pitch(fill = "#538032", 
                 colour = "white") +
  geom_label(aes(label = lab)) +
  theme_pitch() +
  coord_flip(xlim = c(49, 101),
             ylim = c(-1, 101))
```



================================================
FILE: Asian Cup 2019/jpn_aus_waffle.Rmd
================================================
---
title: "Untitled"
author: "RN7"
date: "January 12, 2019"
output: 
  md_document:
    variant: markdown_github
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

A new rival, Australia, emerged to challenge Japan in Asia as they joined the AFC in 2006. From the come-from-behind defeat in the Group Stages of the 2006 World Cup (still one of my most painful memories as a Japanese football fan...) and to an extra-time win in the 2011 Asian Cup Final, Japan and Australia have dramatically clashed in the past decade.

Using the `waffle` package I can create a graphic that summarizes the results between the two sides.


```{r}
japan_versus <- function(data, ...) {
  # filter 
  filter_vars <- enquos(...)
  
  jp_vs <- data %>% 
    filter(!!!filter_vars) %>% 
    # count results type per opponent
    group_by(result, opponent) %>% 
    mutate(n = n()) %>% 
    ungroup() %>% 
    # sum amount of goals by Japan and opponent
    group_by(result, opponent) %>% 
    summarize(j_g = sum(japan_goals),
              o_g = sum(opp_goals),
              n = n()) %>% 
    ungroup() %>% 
    # spread results over multiple columns
    spread(result, n) %>% 
    # 1. failsafe against no type of result against an opponent
    # 2. sum up counts per opponent
    group_by(opponent) %>% 
    mutate(Win = if("Win" %in% names(.)){return(Win)} else{return(0)},
         Draw = if("Draw" %in% names(.)){return(Draw)} else{return(0)},
         Loss = if("Loss" %in% names(.)){return(Loss)} else{return(0)}) %>% 
    summarize(Win = sum(Win, na.rm = TRUE),
              Draw = sum(Draw, na.rm = TRUE),
              Loss = sum(Loss, na.rm = TRUE),
              `Goals For` = sum(j_g),
              `Goals Against` = sum(o_g))
  
  return(jp_vs)
}
```

```{r, fig.height = 4, fig.width=3}
library(glue)
library(dplyr)
library(tidyr)
library(waffle)
library(extrafont)
loadfonts(device = "win")

results_jp_asia <- readRDS("../data/results_jp_asia.RDS")


jp_aus <- results_jp_asia %>% 
  japan_versus(opponent == "Australia") %>% 
  select(-opponent, Japan = Win, Australia = Loss) %>% 
  gather(key = "team", value = "values", -`Goals For`, -`Goals Against`) %>% 
  select(-contains("Goals"))

# Waffle plot!
waffle(
  jp_aus, rows = 4, size = 1, 
  title = glue("
               Japan vs. Australia: 
               The New 'Asian' Rivalry"),
  colors = c("red", "grey", "blue"), 
  use_glyph = "futbol", glyph_size = 5,
  legend_pos = "bottom"
)

```



================================================
FILE: Asian Cup 2019/jpn_aus_waffle.md
================================================
A new rival, Australia, emerged to challenge Japan in Asia as they
joined the AFC in 2006. From the come-from-behind defeat in the Group
Stages of the 2006 World Cup (still one of my most painful memories as a
Japanese football fan…) and to an extra-time win in the 2011 Asian Cup
Final, Japan and Australia have dramatically clashed in the past decade.

Using the `waffle` package I can create a graphic that summarizes the
results between the two sides.

``` r
japan_versus <- function(data, ...) {
  # filter 
  filter_vars <- enquos(...)
  
  jp_vs <- data %>% 
    filter(!!!filter_vars) %>% 
    # count results type per opponent
    group_by(result, opponent) %>% 
    mutate(n = n()) %>% 
    ungroup() %>% 
    # sum amount of goals by Japan and opponent
    group_by(result, opponent) %>% 
    summarize(j_g = sum(japan_goals),
              o_g = sum(opp_goals),
              n = n()) %>% 
    ungroup() %>% 
    # spread results over multiple columns
    spread(result, n) %>% 
    # 1. failsafe against no type of result against an opponent
    # 2. sum up counts per opponent
    group_by(opponent) %>% 
    mutate(Win = if("Win" %in% names(.)){return(Win)} else{return(0)},
         Draw = if("Draw" %in% names(.)){return(Draw)} else{return(0)},
         Loss = if("Loss" %in% names(.)){return(Loss)} else{return(0)}) %>% 
    summarize(Win = sum(Win, na.rm = TRUE),
              Draw = sum(Draw, na.rm = TRUE),
              Loss = sum(Loss, na.rm = TRUE),
              `Goals For` = sum(j_g),
              `Goals Against` = sum(o_g))
  
  return(jp_vs)
}
```

``` r
library(glue)
library(dplyr)
```

    ## 
    ## Attaching package: 'dplyr'

    ## The following object is masked from 'package:glue':
    ## 
    ##     collapse

    ## The following objects are masked from 'package:stats':
    ## 
    ##     filter, lag

    ## The following objects are masked from 'package:base':
    ## 
    ##     intersect, setdiff, setequal, union

``` r
library(tidyr)
```

    ## Warning: package 'tidyr' was built under R version 3.5.2

``` r
library(waffle)
```

    ## Loading required package: ggplot2

    ## Warning: package 'ggplot2' was built under R version 3.5.2

``` r
library(extrafont)
```

    ## Registering fonts with R

``` r
loadfonts(device = "win")
```

    ## Agency FB already registered with windowsFonts().

    ## Algerian already registered with windowsFonts().

    ## Anonymous already registered with windowsFonts().

    ## Arial Black already registered with windowsFonts().

    ## Arial already registered with windowsFonts().

    ## Arial Narrow already registered with windowsFonts().

    ## Arial Rounded MT Bold already registered with windowsFonts().

    ## Arial Unicode MS already registered with windowsFonts().

    ## Bahnschrift already registered with windowsFonts().

    ## Baskerville Old Face already registered with windowsFonts().

    ## Bauhaus 93 already registered with windowsFonts().

    ## Bell MT already registered with windowsFonts().

    ## Berlin Sans FB already registered with windowsFonts().

    ## Berlin Sans FB Demi already registered with windowsFonts().

    ## Bernard MT Condensed already registered with windowsFonts().

    ## Blackadder ITC already registered with windowsFonts().

    ## Bodoni MT already registered with windowsFonts().

    ## Bodoni MT Black already registered with windowsFonts().

    ## Bodoni MT Condensed already registered with windowsFonts().

    ## Bodoni MT Poster Compressed already registered with windowsFonts().

    ## Book Antiqua already registered with windowsFonts().

    ## Bookman Old Style already registered with windowsFonts().

    ## Bookshelf Symbol 7 already registered with windowsFonts().

    ## Bradley Hand ITC already registered with windowsFonts().

    ## Britannic Bold already registered with windowsFonts().

    ## Broadway already registered with windowsFonts().

    ## Brush Script MT already registered with windowsFonts().

    ## Calibri already registered with windowsFonts().

    ## Calibri Light already registered with windowsFonts().

    ## Californian FB already registered with windowsFonts().

    ## Calisto MT already registered with windowsFonts().

    ## Cambria already registered with windowsFonts().

    ## Candara already registered with windowsFonts().

    ## Castellar already registered with windowsFonts().

    ## Centaur already registered with windowsFonts().

    ## Century already registered with windowsFonts().

    ## Century Gothic already registered with windowsFonts().

    ## Century Schoolbook already registered with windowsFonts().

    ## Champion HTF-Heavyweight already registered with windowsFonts().

    ## Chiller already registered with windowsFonts().

    ## Colonna MT already registered with windowsFonts().

    ## Comic Sans MS already registered with windowsFonts().

    ## Consolas already registered with windowsFonts().

    ## Constantia already registered with windowsFonts().

    ## Cooper Black already registered with windowsFonts().

    ## Copperplate Gothic Bold already registered with windowsFonts().

    ## Copperplate Gothic Light already registered with windowsFonts().

    ## Corbel already registered with windowsFonts().

    ## Courier New already registered with windowsFonts().

    ## Curlz MT already registered with windowsFonts().

    ## Dubai already registered with windowsFonts().

    ## Dubai Light already registered with windowsFonts().

    ## Dubai Medium already registered with windowsFonts().

    ## Dusha V5 already registered with windowsFonts().

    ## Ebrima already registered with windowsFonts().

    ## Edwardian Script ITC already registered with windowsFonts().

    ## Elephant already registered with windowsFonts().

    ## Engravers MT already registered with windowsFonts().

    ## Eras Bold ITC already registered with windowsFonts().

    ## Eras Demi ITC already registered with windowsFonts().

    ## Eras Light ITC already registered with windowsFonts().

    ## Eras Medium ITC already registered with windowsFonts().

    ## Felix Titling already registered with windowsFonts().

    ## Fira Code already registered with windowsFonts().

    ## Fira Code Light already registered with windowsFonts().

    ## Fira Code Medium already registered with windowsFonts().

    ## Fira Code Retina already registered with windowsFonts().

    ## FontAwesome already registered with windowsFonts().

    ## Font Awesome 5 Free Regular already registered with windowsFonts().

    ## Footlight MT Light already registered with windowsFonts().

    ## Forte already registered with windowsFonts().

    ## Franklin Gothic Book already registered with windowsFonts().

    ## Franklin Gothic Demi already registered with windowsFonts().

    ## Franklin Gothic Demi Cond already registered with windowsFonts().

    ## Franklin Gothic Heavy already registered with windowsFonts().

    ## Franklin Gothic Medium already registered with windowsFonts().

    ## Franklin Gothic Medium Cond already registered with windowsFonts().

    ## Freestyle Script already registered with windowsFonts().

    ## French Script MT already registered with windowsFonts().

    ## Gabriola already registered with windowsFonts().

    ## Gadugi already registered with windowsFonts().

    ## Garamond already registered with windowsFonts().

    ## Georgia already registered with windowsFonts().

    ## Gigi already registered with windowsFonts().

    ## Gill Sans Ultra Bold already registered with windowsFonts().

    ## Gill Sans Ultra Bold Condensed already registered with windowsFonts().

    ## Gill Sans MT already registered with windowsFonts().

    ## Gill Sans MT Condensed already registered with windowsFonts().

    ## Gill Sans MT Ext Condensed Bold already registered with windowsFonts().

    ## Gloucester MT Extra Condensed already registered with windowsFonts().

    ## Goudy Old Style already registered with windowsFonts().

    ## Goudy Stout already registered with windowsFonts().

    ## Haettenschweiler already registered with windowsFonts().

    ## Harlow Solid Italic already registered with windowsFonts().

    ## Harrington already registered with windowsFonts().

    ## High Tower Text already registered with windowsFonts().

    ## HoloLens MDL2 Assets already registered with windowsFonts().

    ## HP Simplified already registered with windowsFonts().

    ## HP Simplified Light already registered with windowsFonts().

    ## Impact already registered with windowsFonts().

    ## Imprint MT Shadow already registered with windowsFonts().

    ## Informal Roman already registered with windowsFonts().

    ## Ink Free already registered with windowsFonts().

    ## IPAexGothic already registered with windowsFonts().

    ## Javanese Text already registered with windowsFonts().

    ## Jokerman already registered with windowsFonts().

    ## Juice ITC already registered with windowsFonts().

    ## Kristen ITC already registered with windowsFonts().

    ## Kunstler Script already registered with windowsFonts().

    ## Wide Latin already registered with windowsFonts().

    ## Leelawadee already registered with windowsFonts().

    ## Leelawadee UI already registered with windowsFonts().

    ## Leelawadee UI Semilight already registered with windowsFonts().

    ## Lucida Bright already registered with windowsFonts().

    ## Lucida Calligraphy already registered with windowsFonts().

    ## Lucida Console already registered with windowsFonts().

    ## Lucida Fax already registered with windowsFonts().

    ## Lucida Handwriting already registered with windowsFonts().

    ## Lucida Sans already registered with windowsFonts().

    ## Lucida Sans Typewriter already registered with windowsFonts().

    ## Lucida Sans Unicode already registered with windowsFonts().

    ## Magneto already registered with windowsFonts().

    ## Maiandra GD already registered with windowsFonts().

    ## Malgun Gothic already registered with windowsFonts().

    ## Malgun Gothic Semilight already registered with windowsFonts().

    ## Marlett already registered with windowsFonts().

    ## Matura MT Script Capitals already registered with windowsFonts().

    ## Microsoft Himalaya already registered with windowsFonts().

    ## Microsoft Yi Baiti already registered with windowsFonts().

    ## Microsoft New Tai Lue already registered with windowsFonts().

    ## Microsoft PhagsPa already registered with windowsFonts().

    ## Microsoft Sans Serif already registered with windowsFonts().

    ## Microsoft Tai Le already registered with windowsFonts().

    ## Microsoft Uighur already registered with windowsFonts().

    ## Mistral already registered with windowsFonts().

    ## Modern No. 20 already registered with windowsFonts().

    ## Mongolian Baiti already registered with windowsFonts().

    ## Monotype Corsiva already registered with windowsFonts().

    ## MS Outlook already registered with windowsFonts().

    ## MS Reference Sans Serif already registered with windowsFonts().

    ## MS Reference Specialty already registered with windowsFonts().

    ## MT Extra already registered with windowsFonts().

    ## MV Boli already registered with windowsFonts().

    ## Myanmar Text already registered with windowsFonts().

    ## Niagara Engraved already registered with windowsFonts().

    ## Niagara Solid already registered with windowsFonts().

    ## Nirmala UI already registered with windowsFonts().

    ## Nirmala UI Semilight already registered with windowsFonts().

    ## OCR A Extended already registered with windowsFonts().

    ## Old English Text MT already registered with windowsFonts().

    ## Onyx already registered with windowsFonts().

    ## Palace Script MT already registered with windowsFonts().

    ## Palatino Linotype already registered with windowsFonts().

    ## Papyrus already registered with windowsFonts().

    ## Parchment already registered with windowsFonts().

    ## Perpetua already registered with windowsFonts().

    ## Perpetua Titling MT already registered with windowsFonts().

    ## Playbill already registered with windowsFonts().

    ## Poor Richard already registered with windowsFonts().

    ## Pristina already registered with windowsFonts().

    ## Rage Italic already registered with windowsFonts().

    ## Ravie already registered with windowsFonts().

    ## Roboto Condensed already registered with windowsFonts().

    ## Roboto Condensed Light already registered with windowsFonts().

    ## Rockwell already registered with windowsFonts().

    ## Rockwell Condensed already registered with windowsFonts().

    ## Rockwell Extra Bold already registered with windowsFonts().

    ## Script MT Bold already registered with windowsFonts().

    ## Segoe MDL2 Assets already registered with windowsFonts().

    ## Segoe Print already registered with windowsFonts().

    ## Segoe Script already registered with windowsFonts().

    ## Segoe UI already registered with windowsFonts().

    ## Segoe UI Light already registered with windowsFonts().

    ## Segoe UI Semibold already registered with windowsFonts().

    ## Segoe UI Semilight already registered with windowsFonts().

    ## Segoe UI Black already registered with windowsFonts().

    ## Segoe UI Emoji already registered with windowsFonts().

    ## Segoe UI Historic already registered with windowsFonts().

    ## Segoe UI Symbol already registered with windowsFonts().

    ## Showcard Gothic already registered with windowsFonts().

    ## SimSun-ExtB already registered with windowsFonts().

    ## Snap ITC already registered with windowsFonts().

    ## Stencil already registered with windowsFonts().

    ## Sylfaen already registered with windowsFonts().

    ## Symbol already registered with windowsFonts().

    ## Tahoma already registered with windowsFonts().

    ## Tempus Sans ITC already registered with windowsFonts().

    ## Times New Roman already registered with windowsFonts().

    ## Trebuchet MS already registered with windowsFonts().

    ## Tw Cen MT already registered with windowsFonts().

    ## Tw Cen MT Condensed already registered with windowsFonts().

    ## Tw Cen MT Condensed Extra Bold already registered with windowsFonts().

    ## Verdana already registered with windowsFonts().

    ## Viner Hand ITC already registered with windowsFonts().

    ## Vivaldi already registered with windowsFonts().

    ## Vladimir Script already registered with windowsFonts().

    ## Webdings already registered with windowsFonts().

    ## Wingdings already registered with windowsFonts().

    ## Wingdings 2 already registered with windowsFonts().

    ## Wingdings 3 already registered with windowsFonts().

    ## xkcd already registered with windowsFonts().

    ## Yu Mincho Demibold already registered with windowsFonts().

    ## Yu Mincho Light already registered with windowsFonts().

    ## Yu Mincho already registered with windowsFonts().

``` r
results_jp_asia <- readRDS("../data/results_jp_asia.RDS")


jp_aus <- results_jp_asia %>% 
  japan_versus(opponent == "Australia") %>% 
  select(-opponent, Japan = Win, Australia = Loss) %>% 
  gather(key = "team", value = "values", -`Goals For`, -`Goals Against`) %>% 
  select(-contains("Goals"))

# Waffle plot!
waffle(
  jp_aus, rows = 4, size = 1, 
  title = glue("
               Japan vs. Australia: 
               The New 'Asian' Rivalry"),
  colors = c("red", "grey", "blue"), 
  use_glyph = "futbol", glyph_size = 5,
  legend_pos = "bottom"
)
```

![](jpn_aus_waffle_files/figure-markdown_github/unnamed-chunk-2-1.png)


================================================
FILE: Asian Cup 2019/jpn_saudi.Rmd
================================================
---
title: "Untitled"
author: "RN7"
date: "January 21, 2019"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

## R Markdown
```{r message=FALSE}
pacman::p_load(tidyverse, scales, lubridate, ggrepel, stringi, magick, 
               glue, extrafont, rvest, ggtextures, cowplot, ggimage, polite)
```

```{r}
federation_files <- Sys.glob("../data/federation_affiliations/*")

df_federations = data.frame(country = NULL, federation = NULL)
for (f in federation_files) {
    federation = basename(f)
    content = read.csv(f, header=FALSE)
    content <- cbind(content,federation=rep(federation, dim(content)[1]))
    df_federations <- rbind(df_federations, content)
}

colnames(df_federations) <- c("country", "federation")

df_federations <- df_federations %>% 
  mutate(country = as.character(country) %>% str_trim(side = "both"))
```

Now to load the results data and then join it with the affiliations data.

```{r, message=FALSE}
results_raw <- read_csv("../data/results.csv")

results_japan_raw <- results_raw %>% 
  filter(home_team == "Japan" | away_team == "Japan") %>% 
  rename(venue_country = country, 
         venue_city = city) %>% 
  mutate(match_num = row_number())

# combine with federation affiliations
results_japan_home <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("home_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(home_federation = federation) 

results_japan_away <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("away_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(away_federation = federation)

# combine home-away
results_japan_cleaned <- results_japan_home %>% 
  full_join(results_japan_away)
```

Next I need to edit some of the continents for teams that didn't have a match in the federation affiliation data set, for example, "South Korea" is "Korea Republic" in the Kaggle data set.

```{r}
results_japan_cleaned <- results_japan_cleaned %>% 
  mutate(
    home_federation = case_when(
      home_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei") ~ "AFC",
      home_team == "USA" ~ "Concacaf",
      home_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ home_federation),
    away_federation = case_when(
      away_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei", "Taiwan") ~ "AFC",
      away_team == "USA" ~ "Concacaf",
      away_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ away_federation
    ))
```

Now that it's nice and cleaned up I can reshape it so that the data is set from Japan's perspective.

```{r}
results_jp_asia <- results_japan_cleaned %>% 
  # filter only for Japan games and AFC opponents
  filter(home_team == "Japan" | away_team == "Japan",
         home_federation == "AFC" & away_federation == "AFC") %>% 
  select(-contains("federation"), -contains("venue"),
         -neutral, -match_num,
         date, home_team, home_score, away_team, away_score, tournament) %>% 
  # reshape columns to Japan vs. opponent
  mutate(
    opponent = case_when(
      away_team != "Japan" ~ away_team,
      home_team != "Japan" ~ home_team),
    home_away = case_when(
      home_team == "Japan" ~ "home",
      away_team == "Japan" ~ "away"),
    japan_goals = case_when(
      home_team == "Japan" ~ home_score,
      away_team == "Japan" ~ away_score),
    opp_goals = case_when(
      home_team != "Japan" ~ home_score,
      away_team != "Japan" ~ away_score)) %>% 
  # label results from Japan's perspective
  mutate(
    result = case_when(
      japan_goals > opp_goals ~ "Win",
      japan_goals < opp_goals ~ "Loss",
      japan_goals == opp_goals ~ "Draw"),
    result = result %>% as_factor() %>% fct_relevel(c("Win", "Draw", "Loss"))) %>% 
  select(-contains("score"), -contains("team"))
```


```{r}
results_jp_asia %>% filter(opponent == "Saudi Arabia") %>% knitr::kable()
```


```{r}
japan_versus <- function(data, ...) {
  # filter 
  filter_vars <- enquos(...)
  
  jp_vs <- data %>% 
    filter(!!!filter_vars) %>% 
    # count results type per opponent
    group_by(result, opponent) %>% 
    mutate(n = n()) %>% 
    ungroup() %>% 
    # sum amount of goals by Japan and opponent
    group_by(result, opponent) %>% 
    summarize(j_g = sum(japan_goals),
              o_g = sum(opp_goals),
              n = n()) %>% 
    ungroup() %>% 
    # spread results over multiple columns
    spread(result, n) %>% 
    # 1. failsafe against no type of result against an opponent
    # 2. sum up counts per opponent
    group_by(opponent) %>% 
    mutate(Win = if("Win" %in% names(.)){return(Win)} else{return(0)},
         Draw = if("Draw" %in% names(.)){return(Draw)} else{return(0)},
         Loss = if("Loss" %in% names(.)){return(Loss)} else{return(0)}) %>% 
    summarize(Win = sum(Win, na.rm = TRUE),
              Draw = sum(Draw, na.rm = TRUE),
              Loss = sum(Loss, na.rm = TRUE),
              `Goals For` = sum(j_g),
              `Goals Against` = sum(o_g))
  
  return(jp_vs)
}
```



```{r}
results_jp_asia %>% 
  japan_versus(opponent == "Saudi Arabia") %>% 
  knitr::kable(format = "html",
               caption = "Japan's Record vs. Saudi Arabia") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```

```{r}
results_jp_asia %>% 
  japan_versus(opponent == "Vietnam") %>% 
  knitr::kable(format = "html",
               caption = "Japan's Record vs. Vietnam") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```



================================================
FILE: Asian Cup 2019/visualize_asian_cup_2019.knit.md
================================================
Another year, another big soccer/football tournament! This time it’s the
top international competition in Asia, the Asian Cup hosted in the
U.A.E. I’ll be covering (responsible) web-scraping, data wrangling
(tidyverse ftw!), and of course, data visualization with `ggplot2`.

Let’s get started!

Packages
--------

``` r
pacman::p_load(tidyverse, scales, lubridate, ggrepel, stringi, magick, 
               glue, extrafont, rvest, ggtextures, cowplot, ggimage, polite)
# Roboto Condensed font (from hrbrmstrthemes or just Google it)
loadfonts()
```

Top Goalscorers of the Asian Cup
--------------------------------

The first thing I looked at was, “Who were the top goalscorers in the
history of the Asian Cup?”

Here I use the [polite](https://github.com/dmi3kno/polite) package to
take a look at the `robots.txt` for the web page and see if it is OK to
web scrape. It’s good to make things like this a habit!

First you pass the URL to the `bow()` function, check that you are
indeed allowed to scrape, then use `scrape()` to retrieve data, and the
rest is the usual `rvest` web-scraping workflow.

``` r
topg_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup_records_and_statistics"

session <- bow(topg_url)

ac_top_scorers <- scrape(session) %>%
  html_nodes("table.wikitable:nth-child(29)") %>% 
  html_table() %>% 
  flatten_df() %>% 
  select(-Ref.) %>% 
  set_names(c("total_goals", "player", "country"))
```

For brevity, let’s only take a look at the top 5 goal scorers. I’ll also
`mutate()` in a nice image of a soccer ball for the data points on the
plot.

``` r
ac_top_scorers <- ac_top_scorers %>% 
  head(5) %>% 
  mutate(image = "https://www.emoji.co.uk/files/microsoft-emojis/activity-windows10/8356-soccer-ball.png")
```

Now it’s ready! Slightly different to your standard bar graph here as I
use the `geom_isotype_col()` function from `ggtextures` to create a bar
of soccer ball images. Compared to other functions in `ggtextures`,
`geom_isotype_col()` allows each image to correspond to the value of the
variable you are plotting, in this case 1 ball = 1 goal!

``` r
ac_top_graph <- ac_top_scorers %>% 
  ggplot(aes(x = reorder(player, total_goals), y = total_goals,
             image = image)) +
  geom_isotype_col(img_width = grid::unit(1, "native"), img_height = NULL,
    ncol = NA, nrow = 1, hjust = 0, vjust = 0.5) +
  coord_flip() +
  scale_y_continuous(breaks = c(0, 2, 4, 6, 8, 10, 12, 14),
                     expand = c(0, 0), 
                     limits = c(0, 15)) +
  ggthemes::theme_solarized() +
  labs(title = "Top Scorers of the Asian Cup",
       subtitle = "Most goals in a single tournament: 8 (Ali Daei, 1996)",
       y = "Number of Goals", x = NULL,
       caption = glue("
                      Source: Wikipedia
                      By @R_by_Ryo")) +
  theme(text = element_text(family = "Roboto Condensed"),
        plot.title = element_text(size = 22),
        plot.subtitle = element_text(size = 14),
        axis.text = element_text(size = 14),
        axis.title.x = element_text(size = 16),
        axis.line.y = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        axis.ticks.y = element_blank())

ac_top_graph
```

![](visualize_asian_cup_2019_files/figure-markdown_github/top%20goal%20scorers%20plot-1.png)

OK, not bad. However, wouldn’t it be nice to add a bit more information
for context? Specifically, which country these players came from. So
let’s add some flags along the y-axis!

There are lots of different ways to do this (like `geom_flag()` from the
`ggimage` package) but I ended up doing it the `cowplot` way. I had to
tweak the scales a bit as the flags came in different sizes. When you
plot, you just insert the image strip into the bar plot with
`axis_canvas()` and combine all the parts together with `ggdraw()`!

``` r
axis_image <- axis_canvas(ac_top_graph, axis = 'y') + 
  draw_image("https://upload.wikimedia.org/wikipedia/commons/c/ca/Flag_of_Iran.svg", 
             y = 13, scale = 1.5) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/0/09/Flag_of_South_Korea.svg", 
             y = 10, scale = 1.7) +
  draw_image("https://upload.wikimedia.org/wikipedia/en/9/9e/Flag_of_Japan.svg", 
             y = 7, scale = 1.7) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/f/f6/Flag_of_Iraq.svg", 
             y = 4, scale = 1.6) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/a/aa/Flag_of_Kuwait.svg", 
             y = 1, scale = 1.2)

ggdraw(insert_yaxis_grob(ac_top_graph, axis_image, position = "left"))
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/draw_image-1.png" style="display: block; margin: auto;" />

Ideally I wanted the soccer balls to be the official balls from the
tournament that the player scored in. However, I couldn’t find a nice
emoji-fied/icon-ized version and there was also the “small” problem in
that there was no “official” Asian Cup ball until the 2004 tournament in
China! You can take a look at the official Asian Cup balls
[here](http://football-balls.com/balls/asian-cup).

Winners of the Asian Cup
------------------------

We saw that the top goal scorers came from Iran, South Korea, Japan,
Iraq, and Kuwait but did their goal scoring exploits lead their nations
to glory? Let’s find out!

When web-scraping I really like using `flatten_df()` after
`html_table()` as I don’t have to use the awkward looking `.[[1]]`
within my piped workflow.

``` r
acup_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup"

session <- bow(acup_url)

acup_winners_raw <- scrape(session) %>% 
  html_nodes("table:nth-child(31)") %>% 
  html_table() %>% 
  flatten_df()
```

Now I can use the `clean_names()` function to quickly clean up my names
(mainly when I can’t be bothered to `set_names()` them myself…).

The next steps are splitting up the number of times a team placed
between 1st and 3rd and the year that occurred with `separate()`.

The variants of `mutate()` are then used to tidy the string columns of
the data into numeric type.

I use `gather()` so each team will have a row for each of the rank
positions (1st-3rd).

Finally, I arrange the data in a way that the facets will be ordered in
the way that I want.

``` r
acup_winners_clean <- acup_winners_raw %>% 
  janitor::clean_names() %>% 
  slice(1:8) %>% 
  select(-fourth_place, -semi_finalists, -total_top_four) %>% 
  separate(winners, into = c("Champions", "first_place_year"), 
           sep = " ", extra = "merge") %>% 
  separate(runners_up, into = c("Runners-up", "second_place_year"), 
           sep = " ", extra = "merge") %>% 
  separate(third_place, into = c("Third Place", "third_place_year"), 
           sep = " ", extra = "merge") %>% 
  mutate_all(funs(str_replace_all(., "–", "0"))) %>% 
  mutate_at(vars(contains("num")), funs(as.numeric)) %>% 
  mutate(team = if_else(team == "Israel1", "Israel", team)) %>% 
  gather(key = "key", value = "value", -team, 
         -first_place_year, -second_place_year, -third_place_year) %>% 
  mutate(key = key %>% 
           fct_relevel(c("Champions", "Runners-up", "Third Place"))) %>% 
  arrange(key, value) %>% 
  mutate(team = as_factor(team),
         order = row_number())
```

I plot using facets on the “key” variable (containing the rank data) so
that we can see how many times each team placed as Champions to Third
Place. I also use the `glue()` function here to format the multi-line
captions and titles in a neat way.

``` r
acup_winners_clean %>% 
  ggplot(aes(value, team, color = key)) +
  geom_point(size = 5) +
  scale_color_manual(values = c("Champions" = "#FFCC33",
                                "Runners-up" = "#999999",
                                "Third Place" = "#CC6600"),
                     guide = FALSE) +
  labs(x = "Number of Occurrence",
       title = "Winners & Losers of the Asian Cup!",
       subtitle = glue("
                       Ordered by number of Asian Cup(s) won.
                       Four-time Champions, Japan, only won their first in 1992!"),
       caption = glue("
                      Note: Israel was expelled by the AFC in 1974 while Australia joined the AFC in 2006.
                      Source: Wikipedia
                      By @R_by_Ryo")) +
  facet_wrap(~key) +
  theme_minimal() +
  theme(text = element_text(family = "Roboto Condensed"),
        title = element_text(size = 18),
        plot.subtitle = element_text(size = 12),
        axis.title.y = element_blank(),
        axis.title.x = element_text(size = 12),
        axis.text.y = element_text(size = 14),
        axis.text.x = element_text(size = 12),
        plot.caption = element_text(hjust = 0, size = 10),
        panel.border = element_rect(fill = NA, colour = "grey20"),
        panel.grid.minor.x = element_blank(),
        strip.text = element_text(size = 16)) 
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/unnamed-chunk-108-1.png" style="display: block; margin: auto;" />

Goals per Game
--------------

One new thing I learned very recently, while working on this viz in
fact, was using magrittr aliases!

![](https://twitter.com/Emil_Hvitfeldt/status/1081080919073542144)

Usually for web scraping I always wind up having to use `.[x]` or
`.[[x]]` but now I can just use `extract()` or `extract2()` respectively
to do the same thing!

``` r
wiki_url <- "https://en.wikipedia.org"
session <- bow(wiki_url)
acup_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup"
session_cup <- bow(acup_url)

cup_links <- scrape(session_cup) %>% 
  html_nodes("br+ i a") %>% 
  html_attr("href") %>% 
  magrittr::extract(-17:-18)

acup_df <- cup_links %>% 
  as_data_frame() %>% 
  mutate(cup = str_remove(value, "\\/wiki\\/") %>% str_replace_all("_", " ")) %>% 
  rename(link = value)
```

Another cool thing I found while scraping this data was the `jump_to()`
function that allows you to navigate to a new URL. This makes
`map()`-ing over multiple URL links from a base URL very easy! Here, the
base URL is the AFC Asian Cup Wikipedia page and the function iterates
over each of the links to the URL of the respective tournament pages.
Another way that I could’ve done this was to `map()` over the different
dates of the tournaments as the Wikipedia page of each edition of the
Asian Cup only differed in the “year” appended at the beginning of the
URL.

``` r
goals_info <- function(x) {
  goal_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    select(`Goals scored`) %>% 
    mutate(`Goals scored` = str_remove_all(`Goals scored`, pattern = ".*\\(") %>% 
             str_extract_all("\\d+\\.*\\d*") %>% as.numeric)
}

team_num_info <- function(x) {
  team_num_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    select(`Teams`) %>% 
    mutate(`Teams` = as.numeric(`Teams`))
}

match_num_info <- function(x) {
  match_num_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    janitor::clean_names() %>% 
    select(matches_played) %>% 
    mutate(matches_played = as.numeric(matches_played))
}

# all together:
goals_data <- acup_df %>% 
  mutate(goals_per_game = map(acup_df$link, goals_info) %>% unlist,
         team_num = map(acup_df$link, team_num_info) %>% unlist,
         match_num = map(acup_df$link, match_num_info) %>% unlist)
```

Next, clean it up a bit and add in the number of teams that participated
in each tournament.

``` r
ac_goals_df <- goals_data %>% 
  mutate(label = cup %>% str_extract("[0-9]+") %>% str_replace("..", "'"),
         team_num = case_when(
           is.na(team_num) ~ 16,
           TRUE ~ team_num
         )) %>% 
  arrange(cup) %>% 
  mutate(label = factor(label, label),
         team_num = c(4, 4, 4, 5, 6, 6, 10, 10, 10, 8, 12, 12, 16, 16, 16, 16))

glimpse(ac_goals_df)
```

    ## Observations: 16
    ## Variables: 6
    ## $ link           <chr> "/wiki/1956_AFC_Asian_Cup", "/wiki/1960_AFC_Asi...
    ## $ cup            <chr> "1956 AFC Asian Cup", "1960 AFC Asian Cup", "19...
    ## $ goals_per_game <dbl> 4.50, 3.17, 2.17, 3.20, 2.92, 2.50, 3.17, 1.83,...
    ## $ team_num       <dbl> 4, 4, 4, 5, 6, 6, 10, 10, 10, 8, 12, 12, 16, 16...
    ## $ match_num      <dbl> 6, 6, 6, 10, 13, 10, 24, 24, 24, 16, 26, 26, 32...
    ## $ label          <fct> '56, '60, '64, '68, '72, '76, '80, '84, '88, '9...

Now we make a line graph but with LOTS of `annotate()` code to add in
comments, labels, and segments for the labels. At the end I use
`geom_emoji()` to add a soccer ball to the plot for each of the data
points.

``` r
plot <- ac_goals_df %>% 
  ggplot(aes(x = label, y = goals_per_game, group = 1)) +
  geom_line() +
  scale_y_continuous(limits = c(NA, 5.35),
                     breaks = c(1.5, 2, 2.5, 3, 3.5, 4, 4.5)) +
  labs(x = "Tournament (Year)", y = "Goals per Game") +
  theme_minimal() +
  theme(text = element_text(family = "Roboto Condensed"),
        axis.title = element_text(size = 12),
        axis.text = element_text(size = 12)) +
  annotate(geom = "label", x = "'56", y = 5.15, family = "Roboto Condensed",
           color = "black", 
           label = "Total Number of Games Played:", hjust = 0) +
  annotate(geom = "text", x = "'60", y = 4.9, 
           label = "6", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 1, xend = 3, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'68", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 3.8, xend = 4.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'72", y = 4.9, 
           label = "13", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 4.8, xend = 5.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'76", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 5.8, xend = 6.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'84", y = 4.9, 
           label = "24", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 7, xend = 9, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'92", y = 4.9, 
           label = "16", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 9.8, xend = 10.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 11.5, y = 4.9, 
           label = "26", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 11, xend = 12, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 14.5, y = 4.9, 
           label = "32", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 13, xend = 16, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 9, y = 4, family = "Roboto Condensed",
           label = glue("
                        Incredibly low amount of goals in Group B
                        (15 in 10 Games) and in Knock-Out Stages
                        (4 goals in 4, only 1 scored in normal time)")) +
  annotate(geom = "segment", x = 9, xend = 9, y = 1.65, yend = 3.75,
           color = "red") +
  ggimage::geom_emoji(aes(image = '26bd'), size = 0.03) 

plot
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/unnamed-chunk-109-1.png" style="display: block; margin: auto;" />

``` r
ggsave(filename = paste0(here::here("Asian Cup 2019"), "/gpg_plot.png"), 
       width = 8, height = 7, dpi = 300)
plot <- image_read(paste0(here::here("Asian Cup 2019"), "/gpg_plot.png"))
```

However, I’m not finished yet! I wanted to try to make this look a bit
more “official” so I attempted to add the Asian Cup logo on the top
right corner. There are probably alternative ways to how I did it below,
especially by using grobs, but I was reminded of
[this](https://www.danielphadley.com/ggplot-logo/) blog post by [Daniel
Hadley](https://twitter.com/danielphadley) who used the `magick` package
to add a footer with a logo onto a `ggplot` object. I’ve used `magick`
before for animations and this was a good chance to try it out for image
editing. Compared to Daniel Hadley’s example I needed to have the logo
on the right corner so I had to find an alternative way of creating a
blank canvas with `image_blank()` and then placing everything on top of
that with `image_composite()` and `image_append()`.

``` r
logo_raw <- image_read("https://upload.wikimedia.org/wikipedia/en/a/ad/2019_afc_asian_cup_logo.png")

logo_proc <- logo_raw %>% image_scale("600")

# create blank canvas
a <- image_blank(width = 1000, height = 100, color = "white")
# combine with logo image and shift logo to the right
b <- image_composite(image_scale(a, "x100"), image_scale(logo_proc, "x75"), 
                     offset = "+880+25")
# add in the title text
logo_header <- b %>% 
  image_annotate(text = glue("Goals per Game throughout the history of the Asian Cup"),
                 color = "black", size = 24, font = "Roboto Condensed",
                 location = "+63+50", gravity = "northwest")

# combine it all together! 
final2_plot <- image_append(image_scale(c(logo_header, plot), "1000"), stack = TRUE)

# image_write(final2_plot,
#             glue("{here::here('Asian Cup 2019')}/gpg_plot_final.png"))

final2_plot
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/unnamed-chunk-110-1.png" width="1000" style="display: block; margin: auto;" />

All in all it took a while to tweak the positions of the text and logo
image but for my first try it worked well. There is definitely room for
improvement in regards to sizing and scaling though.

Ultimately, I couldn’t find much information on why those tournaments in
the 80s in particular were such low scoring affairs. I wasn’t alive to
watch those games on TV nor could I find any illuminating articles or
blog posts on the style of Asian football back in the 80s…This was also
before Japan really got into soccer so there wasn’t anything I could
find in Japanese either.

Japan’s record vs. Group D opponents and rivals
-----------------------------------------------

Japan is the most successful team in the competition with 4
championships but who are their opponents in the group stages and how
have they fared against them? While I’m at it I will also check their
records against long-time continental rivals such as Iran, South Korea,
Saudi Arabia and more recently, Australia.

The data I’m going to use comes from
[Kaggle](https://www.kaggle.com/martj42/international-football-results-from-1872-to-2017)
which has all international football results from 1872 to the World Cup
final last year. To add in the federation affiliation (UEFA, AFC, etc.)
for each of the countries I slightly modified some code from one of the
kernels, [“A Journey Through The History of
Soccer”](https://www.kaggle.com/phjulien/a-journey-through-the-history-of-soccer/)
by PH Julien.

``` r
federation_files <- Sys.glob("../data/federation_affiliations/*")

df_federations = data.frame(country = NULL, federation = NULL)
for (f in federation_files) {
    federation = basename(f)
    content = read.csv(f, header=FALSE)
    content <- cbind(content,federation=rep(federation, dim(content)[1]))
    df_federations <- rbind(df_federations, content)
}

colnames(df_federations) <- c("country", "federation")

df_federations <- df_federations %>% 
  mutate(country = as.character(country) %>% str_trim(side = "both"))
```

Now to load the results data and then join it with the affiliations
data.

``` r
results_raw <- read_csv("../data/results.csv")

results_japan_raw <- results_raw %>% 
  filter(home_team == "Japan" | away_team == "Japan") %>% 
  rename(venue_country = country, 
         venue_city = city) %>% 
  mutate(match_num = row_number())

# combine with federation affiliations
results_japan_home <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("home_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(home_federation = federation) 

results_japan_away <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("away_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(away_federation = federation)

# combine home-away
results_japan_cleaned <- results_japan_home %>% 
  full_join(results_japan_away)
```

Next I need to edit some of the continents for teams that didn’t have a
match in the federation affiliation data set, for example, “South Korea”
is “Korea Republic” in the Kaggle data set.

``` r
results_japan_cleaned <- results_japan_cleaned %>% 
  mutate(
    home_federation = case_when(
      home_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei") ~ "AFC",
      home_team == "USA" ~ "Concacaf",
      home_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ home_federation),
    away_federation = case_when(
      away_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei", "Taiwan") ~ "AFC",
      away_team == "USA" ~ "Concacaf",
      away_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ away_federation
    ))
```

Now that it’s nice and cleaned up I can reshape it so that the data is
set from Japan’s perspective.

``` r
results_jp_asia <- results_japan_cleaned %>% 
  # filter only for Japan games and AFC opponents
  filter(home_team == "Japan" | away_team == "Japan",
         home_federation == "AFC" & away_federation == "AFC") %>% 
  select(-contains("federation"), -contains("venue"),
         -neutral, -match_num,
         date, home_team, home_score, away_team, away_score, tournament) %>% 
  # reshape columns to Japan vs. opponent
  mutate(
    opponent = case_when(
      away_team != "Japan" ~ away_team,
      home_team != "Japan" ~ home_team),
    home_away = case_when(
      home_team == "Japan" ~ "home",
      away_team == "Japan" ~ "away"),
    japan_goals = case_when(
      home_team == "Japan" ~ home_score,
      away_team == "Japan" ~ away_score),
    opp_goals = case_when(
      home_team != "Japan" ~ home_score,
      away_team != "Japan" ~ away_score)) %>% 
  # label results from Japan's perspective
  mutate(
    result = case_when(
      japan_goals > opp_goals ~ "Win",
      japan_goals < opp_goals ~ "Loss",
      japan_goals == opp_goals ~ "Draw"),
    result = result %>% as_factor() %>% fct_relevel(c("Win", "Draw", "Loss"))) %>% 
  select(-contains("score"), -contains("team"))
```

With all that done we can take a look at how Japan have done against
certain opponents by using `filter()`.

``` r
results_jp_asia %>% 
  filter(opponent == "Jordan",
         tournament == "AFC Asian Cup")
```

    ## # A tibble: 3 x 7
    ##   date       tournament    opponent home_away japan_goals opp_goals result
    ##   <date>     <chr>         <chr>    <chr>           <int>     <int> <fct> 
    ## 1 2004-07-31 AFC Asian Cup Jordan   home                1         1 Draw  
    ## 2 2011-01-09 AFC Asian Cup Jordan   home                1         1 Draw  
    ## 3 2015-01-20 AFC Asian Cup Jordan   home                2         0 Win

Unfortunately, this data set doesn’t go into extra-time or penalty wins
as Japan’s Quarter-Final meeting with Jordan in 2004 ended with Japan
securing a route to the semis, 4-3 on penalties!

I can create a function that’ll filter for certain opponents and
tournaments and aggregate the results. With the second argument being
`...`, `tidyeval` allows me to input any kind of filter condition for an
opponent, tournament, etc. The `if else` statement protects against
cases where Japan never had that type of result against an opponent and
makes sure that a column populated by 0s is created.

``` r
japan_versus <- function(data, ...) {
  # filter 
  filter_vars <- enquos(...)
  
  jp_vs <- data %>% 
    filter(!!!filter_vars) %>% 
    # count results type per opponent
    group_by(result, opponent) %>% 
    mutate(n = n()) %>% 
    ungroup() %>% 
    # sum amount of goals by Japan and opponent
    group_by(result, opponent) %>% 
    summarize(j_g = sum(japan_goals),
              o_g = sum(opp_goals),
              n = n()) %>% 
    ungroup() %>% 
    # spread results over multiple columns
    spread(result, n) %>% 
    # 1. failsafe against no type of result against an opponent
    # 2. sum up counts per opponent
    group_by(opponent) %>% 
    mutate(Win = if("Win" %in% names(.)){return(Win)} else{return(0)},
         Draw = if("Draw" %in% names(.)){return(Draw)} else{return(0)},
         Loss = if("Loss" %in% names(.)){return(Loss)} else{return(0)}) %>% 
    summarize(Win = sum(Win, na.rm = TRUE),
              Draw = sum(Draw, na.rm = TRUE),
              Loss = sum(Loss, na.rm = TRUE),
              `Goals For` = sum(j_g),
              `Goals Against` = sum(o_g))
  
  return(jp_vs)
}
```

Now let’s try it out a bit.

``` r
japan_versus(data = results_jp_asia, 
             opponent == "China")
```

    ## # A tibble: 1 x 6
    ##   opponent   Win  Draw  Loss `Goals For` `Goals Against`
    ##   <chr>    <int> <int> <int>       <int>           <int>
    ## 1 China       14     8    10          54              45

I can put in multiple filter conditions if needed as well.

``` r
japan_versus(data = results_jp_asia,
             home_away == "home",
             opponent %in% c("Palestine", "Vietnam", "India"))
```

    ## # A tibble: 3 x 6
    ##   opponent    Win  Draw  Loss `Goals For` `Goals Against`
    ##   <chr>     <int> <dbl> <dbl>       <int>           <int>
    ## 1 India         2     0     0          13               0
    ## 2 Palestine     1     0     0           4               0
    ## 3 Vietnam       1     0     0           1               0

As you can see Japan has never lost or drawn against India, Palestine,
or Vietnam so in the data there wouldn’t have been any rows with “Loss”
in the results column. With the function I created I was able to impute
results that didn’t exist and fill them in with 0s!

Let’s check Japan’s performance against our main rivals in the Asian
Cup. Here I make the tables look a lot nicer with the options in the
`kable` and `kableExtra` packages.

``` r
results_jp_asia %>% 
  japan_versus(opponent %in% c("Iran", "Korea Republic", "Saudi Arabia"),
               tournament == "AFC Asian Cup") %>% 
  knitr::kable(format = "html",
               caption = "Japan vs. Historic Rivals in the Asian Cup") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```

<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;">
<caption>
Japan vs. Historic Rivals in the Asian Cup
</caption>
<thead>
<tr>
<th style="border-bottom:hidden" colspan="1">
</th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="3">
Result

</th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2">
Goals

</th>
</tr>
<tr>
<th style="text-align:left;">
opponent
</th>
<th style="text-align:right;">
Win
</th>
<th style="text-align:right;">
Draw
</th>
<th style="text-align:right;">
Loss
</th>
<th style="text-align:right;">
Goals For
</th>
<th style="text-align:right;">
Goals Against
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;">
Iran
</td>
<td style="text-align:right;">
1
</td>
<td style="text-align:right;">
2
</td>
<td style="text-align:right;">
0
</td>
<td style="text-align:right;">
1
</td>
<td style="text-align:right;">
0
</td>
</tr>
<tr>
<td style="text-align:left;">
Korea Republic
</td>
<td style="text-align:right;">
0
</td>
<td style="text-align:right;">
2
</td>
<td style="text-align:right;">
1
</td>
<td style="text-align:right;">
2
</td>
<td style="text-align:right;">
4
</td>
</tr>
<tr>
<td style="text-align:left;">
Saudi Arabia
</td>
<td style="text-align:right;">
4
</td>
<td style="text-align:right;">
0
</td>
<td style="text-align:right;">
1
</td>
<td style="text-align:right;">
13
</td>
<td style="text-align:right;">
4
</td>
</tr>
</tbody>
</table>
Now let’s take a look at how Japan have historically played against the
other teams in Group F of this year’s Asian Cup.

``` r
results_jp_asia %>% 
  japan_versus(opponent %in% c("Oman", "Uzbekistan", "Turkmenistan")) %>% 
  knitr::kable(format = "html",
               caption = "Japan's Record vs. Group F Teams") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```

<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;">
<caption>
Japan’s Record vs. Group F Teams
</caption>
<thead>
<tr>
<th style="border-bottom:hidden" colspan="1">
</th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="3">
Result

</th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2">
Goals

</th>
</tr>
<tr>
<th style="text-align:left;">
opponent
</th>
<th style="text-align:right;">
Win
</th>
<th style="text-align:right;">
Draw
</th>
<th style="text-align:right;">
Loss
</th>
<th style="text-align:right;">
Goals For
</th>
<th style="text-align:right;">
Goals Against
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;">
Oman
</td>
<td style="text-align:right;">
8
</td>
<td style="text-align:right;">
3
</td>
<td style="text-align:right;">
0
</td>
<td style="text-align:right;">
19
</td>
<td style="text-align:right;">
4
</td>
</tr>
<tr>
<td style="text-align:left;">
Uzbekistan
</td>
<td style="text-align:right;">
6
</td>
<td style="text-align:right;">
3
</td>
<td style="text-align:right;">
1
</td>
<td style="text-align:right;">
28
</td>
<td style="text-align:right;">
9
</td>
</tr>
</tbody>
</table>
We see no rows here for Turkmenistan. This is due to the fact that until
just this past week Japan had **never** played against them in a
friendly or competitive game!

Conclusion
==========

Although Japan’s first game was quite horrible I’m hoping it’ll wake the
players and coaches out of their complacency and not underestimate our
opponents in the next two games.

Japan

South Korea and Iran

thankfully south korea should be on the other side of the bracket and we
would also only meet Iran in the semifinals (provided both teams finish
top of their respective groups)

Japan could meet Australia in the Quarters but without Aaron Mooy
they’re a much weaker side as shown in their abject loss to Jordan in
their opening match.

even with losing new star Nakajima, the fact that we can replace him
with a player of the calibre of Takashi Inui and Hannover regular, Genki
Haraguchi, stepping up from the bench shows how much Japanese football
has progressed these past 25 years.

It’s a changing of the guard for Japan but we’ve got quality players in
Europe as well as some depth too with more young Japanese players headed
to Europe from a young age

It was quite awe-inspiring seeing how the number of Japanese players
playing for foreign clubs have been steadily increasing since the 1988
Asian Cup squad. Maybe that could be another idea for a visualization?

this tournament should be a first stepping stone for this new generation
of players to make a big impact for the next world cup in 2022 so keep
your eye out for this bunch of players!


================================================
FILE: Asian Cup 2019/visualize_asian_cup_2019.md
================================================
Another year, another big soccer/football tournament! This time it’s the
top international competition in Asia, the Asian Cup hosted in the
U.A.E. I’ll be covering (responsible) web-scraping, data wrangling
(tidyverse ftw!), and of course, data visualization with `ggplot2`.

Let’s get started!

Packages
--------

``` r
pacman::p_load(tidyverse, scales, lubridate, ggrepel, stringi, magick, 
               glue, extrafont, rvest, ggtextures, cowplot, ggimage, polite)
# Roboto Condensed font (from hrbrmstrthemes or just Google it)
loadfonts()
```

Top Goalscorers of the Asian Cup
--------------------------------

The first thing I looked at was, “Who were the top goalscorers in the
history of the Asian Cup?”

Here I use the [polite](https://github.com/dmi3kno/polite) package to
take a look at the `robots.txt` for the web page and see if it is OK to
web scrape. It’s good to make things like this a habit!

First you pass the URL to the `bow()` function, check that you are
indeed allowed to scrape, then use `scrape()` to retrieve data, and the
rest is the usual `rvest` web-scraping workflow.

``` r
topg_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup_records_and_statistics"

session <- bow(topg_url)

ac_top_scorers <- scrape(session) %>%
  html_nodes("table.wikitable:nth-child(29)") %>% 
  html_table() %>% 
  flatten_df() %>% 
  select(-Ref.) %>% 
  set_names(c("total_goals", "player", "country"))
```

For brevity, let’s only take a look at the top 5 goal scorers. I’ll also
`mutate()` in a nice image of a soccer ball for the data points on the
plot.

``` r
ac_top_scorers <- ac_top_scorers %>% 
  head(5) %>% 
  mutate(image = "https://www.emoji.co.uk/files/microsoft-emojis/activity-windows10/8356-soccer-ball.png")
```

Now it’s ready! Slightly different to your standard bar graph here as I
use the `geom_isotype_col()` function from `ggtextures` to create a bar
of soccer ball images. Compared to other functions in `ggtextures`,
`geom_isotype_col()` allows each image to correspond to the value of the
variable you are plotting, in this case 1 ball = 1 goal!

``` r
ac_top_graph <- ac_top_scorers %>% 
  ggplot(aes(x = reorder(player, total_goals), y = total_goals,
             image = image)) +
  geom_isotype_col(img_width = grid::unit(1, "native"), img_height = NULL,
    ncol = NA, nrow = 1, hjust = 0, vjust = 0.5) +
  coord_flip() +
  scale_y_continuous(breaks = c(0, 2, 4, 6, 8, 10, 12, 14),
                     expand = c(0, 0), 
                     limits = c(0, 15)) +
  ggthemes::theme_solarized() +
  labs(title = "Top Scorers of the Asian Cup",
       subtitle = "Most goals in a single tournament: 8 (Ali Daei, 1996)",
       y = "Number of Goals", x = NULL,
       caption = glue("
                      Source: Wikipedia
                      By @R_by_Ryo")) +
  theme(text = element_text(family = "Roboto Condensed"),
        plot.title = element_text(size = 22),
        plot.subtitle = element_text(size = 14),
        axis.text = element_text(size = 14),
        axis.title.x = element_text(size = 16),
        axis.line.y = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        axis.ticks.y = element_blank())

ac_top_graph
```

![](visualize_asian_cup_2019_files/figure-markdown_github/top%20goal%20scorers%20plot-1.png)

OK, not bad. However, wouldn’t it be nice to add a bit more information
for context? Specifically, which country these players came from. So
let’s add some flags along the y-axis!

There are lots of different ways to do this (like `geom_flag()` from the
`ggimage` package) but I ended up doing it the `cowplot` way. I had to
tweak the scales a bit as the flags came in different sizes. When you
plot, you just insert the image strip into the bar plot with
`axis_canvas()` and combine all the parts together with `ggdraw()`!

``` r
axis_image <- axis_canvas(ac_top_graph, axis = 'y') + 
  draw_image("https://upload.wikimedia.org/wikipedia/commons/c/ca/Flag_of_Iran.svg", 
             y = 13, scale = 1.5) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/0/09/Flag_of_South_Korea.svg", 
             y = 10, scale = 1.7) +
  draw_image("https://upload.wikimedia.org/wikipedia/en/9/9e/Flag_of_Japan.svg", 
             y = 7, scale = 1.7) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/f/f6/Flag_of_Iraq.svg", 
             y = 4, scale = 1.6) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/a/aa/Flag_of_Kuwait.svg", 
             y = 1, scale = 1.2)

ggdraw(insert_yaxis_grob(ac_top_graph, axis_image, position = "left"))
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/draw_image-1.png" style="display: block; margin: auto;" />

Ideally I wanted the soccer balls to be the official balls from the
tournament that the player scored in. However, I couldn’t find a nice
emoji-fied/icon-ized version and there was also the “small” problem in
that there was no “official” Asian Cup ball until the 2004 tournament in
China! You can take a look at the official Asian Cup balls
[here](http://football-balls.com/balls/asian-cup).

Winners of the Asian Cup
------------------------

We saw that the top goal scorers came from Iran, South Korea, Japan,
Iraq, and Kuwait but did their goal scoring exploits lead their nations
to glory? Let’s find out!

When web-scraping I really like using `flatten_df()` after
`html_table()` as I don’t have to use the awkward looking `.[[1]]`
within my piped workflow.

``` r
acup_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup"

session <- bow(acup_url)

acup_winners_raw <- scrape(session) %>% 
  html_nodes("table:nth-child(31)") %>% 
  html_table() %>% 
  flatten_df()
```

Now I can use the `clean_names()` function to quickly clean up my names
(mainly when I can’t be bothered to `set_names()` them myself…).

The next steps are splitting up the number of times a team placed
between 1st and 3rd and the year that occurred with `separate()`.

The variants of `mutate()` are then used to tidy the string columns of
the data into numeric type.

I use `gather()` so each team will have a row for each of the rank
positions (1st-3rd).

Finally, I arrange the data in a way that the facets will be ordered in
the way that I want.

``` r
acup_winners_clean <- acup_winners_raw %>% 
  janitor::clean_names() %>% 
  slice(1:8) %>% 
  select(-fourth_place, -semi_finalists, -total_top_four) %>% 
  separate(winners, into = c("Champions", "first_place_year"), 
           sep = " ", extra = "merge") %>% 
  separate(runners_up, into = c("Runners-up", "second_place_year"), 
           sep = " ", extra = "merge") %>% 
  separate(third_place, into = c("Third Place", "third_place_year"), 
           sep = " ", extra = "merge") %>% 
  mutate_all(funs(str_replace_all(., "–", "0"))) %>% 
  mutate_at(vars(contains("num")), funs(as.numeric)) %>% 
  mutate(team = if_else(team == "Israel1", "Israel", team)) %>% 
  gather(key = "key", value = "value", -team, 
         -first_place_year, -second_place_year, -third_place_year) %>% 
  mutate(key = key %>% 
           fct_relevel(c("Champions", "Runners-up", "Third Place"))) %>% 
  arrange(key, value) %>% 
  mutate(team = as_factor(team),
         order = row_number())
```

I plot using facets on the “key” variable (containing the rank data) so
that we can see how many times each team placed as Champions to Third
Place. I also use the `glue()` function here to format the multi-line
captions and titles in a neat way.

``` r
acup_winners_clean %>% 
  ggplot(aes(value, team, color = key)) +
  geom_point(size = 5) +
  scale_color_manual(values = c("Champions" = "#FFCC33",
                                "Runners-up" = "#999999",
                                "Third Place" = "#CC6600"),
                     guide = FALSE) +
  labs(x = "Number of Occurrence",
       title = "Winners & Losers of the Asian Cup!",
       subtitle = glue("
                       Ordered by number of Asian Cup(s) won.
                       Four-time Champions, Japan, only won their first in 1992!"),
       caption = glue("
                      Note: Israel was expelled by the AFC in 1974 while Australia joined the AFC in 2006.
                      Source: Wikipedia
                      By @R_by_Ryo")) +
  facet_wrap(~key) +
  theme_minimal() +
  theme(text = element_text(family = "Roboto Condensed"),
        title = element_text(size = 18),
        plot.subtitle = element_text(size = 12),
        axis.title.y = element_blank(),
        axis.title.x = element_text(size = 12),
        axis.text.y = element_text(size = 14),
        axis.text.x = element_text(size = 12),
        plot.caption = element_text(hjust = 0, size = 10),
        panel.border = element_rect(fill = NA, colour = "grey20"),
        panel.grid.minor.x = element_blank(),
        strip.text = element_text(size = 16)) 
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/unnamed-chunk-108-1.png" style="display: block; margin: auto;" />

Goals per Game
--------------

One new thing I learned very recently, while working on this viz in
fact, was using magrittr aliases!

![](https://twitter.com/Emil_Hvitfeldt/status/1081080919073542144)

Usually for web scraping I always wind up having to use `.[x]` or
`.[[x]]` but now I can just use `extract()` or `extract2()` respectively
to do the same thing!

``` r
wiki_url <- "https://en.wikipedia.org"
session <- bow(wiki_url)
acup_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup"
session_cup <- bow(acup_url)

cup_links <- scrape(session_cup) %>% 
  html_nodes("br+ i a") %>% 
  html_attr("href") %>% 
  magrittr::extract(-17:-18)

acup_df <- cup_links %>% 
  as_data_frame() %>% 
  mutate(cup = str_remove(value, "\\/wiki\\/") %>% str_replace_all("_", " ")) %>% 
  rename(link = value)
```

Another cool thing I found while scraping this data was the `jump_to()`
function that allows you to navigate to a new URL. This makes
`map()`-ing over multiple URL links from a base URL very easy! Here, the
base URL is the AFC Asian Cup Wikipedia page and the function iterates
over each of the links to the URL of the respective tournament pages.
Another way that I could’ve done this was to `map()` over the different
dates of the tournaments as the Wikipedia page of each edition of the
Asian Cup only differed in the “year” appended at the beginning of the
URL.

``` r
goals_info <- function(x) {
  goal_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    select(`Goals scored`) %>% 
    mutate(`Goals scored` = str_remove_all(`Goals scored`, pattern = ".*\\(") %>% 
             str_extract_all("\\d+\\.*\\d*") %>% as.numeric)
}

team_num_info <- function(x) {
  team_num_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    select(`Teams`) %>% 
    mutate(`Teams` = as.numeric(`Teams`))
}

match_num_info <- function(x) {
  match_num_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    janitor::clean_names() %>% 
    select(matches_played) %>% 
    mutate(matches_played = as.numeric(matches_played))
}

# all together:
goals_data <- acup_df %>% 
  mutate(goals_per_game = map(acup_df$link, goals_info) %>% unlist,
         team_num = map(acup_df$link, team_num_info) %>% unlist,
         match_num = map(acup_df$link, match_num_info) %>% unlist)
```

Next, clean it up a bit and add in the number of teams that participated
in each tournament.

``` r
ac_goals_df <- goals_data %>% 
  mutate(label = cup %>% str_extract("[0-9]+") %>% str_replace("..", "'"),
         team_num = case_when(
           is.na(team_num) ~ 16,
           TRUE ~ team_num
         )) %>% 
  arrange(cup) %>% 
  mutate(label = factor(label, label),
         team_num = c(4, 4, 4, 5, 6, 6, 10, 10, 10, 8, 12, 12, 16, 16, 16, 16))

glimpse(ac_goals_df)
```

    ## Observations: 16
    ## Variables: 6
    ## $ link           <chr> "/wiki/1956_AFC_Asian_Cup", "/wiki/1960_AFC_Asi...
    ## $ cup            <chr> "1956 AFC Asian Cup", "1960 AFC Asian Cup", "19...
    ## $ goals_per_game <dbl> 4.50, 3.17, 2.17, 3.20, 2.92, 2.50, 3.17, 1.83,...
    ## $ team_num       <dbl> 4, 4, 4, 5, 6, 6, 10, 10, 10, 8, 12, 12, 16, 16...
    ## $ match_num      <dbl> 6, 6, 6, 10, 13, 10, 24, 24, 24, 16, 26, 26, 32...
    ## $ label          <fct> '56, '60, '64, '68, '72, '76, '80, '84, '88, '9...

Now we make a line graph but with LOTS of `annotate()` code to add in
comments, labels, and segments for the labels. At the end I use
`geom_emoji()` to add a soccer ball to the plot for each of the data
points.

``` r
plot <- ac_goals_df %>% 
  ggplot(aes(x = label, y = goals_per_game, group = 1)) +
  geom_line() +
  scale_y_continuous(limits = c(NA, 5.35),
                     breaks = c(1.5, 2, 2.5, 3, 3.5, 4, 4.5)) +
  labs(x = "Tournament (Year)", y = "Goals per Game") +
  theme_minimal() +
  theme(text = element_text(family = "Roboto Condensed"),
        axis.title = element_text(size = 12),
        axis.text = element_text(size = 12)) +
  annotate(geom = "label", x = "'56", y = 5.15, family = "Roboto Condensed",
           color = "black", 
           label = "Total Number of Games Played:", hjust = 0) +
  annotate(geom = "text", x = "'60", y = 4.9, 
           label = "6", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 1, xend = 3, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'68", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 3.8, xend = 4.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'72", y = 4.9, 
           label = "13", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 4.8, xend = 5.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'76", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 5.8, xend = 6.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'84", y = 4.9, 
           label = "24", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 7, xend = 9, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'92", y = 4.9, 
           label = "16", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 9.8, xend = 10.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 11.5, y = 4.9, 
           label = "26", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 11, xend = 12, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 14.5, y = 4.9, 
           label = "32", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 13, xend = 16, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 9, y = 4, family = "Roboto Condensed",
           label = glue("
                        Incredibly low amount of goals in Group B
                        (15 in 10 Games) and in Knock-Out Stages
                        (4 goals in 4, only 1 scored in normal time)")) +
  annotate(geom = "segment", x = 9, xend = 9, y = 1.65, yend = 3.75,
           color = "red") +
  ggimage::geom_emoji(aes(image = '26bd'), size = 0.03) 

plot
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/unnamed-chunk-109-1.png" style="display: block; margin: auto;" />

``` r
ggsave(filename = paste0(here::here("Asian Cup 2019"), "/gpg_plot.png"), 
       width = 8, height = 7, dpi = 300)
plot <- image_read(paste0(here::here("Asian Cup 2019"), "/gpg_plot.png"))
```

However, I’m not finished yet! I wanted to try to make this look a bit
more “official” so I attempted to add the Asian Cup logo on the top
right corner. There are probably alternative ways to how I did it below,
especially by using grobs, but I was reminded of
[this](https://www.danielphadley.com/ggplot-logo/) blog post by [Daniel
Hadley](https://twitter.com/danielphadley) who used the `magick` package
to add a footer with a logo onto a `ggplot` object. I’ve used `magick`
before for animations and this was a good chance to try it out for image
editing. Compared to Daniel Hadley’s example I needed to have the logo
on the right corner so I had to find an alternative way of creating a
blank canvas with `image_blank()` and then placing everything on top of
that with `image_composite()` and `image_append()`.

``` r
logo_raw <- image_read("https://upload.wikimedia.org/wikipedia/en/a/ad/2019_afc_asian_cup_logo.png")

logo_proc <- logo_raw %>% image_scale("600")

# create blank canvas
a <- image_blank(width = 1000, height = 100, color = "white")
# combine with logo image and shift logo to the right
b <- image_composite(image_scale(a, "x100"), image_scale(logo_proc, "x75"), 
                     offset = "+880+25")
# add in the title text
logo_header <- b %>% 
  image_annotate(text = glue("Goals per Game throughout the history of the Asian Cup"),
                 color = "black", size = 24, font = "Roboto Condensed",
                 location = "+63+50", gravity = "northwest")

# combine it all together! 
final2_plot <- image_append(image_scale(c(logo_header, plot), "1000"), stack = TRUE)

# image_write(final2_plot,
#             glue("{here::here('Asian Cup 2019')}/gpg_plot_final.png"))

final2_plot
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/unnamed-chunk-110-1.png" width="1000" style="display: block; margin: auto;" />

All in all it took a while to tweak the positions of the text and logo
image but for my first try it worked well. There is definitely room for
improvement in regards to sizing and scaling though.

Ultimately, I couldn’t find much information on why those tournaments in
the 80s in particular were such low scoring affairs. I wasn’t alive to
watch those games on TV nor could I find any illuminating articles or
blog posts on the style of Asian football back in the 80s…This was also
before Japan really got into soccer so there wasn’t anything I could
find in Japanese either.

Japan’s record vs. Group D opponents and rivals
-----------------------------------------------

Japan is the most successful team in the competition with 4
championships but who are their opponents in the group stages and how
have they fared against them? While I’m at it I will also check their
records against long-time continental rivals such as Iran, South Korea,
Saudi Arabia and more recently, Australia.

The data I’m going to use comes from
[Kaggle](https://www.kaggle.com/martj42/international-football-results-from-1872-to-2017)
which has all international football results from 1872 to the World Cup
final last year. To add in the federation affiliation (UEFA, AFC, etc.)
for each of the countries I slightly modified some code from one of the
kernels, [“A Journey Through The History of
Soccer”](https://www.kaggle.com/phjulien/a-journey-through-the-history-of-soccer/)
by PH Julien.

``` r
federation_files <- Sys.glob("../data/federation_affiliations/*")

df_federations = data.frame(country = NULL, federation = NULL)
for (f in federation_files) {
    federation = basename(f)
    content = read.csv(f, header=FALSE)
    content <- cbind(content,federation=rep(federation, dim(content)[1]))
    df_federations <- rbind(df_federations, content)
}

colnames(df_federations) <- c("country", "federation")

df_federations <- df_federations %>% 
  mutate(country = as.character(country) %>% str_trim(side = "both"))
```

Now to load the results data and then join it with the affiliations
data.

``` r
results_raw <- read_csv("../data/results.csv")

results_japan_raw <- results_raw %>% 
  filter(home_team == "Japan" | away_team == "Japan") %>% 
  rename(venue_country = country, 
         venue_city = city) %>% 
  mutate(match_num = row_number())

# combine with federation affiliations
results_japan_home <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("home_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(home_federation = federation) 

results_japan_away <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("away_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(away_federation = federation)

# combine home-away
results_japan_cleaned <- results_japan_home %>% 
  full_join(results_japan_away)
```

Next I need to edit some of the continents for teams that didn’t have a
match in the federation affiliation data set, for example, “South Korea”
is “Korea Republic” in the Kaggle data set.

``` r
results_japan_cleaned <- results_japan_cleaned %>% 
  mutate(
    home_federation = case_when(
      home_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei") ~ "AFC",
      home_team == "USA" ~ "Concacaf",
      home_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ home_federation),
    away_federation = case_when(
      away_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei", "Taiwan") ~ "AFC",
      away_team == "USA" ~ "Concacaf",
      away_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ away_federation
    ))
```

Now that it’s nice and cleaned up I can reshape it so that the data is
set from Japan’s perspective.

``` r
results_jp_asia <- results_japan_cleaned %>% 
  # filter only for Japan games and AFC opponents
  filter(home_team == "Japan" | away_team == "Japan",
         home_federation == "AFC" & away_federation == "AFC") %>% 
  select(-contains("federation"), -contains("venue"),
         -neutral, -match_num,
         date, home_team, home_score, away_team, away_score, tournament) %>% 
  # reshape columns to Japan vs. opponent
  mutate(
    opponent = case_when(
      away_team != "Japan" ~ away_team,
      home_team != "Japan" ~ home_team),
    home_away = case_when(
      home_team == "Japan" ~ "home",
      away_team == "Japan" ~ "away"),
    japan_goals = case_when(
      home_team == "Japan" ~ home_score,
      away_team == "Japan" ~ away_score),
    opp_goals = case_when(
      home_team != "Japan" ~ home_score,
      away_team != "Japan" ~ away_score)) %>% 
  # label results from Japan's perspective
  mutate(
    result = case_when(
      japan_goals > opp_goals ~ "Win",
      japan_goals < opp_goals ~ "Loss",
      japan_goals == opp_goals ~ "Draw"),
    result = result %>% as_factor() %>% fct_relevel(c("Win", "Draw", "Loss"))) %>% 
  select(-contains("score"), -contains("team"))
```

With all that done we can take a look at how Japan have done against
certain opponents by using `filter()`.

``` r
results_jp_asia %>% 
  filter(opponent == "Jordan",
         tournament == "AFC Asian Cup")
```

    ## # A tibble: 3 x 7
    ##   date       tournament    opponent home_away japan_goals opp_goals result
    ##   <date>     <chr>         <chr>    <chr>           <int>     <int> <fct> 
    ## 1 2004-07-31 AFC Asian Cup Jordan   home                1         1 Draw  
    ## 2 2011-01-09 AFC Asian Cup Jordan   home                1         1 Draw  
    ## 3 2015-01-20 AFC Asian Cup Jordan   home                2         0 Win

Unfortunately, this data set doesn’t go into extra-time or penalty wins
as Japan’s Quarter-Final meeting with Jordan in 2004 ended with Japan
securing a route to the semis, 4-3 on penalties!

I can create a function that’ll filter for certain opponents and
tournaments and aggregate the results. With the second argument being
`...`, `tidyeval` allows me to input any kind of filter condition for an
opponent, tournament, etc. The `if else` statement protects against
cases where Japan never had that type of result against an opponent and
makes sure that a column populated by 0s is created.

``` r
japan_versus <- function(data, ...) {
  # filter 
  filter_vars <- enquos(...)
  
  jp_vs <- data %>% 
    filter(!!!filter_vars) %>% 
    # count results type per opponent
    group_by(result, opponent) %>% 
    mutate(n = n()) %>% 
    ungroup() %>% 
    # sum amount of goals by Japan and opponent
    group_by(result, opponent) %>% 
    summarize(j_g = sum(japan_goals),
              o_g = sum(opp_goals),
              n = n()) %>% 
    ungroup() %>% 
    # spread results over multiple columns
    spread(result, n) %>% 
    # 1. failsafe against no type of result against an opponent
    # 2. sum up counts per opponent
    group_by(opponent) %>% 
    mutate(Win = if("Win" %in% names(.)){return(Win)} else{return(0)},
         Draw = if("Draw" %in% names(.)){return(Draw)} else{return(0)},
         Loss = if("Loss" %in% names(.)){return(Loss)} else{return(0)}) %>% 
    summarize(Win = sum(Win, na.rm = TRUE),
              Draw = sum(Draw, na.rm = TRUE),
              Loss = sum(Loss, na.rm = TRUE),
              `Goals For` = sum(j_g),
              `Goals Against` = sum(o_g))
  
  return(jp_vs)
}
```

Now let’s try it out a bit.

``` r
japan_versus(data = results_jp_asia, 
             opponent == "China")
```

    ## # A tibble: 1 x 6
    ##   opponent   Win  Draw  Loss `Goals For` `Goals Against`
    ##   <chr>    <int> <int> <int>       <int>           <int>
    ## 1 China       14     8    10          54              45

I can put in multiple filter conditions if needed as well.

``` r
japan_versus(data = results_jp_asia,
             home_away == "home",
             opponent %in% c("Palestine", "Vietnam", "India"))
```

    ## # A tibble: 3 x 6
    ##   opponent    Win  Draw  Loss `Goals For` `Goals Against`
    ##   <chr>     <int> <dbl> <dbl>       <int>           <int>
    ## 1 India         2     0     0          13               0
    ## 2 Palestine     1     0     0           4               0
    ## 3 Vietnam       1     0     0           1               0

As you can see Japan has never lost or drawn against India, Palestine,
or Vietnam so in the data there wouldn’t have been any rows with “Loss”
in the results column. With the function I created I was able to impute
results that didn’t exist and fill them in with 0s!

Let’s check Japan’s performance against our main rivals in the Asian
Cup. Here I make the tables look a lot nicer with the options in the
`kable` and `kableExtra` packages.

``` r
results_jp_asia %>% 
  japan_versus(opponent %in% c("Iran", "Korea Republic", "Saudi Arabia"),
               tournament == "AFC Asian Cup") %>% 
  knitr::kable(format = "html",
               caption = "Japan vs. Historic Rivals in the Asian Cup") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```

<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;">
<caption>
Japan vs. Historic Rivals in the Asian Cup
</caption>
<thead>
<tr>
<th style="border-bottom:hidden" colspan="1">
</th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="3">
Result

</th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2">
Goals

</th>
</tr>
<tr>
<th style="text-align:left;">
opponent
</th>
<th style="text-align:right;">
Win
</th>
<th style="text-align:right;">
Draw
</th>
<th style="text-align:right;">
Loss
</th>
<th style="text-align:right;">
Goals For
</th>
<th style="text-align:right;">
Goals Against
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;">
Iran
</td>
<td style="text-align:right;">
1
</td>
<td style="text-align:right;">
2
</td>
<td style="text-align:right;">
0
</td>
<td style="text-align:right;">
1
</td>
<td style="text-align:right;">
0
</td>
</tr>
<tr>
<td style="text-align:left;">
Korea Republic
</td>
<td style="text-align:right;">
0
</td>
<td style="text-align:right;">
2
</td>
<td style="text-align:right;">
1
</td>
<td style="text-align:right;">
2
</td>
<td style="text-align:right;">
4
</td>
</tr>
<tr>
<td style="text-align:left;">
Saudi Arabia
</td>
<td style="text-align:right;">
4
</td>
<td style="text-align:right;">
0
</td>
<td style="text-align:right;">
1
</td>
<td style="text-align:right;">
13
</td>
<td style="text-align:right;">
4
</td>
</tr>
</tbody>
</table>
Now let’s take a look at how Japan have historically played against the
other teams in Group F of this year’s Asian Cup.

``` r
results_jp_asia %>% 
  japan_versus(opponent %in% c("Oman", "Uzbekistan", "Turkmenistan")) %>% 
  knitr::kable(format = "html",
               caption = "Japan's Record vs. Group F Teams") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```

<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;">
<caption>
Japan’s Record vs. Group F Teams
</caption>
<thead>
<tr>
<th style="border-bottom:hidden" colspan="1">
</th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="3">
Result

</th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2">
Goals

</th>
</tr>
<tr>
<th style="text-align:left;">
opponent
</th>
<th style="text-align:right;">
Win
</th>
<th style="text-align:right;">
Draw
</th>
<th style="text-align:right;">
Loss
</th>
<th style="text-align:right;">
Goals For
</th>
<th style="text-align:right;">
Goals Against
</th>
</tr>
</thead>
<tbody>
<tr>
<td style="text-align:left;">
Oman
</td>
<td style="text-align:right;">
8
</td>
<td style="text-align:right;">
3
</td>
<td style="text-align:right;">
0
</td>
<td style="text-align:right;">
19
</td>
<td style="text-align:right;">
4
</td>
</tr>
<tr>
<td style="text-align:left;">
Uzbekistan
</td>
<td style="text-align:right;">
6
</td>
<td style="text-align:right;">
3
</td>
<td style="text-align:right;">
1
</td>
<td style="text-align:right;">
28
</td>
<td style="text-align:right;">
9
</td>
</tr>
</tbody>
</table>
We see no rows here for Turkmenistan. This is due to the fact that until
just this past week Japan had **never** played against them in a
friendly or competitive game!

Conclusion
==========

Although Japan’s first game was quite horrible I’m hoping it’ll wake the
players and coaches out of their complacency and not underestimate our
opponents in the next two games.

Japan

South Korea and Iran

thankfully south korea should be on the other side of the bracket and we
would also only meet Iran in the semifinals (provided both teams finish
top of their respective groups)

Japan could meet Australia in the Quarters but without Aaron Mooy
they’re a much weaker side as shown in their abject loss to Jordan in
their opening match.

even with losing new star Nakajima, the fact that we can replace him
with a player of the calibre of Takashi Inui and Hannover regular, Genki
Haraguchi, stepping up from the bench shows how much Japanese football
has progressed these past 25 years.

It’s a changing of the guard for Japan but we’ve got quality players in
Europe as well as some depth too with more young Japanese players headed
to Europe from a young age

It was quite awe-inspiring seeing how the number of Japanese players
playing for foreign clubs have been steadily increasing since the 1988
Asian Cup squad. Maybe that could be another idea for a visualization?

this tournament should be a first stepping stone for this new generation
of players to make a big impact for the next world cup in 2022 so keep
your eye out for this bunch of players!


================================================
FILE: Asian Cup 2019/visualize_asian_cup_2019.rmd
================================================
---
title: "Untitled"
author: "RN7"
always_allow_html: yes
output: 
  md_document:
    variant: markdown_github
---

```{r setup, include=FALSE, warning=FALSE, message=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

Another year, another big soccer/football tournament! This time it's the top international competition in Asia, the Asian Cup hosted in the U.A.E. I'll be covering (responsible) web-scraping, data wrangling (tidyverse ftw!), and of course, data visualization with `ggplot2`.

Let's get started!

## Packages

```{r message=FALSE}
pacman::p_load(tidyverse, scales, lubridate, ggrepel, stringi, magick, 
               glue, extrafont, rvest, ggtextures, cowplot, ggimage, polite)
# Roboto Condensed font (from hrbrmstrthemes or just Google it)
loadfonts()
```

## Top Goalscorers of the Asian Cup

The first thing I looked at was, "Who were the top goalscorers in the history of the Asian Cup?"

Here I use the [polite](https://github.com/dmi3kno/polite) package to take a look at the `robots.txt` for the web page and see if it is OK to  web scrape. It's good to make things like this a habit! 

First you pass the URL to the `bow()` function, check that you are indeed allowed to scrape, then use `scrape()` to retrieve data, and the rest is the usual `rvest` web-scraping workflow.

```{r}
topg_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup_records_and_statistics"

session <- bow(topg_url)

ac_top_scorers <- scrape(session) %>%
  html_nodes("table.wikitable:nth-child(29)") %>% ## 6/22/2019: it's (36) now and the players have changed...
  html_table() %>% 
  flatten_df() %>% 
  select(-Ref.) %>% 
  set_names(c("total_goals", "player", "country"))
```

For brevity, let's only take a look at the top 5 goal scorers. I'll also `mutate()` in a nice image of a soccer ball for the data points on the plot.

```{r}
ac_top_scorers <- ac_top_scorers %>% 
  head(5) %>% 
  mutate(image = "https://www.emoji.co.uk/files/microsoft-emojis/activity-windows10/8356-soccer-ball.png")
```

Now it's ready! Slightly different to your standard bar graph here as I use the `geom_isotype_col()` function from `ggtextures` to create a bar of soccer ball images. Compared to other functions in `ggtextures`, `geom_isotype_col()` allows each image to correspond to the value of the variable you are plotting, in this case 1 ball = 1 goal!

```{r top goal scorers plot, fig.width=8, fig.height=6}
ac_top_graph <- ac_top_scorers %>% 
  ggplot(aes(x = reorder(player, total_goals), y = total_goals,
             image = image)) +
  geom_isotype_col(img_width = grid::unit(1, "native"), img_height = NULL,
    ncol = NA, nrow = 1, hjust = 0, vjust = 0.5) +
  coord_flip() +
  scale_y_continuous(breaks = c(0, 2, 4, 6, 8, 10, 12, 14),
                     expand = c(0, 0), 
                     limits = c(0, 15)) +
  ggthemes::theme_solarized() +
  labs(title = "Top Scorers of the Asian Cup",
       subtitle = "Most goals in a single tournament: 8 (Ali Daei, 1996)",
       y = "Number of Goals", x = NULL,
       caption = glue("
                      Source: Wikipedia
                      By @R_by_Ryo")) +
  theme(text = element_text(family = "Roboto Condensed"),
        plot.title = element_text(size = 22),
        plot.subtitle = element_text(size = 14),
        axis.text = element_text(size = 14),
        axis.title.x = element_text(size = 16),
        axis.line.y = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        axis.ticks.y = element_blank())

ac_top_graph
```

OK, not bad. However, wouldn't it be nice to add a bit more information for context? Specifically, which country these players came from. So let's add some flags along the y-axis!

There are lots of different ways to do this (like `geom_flag()` from the `ggimage` package) but I ended up doing it the `cowplot` way. I had to tweak the scales a bit as the flags came in different sizes. When you plot, you just insert the image strip into the bar plot with `axis_canvas()` and combine all the parts together with `ggdraw()`!

```{r draw_image, fig.width=8, fig.height=6, fig.align='center'}
axis_image <- axis_canvas(ac_top_graph, axis = 'y') + 
  draw_image("https://upload.wikimedia.org/wikipedia/commons/c/ca/Flag_of_Iran.svg", 
             y = 13, scale = 1.5) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/0/09/Flag_of_South_Korea.svg", 
             y = 10, scale = 1.7) +
  draw_image("https://upload.wikimedia.org/wikipedia/en/9/9e/Flag_of_Japan.svg", 
             y = 7, scale = 1.7) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/f/f6/Flag_of_Iraq.svg", 
             y = 4, scale = 1.6) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/a/aa/Flag_of_Kuwait.svg", 
             y = 1, scale = 1.2)

ggdraw(insert_yaxis_grob(ac_top_graph, axis_image, position = "left"))
```

Ideally I wanted the soccer balls to be the official balls from the tournament that the player scored in. However, I couldn't find a nice emoji-fied/icon-ized version and there was also the "small" problem in that there was no "official" Asian Cup ball until the 2004 tournament in China! You can take a look at the official Asian Cup balls [here](http://football-balls.com/balls/asian-cup).

## Winners of the Asian Cup

We saw that the top goal scorers came from Iran, South Korea, Japan, Iraq, and Kuwait but did their goal scoring exploits lead their nations to glory? Let's find out!

When web-scraping I really like using `flatten_df()` after `html_table()` as I don't have to use the awkward looking `.[[1]]` within my piped workflow.

```{r}
acup_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup"

session <- bow(acup_url)

acup_winners_raw <- scrape(session) %>% 
  html_nodes("table:nth-child(31)") %>% 
  html_table() %>% 
  flatten_df()
```

Now I can use the `clean_names()` function to quickly clean up my names (mainly when I can't be bothered to `set_names()` them myself...).

The next steps are splitting up the number of times a team placed between 1st and 3rd and the year that occurred with `separate()`. 

The variants of `mutate()` are then used to tidy the string columns of the data into numeric type.

I use `gather()` so each team will have a row for each of the rank positions (1st-3rd). 

Finally, I arrange the data in a way that the facets will be ordered in the way that I want.

```{r clean winners df, warning=FALSE}
acup_winners_clean <- acup_winners_raw %>% 
  janitor::clean_names() %>% 
  slice(1:8) %>% 
  select(-fourth_place, -semi_finalists, -total_top_four) %>% 
  separate(winners, into = c("Champions", "first_place_year"), 
           sep = " ", extra = "merge") %>% 
  separate(runners_up, into = c("Runners-up", "second_place_year"), 
           sep = " ", extra = "merge") %>% 
  separate(third_place, into = c("Third Place", "third_place_year"), 
           sep = " ", extra = "merge") %>% 
  mutate_all(funs(str_replace_all(., "–", "0"))) %>% 
  mutate_at(vars(contains("num")), funs(as.numeric)) %>% 
  mutate(team = if_else(team == "Israel1", "Israel", team)) %>% 
  gather(key = "key", value = "value", -team, 
         -first_place_year, -second_place_year, -third_place_year) %>% 
  mutate(key = key %>% 
           fct_relevel(c("Champions", "Runners-up", "Third Place"))) %>% 
  arrange(key, value) %>% 
  mutate(team = as_factor(team),
         order = row_number())
```

I plot using facets on the "key" variable (containing the rank data) so that we can see how many times each team placed as Champions to Third Place. I also use the `glue()` function here to format the multi-line captions and titles in a neat way.

```{r, fig.width=8, fig.height=6, fig.align='center'}
acup_winners_clean %>% 
  ggplot(aes(value, team, color = key)) +
  geom_point(size = 5) +
  scale_color_manual(values = c("Champions" = "#FFCC33",
                                "Runners-up" = "#999999",
                                "Third Place" = "#CC6600"),
                     guide = FALSE) +
  labs(x = "Number of Occurrence",
       title = "Winners & Losers of the Asian Cup!",
       subtitle = glue("
                       Ordered by number of Asian Cup(s) won.
                       Four-time Champions, Japan, only won their first in 1992!"),
       caption = glue("
                      Note: Israel was expelled by the AFC in 1974 while Australia joined the AFC in 2006.
                      Source: Wikipedia
                      By @R_by_Ryo")) +
  facet_wrap(~key) +
  theme_minimal() +
  theme(text = element_text(family = "Roboto Condensed"),
        title = element_text(size = 18),
        plot.subtitle = element_text(size = 12),
        axis.title.y = element_blank(),
        axis.title.x = element_text(size = 12),
        axis.text.y = element_text(size = 14),
        axis.text.x = element_text(size = 12),
        plot.caption = element_text(hjust = 0, size = 10),
        panel.border = element_rect(fill = NA, colour = "grey20"),
        panel.grid.minor.x = element_blank(),
        strip.text = element_text(size = 16)) 
```

## Goals per Game

One new thing I learned very recently, while working on this viz in fact, was using magrittr aliases! 

![](https://twitter.com/Emil_Hvitfeldt/status/1081080919073542144) 

Usually for web scraping I always wind up having to use `.[x]` or `.[[x]]` but now I can just use `extract()` or `extract2()` respectively to do the same thing!

```{r GPG base links}
wiki_url <- "https://en.wikipedia.org"
session <- bow(wiki_url)
acup_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup"
session_cup <- bow(acup_url)

cup_links <- scrape(session_cup) %>% 
  html_nodes("br+ i a") %>% 
  html_attr("href") %>% 
  magrittr::extract(-17:-18)

acup_df <- cup_links %>% 
  as_data_frame() %>% 
  mutate(cup = str_remove(value, "\\/wiki\\/") %>% str_replace_all("_", " ")) %>% 
  rename(link = value)
```

Another cool thing I found while scraping this data was the `jump_to()` function that allows you to navigate to a new URL. This makes `map()`-ing over multiple URL links from a base URL very easy! Here, the base URL is the AFC Asian Cup Wikipedia page and the function iterates over each of the links to the URL of the respective tournament pages. Another way that I could've done this was to `map()` over the different dates of the tournaments as the Wikipedia page of each edition of the Asian Cup only differed in the "year" appended at the beginning of the URL. 

```{r goal info functions, warning=FALSE}
goals_info <- function(x) {
  goal_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    select(`Goals scored`) %>% 
    mutate(`Goals scored` = str_remove_all(`Goals scored`, pattern = ".*\\(") %>% 
             str_extract_all("\\d+\\.*\\d*") %>% as.numeric)
}

team_num_info <- function(x) {
  team_num_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    select(`Teams`) %>% 
    mutate(`Teams` = as.numeric(`Teams`))
}

match_num_info <- function(x) {
  match_num_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    janitor::clean_names() %>% 
    select(matches_played) %>% 
    mutate(matches_played = as.numeric(matches_played))
}

# all together:
goals_data <- acup_df %>% 
  mutate(goals_per_game = map(acup_df$link, goals_info) %>% unlist,
         team_num = map(acup_df$link, team_num_info) %>% unlist,
         match_num = map(acup_df$link, match_num_info) %>% unlist)
```

Next, clean it up a bit and add in the number of teams that participated in each tournament.

```{r clean up}
ac_goals_df <- goals_data %>% 
  mutate(label = cup %>% str_extract("[0-9]+") %>% str_replace("..", "'"),
         team_num = case_when(
           is.na(team_num) ~ 16,
           TRUE ~ team_num
         )) %>% 
  arrange(cup) %>% 
  mutate(label = factor(label, label),
         team_num = c(4, 4, 4, 5, 6, 6, 10, 10, 10, 8, 12, 12, 16, 16, 16, 16))

glimpse(ac_goals_df)
```

Now we make a line graph but with LOTS of `annotate()` code to add in comments, labels, and segments for the labels. At the end I use `geom_emoji()` to add a soccer ball to the plot for each of the data points.

```{r, fig.align='center'}
plot <- ac_goals_df %>% 
  ggplot(aes(x = label, y = goals_per_game, group = 1)) +
  geom_line() +
  scale_y_continuous(limits = c(NA, 5.35),
                     breaks = c(1.5, 2, 2.5, 3, 3.5, 4, 4.5)) +
  labs(x = "Tournament (Year)", y = "Goals per Game") +
  theme_minimal() +
  theme(text = element_text(family = "Roboto Condensed"),
        axis.title = element_text(size = 12),
        axis.text = element_text(size = 12)) +
  annotate(geom = "label", x = "'56", y = 5.15, family = "Roboto Condensed",
           color = "black", 
           label = "Total Number of Games Played:", hjust = 0) +
  annotate(geom = "text", x = "'60", y = 4.9, 
           label = "6", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 1, xend = 3, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'68", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 3.8, xend = 4.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'72", y = 4.9, 
           label = "13", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 4.8, xend = 5.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'76", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 5.8, xend = 6.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'84", y = 4.9, 
           label = "24", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 7, xend = 9, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'92", y = 4.9, 
           label = "16", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 9.8, xend = 10.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 11.5, y = 4.9, 
           label = "26", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 11, xend = 12, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 14.5, y = 4.9, 
           label = "32", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 13, xend = 16, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 9, y = 4, family = "Roboto Condensed",
           label = glue("
                        Incredibly low amount of goals in Group B
                        (15 in 10 Games) and in Knock-Out Stages
                        (4 goals in 4, only 1 scored in normal time)")) +
  annotate(geom = "segment", x = 9, xend = 9, y = 1.65, yend = 3.75,
           color = "red") +
  ggimage::geom_emoji(aes(image = '26bd'), size = 0.03) 

plot
ggsave(filename = paste0(here::here("Asian Cup 2019"), "/gpg_plot.png"), 
       width = 8, height = 7, dpi = 300)
plot <- image_read(paste0(here::here("Asian Cup 2019"), "/gpg_plot.png"))
```

However, I'm not finished yet! I wanted to try to make this look a bit more "official" so I attempted to add the Asian Cup logo on the top right corner. There are probably alternative ways to how I did it below, especially by using grobs, but I was reminded of [this](https://www.danielphadley.com/ggplot-logo/) blog post by [Daniel Hadley](https://twitter.com/danielphadley) who used the `magick` package to add a footer with a logo onto a `ggplot` object. I've used `magick` before for animations and this was a good chance to try it out for image editing. Compared to Daniel Hadley's example I needed to have the logo on the right corner so I had to find an alternative way of creating a blank canvas with `image_blank()` and then placing everything on top of that with `image_composite()` and `image_append()`.

```{r, fig.align='center'}
logo_raw <- image_read("https://upload.wikimedia.org/wikipedia/en/a/ad/2019_afc_asian_cup_logo.png")

logo_proc <- logo_raw %>% image_scale("600")

# create blank canvas
a <- image_blank(width = 1000, height = 100, color = "white")
# combine with logo image and shift logo to the right
b <- image_composite(image_scale(a, "x100"), image_scale(logo_proc, "x75"), 
                     offset = "+880+25")
# add in the title text
logo_header <- b %>% 
  image_annotate(text = glue("Goals per Game Throughout the History of the Asian Cup"),
                 color = "black", size = 24, font = "Roboto Condensed",
                 location = "+63+50", gravity = "northwest")

# combine it all together! 
final2_plot <- image_append(image_scale(c(logo_header, plot), "1000"), stack = TRUE)

# image_write(final2_plot,
#             glue("{here::here('Asian Cup 2019')}/gpg_plot_final.png"))

final2_plot
```

All in all it took a while to tweak the positions of the text and logo image but for my first try it worked well. There is definitely room for improvement in regards to sizing and scaling though.

Ultimately, I couldn't find much information on why those tournaments in the 80s in particular were such low scoring affairs. I wasn't alive to watch those games on TV nor could I find any illuminating articles or blog posts on the style of Asian football back in the 80s...This was also before Japan really got into soccer so there wasn't anything I could find in Japanese either.

## Japan's record vs. Group D opponents and rivals

Japan is the most successful team in the competition with 4 championships but who are their opponents in the group stages and how have they fared against them? While I'm at it I will also check their records against long-time continental rivals such as Iran, South Korea, Saudi Arabia and more recently, Australia.

The data I'm going to use comes from [Kaggle](https://www.kaggle.com/martj42/international-football-results-from-1872-to-2017) which has all international football results from 1872 to the World Cup final last year. To add in the federation affiliation (UEFA, AFC, etc.) for each of the countries I slightly modified some code from one of the kernels, ["A Journey Through The History of Soccer"](https://www.kaggle.com/phjulien/a-journey-through-the-history-of-soccer/) by PH Julien.

```{r}
federation_files <- Sys.glob("../data/federation_affiliations/*")

df_federations = data.frame(country = NULL, federation = NULL)
for (f in federation_files) {
    federation = basename(f)
    content = read.csv(f, header=FALSE)
    content <- cbind(content,federation=rep(federation, dim(content)[1]))
    df_federations <- rbind(df_federations, content)
}

colnames(df_federations) <- c("country", "federation")

df_federations <- df_federations %>% 
  mutate(country = as.character(country) %>% str_trim(side = "both"))
```

Now to load the results data and then join it with the affiliations data.

```{r, message=FALSE}
results_raw <- read_csv("../data/results.csv")

results_japan_raw <- results_raw %>% 
  filter(home_team == "Japan" | away_team == "Japan") %>% 
  rename(venue_country = country, 
         venue_city = city) %>% 
  mutate(match_num = row_number())

# combine with federation affiliations
results_japan_home <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("home_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(home_federation = federation) 

results_japan_away <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("away_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(away_federation = federation)

# combine home-away
results_japan_cleaned <- results_japan_home %>% 
  full_join(results_japan_away)
```

Next I need to edit some of the continents for teams that didn't have a match in the federation affiliation data set, for example, "South Korea" is "Korea Republic" in the Kaggle data set.

```{r}
results_japan_cleaned <- results_japan_cleaned %>% 
  mutate(
    home_federation = case_when(
      home_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei") ~ "AFC",
      home_team == "USA" ~ "Concacaf",
      home_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ home_federation),
    away_federation = case_when(
      away_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei", "Taiwan") ~ "AFC",
      away_team == "USA" ~ "Concacaf",
      away_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ away_federation
    ))
```

Now that it's nice and cleaned up I can reshape it so that the data is set from Japan's perspective.

```{r}
results_jp_asia <- results_japan_cleaned %>% 
  # filter only for Japan games and AFC opponents
  filter(home_team == "Japan" | away_team == "Japan",
         home_federation == "AFC" & away_federation == "AFC") %>% 
  select(-contains("federation"), -contains("venue"),
         -neutral, -match_num,
         date, home_team, home_score, away_team, away_score, tournament) %>% 
  # reshape columns to Japan vs. opponent
  mutate(
    opponent = case_when(
      away_team != "Japan" ~ away_team,
      home_team != "Japan" ~ home_team),
    home_away = case_when(
      home_team == "Japan" ~ "home",
      away_team == "Japan" ~ "away"),
    japan_goals = case_when(
      home_team == "Japan" ~ home_score,
      away_team == "Japan" ~ away_score),
    opp_goals = case_when(
      home_team != "Japan" ~ home_score,
      away_team != "Japan" ~ away_score)) %>% 
  # label results from Japan's perspective
  mutate(
    result = case_when(
      japan_goals > opp_goals ~ "Win",
      japan_goals < opp_goals ~ "Loss",
      japan_goals == opp_goals ~ "Draw"),
    result = result %>% as_factor() %>% fct_relevel(c("Win", "Draw", "Loss"))) %>% 
  select(-contains("score"), -contains("team"))
```

With all that done we can take a look at how Japan have done against certain opponents by using `filter()`.

```{r}
results_jp_asia %>% 
  filter(opponent == "Jordan",
         tournament == "AFC Asian Cup")
```

Unfortunately, this data set doesn't go into extra-time or penalty wins as Japan's Quarter-Final meeting with Jordan in 2004 ended with Japan securing a route to the semis, 4-3 on penalties! 

I can create a function that'll filter for certain opponents and tournaments and aggregate the results. With the second argument being `...`, `tidyeval` allows me to input any kind of filter condition for an opponent, tournament, etc. The `if else` statement protects against cases where Japan never had that type of result against an opponent and makes sure that a column populated by 0s is created.

```{r}
japan_versus <- function(data, ...) {
  # filter 
  filter_vars <- enquos(...)
  
  jp_vs <- data %>% 
    filter(!!!filter_vars) %>% 
    # count results type per opponent
    group_by(result, opponent) %>% 
    mutate(n = n()) %>% 
    ungroup() %>% 
    # sum amount of goals by Japan and opponent
    group_by(result, opponent) %>% 
    summarize(j_g = sum(japan_goals),
              o_g = sum(opp_goals),
              n = n()) %>% 
    ungroup() %>% 
    # spread results over multiple columns
    spread(result, n) %>% 
    # 1. failsafe against no type of result against an opponent
    # 2. sum up counts per opponent
    group_by(opponent) %>% 
    mutate(Win = if("Win" %in% names(.)){return(Win)} else{return(0)},
         Draw = if("Draw" %in% names(.)){return(Draw)} else{return(0)},
         Loss = if("Loss" %in% names(.)){return(Loss)} else{return(0)}) %>% 
    summarize(Win = sum(Win, na.rm = TRUE),
              Draw = sum(Draw, na.rm = TRUE),
              Loss = sum(Loss, na.rm = TRUE),
              `Goals For` = sum(j_g),
              `Goals Against` = sum(o_g))
  
  return(jp_vs)
}
```

Now let's try it out a bit.

```{r}
japan_versus(data = results_jp_asia, 
             opponent == "China")
```

I can put in multiple filter conditions if needed as well.

```{r}
japan_versus(data = results_jp_asia,
             home_away == "home",
             opponent %in% c("Palestine", "Vietnam", "India"))
```

As you can see Japan has never lost or drawn against India, Palestine, or Vietnam so in the data there wouldn't have been any rows with "Loss" in the results column. With the function I created I was able to impute results that didn't exist and fill them in with 0s!

Let's check Japan's performance against our main rivals in the Asian Cup. Here I make the tables look a lot nicer with the options in the `kable` and `kableExtra` packages.

```{r, fig.align='center'}
results_jp_asia %>% 
  japan_versus(opponent %in% c("Iran", "Korea Republic", "Saudi Arabia"),
               tournament == "AFC Asian Cup") %>% 
  knitr::kable(format = "html",
               caption = "Japan vs. Historic Rivals in the Asian Cup") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```

Now let's take a look at how Japan have historically played against the other teams in Group F of this year's Asian Cup.

```{r}
results_jp_asia %>% 
  japan_versus(opponent %in% c("Oman", "Uzbekistan", "Turkmenistan")) %>% 
  knitr::kable(format = "html",
               caption = "Japan's Record vs. Group F Teams") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```

We see no rows here for Turkmenistan. This is due to the fact that until just this past week Japan had **never** played against them in a friendly or competitive game!

# Conclusion

Although Japan's first game was quite horrible I'm hoping it'll wake the players and coaches out of their complacency and not underestimate our opponents in the next two games. 

Japan 

South Korea and Iran

thankfully south korea should be on the other side of the bracket and we would also only meet Iran in the semifinals (provided both teams finish top of their respective groups)

Japan could meet Australia in the Quarters but without Aaron Mooy they're a much weaker side as shown in their abject loss to Jordan in their opening match.

even with losing new star Nakajima, the fact that we can replace him with a player of the calibre of Takashi Inui and Hannover regular, Genki Haraguchi, stepping up from the bench shows how much Japanese football has progressed these past 25 years.

It's a changing of the guard for Japan but we've got quality players in Europe as well as some depth too with more young Japanese players headed to Europe from a young age

It was quite awe-inspiring seeing how the number of Japanese players playing for foreign clubs have been steadily increasing since the 1988 Asian Cup squad. Maybe that could be another idea for a visualization?

this tournament should be a first stepping stone for this new generation of players to make a big impact for the next world cup in 2022 so keep your eye out for this bunch of players!

================================================
FILE: Asian Cup 2019/visualize_asian_cup_2019.utf8.md
================================================
---
title: "Untitled"
author: "RN7"
always_allow_html: yes
output: 
  md_document:
    variant: markdown_github
---



Another year, another big soccer/football tournament! This time it's the top international competition in Asia, the Asian Cup hosted in the U.A.E. I'll be covering (responsible) web-scraping, data wrangling (tidyverse ftw!), and of course, data visualization with `ggplot2`.

Let's get started!

## Packages


```r
pacman::p_load(tidyverse, scales, lubridate, ggrepel, stringi, magick, 
               glue, extrafont, rvest, ggtextures, cowplot, ggimage, polite)
# Roboto Condensed font (from hrbrmstrthemes or just Google it)
loadfonts()
```

## Top Goalscorers of the Asian Cup

The first thing I looked at was, "Who were the top goalscorers in the history of the Asian Cup?"

Here I use the [polite](https://github.com/dmi3kno/polite) package to take a look at the `robots.txt` for the web page and see if it is OK to  web scrape. It's good to make things like this a habit! 

First you pass the URL to the `bow()` function, check that you are indeed allowed to scrape, then use `scrape()` to retrieve data, and the rest is the usual `rvest` web-scraping workflow.


```r
topg_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup_records_and_statistics"

session <- bow(topg_url)

ac_top_scorers <- scrape(session) %>%
  html_nodes("table.wikitable:nth-child(29)") %>% 
  html_table() %>% 
  flatten_df() %>% 
  select(-Ref.) %>% 
  set_names(c("total_goals", "player", "country"))
```

For brevity, let's only take a look at the top 5 goal scorers. I'll also `mutate()` in a nice image of a soccer ball for the data points on the plot.


```r
ac_top_scorers <- ac_top_scorers %>% 
  head(5) %>% 
  mutate(image = "https://www.emoji.co.uk/files/microsoft-emojis/activity-windows10/8356-soccer-ball.png")
```

Now it's ready! Slightly different to your standard bar graph here as I use the `geom_isotype_col()` function from `ggtextures` to create a bar of soccer ball images. Compared to other functions in `ggtextures`, `geom_isotype_col()` allows each image to correspond to the value of the variable you are plotting, in this case 1 ball = 1 goal!


```r
ac_top_graph <- ac_top_scorers %>% 
  ggplot(aes(x = reorder(player, total_goals), y = total_goals,
             image = image)) +
  geom_isotype_col(img_width = grid::unit(1, "native"), img_height = NULL,
    ncol = NA, nrow = 1, hjust = 0, vjust = 0.5) +
  coord_flip() +
  scale_y_continuous(breaks = c(0, 2, 4, 6, 8, 10, 12, 14),
                     expand = c(0, 0), 
                     limits = c(0, 15)) +
  ggthemes::theme_solarized() +
  labs(title = "Top Scorers of the Asian Cup",
       subtitle = "Most goals in a single tournament: 8 (Ali Daei, 1996)",
       y = "Number of Goals", x = NULL,
       caption = glue("
                      Source: Wikipedia
                      By @R_by_Ryo")) +
  theme(text = element_text(family = "Roboto Condensed"),
        plot.title = element_text(size = 22),
        plot.subtitle = element_text(size = 14),
        axis.text = element_text(size = 14),
        axis.title.x = element_text(size = 16),
        axis.line.y = element_blank(),
        panel.grid.minor = element_blank(),
        panel.background = element_blank(),
        axis.ticks.y = element_blank())

ac_top_graph
```

![](visualize_asian_cup_2019_files/figure-markdown_github/top goal scorers plot-1.png)

OK, not bad. However, wouldn't it be nice to add a bit more information for context? Specifically, which country these players came from. So let's add some flags along the y-axis!

There are lots of different ways to do this (like `geom_flag()` from the `ggimage` package) but I ended up doing it the `cowplot` way. I had to tweak the scales a bit as the flags came in different sizes. When you plot, you just insert the image strip into the bar plot with `axis_canvas()` and combine all the parts together with `ggdraw()`!


```r
axis_image <- axis_canvas(ac_top_graph, axis = 'y') + 
  draw_image("https://upload.wikimedia.org/wikipedia/commons/c/ca/Flag_of_Iran.svg", 
             y = 13, scale = 1.5) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/0/09/Flag_of_South_Korea.svg", 
             y = 10, scale = 1.7) +
  draw_image("https://upload.wikimedia.org/wikipedia/en/9/9e/Flag_of_Japan.svg", 
             y = 7, scale = 1.7) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/f/f6/Flag_of_Iraq.svg", 
             y = 4, scale = 1.6) +
  draw_image("https://upload.wikimedia.org/wikipedia/commons/a/aa/Flag_of_Kuwait.svg", 
             y = 1, scale = 1.2)

ggdraw(insert_yaxis_grob(ac_top_graph, axis_image, position = "left"))
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/draw_image-1.png" style="display: block; margin: auto;" />

Ideally I wanted the soccer balls to be the official balls from the tournament that the player scored in. However, I couldn't find a nice emoji-fied/icon-ized version and there was also the "small" problem in that there was no "official" Asian Cup ball until the 2004 tournament in China! You can take a look at the official Asian Cup balls [here](http://football-balls.com/balls/asian-cup).

## Winners of the Asian Cup

We saw that the top goal scorers came from Iran, South Korea, Japan, Iraq, and Kuwait but did their goal scoring exploits lead their nations to glory? Let's find out!

When web-scraping I really like using `flatten_df()` after `html_table()` as I don't have to use the awkward looking `.[[1]]` within my piped workflow.


```r
acup_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup"

session <- bow(acup_url)

acup_winners_raw <- scrape(session) %>% 
  html_nodes("table:nth-child(31)") %>% 
  html_table() %>% 
  flatten_df()
```

Now I can use the `clean_names()` function to quickly clean up my names (mainly when I can't be bothered to `set_names()` them myself...).

The next steps are splitting up the number of times a team placed between 1st and 3rd and the year that occurred with `separate()`. 

The variants of `mutate()` are then used to tidy the string columns of the data into numeric type.

I use `gather()` so each team will have a row for each of the rank positions (1st-3rd). 

Finally, I arrange the data in a way that the facets will be ordered in the way that I want.


```r
acup_winners_clean <- acup_winners_raw %>% 
  janitor::clean_names() %>% 
  slice(1:8) %>% 
  select(-fourth_place, -semi_finalists, -total_top_four) %>% 
  separate(winners, into = c("Champions", "first_place_year"), 
           sep = " ", extra = "merge") %>% 
  separate(runners_up, into = c("Runners-up", "second_place_year"), 
           sep = " ", extra = "merge") %>% 
  separate(third_place, into = c("Third Place", "third_place_year"), 
           sep = " ", extra = "merge") %>% 
  mutate_all(funs(str_replace_all(., "–", "0"))) %>% 
  mutate_at(vars(contains("num")), funs(as.numeric)) %>% 
  mutate(team = if_else(team == "Israel1", "Israel", team)) %>% 
  gather(key = "key", value = "value", -team, 
         -first_place_year, -second_place_year, -third_place_year) %>% 
  mutate(key = key %>% 
           fct_relevel(c("Champions", "Runners-up", "Third Place"))) %>% 
  arrange(key, value) %>% 
  mutate(team = as_factor(team),
         order = row_number())
```

I plot using facets on the "key" variable (containing the rank data) so that we can see how many times each team placed as Champions to Third Place. I also use the `glue()` function here to format the multi-line captions and titles in a neat way.


```r
acup_winners_clean %>% 
  ggplot(aes(value, team, color = key)) +
  geom_point(size = 5) +
  scale_color_manual(values = c("Champions" = "#FFCC33",
                                "Runners-up" = "#999999",
                                "Third Place" = "#CC6600"),
                     guide = FALSE) +
  labs(x = "Number of Occurrence",
       title = "Winners & Losers of the Asian Cup!",
       subtitle = glue("
                       Ordered by number of Asian Cup(s) won.
                       Four-time Champions, Japan, only won their first in 1992!"),
       caption = glue("
                      Note: Israel was expelled by the AFC in 1974 while Australia joined the AFC in 2006.
                      Source: Wikipedia
                      By @R_by_Ryo")) +
  facet_wrap(~key) +
  theme_minimal() +
  theme(text = element_text(family = "Roboto Condensed"),
        title = element_text(size = 18),
        plot.subtitle = element_text(size = 12),
        axis.title.y = element_blank(),
        axis.title.x = element_text(size = 12),
        axis.text.y = element_text(size = 14),
        axis.text.x = element_text(size = 12),
        plot.caption = element_text(hjust = 0, size = 10),
        panel.border = element_rect(fill = NA, colour = "grey20"),
        panel.grid.minor.x = element_blank(),
        strip.text = element_text(size = 16)) 
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/unnamed-chunk-108-1.png" style="display: block; margin: auto;" />

## Goals per Game

One new thing I learned very recently, while working on this viz in fact, was using magrittr aliases! 

![](https://twitter.com/Emil_Hvitfeldt/status/1081080919073542144) 

Usually for web scraping I always wind up having to use `.[x]` or `.[[x]]` but now I can just use `extract()` or `extract2()` respectively to do the same thing!


```r
wiki_url <- "https://en.wikipedia.org"
session <- bow(wiki_url)
acup_url <- "https://en.wikipedia.org/wiki/AFC_Asian_Cup"
session_cup <- bow(acup_url)

cup_links <- scrape(session_cup) %>% 
  html_nodes("br+ i a") %>% 
  html_attr("href") %>% 
  magrittr::extract(-17:-18)

acup_df <- cup_links %>% 
  as_data_frame() %>% 
  mutate(cup = str_remove(value, "\\/wiki\\/") %>% str_replace_all("_", " ")) %>% 
  rename(link = value)
```

Another cool thing I found while scraping this data was the `jump_to()` function that allows you to navigate to a new URL. This makes `map()`-ing over multiple URL links from a base URL very easy! Here, the base URL is the AFC Asian Cup Wikipedia page and the function iterates over each of the links to the URL of the respective tournament pages. Another way that I could've done this was to `map()` over the different dates of the tournaments as the Wikipedia page of each edition of the Asian Cup only differed in the "year" appended at the beginning of the URL. 


```r
goals_info <- function(x) {
  goal_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    select(`Goals scored`) %>% 
    mutate(`Goals scored` = str_remove_all(`Goals scored`, pattern = ".*\\(") %>% 
             str_extract_all("\\d+\\.*\\d*") %>% as.numeric)
}

team_num_info <- function(x) {
  team_num_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    select(`Teams`) %>% 
    mutate(`Teams` = as.numeric(`Teams`))
}

match_num_info <- function(x) {
  match_num_info <- session %>% 
    jump_to(x) %>% 
    html_nodes(".vcalendar") %>% 
    html_table(header = FALSE) %>% 
    flatten_df() %>% 
    spread(key = X1, value = X2) %>% 
    janitor::clean_names() %>% 
    select(matches_played) %>% 
    mutate(matches_played = as.numeric(matches_played))
}

# all together:
goals_data <- acup_df %>% 
  mutate(goals_per_game = map(acup_df$link, goals_info) %>% unlist,
         team_num = map(acup_df$link, team_num_info) %>% unlist,
         match_num = map(acup_df$link, match_num_info) %>% unlist)
```

Next, clean it up a bit and add in the number of teams that participated in each tournament.


```r
ac_goals_df <- goals_data %>% 
  mutate(label = cup %>% str_extract("[0-9]+") %>% str_replace("..", "'"),
         team_num = case_when(
           is.na(team_num) ~ 16,
           TRUE ~ team_num
         )) %>% 
  arrange(cup) %>% 
  mutate(label = factor(label, label),
         team_num = c(4, 4, 4, 5, 6, 6, 10, 10, 10, 8, 12, 12, 16, 16, 16, 16))

glimpse(ac_goals_df)
```

```
## Observations: 16
## Variables: 6
## $ link           <chr> "/wiki/1956_AFC_Asian_Cup", "/wiki/1960_AFC_Asi...
## $ cup            <chr> "1956 AFC Asian Cup", "1960 AFC Asian Cup", "19...
## $ goals_per_game <dbl> 4.50, 3.17, 2.17, 3.20, 2.92, 2.50, 3.17, 1.83,...
## $ team_num       <dbl> 4, 4, 4, 5, 6, 6, 10, 10, 10, 8, 12, 12, 16, 16...
## $ match_num      <dbl> 6, 6, 6, 10, 13, 10, 24, 24, 24, 16, 26, 26, 32...
## $ label          <fct> '56, '60, '64, '68, '72, '76, '80, '84, '88, '9...
```

Now we make a line graph but with LOTS of `annotate()` code to add in comments, labels, and segments for the labels. At the end I use `geom_emoji()` to add a soccer ball to the plot for each of the data points.


```r
plot <- ac_goals_df %>% 
  ggplot(aes(x = label, y = goals_per_game, group = 1)) +
  geom_line() +
  scale_y_continuous(limits = c(NA, 5.35),
                     breaks = c(1.5, 2, 2.5, 3, 3.5, 4, 4.5)) +
  labs(x = "Tournament (Year)", y = "Goals per Game") +
  theme_minimal() +
  theme(text = element_text(family = "Roboto Condensed"),
        axis.title = element_text(size = 12),
        axis.text = element_text(size = 12)) +
  annotate(geom = "label", x = "'56", y = 5.15, family = "Roboto Condensed",
           color = "black", 
           label = "Total Number of Games Played:", hjust = 0) +
  annotate(geom = "text", x = "'60", y = 4.9, 
           label = "6", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 1, xend = 3, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'68", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 3.8, xend = 4.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'72", y = 4.9, 
           label = "13", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 4.8, xend = 5.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'76", y = 4.9, 
           label = "10", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 5.8, xend = 6.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'84", y = 4.9, 
           label = "24", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 7, xend = 9, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = "'92", y = 4.9, 
           label = "16", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 9.8, xend = 10.2, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 11.5, y = 4.9, 
           label = "26", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 11, xend = 12, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 14.5, y = 4.9, 
           label = "32", family = "Roboto Condensed") +
  annotate(geom = "segment", x = 13, xend = 16, y = 4.8, yend = 4.8) +
  annotate(geom = "text", x = 9, y = 4, family = "Roboto Condensed",
           label = glue("
                        Incredibly low amount of goals in Group B
                        (15 in 10 Games) and in Knock-Out Stages
                        (4 goals in 4, only 1 scored in normal time)")) +
  annotate(geom = "segment", x = 9, xend = 9, y = 1.65, yend = 3.75,
           color = "red") +
  ggimage::geom_emoji(aes(image = '26bd'), size = 0.03) 

plot
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/unnamed-chunk-109-1.png" style="display: block; margin: auto;" />

```r
ggsave(filename = paste0(here::here("Asian Cup 2019"), "/gpg_plot.png"), 
       width = 8, height = 7, dpi = 300)
plot <- image_read(paste0(here::here("Asian Cup 2019"), "/gpg_plot.png"))
```

However, I'm not finished yet! I wanted to try to make this look a bit more "official" so I attempted to add the Asian Cup logo on the top right corner. There are probably alternative ways to how I did it below, especially by using grobs, but I was reminded of [this](https://www.danielphadley.com/ggplot-logo/) blog post by [Daniel Hadley](https://twitter.com/danielphadley) who used the `magick` package to add a footer with a logo onto a `ggplot` object. I've used `magick` before for animations and this was a good chance to try it out for image editing. Compared to Daniel Hadley's example I needed to have the logo on the right corner so I had to find an alternative way of creating a blank canvas with `image_blank()` and then placing everything on top of that with `image_composite()` and `image_append()`.


```r
logo_raw <- image_read("https://upload.wikimedia.org/wikipedia/en/a/ad/2019_afc_asian_cup_logo.png")

logo_proc <- logo_raw %>% image_scale("600")

# create blank canvas
a <- image_blank(width = 1000, height = 100, color = "white")
# combine with logo image and shift logo to the right
b <- image_composite(image_scale(a, "x100"), image_scale(logo_proc, "x75"), 
                     offset = "+880+25")
# add in the title text
logo_header <- b %>% 
  image_annotate(text = glue("Goals per Game throughout the history of the Asian Cup"),
                 color = "black", size = 24, font = "Roboto Condensed",
                 location = "+63+50", gravity = "northwest")

# combine it all together! 
final2_plot <- image_append(image_scale(c(logo_header, plot), "1000"), stack = TRUE)

# image_write(final2_plot,
#             glue("{here::here('Asian Cup 2019')}/gpg_plot_final.png"))

final2_plot
```

<img src="visualize_asian_cup_2019_files/figure-markdown_github/unnamed-chunk-110-1.png" width="1000" style="display: block; margin: auto;" />

All in all it took a while to tweak the positions of the text and logo image but for my first try it worked well. There is definitely room for improvement in regards to sizing and scaling though.

Ultimately, I couldn't find much information on why those tournaments in the 80s in particular were such low scoring affairs. I wasn't alive to watch those games on TV nor could I find any illuminating articles or blog posts on the style of Asian football back in the 80s...This was also before Japan really got into soccer so there wasn't anything I could find in Japanese either.

## Japan's record vs. Group D opponents and rivals

Japan is the most successful team in the competition with 4 championships but who are their opponents in the group stages and how have they fared against them? While I'm at it I will also check their records against long-time continental rivals such as Iran, South Korea, Saudi Arabia and more recently, Australia.

The data I'm going to use comes from [Kaggle](https://www.kaggle.com/martj42/international-football-results-from-1872-to-2017) which has all international football results from 1872 to the World Cup final last year. To add in the federation affiliation (UEFA, AFC, etc.) for each of the countries I slightly modified some code from one of the kernels, ["A Journey Through The History of Soccer"](https://www.kaggle.com/phjulien/a-journey-through-the-history-of-soccer/) by PH Julien.


```r
federation_files <- Sys.glob("../data/federation_affiliations/*")

df_federations = data.frame(country = NULL, federation = NULL)
for (f in federation_files) {
    federation = basename(f)
    content = read.csv(f, header=FALSE)
    content <- cbind(content,federation=rep(federation, dim(content)[1]))
    df_federations <- rbind(df_federations, content)
}

colnames(df_federations) <- c("country", "federation")

df_federations <- df_federations %>% 
  mutate(country = as.character(country) %>% str_trim(side = "both"))
```

Now to load the results data and then join it with the affiliations data.


```r
results_raw <- read_csv("../data/results.csv")

results_japan_raw <- results_raw %>% 
  filter(home_team == "Japan" | away_team == "Japan") %>% 
  rename(venue_country = country, 
         venue_city = city) %>% 
  mutate(match_num = row_number())

# combine with federation affiliations
results_japan_home <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("home_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(home_federation = federation) 

results_japan_away <- results_japan_raw %>% 
  left_join(df_federations, 
            by = c("away_team" = "country")) %>% 
  mutate(federation = as.character(federation)) %>% 
  rename(away_federation = federation)

# combine home-away
results_japan_cleaned <- results_japan_home %>% 
  full_join(results_japan_away)
```

Next I need to edit some of the continents for teams that didn't have a match in the federation affiliation data set, for example, "South Korea" is "Korea Republic" in the Kaggle data set.


```r
results_japan_cleaned <- results_japan_cleaned %>% 
  mutate(
    home_federation = case_when(
      home_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei") ~ "AFC",
      home_team == "USA" ~ "Concacaf",
      home_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ home_federation),
    away_federation = case_when(
      away_team %in% c(
        "China", "Manchukuo", "Burma", "Korea Republic", "Vietnam Republic",
        "Korea DPR", "Brunei", "Taiwan") ~ "AFC",
      away_team == "USA" ~ "Concacaf",
      away_team == "Bosnia-Herzegovina" ~ "UEFA",
      TRUE ~ away_federation
    ))
```

Now that it's nice and cleaned up I can reshape it so that the data is set from Japan's perspective.


```r
results_jp_asia <- results_japan_cleaned %>% 
  # filter only for Japan games and AFC opponents
  filter(home_team == "Japan" | away_team == "Japan",
         home_federation == "AFC" & away_federation == "AFC") %>% 
  select(-contains("federation"), -contains("venue"),
         -neutral, -match_num,
         date, home_team, home_score, away_team, away_score, tournament) %>% 
  # reshape columns to Japan vs. opponent
  mutate(
    opponent = case_when(
      away_team != "Japan" ~ away_team,
      home_team != "Japan" ~ home_team),
    home_away = case_when(
      home_team == "Japan" ~ "home",
      away_team == "Japan" ~ "away"),
    japan_goals = case_when(
      home_team == "Japan" ~ home_score,
      away_team == "Japan" ~ away_score),
    opp_goals = case_when(
      home_team != "Japan" ~ home_score,
      away_team != "Japan" ~ away_score)) %>% 
  # label results from Japan's perspective
  mutate(
    result = case_when(
      japan_goals > opp_goals ~ "Win",
      japan_goals < opp_goals ~ "Loss",
      japan_goals == opp_goals ~ "Draw"),
    result = result %>% as_factor() %>% fct_relevel(c("Win", "Draw", "Loss"))) %>% 
  select(-contains("score"), -contains("team"))
```

With all that done we can take a look at how Japan have done against certain opponents by using `filter()`.


```r
results_jp_asia %>% 
  filter(opponent == "Jordan",
         tournament == "AFC Asian Cup")
```

```
## # A tibble: 3 x 7
##   date       tournament    opponent home_away japan_goals opp_goals result
##   <date>     <chr>         <chr>    <chr>           <int>     <int> <fct> 
## 1 2004-07-31 AFC Asian Cup Jordan   home                1         1 Draw  
## 2 2011-01-09 AFC Asian Cup Jordan   home                1         1 Draw  
## 3 2015-01-20 AFC Asian Cup Jordan   home                2         0 Win
```

Unfortunately, this data set doesn't go into extra-time or penalty wins as Japan's Quarter-Final meeting with Jordan in 2004 ended with Japan securing a route to the semis, 4-3 on penalties! 

I can create a function that'll filter for certain opponents and tournaments and aggregate the results. With the second argument being `...`, `tidyeval` allows me to input any kind of filter condition for an opponent, tournament, etc. The `if else` statement protects against cases where Japan never had that type of result against an opponent and makes sure that a column populated by 0s is created.


```r
japan_versus <- function(data, ...) {
  # filter 
  filter_vars <- enquos(...)
  
  jp_vs <- data %>% 
    filter(!!!filter_vars) %>% 
    # count results type per opponent
    group_by(result, opponent) %>% 
    mutate(n = n()) %>% 
    ungroup() %>% 
    # sum amount of goals by Japan and opponent
    group_by(result, opponent) %>% 
    summarize(j_g = sum(japan_goals),
              o_g = sum(opp_goals),
              n = n()) %>% 
    ungroup() %>% 
    # spread results over multiple columns
    spread(result, n) %>% 
    # 1. failsafe against no type of result against an opponent
    # 2. sum up counts per opponent
    group_by(opponent) %>% 
    mutate(Win = if("Win" %in% names(.)){return(Win)} else{return(0)},
         Draw = if("Draw" %in% names(.)){return(Draw)} else{return(0)},
         Loss = if("Loss" %in% names(.)){return(Loss)} else{return(0)}) %>% 
    summarize(Win = sum(Win, na.rm = TRUE),
              Draw = sum(Draw, na.rm = TRUE),
              Loss = sum(Loss, na.rm = TRUE),
              `Goals For` = sum(j_g),
              `Goals Against` = sum(o_g))
  
  return(jp_vs)
}
```

Now let's try it out a bit.


```r
japan_versus(data = results_jp_asia, 
             opponent == "China")
```

```
## # A tibble: 1 x 6
##   opponent   Win  Draw  Loss `Goals For` `Goals Against`
##   <chr>    <int> <int> <int>       <int>           <int>
## 1 China       14     8    10          54              45
```

I can put in multiple filter conditions if needed as well.


```r
japan_versus(data = results_jp_asia,
             home_away == "home",
             opponent %in% c("Palestine", "Vietnam", "India"))
```

```
## # A tibble: 3 x 6
##   opponent    Win  Draw  Loss `Goals For` `Goals Against`
##   <chr>     <int> <dbl> <dbl>       <int>           <int>
## 1 India         2     0     0          13               0
## 2 Palestine     1     0     0           4               0
## 3 Vietnam       1     0     0           1               0
```

As you can see Japan has never lost or drawn against India, Palestine, or Vietnam so in the data there wouldn't have been any rows with "Loss" in the results column. With the function I created I was able to impute results that didn't exist and fill them in with 0s!

Let's check Japan's performance against our main rivals in the Asian Cup. Here I make the tables look a lot nicer with the options in the `kable` and `kableExtra` packages.


```r
results_jp_asia %>% 
  japan_versus(opponent %in% c("Iran", "Korea Republic", "Saudi Arabia"),
               tournament == "AFC Asian Cup") %>% 
  knitr::kable(format = "html",
               caption = "Japan vs. Historic Rivals in the Asian Cup") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```

<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;">
<caption>Japan vs. Historic Rivals in the Asian Cup</caption>
 <thead>
<tr>
<th style="border-bottom:hidden" colspan="1"></th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="3"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px;">Result</div></th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px;">Goals</div></th>
</tr>
  <tr>
   <th style="text-align:left;"> opponent </th>
   <th style="text-align:right;"> Win </th>
   <th style="text-align:right;"> Draw </th>
   <th style="text-align:right;"> Loss </th>
   <th style="text-align:right;"> Goals For </th>
   <th style="text-align:right;"> Goals Against </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Iran </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 0 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Korea Republic </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 2 </td>
   <td style="text-align:right;"> 4 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Saudi Arabia </td>
   <td style="text-align:right;"> 4 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 13 </td>
   <td style="text-align:right;"> 4 </td>
  </tr>
</tbody>
</table>

Now let's take a look at how Japan have historically played against the other teams in Group F of this year's Asian Cup.


```r
results_jp_asia %>% 
  japan_versus(opponent %in% c("Oman", "Uzbekistan", "Turkmenistan")) %>% 
  knitr::kable(format = "html",
               caption = "Japan's Record vs. Group F Teams") %>% 
  kableExtra::kable_styling(full_width = FALSE) %>% 
  kableExtra::add_header_above(c(" ", "Result" = 3, "Goals" = 2))
```

<table class="table" style="width: auto !important; margin-left: auto; margin-right: auto;">
<caption>Japan's Record vs. Group F Teams</caption>
 <thead>
<tr>
<th style="border-bottom:hidden" colspan="1"></th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="3"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px;">Result</div></th>
<th style="border-bottom:hidden; padding-bottom:0; padding-left:3px;padding-right:3px;text-align: center; " colspan="2"><div style="border-bottom: 1px solid #ddd; padding-bottom: 5px;">Goals</div></th>
</tr>
  <tr>
   <th style="text-align:left;"> opponent </th>
   <th style="text-align:right;"> Win </th>
   <th style="text-align:right;"> Draw </th>
   <th style="text-align:right;"> Loss </th>
   <th style="text-align:right;"> Goals For </th>
   <th style="text-align:right;"> Goals Against </th>
  </tr>
 </thead>
<tbody>
  <tr>
   <td style="text-align:left;"> Oman </td>
   <td style="text-align:right;"> 8 </td>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:right;"> 0 </td>
   <td style="text-align:right;"> 19 </td>
   <td style="text-align:right;"> 4 </td>
  </tr>
  <tr>
   <td style="text-align:left;"> Uzbekistan </td>
   <td style="text-align:right;"> 6 </td>
   <td style="text-align:right;"> 3 </td>
   <td style="text-align:right;"> 1 </td>
   <td style="text-align:right;"> 28 </td>
   <td style="text-align:right;"> 9 </td>
  </tr>
</tbody>
</table>

We see no rows here for Turkmenistan. This is due to the fact that until just this past week Japan had **never** played against them in a friendly or competitive game!

# Conclusion

Although Japan's first game was quite horrible I'm hoping it'll wake the players and coaches out of their complacency and not underestimate our opponents in the next two games. 

Japan 

South Korea and Iran

thankfully south korea should be on the other side of the bracket and we would also only meet Iran in the semifinals (provided both teams finish top of their respective groups)

Japan could meet Australia in the Quarters but without Aaron Mooy they're a much weaker side as shown in their abject loss to Jordan in their opening match.

even with losing new star Nakajima, the fact that we can replace him with a player of the calibre of Takashi Inui and Hannover regular, Genki Haraguchi, stepping up from the bench shows how much Japanese football has progressed these past 25 years.

It's a changing of the guard for Japan but we've got quality players in Europe as well as some depth too with more young Japanese players headed to Europe from a young age

It was quite awe-inspiring seeing how the number of Japanese players playing for foreign clubs have been steadily increasing since the 1988 Asian Cup squad. Maybe that could be another idea for a visualization?

this tournament should be a first stepping stone for this new generation of players to make a big impact for the next world cup in 2022 so keep your eye out for this bunch of players!


================================================
FILE: Bundesliga 2018-2019/player_goal_contribution_matrix.Rmd
================================================
---
title: "Bundesliga"
author: "RN7"
date: "5/24/2019"
output: html_document
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```

# pkgs

```{r, message=FALSE, warning=FALSE}
pacman::p_load(tidyverse, polite, scales, ggimage, ggforce,
               rvest, glue, extrafont, ggrepel, magick)
loadfonts()
```

## add_logo

```{r}
add_logo <- function(plot_path, logo_path, logo_position, logo_scale = 10){

    # Requires magick R Package https://github.com/ropensci/magick

    # Useful error message for logo position
    if (!logo_position %in% c("top right", "top left", "bottom right", "bottom left")) {
        stop("Error Message: Uh oh! Logo Position not recognized\n  Try: logo_positon = 'top left', 'top right', 'bottom left', or 'bottom right'")
    }

    # read in raw images
    plot <- magick::image_read(plot_path)
    logo_raw <- magick::image_read(logo_path)

    # get dimensions of plot for scaling
    plot_height <- magick::image_info(plot)$height
    plot_width <- magick::image_info(plot)$width

    # default scale to 1/10th width of plot
    # Can change with logo_scale
    logo <- magick::image_scale(logo_raw, as.character(plot_width/logo_scale))

    # Get width of logo
    logo_width <- magick::image_info(logo)$width
    logo_height <- magick::image_info(logo)$height

    # Set position of logo
    # Position starts at 0,0 at top left
    # Using 0.01 for 1% - aesthetic padding

    if (logo_position == "top right") {
        x_pos = plot_width - logo_width - 0.01 * plot_width
        y_pos = 0.01 * plot_height
    } else if (logo_position == "top left") {
        x_pos = 0.01 * plot_width
        y_pos = 0.01 * plot_height
    } else if (logo_position == "bottom right") {
        x_pos = plot_width - logo_width - 0.01 * plot_width
        y_pos = plot_height - logo_height - 0.01 * plot_height
    } else if (logo_position == "bottom left") {
        x_pos = 0.01 * plot_width
        y_pos = plot_height - logo_height - 0.01 * plot_height
    }

    # Compose the actual overlay
    magick::image_composite(plot, logo, offset = paste0("+", x_pos, "+", y_pos))
}
```


# Bundesliga

## webscrape soccerway

```{r}
url <- "https://us.soccerway.com/national/germany/bundesliga/20182019/regular-season/r47657/"

session <- bow(url)

team_links <- scrape(session) %>% 
  html_nodes("#page_competition_1_block_competition_tables_7_block_competition_league_table_1_table .large-link a") %>% 
  html_attr("href")

team_links_df <- team_links %>% 
  enframe(name = NULL) %>% 
  separate(value, c(NA, NA, NA, "team_name", "team_num"), sep = "/") %>% 
  mutate(link = glue("
                     https://us.soccerway.com/teams/germany/{team_name}/{team_num}/squad/"),
         stat_link = glue("{link %>% str_replace('squad', 'statistics')}"))

# for each team link:

player_name_info <- function(session) {
  
  player_name_info <- scrape(session) %>% 
    html_nodes("#page_team_1_block_team_squad_3-table .name.large-link") %>% 
    html_text()
}

num_goals_info <- function(session) {

  num_goals_info <- scrape(session) %>% 
    html_nodes(".goals") %>% 
    html_text()
  
  num_goals_info_clean <- num_goals_info[-1]
}

num_assists_info <- function(session) {

  num_assists_info <- scrape(session) %>% 
    html_nodes(".assists") %>% 
    html_text()
  
  num_assists_info_clean <- num_assists_info[-1]
}

team_goals_info <- function(session) {
  team_goals_info <- scrape(session) %>% 
    html_nodes("tr.first:nth-child(6) > td:nth-child(2)") %>% 
    html_text()
}

# BIG FUNCTION
bundesliga_stats_info <- function(link, statlink) {
  
  session <- bow(link)
  session2 <- bow(statlink)
  
  player_name <- player_name_info(session = session)

  num_goals <- num_goals_info(session = session)

  num_assists <- num_assists_info(session = session)
  
  team_goals <- team_goals_info(session = session2)
  
  resultados <- list(player_name, num_goals, num_assists, team_goals)
  col_names <- c("name", "goals", "assists", "team_goals") 
  
  bundesliga_stats <- resultados %>% 
    reduce(cbind) %>% 
    as_tibble() %>% 
    set_names(col_names) 
  
}
```

### all at once

```{r}
# ALL 18 TEAMS AT ONCE, WILL TAKE A WHILE:
bundesliga_goal_contribution_df_ALL <- map2(.x = team_links_df$link,
                .y = team_links_df$stat_link,
                ~ bundesliga_stats_info(link = .x, statlink = .y))

bundesliga_goal_contribution_df <- bundesliga_goal_contribution_df_ALL %>% 
  set_names(team_links_df$team_name) %>% 
  bind_rows(.id = "team_name")

## save
saveRDS(bundesliga_goal_contribution_df, file = glue("{here::here()}/data/bundesliga_goal_contrib_df_soccerway.RDS"))
```

## clean

```{r}
bundesliga_goal_contribution_clean_df <- bundesliga_goal_contribution_df %>% 
  mutate_at(.vars = c("goals", "assists"), 
            ~str_replace(., "-", "0") %>% as.numeric) %>% 
  mutate(team = team_name %>% str_replace_all(., "-", " ") %>% str_to_title,
         total_goals = as.numeric(team_goals)) %>% 
  group_by(team) %>% 
  mutate(total_assists = sum(assists),
         goal_contrib = goals/total_goals,
         assist_contrib = assists/total_goals) %>% 
  ungroup() %>% 
  select(-team_name, -team_goals)

## save
saveRDS(bundesliga_goal_contribution_clean_df, 
        file = glue("{here::here()}/data/bundesliga_goal_contrib_clean_df.RDS"))
bundesliga_goal_contribution_clean_df <- readRDS(file = glue("{here::here()}/data/bundesliga_goal_contrib_clean_df.RDS"))
```

## plot

`
Download .txt
gitextract_i_d1l2y0/

├── .gitignore
├── Africa Cup of Nations 2019/
│   └── afcon.Rmd
├── Asian Cup 2019/
│   ├── asian_cup_2019.rmd
│   ├── japan_qatar.Rmd
│   ├── jpn_aus_waffle.Rmd
│   ├── jpn_aus_waffle.md
│   ├── jpn_saudi.Rmd
│   ├── visualize_asian_cup_2019.knit.md
│   ├── visualize_asian_cup_2019.md
│   ├── visualize_asian_cup_2019.rmd
│   └── visualize_asian_cup_2019.utf8.md
├── Bundesliga 2018-2019/
│   └── player_goal_contribution_matrix.Rmd
├── Bundesliga 2019-2020/
│   ├── buli_age_utility.Rmd
│   ├── buli_dribbling_1920_hinrunde.Rmd
│   ├── buli_goalkeepers_1920_hinrunde.Rmd
│   ├── buli_progressive_passing_1920_hinrunde.Rmd
│   ├── buli_shot_quality_1920_hinrunde.Rmd
│   └── goal_contrib_graph_1920_hinrunde.Rmd
├── Champions League & Europa League 2019-2020/
│   └── europa_league_eloRatings.Rmd
├── Copa America 2019/
│   ├── 1-copa_america2019.md
│   ├── COPY-2019-06-18-visualize-copa-america.md
│   ├── copa_america2019.md
│   ├── copa_america2019.rmd
│   └── copa_extras.Rmd
├── Eredivisie 2018-2019/
│   └── player_goal_contribution_matrix.rmd
├── Europe 2021-2022/
│   ├── fbref_sca_waffle_blogpost.Rmd
│   ├── fbref_sca_waffle_blogpost.md
│   └── fbref_sca_waffle_raw.Rmd
├── J-League 2018/
│   ├── j_league.rmd
│   ├── j_league_avg_age_value.rmd
│   ├── jleague_age_utility.Rmd
│   ├── player_goal_contribution_matrix.Rmd
│   └── player_turnover.Rmd
├── J-League 2019/
│   ├── goal_minutes.Rmd
│   ├── jleague_age_utility_2019.Rmd
│   └── jleague_summary_2019_season.Rmd
├── J-League 2020/
│   ├── jleague_2020_review_code.Rmd
│   └── jleague_age_utility_2020.Rmd
├── Japan National Team/
│   ├── japan_kirin_cup.Rmd
│   ├── japan_korea_rivalry.rmd
│   └── japan_worldcup.rmd
├── LICENSE
├── La Liga 2018-2019/
│   └── player_goal_contribution_matrix.Rmd
├── La Liga 2019-2020/
│   ├── age_utility_LaLiga.Rmd
│   └── laliga_goalkeepers_1920_3420.Rmd
├── Ligue 1 2018-2019/
│   └── player_goal_contribution_matrix.rmd
├── Lionel Messi/
│   ├── check_coords.Rmd
│   ├── check_y_coordinates.Rmd
│   ├── code_pass_upsetplot.Rmd
│   ├── explore_messi.Rmd
│   ├── messi_pass_upsetplots.Rmd
│   ├── statsbomb_tutorialBlogOne.Rmd
│   └── statsbomb_tutorialBlogOne.md
├── Premier League 2018-2019/
│   ├── LFC_ELO_Ratings.rmd
│   ├── LFC_goals_timeframe.rmd
│   ├── Premier_League_Center_of_Gravity.rmd
│   ├── appearances_season_players.Rmd
│   ├── appearances_season_split_manager.Rmd
│   ├── epl_wages.rmd
│   ├── liverpool_age_utility.rmd
│   ├── liverpoolfc_goals.rmd
│   ├── north_west_derby.rmd
│   └── player_goal_contribution_matrix.Rmd
├── Premier League 2019-2020/
│   ├── 2019-11-21-visualize-EPL-part-1.Rmd
│   ├── 2019-11-28-visualize-EPL-part-2.Rmd
│   ├── 2019-11-28-visualize-EPL-part-3.Rmd
│   ├── epl_goalkeepers_1920_11920.Rmd
│   ├── goal_contrib_graph_1920_MD21.Rmd
│   └── premierleague_top_goalscorers.Rmd
├── README.md
├── Serie A 2018-2019/
│   └── player_goal_contribution_matrix.Rmd
├── Serie A 2019-2020/
│   ├── age_utility_serieA.Rmd
│   └── serieA_goalkeepers_1920_1-23-20.Rmd
├── Women's World Cup 2019/
│   ├── tidytuesday.Rmd
│   └── tidytuesday_statsbomb.Rmd
├── World Cup 2018/
│   ├── RMarkdown/
│   │   ├── blog posts/
│   │   │   ├── soccer_plots_part1.md
│   │   │   ├── soccer_plots_part1.rmd
│   │   │   ├── soccer_plots_part2.md
│   │   │   ├── soccer_plots_part2.rmd
│   │   │   ├── soccer_plots_part3.md
│   │   │   ├── soccer_plots_part3.rmd
│   │   │   └── soccer_plots_part3_DS+.rmd
│   │   ├── ggsoccer_graphs.rmd
│   │   ├── group_goals.csv
│   │   ├── group_table_final_matchday.rmd
│   │   ├── historical_kits.rmd
│   │   ├── joyplot_goals.rmd
│   │   ├── presentation.rmd
│   │   ├── soccer_plots_part4.rmd
│   │   ├── worldcup_goal_plots.rmd
│   │   ├── worldcup_goal_plots_DRAFT.rmd
│   │   └── worldcup_ideas.rmd
│   ├── anim_save_try.r
│   ├── other articles/
│   │   ├── world_cup_BBC_charts.rmd
│   │   └── worldcup_player_data.rmd
│   └── scripts/
│       └── kit_read().r
├── data/
│   ├── Dortmund
│   ├── EPL_shots_data_df_raw.RDS
│   ├── EPL_shots_data_df_raw_matchday13.RDS
│   ├── EPL_shots_data_df_raw_matchday15.RDS
│   ├── FCTokyo_2019_age_utility_df.RDS
│   ├── J-League_2020_review/
│   │   ├── interval_goaltimes_all_df_jleague_2020.csv
│   │   ├── jleague_2020_individual_xG.csv
│   │   ├── jleague_2020_shooting_df.csv
│   │   ├── jleague_2020_situation_all_df.csv
│   │   ├── jleague_age_utility_df_2020.csv
│   │   ├── jleague_table_2020_cleaned.csv
│   │   └── team_xG_J-League-2020.csv
│   ├── J-League_2021_mid_review/
│   │   ├── Gamba-2021.csv
│   │   ├── J-League_2021_mid_league_table.csv
│   │   ├── interval_goaltimes_all_df_jleague_2021_mid.RDS
│   │   ├── jleague_2021_mid_shooting_clean_df.RDS
│   │   ├── jleague_2021_mid_shooting_clean_df.csv
│   │   ├── jleague_2021_mid_shooting_df.xlsx
│   │   ├── jleague_2021_mid_squad_standard_against.csv
│   │   ├── jleague_2021_mid_squad_standard_for.csv
│   │   ├── jleague_2021_situation_all_df.RDS
│   │   ├── jleague_2021_situation_df.xlsx
│   │   ├── jleague_age_utility_df_2021_mid.RDS
│   │   ├── jleague_age_utility_df_2021_mid.csv
│   │   ├── jleague_age_utility_df_2021_mid_raw.RDS
│   │   ├── jleague_table_2021_mid_cleaned.RDS
│   │   ├── jleague_table_2021_mid_cleaned.csv
│   │   ├── jleague_xg_player_2021_mid.RDS
│   │   ├── jleague_xg_player_2021_mid.csv
│   │   ├── team_xG_J-League-2021_mid.RDS
│   │   └── team_xG_J-League-2021_mid.csv
│   ├── Mainz
│   ├── afcon_squads_df_raw.RDS
│   ├── appearances_df_LFC_10_11.RDS
│   ├── appearances_df_LFC_15_16.RDS
│   ├── appearances_df_LFC_18_19.RDS
│   ├── appearances_df_LFC_19_20.RDS
│   ├── appearances_df_raw_LFC_10_11.RDS
│   ├── appearances_df_raw_LFC_15_16.RDS
│   ├── appearances_df_raw_LFC_18_19.RDS
│   ├── appearances_df_raw_LFC_19_20.RDS
│   ├── base_LFC_10_11_dates_df.RDS
│   ├── base_LFC_15_16_dates_df.RDS
│   ├── base_LFC_18_19_dates_df.RDS
│   ├── base_LFC_19_20_dates_df.RDS
│   ├── br_cr.RDS
│   ├── buli_age_utility_df_MD24_1920.RDS
│   ├── buli_player_dribbling_hinrunde_clean.RDS
│   ├── buli_player_dribbling_stats_hinrunde.csv
│   ├── buli_player_goalkeeping_hinrunde_clean.RDS
│   ├── buli_player_goalkeeping_stats_hinrunde.csv
│   ├── buli_player_passing_hinrunde_clean.RDS
│   ├── buli_player_passing_stats_hinrunde.csv
│   ├── buli_player_regular_goalkeeping_stats_hinrunde.csv
│   ├── buli_player_shooting_hinrunde_clean.RDS
│   ├── buli_player_shooting_stats_hinrunde.csv
│   ├── buli_player_stats_hinrunde.RDS
│   ├── buli_player_stats_hinrunde.csv
│   ├── buli_squad_stats_hinrunde.RDS
│   ├── buli_squad_stats_hinrunde.csv
│   ├── bundesliga_goal_contrib_clean_df.RDS
│   ├── bundesliga_goal_contrib_df_soccerway.RDS
│   ├── championship_age_utility_df_MD35_1920.RDS
│   ├── copa_america2019_squads_clean.RDS
│   ├── copa_america2019_squads_raw.RDS
│   ├── copa_america_understat.RDS
│   ├── copa_campeones_clean.RDS
│   ├── copa_top_scorers.RDS
│   ├── eng_champ_location.RDS
│   ├── epl_age_utility_df_MD27_1920.RDS
│   ├── epl_age_utility_df_MD28_1920.RDS
│   ├── epl_goal_contrib_clean_df.RDS
│   ├── epl_goal_contrib_df.RDS
│   ├── epl_goal_contrib_df_soccerway.RDS
│   ├── epl_player_defensive_actions_stats_MD29.csv
│   ├── epl_player_goalkeeping_MD23_clean.RDS
│   ├── epl_player_goalkeeping_stats_MD23.csv
│   ├── epl_player_regular_goalkeeping_stats_MD23.csv
│   ├── epl_player_stats_MD20.RDS
│   ├── epl_player_stats_MD20.csv
│   ├── epl_player_stats_MD21.RDS
│   ├── epl_player_stats_MD21.csv
│   ├── epl_player_stats_MD21_2.RDS
│   ├── epl_player_stats_MD21_2.csv
│   ├── epl_squad_stats_MD20.RDS
│   ├── epl_squad_stats_MD20.csv
│   ├── epl_squad_stats_MD21.RDS
│   ├── epl_squad_stats_MD21.csv
│   ├── epl_squad_stats_MD21_2.RDS
│   ├── epl_squad_stats_MD21_2.csv
│   ├── eredivisie_goal_contrib_clean_df.RDS
│   ├── eredivisie_goal_contrib_df_soccerway.RDS
│   ├── federation_affiliations/
│   │   ├── AFC
│   │   ├── CAF
│   │   ├── Concacaf
│   │   ├── Conmebol
│   │   ├── OFC
│   │   └── UEFA
│   ├── goal_contrib3_df.RDS
│   ├── goal_contrib_clean_df.RDS
│   ├── goal_contrib_df.RDS
│   ├── goal_timeline_df_raw_42920.RDS
│   ├── goalcontrib_webscrape_tutorial.RDS
│   ├── gpg_data.RDS
│   ├── j_league_2018_age_value.RDS
│   ├── j_league_2019_age_value.RDS
│   ├── jleague2019_goal_contrib_raw_df.RDS
│   ├── jleague2019_shot_data.csv
│   ├── jleague_2021_END/
│   │   ├── jleague_age_utility_df_2021_end.csv
│   │   ├── jleague_table_2021_end_cleaned.csv
│   │   └── xGDiff_all_matches_per_team.csv
│   ├── jleague_2022_end/
│   │   ├── jleague_age_utility_df_2022_end.csv
│   │   └── jleague_table_2022_end_cleaned.csv
│   ├── jleague_2022_mid/
│   │   ├── jleague_age_utility_df_2022_mid.csv
│   │   └── jleague_table_2022_mid_cleaned.csv
│   ├── jleague_age_utility_df_2019.RDS
│   ├── jleague_goal_contrib_clean_df.RDS
│   ├── jp_bel.RDS
│   ├── jp_col.RDS
│   ├── jp_pol.RDS
│   ├── jp_sen.RDS
│   ├── laliga_age_utility_df_MD25_1920.RDS
│   ├── laliga_goal_contrib_clean_df.RDS
│   ├── laliga_goal_contrib_df_soccerway.RDS
│   ├── laliga_player_goalkeeping_MD26_clean.RDS
│   ├── laliga_player_goalkeeping_stats_MD26.csv
│   ├── laliga_player_regular_goalkeeping_stats_MD26.csv
│   ├── lewa_shot_contrib.RDS
│   ├── ligueUn_goal_contrib_clean_df.RDS
│   ├── ligueUn_goal_contrib_df_soccerway.RDS
│   ├── liverpool
│   ├── messi_data_clean.RDS
│   ├── messi_data_raw.RDS
│   ├── premierleague_1516_1920_results.RDS
│   ├── premierleague_klopp_results.RDS
│   ├── results.csv
│   ├── results_copa_cleaned.RDS
│   ├── results_jp_asia.RDS
│   ├── sca_big5_demo.RDS
│   ├── sca_big5_demo.csv
│   ├── serieA_age_utility_df_2020-03-01_1920.RDS
│   ├── serieA_goal_contrib_clean_df.RDS
│   ├── serieA_goal_contrib_df_soccerway.RDS
│   ├── serieA_player_goalkeeping_MD20_clean.RDS
│   ├── serieA_player_goalkeeping_stats_1-23-20.csv
│   ├── serieA_player_regular_goalkeeping_stats_1-23-20.csv
│   ├── spi_matches.csv
│   ├── squad_LFC_18_19_df.RDS
│   ├── squad_LFC_19_20_df.RDS
│   ├── team_situation_data.json
│   └── wwc_final_raw.RDS
└── soccer_ggplots.Rproj
Copy disabled (too large) Download .json
Condensed preview — 248 files, each showing path, character count, and a content snippet. Download the .json file for the full structured content (10,758K chars).
[
  {
    "path": ".gitignore",
    "chars": 45,
    "preview": ".Rproj.user\n.Rhistory\n.RData\n.Ruserdata\nnotes"
  },
  {
    "path": "Africa Cup of Nations 2019/afcon.Rmd",
    "chars": 4190,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"6/22/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Asian Cup 2019/asian_cup_2019.rmd",
    "chars": 35798,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"December 26, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nkni"
  },
  {
    "path": "Asian Cup 2019/japan_qatar.Rmd",
    "chars": 6675,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"February 1, 2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknit"
  },
  {
    "path": "Asian Cup 2019/jpn_aus_waffle.Rmd",
    "chars": 2474,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"January 12, 2019\"\noutput: \n  md_document:\n    variant: markdown_github\n---\n\n`"
  },
  {
    "path": "Asian Cup 2019/jpn_aus_waffle.md",
    "chars": 15712,
    "preview": "A new rival, Australia, emerged to challenge Japan in Asia as they\njoined the AFC in 2006. From the come-from-behind def"
  },
  {
    "path": "Asian Cup 2019/jpn_saudi.Rmd",
    "chars": 5784,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"January 21, 2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknit"
  },
  {
    "path": "Asian Cup 2019/visualize_asian_cup_2019.knit.md",
    "chars": 31574,
    "preview": "Another year, another big soccer/football tournament! This time it’s the\ntop international competition in Asia, the Asia"
  },
  {
    "path": "Asian Cup 2019/visualize_asian_cup_2019.md",
    "chars": 31576,
    "preview": "Another year, another big soccer/football tournament! This time it’s the\ntop international competition in Asia, the Asia"
  },
  {
    "path": "Asian Cup 2019/visualize_asian_cup_2019.rmd",
    "chars": 27052,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\nalways_allow_html: yes\noutput: \n  md_document:\n    variant: markdown_github\n---\n\n```"
  },
  {
    "path": "Asian Cup 2019/visualize_asian_cup_2019.utf8.md",
    "chars": 31940,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\nalways_allow_html: yes\noutput: \n  md_document:\n    variant: markdown_github\n---\n\n\n\nA"
  },
  {
    "path": "Bundesliga 2018-2019/player_goal_contribution_matrix.Rmd",
    "chars": 9437,
    "preview": "---\ntitle: \"Bundesliga\"\nauthor: \"RN7\"\ndate: \"5/24/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::op"
  },
  {
    "path": "Bundesliga 2019-2020/buli_age_utility.Rmd",
    "chars": 8614,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"5/15/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Bundesliga 2019-2020/buli_dribbling_1920_hinrunde.Rmd",
    "chars": 14144,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"1/16/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Bundesliga 2019-2020/buli_goalkeepers_1920_hinrunde.Rmd",
    "chars": 14531,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"1/18/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Bundesliga 2019-2020/buli_progressive_passing_1920_hinrunde.Rmd",
    "chars": 13494,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"1/14/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Bundesliga 2019-2020/buli_shot_quality_1920_hinrunde.Rmd",
    "chars": 11846,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"1/15/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Bundesliga 2019-2020/goal_contrib_graph_1920_hinrunde.Rmd",
    "chars": 21458,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"1/8/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "Champions League & Europa League 2019-2020/europa_league_eloRatings.Rmd",
    "chars": 13043,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"9/1/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "Copa America 2019/1-copa_america2019.md",
    "chars": 67439,
    "preview": "Another summer and another edition of the Copa América! Along with the\nAfrica Cup of Nations, Nations League finals, the"
  },
  {
    "path": "Copa America 2019/COPY-2019-06-18-visualize-copa-america.md",
    "chars": 67829,
    "preview": "---\nlayout: post\ntitle: \"Visualizing the Copa América: Historical Records, Squad Profiles, and Player Profiles with xG s"
  },
  {
    "path": "Copa America 2019/copa_america2019.md",
    "chars": 76178,
    "preview": "Another summer and another edition of the Copa América! Along with the\nAfrica Cup of Nations, Nations League finals, the"
  },
  {
    "path": "Copa America 2019/copa_america2019.rmd",
    "chars": 67086,
    "preview": "---\ntitle: \"Visualizing the Copa América: Historical Records, Squad Profiles, and Player Profiles with xG statistics!\"\na"
  },
  {
    "path": "Copa America 2019/copa_extras.Rmd",
    "chars": 11434,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"6/15/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Eredivisie 2018-2019/player_goal_contribution_matrix.rmd",
    "chars": 9872,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"5/25/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Europe 2021-2022/fbref_sca_waffle_blogpost.Rmd",
    "chars": 16148,
    "preview": "---\ntitle: \"Solution to the 'preserving the sum after rounding' problem in a soccer waffle viz!\"\nalways_allow_html: yes\n"
  },
  {
    "path": "Europe 2021-2022/fbref_sca_waffle_blogpost.md",
    "chars": 22666,
    "preview": "---\nlayout: post\ntitle: \"J.League Soccer 2021 Season Review!\"\ntags: [japan, jleague, soccer, football, ggplot2, tidyvers"
  },
  {
    "path": "Europe 2021-2022/fbref_sca_waffle_raw.Rmd",
    "chars": 8489,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"1/13/2022\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "J-League 2018/j_league.rmd",
    "chars": 1165,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"December 25, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nkni"
  },
  {
    "path": "J-League 2018/j_league_avg_age_value.rmd",
    "chars": 8806,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"August 25, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr"
  },
  {
    "path": "J-League 2018/jleague_age_utility.Rmd",
    "chars": 23053,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"March 9, 2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::"
  },
  {
    "path": "J-League 2018/player_goal_contribution_matrix.Rmd",
    "chars": 16307,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"March 19, 2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr:"
  },
  {
    "path": "J-League 2018/player_turnover.Rmd",
    "chars": 578,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"February 24, 2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nkni"
  },
  {
    "path": "J-League 2019/goal_minutes.Rmd",
    "chars": 1659,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"10/19/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opt"
  },
  {
    "path": "J-League 2019/jleague_age_utility_2019.Rmd",
    "chars": 5879,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"2/8/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "J-League 2019/jleague_summary_2019_season.Rmd",
    "chars": 7748,
    "preview": "---\ntitle: \"Untitled\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_chunk$set(echo = TRUE)\n```\n\n# P"
  },
  {
    "path": "J-League 2020/jleague_2020_review_code.Rmd",
    "chars": 25158,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"1/14/2021\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "J-League 2020/jleague_age_utility_2020.Rmd",
    "chars": 49842,
    "preview": "---\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_chunk$set(echo = TRUE)\n```\n\n# Packages\n```{r, mes"
  },
  {
    "path": "Japan National Team/japan_kirin_cup.Rmd",
    "chars": 4193,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"6/5/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "Japan National Team/japan_korea_rivalry.rmd",
    "chars": 14236,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"September 18, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nkn"
  },
  {
    "path": "Japan National Team/japan_worldcup.rmd",
    "chars": 4524,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"November 13, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nkni"
  },
  {
    "path": "LICENSE",
    "chars": 34523,
    "preview": "                    GNU AFFERO GENERAL PUBLIC LICENSE\n                       Version 3, 19 November 2007\n\n Copyright (C)"
  },
  {
    "path": "La Liga 2018-2019/player_goal_contribution_matrix.Rmd",
    "chars": 10156,
    "preview": "---\ntitle: \"La Liga\"\nauthor: \"RN7\"\ndate: \"5/24/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "La Liga 2019-2020/age_utility_LaLiga.Rmd",
    "chars": 10363,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"2/29/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "La Liga 2019-2020/laliga_goalkeepers_1920_3420.Rmd",
    "chars": 18858,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"1/19/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Ligue 1 2018-2019/player_goal_contribution_matrix.rmd",
    "chars": 9779,
    "preview": "---\ntitle: \"Ligue 1\"\nauthor: \"RN7\"\ndate: \"5/25/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "Lionel Messi/check_coords.Rmd",
    "chars": 26885,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"8/3/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "Lionel Messi/check_y_coordinates.Rmd",
    "chars": 20214,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"8/4/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "Lionel Messi/code_pass_upsetplot.Rmd",
    "chars": 33723,
    "preview": "---\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_chunk$set(echo = TRUE, fig.height=6, fig.width=8)"
  },
  {
    "path": "Lionel Messi/explore_messi.Rmd",
    "chars": 44893,
    "preview": "---\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_chunk$set(echo = TRUE)\n```\n\n# Packages\n\n```{r, me"
  },
  {
    "path": "Lionel Messi/messi_pass_upsetplots.Rmd",
    "chars": 52825,
    "preview": "---\ntitle: \"Untitled\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_chunk$set(echo = TRUE, fig.heig"
  },
  {
    "path": "Lionel Messi/statsbomb_tutorialBlogOne.Rmd",
    "chars": 53806,
    "preview": "---\ntitle: \"Visualizing Soccer with StatsBomb Data and R, Part 1: Simple xG and Pass Partner Plots!\"\nalways_allow_html: "
  },
  {
    "path": "Lionel Messi/statsbomb_tutorialBlogOne.md",
    "chars": 55851,
    "preview": "This will be **Part 1** of what I hope to be a multi-part series of\nplotting soccer event-level data with R! This is mor"
  },
  {
    "path": "Premier League 2018-2019/LFC_ELO_Ratings.rmd",
    "chars": 17189,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"December 20, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nkni"
  },
  {
    "path": "Premier League 2018-2019/LFC_goals_timeframe.rmd",
    "chars": 3902,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"November 24, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nkni"
  },
  {
    "path": "Premier League 2018-2019/Premier_League_Center_of_Gravity.rmd",
    "chars": 6056,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"December 8, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknit"
  },
  {
    "path": "Premier League 2018-2019/appearances_season_players.Rmd",
    "chars": 25708,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"10/10/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opt"
  },
  {
    "path": "Premier League 2018-2019/appearances_season_split_manager.Rmd",
    "chars": 46895,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"6/23/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Premier League 2018-2019/epl_wages.rmd",
    "chars": 1665,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"November 2, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknit"
  },
  {
    "path": "Premier League 2018-2019/liverpool_age_utility.rmd",
    "chars": 7816,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"August 2, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr:"
  },
  {
    "path": "Premier League 2018-2019/liverpoolfc_goals.rmd",
    "chars": 3400,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"August 12, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr"
  },
  {
    "path": "Premier League 2018-2019/north_west_derby.rmd",
    "chars": 10484,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"December 16, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nkni"
  },
  {
    "path": "Premier League 2018-2019/player_goal_contribution_matrix.Rmd",
    "chars": 15072,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"5/17/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Premier League 2019-2020/2019-11-21-visualize-EPL-part-1.Rmd",
    "chars": 12651,
    "preview": "---\ntitle: \"Visualizing the Premier League So Far, Part 1: Overview with xPts Tables and xG Plots\"\nalways_allow_html: ye"
  },
  {
    "path": "Premier League 2019-2020/2019-11-28-visualize-EPL-part-2.Rmd",
    "chars": 14468,
    "preview": "---\ntitle: \"Visualizing the Premier League So Far, Part 2: Stats from Open Play and Set Pieces\"\nalways_allow_html: yes\no"
  },
  {
    "path": "Premier League 2019-2020/2019-11-28-visualize-EPL-part-3.Rmd",
    "chars": 661,
    "preview": "---\ntitle: \"Visualizing the Premier League So Far, Part 2: Stats by 15-Minute Intervals\"\nalways_allow_html: yes\noutput: "
  },
  {
    "path": "Premier League 2019-2020/epl_goalkeepers_1920_11920.Rmd",
    "chars": 18028,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"1/19/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Premier League 2019-2020/goal_contrib_graph_1920_MD21.Rmd",
    "chars": 18107,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"1/1/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "Premier League 2019-2020/premierleague_top_goalscorers.Rmd",
    "chars": 21196,
    "preview": "---\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_chunk$set(echo = TRUE)\n```\n\ntimeline bump chart r"
  },
  {
    "path": "README.md",
    "chars": 8512,
    "preview": "## Animate goals, Visualize stats, Video analyses, etc. from the World Cup, Premier League, Copa America, and beyond. (U"
  },
  {
    "path": "Serie A 2018-2019/player_goal_contribution_matrix.Rmd",
    "chars": 9945,
    "preview": "---\ntitle: \"Serie A\"\nauthor: \"RN7\"\ndate: \"5/24/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "Serie A 2019-2020/age_utility_serieA.Rmd",
    "chars": 8357,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"3/1/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "Serie A 2019-2020/serieA_goalkeepers_1920_1-23-20.Rmd",
    "chars": 15099,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"1/23/2020\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "Women's World Cup 2019/tidytuesday.Rmd",
    "chars": 15689,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"7/9/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts_"
  },
  {
    "path": "Women's World Cup 2019/tidytuesday_statsbomb.Rmd",
    "chars": 6014,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"7/10/2019\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::opts"
  },
  {
    "path": "World Cup 2018/RMarkdown/blog posts/soccer_plots_part1.md",
    "chars": 17845,
    "preview": "Recreate Your Favorite World Cup Goals!\n---------------------------------------\n\nAfter posting a couple of my World Cup "
  },
  {
    "path": "World Cup 2018/RMarkdown/blog posts/soccer_plots_part1.rmd",
    "chars": 25367,
    "preview": "---\ntitle: \"Visualize the World Cup with R! Part 1: Recreating Goals with ggsoccer and ggplot2\"\nauthor: \"RN7\"\ndate: \"Jun"
  },
  {
    "path": "World Cup 2018/RMarkdown/blog posts/soccer_plots_part2.md",
    "chars": 13134,
    "preview": "This is **Part 2** of my World Cup Data Viz Series (See Part 1 [here](https://ryo-n7.github.io/2018-06-29-visualize-worl"
  },
  {
    "path": "World Cup 2018/RMarkdown/blog posts/soccer_plots_part2.rmd",
    "chars": 15460,
    "preview": "---\ntitle: \"World Cup Drama: Visualizing Changes in the Group Table During the Final Matchday\"\nauthor: \"RN7\"\ndate: \"June"
  },
  {
    "path": "World Cup 2018/RMarkdown/blog posts/soccer_plots_part3.md",
    "chars": 32184,
    "preview": "**Animating** the Goals of the World Cup: Comparing the **old** vs. **new** `gganimate` and `tweenr` API!\n\nWelcome to **"
  },
  {
    "path": "World Cup 2018/RMarkdown/blog posts/soccer_plots_part3.rmd",
    "chars": 31616,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"July 24, 2018\"\noutput: \n  md_document:\n    variant: markdown_github\n---\n\n```{"
  },
  {
    "path": "World Cup 2018/RMarkdown/blog posts/soccer_plots_part3_DS+.rmd",
    "chars": 32213,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"July 24, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::"
  },
  {
    "path": "World Cup 2018/RMarkdown/ggsoccer_graphs.rmd",
    "chars": 11358,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"June 15, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::"
  },
  {
    "path": "World Cup 2018/RMarkdown/group_goals.csv",
    "chars": 1013,
    "preview": "\"value\"\n\"Group A\"\n\"12'\"\n\"43'\"\n\"90+1'\"\n\"71'\"\n\"90+4'\"\n\"89'\"\n\"47'(o.g.)\"\n\"59'\"\n\"62'\"\n\"73'(pen.)\"\n\"23'\"\n\"10'\"\n\"23'(o.g.)\"\n\"9"
  },
  {
    "path": "World Cup 2018/RMarkdown/group_table_final_matchday.rmd",
    "chars": 28801,
    "preview": "---\ntitle: \"final matchday group\"\nauthor: \"RN7\"\ndate: \"June 27, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FA"
  },
  {
    "path": "World Cup 2018/RMarkdown/historical_kits.rmd",
    "chars": 22327,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"June 24, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::"
  },
  {
    "path": "World Cup 2018/RMarkdown/joyplot_goals.rmd",
    "chars": 7267,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"June 30, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::"
  },
  {
    "path": "World Cup 2018/RMarkdown/presentation.rmd",
    "chars": 10349,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"June 22, 2018\"\noutput:\n  word_document: default\n  pdf_document: default\n  htm"
  },
  {
    "path": "World Cup 2018/RMarkdown/soccer_plots_part4.rmd",
    "chars": 506,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"July 18, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::"
  },
  {
    "path": "World Cup 2018/RMarkdown/worldcup_goal_plots.rmd",
    "chars": 30908,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"June 20, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::"
  },
  {
    "path": "World Cup 2018/RMarkdown/worldcup_goal_plots_DRAFT.rmd",
    "chars": 14182,
    "preview": "---\ntitle: \"worldcup_goal_plots_DRAFT\"\nauthor: \"RN7\"\ndate: \"July 13, 2018\"\noutput: html_document\n---\n\n```{r setup, inclu"
  },
  {
    "path": "World Cup 2018/RMarkdown/worldcup_ideas.rmd",
    "chars": 1021,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"June 23, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::"
  },
  {
    "path": "World Cup 2018/anim_save_try.r",
    "chars": 2782,
    "preview": "library(ggplot2)    # general plotting base\nlibrary(dplyr)      # data manipulation/tidying\nlibrary(ggsoccer)   # draw s"
  },
  {
    "path": "World Cup 2018/other articles/world_cup_BBC_charts.rmd",
    "chars": 2685,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"June 11, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::"
  },
  {
    "path": "World Cup 2018/other articles/worldcup_player_data.rmd",
    "chars": 3157,
    "preview": "---\ntitle: \"Untitled\"\nauthor: \"RN7\"\ndate: \"June 26, 2018\"\noutput: html_document\n---\n\n```{r setup, include=FALSE}\nknitr::"
  },
  {
    "path": "World Cup 2018/scripts/kit_read().r",
    "chars": 187,
    "preview": "kit_read <- function(path) {\n  \n  japan_kits <- list.files(path = path, pattern = \"*.gif\", full.names = TRUE) %>% \n    m"
  },
  {
    "path": "data/Dortmund",
    "chars": 297427,
    "preview": "Rank,Club,Country,Level,Elo,From,To\nNone,Dortmund,FRG,0,1555.7598877,1949-06-06,1949-06-06\nNone,Dortmund,FRG,0,1573.2734"
  },
  {
    "path": "data/J-League_2020_review/interval_goaltimes_all_df_jleague_2020.csv",
    "chars": 8451,
    "preview": "percFor,goalFor,totalFor,time,percAgainst,goalAgainst,totalAgainst,goalAG,team_name,mediangolsFor,mediangolsAgainst\n0.03"
  },
  {
    "path": "data/J-League_2020_review/jleague_2020_individual_xG.csv",
    "chars": 2634,
    "preview": "player_name,team_name,xG,goals,penalty_goals,shots,appearances,npGoals,npShots,npShotsPG,npxG,xGPerShot,npxGPerShot,xgdi"
  },
  {
    "path": "data/J-League_2020_review/jleague_2020_shooting_df.csv",
    "chars": 2487,
    "preview": "Squad,# Pl,90s,Gls,Sh,SoT,SoT%,Sh/90,SoT/90,G/Sh,G/SoT,PK,PKatt,Gls_against,Sh_against,SoT_against,SoT%_against,Sh/90_ag"
  },
  {
    "path": "data/J-League_2020_review/jleague_2020_situation_all_df.csv",
    "chars": 12962,
    "preview": "team_name,GS_total,situation,goals_scored,prop_score,GA_total,goals_against,prop_against,avg_prop_score,avg_prop_against"
  },
  {
    "path": "data/J-League_2020_review/jleague_age_utility_df_2020.csv",
    "chars": 65520,
    "preview": "first_name,last_name,age,minutes,bday,join,leave,min_perc,join_age,age_now,fname,player,team_name,season\nShusaku,Nishika"
  },
  {
    "path": "data/J-League_2020_review/jleague_table_2020_cleaned.csv",
    "chars": 978,
    "preview": "Team,W,D,L,Pts,GF,GA,GDiff,xG,xGA,xGDiff\nKawasaki Frontale,26,5,3,83,88,31,57,82.21,35.05,47.16\nGamba Osaka,20,5,9,65,46"
  },
  {
    "path": "data/J-League_2020_review/team_xG_J-League-2020.csv",
    "chars": 2938,
    "preview": "team_name,xG_perGame,goalsFor_perGame,xGA_perGame,goalsAgainst_perGame,img,team_jp_footballab,goals,goalsAgainst,xG,xGA,"
  },
  {
    "path": "data/J-League_2021_mid_review/Gamba-2021.csv",
    "chars": 3759,
    "preview": "Date,Time,Comp,Round,Day,Venue,Result,GF,GA,Opponent,Poss,Attendance,Captain,Formation,Referee,Match Report,Notes\n2021-0"
  },
  {
    "path": "data/J-League_2021_mid_review/J-League_2021_mid_league_table.csv",
    "chars": 1824,
    "preview": "Rk,Squad,MP,W,D,L,GF,GA,GD,Pts,Last 5,Attendance,Top Team Scorer,Goalkeeper,Notes\n1,Kawa Frontale,22,18,4,0,53,15,+38,58"
  },
  {
    "path": "data/J-League_2021_mid_review/jleague_2021_mid_shooting_clean_df.csv",
    "chars": 2449,
    "preview": "Squad,# Pl,90s,Gls,Sh,SoT,SoT%,Sh/90,SoT/90,G/Sh,G/SoT,PK,PKatt,Gls_against,Sh_against,SoT_against,SoT%_against,Sh/90_ag"
  },
  {
    "path": "data/J-League_2021_mid_review/jleague_2021_mid_squad_standard_against.csv",
    "chars": 2062,
    "preview": ",,,,Playing Time,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Perf"
  },
  {
    "path": "data/J-League_2021_mid_review/jleague_2021_mid_squad_standard_for.csv",
    "chars": 2001,
    "preview": ",,,,Playing Time,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Perf"
  },
  {
    "path": "data/J-League_2021_mid_review/jleague_age_utility_df_2021_mid.csv",
    "chars": 69394,
    "preview": "first_name,last_name,age,minutes,bday,join,leave,min_perc,join_age,age_now,fname,player,team_name,season,matchday\nMiki,Y"
  },
  {
    "path": "data/J-League_2021_mid_review/jleague_table_2021_mid_cleaned.csv",
    "chars": 1046,
    "preview": "Team,W,D,L,Pts,GF,GA,GD,xG,xGA,xGDiff\nKawasaki Frontale,18,4,0,58,53,15,38,36.79,15.64,21.15\nYokohama F. Marinos,14,4,2,"
  },
  {
    "path": "data/J-League_2021_mid_review/jleague_xg_player_2021_mid.csv",
    "chars": 3672,
    "preview": "player_name,team_name,xG,goals,penalty_goals,shots,appearances,npGoals,npShots,npShotsPG,npxG,xGPerShot,npxGPerShot,xgdi"
  },
  {
    "path": "data/J-League_2021_mid_review/team_xG_J-League-2021_mid.csv",
    "chars": 2049,
    "preview": "Squad,xG_perGame,goalsFor_perGame,xGA_perGame,goalsAgainst_perGame,matches,goals,goalsAgainst,xG,xGA,xG_perGame_avg,xGA_"
  },
  {
    "path": "data/Mainz",
    "chars": 182207,
    "preview": "Rank,Club,Country,Level,Elo,From,To\nNone,Mainz,FRG,1,1357.1763916,1951-07-01,1951-08-19\nNone,Mainz,FRG,1,1364.53930664,1"
  },
  {
    "path": "data/buli_player_dribbling_stats_hinrunde.csv",
    "chars": 50681,
    "preview": ",,,,,,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Performance,Perfor"
  },
  {
    "path": "data/buli_player_goalkeeping_stats_hinrunde.csv",
    "chars": 5236,
    "preview": ",,,,,,Goals,Goals,Goals,Goals,Goals,Expected,Expected,Expected,Expected,Launched,Launched,Launched,Passes,Passes,Passes,"
  },
  {
    "path": "data/buli_player_passing_stats_hinrunde.csv",
    "chars": 62689,
    "preview": ",,,,,,Assists,Assists,Assists,Assists,Total,Total,Total,Short,Short,Short,Medium,Medium,Medium,Long,Long,Long,,,,,,,,,,\n"
  },
  {
    "path": "data/buli_player_regular_goalkeeping_stats_hinrunde.csv",
    "chars": 3650,
    "preview": ",,,,,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Perf"
  },
  {
    "path": "data/buli_player_shooting_stats_hinrunde.csv",
    "chars": 52365,
    "preview": "Rk,Player,Nation,Pos,Squad,90s,Gls,PK,PKatt,Sh,SoT,FK,SoT%,Sh/90,SoT/90,G/Sh,G/SoT,xG,npxG,npxG/Sh,G-xG,np:G-xG,Matches\n"
  },
  {
    "path": "data/buli_player_stats_hinrunde.csv",
    "chars": 65607,
    "preview": ",,,,,,,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Pe"
  },
  {
    "path": "data/buli_squad_stats_hinrunde.csv",
    "chars": 2303,
    "preview": ",,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Per 90 "
  },
  {
    "path": "data/epl_player_defensive_actions_stats_MD29.csv",
    "chars": 66852,
    "preview": ",,,,,,,,Tackles,Tackles,Tackles,Tackles,Tackles,Vs Dribbles,Vs Dribbles,Vs Dribbles,Vs Dribbles,Pressures,Pressures,Pres"
  },
  {
    "path": "data/epl_player_goalkeeping_stats_MD23.csv",
    "chars": 5889,
    "preview": ",,,,,,Goals,Goals,Goals,Goals,Goals,Expected,Expected,Expected,Expected,Launched,Launched,Launched,Passes,Passes,Passes,"
  },
  {
    "path": "data/epl_player_regular_goalkeeping_stats_MD23.csv",
    "chars": 4084,
    "preview": ",,,,,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Perf"
  },
  {
    "path": "data/epl_player_stats_MD20.csv",
    "chars": 69532,
    "preview": ",,,,,,,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Pe"
  },
  {
    "path": "data/epl_player_stats_MD21.csv",
    "chars": 69820,
    "preview": ",,,,,,,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Pe"
  },
  {
    "path": "data/epl_player_stats_MD21_2.csv",
    "chars": 69918,
    "preview": ",,,,,,,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Pe"
  },
  {
    "path": "data/epl_squad_stats_MD20.csv",
    "chars": 2523,
    "preview": ",,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Per 90 "
  },
  {
    "path": "data/epl_squad_stats_MD21.csv",
    "chars": 2523,
    "preview": ",,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Per 90 "
  },
  {
    "path": "data/epl_squad_stats_MD21_2.csv",
    "chars": 2523,
    "preview": ",,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Per 90 "
  },
  {
    "path": "data/federation_affiliations/AFC",
    "chars": 498,
    "preview": " Afghanistan\n Australia\n Bahrain\n Bangladesh\n Bhutan\n Brunei Darussalam\n Cambodia\n China PR\n Chinese Taipei\n Guam\n Hong "
  },
  {
    "path": "data/federation_affiliations/CAF",
    "chars": 569,
    "preview": " Algeria\n Angola\n Benin\n Botswana\n Burkina Faso\n Burundi\n Cameroon\n Cape Verde\n Central African Republic\n Chad\n Comoros\n"
  },
  {
    "path": "data/federation_affiliations/Concacaf",
    "chars": 534,
    "preview": " Anguilla\n Antigua and Barbuda\n Aruba\n Bahamas\n Barbados\n Belize\n Bermuda\n Bonaire\n British Virgin Islands\n Canada\n Caym"
  },
  {
    "path": "data/federation_affiliations/Conmebol",
    "chars": 89,
    "preview": " Argentina\n Bolivia\n Brazil\n Chile\n Colombia\n Ecuador\n Paraguay\n Peru\n Uruguay\n Venezuela"
  },
  {
    "path": "data/federation_affiliations/OFC",
    "chars": 153,
    "preview": " American Samoa\n Cook Islands\n Fiji\n Kiribati\n New Caledonia\n New Zealand\n Niue\n Papua New Guinea\n Samoa\n Solomon Island"
  },
  {
    "path": "data/federation_affiliations/UEFA",
    "chars": 567,
    "preview": " Albania\n Andorra\n Armenia\n Austria\n Azerbaijan\n Belarus\n Belgium\n Bosnia and Herzegovina\n Bulgaria\n Croatia\n Cyprus\n Cz"
  },
  {
    "path": "data/jleague2019_shot_data.csv",
    "chars": 51883,
    "preview": "Rk,Player,Nation,Pos,Squad,90s,Gls,PK,PKatt,Sh,SoT,FK,SoT%,Sh/90,SoT/90,G/Sh,G/SoT,Matches\n1,Hiroki Abe\\Hiroki-Abe,jp JP"
  },
  {
    "path": "data/jleague_2021_END/jleague_age_utility_df_2021_end.csv",
    "chars": 77054,
    "preview": "first_name,last_name,age,minutes,bday,join,leave,min_perc,join_age,age_now,fname,player,team_name,season,matchday\nTakuya"
  },
  {
    "path": "data/jleague_2021_END/jleague_table_2021_end_cleaned.csv",
    "chars": 1081,
    "preview": "Team,W,D,L,Pts,GF,GA,GD,xG,xGA,xGDiff\nKawasaki Frontale,28,8,2,92,81,28,53,64.98,35.11,29.87\nYokohama F. Marinos,24,7,7,"
  },
  {
    "path": "data/jleague_2021_END/xGDiff_all_matches_per_team.csv",
    "chars": 319747,
    "preview": "\"\",\"home_goals\",\"away_goals\",\"home_xG\",\"away_xG\",\"match_id\",\"home_team\",\"away_team\",\"matchday\",\"base_link\",\"xG_graphic_l"
  },
  {
    "path": "data/jleague_2022_end/jleague_age_utility_df_2022_end.csv",
    "chars": 78200,
    "preview": "\"\",\"first_name\",\"last_name\",\"age\",\"minutes\",\"bday\",\"join\",\"leave\",\"min_perc\",\"join_age\",\"age_now\",\"fname\",\"player\",\"team"
  },
  {
    "path": "data/jleague_2022_end/jleague_table_2022_end_cleaned.csv",
    "chars": 1046,
    "preview": "Team,Matches,W,D,L,Pts,GF,GA,GD,xG,xGA,xGDiff\nYokohama Marinos,34,20,8,6,68,70,35,35,57.32,36.86,20.46\nKawasaki Frontale"
  },
  {
    "path": "data/jleague_2022_mid/jleague_age_utility_df_2022_mid.csv",
    "chars": 66311,
    "preview": "\"\",\"first_name\",\"last_name\",\"age\",\"minutes\",\"bday\",\"join\",\"leave\",\"min_perc\",\"join_age\",\"age_now\",\"fname\",\"player\",\"team"
  },
  {
    "path": "data/jleague_2022_mid/jleague_table_2022_mid_cleaned.csv",
    "chars": 996,
    "preview": "Team,Matches,W,D,L,Pts,GF,GA,GD,xG,xGA,xGDiff\nYokohama Marinos,16,9,4,3,31,30,17,13,24.81,17.63,7.18\nKashima Antlers,16,"
  },
  {
    "path": "data/laliga_player_goalkeeping_stats_MD26.csv",
    "chars": 6606,
    "preview": ",,,,,,,,Goals,Goals,Goals,Goals,Goals,Expected,Expected,Expected,Expected,Launched,Launched,Launched,Passes,Passes,Passe"
  },
  {
    "path": "data/laliga_player_regular_goalkeeping_stats_MD26.csv",
    "chars": 4670,
    "preview": ",,,,,,,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Pe"
  },
  {
    "path": "data/liverpool",
    "chars": 290748,
    "preview": "Rank,Club,Country,Level,Elo,From,To\nNone,Liverpool,ENG,1,1551.14025879,1946-07-07,1946-08-31\nNone,Liverpool,ENG,1,1561.3"
  },
  {
    "path": "data/results.csv",
    "chars": 2912886,
    "preview": "date,home_team,away_team,home_score,away_score,tournament,city,country,neutral\n1872-11-30,Scotland,England,0,0,Friendly,"
  },
  {
    "path": "data/sca_big5_demo.csv",
    "chars": 438198,
    "preview": "\"\",\"Season_End_Year\",\"Squad\",\"Comp\",\"Player\",\"Nation\",\"Pos\",\"Age\",\"Born\",\"Mins_Per_90\",\"SCA_SCA\",\"SCA90_SCA\",\"PassLive_S"
  },
  {
    "path": "data/serieA_player_goalkeeping_stats_1-23-20.csv",
    "chars": 5305,
    "preview": ",,,,,,Goals,Goals,Goals,Goals,Goals,Expected,Expected,Expected,Expected,Launched,Launched,Launched,Passes,Passes,Passes,"
  },
  {
    "path": "data/serieA_player_regular_goalkeeping_stats_1-23-20.csv",
    "chars": 3722,
    "preview": ",,,,,Playing Time,Playing Time,Playing Time,Performance,Performance,Performance,Performance,Performance,Performance,Perf"
  },
  {
    "path": "data/spi_matches.csv",
    "chars": 3242477,
    "preview": "date,league_id,league,team1,team2,spi1,spi2,prob1,prob2,probtie,proj_score1,proj_score2,importance1,importance2,score1,s"
  },
  {
    "path": "data/team_situation_data.json",
    "chars": 5626,
    "preview": "[\"<script>\\n\\tvar statisticsData = JSON.parse('[{\\\"situation\\\":{\\\"OpenPlay\\\":{\\\"shots\\\":143,\\\"goals\\\":17,\\\"xG\\\":19.66852"
  },
  {
    "path": "soccer_ggplots.Rproj",
    "chars": 205,
    "preview": "Version: 1.0\n\nRestoreWorkspace: Default\nSaveWorkspace: Default\nAlwaysSaveHistory: Default\n\nEnableCodeIndexing: Yes\nUseSp"
  }
]

// ... and 94 more files (download for full content)

About this extraction

This page contains the full source code of the Ryo-N7/soccer_ggplots GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 248 files (134.0 MB), approximately 2.6M tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!