Setup
This exercise was developed by Thierry Mayer for the International Trade and Finance Course. The dataset needed for this exercise is available in Stata format at this dropbox link. Download the file, and read it into R with the function read_stata from the haven package.
Exploring the data
What variables are included in the data?
## [1] "year" "iso_o" "iso_d" "contig" "comlang_off"
## [6] "distw" "pop_o" "pop_d" "gdp_o" "gdp_d"
## [11] "comcur" "fta_wto" "flow"
how many observations do we have in total?
## [1] 1106870 13
How many unique countries do we have in the columns iso_o and iso_d (origin/destination)?
## [1] 208
## [1] 208
- How does the total number of observations evolve over the years? That is, how many rows of data do we have for each year?

- What about countries? How many countries
iso_o do we have by year? 
How often does each country appear as iso_d within a year? Make a table that counts how often each country appears as iso_d per year!
## # A tibble: 1,106,870 x 3
## # Groups: year [69]
## year iso_d n
## <dbl> <chr> <int>
## 1 1984 ABW 2
## 2 1984 ABW 2
## 3 1985 ABW 1
## 4 1986 ABW 1
## 5 1987 ABW 1
## 6 1988 ABW 5
## 7 1988 ABW 5
## 8 1988 ABW 5
## 9 1988 ABW 5
## 10 1988 ABW 5
## # ... with 1,106,860 more rows
Do all countries trade with each other? How many country pairs would we observe if each country traded with each other possible country? Produce a graph that illustrates cross country trade. You could think of a square matrix \(M\) with as many row and columns as there are unique countries. rows index origin and cols index destination countries. You could fill the the matrix like this, where \(i,j\) index origin and destination country:
\[
M(i,j) = \begin{cases} 1 & \text{if flow}_{ij}>0 \\
0 & \text{else.}
\end{cases}
\] Your graph should visualize this matrix somehow. Make the graph for two years, 1948 and 2016, and compute the share of trading countries in each of them. 
Gravity
Compute a new variable called gravity, defined as
\[
\text{gravity}_{odt} = \frac{GDP_{ot} \cdot GDP_{dt}}{DGP_{wt}\cdot distance_{od}}
\]
where indices \(o,d,t\) stand for origin, destination and year. The index \(w\) means world, i.e. here we talk about the sum of all destination countries. You need to be careful here because some countries don’t have any data in certain years (as we know from above), so there will be missing values. When you prepare this computation, apply the following cleaning protocol to your data:
- you need to be careful in computing world gdp. Look back at point 6. above for why. Using
dplyr, I would compute world gdp by year first, and then merge it back onto the main dataset.
- group the data by year
- compute the share of
gdp_o and gdp_d in world gdp and drop observations smaller than the first percentile of either share
- transform
flow into flow/1000 i.e. trade flows in thousand dollars.
- compute gravity as above.
Gravity Regression
Run a regression of the log of trade flows on the log of gravity, using only data for the year 1995. Interpret the coefficient obtained. In a scatterplot, represent the relationship between the log of trade flows on the log of the gravity prediction, together with the regression line, which is very close to a 45 degree line for the 1995 data. How should we interpret the distance of each point to this 45 degree line?
##
## Call:
## lm(formula = log(flow1000) ~ log(gravity), data = d95)
##
## Residuals:
## Min 1Q Median 3Q Max
## -12.836 -1.192 0.162 1.414 8.501
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -0.847702 0.052725 -16.08 <2e-16 ***
## log(gravity) 1.036308 0.005989 173.04 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 2.295 on 17801 degrees of freedom
## Multiple R-squared: 0.6272, Adjusted R-squared: 0.6271
## F-statistic: 2.994e+04 on 1 and 17801 DF, p-value: < 2.2e-16

How do the slope coefficient estimates vary by year? You could run the above regression for each year, collect the slopes, and plot them against year.

Effect of Free Trade Agreements
Do the same scatterplot, but highlighting in a different color the pairs of countries engaged in a Free Trade Agreement (fta_wto = 1 for those in the database). Is it clear what is the effect of agreements graphically? I used function dplyr::sample_frac to randomly select 10% of rows from the 1995 data in order to avoid overplotting.

Investigate this more with Regressions
Run the following regressions using the 1995 data as above.
A classical gravity equation with only GDPs and distance (in logs) explaining the log of trade flows. That is, instead of the computed gravity variable from above, we include the following variables individually: \[\begin{align}
\log(gravity)_{odt} &= \log\left( \frac{GDP_{ot} \cdot GDP_{dt}}{ distance_{od}}\right) \\
&= \log(GDP_{ot}) + \log(GDP_{dt}) - \log(GDP_{dt}) - \log(distance_{od})
\end{align}\] and so you are supposed to investigate \[
\log \left( \frac{flow_{odt}}{1000} \right) = \log(GDP_{ot}) + \log(GDP_{dt} - \log(distance_{od})
\]
Introduce the fta_wto dummy variable in that regression. What is the impact of becoming a wto member on expected trade flows? To answer that last question, remember that for a zero-one dummy \(d\) , \[\begin{align}
\ln y &= a + b d \\
y =&= \exp(a +b d) \\
E[y|d=0] =& \exp(a)\\
E[y|d=1] =& \exp(a + b )\\
\Delta E[y|d] =& \exp(a + b ) - \exp(a)\\
\%\Delta E[y|d] =& \frac{\exp(a + b ) - \exp(a)}{\exp(a)}\\
=& e^{a + b - a} - 1 = \exp( b ) - 1
\end{align}\]
Introduce common language and contiguity. Again compute the impact of having a common official language and of being contiguous contries.
================================================
FILE: index.Rmd
================================================
---
title: "Introduction to Econometrics with R"
author: "Florian Oswald, Vincent Viers, Jean-Marc Robin, Pierre Villedieu, Gustave Kenedi "
date: "`r Sys.Date()`"
site: bookdown::bookdown_site
output: bookdown::gitbook
documentclass: book
bibliography: ["packages.bib","book.bib"]
biblio-style: apalike
link-citations: yes
url: 'https\://scpoecon.github.io/ScPoEconometrics/'
favicon: "favicon.gif"
github-repo: ScPoEcon/ScPoEconometrics
description: "SciencesPo UG Econometrics online textbook. Almost no Maths."
---
```{r, setup, include=FALSE}
# knitr::opts_chunk$set(comment=ScPoEconometrics:::getprompt(),fig.align = 'center')
knitr::opts_chunk$set(fig.align = 'center')
```
# Syllabus {-}

Welcome to Introductory Econometrics for 2nd year undergraduates at ScPo! On this page we outline the course and present the Syllabus. 2018/2019 was the first time that we taught this course in this format, so we are in year 3 now.
### Objective {-}
We teach this course split over two levels and two semesters: *Introduction* and *Advanced*. Having taken the *Introduction* course is a requirement to enroll in *Advanced*.
The *Introduction* course aims to teach you the basics of data analysis needed in a Social Sciences oriented University like SciencesPo. We purposefully start at a level that assumes no prior knowledge about statistics whatsoever. Our objective is to have you understand and be able to interpret linear regression analysis. We will not rely on maths and statistics, but practical learning in order to teach the main concepts. We also add the principal elements of causal inference, such that you will start being able to distinguish between simple statistical correlation and actual causation.
The *Advanced* course will continue in the semester *after* you have taken the *Introduction* course, following the same philosophie of staying away as much as possible from formal derivations and proofs. We treat important further classical econometric topics like Instrumental Variables, Panel Data, Discrete Dependent Variables. Towards the end of the course we reserve a good amount of time to give an oveview of *Statistical Learning*. We will study and apply important concepts from machine learning in an accessible way.
### Course Structure {-}
Either course is taught in several different groups across various campuses of SciencesPo. All groups will go over the same material, do the same exercises, and will have the same assessments.
Groups meet once per week for 2 hours. The main purpose of the weekly meetings is to clarify any questions, and to work together through tutorials. The little theory we need will be covered in this book, and **you are expected to read through this in your own time** before coming to class.
### Introduction Course: Syllabus and Requirements {-}
**Requirements**
The only requirement is that **you bring your own personal computer** to each session. We will be using the free statistical computing language [`R`](https://www.r-project.org) very intensively. Before coming to the first session, please install `R` and `RStudio` as explained at the beginning of chapter \@ref(R-intro).
**Syllabus**
1. Introduction: Chapters 1.1 and 1.2 from this book, Introduction from *Mastering Metrics*, *The Credibility Revolution in Empirical Economics* by Angrist and Pischke (JEP 2010)
2. Summarizing, Visualizing and Tidying Data: Chapter 2 of this book, Chapters 2 and 3 from [ModernDive](https://moderndive.com)
3. Continues with previous session.
4. Simple Linear Regression: Chapter \@ref(linreg) of this book, Chapter 5 of [ModernDive](https://moderndive.com)
5. Introduction to Causality: Chapter \@ref(causality) of this book, Chapter 1 Mastering Metrics, Potential Outcomes Model in *Causal Inerence, The Mixtape* by Scott Cunningham
6. Multiple Linear Regression: Chapter \@ref(multiple-reg)
7. Sampling: Chapter 7 of [ModernDive](https://moderndive.com)
8. Confidence Interval and Hypothesis Testing: Chapters 8 and 9 of [ModernDive](https://moderndive.com)
9. Regression Inference: Chapter \@ref(std-errors) of this book, Chapter 10 of [ModernDive](https://moderndive.com)
10. Differences-in-Differences: Chapter 5 of Mastering Metrics, Card and Krueger (AER 1994)
11. Regression Discontinuity: Chapter 4 of Mastering Metrics, Carpenter and Dobkin (AEJ, Applied, 2009), Imbens and Lemieux (Journal of Econometrics, 2008), Lee and Lemieux (JEL 2010)
12. Review Session
### Advanced Course: Syllabus and Requirements {-}
**Requirements**
*You must have taken the Intro course before, or a course with similar syllabus at your home institution.*
**Syllabus**
1. Logistics, Organisation, Recap 1 from Intro Course
2. Recap 2 from Intro Course
3. Intro to `data.table`
4. Instrumental Variables and Causality 1
5. Instrumental Variables and Causality 2
6. Instrumental Variables and Causality 3
7. Panel Data: What, How and Why?
8. Discrete Outcomes: Logit and Probit
9. Intro to Statistical Learning 1: Taxonomy and Intro to Machine Learning
9. Intro to Statistical Learning 2: Model Validation
10. Intro to Statistical Learning 3: Unsupervised Learning
Session 11: Recap / Buffer 1
Session 11: Recap / Buffer 2
### Slides {-}
**Introductory Level**
There are slides for each book chapter at a [dedicated website](https://github.com/ScPoEcon/ScPoEconometrics-Slides).
**Advanced Level**
We host slides [here](https://github.com/ScPoEcon/Advanced-Metrics-slides).
### This Book and Other Material {-}
What you are looking at is an online textbook. You can therefore look at it in your browser (as you are doing just now), on your mobile phone or tablet, but you can also download it as a `pdf` file or as an `epub` file for your ebook-reader. We don't have any ambition to actually produce and publish a *book* for now, so you should just see this as a way to disseminate our lecture notes to you.
The second part of course material next to the book is an extensive suite of tutorials and interactive demonstrations, which are all contained in the `R` package which is associated to this book and which you will install in chapter 1.
### Open Source {-}
The book and all other content for this course are hosted under an open source license on github. You can contribute to the book by just clicking on the appropriate *edit* symbol in the top bar of this page. Other teachers who want to use our material can freely do so, observing the terms of the license on the [github repository](https://github.com/ScPoEcon/ScPoEconometrics).
### Assessments {-}
We will assess participation in class, quizzes on moodle and take home exams.
### Communication {-}
We will communicate exclusively on our slack group. You will get an invitation email to join from your instructor in due course.
================================================
FILE: inst/CITATION
================================================
rref <- bibentry(
bibtype = "Manual",
title = "Introduction to Econometrics with R",
author = c(person("Florian", "Oswald", role=c("aut","cre")),
person("Jean-Marc", "Robin", role=c("ctb")),
person("Vincent", "Viers", role=c("aut","ctb"))),
organization = "SciencesPo, Department of Economics",
address = "Paris, France",
year = "2018",
url = "https://scpoecon.github.io/ScPoEconometrics/")
================================================
FILE: inst/datasets/airline-safety.csv
================================================
"airline","avail_seat_km_per_week","type","value","period"
"Aer Lingus",320906734,"incidents",2,"1985_1999"
"Aeroflot*",1197672318,"incidents",76,"1985_1999"
"Aerolineas Argentinas",385803648,"incidents",6,"1985_1999"
"Aeromexico*",596871813,"incidents",3,"1985_1999"
"Air Canada",1865253802,"incidents",2,"1985_1999"
"Air France",3004002661,"incidents",14,"1985_1999"
"Air India*",869253552,"incidents",2,"1985_1999"
"Air New Zealand*",710174817,"incidents",3,"1985_1999"
"Alaska Airlines*",965346773,"incidents",5,"1985_1999"
"Alitalia",698012498,"incidents",7,"1985_1999"
"All Nippon Airways",1841234177,"incidents",3,"1985_1999"
"American*",5228357340,"incidents",21,"1985_1999"
"Austrian Airlines",358239823,"incidents",1,"1985_1999"
"Avianca",396922563,"incidents",5,"1985_1999"
"British Airways*",3179760952,"incidents",4,"1985_1999"
"Cathay Pacific*",2582459303,"incidents",0,"1985_1999"
"China Airlines",813216487,"incidents",12,"1985_1999"
"Condor",417982610,"incidents",2,"1985_1999"
"COPA",550491507,"incidents",3,"1985_1999"
"Delta / Northwest*",6525658894,"incidents",24,"1985_1999"
"Egyptair",557699891,"incidents",8,"1985_1999"
"El Al",335448023,"incidents",1,"1985_1999"
"Ethiopian Airlines",488560643,"incidents",25,"1985_1999"
"Finnair",506464950,"incidents",1,"1985_1999"
"Garuda Indonesia",613356665,"incidents",10,"1985_1999"
"Gulf Air",301379762,"incidents",1,"1985_1999"
"Hawaiian Airlines",493877795,"incidents",0,"1985_1999"
"Iberia",1173203126,"incidents",4,"1985_1999"
"Japan Airlines",1574217531,"incidents",3,"1985_1999"
"Kenya Airways",277414794,"incidents",2,"1985_1999"
"KLM*",1874561773,"incidents",7,"1985_1999"
"Korean Air",1734522605,"incidents",12,"1985_1999"
"LAN Airlines",1001965891,"incidents",3,"1985_1999"
"Lufthansa*",3426529504,"incidents",6,"1985_1999"
"Malaysia Airlines",1039171244,"incidents",3,"1985_1999"
"Pakistan International",348563137,"incidents",8,"1985_1999"
"Philippine Airlines",413007158,"incidents",7,"1985_1999"
"Qantas*",1917428984,"incidents",1,"1985_1999"
"Royal Air Maroc",295705339,"incidents",5,"1985_1999"
"SAS*",682971852,"incidents",5,"1985_1999"
"Saudi Arabian",859673901,"incidents",7,"1985_1999"
"Singapore Airlines",2376857805,"incidents",2,"1985_1999"
"South African",651502442,"incidents",2,"1985_1999"
"Southwest Airlines",3276525770,"incidents",1,"1985_1999"
"Sri Lankan / AirLanka",325582976,"incidents",2,"1985_1999"
"SWISS*",792601299,"incidents",2,"1985_1999"
"TACA",259373346,"incidents",3,"1985_1999"
"TAM",1509195646,"incidents",8,"1985_1999"
"TAP - Air Portugal",619130754,"incidents",0,"1985_1999"
"Thai Airways",1702802250,"incidents",8,"1985_1999"
"Turkish Airlines",1946098294,"incidents",8,"1985_1999"
"United / Continental*",7139291291,"incidents",19,"1985_1999"
"US Airways / America West*",2455687887,"incidents",16,"1985_1999"
"Vietnam Airlines",625084918,"incidents",7,"1985_1999"
"Virgin Atlantic",1005248585,"incidents",1,"1985_1999"
"Xiamen Airlines",430462962,"incidents",9,"1985_1999"
"Aer Lingus",320906734,"fatal_accidents",0,"1985_1999"
"Aeroflot*",1197672318,"fatal_accidents",14,"1985_1999"
"Aerolineas Argentinas",385803648,"fatal_accidents",0,"1985_1999"
"Aeromexico*",596871813,"fatal_accidents",1,"1985_1999"
"Air Canada",1865253802,"fatal_accidents",0,"1985_1999"
"Air France",3004002661,"fatal_accidents",4,"1985_1999"
"Air India*",869253552,"fatal_accidents",1,"1985_1999"
"Air New Zealand*",710174817,"fatal_accidents",0,"1985_1999"
"Alaska Airlines*",965346773,"fatal_accidents",0,"1985_1999"
"Alitalia",698012498,"fatal_accidents",2,"1985_1999"
"All Nippon Airways",1841234177,"fatal_accidents",1,"1985_1999"
"American*",5228357340,"fatal_accidents",5,"1985_1999"
"Austrian Airlines",358239823,"fatal_accidents",0,"1985_1999"
"Avianca",396922563,"fatal_accidents",3,"1985_1999"
"British Airways*",3179760952,"fatal_accidents",0,"1985_1999"
"Cathay Pacific*",2582459303,"fatal_accidents",0,"1985_1999"
"China Airlines",813216487,"fatal_accidents",6,"1985_1999"
"Condor",417982610,"fatal_accidents",1,"1985_1999"
"COPA",550491507,"fatal_accidents",1,"1985_1999"
"Delta / Northwest*",6525658894,"fatal_accidents",12,"1985_1999"
"Egyptair",557699891,"fatal_accidents",3,"1985_1999"
"El Al",335448023,"fatal_accidents",1,"1985_1999"
"Ethiopian Airlines",488560643,"fatal_accidents",5,"1985_1999"
"Finnair",506464950,"fatal_accidents",0,"1985_1999"
"Garuda Indonesia",613356665,"fatal_accidents",3,"1985_1999"
"Gulf Air",301379762,"fatal_accidents",0,"1985_1999"
"Hawaiian Airlines",493877795,"fatal_accidents",0,"1985_1999"
"Iberia",1173203126,"fatal_accidents",1,"1985_1999"
"Japan Airlines",1574217531,"fatal_accidents",1,"1985_1999"
"Kenya Airways",277414794,"fatal_accidents",0,"1985_1999"
"KLM*",1874561773,"fatal_accidents",1,"1985_1999"
"Korean Air",1734522605,"fatal_accidents",5,"1985_1999"
"LAN Airlines",1001965891,"fatal_accidents",2,"1985_1999"
"Lufthansa*",3426529504,"fatal_accidents",1,"1985_1999"
"Malaysia Airlines",1039171244,"fatal_accidents",1,"1985_1999"
"Pakistan International",348563137,"fatal_accidents",3,"1985_1999"
"Philippine Airlines",413007158,"fatal_accidents",4,"1985_1999"
"Qantas*",1917428984,"fatal_accidents",0,"1985_1999"
"Royal Air Maroc",295705339,"fatal_accidents",3,"1985_1999"
"SAS*",682971852,"fatal_accidents",0,"1985_1999"
"Saudi Arabian",859673901,"fatal_accidents",2,"1985_1999"
"Singapore Airlines",2376857805,"fatal_accidents",2,"1985_1999"
"South African",651502442,"fatal_accidents",1,"1985_1999"
"Southwest Airlines",3276525770,"fatal_accidents",0,"1985_1999"
"Sri Lankan / AirLanka",325582976,"fatal_accidents",1,"1985_1999"
"SWISS*",792601299,"fatal_accidents",1,"1985_1999"
"TACA",259373346,"fatal_accidents",1,"1985_1999"
"TAM",1509195646,"fatal_accidents",3,"1985_1999"
"TAP - Air Portugal",619130754,"fatal_accidents",0,"1985_1999"
"Thai Airways",1702802250,"fatal_accidents",4,"1985_1999"
"Turkish Airlines",1946098294,"fatal_accidents",3,"1985_1999"
"United / Continental*",7139291291,"fatal_accidents",8,"1985_1999"
"US Airways / America West*",2455687887,"fatal_accidents",7,"1985_1999"
"Vietnam Airlines",625084918,"fatal_accidents",3,"1985_1999"
"Virgin Atlantic",1005248585,"fatal_accidents",0,"1985_1999"
"Xiamen Airlines",430462962,"fatal_accidents",1,"1985_1999"
"Aer Lingus",320906734,"fatalities",0,"1985_1999"
"Aeroflot*",1197672318,"fatalities",128,"1985_1999"
"Aerolineas Argentinas",385803648,"fatalities",0,"1985_1999"
"Aeromexico*",596871813,"fatalities",64,"1985_1999"
"Air Canada",1865253802,"fatalities",0,"1985_1999"
"Air France",3004002661,"fatalities",79,"1985_1999"
"Air India*",869253552,"fatalities",329,"1985_1999"
"Air New Zealand*",710174817,"fatalities",0,"1985_1999"
"Alaska Airlines*",965346773,"fatalities",0,"1985_1999"
"Alitalia",698012498,"fatalities",50,"1985_1999"
"All Nippon Airways",1841234177,"fatalities",1,"1985_1999"
"American*",5228357340,"fatalities",101,"1985_1999"
"Austrian Airlines",358239823,"fatalities",0,"1985_1999"
"Avianca",396922563,"fatalities",323,"1985_1999"
"British Airways*",3179760952,"fatalities",0,"1985_1999"
"Cathay Pacific*",2582459303,"fatalities",0,"1985_1999"
"China Airlines",813216487,"fatalities",535,"1985_1999"
"Condor",417982610,"fatalities",16,"1985_1999"
"COPA",550491507,"fatalities",47,"1985_1999"
"Delta / Northwest*",6525658894,"fatalities",407,"1985_1999"
"Egyptair",557699891,"fatalities",282,"1985_1999"
"El Al",335448023,"fatalities",4,"1985_1999"
"Ethiopian Airlines",488560643,"fatalities",167,"1985_1999"
"Finnair",506464950,"fatalities",0,"1985_1999"
"Garuda Indonesia",613356665,"fatalities",260,"1985_1999"
"Gulf Air",301379762,"fatalities",0,"1985_1999"
"Hawaiian Airlines",493877795,"fatalities",0,"1985_1999"
"Iberia",1173203126,"fatalities",148,"1985_1999"
"Japan Airlines",1574217531,"fatalities",520,"1985_1999"
"Kenya Airways",277414794,"fatalities",0,"1985_1999"
"KLM*",1874561773,"fatalities",3,"1985_1999"
"Korean Air",1734522605,"fatalities",425,"1985_1999"
"LAN Airlines",1001965891,"fatalities",21,"1985_1999"
"Lufthansa*",3426529504,"fatalities",2,"1985_1999"
"Malaysia Airlines",1039171244,"fatalities",34,"1985_1999"
"Pakistan International",348563137,"fatalities",234,"1985_1999"
"Philippine Airlines",413007158,"fatalities",74,"1985_1999"
"Qantas*",1917428984,"fatalities",0,"1985_1999"
"Royal Air Maroc",295705339,"fatalities",51,"1985_1999"
"SAS*",682971852,"fatalities",0,"1985_1999"
"Saudi Arabian",859673901,"fatalities",313,"1985_1999"
"Singapore Airlines",2376857805,"fatalities",6,"1985_1999"
"South African",651502442,"fatalities",159,"1985_1999"
"Southwest Airlines",3276525770,"fatalities",0,"1985_1999"
"Sri Lankan / AirLanka",325582976,"fatalities",14,"1985_1999"
"SWISS*",792601299,"fatalities",229,"1985_1999"
"TACA",259373346,"fatalities",3,"1985_1999"
"TAM",1509195646,"fatalities",98,"1985_1999"
"TAP - Air Portugal",619130754,"fatalities",0,"1985_1999"
"Thai Airways",1702802250,"fatalities",308,"1985_1999"
"Turkish Airlines",1946098294,"fatalities",64,"1985_1999"
"United / Continental*",7139291291,"fatalities",319,"1985_1999"
"US Airways / America West*",2455687887,"fatalities",224,"1985_1999"
"Vietnam Airlines",625084918,"fatalities",171,"1985_1999"
"Virgin Atlantic",1005248585,"fatalities",0,"1985_1999"
"Xiamen Airlines",430462962,"fatalities",82,"1985_1999"
"Aer Lingus",320906734,"incidents",0,"2000_2014"
"Aeroflot*",1197672318,"incidents",6,"2000_2014"
"Aerolineas Argentinas",385803648,"incidents",1,"2000_2014"
"Aeromexico*",596871813,"incidents",5,"2000_2014"
"Air Canada",1865253802,"incidents",2,"2000_2014"
"Air France",3004002661,"incidents",6,"2000_2014"
"Air India*",869253552,"incidents",4,"2000_2014"
"Air New Zealand*",710174817,"incidents",5,"2000_2014"
"Alaska Airlines*",965346773,"incidents",5,"2000_2014"
"Alitalia",698012498,"incidents",4,"2000_2014"
"All Nippon Airways",1841234177,"incidents",7,"2000_2014"
"American*",5228357340,"incidents",17,"2000_2014"
"Austrian Airlines",358239823,"incidents",1,"2000_2014"
"Avianca",396922563,"incidents",0,"2000_2014"
"British Airways*",3179760952,"incidents",6,"2000_2014"
"Cathay Pacific*",2582459303,"incidents",2,"2000_2014"
"China Airlines",813216487,"incidents",2,"2000_2014"
"Condor",417982610,"incidents",0,"2000_2014"
"COPA",550491507,"incidents",0,"2000_2014"
"Delta / Northwest*",6525658894,"incidents",24,"2000_2014"
"Egyptair",557699891,"incidents",4,"2000_2014"
"El Al",335448023,"incidents",1,"2000_2014"
"Ethiopian Airlines",488560643,"incidents",5,"2000_2014"
"Finnair",506464950,"incidents",0,"2000_2014"
"Garuda Indonesia",613356665,"incidents",4,"2000_2014"
"Gulf Air",301379762,"incidents",3,"2000_2014"
"Hawaiian Airlines",493877795,"incidents",1,"2000_2014"
"Iberia",1173203126,"incidents",5,"2000_2014"
"Japan Airlines",1574217531,"incidents",0,"2000_2014"
"Kenya Airways",277414794,"incidents",2,"2000_2014"
"KLM*",1874561773,"incidents",1,"2000_2014"
"Korean Air",1734522605,"incidents",1,"2000_2014"
"LAN Airlines",1001965891,"incidents",0,"2000_2014"
"Lufthansa*",3426529504,"incidents",3,"2000_2014"
"Malaysia Airlines",1039171244,"incidents",3,"2000_2014"
"Pakistan International",348563137,"incidents",10,"2000_2014"
"Philippine Airlines",413007158,"incidents",2,"2000_2014"
"Qantas*",1917428984,"incidents",5,"2000_2014"
"Royal Air Maroc",295705339,"incidents",3,"2000_2014"
"SAS*",682971852,"incidents",6,"2000_2014"
"Saudi Arabian",859673901,"incidents",11,"2000_2014"
"Singapore Airlines",2376857805,"incidents",2,"2000_2014"
"South African",651502442,"incidents",1,"2000_2014"
"Southwest Airlines",3276525770,"incidents",8,"2000_2014"
"Sri Lankan / AirLanka",325582976,"incidents",4,"2000_2014"
"SWISS*",792601299,"incidents",3,"2000_2014"
"TACA",259373346,"incidents",1,"2000_2014"
"TAM",1509195646,"incidents",7,"2000_2014"
"TAP - Air Portugal",619130754,"incidents",0,"2000_2014"
"Thai Airways",1702802250,"incidents",2,"2000_2014"
"Turkish Airlines",1946098294,"incidents",8,"2000_2014"
"United / Continental*",7139291291,"incidents",14,"2000_2014"
"US Airways / America West*",2455687887,"incidents",11,"2000_2014"
"Vietnam Airlines",625084918,"incidents",1,"2000_2014"
"Virgin Atlantic",1005248585,"incidents",0,"2000_2014"
"Xiamen Airlines",430462962,"incidents",2,"2000_2014"
"Aer Lingus",320906734,"fatal_accidents",0,"2000_2014"
"Aeroflot*",1197672318,"fatal_accidents",1,"2000_2014"
"Aerolineas Argentinas",385803648,"fatal_accidents",0,"2000_2014"
"Aeromexico*",596871813,"fatal_accidents",0,"2000_2014"
"Air Canada",1865253802,"fatal_accidents",0,"2000_2014"
"Air France",3004002661,"fatal_accidents",2,"2000_2014"
"Air India*",869253552,"fatal_accidents",1,"2000_2014"
"Air New Zealand*",710174817,"fatal_accidents",1,"2000_2014"
"Alaska Airlines*",965346773,"fatal_accidents",1,"2000_2014"
"Alitalia",698012498,"fatal_accidents",0,"2000_2014"
"All Nippon Airways",1841234177,"fatal_accidents",0,"2000_2014"
"American*",5228357340,"fatal_accidents",3,"2000_2014"
"Austrian Airlines",358239823,"fatal_accidents",0,"2000_2014"
"Avianca",396922563,"fatal_accidents",0,"2000_2014"
"British Airways*",3179760952,"fatal_accidents",0,"2000_2014"
"Cathay Pacific*",2582459303,"fatal_accidents",0,"2000_2014"
"China Airlines",813216487,"fatal_accidents",1,"2000_2014"
"Condor",417982610,"fatal_accidents",0,"2000_2014"
"COPA",550491507,"fatal_accidents",0,"2000_2014"
"Delta / Northwest*",6525658894,"fatal_accidents",2,"2000_2014"
"Egyptair",557699891,"fatal_accidents",1,"2000_2014"
"El Al",335448023,"fatal_accidents",0,"2000_2014"
"Ethiopian Airlines",488560643,"fatal_accidents",2,"2000_2014"
"Finnair",506464950,"fatal_accidents",0,"2000_2014"
"Garuda Indonesia",613356665,"fatal_accidents",2,"2000_2014"
"Gulf Air",301379762,"fatal_accidents",1,"2000_2014"
"Hawaiian Airlines",493877795,"fatal_accidents",0,"2000_2014"
"Iberia",1173203126,"fatal_accidents",0,"2000_2014"
"Japan Airlines",1574217531,"fatal_accidents",0,"2000_2014"
"Kenya Airways",277414794,"fatal_accidents",2,"2000_2014"
"KLM*",1874561773,"fatal_accidents",0,"2000_2014"
"Korean Air",1734522605,"fatal_accidents",0,"2000_2014"
"LAN Airlines",1001965891,"fatal_accidents",0,"2000_2014"
"Lufthansa*",3426529504,"fatal_accidents",0,"2000_2014"
"Malaysia Airlines",1039171244,"fatal_accidents",2,"2000_2014"
"Pakistan International",348563137,"fatal_accidents",2,"2000_2014"
"Philippine Airlines",413007158,"fatal_accidents",1,"2000_2014"
"Qantas*",1917428984,"fatal_accidents",0,"2000_2014"
"Royal Air Maroc",295705339,"fatal_accidents",0,"2000_2014"
"SAS*",682971852,"fatal_accidents",1,"2000_2014"
"Saudi Arabian",859673901,"fatal_accidents",0,"2000_2014"
"Singapore Airlines",2376857805,"fatal_accidents",1,"2000_2014"
"South African",651502442,"fatal_accidents",0,"2000_2014"
"Southwest Airlines",3276525770,"fatal_accidents",0,"2000_2014"
"Sri Lankan / AirLanka",325582976,"fatal_accidents",0,"2000_2014"
"SWISS*",792601299,"fatal_accidents",0,"2000_2014"
"TACA",259373346,"fatal_accidents",1,"2000_2014"
"TAM",1509195646,"fatal_accidents",2,"2000_2014"
"TAP - Air Portugal",619130754,"fatal_accidents",0,"2000_2014"
"Thai Airways",1702802250,"fatal_accidents",1,"2000_2014"
"Turkish Airlines",1946098294,"fatal_accidents",2,"2000_2014"
"United / Continental*",7139291291,"fatal_accidents",2,"2000_2014"
"US Airways / America West*",2455687887,"fatal_accidents",2,"2000_2014"
"Vietnam Airlines",625084918,"fatal_accidents",0,"2000_2014"
"Virgin Atlantic",1005248585,"fatal_accidents",0,"2000_2014"
"Xiamen Airlines",430462962,"fatal_accidents",0,"2000_2014"
"Aer Lingus",320906734,"fatalities",0,"2000_2014"
"Aeroflot*",1197672318,"fatalities",88,"2000_2014"
"Aerolineas Argentinas",385803648,"fatalities",0,"2000_2014"
"Aeromexico*",596871813,"fatalities",0,"2000_2014"
"Air Canada",1865253802,"fatalities",0,"2000_2014"
"Air France",3004002661,"fatalities",337,"2000_2014"
"Air India*",869253552,"fatalities",158,"2000_2014"
"Air New Zealand*",710174817,"fatalities",7,"2000_2014"
"Alaska Airlines*",965346773,"fatalities",88,"2000_2014"
"Alitalia",698012498,"fatalities",0,"2000_2014"
"All Nippon Airways",1841234177,"fatalities",0,"2000_2014"
"American*",5228357340,"fatalities",416,"2000_2014"
"Austrian Airlines",358239823,"fatalities",0,"2000_2014"
"Avianca",396922563,"fatalities",0,"2000_2014"
"British Airways*",3179760952,"fatalities",0,"2000_2014"
"Cathay Pacific*",2582459303,"fatalities",0,"2000_2014"
"China Airlines",813216487,"fatalities",225,"2000_2014"
"Condor",417982610,"fatalities",0,"2000_2014"
"COPA",550491507,"fatalities",0,"2000_2014"
"Delta / Northwest*",6525658894,"fatalities",51,"2000_2014"
"Egyptair",557699891,"fatalities",14,"2000_2014"
"El Al",335448023,"fatalities",0,"2000_2014"
"Ethiopian Airlines",488560643,"fatalities",92,"2000_2014"
"Finnair",506464950,"fatalities",0,"2000_2014"
"Garuda Indonesia",613356665,"fatalities",22,"2000_2014"
"Gulf Air",301379762,"fatalities",143,"2000_2014"
"Hawaiian Airlines",493877795,"fatalities",0,"2000_2014"
"Iberia",1173203126,"fatalities",0,"2000_2014"
"Japan Airlines",1574217531,"fatalities",0,"2000_2014"
"Kenya Airways",277414794,"fatalities",283,"2000_2014"
"KLM*",1874561773,"fatalities",0,"2000_2014"
"Korean Air",1734522605,"fatalities",0,"2000_2014"
"LAN Airlines",1001965891,"fatalities",0,"2000_2014"
"Lufthansa*",3426529504,"fatalities",0,"2000_2014"
"Malaysia Airlines",1039171244,"fatalities",537,"2000_2014"
"Pakistan International",348563137,"fatalities",46,"2000_2014"
"Philippine Airlines",413007158,"fatalities",1,"2000_2014"
"Qantas*",1917428984,"fatalities",0,"2000_2014"
"Royal Air Maroc",295705339,"fatalities",0,"2000_2014"
"SAS*",682971852,"fatalities",110,"2000_2014"
"Saudi Arabian",859673901,"fatalities",0,"2000_2014"
"Singapore Airlines",2376857805,"fatalities",83,"2000_2014"
"South African",651502442,"fatalities",0,"2000_2014"
"Southwest Airlines",3276525770,"fatalities",0,"2000_2014"
"Sri Lankan / AirLanka",325582976,"fatalities",0,"2000_2014"
"SWISS*",792601299,"fatalities",0,"2000_2014"
"TACA",259373346,"fatalities",3,"2000_2014"
"TAM",1509195646,"fatalities",188,"2000_2014"
"TAP - Air Portugal",619130754,"fatalities",0,"2000_2014"
"Thai Airways",1702802250,"fatalities",1,"2000_2014"
"Turkish Airlines",1946098294,"fatalities",84,"2000_2014"
"United / Continental*",7139291291,"fatalities",109,"2000_2014"
"US Airways / America West*",2455687887,"fatalities",23,"2000_2014"
"Vietnam Airlines",625084918,"fatalities",0,"2000_2014"
"Virgin Atlantic",1005248585,"fatalities",0,"2000_2014"
"Xiamen Airlines",430462962,"fatalities",0,"2000_2014"
================================================
FILE: inst/datasets/corr50.csv
================================================
-1.5769,-0.107
-0.4231,5.72
1.2308,-2.6454
1.2308,1.2776
2.2692,5.72
4.1154,1.2776
4.0385,-1.8954
5.3462,8.893
4.4231,8.3738
5.0385,6.9892
6.1923,4.5661
5.9615,0.4123
7.4615,4.8546
7.5385,6.9892
9.1923,6.1238
3.8846,3.9892
2.3462,1.2776
8.7692,5.0276
8.7692,7.5084
-0.6923,1.7969
================================================
FILE: inst/datasets/example-data.csv
================================================
"x","y","z"
1,"Hello",TRUE
3,"Hello",FALSE
5,"Hello",TRUE
7,"Hello",FALSE
9,"Hello",TRUE
1,"Hello",FALSE
3,"Hello",TRUE
5,"Hello",FALSE
7,"Hello",TRUE
9,"Goodbye",FALSE
================================================
FILE: packages.bib
================================================
@Manual{R-Ecdat,
title = {Ecdat: Data Sets for Econometrics},
author = {Yves Croissant},
year = {2016},
note = {R package version 0.3-1},
url = {https://CRAN.R-project.org/package=Ecdat},
}
@Manual{R-Ecfun,
title = {Ecfun: Functions for Ecdat},
author = {Spencer Graves},
year = {2016},
note = {R package version 0.1-7},
url = {https://CRAN.R-project.org/package=Ecfun},
}
@Manual{R-ScPoEconometrics,
title = {ScPoEconometrics: ScPoEconometrics},
author = {Florian Oswald},
year = {2018},
note = {R package version 0.1.8},
url = {https://github.com/ScPoEcon/ScPoEconometrics},
}
@Manual{R-base,
title = {R: A Language and Environment for Statistical Computing},
author = {{R Core Team}},
organization = {R Foundation for Statistical Computing},
address = {Vienna, Austria},
year = {2018},
url = {https://www.R-project.org/},
}
@Manual{R-bindrcpp,
title = {bindrcpp: An 'Rcpp' Interface to Active Bindings},
author = {Kirill Müller},
year = {2018},
note = {R package version 0.2.2},
url = {https://CRAN.R-project.org/package=bindrcpp},
}
@Manual{R-bookdown,
title = {bookdown: Authoring Books and Technical Documents with R Markdown},
author = {Yihui Xie},
year = {2018},
note = {R package version 0.7},
url = {https://CRAN.R-project.org/package=bookdown},
}
@Manual{R-dplyr,
title = {dplyr: A Grammar of Data Manipulation},
author = {Hadley Wickham and Romain François and Lionel Henry and Kirill Müller},
year = {2018},
note = {R package version 0.7.6},
url = {https://CRAN.R-project.org/package=dplyr},
}
@Manual{R-ggplot2,
title = {ggplot2: Create Elegant Data Visualisations Using the Grammar of Graphics},
author = {Hadley Wickham and Winston Chang and Lionel Henry and Thomas Lin Pedersen and Kohske Takahashi and Claus Wilke and Kara Woo},
year = {2018},
note = {R package version 3.0.0},
url = {https://CRAN.R-project.org/package=ggplot2},
}
@Manual{R-knitr,
title = {knitr: A General-Purpose Package for Dynamic Report Generation in R},
author = {Yihui Xie},
year = {2018},
note = {R package version 1.20},
url = {https://CRAN.R-project.org/package=knitr},
}
@Manual{R-mvtnorm,
title = {mvtnorm: Multivariate Normal and t Distributions},
author = {Alan Genz and Frank Bretz and Tetsuhisa Miwa and Xuefei Mi and Torsten Hothorn},
year = {2018},
note = {R package version 1.0-8},
url = {https://CRAN.R-project.org/package=mvtnorm},
}
@Manual{R-plotly,
title = {plotly: Create Interactive Web Graphics via 'plotly.js'},
author = {Carson Sievert and Chris Parmer and Toby Hocking and Scott Chamberlain and Karthik Ram and Marianne Corvellec and Pedro Despouy},
year = {2017},
note = {R package version 4.7.1},
url = {https://CRAN.R-project.org/package=plotly},
}
@Manual{R-readr,
title = {readr: Read Rectangular Text Data},
author = {Hadley Wickham and Jim Hester and Romain Francois},
year = {2017},
note = {R package version 1.1.1},
url = {https://CRAN.R-project.org/package=readr},
}
@Manual{R-readxl,
title = {readxl: Read Excel Files},
author = {Hadley Wickham and Jennifer Bryan},
year = {2018},
note = {R package version 1.1.0},
url = {https://CRAN.R-project.org/package=readxl},
}
@Manual{R-reshape2,
title = {reshape2: Flexibly Reshape Data: A Reboot of the Reshape Package},
author = {Hadley Wickham},
year = {2017},
note = {R package version 1.4.3},
url = {https://CRAN.R-project.org/package=reshape2},
}
@Manual{R-rmarkdown,
title = {rmarkdown: Dynamic Documents for R},
author = {JJ Allaire and Yihui Xie and Jonathan McPherson and Javier Luraschi and Kevin Ushey and Aron Atkins and Hadley Wickham and Joe Cheng and Winston Chang},
year = {2018},
note = {R package version 1.10},
url = {https://CRAN.R-project.org/package=rmarkdown},
}
@Manual{R-tidyr,
title = {tidyr: Easily Tidy Data with 'spread()' and 'gather()' Functions},
author = {Hadley Wickham and Lionel Henry},
year = {2018},
note = {R package version 0.8.1},
url = {https://CRAN.R-project.org/package=tidyr},
}
================================================
FILE: preamble.tex
================================================
\usepackage{tcolorbox}
\usepackage{booktabs}
\usepackage{amsthm}
\newenvironment{note}{\begin{tcolorbox}[colback=blue!5!white,colframe=blue!75!black]}{\end{tcolorbox}}
\newenvironment{notel}{\begin{tcolorbox}[colback=blue!5!white,colframe=blue!75!black]}{\end{tcolorbox}}
\newenvironment{warning}{\begin{tcolorbox}[colback=orange!5!white,colframe=orange]}{\end{tcolorbox}}
\newenvironment{warningl}{\begin{tcolorbox}[colback=orange!5!white,colframe=orange]}{\end{tcolorbox}}
\newenvironment{tip}{\begin{tcolorbox}[colback=green!5!white,colframe=green]}{\end{tcolorbox}}
\makeatletter
\def\thm@space@setup{%
\thm@preskip=8pt plus 2pt minus 4pt
\thm@postskip=\thm@preskip
}
\makeatother
================================================
FILE: previous_travis.yml
================================================
language: r
os:
- linux
- osx
before_install:
# - if [ $TRAVIS_OS_NAME = linux ]; then sudo apt-get update; fi
- if [ $TRAVIS_OS_NAME = linux ]; then sudo apt-get install -y ghostscript; sudo apt-get install -y libmagick++-dev; sudo add-apt-repository -y ppa:cran/poppler;sudo apt-get install -y libpoppler-cpp-dev; sudo apt-get install -y libv8-dev ; sudo apt-get install -y libudunits2-dev libgdal-dev libgeos-dev libproj-dev libfontconfig1-dev;fi
- if [[ "$TRAVIS_OS_NAME" == "osx" ]]; then brew install llvm; brew install v8; brew install poppler;
export PATH="/usr/local/opt/llvm/bin:$PATH" &&
export LDFLAGS="-L/usr/local/opt/llvm/lib" &&
export CFLAGS="-I/usr/local/opt/llvm/include"; fi
cache:
packages: yes
directories:
- $TRAVIS_BUILD_DIR/_bookdown_files
sudo: false
pandoc_version: 1.19.2.1
before_script:
- chmod +x ./_build.sh
- chmod +x ./_deploy.sh
- if [ $TRAVIS_OS_NAME = osx ]; then brew tap homebrew/cask; brew cask install phantomJS; brew install imagemagick@6; fi
script:
- R CMD build .
- R CMD INSTALL *tar.gz
- if [ $TRAVIS_OS_NAME = osx ]; then R CMD check *tar.gz ; fi
- if [ $TRAVIS_OS_NAME = linux ]; then R CMD check *tar.gz; fi
- if [ $TRAVIS_OS_NAME = osx ] && [[ $TRAVIS_COMMIT_MESSAGE != *"[nobook]"* ]]; then ./_build.sh && ./_deploy.sh; fi
================================================
FILE: style.css
================================================
p.caption {
color: #777;
margin-top: 10px;
}
p code {
white-space: inherit;
}
pre {
word-break: normal;
word-wrap: normal;
}
pre code {
white-space: inherit;
}
/*
* Admonitions
*
* Colors (title, body)
* warning: #f0b37e #ffedcc (orange)
* note: #6ab0de #e7f2fa (blue)
* tip: #1abc9c #dbfaf4 (green)
*/
.note {
padding: 0.5em;
background-color: #e7f2fa;
border-radius: 5px;
text-align: center;
}
.notel {
padding: 0.5em;
background-color: #e7f2fa;
border-radius: 5px;
text-align: left;
}
.warning {
padding: 0.5em;
background-color: #f0b37e;
border-radius: 5px;
text-align: center;
}
.warningl {
padding: 0.5em;
background-color: #f0b37e;
border-radius: 5px;
text-align: left;
}
.tip {
padding: 0.5em;
background-color: #dbfaf4;
border-radius: 5px;
text-align: center;
}
================================================
FILE: teachers/ForTeachers.md
================================================
# Meta Info For Teachers
This document contains info for teachers (at SciencesPo and elsewhere) who want to teach this course.
## Content
1. [Outline and Philosphie](#outline-and-philosphie)
2. [Details](#details)
3. [TODO list teachers](#TODO-list-teachers)
4. [Student/Teacher feedback from first iteration of course](#student-and-teacher-feedback)
## Outline and Philosophie
* This is an introductory course to econometrics taught to 2nd year students at SciencesPo
* The course is mandatory for the Economics and Society major.
* Based on our experience teaching this course for many years, the traditional setup of teaching econometrics was found to be unsuitable.
* The traditional curriculum assumes some basic maths knowledge, summation notation for example, as well as some basic statistics.
* Both maths and stats are taught in the first year.
* It seems that for many students this is too abstract (or not interesting).
* The distribution of student evaluations was always *bimodal*: some students thought it was great, but didn’t go far enough, and a relatively larger number thought it was much too hard and they didn’t get much out of it.
* This edition of the course uses only minimal maths and statistics
* We are focusing on the lower mode of the above mentioned student evaluations population.
* We will use `R` to illustrate key concepts interactively.
* **Important**: this is not a course *about `R`*, in the sense that our primary goal is not to teach students how to program. (This is a very laudable goal in general, but we are constrained in this sense.)
* Our primary goal is for students to understand the basics of linear regression, *using `R`*. They will be exposed to some very basic `R` programming.
## Details
* course structure
* material: Everything the students need is contained in an [online code repository](https://github.com/ScPoEcon/ScPoEconometrics). In particular, this contains an `R` package with
* code that produces interactive `apps`, i.e. small server applications, used for illustration
* `tutorials`, which are worked examples that require some student input for completion
* code that produces the associated textbook
* textbook: The textbook is online at [https://scpoecon.github.io/ScPoEconometrics/](https://scpoecon.github.io/ScPoEconometrics/).
* It’s readable online in a browser (also on a mobile device), as an `epub` on an ebook reader, or as a `pdf`.
* It is still work in progress (contributions welcome!). Particularly chapter 1 needs drastic shortening.
* sessions: standard weekly meetings, 12 times per semester, 2 hours per session. The focus of the meetings will be to work on the tutorials, either alone or in small teams. The teacher will start each session with a short overview of the relevant chapter from the online textbook. The main task of the teacher will be to help students along the way and to break after each 20 min interval (or so) with short quizzes (more below).
* The book should be for home study, practical exercises are done in class
* Grades: Some weighted average between a final exam and bi-weekly online quizzes.
* Exams
- Both Exams and online quizzes rely on the amazing [R-exams](http://www.r-exams.org) package.
- We produce a pool of template questions, and the package generates random numbers to populate the questions with. Cheating becomes very hard.
* The package produces solutions and scannable pdfs for automatic grading.
* One final exam.
* can produce a mock exam before
* pen and paper. could allow to use computer for computing during exam, but it carries high risk of cheating or technical problems.
* Each teacher should supply as many exam questions as possible. We need a question bank from which to choose.
* More on this below in see [TODO](#todo-list-teachers).
* Online Quizzes (Homework)
- Part of the grade.
* weekly or bi-weekly
* Serve the purpose to make sure that they read the book
* [automatically put on moodle](https://moodle.sciences-po.fr/mod/quiz/view.php?id=114720)
* Can be automatically generated from our questions pool.
* We used moodle, but this works for pretty much all other online learning platforms.
* Kahoots
- Not part of the grade.
* Kahoot! is an online quiz platform widely used in teaching
* Kahoots should be given to students in class, just to have some fun and check they understand what is going on. They are played on a mobile phone or a browser. Students choose nicknames. best (fastest and correct) answer wins, shows podium in the end.
* quick demo:
* Students need to be able to see your screen.
* I (or you!) create intermediate quizzes before class at https://create.kahoot.it/
* In class, you launch a kahoot from *my kahoots* (click on *play*).
* students go to https://kahoot.it and enter quiz pin
* teachers should sign up and I can share my kahoots with them. see [TODO](#todo-list-teachers)
* Here is the [kahoot for chapter 2](https://play.kahoot.it/#/k/9dfe2cc0-ea38-491a-9e0b-fb55867fcdda)
* Communication
* slack: this is a chatroom-like environment that I have tested successfully in my other courses.
* every group gets their separate channel
* every teacher is responsible to manage questions in their group’s channel
* general questions should be asked in the #general channel
* using this technology is a viable way for me to maintain a global view of how this course is going in the various locations. If I can see what you and your students are talking about, we can react fast to adapt the course. On the other hand, if I have to read through several threaded emails back and forth between you and your students before I can understand what the problem is, this will be much harder (read: *impossible*) to do.
* I **strongly recommend** to communicate with your students via slack, not via email.
* When working with software and computers, there is **ALWAYS** another student who as exactly the same problem as the one you are currently emailing to. The economies of scale are almost unlimited in this domain.
* You can for once share `code` in a readable way.
* I would prefer if you communicated with me as well on slack. You can send private messages.
## TODO list teachers
To ensure consistency in the department's approach to the *Introduction to Econometrics* curriculum, instructors are strongly encouraged to follow the following guidelines.
1. sign up to slack: send me an email at florian.oswald@sciencespo.fr so I can add you
2. get a free account on github.com
3. have a look at our course [code repository](https://github.com/ScPoEcon/ScPoEconometrics)
1. In particular, look at the [current list of issues](https://github.com/ScPoEcon/ScPoEconometrics/issues) and file new ones
4. Install `R`
5. Install the `R` package as described on the readme of the [code repository](https://github.com/ScPoEcon/ScPoEconometrics).
6. go through **all** the apps. Instructions always on the same readme.
1. This is important.
2. Please run all apps. If you find any trouble, please [file an issue](https://github.com/ScPoEcon/ScPoEconometrics/issues).
3. Make sure you understand what each app is supposed to teach. If it’s not clear, [file an issue](https://github.com/ScPoEcon/ScPoEconometrics/issues).
4. Feel free to suggest other apps! By [filing an issue](https://github.com/ScPoEcon/ScPoEconometrics/issues).
7. Create questions.
1. Have a look at the textbook for the level of difficulty you should aim at
2. You will be associated to the [private exams repo](https://github.com/floswald/ScPoMetricsExams) as soon as you send me your github user name (Point 2. above!). External teachers, please send me an email with that request.
3. I would like to get at the *very least* 4 questions from each teacher. They can be a mixture of short an long questions.
8. Have a close look at [the textbook](https://scpoecon.github.io/ScPoEconometrics/). If you have any suggestions about anything at all please [file an issue](https://github.com/ScPoEcon/ScPoEconometrics/issues).
9. Sign for a free account at [https://kahoot.com/](https://kahoot.com/) to we can share short quizzes.
10. Please be vocal. This course is an experiment and we are sailing uncharted territory. Every comment you have will be valuable for us. So [file an issue](https://github.com/ScPoEcon/ScPoEconometrics/issues), post a message on slack, or get in touch otherwise with anything at all!
11. Thank you for participating!
## Student and Teacher Feedback
### Course Iteration 1: September 2018. ScPo Paris and Regional Campuses.
#### Teachers half-term feedback:
##### T1
- Few problems at the beginning concerning the installation of packages: many people had to change their security options in order to install the packages. Now everything is working smoothly.
- Few people had problems opening the slides using safari and google chrome.
- I think that some of the students would like to see more “real world” examples (as the one on California student test scores).
- Two exchange students seem to have troubles understanding basic math concepts (one of them was not able to understand a simple linear equation).
- 10 to 15 students reported some issue with the quiz.
- They seem to like the format of the course.
##### T2
- Installations of R, RStudio, and packages were ok at the end of the first course
- I do not use the slides, I follow the book, projecting RStudio from my laptop
- Student do not use Slack but ask their questions during the course
- (personal opinion) the tidyverse framework arrive too early to understand its interest
- No problem with the quizz (we've tested only the first)
##### T3
-no specific problems with the package. Sometimes students using Macs have more difficulities because they need to adapt certain lines of code concerning import of files (folder paths etc..)
- students are sometimes surprised that certain functions can use only particular types of objects as arguments.
- several student had to retake the test twice because of the server collapse m. Overall, the results are good, low grades are rare.
Overall, nothing very peculiar or worrisome so far in my groups
##### T4
1. The main problem concerns the ScPoEconometrics package. Sometimes it blocks suddently while it worked 1 day ago. Otherwise, everything goes well.
2. Some students find that the book is hard to follow. Aside from the slides, I give them a synthesis of the R codes at the end of each chapter.
3. Students would like to know the weight of the moodle quizzes.
##### T5
1. Some students had problem when they update the package and run tutorial, fortunately it seems to be ok in the last session. More students had problems with the first test but the second one until now only one student has. Students in my groups rarely interact in Slack, some even never check the messages.
2. Agree that the tidyverse seem to be technical and students were not quite interested in this early stage.
3. The average of the quizzes is good.
4. I think the command should be kept simple since it’s hard for some students even to replicate the command.
##### T6
1. Some student had problem with package installation at the beginning, they seem to know how to interact with slack but don’t really use it.
2. Agree with the past comments about tidyverse, it’s too early for them to understand its interest. They seem to like real world examples. I think that having a kind of small applied project would be helpful as it seems that they just try to reproduce class results and not to play with R. I have 1 student with almost no math background. I think the slides on OLS transformations (normalization, demeaning) may be too cryptic at this stage.
3. 3-4 students had problems with the quiz. The average grade is very nice for now.
##### My response to teacher feedback
So my experience was overall similar to what you are writing, just to reassure you. Going forward, i.e. for the next edition of the course, I take the following messages out of what you wrote:
1. no tidyverse, or only later
2. more real world examples a la `Caschool`and or an applied project
3. Slides on OLS transformation too much math
4. Some find book hard to follow.
All of those are good points. Let me just put some more realism into each point by highlighting that nothing comes for free. Again, this is mainly for my own future benefit, but please feel free to discuss.
1. The Tidyverse approach to cleaning data is easier than the corresponding solution using base R. This is related to *real world*. the example with reading an excel dataset downloaded from the web is _very_ real world in this sense. You will always have to reshape the data somehow, and I am doubtful whether the base R route is simpler to understand.
2. Can produce more worked examples or projects. We did as much as we could with the tutorials so far, clearly the more the better.
3. I explicitly say that the math is only for whoever is interested on those particular slides. I think we should at least give the option for those interested to get a chance to see how stuff works. Debatable.
4. I need more info as to which parts of the book they find hard to follow. Please don’t say *all of it*.
#### Student Feedback
* At the time of writing, the official course evaluation on behalf of students has not yet been published. To be added here.
* I got some informal feedback during the semester.
1. some moodle/exam questions are not suitable for exams. for example, the question about a distribution being left/right skewed or unimodal etc is not always unambiguously clear, givne the random nature of the data.
2. the crashing moodle server caused some real pain. giving people grades under conditions of such technical frailty was quite borderline and I was tempted not to use the moodle quizzes in the grades at all.
3. The federated structure of SciencesPo (central paris and regional campuses) caused some frustration. It is hard to synchronize classrooms at a distance. Some people didn't find slack helpful.
4. Students who performed poorly on the final exam thought it was too hard/unfair. Students who performed well thought it was not unfair. Not much to learn from this. 80% of exam questions were using the *identical* template previously used in one of the online quizzes.
================================================
FILE: teachers/app-timeline.md
================================================
# App and Tutorial Schedule
This doc sets out a rough timeline for when to do which app or tutorial.
## Chapter 1
Nothing
## Chapter 2: Summarizing Data
* After slide *scatter plots*, do `runTutorial('chapter2')`.
- Discrete Data
- Continuous Data
- Estimation based on a sample
* Immediately after, `runTutorial('correlation')`
* After that, introduce the `aboutApp()` function. do `aboutApp("corr_continous")` (slightly different app, but fitting explanation)
* Finally in that chapter, I would recommend to go through the *entire* worked example in the book at 2.4.1 "Reading .csv data in the tidy way"
## Chapter 3: Linear Regression
This is by far the most important chapter, so we have a lot of apps. You should take as long for this chapter as you feel is necessary. It's the core of the course.
* After you showed figure 3.1 do `launchApp('reg_simple_arrows')`
* continue with `launchApp('reg_simple')`. explanation for the squares comes later, at this stage this is just intuition.
* continue in 3.1.2 to introduce SSR
* after this point, they **must** have the simple formula (3.1) and what each part means in their head for the rest of their lives. make sure that is the case.
* go back to `launchApp('reg_simple')` and explain the squares
* now `launchApp('SSR_cone')` and tell them that OLS solves exactly this minization problem. spend good time there, explain all the numbers that are visible and that they can drag the 3D graph with their mouse to see better.
* now do `launchApp('reg_full')`. explain
- there are 10 different examples
- what happens when you increase the noise level?
- you should spend a lot of time with this app.
* Now we have the basics down. Next we talk about some simple restrictions on the basic model.
* what happens if we demean both x and y? `launchApp('demeaned_reg')`
* contrained regression: what if we have only an intercept, or only a slope? how does our result improve (with 0 intercept, say), if we then demean the data? `launchApp('reg_constrained')`
* what happens if we rescale either x or y or both by some number? say, what if instead of measuring wage in a regression in euros, we now measure it in 1000s of euros?
1. `runTutorial('rescaling')`
1. `launchApp('rescale')`
* go back to 3.1.3 in the book and define the simple formulae for both coefficients
* 3.1.4
- `launchApp('anscombe')`
- `launchApp('datasaurus')`
* Work through book 3.3 example till the end
## Chapter 4: Standard Errors
* `launchApp('sampling')`
* `launchApp('standard_errors_simple')`
* `launchApp('standard_errors_changeN')`
* `launchApp('confidence_intervals')`
* `runTutorial('non_normal')`
## Chapter 5: Multiple Regression
* `launchApp('reg_multivariate')`
* `runTutorial('lm_example')`
* `launchApp('multicollinearity')`
## Chapter 6: Categorical Variables
* `launchApp('reg_dummy')`
* `launchApp('reg_dummy_example')`
================================================
FILE: teachers/session1-ouline.md
================================================
# Session 1
Teacher brings a laptop with Slack, R and Rstudio installed. Our package code is installed on the laptop. The laptop is connected to the projector. ScPo provided hardware won't allow either Slack nor the installation of our package, so is not useful.
## Welcome!
* Who am I?
* name
* experience (research, teaching, other)
* What this course tries to teach you?
* We want to teach you the basics of data analysis and Econometrics.
* We want you to try things out, rather than to be able to proof them formally
* For those of you very eager to derive formal and more rigorous insights, there will be ample opportunity later on, in a Masters or a Phd
* Our aim is for *everybody* to understand and to be able a linear regression with `R`.
* This is a brand new course.
* This means that we quite happy to show you plenty of new things, but you should be aware there are still some rough edges. Please be patient if something does not work as expected - we are here to help!
## Meetings
* We meet once per week
* please bring your laptop each time
## Exam and Grading
* There will be quizzes on Moodle roughly every two weeks.
* There will be a final exam on paper.
* We will do online quizzes on kahoot.com, but those will not be part of your grade.
## Today
* We will talk about some logistical details first. You will need your computer running and connected to the internet, so why not start up now?
* Then we will have a first look at `R`.
### Communication
* We will talk to each other on Slack.
* Who is not yet signed up to Slack?
* This is much better to talk about issues with computer code than email
1. it *looks* nicer than in an email
2. Slack is like a chatroom, so other people see what you say. Odds are that there are several people who have the same/similar problem like you, so this much more efficient in a chatroom.
* Let me quickly show you Slack. You should open Slack on your computer now as well.
1. [WAITS for all]
1. In the left panel you can see all the channels you are subscribed to. You can see I am subscribed to more channels than you are.
1. You should subscribe to *my* channel, so we can talk about things in this classroom. Just click on `Channels` and start typing my first name. You will see my channel appear, click on it, and finally click *join channel* at the bottom.
1. This channel is your first reference for any questions you have about the course.
1. Let me check that you are all in my channel now
1. [checks *members* in right panel]
1. I'll post an example message now in our channel to say hello to you all.
1. [posts hello message into their channel]
1. You can **react** to any post by clicking on the appropriate symbol at the top right corner of the post.
1. [reacts to hello message just posted]
1. Let me show you now how to nicely format computer code in a slack post. it's easy.
1. [starts typing x + y = 3 and alerts students to the appearing info just below the text box]
1. We want this to be formatted like ``code``. So we put this in backticks `` ` ``, like so: `` `x + y = 3` `` [hits enter]
1. If you want to write multiple lines of code, you could start with three backticks, and create a new line with `shift` and `enter` (`enter` alone sends the message!):
````
```
x = 3
y = 4
x + y
```
````
1. you can also attach files by clicking on the plus symbol.
1. Please don't post in the #general channel, as this is for public announcements for all courses.
1. Finally, you can send direct messages by clicking on a username, or on *direct messages* in your left panel.
### RStudio
* You all have R and RStudio installed?
* If not, install now and look on your neighbors screen
* Lets all open RStudio!
* [make sure you have standard layout, from top left to bottom right source, environment, console, files/plots]
* open an empty script
* here is the console (bottom left): write some commands into it
* show that variables show up in environment if you assign a value (top rigiht)
* make a base plot (not ggplot) and show where it appears
* write 2 lines of code in the open script file, execute each line (place cursor on line and hit cmd+enter or click on run)
* save the script file somewhere by clicking on the save symbol
* type `help(plot)` in the console and explain help file
### Let's get going with R!
* open https://scpoecon.github.io/ScPoEconometrics/R-intro.html and project to wall
* explain how the **book** works
* left: TOC
* menu bar on top:
* make TOC disappear
* search for a term
* choose text type
* edit this page of the book on github.com (to suggest a change or if you found a mistake)
* download as pdf or as epub.
* If you like what you see, on the right you can tweet and post to facebook about this book.
* All the code you see in the book actually works. so please copy and paste from it as much as you can!
### Continue with Slides!
* Start at 1.2.1: First Glossary
* Do some basics from 1.3
* Do 1.4
* Do 1.5
* Do 1.7 and install the package!
* make them load the library and check the version!
* keep going over the chapter:
* ideally you have your RStudio screen open and type commands as you go along
* we want them to type as many commands as possible!!!
* go until Task 1:
* 2 minute break!
* who is having any trouble with their computers, please come and see me now.
* then do task 1
* then keep going
================================================
FILE: teachers/tasks_ch1.Rmd
================================================
---
title: "tasks for session 1"
author: "Florian Oswald"
date: "8/18/2018"
output:
pdf_document: default
html_document: default
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# task 1
1. Create a vector of five ones, i.e. `[1,1,1,1,1]`
`rep(1,5)`
1. Notice that the colon operator `a:b` is just short for *construct a sequence **from** `a` **to** `b`*. Create a vector the counts down from 10 to 0, i.e. it looks like `10,9,8,7,6,5,4,3,2,1,0`!
`10:0`
1. the `rep` function takes additional arguments `times` (as above), and `each`, which tells you how often *each element* should be repeated (as opposed to the entire input vector). Use `rep` to create a vector that looks like this: `1 1 1 2 2 2 3 3 3 1 1 1 2 2 2 3 3 3`
`rep(1:3,times=2,each=3)`
# task 2
1. Create a vector filled with 10 numbers drawn from the uniform distribution (hint: use function `runif`) and store them in `x`. `x = runif(10)`
1. Using logical subsetting as above, get all the elements of `x` which are larger than 0.5, and store them in `y`. `y = x[x>0.5]`
1. using the function `which`, store the *indices* of all the elements of `x` which are larger than 0.5 in `iy`. `iy = which(x>0.5)`
1. Check that `y` and `x[iy]` are identical. `identical(y,x[iy])` or `all(y == x[iy])`
# Task 3
1. Create a vector containing `1,2,3,4,5` called v. `v = 1:5`
1. Create a (2,5) matrix `m` containing the data `1,2,3,4,5,6,7,8,9,10`. The first row should be `1,2,3,4,5`. `m = matrix(data = 1:10,nrow=2,ncol=5,byrow=T)`
1. Perform matrix multiplication of `m` with `v`. Use the command `%*%`. What dimension does the output have? `dim(m%*% v)`,
1. Why does `v %*% m` not work? non-conformable
# Task 4
1. Copy and paste the above code for `ex_list` into your R session. Remember that `list` can hold any kind of `R` object. Like...another list! So, create a new list `new_list` that has two fields: a first field called "this" with string content `"is awesome"`, and a second field called "ex_list" that contains `ex_list`. `new_list = list(this = "is awesome", ex_list = ex_list)`
1. Accessing members is like in a plain list, just with several layers now. Get the element `c` from `ex_list` in `new_list`! `new_list$ex_list$c`
1. Compose a new string out of the first element in `new_list`, the element under label `this`. Use the function `paste` to print `R is awesome` to your screen. `paste("R",new_list$this)`
# Task 5
1. How many observations are there in `mtcars`? `nrow(mtcars)`
1. How many variables? `ncol(mtcars)`
1. What is the average value of `mpg`? `mean(mtcars$mpg)`
1. What is the average value of `mpg` for cars with more than 4 cylinders, i.e. with `cyl>4`? `mean(subset(mtcars,subset=cyl>4)$mpg)`
# Task 6
1. Write a for loop that counts down from 10 to 1, printing the value of the iterator to the screen.
```{r}
for (i in 10:1){
print(i)
}
```
1. Modify that loop to write "i iterations to go" where `i` is the iterator
```{r}
for (i in 10:1){
print(paste(i,"iterations to go"))
}
```
1. Modify that loop so that each iteration takes roughly one second. You can achieve that by adding the command `Sys.sleep(1)` below the line that prints "i iterations to go".
```{r}
for (i in 10:1){
print(paste(i,"iterations to go"))
Sys.sleep(1)
}
```
================================================
FILE: teachers/tasks_ch2.Rmd
================================================
---
title: "tasks for chapter 2"
author: "Florian Oswald"
date: "8/18/2018"
output:
pdf_document: default
html_document: default
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Task
1. Make sure to have the `mpg` dataset loaded by typing `data(mpg)` (and `library(ggplot2)` if you haven't!). Use the `table` function to find out how many cars were built by *mercury*? `table(mpg$manufacturer)`, `4`.
1. What is the average year the audi's were built in this dataset? Use the function `mean` on the subset of column `year` that corresponds to `audi`. (Be careful: subsetting a `tibble` returns a `tibble` (and not a vector)!. so get the `year` column after you have subset the `tibble`.) `mean(subset(mpg,subset=manufacturer=="audi")$year)`, `mean(mpg[mpg$manufacturer=="audi","year"]$year)`
1. Use the `dplyr` piping syntax from above first with `group_by` and then with `summarise(newvar=your_expression)` to find the mean `year` by manufacturer!
```{r}
library(ggplot2)
library(dplyr)
mpg %>%
group_by(manufacturer) %>%
summarise(year=mean(year))
```
================================================
FILE: toc.css
================================================
#TOC ul,
#TOC li,
#TOC span,
#TOC a {
margin: 0;
padding: 0;
position: relative;
}
#TOC {
line-height: 1;
border-radius: 5px 5px 0 0;
background: #141414;
background: linear-gradient(to bottom, #333333 0%, #141414 100%);
border-bottom: 2px solid #0fa1e0;
width: auto;
}
#TOC:after,
#TOC ul:after {
content: '';
display: block;
clear: both;
}
#TOC a {
background: #141414;
background: linear-gradient(to bottom, #333333 0%, #141414 100%);
color: #ffffff;
display: block;
padding: 19px 20px;
text-decoration: none;
text-shadow: none;
}
#TOC ul {
list-style: none;
}
#TOC > ul > li {
display: inline-block;
float: left;
margin: 0;
}
#TOC > ul > li > a {
color: #ffffff;
}
#TOC > ul > li:hover:after {
content: '';
display: block;
width: 0;
height: 0;
position: absolute;
left: 50%;
bottom: 0;
border-left: 10px solid transparent;
border-right: 10px solid transparent;
border-bottom: 10px solid #0fa1e0;
margin-left: -10px;
}
#TOC > ul > li:first-child > a {
border-radius: 5px 0 0 0;
}
#TOC.align-right > ul > li:first-child > a,
#TOC.align-center > ul > li:first-child > a {
border-radius: 0;
}
#TOC.align-right > ul > li:last-child > a {
border-radius: 0 5px 0 0;
}
#TOC > ul > li.active > a,
#TOC > ul > li:hover > a {
color: #ffffff;
box-shadow: inset 0 0 3px #000000;
background: #070707;
background: linear-gradient(to bottom, #262626 0%, #070707 100%);
}
#TOC .has-sub {
z-index: 1;
}
#TOC .has-sub:hover > ul {
display: block;
}
#TOC .has-sub ul {
display: none;
position: absolute;
width: 200px;
top: 100%;
left: 0;
}
#TOC .has-sub ul li a {
background: #0fa1e0;
border-bottom: 1px dotted #31b7f1;
filter: none;
display: block;
line-height: 120%;
padding: 10px;
color: #ffffff;
}
#TOC .has-sub ul li:hover a {
background: #0c7fb0;
}
#TOC ul ul li:hover > a {
color: #ffffff;
}
#TOC .has-sub .has-sub:hover > ul {
display: block;
}
#TOC .has-sub .has-sub ul {
display: none;
position: absolute;
left: 100%;
top: 0;
}
#TOC .has-sub .has-sub ul li a {
background: #0c7fb0;
border-bottom: 1px dotted #31b7f1;
}
#TOC .has-sub .has-sub ul li a:hover {
background: #0a6d98;
}
#TOC ul ul li.last > a,
#TOC ul ul li:last-child > a,
#TOC ul ul ul li.last > a,
#TOC ul ul ul li:last-child > a,
#TOC .has-sub ul li:last-child > a,
#TOC .has-sub ul li.last > a {
border-bottom: 0;
}
#TOC ul {
font-size: 1.2rem;
}