Full Code of linkedin/school-of-sre for AI

main 22979a6e89d4 cached

115 files

708.8 KB

167.1k tokens

1 requests

Download .txt

Showing preview only (748K chars total). Download the full file or copy to clipboard to get everything.

Repository: linkedin/school-of-sre
Branch: main
Commit: 22979a6e89d4
Files: 115
Total size: 708.8 KB

Directory structure:
gitextract_zk337uau/

├── .github/
│   └── workflows/
│       ├── build.yml
│       └── gh-deploy.yml
├── .gitignore
├── LICENSE
├── NOTICE
├── courses/
│   ├── CODE_OF_CONDUCT.md
│   ├── CONTRIBUTING.md
│   ├── index.md
│   ├── level101/
│   │   ├── big_data/
│   │   │   ├── evolution.md
│   │   │   ├── intro.md
│   │   │   └── tasks.md
│   │   ├── databases_nosql/
│   │   │   ├── further_reading.md
│   │   │   ├── intro.md
│   │   │   └── key_concepts.md
│   │   ├── databases_sql/
│   │   │   ├── backup_recovery.md
│   │   │   ├── concepts.md
│   │   │   ├── conclusion.md
│   │   │   ├── innodb.md
│   │   │   ├── intro.md
│   │   │   ├── lab.md
│   │   │   ├── mysql.md
│   │   │   ├── operations.md
│   │   │   ├── query_performance.md
│   │   │   ├── replication.md
│   │   │   └── select_query.md
│   │   ├── git/
│   │   │   ├── branches.md
│   │   │   ├── conclusion.md
│   │   │   ├── git-basics.md
│   │   │   └── github-hooks.md
│   │   ├── linux_basics/
│   │   │   ├── command_line_basics.md
│   │   │   ├── conclusion.md
│   │   │   ├── intro.md
│   │   │   └── linux_server_administration.md
│   │   ├── linux_networking/
│   │   │   ├── conclusion.md
│   │   │   ├── dns.md
│   │   │   ├── http.md
│   │   │   ├── intro.md
│   │   │   ├── ipr.md
│   │   │   ├── tcp.md
│   │   │   └── udp.md
│   │   ├── messagequeue/
│   │   │   ├── further_reading.md
│   │   │   ├── intro.md
│   │   │   └── key_concepts.md
│   │   ├── metrics_and_monitoring/
│   │   │   ├── alerts.md
│   │   │   ├── best_practices.md
│   │   │   ├── command-line_tools.md
│   │   │   ├── conclusion.md
│   │   │   ├── introduction.md
│   │   │   ├── observability.md
│   │   │   └── third-party_monitoring.md
│   │   ├── python_web/
│   │   │   ├── intro.md
│   │   │   ├── python-concepts.md
│   │   │   ├── python-web-flask.md
│   │   │   ├── sre-conclusion.md
│   │   │   └── url-shorten-app.md
│   │   ├── security/
│   │   │   ├── conclusion.md
│   │   │   ├── fundamentals.md
│   │   │   ├── intro.md
│   │   │   ├── network_security.md
│   │   │   ├── threats_attacks_defences.md
│   │   │   └── writing_secure_code.md
│   │   └── systems_design/
│   │       ├── availability.md
│   │       ├── conclusion.md
│   │       ├── fault-tolerance.md
│   │       ├── intro.md
│   │       └── scalability.md
│   ├── level102/
│   │   ├── .level102
│   │   ├── containerization_and_orchestration/
│   │   │   ├── conclusion.md
│   │   │   ├── containerization_with_docker.md
│   │   │   ├── intro.md
│   │   │   ├── intro_to_containers.md
│   │   │   └── orchestration_with_kubernetes.md
│   │   ├── continuous_integration_and_continuous_delivery/
│   │   │   ├── cicd_brief_history.md
│   │   │   ├── conclusion.md
│   │   │   ├── continuous_delivery_release_pipeline.md
│   │   │   ├── continuous_integration_build_pipeline.md
│   │   │   ├── introduction.md
│   │   │   ├── introduction_to_cicd.md
│   │   │   └── jenkins_cicd_pipeline_hands_on_lab.md
│   │   ├── linux_intermediate/
│   │   │   ├── archiving_backup.md
│   │   │   ├── bashscripting.md
│   │   │   ├── conclusion.md
│   │   │   ├── introduction.md
│   │   │   ├── introvim.md
│   │   │   ├── package_management.md
│   │   │   └── storage_media.md
│   │   ├── networking/
│   │   │   ├── conclusion.md
│   │   │   ├── infrastructure-features.md
│   │   │   ├── introduction.md
│   │   │   ├── rtt.md
│   │   │   ├── scale.md
│   │   │   └── security.md
│   │   ├── system_calls_and_signals/
│   │   │   ├── conclusion.md
│   │   │   ├── intro.md
│   │   │   ├── signals.md
│   │   │   └── system_calls.md
│   │   ├── system_design/
│   │   │   ├── conclusion.md
│   │   │   ├── intro.md
│   │   │   ├── large-system-design.md
│   │   │   ├── resiliency.md
│   │   │   ├── scaling-beyond-the-datacenter.md
│   │   │   └── scaling.md
│   │   └── system_troubleshooting_and_performance/
│   │       ├── conclusion.md
│   │       ├── important-tools.md
│   │       ├── introduction.md
│   │       ├── performance-improvements.md
│   │       ├── troubleshooting-example.md
│   │       └── troubleshooting.md
│   ├── sre_community.md
│   └── stylesheets/
│       └── custom.css
├── mkdocs.yml
├── overrides/
│   └── partials/
│       ├── header.html
│       ├── nav-item.html
│       └── nav.html
└── requirements.txt

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/workflows/build.yml
================================================
name: Build mkdocs


on:
  pull_request:

  # Allows you to run this workflow manually from the Actions tab
  workflow_dispatch:


jobs:
  build:
    runs-on: ubuntu-latest

    steps:
      - uses: actions/checkout@v6
      - name: Set up Python
        uses: actions/setup-python@v6
        with:
          python-version: '3.11'
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
      - name: Build
        run: |
          mkdocs build


================================================
FILE: .github/workflows/gh-deploy.yml
================================================
name: Deploy to gh-pages

# Controls when the action will run. 
on:
  # Triggers the workflow on push or pull request events but only for the main branch
  push:
    branches: [ main ]

  # Allows you to run this workflow manually from the Actions tab
  workflow_dispatch:

# A workflow run is made up of one or more jobs that can run sequentially or in parallel
jobs:
  build-and-deploy:
    # The type of runner that the job will run on
    runs-on: ubuntu-latest

    # Steps represent a sequence of tasks that will be executed as part of the job
    steps:
      # Checks-out your repository under $GITHUB_WORKSPACE, so your job can access it
      - uses: actions/checkout@v6
        with:
          # this fetches all branches. Needed because we need gh-pages branch for deploy to work
          fetch-depth: 0
      - name: Set up Python
        uses: actions/setup-python@v6
        with:
          python-version: '3.11'
      - name: Install dependencies
        run: |
          python -m pip install --upgrade pip
          pip install -r requirements.txt
      - name: Deploy
        run: |
          git config user.name github-actions
          git config user.email github-actions@github.com
          mkdocs gh-deploy


================================================
FILE: .gitignore
================================================
.DS_Store
.venv
site/
.vscode/


================================================
FILE: LICENSE
================================================
Creative Commons Attribution 4.0 International Public License

By exercising the Licensed Rights (defined below), You accept and agree to be bound by the terms and conditions of this Creative Commons Attribution 4.0 International Public License ("Public License"). To the extent this Public License may be interpreted as a contract, You are granted the Licensed Rights in consideration of Your acceptance of these terms and conditions, and the Licensor grants You such rights in consideration of benefits the Licensor receives from making the Licensed Material available under these terms and conditions.

Section 1 – Definitions.

Adapted Material means material subject to Copyright and Similar Rights that is derived from or based upon the Licensed Material and in which the Licensed Material is translated, altered, arranged, transformed, or otherwise modified in a manner requiring permission under the Copyright and Similar Rights held by the Licensor. For purposes of this Public License, where the Licensed Material is a musical work, performance, or sound recording, Adapted Material is always produced where the Licensed Material is synched in timed relation with a moving image.
Adapter's License means the license You apply to Your Copyright and Similar Rights in Your contributions to Adapted Material in accordance with the terms and conditions of this Public License.
Copyright and Similar Rights means copyright and/or similar rights closely related to copyright including, without limitation, performance, broadcast, sound recording, and Sui Generis Database Rights, without regard to how the rights are labeled or categorized. For purposes of this Public License, the rights specified in Section 2(b)(1)-(2) are not Copyright and Similar Rights.
Effective Technological Measures means those measures that, in the absence of proper authority, may not be circumvented under laws fulfilling obligations under Article 11 of the WIPO Copyright Treaty adopted on December 20, 1996, and/or similar international agreements.
Exceptions and Limitations means fair use, fair dealing, and/or any other exception or limitation to Copyright and Similar Rights that applies to Your use of the Licensed Material.
Licensed Material means the artistic or literary work, database, or other material to which the Licensor applied this Public License.
Licensed Rights means the rights granted to You subject to the terms and conditions of this Public License, which are limited to all Copyright and Similar Rights that apply to Your use of the Licensed Material and that the Licensor has authority to license.
Licensor means the individual(s) or entity(ies) granting rights under this Public License.
Share means to provide material to the public by any means or process that requires permission under the Licensed Rights, such as reproduction, public display, public performance, distribution, dissemination, communication, or importation, and to make material available to the public including in ways that members of the public may access the material from a place and at a time individually chosen by them.
Sui Generis Database Rights means rights other than copyright resulting from Directive 96/9/EC of the European Parliament and of the Council of 11 March 1996 on the legal protection of databases, as amended and/or succeeded, as well as other essentially equivalent rights anywhere in the world.
You means the individual or entity exercising the Licensed Rights under this Public License. Your has a corresponding meaning.
Section 2 – Scope.

License grant.
Subject to the terms and conditions of this Public License, the Licensor hereby grants You a worldwide, royalty-free, non-sublicensable, non-exclusive, irrevocable license to exercise the Licensed Rights in the Licensed Material to:
reproduce and Share the Licensed Material, in whole or in part; and
produce, reproduce, and Share Adapted Material.
Exceptions and Limitations. For the avoidance of doubt, where Exceptions and Limitations apply to Your use, this Public License does not apply, and You do not need to comply with its terms and conditions.
Term. The term of this Public License is specified in Section 6(a).
Media and formats; technical modifications allowed. The Licensor authorizes You to exercise the Licensed Rights in all media and formats whether now known or hereafter created, and to make technical modifications necessary to do so. The Licensor waives and/or agrees not to assert any right or authority to forbid You from making technical modifications necessary to exercise the Licensed Rights, including technical modifications necessary to circumvent Effective Technological Measures. For purposes of this Public License, simply making modifications authorized by this Section 2(a)(4) never produces Adapted Material.
Downstream recipients.
Offer from the Licensor – Licensed Material. Every recipient of the Licensed Material automatically receives an offer from the Licensor to exercise the Licensed Rights under the terms and conditions of this Public License.
No downstream restrictions. You may not offer or impose any additional or different terms or conditions on, or apply any Effective Technological Measures to, the Licensed Material if doing so restricts exercise of the Licensed Rights by any recipient of the Licensed Material.
No endorsement. Nothing in this Public License constitutes or may be construed as permission to assert or imply that You are, or that Your use of the Licensed Material is, connected with, or sponsored, endorsed, or granted official status by, the Licensor or others designated to receive attribution as provided in Section 3(a)(1)(A)(i).
Other rights.

Moral rights, such as the right of integrity, are not licensed under this Public License, nor are publicity, privacy, and/or other similar personality rights; however, to the extent possible, the Licensor waives and/or agrees not to assert any such rights held by the Licensor to the limited extent necessary to allow You to exercise the Licensed Rights, but not otherwise.
Patent and trademark rights are not licensed under this Public License.
To the extent possible, the Licensor waives any right to collect royalties from You for the exercise of the Licensed Rights, whether directly or through a collecting society under any voluntary or waivable statutory or compulsory licensing scheme. In all other cases the Licensor expressly reserves any right to collect such royalties.
Section 3 – License Conditions.

Your exercise of the Licensed Rights is expressly made subject to the following conditions.

Attribution.

If You Share the Licensed Material (including in modified form), You must:

retain the following if it is supplied by the Licensor with the Licensed Material:
identification of the creator(s) of the Licensed Material and any others designated to receive attribution, in any reasonable manner requested by the Licensor (including by pseudonym if designated);
a copyright notice;
a notice that refers to this Public License;
a notice that refers to the disclaimer of warranties;
a URI or hyperlink to the Licensed Material to the extent reasonably practicable;
indicate if You modified the Licensed Material and retain an indication of any previous modifications; and
indicate the Licensed Material is licensed under this Public License, and include the text of, or the URI or hyperlink to, this Public License.
You may satisfy the conditions in Section 3(a)(1) in any reasonable manner based on the medium, means, and context in which You Share the Licensed Material. For example, it may be reasonable to satisfy the conditions by providing a URI or hyperlink to a resource that includes the required information.
If requested by the Licensor, You must remove any of the information required by Section 3(a)(1)(A) to the extent reasonably practicable.
If You Share Adapted Material You produce, the Adapter's License You apply must not prevent recipients of the Adapted Material from complying with this Public License.
Section 4 – Sui Generis Database Rights.

Where the Licensed Rights include Sui Generis Database Rights that apply to Your use of the Licensed Material:

for the avoidance of doubt, Section 2(a)(1) grants You the right to extract, reuse, reproduce, and Share all or a substantial portion of the contents of the database;
if You include all or a substantial portion of the database contents in a database in which You have Sui Generis Database Rights, then the database in which You have Sui Generis Database Rights (but not its individual contents) is Adapted Material; and
You must comply with the conditions in Section 3(a) if You Share all or a substantial portion of the contents of the database.
For the avoidance of doubt, this Section 4 supplements and does not replace Your obligations under this Public License where the Licensed Rights include other Copyright and Similar Rights.
Section 5 – Disclaimer of Warranties and Limitation of Liability.

Unless otherwise separately undertaken by the Licensor, to the extent possible, the Licensor offers the Licensed Material as-is and as-available, and makes no representations or warranties of any kind concerning the Licensed Material, whether express, implied, statutory, or other. This includes, without limitation, warranties of title, merchantability, fitness for a particular purpose, non-infringement, absence of latent or other defects, accuracy, or the presence or absence of errors, whether or not known or discoverable. Where disclaimers of warranties are not allowed in full or in part, this disclaimer may not apply to You.
To the extent possible, in no event will the Licensor be liable to You on any legal theory (including, without limitation, negligence) or otherwise for any direct, special, indirect, incidental, consequential, punitive, exemplary, or other losses, costs, expenses, or damages arising out of this Public License or use of the Licensed Material, even if the Licensor has been advised of the possibility of such losses, costs, expenses, or damages. Where a limitation of liability is not allowed in full or in part, this limitation may not apply to You.
The disclaimer of warranties and limitation of liability provided above shall be interpreted in a manner that, to the extent possible, most closely approximates an absolute disclaimer and waiver of all liability.
Section 6 – Term and Termination.

This Public License applies for the term of the Copyright and Similar Rights licensed here. However, if You fail to comply with this Public License, then Your rights under this Public License terminate automatically.
Where Your right to use the Licensed Material has terminated under Section 6(a), it reinstates:

automatically as of the date the violation is cured, provided it is cured within 30 days of Your discovery of the violation; or
upon express reinstatement by the Licensor.
For the avoidance of doubt, this Section 6(b) does not affect any right the Licensor may have to seek remedies for Your violations of this Public License.
For the avoidance of doubt, the Licensor may also offer the Licensed Material under separate terms or conditions or stop distributing the Licensed Material at any time; however, doing so will not terminate this Public License.
Sections 1, 5, 6, 7, and 8 survive termination of this Public License.
Section 7 – Other Terms and Conditions.

The Licensor shall not be bound by any additional or different terms or conditions communicated by You unless expressly agreed.
Any arrangements, understandings, or agreements regarding the Licensed Material not stated herein are separate from and independent of the terms and conditions of this Public License.
Section 8 – Interpretation.

For the avoidance of doubt, this Public License does not, and shall not be interpreted to, reduce, limit, restrict, or impose conditions on any use of the Licensed Material that could lawfully be made without permission under this Public License.
To the extent possible, if any provision of this Public License is deemed unenforceable, it shall be automatically reformed to the minimum extent necessary to make it enforceable. If the provision cannot be reformed, it shall be severed from this Public License without affecting the enforceability of the remaining terms and conditions.
No term or condition of this Public License will be waived and no failure to comply consented to unless expressly agreed to by the Licensor.
Nothing in this Public License constitutes or may be interpreted as a limitation upon, or waiver of, any privileges and immunities that apply to the Licensor or You, including from the legal processes of any jurisdiction or authority.


================================================
FILE: NOTICE
================================================
Copyright 2020 LinkedIn Corporation
All Rights Reserved.

Licensed under the Creative Commons Attribution 4.0 International Public License (the "License").
See LICENSE in the project root for license information.

This product includes:
* N/A


================================================
FILE: courses/CODE_OF_CONDUCT.md
================================================
This code of conduct outlines expectations for participation in LinkedIn-managed open source communities, as well as steps for reporting unacceptable behavior. We are committed to providing a welcoming and inspiring community for all. People violating this code of conduct may be banned from the community.

Our open source communities strive to:

* **Be friendly and patient:** Remember you might not be communicating in someone else's primary spoken or programming language, and others may not have your level of understanding.
* **Be welcoming:** Our communities welcome and support people of all backgrounds and identities. This includes, but is not limited to members of any race, ethnicity, culture, national origin, color, immigration status, social and economic class, educational level, sex, sexual orientation, gender identity and expression, age, size, family status, political belief, religion, and mental and physical ability.
* **Be respectful:** We are a world-wide community of professionals, and we conduct ourselves professionally. Disagreement is no excuse for poor behavior and poor manners. Disrespectful and unacceptable behavior includes, but is not limited to:
    * Violent threats or language.
    * Discriminatory or derogatory jokes and language.
    * Posting sexually explicit or violent material.
    * Posting, or threatening to post, people's personally identifying information ("doxing").
    * Insults, especially those using discriminatory terms or slurs.
    * Behavior that could be perceived as sexual attention.
    * Advocating for or encouraging any of the above behaviors.
* **Understand disagreements:** Disagreements, both social and technical, are useful learning opportunities. Seek to understand the other viewpoints and resolve differences constructively.
* This code is not exhaustive or complete. It serves to capture our common understanding of a productive, collaborative environment. We expect the code to be followed in spirit as much as in the letter.

### Scope

This code of conduct applies to all repos and communities for LinkedIn-managed open source projects regardless of whether or not the repo explicitly calls out its use of this code. The code also applies in public spaces when an individual is representing a project or its community. Examples include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers.

Note: Some LinkedIn-managed communities have codes of conduct that pre-date this document and issue resolution process. While communities are not required to change their code, they are expected to use the resolution process outlined here. The review team will coordinate with the communities involved to address your concerns.

### Reporting Code of Conduct Issues

We encourage all communities to resolve issues on their own whenever possible. This builds a broader and deeper understanding and ultimately a healthier interaction. In the event that an issue cannot be resolved locally, please feel free to report your concerns by contacting [oss@linkedin.com](mailto:oss@linkedin.com).

In your report, please include:

*   Your contact information.
*   Names (real, usernames or pseudonyms) of any individuals involved. If there are additional witnesses, please include them as well.
*   Your account of what occurred, and if you believe the incident is ongoing. If there is a publicly available record (e.g. a mailing list archive or a public chat log), please include a link or attachment.
*   Any additional information that may be helpful.

All reports will be reviewed by a multi-person team and will result in a response that is deemed necessary and appropriate to the circumstances. Where additional perspectives are needed, the team may seek insight from others with relevant expertise or experience. The confidentiality of the person reporting the incident will be kept at all times. Involved parties are never part of the review team.

Anyone asked to stop unacceptable behavior is expected to comply immediately. If an individual engages in unacceptable behavior, the review team may take any action they deem appropriate, including a permanent ban from the community.

_This code of conduct is based on the [Microsoft](https://opensource.microsoft.com/codeofconduct/) Open Source Code of Conduct which was based on the [template](http://todogroup.org/opencodeofconduct) established by the [TODO Group](http://todogroup.org/) and used by numerous other large communities (e.g., [Facebook](https://code.facebook.com/pages/876921332402685/open-source-code-of-conduct), [Yahoo](https://yahoo.github.io/codeofconduct), [Twitter](https://engineering.twitter.com/opensource/code-of-conduct), [GitHub](http://todogroup.org/opencodeofconduct/#opensource@github.com)) and the Scope section from the [Contributor Covenant version 1.4](http://contributor-covenant.org/version/1/4/)._

================================================
FILE: courses/CONTRIBUTING.md
================================================
We realise that the initial content we created is just a starting point and our hope is that the community can help in the journey refining and extending the contents.

As a contributor, you represent that the content you submit is not plagiarised. By submitting the content, you (and, if applicable, your employer) are licensing the submitted content to LinkedIn and the open source community subject to the Creative Commons Attribution 4.0 International Public License.

*Repository URL*: [https://github.com/linkedin/school-of-sre](https://github.com/linkedin/school-of-sre)

### Contributing Guidelines
Ensure that you adhere to the following guidelines:

* Should be about principles and concepts that can be applied in any company or individual project. Do not focus on particular tools or tech stack (which usually change over time).
* Adhere to the [Code of Conduct](/school-of-sre/CODE_OF_CONDUCT/).
* Should be relevant to the roles and responsibilities of an SRE.
* Should be locally tested (see steps for testing) and well-formatted.
* It is good practice to open an issue first and discuss your changes before submitting a pull request. This way, you can incorporate ideas from others before you even start.

### Building and testing locally
Run the following commands to build and view the site locally before opening a PR.

```shell
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
mkdocs build
mkdocs serve
```

### Opening a PR
Follow the [GitHub PR workflow](https://guides.github.com/introduction/flow/) for your contributions.

Fork this repo, create a feature branch, commit your changes and open a PR to this repo.

================================================
FILE: courses/index.md
================================================
# School of SRE

<img src="img/sos.png" width=200 >

Site Reliability Engineers (SREs) sits at the intersection of software engineering and systems engineering. While there are potentially infinite permutations and combinations of how infrastructure and software components can be put together to achieve an objective, focusing on foundational skills allows SREs to work with complex systems and software, regardless of whether these systems are proprietary, 3rd party, open systems, run on cloud/on-prem infrastructure, etc. Particularly, it is important to gain a deep understanding of how these areas of systems and infrastructure relate to each other and interact with each other. The combination of software and systems engineering skills is rare and is generally built over time with exposure to a wide variety of infrastructure, systems, and software. 

SREs bring in engineering practices to keep the site up. Each distributed system is an agglomeration of many components. SREs validate business requirements, convert them to SLAs for each of the components that constitute the distributed system, monitor and measure adherence to SLAs, re-architect or scale out to mitigate or avoid SLA breaches, add these learnings as feedback to new systems or projects and thereby reduce operational toil. Hence SREs play a vital role right from the day 0 design of the system. 

In early 2019, we started visiting campuses across India to recruit the best and brightest minds to make sure LinkedIn and all the services that make up its complex technology stack are always available for everyone. This critical function at LinkedIn falls under the purview of the Site Engineering team and Site Reliability Engineers (SREs) who are Software Engineers, specialized in reliability. 

As we continued on this journey, we started getting a lot of questions from these campuses on what exactly the site reliability engineering role entails? And, how could someone learn the skills and the disciplines involved to become a successful site reliability engineer? Fast forward a few months, and a few of these campus students had joined LinkedIn either as interns or as full-time engineers to become a part of the Site Engineering team; we also had a few lateral hires who joined our organization who were not from a traditional SRE background. That's when a few of us got together and started to think about how we can onboard new graduate engineers to the Site Engineering team.

There are very few resources out there guiding someone on the basic skill sets one has to acquire as a beginner SRE. Because of the lack of these resources, we felt that individuals have a tough time getting into open positions in the industry. We created the School Of SRE as a starting point for anyone wanting to build their career as an SRE.
In this course, we are focusing on building strong foundational skills. The course is structured in a way to provide more real life examples and how learning each of these topics can play an important role in day-to-day job responsibilities of an SRE. Currently, we are covering the following topics under the School Of SRE:
 
-   Level 101
    -   Fundamentals Series
        -   [Linux Basics](https://linkedin.github.io/school-of-sre/level101/linux_basics/intro/)
        -   [Git](https://linkedin.github.io/school-of-sre/level101/git/git-basics/)
        -   [Linux Networking](https://linkedin.github.io/school-of-sre/level101/linux_networking/intro/)
    -   [Python and Web](https://linkedin.github.io/school-of-sre/level101/python_web/intro/)
    -   Data
        - [Relational Databases (MySQL)](https://linkedin.github.io/school-of-sre/level101/databases_sql/intro/)
        -   [NoSQL Concepts](https://linkedin.github.io/school-of-sre/level101/databases_nosql/intro/)
        -   [Big Data](https://linkedin.github.io/school-of-sre/level101/big_data/intro/)
    -   [Systems Design](https://linkedin.github.io/school-of-sre/level101/systems_design/intro/)
    -   [Metrics and Monitoring](https://linkedin.github.io/school-of-sre/level101/metrics_and_monitoring/introduction/)
    -   [Security](https://linkedin.github.io/school-of-sre/level101/security/intro/)

-   Level 102
    -   [Linux Intermediate](https://linkedin.github.io/school-of-sre/level102/linux_intermediate/introduction/)
    -   Linux Advanced
        -   [Containers and Orchestration](https://linkedin.github.io/school-of-sre/level102/containerization_and_orchestration/intro/)
        -   [System Calls and Signals](https://linkedin.github.io/school-of-sre/level102/system_calls_and_signals/intro/)
    -   [Networking](https://linkedin.github.io/school-of-sre/level102/networking/introduction/)
    -   [System Design](https://linkedin.github.io/school-of-sre/level102/system_design/intro/)
    -   [System Troubleshooting and Performance Improvements](https://linkedin.github.io/school-of-sre/level102/system_troubleshooting_and_performance/introduction/) 
    -   [Continuous Integration and Continuous Delivery](https://linkedin.github.io/school-of-sre/level102/continuous_integration_and_continuous_delivery/introduction/)

We believe continuous learning will help in acquiring deeper knowledge and competencies in order to expand your skill sets, every module has added references that could be a guide for further learning. Our hope is that by going through these modules we should be able to build the essential skills required for a Site Reliability Engineer.

At LinkedIn, we are using this curriculum for onboarding our non-traditional hires and new college grads into the SRE role. We had multiple rounds of successful onboarding experiences with new employees and the course helped them be productive in a very short period of time. This motivated us to open source the content for helping other organizations in onboarding new engineers into the role and provide guidance for aspiring individuals to get into the role. We realize that the initial content we created is just a starting point and we hope that the community can help in the journey of refining and expanding the content. Check out [the contributing guide](./CONTRIBUTING.md) to get started.


================================================
FILE: courses/level101/big_data/evolution.md
================================================
# Evolution of Hadoop

![Evolution of hadoop](images/hadoop_evolution.png)

# Architecture of Hadoop

1. **HDFS**
    1. The Hadoop Distributed File System (HDFS) is a distributed file system designed to run on commodity hardware. It has many similarities with existing distributed file systems. However, the differences from other distributed file systems are significant.
    2. HDFS is highly fault-tolerant and is designed to be deployed on low-cost hardware. HDFS provides high throughput access to application data and is suitable for applications that have large datasets.
    3. HDFS is part of the [Apache Hadoop Core project](https://github.com/apache/hadoop).

    ![HDFS Architecture](images/hdfs_architecture.png)

    The main components of HDFS include:
    1. NameNode: is the arbitrator and central repository of file namespace in the cluster. The NameNode executes the operations such as opening, closing, and renaming files and directories.
    2. DataNode: manages the storage attached to the node on which it runs. It is responsible for serving all the read and writes requests. It performs operations on instructions on NameNode such as creation, deletion, and replications of blocks.
    3. Client: Responsible for getting the required metadata from the NameNode and then communicating with the DataNodes for reads and writes. </br></br></br>

2. **YARN**
    YARN stands for “Yet Another Resource Negotiator“. It was introduced in Hadoop 2.0 to remove the bottleneck on Job Tracker which was present in Hadoop 1.0. YARN was described as a “Redesigned Resource Manager” at the time of its launching, but it has now evolved to be known as a large-scale distributed operating system used for Big Data processing.

    ![YARN Architecture](images/yarn_architecture.gif)
    
    The main components of YARN architecture include:
    1. Client: It submits map-reduce (MR) jobs to the resource manager.
    2. Resource Manager: It is the master daemon of YARN and is responsible for resource assignment and management among all the applications. Whenever it receives a processing request, it forwards it to the corresponding node manager and allocates resources for the completion of the request accordingly. It has two major components:
        1. Scheduler: It performs scheduling based on the allocated application and available resources. It is a pure scheduler, which means that it does not perform other tasks such as monitoring or tracking and does not guarantee a restart if a task fails. The YARN scheduler supports plugins such as Capacity Scheduler and Fair Scheduler to partition the cluster resources.
        2. Application manager: It is responsible for accepting the application and negotiating the first container from the resource manager. It also restarts the Application Manager container if a task fails.
    3. Node Manager: It takes care of individual nodes on the Hadoop cluster and manages application and workflow and that particular node. Its primary job is to keep up with the Node Manager. It monitors resource usage, performs log management, and also kills a container based on directions from the resource manager. It is also responsible for creating the container process and starting it at the request of the Application master.
    4. Application Master: An application is a single job submitted to a framework. The application manager is responsible for negotiating resources with the resource manager, tracking the status, and monitoring the progress of a single application. The application master requests the container from the node manager by sending a Container Launch Context (CLC) which includes everything an application needs to run. Once the application is started, it sends the health report to the resource manager from time-to-time.
    5. Container: It is a collection of physical resources such as RAM, CPU cores, and disk on a single node. The containers are invoked by Container Launch Context (CLC) which is a record that contains information such as environment variables, security tokens, dependencies, etc.</br></br>


# MapReduce framework

![MapReduce Framework](images/map_reduce.jpg)

1. The term MapReduce represents two separate and distinct tasks Hadoop programs perform&mdash;Map Job and Reduce Job. Map jobs take datasets as input and process them to produce key-value pairs. Reduce job takes the output of the Map job i.e. the key-value pairs and aggregates them to produce desired results.
2. Hadoop MapReduce (Hadoop Map/Reduce) is a software framework for distributed processing of large datasets on computing clusters. MapReduce helps to split the input dataset into a number of parts and run a program on all data parts parallel at once.
3. Please find the below Word count example demonstrating the usage of the MapReduce framework:

![Word Count Example](images/mapreduce_example.jpg)
</br></br>

# Other tooling around Hadoop

1. [**Hive**](https://hive.apache.org/)
    1. Uses a language called HQL which is very SQL like. Gives non-programmers the ability to query and analyze data in Hadoop. Is basically an abstraction layer on top of map-reduce.
    2. Ex. HQL query:
        1. `SELECT pet.name, comment FROM pet JOIN event ON (pet.name = event.name);`
    3. In mysql:
        1. `SELECT pet.name, comment FROM pet, event WHERE pet.name = event.name;`
2. [**Pig**](https://pig.apache.org/)
    1. Uses a scripting language called Pig Latin, which is more workflow driven. Don't need to be an expert Java programmer but need a few coding skills. Is also an abstraction layer on top of map-reduce.
    2. Here is a quick question for you:
    What is the output of running the Pig queries in the right column against the data present in the left column in the below image?

    ![Pig Example](images/pig_example.png)

    Output:
    <pre><code>
    7,Komal,Nayak,24,9848022334,trivendram
    8,Bharathi,Nambiayar,24,9848022333,Chennai
    5,Trupthi,Mohanthy,23,9848022336,Bhuwaneshwar
    6,Archana,Mishra,23,9848022335,Chennai
    </code></pre>

3. [**Spark**](https://spark.apache.org/)
    1. Spark provides primitives for in-memory cluster computing that allows user programs to load data into a cluster’s memory and query it repeatedly, making it well-suited to machine learning algorithms.
4. [**Presto**](https://prestodb.io/)
    1. Presto is a high performance, distributed SQL query engine for Big Data.
    2. Its architecture allows users to query a variety of data sources such as Hadoop, AWS S3, Alluxio, MySQL, Cassandra, Kafka, and MongoDB.
    3. Example Presto query:
    <pre><code>
    USE studentDB;
    SHOW TABLES;
    SELECT roll_no, name FROM studentDB.studentDetails WHERE section=’A’ LIMIT 5;
    </code></pre>   
    
</br>

# Data Serialisation and storage

1. In order to transport the data over the network or to store on some persistent storage, we use the process of translating data structures or objects state into binary or textual form. We call this process serialization.
2. Avro data is stored in a container file (a `.avro` file) and its schema (the `.avsc` file) is stored with the data file.
3. Apache Hive provides support to store a table as Avro and can also query data in this serialisation format.


================================================
FILE: courses/level101/big_data/intro.md
================================================
# Big Data

## Prerequisites

- Basics of Linux File systems.
- Basic understanding of System Design.

## What to expect from this course

This course covers the basics of Big Data and how it has evolved to become what it is today. We will take a look at a few realistic scenarios where Big Data would be a perfect fit. An interesting assignment on designing a Big Data system is followed by understanding the architecture of Hadoop and the tooling around it.

## What is not covered under this course

Writing programs to draw analytics from data.

## Course Contents

1. [Overview of Big Data](https://linkedin.github.io/school-of-sre/level101/big_data/intro/#overview-of-big-data)
2. [Usage of Big Data Techniques](https://linkedin.github.io/school-of-sre/level101/big_data/intro/#usage-of-big-data-techniques)
3. [Evolution of Hadoop](https://linkedin.github.io/school-of-sre/level101/big_data/evolution/)
4. [Architecture of Hadoop](https://linkedin.github.io/school-of-sre/level101/big_data/evolution/#architecture-of-hadoop)
    1. HDFS
    2. Yarn
5. [MapReduce Framework](https://linkedin.github.io/school-of-sre/level101/big_data/evolution/#mapreduce-framework)
6. [Other Tooling Around Hadoop](https://linkedin.github.io/school-of-sre/level101/big_data/evolution/#other-tooling-around-hadoop)
    1. Hive
    2. Pig
    3. Spark
    4. Presto
7. [Data Serialization and Storage](https://linkedin.github.io/school-of-sre/level101/big_data/evolution/#data-serialisation-and-storage)


# Overview of Big Data

1. Big Data is a collection of large datasets that cannot be processed using traditional computing techniques. It is not a single technique or a tool, rather it has become a complete subject, which involves various tools, techniques, and frameworks.
2. Big Data could consist of
    1. Structured data
    2. Unstructured data
    3. Semi-structured data
3. Characteristics of Big Data:
    1. Volume
    2. Variety
    3. Velocity
    4. Variability
4. Examples of Big Data generation include stock exchanges, social media sites, jet engines, etc.


# Usage of Big Data Techniques

1. Take the example of the traffic lights problem.
    1. There are more than 300,000 traffic lights in the US as of 2018.
    2. Let us assume that we placed a device on each of them to collect metrics and send it to a central metrics collection system.
    3. If each of the IoT devices sends 10 events per minute, we have `300000 x 10 x 60 x 24 = 432 x 10 ^ 7` events per day.
    4. How would you go about processing that and telling me how many of the signals were “green” at 10:45 am on a particular day?
2. Consider the next example on Unified Payments Interface (UPI) transactions:
    1. We had about 1.15 billion UPI transactions in the month of October 2019 in India.
    12. If we try to extrapolate this data to about a year and try to find out some common payments that were happening through a particular UPI ID, how do you suggest we go about that?


================================================
FILE: courses/level101/big_data/tasks.md
================================================
# Tasks and conclusion

## Post-training tasks:

1. Try setting up your own three-node Hadoop cluster. 
    1. A VM-based solution can be found [here](http://hortonworks.com/wp-content/uploads/2015/04/Import_on_VBox_4_07_2015.pdf)
2. Write a simple Spark/MR job of your choice and understand how to generate analytics from data.
    1. Sample dataset can be found [here](https://grouplens.org/datasets/movielens/)

## References:
1. [Hadoop documentation](http://hadoop.apache.org/docs/current/)
2. [HDFS Architecture](http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/HdfsDesign.html)
3. [YARN Architecture](http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html)
4. [Google GFS paper](https://storage.googleapis.com/pub-tools-public-publication-data/pdf/035fc972c796d33122033a0614bc94cff1527999.pdf)


================================================
FILE: courses/level101/databases_nosql/further_reading.md
================================================
# Conclusion

We have covered basic concepts of NoSQL databases. There is much more to learn and do. We hope this course gives you a good start and inspires you to explore further.

# Further reading

NoSQL:

[https://hostingdata.co.uk/nosql-database/](https://hostingdata.co.uk/nosql-database/)

[https://www.mongodb.com/nosql-explained](https://www.mongodb.com/nosql-explained)

[https://www.mongodb.com/nosql-explained/nosql-vs-sql](https://www.mongodb.com/nosql-explained/nosql-vs-sql)

Cap Theorem

[http://www.julianbrowne.com/article/brewers-cap-theorem](http://www.julianbrowne.com/article/brewers-cap-theorem)

Scalability

[http://www.slideshare.net/jboner/scalability-availability-stability-patterns](http://www.slideshare.net/jboner/scalability-availability-stability-patterns)

Eventual Consistency

[https://www.allthingsdistributed.com/2008/12/eventually_consistent.html](https://www.allthingsdistributed.com/2008/12/eventually_consistent.html)

[https://www.toptal.com/big-data/consistent-hashing](https://www.toptal.com/big-data/consistent-hashing)

[https://web.stanford.edu/class/cs244/papers/chord_TON_2003.pdf](https://web.stanford.edu/class/cs244/papers/chord_TON_2003.pdf)


================================================
FILE: courses/level101/databases_nosql/intro.md
================================================
# NoSQL Concepts

## Prerequisites
- [Relational Databases](https://linkedin.github.io/school-of-sre/level101/databases_sql/intro/)

## What to expect from this course

At the end of training, you will have an understanding of what a NoSQL database is, what kind of advantages or disadvantages it has over traditional RDBMS, learn about different types of NoSQL databases and understand some of the underlying concepts & trade-offs w.r.t to NoSQL.


## What is not covered under this course

We will not be deep diving into any specific NoSQL database. 


## Course Contents



*   [Introduction to NoSQL](https://linkedin.github.io/school-of-sre/level101/databases_nosql/intro/#introduction)
*   [CAP Theorem](https://linkedin.github.io/school-of-sre/level101/databases_nosql/key_concepts/#cap-theorem)
*   [Data versioning](https://linkedin.github.io/school-of-sre/level101/databases_nosql/key_concepts/#versioning-of-data-in-distributed-systems)
*   [Partitioning](https://linkedin.github.io/school-of-sre/level101/databases_nosql/key_concepts/#partitioning)
*   [Hashing](https://linkedin.github.io/school-of-sre/level101/databases_nosql/key_concepts/#hashing)
*   [Quorum](https://linkedin.github.io/school-of-sre/level101/databases_nosql/key_concepts/#quorum)


## Introduction

When people use the term “NoSQL database”, they typically use it to refer to any non-relational database. Some say the term “NoSQL” stands for “non SQL” while others say it stands for “not only SQL.” Either way, most agree that NoSQL databases are databases that store data in a format other than relational tables.

A common misconception is that NoSQL databases or non-relational databases don’t store relationship data well. NoSQL databases can store relationship data—they just store it differently than relational databases do. In fact, when compared with SQL databases, many find modeling relationship data in NoSQL databases to be _easier_, because related data doesn’t have to be split between tables.

Such databases have existed since the late 1960s, but the name "NoSQL" was only coined in the early 21st century. NASA used a NoSQL database to track inventory for the Apollo mission. NoSQL databases emerged in the late 2000s as the cost of storage dramatically decreased. Gone were the days of needing to create a complex, difficult-to-manage data model simply for the purposes of reducing data duplication. Developers (rather than storage) were becoming the primary cost of software development, so NoSQL databases optimized for developer productivity. With the rise of Agile development methodology, NoSQL databases were developed with a focus on scaling, fast performance and at the same time allowed for frequent application changes and made programming easier.


### Types of NoSQL databases:

Over time due to the way these NoSQL databases were developed to suit requirements at different companies, we ended up with quite a few types of them. However, they can be broadly classified into 4 types. Some of the databases can overlap between different types. They are:

1. **Document databases:** They store data in documents similar to [JSON](https://www.json.org/json-en.html) (JavaScript Object Notation) objects. Each document contains pairs of fields and values. The values can typically be a variety of types including things like strings, numbers, booleans, arrays, or objects, and their structures typically align with objects developers are working with in code. The advantages include intuitive data model & flexible schemas. Because of their variety of field value types and powerful query languages, document databases are great for a wide variety of use cases and can be used as a general purpose database. They can horizontally scale-out to accomodate large data volumes. Ex: MongoDB, Couchbase

2. **Key-Value databases:** These are a simpler type of databases where each item contains keys and values. A value can typically only be retrieved by referencing its key, so learning how to query for a specific key-value pair is typically simple. Key-value databases are great for use cases where you need to store large amounts of data but you don’t need to perform complex queries to retrieve it. Common use cases include storing user preferences or caching. Ex: [Redis](https://redis.io/), [DynamoDB](https://aws.amazon.com/dynamodb/), [Voldemort](https://www.project-voldemort.com/voldemort/)/[Venice](https://engineering.linkedin.com/blog/2017/04/building-venice--a-production-software-case-study) (Linkedin).

3. **Wide-Column stores:** They store data in tables, rows, and dynamic columns. Wide-column stores provide a lot of flexibility over relational databases because each row is not required to have the same columns. Many consider wide-column stores to be two-dimensional key-value databases. Wide-column stores are great for when you need to store large amounts of data and you can predict what your query patterns will be. Wide-column stores are commonly used for storing Internet of Things data and user profile data. [Cassandra](https://cassandra.apache.org/) and [HBase](https://hbase.apache.org/) are two of the most popular wide-column stores.
4. **Graph Databases:** These databases store data in nodes and edges. Nodes typically store information about people, places, and things while edges store information about the relationships between the nodes. The underlying storage mechanism of graph databases can vary. Some depend on a relational engine and “store” the graph data in a table (although a table is a logical element, therefore this approach imposes another level of abstraction between the graph database, the graph database management system and the physical devices where the data is actually stored). Others use a key-value store or document-oriented database for storage, making them inherently NoSQL structures. Graph databases excel in use cases where you need to traverse relationships to look for patterns such as social networks, fraud detection, and recommendation engines. Ex: [Neo4j](https://neo4j.com/) 

### **Comparison** 

<table>
  <tr>
   <td>
   </td>
   <td>Performance
   </td>
   <td>Scalability
   </td>
   <td>Flexibility
   </td>
   <td>Complexity
   </td>
   <td>Functionality
   </td>
  </tr>
  <tr>
   <td>Key Value
   </td>
   <td>high
   </td>
   <td>high
   </td>
   <td>high
   </td>
   <td>none
   </td>
   <td>Variable
   </td>
  </tr>
  <tr>
   <td>Document stores
   </td>
   <td>high
   </td>
   <td>Variable (high)
   </td>
   <td>high
   </td>
   <td>low
   </td>
   <td>Variable (low)
   </td>
  </tr>
  <tr>
   <td>Column DB
   </td>
   <td>high
   </td>
   <td>high
   </td>
   <td>moderate
   </td>
   <td>low
   </td>
   <td>minimal
   </td>
  </tr>
  <tr>
   <td>Graph
   </td>
   <td>Variable
   </td>
   <td>Variable
   </td>
   <td>high
   </td>
   <td>high
   </td>
   <td>Graph theory
   </td>
  </tr>
</table>



### Differences between SQL and NoSQL

The table below summarizes the main differences between SQL and NoSQL databases.

<table>
  <tr>
   <td>
   </td>
   <td>SQL Databases
   </td>
   <td>NoSQL Databases
   </td>
  </tr>
  <tr>
   <td>Data Storage Model
   </td>
   <td>Tables with fixed rows and columns
   </td>
   <td>Document: JSON documents, Key-value: key-value pairs, Wide-column: tables with rows and dynamic columns, Graph: nodes and edges
   </td>
  </tr>
  <tr>
   <td>Primary Purpose
   </td>
   <td>General purpose
   </td>
   <td>Document: general purpose, Key-value: large amounts of data with simple lookup queries, Wide-column: large amounts of data with predictable query patterns, Graph: analyzing and traversing relationships between connected data
   </td>
  </tr>
  <tr>
   <td>Schemas
   </td>
   <td>Rigid
   </td>
   <td>Flexible
   </td>
  </tr>
  <tr>
   <td>Scaling
   </td>
   <td>Vertical (scale-up with a larger server)
   </td>
   <td>Horizontal (scale-out across commodity servers)
   </td>
  </tr>
  <tr>
   <td>Multi-Record <a href="https://en.wikipedia.org/wiki/ACID">ACID </a>Transactions
   </td>
   <td>Supported
   </td>
   <td>Most do not support multi-record ACID transactions. However, some like MongoDB do.
   </td>
  </tr>
  <tr>
   <td>Joins
   </td>
   <td>Typically required
   </td>
   <td>Typically not required
   </td>
  </tr>
  <tr>
   <td>Data to Object Mapping
   </td>
   <td>Requires ORM (object-relational mapping)
   </td>
   <td>Many do not require ORMs. Document DB documents map directly to data structures in most popular programming languages.
   </td>
  </tr>
</table>

### Advantages

*   **Flexible Data Models**

    Most NoSQL systems feature flexible schemas. A flexible schema means you can easily modify your database schema to add or remove fields to support for evolving application requirements. This facilitates with continuous application development of new features without database operation overhead.

*   **Horizontal Scaling**

    Most NoSQL systems allow you to scale horizontally, which means you can add in cheaper & commodity hardware, whenever you want to scale a system. On the other hand, SQL systems generally scale Vertically (a more powerful server). NoSQL systems can also host huge datasets when compared to traditional SQL systems.

*   **Fast Queries**

    NoSQL can generally be a lot faster than traditional SQL systems due to data denormalization and horizontal scaling. Most NoSQL systems also tend to store similar data together facilitating faster query responses. 

*   **Developer productivity**

    NoSQL systems tend to map data based on the programming data structures. As a result, developers need to perform fewer data transformations leading to increased productivity & fewer bugs.


================================================
FILE: courses/level101/databases_nosql/key_concepts.md
================================================
# Key Concepts

Lets looks at some of the key concepts when we talk about NoSQL or distributed systems.

### CAP Theorem

In a keynote titled “[Towards Robust Distributed Systems](https://sites.cs.ucsb.edu/~rich/class/cs293b-cloud/papers/Brewer_podc_keynote_2000.pdf)” at ACM’s PODC symposium in 2000, Eric Brewer came up with the so-called CAP-theorem which is widely adopted today by large web companies as well as in the NoSQL community. The CAP acronym stands for **C**onsistency, **A**vailability & **P**artition Tolerance.

*   **Consistency**

    It refers to how consistent a system is after an execution. A distributed system is called consistent when a write made by a source is available for all readers of that shared data. Different NoSQL systems support different levels of consistency.

*   **Availability**

    It refers to how a system responds to loss of functionality of different systems due to hardware and software failures. A high availability implies that a system is still available to handle operations (reads and writes) when a certain part of the system is down due to a failure or upgrade.

*   **Partition Tolerance**

    It is the ability of the system to continue operations in the event of a network partition. A network partition occurs when a failure causes two or more islands of networks where the systems can’t talk to each other across the islands temporarily or permanently. 


Brewer alleges that one can at most choose two of these three characteristics in a shared-data system. The CAP-theorem states that a choice can only be made for two options out of consistency, availability and partition tolerance. A growing number of use cases in large scale applications tend to value reliability implying that availability & redundancy are more valuable than consistency. As a result these systems struggle to meet ACID properties. They attain this by loosening on the consistency requirement,  i.e Eventual Consistency. 					

**Eventual Consistency** means that all readers will see writes, as time goes on: “In a steady state, the system will eventually return the last written value”. Clients therefore may face an inconsistent state of data as updates are in progress. For instance, in a replicated database updates may go to one node which replicates the latest version to all other nodes that contain a replica of the modified dataset so that the replica nodes eventually will have the latest version. 

NoSQL systems support different levels of eventual consistency models. For example:

*   **Read Your Own Writes Consistency**

    Clients will see their updates immediately after they are written. The reads can hit nodes other than the one where it was written. However, they might not see updates by other clients immediately.

*   **Session Consistency**

    Clients will see the updates to their data within a session scope. This generally indicates that reads & writes occur on the same server. Other clients using the same nodes will receive the same updates.

*   **Casual Consistency**

    A system provides causal consistency if the following condition holds: write operations that are related by potential causality are seen by each process of the system in order. Different processes may observe concurrent writes in different orders 

Eventual consistency is useful if concurrent updates of the same partitions of data are unlikely and if clients do not immediately depend on reading updates issued by themselves or by other clients.

Depending on what consistency model was chosen for the system (or parts of it), determines where the requests are routed, ex: replicas. 

**CAP alternatives illustration**

<table>
  <tr>
   <td>Choice
   </td>
   <td>Traits
   </td>
   <td>Examples
   </td>
  </tr>
  <tr>
   <td>Consistency + Availability
<p>
(Forfeit Partitions)
   </td>
   <td>2-phase commits
<p>
Cache invalidation protocols
   </td>
   <td>Single-site databases Cluster databases 
<p>
LDAP
<p>
xFS file system 
   </td>
  </tr>
  <tr>
   <td>Consistency + Partition tolerance
<p>
 (Forfeit Availability)
   </td>
   <td>Pessimistic locking
<p>
Make minority partitions unavailable
   </td>
   <td>Distributed databases Distributed locking Majority protocols
   </td>
  </tr>
  <tr>
   <td>Availability + Partition tolerance (Forfeit Consistency)
   </td>
   <td>expirations/leases 
<p>
conflict resolution optimistic
   </td>
   <td>DNS
<p>
Web caching
   </td>
  </tr>
</table>


### Versioning of Data in distributed systems

When data is distributed across nodes, it can be modified on different nodes at the same time (assuming strict consistency is enforced). Questions arise on conflict resolution for concurrent updates. Some of the popular conflict resolution mechanism are

*   **Timestamps**

    This is the most obvious solution. You sort updates based on chronological order and choose the latest update. However, this relies on clock synchronization across different parts of the infrastructure. This gets even more complicated when parts of systems are spread across different geographic locations. 

*   **Optimistic Locking**

    You associate a unique value like a clock or counter with every data update. When a client wants to update data, it has to specify which version of data needs to be updated. This would mean you need to keep track of history of the data versions. 

*   **Vector Clocks**

    A vector clock is defined as a tuple of clock values from each node. In a distributed environment, each node maintains a tuple of such clock values which represent the state of the nodes itself and its peers/replicas. A clock value may be real timestamps derived from local clock or version no.  

<p id="gdcalert1" ><span style="color: red; font-weight: bold" images/vector_clocks.png> </span></p>

![alt_text](images/vector_clocks.png "Vector Clocks")
                
<p align="center"><span style="text-decoration:underline;  font-weight:bold;">Vector clocks illustration</span></p>

Vector clocks have the following advantages over other conflict resolution mechanism:

1. No dependency on synchronized clocks
2. No total ordering of revision nos required for casual reasoning 

No need to store and maintain multiple versions of the data on different nodes.**     **

### Partitioning

When the amount of data crosses the capacity of a single node, we need to think of splitting data, creating replicas for load balancing & disaster recovery. Depending on how dynamic the infrastructure is, we have a few approaches that we can take.

1. **Memory cached**

    These are partitioned in-memory databases that are primarily used for transient data. These databases are generally used as a front for traditional RDBMS. Most frequently used data is replicated from a RDBMS into a memory database to facilitate fast queries and to take the load off from backend DB’s. A very common example is Memcached or Couchbase. 

2. **Clustering**

    Traditional cluster mechanisms abstract away the cluster topology from clients. A client need not know where the actual data is residing and which node it is talking to. Clustering is very commonly used in traditional RDBMS where it can help scaling the persistent layer to a certain extent. 

3. **Separating reads from writes**

    In this method, you will have multiple replicas hosting the same data. The incoming writes are typically sent to a single node (Leader) or multiple nodes (multi-Leader), while the rest of the replicas (Follower) handle reads requests. The leader replicates writes asynchronously to all followers. However, the write lag can’t be completely avoided. Sometimes a leader can crash before it replicates all the data to a follower. When this happens, a follower with the most consistent data can be turned into a leader. As you can realize now, it is hard to enforce full consistency in this model. You also need to consider the ratio of read vs write traffic. This model won’t make sense when writes are higher than reads. The replication methods can also vary widely. Some systems do a complete transfer of state periodically, while others use a delta state transfer approach. You could also transfer the state by transferring the operations in order. The followers can then apply the same operations as the leader to catch up.

4. **Sharding**

    Sharing refers to dividing data in such a way that data is distributed evenly (both in terms of storage & processing power) across a cluster of nodes. It can also imply data locality, which means similar & related data is stored together to facilitate faster access. A shard in turn can be further replicated to meet load balancing or disaster recovery requirements. A single shard replica might take in all writes (single leader) or multiple replicas can take writes (multi-leader). Reads can be distributed across multiple replicas. Since data is now distributed across multiple nodes, clients should be able to consistently figure out where data is hosted. We will look at some of the common techniques below. The downside of sharding is that joins between shards is not possible. So an upstream/downstream application has to aggregate the results from multiple shards.

<p id="gdcalert2" ><span style="color: red; font-weight: bold" images/database_sharding.png></span></p>

![alt_text]( images/database_sharding.png "Sharding")

<p align="center"><span style="text-decoration:underline;  font-weight:bold;">Sharding example</span> </p>

### Hashing

A hash function is a function that maps one piece of data—typically describing some kind of object, often of arbitrary size—to another piece of data, typically an integer, known as _hash code_, or simply _hash_. In a partitioned database, it is important to consistently map a key to a server/replica. 

For ex: you can use a very simple hash as a modulo function.

    _p = k mod n_

Where 

    p -> partition,


    k -> primary key


    n -> no of nodes

The downside of this simple hash is that, whenever the cluster topology changes, the data distribution also changes. When you are dealing with memory caches, it will be easy to distribute partitions around. Whenever a node joins/leaves a topology, partitions can reorder themselves, a cache miss can be re-populated from backend DB. However, when you look at persistent data, it is not possible as the new node doesn’t have the data needed to serve it. This brings us to consistent hashing.

#### Consistent Hashing

Consistent hashing is a distributed hashing scheme that operates independently of the number of servers or objects in a distributed _hash table_ by assigning them a position on an abstract circle, or _hash ring_. This allows servers and objects to scale without affecting the overall system.

Say that our hash function *h*() generates a 32-bit integer. Then, to determine to which server we will send a key *k*, we find the server *s* whose hash *h*(*s*) is the smallest integer that is larger than *h*(*k*). To make the process simpler, we assume the table is circular, which means that if we cannot find a server with a hash larger than *h*(*k*), we wrap around and start looking from the beginning of the array.

<p id="gdcalert3" ><span style="color: red; font-weight: bold" images/consistent_hashing.png> </span></p>


![alt_text]( images/consistent_hashing.png "Consistent Hashing")


<p align="center"><span style="text-decoration:underline;  font-weight:bold;">Consistent hashing illustration</span></p>

In consistent hashing, when a server is removed or added, then only the keys from that server are relocated. For example, if server S<sub>3</sub> is removed then, all keys from server S<sub>3</sub> will be moved to server S<sub>4</sub> but keys stored on server S<sub>4</sub> and S<sub>2</sub> are not relocated. But there is one problem, when server S<sub>3</sub> is removed then keys from S<sub>3</sub> are not equally distributed among remaining servers S<sub>4</sub> and S<sub>2</sub>. They are only assigned to server S<sub>4</sub> which increases the load on server S<sub>4</sub>.

To evenly distribute the load among servers when a server is added or removed, it creates a fixed number of replicas (known as virtual nodes) of each server and distributes it along the circle. So instead of server labels S<sub>1</sub>, S<sub>2</sub> and S<sub>3</sub>, we will have S<sub>10</sub>,S<sub>11</sub>,…,S<sub>19</sub>, S<sub>20</sub>,S<sub>21</sub>,…,S<sub>29</sub> and S<sub>30</sub>,S<sub>31</sub>,…,S<sub>39</sub>. The factor for a number of replicas is also known as _weight_, depending on the situation.

All keys which are mapped to replicas S<sub>ij</sub> are stored on server S<sub>i</sub>. To find a key, we do the same thing, find the position of the key on the circle and then move forward until you find a server replica. If the server replica is S<sub>ij</sub>, then the key is stored in server S<sub>i</sub>.

Suppose server S<sub>3</sub> is removed, then all S<sub>3</sub> replicas with labels S<sub>30</sub>,S<sub>31</sub>,…,S<sub>39</sub> must be removed. Now, the objects keys adjacent to S<sub>3X</sub> labels will be automatically re-assigned to S<sub>1X</sub>, S<sub>2X</sub> and S<sub>4X</sub>. All keys originally assigned to S<sub>1</sub>, S<sub>2</sub> & S<sub>4</sub> will not be moved.   

Similar things happen if we add a server. Suppose we want to add a server S<sub>5</sub> as a replacement of S<sub>3</sub>, then we need to add labels S<sub>50</sub>,S<sub>51</sub>,…,S<sub>59</sub>. In the ideal case, one-fourth of keys from S<sub>1</sub>, S<sub>2</sub> and S<sub>4</sub> will be reassigned to S<sub>5</sub>.

When applied to persistent storages, further issues arise: if a node has left the scene, data stored on this node becomes unavailable, unless it has been replicated to other nodes before; in the opposite case of a new node joining the others, adjacent nodes are no longer responsible for some pieces of data which they still store but not get asked for anymore as the corresponding objects are no longer hashed to them by requesting clients. In order to address this issue, a replication factor (r) can be introduced.         

Introducing replicas in a partitioning scheme—besides reliability benefits—also makes it possible to spread workload for read requests that can go to any physical node responsible for a requested piece of data. Scalability doesn’t work if the clients have to decide between multiple versions of the dataset, because they need to read from a quorum of servers which in turn reduces the efficiency of load balancing. 

### Quorum

Quorum is the minimum number of nodes in a cluster that must be online and be able to communicate with each other. If any additional node failure occurs beyond this threshold, the cluster will stop running.

To attain a quorum, you need a majority of the nodes. Commonly, it is (N/2 + 1), where _N_ is the total no of nodes in the system. For example, 

- In a 3-node cluster, you need 2 nodes for a majority.

- In a 5-node cluster, you need 3 nodes for a majority.

- In a 6-node cluster, you need 4 nodes for a majority. 

<p id="gdcalert4" ><span style="color: red; font-weight: bold" images/Quorum.png > </span></p>


![alt_text](images/Quorum.png "image_tooltip")


<p align="center"> <span style="text-decoration:underline; font-weight:bold;">Quorum example</span> </p>

Network problems can cause communication failures among cluster nodes. One set of nodes might be able to communicate together across a functioning part of a network but not be able to communicate with a different set of nodes in another part of the network. This is known as split brain in cluster or cluster partitioning.

Now the partition which has quorum is allowed to continue running the application. The other partitions are removed from the cluster.

Eg: In a 5-node cluster, consider what happens if nodes 1, 2, and 3 can communicate with each other but not with nodes 4 and 5. Nodes 1, 2, and 3 constitute a majority, and they continue running as a cluster. Nodes 4 and 5, being a minority, stop running as a cluster. If node 3 loses communication with other nodes, all nodes stop running as a cluster. However, all functioning nodes will continue to listen for communication, so that when the network begins working again, the cluster can form and begin to run.

Below diagram demonstrates Quorum selection on a cluster partitioned into two sets.

<p id="gdcalert5" ><span style="color: red; font-weight: bold" images/cluster_quorum.png> </span></p>

![alt_text](images/cluster_quorum.png "image_tooltip")

**<p align="center"><span style="text-decoration:underline; font-weight:bold;">Cluster Quorum example</span></p>**



================================================
FILE: courses/level101/databases_sql/backup_recovery.md
================================================
### Backup and Recovery
Backups are a very crucial part of any database setup. They are generally a copy of the data that can be used to reconstruct the data in case of any major or minor crisis with the database. In general terms, backups can be of two types:

- **Physical Backup** - the data directory as it is on the disk
- **Logical Backup** - the table structure and records in it

Both the above kinds of backups are supported by MySQL with different tools. It is the job of the SRE to identify which should be used when.

#### Mysqldump
This utility is available with MySQL installation. It helps in getting the logical backup of the database. It outputs a set of SQL statements to reconstruct the data. It is not recommended to use `mysqldump` for large tables as it might take a lot of time and the file size will be huge. However, for small tables it is the best and the quickest option.

```shell
mysqldump [options] > dump_output.sql
```

There are certain options that can be used with `mysqldump` to get an appropriate dump of the database.

To dump all the databases:

```shell
mysqldump -u<user> -p<pwd> --all-databases > all_dbs.sql
```

To dump specific databases:

```shell
mysqldump -u<user> -p<pwd> --databases db1 db2 db3 > dbs.sql
```

To dump a single database:

```shell
mysqldump -u<user> -p<pwd> --databases db1 > db1.sql
```
OR
```shell
mysqldump -u<user> -p<pwd> db1 > db1.sql
```

The difference between the above two commands is that the latter one does not contain the `CREATE DATABASE` command in the backup output. 

To dump specific tables in a database:

```shell
mysqldump -u<user> -p<pwd> db1 table1 table2 > db1_tables.sql
```

To dump only table structures and no data:

```shell
mysqldump -u<user> -p<pwd> --no-data db1 > db1_structure.sql
```

To dump only table data and no `CREATE` statements:

```shell
mysqldump -u<user> -p<pwd> --no-create-info db1 > db1_data.sql
```

To dump only specific records from a table:

```shell
mysqldump -u<user> -p<pwd> --no-create-info db1 table1 --where=”salary>80000” > db1_table1_80000.sql
```

`mysqldump` can also provide output in CSV, other delimited text or XML format to support use-cases if any. The backup from `mysqldump` utility is offline, i.e. when the backup finishes it will not have the changes to the database which were made when the backup was going on. For example, if the backup started at 3:00 pm and finished at 4:00 pm, it will not have the changes made to the database between 3:00 and 4:00 pm.

**Restoring** from `mysqldump` can be done in the following two ways:

From shell

```shell
mysql -u<user> -p<pwd> < all_dbs.sql
```
OR

From shell, if the database is already created:

```shell
mysql -u<user> -p<pwd> db1 < db1.sql
```

From within MySQL shell:

```shell
mysql> source all_dbs.sql
```

#### Percona XtraBackup
This utility is installed separately from the MySQL server and is open source, provided by Percona. It helps in getting the full or partial physical backup of the database. It provides online backup of the database, i.e. it will have the changes made to the database when the backup was going on as explained at the end of the previous section.

- **Full Backup** - the complete backup of the database. 
- **Partial Backup** - Incremental 
  - **Cumulative** - After one full backup, the next backups will have changes post the full backup. For example, we took a full backup on Sunday, from Monday onwards every backup will have changes after Sunday; so, Tuesday’s backup will have Monday’s changes as well, Wednesday’s backup will have changes of Monday and Tuesday as well and so on.
  - **Differential** - After one full backup, the next backups will have changes post the previous incremental backup. For example, we took a full backup on Sunday, Monday will have changes done after Sunday, Tuesday will have changes done after Monday, and so on.

![partial backups - differential and cummulative](images/partial_backup.png "Differential and Cumulative Backups")

Percona XtraBackup allows us to get both full and incremental backups as we desire. However, incremental backups take less space than a full backup (if taken per day) but the restore time of incremental backups is more than that of full backups.

**Creating a full backup**

```shell
xtrabackup --defaults-file=<location to my.cnf> --user=<mysql user> --password=<mysql password> --backup --target-dir=<location of target directory>
```

Example:

```shell
xtrabackup --defaults-file=/etc/my.cnf --user=some_user --password=XXXX --backup --target-dir=/mnt/data/backup/
```

Some other options

- `--stream` - can be used to stream the backup files to standard output in a specified format. `xbstream` is the only option for now.
- `--tmp-dir` - set this to a `tmp` directory to be used for temporary files while taking backups.
- `--parallel` - set this to the number of threads that can be used to parallely copy data files to target directory.
- `--compress` - by default - `quicklz` is used. Set this to have the backup in compressed format. Each file is a `.qp` compressed file and can be extracted by `qpress` file archiver.
- `--decompress` - decompresses all the files which were compressed with the `.qp` extension. It will not delete the `.qp` files after decompression. To do that, use `--remove-original` along with this. Please note that the `decompress` option should be run separately from the `xtrabackup` command that used the compress option.

**Preparing a backup**

Once the backup is done with the `--backup` option, we need to prepare it in order to restore it. This is done to make the data files consistent with point-in-time. There might have been some transactions going on while the backup was being executed and those have changed the data files. When we prepare a backup, all those transactions are applied to the data files.

```shell
xtrabackup --prepare --target-dir=<where backup is taken>
```

Example:

```shell
xtrabackup --prepare --target-dir=/mnt/data/backup/
```

It is not recommended to halt a process which is preparing the backup as that might cause data file corruption and backup cannot be used further. The backup will have to be taken again.

**Restoring a Full Backup**

To restore the backup which is created and prepared from above commands, just copy everything from the backup `target-dir` to the `data-dir` of MySQL server, change the ownership of all files to MySQL user (the Linux user used by MySQL server) and start MySQL.

Or the below command can be used as well,

```shell
xtrabackup --defaults-file=/etc/my.cnf --copy-back --target-dir=/mnt/data/backups/
```

**Note** - the backup has to be prepared in order to restore it.

**Creating Incremental backups**

Percona XtraBackup helps create incremental backups, i.e, only the changes can be backed up since the last backup. Every InnoDB page contains a log sequence number or LSN that is also mentioned as one of the last lines of backup and prepare commands.

```shell
xtrabackup: Transaction log of lsn <LSN> to <LSN> was copied.
```
OR
```shell
InnoDB: Shutdown completed; log sequence number <LSN>
<timestamp> completed OK!
```

This indicates that the backup has been taken till the log sequence number mentioned. This is a key information in understanding incremental backups and working towards automating one. Incremental backups do not compare data files for changes, instead, they go through the InnoDB pages and compare their LSN to the last backup’s LSN. So, without one full backup, the incremental backups are useless.

The `xtrabackup` command creates a `xtrabackup_checkpoint` file which has the information about the LSN of the backup. Below are the key contents of the file:

```shell
backup_type = full-backuped | incremental
from_lsn = 0 (full backup) | to_lsn of last backup <LSN>
to_lsn = <LSN>
last_lsn = <LSN>
```
There is a difference between `to_lsn` and `last_lsn`. When the `last_lsn` is more than `to_lsn` that means there are transactions that ran while we took the backup and are yet to be applied. That is what `--prepare` is used for.

To take incremental backups, first, we require one full backup.

```shell
xtrabackup --defaults-file=/etc/my.cnf --user=some_user --password=XXXX --backup --target-dir=/mnt/data/backup/full/
```

Let’s assume the contents of the `xtrabackup_checkpoint` file to be as follows:

```shell
backup_type = full-backuped
from_lsn = 0
to_lsn = 1000
last_lsn = 1000
```
Now that we have one full backup, we can have an incremental backup that takes the changes. We will go with differential incremental backups.

```shell
xtrabackup --defaults-file=/etc/my.cnf --user=some_user --password=XXXX --backup --target-dir=/mnt/data/backup/incr1/ --incremental-basedir=/mnt/data/backup/full/
```

There are delta files created in the `incr1` directory like, `ibdata1.delta`, `db1/tbl1.ibd.delta` with the changes from the full directory. The `xtrabackup_checkpoint` file will thus have the following contents.

```shell
backup_type = incremental
from_lsn = 1000
to_lsn = 1500
last_lsn = 1500
```

Hence, the `from_lsn` here is equal to the `to_lsn` of the last backup or the `basedir` provided for the incremental backups. For the next incremental backup, we can use this incremental backup as the `basedir`.

```shell
xtrabackup --defaults-file=/etc/my.cnf --user=some_user --password=XXXX --backup --target-dir=/mnt/data/backup/incr2/ --incremental-basedir=/mnt/data/backup/incr1/
```

The `xtrabackup_checkpoint` file will thus have the following contents:

```shell
backup_type = incremental
from_lsn = 1500
to_lsn = 2000
last_lsn = 2200
```

**Preparing Incremental backups**

Preparing incremental backups is not the same as preparing a full backup. When prepare runs, two operations are performed - *committed transactions are applied on the data files* and *uncommitted transactions are rolled back*. While preparing incremental backups, we have to skip rollback of uncommitted transactions as it is likely that they might get committed in the next incremental backup. If we rollback uncommitted transactions, the further incremental backups cannot be applied.

We use `--apply-log-only` option along with `--prepare` to avoid the rollback phase. 

From the last section, we had the following directories with complete backup:

```shell
/mnt/data/backup/full
/mnt/data/backup/incr1
/mnt/data/backup/incr2
```

First, we prepare the full backup, but only with the `--apply-log-only` option.

```shell
xtrabackup --prepare --apply-log-only --target-dir=/mnt/data/backup/full
```

The output of the command will contain the following at the end.

```shell
InnoDB: Shutdown complete; log sequence number 1000
<timestamp> Completed OK!
```

Note the LSN mentioned at the end is the same as the `to_lsn` from the `xtrabackup_checkpoint` created for full backup.

Next, we apply the changes from the first incremental backup to the full backup.

```shell
xtrabackup --prepare --apply-log-only --target-dir=/mnt/data/backup/full --incremental-dir=/mnt/data/backup/incr1
```

This applies the delta files in the incremental directory to the full backup directory. It rolls the data files in the full backup directory forward to the time of incremental backup and applies the redo logs as usual.

Lastly, we apply the last incremental backup same as the previous one with just a small change.

```shell
xtrabackup --prepare --target-dir=/mnt/data/backup/full --incremental-dir=/mnt/data/backup/incr1
```

We do not have to use the `--apply-log-only` option with it. It applies the *incr2 delta files* to the full backup data files taking them forward, applies redo logs on them and finally rollbacks the uncommitted transactions to produce the final result. The data now present in the full backup directory can now be used to restore.

**Note**: To create cumulative incremental backups, the `incremental-basedir` should always be the full backup directory for every incremental backup. While preparing, we can start with the full backup with the `--apply-log-only` option and use just the last incremental backup for the final `--prepare` as that has all the changes since the full backup. 

**Restoring Incremental backups**

Once all the above steps are completed, restoring is the same as done for a full backup.

#### Further Reading

- [MySQL Point-In-Time-Recovery](https://dev.mysql.com/doc/refman/8.0/en/point-in-time-recovery.html)
- [Another MySQL backup tool - mysqlpump](https://dev.mysql.com/doc/refman/8.0/en/mysqlpump.html)
- [Another MySQL backup tool - mydumper](https://github.com/maxbube/mydumper/tree/master/docs)
- [A comparison between mysqldump, mysqlpump and mydumper](https://mydbops.wordpress.com/2019/03/26/mysqldump%E2%80%8B-vs-mysqlpump-vs-mydumper/)
- [Backup Best Practices](https://www.percona.com/blog/2020/05/27/best-practices-for-mysql-backups/)


================================================
FILE: courses/level101/databases_sql/concepts.md
================================================
*   Relational DBs are used for data storage. Even a file can be used to store data, but relational DBs are designed with specific goals:
    *   Efficiency
    *   Ease of access and management
    *   Organized
    *   Handle relations between data (represented as tables)
*   Transaction: a unit of work that can comprise multiple statements, executed together
*   ACID properties

    Set of properties that guarantee data integrity of DB transactions

    *   Atomicity: Each transaction is atomic (succeeds or fails completely)
    *   Consistency: Transactions only result in valid state (which includes rules, constraints, triggers etc.)
    *   Isolation: Each transaction is executed independently of others safely within a concurrent system
    *   Durability: Completed transactions will not be lost due to any later failures

	Let’s take some examples to illustrate the above properties.

    *   Account A has a balance of ₹200 & B has ₹400. Account A is transferring ₹100 to Account B. This transaction has a deduction from sender and an addition into the recipient’s balance. If the first operation passes successfully while the second fails, A’s balance would be ₹100 while B would be having ₹400 instead of ₹500. **Atomicity** in a DB ensures this partially failed transaction is rolled back.
    *   If the second operation above fails, it leaves the DB inconsistent (sum of balance of accounts before and after the operation is not the same). **Consistency** ensures that this does not happen.
    *   There are three operations, one to calculate interest for A’s account,  another to add that to A’s account, then transfer ₹100 from B to A. Without **isolation** guarantees, concurrent execution of these 3 operations may lead to a different outcome every time.
    *   What happens if the system crashes before the transactions are written to disk? **Durability** ensures that the changes are applied correctly during recovery.
*   Relational data
    *   Tables represent relations
    *   Columns (fields) represent attributes
    *   Rows are individual records
    *   Schema describes the structure of DB
*   SQL

    A query language to interact with and manage data.

    [CRUD operations](https://stackify.com/what-are-crud-operations/)&mdash;create, read, update, delete queries

    Management operations&mdash;create DBs/tables/indexes, backup, import/export, users, access controls, etc

    *Exercise*: Classify the below queries into the four types&mdash;DDL (definition), DML (manipulation), DCL (control) and TCL (transactions) and explain in detail.

        insert, create, drop, delete, update, commit, rollback, truncate, alter, grant, revoke

    You can practise these in the [lab section](https://linkedin.github.io/school-of-sre/level101/databases_sql/lab/).

*   Constraints

    Rules for data that can be stored. Query fails if you violate any of these defined on a table.

	*Primary key*: One or more columns that contain UNIQUE values, and cannot contain NULL values. A table can have only ONE primary key. An index on it is created by default.

    *Foreign key*: Links two tables together. Its value(s) match a primary key in a different table

	*Not null*: Does not allow null values

	*Unique*: Value of column must be unique across all rows

	*Default*: Provides a default value for a column if none is specified during insert

    *Check*: Allows only particular values (like Balance >= 0)


*   [Indexes](https://datageek.blog/en/2018/06/05/rdbms-basics-indexes-and-clustered-indexes/)

	Most indexes use B+ tree structure.

	Why use them: Speeds up queries (in large tables that fetch only a few rows, min/max queries, by eliminating rows from consideration, etc)

	*Types of indexes*: unique, primary key, fulltext, secondary

	Write-heavy loads, mostly full table scans or accessing large number of rows, etc. do not benefit from indexes


*   [Joins](https://www.sqlservertutorial.net/sql-server-basics/sql-server-joins/)

	Allows you to fetch related data from multiple tables, linking them together with some common field. Powerful but also resource-intensive and makes scaling databases difficult. This is the cause of many slow performing queries when run at scale, and the solution is almost always to find ways to reduce the joins.


*   [Access control](https://dev.mysql.com/doc/refman/8.0/en/access-control.html)

	DBs have privileged accounts for admin tasks, and regular accounts for clients. There are fine-grained controls on what actions (DDL, DML, etc. discussed earlier) are allowed for these accounts.

	DB first verifies the user credentials (authentication), and then examines whether this user is permitted to perform the request (authorization) by looking up these information in some internal tables.

	Other controls include activity auditing that allows examining the history of actions done by a user, and resource limits which define the number of queries, connections, etc. allowed.


### Popular databases

Commercial, closed source: Oracle, Microsoft SQL Server, IBM DB2

Open source with optional paid support: MySQL, MariaDB, PostgreSQL

Individuals and small companies have always preferred open source DBs because of the huge cost associated with commercial software.

In recent times, even large organizations have moved away from commercial software to open source alternatives because of the flexibility and cost savings associated with it.

Lack of support is no longer a concern because of the paid support available from the developer and third parties.

MySQL is the most widely used open source DB, and it is widely supported by hosting providers, making it easy for anyone to use. It is part of the popular Linux-Apache-MySQL-PHP ([LAMP](https://en.wikipedia.org/wiki/LAMP_(software_bundle))) stack that became popular in the 2000s. We have many more choices for a programming language, but the rest of that stack is still widely used.


================================================
FILE: courses/level101/databases_sql/conclusion.md
================================================
# Conclusion
We have covered basic concepts of SQL databases. We have also covered some of the tasks that an SRE may be responsible for&mdash;there is so much more to learn and do. We hope this course gives you a good start and inspires you to explore further.


### Further reading

*   More practice with online resources like [this one](https://www.w3resource.com/sql-exercises/index.php)
*   [Normalization](https://beginnersbook.com/2015/05/normalization-in-dbms/)
*   [Routines](https://dev.mysql.com/doc/refman/8.0/en/stored-routines.html), [triggers](https://dev.mysql.com/doc/refman/8.0/en/trigger-syntax.html)
*   [Views](https://www.essentialsql.com/what-is-a-relational-database-view/)
*   [Transaction isolation levels](https://dev.mysql.com/doc/refman/8.0/en/innodb-transaction-isolation-levels.html)
*   [Sharding](https://www.digitalocean.com/community/tutorials/understanding-database-sharding)
*   [Setting up HA](https://severalnines.com/database-blog/introduction-database-high-availability-mysql-mariadb), [monitoring](https://blog.serverdensity.com/how-to-monitor-mysql/), [backups](https://dev.mysql.com/doc/refman/8.0/en/backup-methods.html)

================================================
FILE: courses/level101/databases_sql/innodb.md
================================================
### Why should you use this?

General purpose, row level locking, ACID support, transactions, crash recovery and multi-version concurrency control, etc.


### Architecture

![alt_text](images/innodb_architecture.png "InnoDB components")


### Key components:

*   Memory:
    *   Buffer pool: LRU cache of frequently used data (table and index) to be processed directly from memory, which speeds up processing. Important for tuning performance.
    *   Change buffer: Caches changes to secondary index pages when those pages are not in the buffer pool and merges it when they are fetched. Merging may take a long time and impact live queries. It also takes up part of the buffer pool. Avoids the extra I/O to read secondary indexes in.
    *   Adaptive hash index: Supplements InnoDB’s B-Tree indexes with fast hash lookup tables like a cache. Slight performance penalty for misses, also adds maintenance overhead of updating it. Hash collisions cause AHI rebuilding for large DBs.
    *   Log buffer: Holds log data before flush to disk.

        Size of each above memory is configurable, and impacts performance a lot. Requires careful analysis of workload, available resources, benchmarking and tuning for optimal performance.

*   Disk:
    *   Tables: Stores data within rows and columns.
    *   Indexes: Helps find rows with specific column values quickly, avoids full table scans.
    *   Redo Logs: all transactions are written to them, and after a crash, the recovery process corrects data written by incomplete transactions and replays any pending ones.
    *   Undo Logs: Records associated with a single transaction that contains information about how to undo the latest change by a transaction.



================================================
FILE: courses/level101/databases_sql/intro.md
================================================
# Relational Databases

### Prerequisites
*   Complete [Linux course](https://linkedin.github.io/school-of-sre/level101/linux_basics/intro/)
*   Install Docker (for lab section)

### What to expect from this course
You will have an understanding of what relational databases are, their advantages, and some MySQL specific concepts.

### What is not covered under this course
*   In-depth implementation details

*   Advanced topics like normalization, sharding

*   Specific tools for administration

### Introduction
The main purpose of database systems is to manage data. This includes storage, adding new data, deleting unused data, updating existing data, retrieving data within a reasonable response time, other maintenance tasks to keep the system running, etc.

### Pre-reads
[RDBMS Concepts](https://beginnersbook.com/2015/04/rdbms-concepts/)

### Course Contents
- [Key Concepts](https://linkedin.github.io/school-of-sre/level101/databases_sql/concepts/)
- [MySQL Architecture](https://linkedin.github.io/school-of-sre/level101/databases_sql/mysql/#mysql-architecture)
- [InnoDB](https://linkedin.github.io/school-of-sre/level101/databases_sql/innodb/)
- [Backup and Recovery](https://linkedin.github.io/school-of-sre/level101/databases_sql/backup_recovery/)
- [MySQL Replication](https://linkedin.github.io/school-of-sre/level101/databases_sql/replication/)
- Operational Concepts
    - [SELECT Query](https://linkedin.github.io/school-of-sre/level101/databases_sql/select_query/)
    - [Query Performance](https://linkedin.github.io/school-of-sre/level101/databases_sql/query_performance/)
- [Lab](https://linkedin.github.io/school-of-sre/level101/databases_sql/lab/)
- [Further Reading](https://linkedin.github.io/school-of-sre/level101/databases_sql/conclusion/#further-reading)


================================================
FILE: courses/level101/databases_sql/lab.md
================================================
**Prerequisites**

Install Docker

**Setup**

Create a working directory named `sos` or something similar, and `cd` into it.

Enter the following into a file named `my.cnf` under a directory named `custom`:

```shell
sos $ cat custom/my.cnf
[mysqld]

# These settings apply to MySQL server
# You can set port, socket path, buffer size etc.
# Below, we are configuring slow query settings

slow_query_log=1
slow_query_log_file=/var/log/mysqlslow.log
long_query_time=1
```

Start a container and enable slow query log with the following:

```shell
sos $ docker run --name db -v custom:/etc/mysql/conf.d -e MYSQL_ROOT_PASSWORD=realsecret -d mysql:8
sos $ docker cp custom/my.cnf $(docker ps -qf "name=db"):/etc/mysql/conf.d/custom.cnf
sos $ docker restart $(docker ps -qf "name=db")
```

Import a sample database:

```shell
sos $ git clone git@github.com:datacharmer/test_db.git
sos $ docker cp test_db $(docker ps -qf "name=db"):/home/test_db/
sos $ docker exec -it $(docker ps -qf "name=db") bash
root@3ab5b18b0c7d:/# cd /home/test_db/
root@3ab5b18b0c7d:/# mysql -uroot -prealsecret mysql < employees.sql
root@3ab5b18b0c7d:/etc# touch /var/log/mysqlslow.log
root@3ab5b18b0c7d:/etc# chown mysql:mysql /var/log/mysqlslow.log
```

_Workshop 1: Run some sample queries_

Run the following:

```shell
$ mysql -uroot -prealsecret mysql
mysql>

# inspect DBs and tables
# the last 4 are MySQL internal DBs

mysql> SHOW DATABASES;
+--------------------+
| Database           |
+--------------------+
| employees          |
| information_schema |
| mysql              |
| performance_schema |
| sys                |
+--------------------+

mysql> USE employees;
mysql> SHOW TABLES;
+----------------------+
| Tables_in_employees  |
+----------------------+
| current_dept_emp     |
| departments          |
| dept_emp             |
| dept_emp_latest_date |
| dept_manager         |
| employees            |
| salaries             |
| titles               |
+----------------------+

# read a few rows
mysql> SELECT * FROM employees LIMIT 5;

# filter data by conditions
mysql> SELECT COUNT(*) FROM employees WHERE gender = 'M' LIMIT 5;

# find count of particular data
mysql> SELECT COUNT(*) FROM employees WHERE first_name = 'Sachin'; 
```

_Workshop 2: Use explain and explain analyze to profile a query, identify and add indexes required for improving performance_

```shell
# View all indexes on table 
# (\G is to output horizontally, replace it with a ; to get table output)

mysql> SHOW INDEX FROM employees FROM employees\G
*************************** 1. row ***************************
        Table: employees
   Non_unique: 0
     Key_name: PRIMARY
 Seq_in_index: 1
  Column_name: emp_no
    Collation: A
  Cardinality: 299113
     Sub_part: NULL
       Packed: NULL
         Null:
   Index_type: BTREE
      Comment:
Index_comment:
      Visible: YES
   Expression: NULL

# This query uses an index, identified by 'key' field
# By prefixing explain keyword to the command, 
# we get query plan (including key used)

mysql> EXPLAIN SELECT * FROM employees WHERE emp_no < 10005\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: employees
   partitions: NULL
         type: range
possible_keys: PRIMARY
          key: PRIMARY
      key_len: 4
          ref: NULL
         rows: 4
     filtered: 100.00
        Extra: Using where

# Compare that to the next query which does not utilize any index

mysql> EXPLAIN SELECT first_name, last_name FROM employees WHERE first_name = 'Sachin'\G
*************************** 1. row ***************************
           id: 1
  select_type: SIMPLE
        table: employees
   partitions: NULL
         type: ALL
possible_keys: NULL
          key: NULL
      key_len: NULL
          ref: NULL
         rows: 299113
     filtered: 10.00
        Extra: Using where

# Let's see how much time this query takes

mysql> EXPLAIN ANALYZE SELECT first_name, last_name FROM employees WHERE first_name = 'Sachin'\G
*************************** 1. row ***************************
EXPLAIN: -> Filter: (employees.first_name = 'Sachin')  (cost=30143.55 rows=29911) (actual time=28.284..3952.428 rows=232 loops=1)
    -> Table scan on employees  (cost=30143.55 rows=299113) (actual time=0.095..1996.092 rows=300024 loops=1)


# Cost (estimated by query planner) is 30143.55
# actual time=28.284ms for first row, 3952.428 for all rows
# Now lets try adding an index and running the query again

mysql> CREATE INDEX idx_firstname ON employees(first_name);
Query OK, 0 rows affected (1.25 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> EXPLAIN ANALYZE SELECT first_name, last_name FROM employees WHERE first_name = 'Sachin';
+--------------------------------------------------------------------------------------------------------------------------------------------+
| EXPLAIN                                                                                                                                    |
+--------------------------------------------------------------------------------------------------------------------------------------------+
| -> Index lookup on employees using idx_firstname (first_name='Sachin')  (cost=81.20 rows=232) (actual time=0.551..2.934 rows=232 loops=1)
 |
+--------------------------------------------------------------------------------------------------------------------------------------------+
1 row in set (0.01 sec)

# Actual time=0.551ms for first row
# 2.934ms for all rows. A huge improvement!
# Also notice that the query involves only an index lookup,
# and no table scan (reading all rows of the table),
# which vastly reduces load on the DB.
```

_Workshop 3: Identify slow queries on a MySQL server_

```shell
# Run the command below in two terminal tabs to open two shells into the container.

$ docker exec -it $(docker ps -qf "name=db") bash

# Open a `mysql` prompt in one of them and execute this command
# We have configured to log queries that take longer than 1s,
# so this `sleep(3)` will be logged

$ mysql -uroot -prealsecret mysql
mysql> select sleep(3);

# Now, in the other terminal, tail the slow log to find details about the query

root@62c92c89234d:/etc# tail -f /var/log/mysqlslow.log
/usr/sbin/mysqld, Version: 8.0.21 (MySQL Community Server - GPL). started with:
Tcp port: 3306  Unix socket: /var/run/mysqld/mysqld.sock
Time                 Id Command    Argument

# Time: 2020-11-26T14:53:44.822348Z
# User@Host: root[root] @ localhost []  Id:     9
# Query_time: 5.404938  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 1
use employees;
# Time: 2020-11-26T14:53:58.015736Z
# User@Host: root[root] @ localhost []  Id:     9
# Query_time: 10.000225  Lock_time: 0.000000 Rows_sent: 1  Rows_examined: 1

SET timestamp=1606402428;
select sleep(3);
```

These were simulated examples with minimal complexity. In real life, the queries would be much more complex and the explain/analyze and slow query logs would have more details.


================================================
FILE: courses/level101/databases_sql/mysql.md
================================================
### MySQL architecture

![alt_text](images/mysql_architecture.png "MySQL architecture diagram")

MySQL architecture enables you to select the right storage engine for your needs, and abstracts away all implementation details from the end users (application engineers and [DBA](https://en.wikipedia.org/wiki/Database_administrator)) who only need to know a consistent stable API.

Application layer:

*   Connection handling: each client gets its own connection which is cached for the duration of access
*   Authentication: server checks (username, password, host) info of client and allows/rejects connection
*   Security: server determines whether the client has privileges to execute each query (check with `SHOW PRIVILEGES` command)

Server layer:

*   Services and utilities: backup/restore, replication, cluster, etc
*   SQL interface: clients run queries for data access and manipulation
*   SQL parser: creates a parse tree from the query (lexical/syntactic/semantic analysis and code generation)
*   Optimizer: optimizes queries using various algorithms and data available to it (table-level stats), modifies queries, order of scanning, indexes to use, etc. (check with `EXPLAIN` command)
*   Caches and buffers: cache stores query results, buffer pool (InnoDB) stores table and index data in [LRU](https://en.wikipedia.org/wiki/Cache_replacement_policies#Least_recently_used_(LRU)) fashion

Storage engine options:

*   InnoDB: most-widely used, transaction support, ACID compliant, supports row-level locking, crash recovery and multi-version concurrency control. Default since MySQL 5.5+.
*   MyISAM: fast, does not support transactions, provides table-level locking, great for read-heavy workloads, mostly in web and data warehousing. Default upto MySQL 5.1.
*   Archive: optimised for high speed inserts, compresses data as it is inserted, does not support transactions, ideal for storing and retrieving large amounts of seldom referenced historical, archived data
*   Memory: tables in memory. Fastest engine, supports table-level locking, does not support transactions, ideal for creating temporary tables or quick lookups, data is lost after a shutdown
*   CSV: stores data in CSV files, great for integrating into other applications that use this format
*   … etc.

It is possible to migrate from one storage engine to another. But this migration locks tables for all operations and is not online, as it changes the physical layout of the data. It takes a long time and is generally not recommended. Hence, choosing the right storage engine at the beginning is important.

General guideline is to use InnoDB unless you have a specific need for one of the other storage engines.

Running `mysql> SHOW ENGINES;` shows you the supported engines on your MySQL server.

================================================
FILE: courses/level101/databases_sql/operations.md
================================================
*   Explain and explain+analyze

	`EXPLAIN <query>` analyzes query plans from the optimizer, including how tables are joined, which tables/rows are scanned, etc.

	`EXPLAIN ANALYZE` shows the above and additional info like execution cost, number of rows returned, time taken, etc.

	This knowledge is useful to tweak queries and add indexes.

	Watch this performance tuning [tutorial video](https://www.youtube.com/watch?v=pjRTLPeUOug).

	Checkout the [lab section](https://linkedin.github.io/school-of-sre/level101/databases_sql/lab/) for a hands-on about indexes.

*   [Slow query logs](https://dev.mysql.com/doc/refman/5.7/en/slow-query-log.html)

	Used to identify slow queries (configurable threshold), enabled in config or dynamically with a query.

	Checkout the [lab section](https://linkedin.github.io/school-of-sre/level101/databases_sql/lab/) about identifying slow queries.

*   User management

	This includes creation and changes to users, like managing privileges, changing password etc.

*   Backup and restore strategies, pros and cons

	- Logical backup using `mysqldump` - slower but can be done online

	- Physical backup (copy data directory or use XtraBackup) -  quick backup/recovery. Copying data directory requires locking or shut down. XtraBackup is an improvement because it supports backups without shutting down (hot backup).

	- Others - PITR, snapshots etc.

*   Crash recovery process using redo logs

	After a crash, when you restart server, it reads redo logs and replays modifications to recover

*   Monitoring MySQL

	- Key MySQL metrics: reads, writes, query runtime, errors, slow queries, connections, running threads, InnoDB metrics

	- Key OS metrics: CPU, load, memory, disk I/O, network


*   Replication

    Copies data from one instance to one or more instances. Helps in horizontal scaling, data protection, analytics and performance. Binlog dump thread on primary, replication I/O and SQL threads on secondary. Strategies include the standard async, semi async or group replication.

*   High Availability

    Ability to cope with failure at software, hardware and network level. Essential for anyone who needs 99.9%+ uptime. Can be implemented with replication or clustering solutions from MySQL, Percona, Oracle, etc. Requires expertise to setup and maintain. Failover can be manual, scripted or using tools like Orchestrator.

*   [Data directory](https://dev.mysql.com/doc/refman/8.0/en/data-directory.html)

    Data is stored in a particular directory, with nested directories for the data contained in each database. There are also MySQL log files, InnoDB log files, server process ID file and some other configs. The data directory is configurable.

*   [MySQL configuration](https://dev.mysql.com/doc/refman/5.7/en/server-configuration.html)

    This can be done by passing [parameters during startup](https://dev.mysql.com/doc/refman/5.7/en/server-options.html), or in a [file](https://dev.mysql.com/doc/refman/8.0/en/option-files.html). There are a few [standard paths](https://dev.mysql.com/doc/refman/8.0/en/option-files.html#option-file-order) where MySQL looks for config files, `/etc/my.cnf` is one of the commonly used paths. These options are organized under headers (`mysqld` for server and `mysql` for client), you can explore them more in the lab that follows.

*   [Logs](https://dev.mysql.com/doc/refman/5.7/en/server-logs.html)

    MySQL has logs for various purposes - general query log, errors, binary logs (for replication), slow query log. Only error log is enabled by default (to reduce I/O and storage requirement), the others can be enabled when required - by specifying config parameters at startup or running commands at runtime. [Log destination](https://dev.mysql.com/doc/refman/5.7/en/log-destinations.html) can also be tweaked with config parameters.


================================================
FILE: courses/level101/databases_sql/query_performance.md
================================================
### Query Performance Improvement
Query Performance is a very crucial aspect of relational databases. If not tuned correctly, the select queries can become slow and painful for the application, and for the MySQL server as well. The important task is to identify the slow queries and try to improve their performance by either rewriting them or creating proper indexes on the tables involved in it.

#### The Slow Query Log
The slow query log contains SQL statements that take a longer time to execute than set in the config parameter `long_query_time`. These queries are the candidates for optimization. There are some good utilities to summarize the slow query logs like, `mysqldumpslow` (provided by MySQL itself), `pt-query-digest` (provided by Percona), etc. Following are the config parameters that are used to enable and effectively catch slow queries

| Variable | Explanation | Example value |
| --- | --- | --- |
| slow_query_log | Enables or disables slow query logs | ON |
| slow_query_log_file | The location of the slow query log | /var/lib/mysql/mysql-slow.log |
| long_query_time | Threshold time. The query that takes longer than this time is logged in slow query log | 5 |
| log_queries_not_using_indexes | When enabled with the slow query log, the queries which do not make use of any index are also logged in the slow query log even though they take less time than long_query_time. | ON |

So, for this section, we will be enabling `slow_query_log`, `long_query_time` will be kept to **0.3 (300 ms)**, and `log_queries_not_using` index will be enabled as well.

Below are the queries that we will execute on the `employees` database.

1. 
    ```
    SELECT * FROM employees WHERE last_name = 'Koblick'
    ```
1. 
    ```
    SELECT * FROM salaries WHERE salary >= 100000
    ```
1. 
    ```
    SELECT * FROM titles WHERE title = 'Manager'
    ```
1. 
    ```
    SELECT * FROM employees WHERE year(hire_date) = 1995
    ```
1. 
    ```
    SELECT year(e.hire_date), max(s.salary) FROM employees e JOIN salaries s ON e.emp_no=s.emp_no GROUP BY year(e.hire_date)
    ```

Now, queries **1**, **3** and **4** executed under 300ms but if we check the slow query logs, we will find these queries logged as they are not using any of the index. Queries **2** and **5** are taking longer than 300ms and also not using any index.

Use the following command to get the summary of the slow query log:

```shell
mysqldumpslow /var/lib/mysql/mysql-slow.log
```

![slow query log analysis](images/mysqldumpslow_out.png "slow query log analysis")

There are some more queries in the snapshot that were along with the queries mentioned. `mysqldumpslow` replaces actual values that were used by _N_ (in case of numbers) and _S_ (in case of strings). That can be overridden by `-a` option, however, that will increase the output lines if different values are used in similar queries.

#### The EXPLAIN Plan
The `EXPLAIN` command is used with any query that we want to analyze. It describes the query execution plan, how MySQL sees and executes the query. `EXPLAIN` works with `SELECT`, `INSERT`, `UPDATE` and `DELETE` statements. It tells about different aspects of the query like, how tables are joined, indexes used or not, etc. The important thing here is to understand the basic `EXPLAIN` plan output of a query to determine its performance. 

Let's take the following query as an example,

```shell
mysql> EXPLAIN SELECT * FROM salaries WHERE salary = 100000;
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
| id | select_type | table    | partitions | type | possible_keys | key  | key_len | ref  | rows    | filtered | Extra       |
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
|  1 | SIMPLE      | salaries | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 2838426 |    10.00 | Using where |
+----+-------------+----------+------------+------+---------------+------+---------+------+---------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
```

The key aspects to understand in the above output are:

- **Partitions** - the number of partitions considered while executing the query. It is only valid if the table is partitioned.
- **Possible_keys** - the list of indexes that were considered during creation of the execution plan.
- **Key** - the index that will be used while executing the query.
- **Rows** - the number of rows examined during the execution.
- **Filtered** - the percentage of rows that were filtered out of the rows examined. The maximum and most optimized result will have 100 in this field. 
- **Extra** - this tells some extra information on how MySQL evaluates, whether the query is using only `WHERE` clause to match target rows, any index or temporary table, etc.

So, for the above query, we can determine that there are no partitions, there are no candidate indexes to be used and so no index is used at all, over 2M rows are examined and only 10% of them are included in the result, and lastly, only a `WHERE` clause is used to match the target rows.

#### Creating an Index
Indexes are used to speed up selecting relevant rows for a given column value. Without an index, MySQL starts with the first row and goes through the entire table to find matching rows. If the table has too many rows, the operation becomes costly. With indexes, MySQL determines the position to start looking for the data without reading the full table.

A primary key is also an index which is also the fastest and is stored along with the table data. Secondary indexes are stored outside of the table data and are used to further enhance the performance of SQL statements. Indexes are mostly stored as B-Trees, with some exceptions like spatial indexes use R-Trees and memory tables use hash indexes.

There are 2 ways to create indexes:

- While creating a table - if we know beforehand the columns that will drive the most number of `WHERE` clauses in `SELECT` queries, then we can put an index over them while creating a table.
- Altering a Table - To improve the performance of a troubling query, we create an index on a table which already has data in it using `ALTER` or `CREATE INDEX` command. This operation does not block the table but might take some time to complete depending on the size of the table.

Let’s look at the query that we discussed in the previous section. It’s clear that scanning over 2M records is not a good idea when only 10% of those records are actually in the resultset. 

Hence, we create an index on the salary column of the salaries table.

```SQL
CREATE INDEX idx_salary ON salaries(salary)
```
OR

```SQL
ALTER TABLE salaries ADD INDEX idx_salary(salary)
```

And the same explain plan now looks like this:

```shell
mysql> EXPLAIN SELECT * FROM salaries WHERE salary = 100000;
+----+-------------+----------+------------+------+---------------+------------+---------+-------+------+----------+-------+
| id | select_type | table    | partitions | type | possible_keys | key        | key_len | ref   | rows | filtered | Extra |
+----+-------------+----------+------------+------+---------------+------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | salaries | NULL       | ref  | idx_salary    | idx_salary | 4       | const |   13 |   100.00 | NULL  |
+----+-------------+----------+------------+------+---------------+------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
```

Now the index used is `idx_salary`, the one we recently created. The index actually helped examine only 13 records and all of them are in the resultset. Also, the query execution time is also reduced from over 700ms to almost negligible. 

Let’s look at another example. Here, we are searching for a specific combination of `first_name` and `last_name`. But, we might also search based on `last_name` only.

```shell
mysql> EXPLAIN SELECT * FROM employees WHERE last_name = 'Dredge' AND first_name = 'Yinghua';
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table     | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | employees | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 299468 |     1.00 | Using where |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
```

Now only 1% record out of almost 300K is the resultset. Although the query time is particularly quick as we have only 300K records, this will be a pain if the number of records are over millions. In this case, we create an index on `last_name` and `first_name`, not separately, but a composite index including both the columns. 

```SQL
CREATE INDEX idx_last_first ON employees(last_name, first_name)
```

```shell
mysql> EXPLAIN SELECT * FROM employees WHERE last_name = 'Dredge' AND first_name = 'Yinghua';
+----+-------------+-----------+------------+------+----------------+----------------+---------+-------------+------+----------+-------+
| id | select_type | table     | partitions | type | possible_keys  | key            | key_len | ref         | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------+----------------+---------+-------------+------+----------+-------+
|  1 | SIMPLE      | employees | NULL       | ref  | idx_last_first | idx_last_first | 124     | const,const |    1 |   100.00 | NULL  |
+----+-------------+-----------+------------+------+----------------+----------------+---------+-------------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
```

We chose to put `last_name` before `first_name` while creating the index as the optimizer starts from the leftmost prefix of the index while evaluating the query. For example, if we have a 3-column index like `idx(c1, c2, c3)`, then the search capability of the index follows - (c1), (c1, c2) or (c1, c2, c3) i.e. if your `WHERE` clause has only `first_name`, this index won’t work.

```shell
mysql> EXPLAIN SELECT * FROM employees WHERE first_name = 'Yinghua';
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
| id | select_type | table     | partitions | type | possible_keys | key  | key_len | ref  | rows   | filtered | Extra       |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
|  1 | SIMPLE      | employees | NULL       | ALL  | NULL          | NULL | NULL    | NULL | 299468 |    10.00 | Using where |
+----+-------------+-----------+------------+------+---------------+------+---------+------+--------+----------+-------------+
1 row in set, 1 warning (0.00 sec)
```

But, if you have only the `last_name` in the `WHERE` clause, it will work as expected.

```shell
mysql> EXPLAIN SELECT * FROM employees WHERE last_name = 'Dredge';
+----+-------------+-----------+------------+------+----------------+----------------+---------+-------+------+----------+-------+
| id | select_type | table     | partitions | type | possible_keys  | key            | key_len | ref   | rows | filtered | Extra |
+----+-------------+-----------+------------+------+----------------+----------------+---------+-------+------+----------+-------+
|  1 | SIMPLE      | employees | NULL       | ref  | idx_last_first | idx_last_first | 66      | const |  200 |   100.00 | NULL  |
+----+-------------+-----------+------------+------+----------------+----------------+---------+-------+------+----------+-------+
1 row in set, 1 warning (0.00 sec)
```

For another example, use the following queries:

```SQL
CREATE TABLE employees_2 LIKE employees;
CREATE TABLE salaries_2 LIKE salaries;
ALTER TABLE salaries_2 DROP PRIMARY KEY;
```

We made copies of `employees` and `salaries` tables without the Primary Key of `salaries` table to understand an example of `SELECT` with `JOIN`.

When you have queries like the below, it becomes tricky to identify the pain point of the query.

```shell
mysql> SELECT e.first_name, e.last_name, s.salary, e.hire_date FROM employees_2 e JOIN salaries_2 s ON e.emp_no=s.emp_no WHERE e.last_name='Dredge';
1860 rows in set (4.44 sec)
```

This query is taking about 4.5 seconds to complete with 1860 rows in the resultset. Let’s look at the Explain plan. There will be 2 records in the Explain plan as 2 tables are used in the query.

```shell
mysql> EXPLAIN SELECT e.first_name, e.last_name, s.salary, e.hire_date FROM employees_2 e JOIN salaries_2 s ON e.emp_no=s.emp_no WHERE e.last_name='Dredge';
+----+-------------+-------+------------+--------+------------------------+---------+---------+--------------------+---------+----------+-------------+
| id | select_type | table | partitions | type   | possible_keys          | key     | key_len | ref                | rows    | filtered | Extra       |
+----+-------------+-------+------------+--------+------------------------+---------+---------+--------------------+---------+----------+-------------+
|  1 | SIMPLE      | s     | NULL       | ALL    | NULL                   | NULL    | NULL    | NULL               | 2837194 |   100.00 | NULL        |
|  1 | SIMPLE      | e     | NULL       | eq_ref | PRIMARY,idx_last_first | PRIMARY | 4       | employees.s.emp_no |       1 |     5.00 | Using where |
+----+-------------+-------+------------+--------+------------------------+---------+---------+--------------------+---------+----------+-------------+
2 rows in set, 1 warning (0.00 sec)
```

These are in order of evaluation, i.e. `salaries_2` will be evaluated first and then `employees_2` will be joined to it. As it looks like, it scans almost all the rows of `salaries_2` table and tries to match the `employees_2` rows as per the `JOIN` condition. Though `WHERE` clause is used in fetching the final resultset, but the index corresponding to the `WHERE` clause is not used for the `employees_2` table. 

If the join is done on two indexes which have the same data-types, it will always be faster. So, let’s create an index on the `emp_no` column of `salaries_2` table and analyze the query again.

```SQL
CREATE INDEX idx_empno ON salaries_2(emp_no)
```

```shell
mysql> EXPLAIN SELECT e.first_name, e.last_name, s.salary, e.hire_date FROM employees_2 e JOIN salaries_2 s ON e.emp_no=s.emp_no WHERE e.last_name='Dredge';
+----+-------------+-------+------------+------+------------------------+----------------+---------+--------------------+------+----------+-------+
| id | select_type | table | partitions | type | possible_keys          | key            | key_len | ref                | rows | filtered | Extra |
+----+-------------+-------+------------+------+------------------------+----------------+---------+--------------------+------+----------+-------+
|  1 | SIMPLE      | e     | NULL       | ref  | PRIMARY,idx_last_first | idx_last_first | 66      | const              |  200 |   100.00 | NULL  |
|  1 | SIMPLE      | s     | NULL       | ref  | idx_empno              | idx_empno      | 4       | employees.e.emp_no |    9 |   100.00 | NULL  |
+----+-------------+-------+------------+------+------------------------+----------------+---------+--------------------+------+----------+-------+
2 rows in set, 1 warning (0.00 sec)
```

Now, not only did the index help the optimizer to examine only a few rows in both tables, it reversed the order of the tables in evaluation. The `employees_2` table is evaluated first and rows are selected as per the index respective to the `WHERE` clause. Then, the records are joined to `salaries_2` table as per the index used due to the `JOIN` condition. The execution time of the query came down **from 4.5s to 0.02s**.

```shell
mysql> SELECT e.first_name, e.last_name, s.salary, e.hire_date FROM employees_2 e JOIN salaries_2 s ON e.emp_no=s.emp_no WHERE e.last_name='Dredge'\G
1860 rows in set (0.02 sec)
```

================================================
FILE: courses/level101/databases_sql/replication.md
================================================
### MySQL Replication
Replication enables data from one MySQL host (termed as Primary) to be copied to another MySQL host (termed as Replica). MySQL Replication is asynchronous in nature by default, but it can be changed to semi-synchronous with some configurations.

Some common applications of MySQL replication are:

- **Read-scaling** - as multiple hosts can replicate the data from a single primary host, we can set up as many replicas as we need and scale reads through them, i.e. application writes will go to a single primary host and the reads can balance between all the replicas that are there. Such a setup can improve the write performance as well, as the primary is dedicated to only updates and not reads.
- **Backups using replicas** - the backup process can sometimes be a little heavy. But if we have replicas configured, then we can use one of them to get the backup without affecting the primary data at all.
- **Disaster Recovery** - a replica in some other geographical region paves a proper path to configure disaster recovery.

MySQL supports different types of synchronizations as well:

- **Asynchronous** - this is the default synchronization method. It is one-way, i.e. one host serves as primary and one or more hosts as replica. We will discuss this method throughout the replication topic.

![replication topologies](images/replication_topologies.png "Different Replication Scenarios")

- **Semi-Synchronous** - in this type of synchronization, a commit performed on the primary host is blocked until at least one replica acknowledges it. Post the acknowledgement from any one replica, the control is returned to the session that performed the transaction. This ensures strong consistency but the replication is slower than asynchronous.
- **Delayed** - we can deliberately lag the replica in a typical MySQL replication by the number of seconds desired by the use case. This type of replication safeguards from severe human errors of dropping or corrupting the data on the primary, for example, in the above diagram for Delayed Replication, if a `DROP DATABASE` is executed by mistake on the primary, we still have 30 minutes to recover the data from R2 as that command has not been replicated on R2 yet.

**Pre-Requisites**

Before we dive into setting up replication, we should know about the binary logs. Binary logs play a very important role in MySQL replication. Binary logs, or commonly known as *binlogs* contain events about the changes done to the database, like table structure changes, data changes via DML operations, etc. They are not used to log `SELECT` statements. For replication, the primary sends the information to the replicas using its `binlogs` about the changes done to the database, and the replicas make the same data changes. 

With respect to MySQL replication, the binary log format can be of two types that decides the main type of replication:

- Statement-Based Replication or SBR
- Row-Based Replication or RBR

**Statement-Based Binlog Format**

Originally, the replication in MySQL was based on SQL statements getting replicated and executed on the replica from the primary. This is called statement-based logging. The `binlog` contains the exact SQL statement run by the session. 

![SBR update example](images/sbr_example_update_1.png "SBR update example")

So, if we run the above statements to insert 3 records and the update 3 in a single update statement, they will be logged exactly the same as when we executed them.

![SBR binlog](images/sbr_binlog_view_1.png "SBR binlog")

**Row-Based Binlog Format**

The row-based is the default one in the latest MySQL releases. This is a lot different from the Statement format as here, row events are logged instead of statements. By that we mean, in the above example one update statement affected 3 records, but `binlog` had only one `UPDATE` statement; if it is a row-based format, `binlog` will have an event for each record updated.

![RBR update example](images/rbr_example_update_1.png "RBR update example")

![RBR binlog](images/rbr_binlog_view_1.png "RBR binlog")


**Statement-Based v/s Row-Based binlogs**

Let’s have a look at the operational differences between statement-based and row-based binlogs. 

| Statement-Based              | Row-Based                                                          |
|------------------------------|--------------------------------------------------------------------|
| Logs SQL statements as executed | Logs row events based on SQL statements executed |
| Takes lesser disk space | Takes more disk space | 
| Restoring using binlogs is faster | Restoring using binlogs is slower |
| When used for replication, if any statement has a predefined function that has its own value, like `sysdate()`, `uuid()` etc, the output could be different on the replica which makes it inconsistent. | Whatever is executed becomes a row event with values, so there will be no problem if such functions are used in SQL statements. |
| Only statements are logged so no other row events are generated. | A lot of events are generated when a table is copied into another using `INSERT INTO SELECT`. |

**Note**: There is another type of `binlog` format called **Mixed**. With mixed logging, statement-based is used by default but it switches to row-based in certain cases. If MySQL cannot guarantee that statement-based logging is safe for the statements executed, it issues a warning and switches to row-based for those statements.

We will be using binary log format as Row for the entire replication topic.

**Replication in Motion**

![replication in motion](images/replication_function.png "Replication in motion")

The above figure indicates how a typical MySQL replication works.

1. `Replica_IO_Thread` is responsible to fetch the binlog events from the primary binary logs to the replica.
2. On the Replica host, relay logs are created which are exact copies of the binary logs. If the binary logs on primary are in row format, the relay logs will be the same.
3. `Replica_SQL_Thread` applies the relay logs on the replica MySQL server.
4. If `log-bin` is enabled on the replica, then the replica will have its own binary logs as well. If `log-slave-updates` is enabled, then it will have the updates from the primary logged in the binlogs as well.

#### Setting up Replication
In this section, we will set up a simple asynchronous replication. The binlogs will be in row-based format. The replication will be set up on two fresh hosts with no prior data present. There are two different ways in which we can set up replication. 

- **Binlog based** - Each replica keeps a record of the binlog coordinates on the primary - current binlog and position in the binlog till where it has read and processed. So, at a time different replicas might be reading different parts of the same binlog.
- **GTID based** - Every transaction gets an identifier called global transaction identifier or GTID. There is no need to keep the record of binlog coordinates, as long as the replica has all the GTIDs executed on the primary, it is consistent with the primary. A typical GTID is the `server_uuid:#` positive integer.

We will set up a GTID-based replication in the following section but will also discuss binlog-based replication setup as well.

**Primary Host Configurations**

The following config parameters should be present in the primary `my.cnf` file for setting up GTID-based replication.

```
server-id - a unique ID for the mysql server
log-bin - the binlog location
binlog-format - ROW | STATEMENT (we will use ROW)
gtid-mode - ON
enforce-gtid-consistency - ON (allows execution of only those statements which can be logged using GTIDs)
```

**Replica Host Configurations**

The following config parameters should be present in the replica `my.cnf` file for setting up replication.

```
server-id - different than the primary host
log-bin - (optional, if you want replica to log its own changes as well)
binlog-format - depends on the above
gtid-mode - ON
enforce-gtid-consistency - ON
log-slave-updates - ON (if binlog is enabled, then we can enable this. This enables the replica to log the changes coming from the primary along with its own changes. Helps in setting up chain replication)
```

**Replication User**

Every replica connects to the primary using a `mysql` user for replicating. So there must be a `mysql` user account for the same on the primary host. Any user can be used for this purpose provided it has `REPLICATION SLAVE` privilege. If the sole purpose is replication, then we can have a user with only the required privilege.

On the primary host:

```shell
mysql> CREATE USER repl_user@<replica_IP> IDENTIFIED BY 'xxxxx';

mysql> GRANT REPLICATION SLAVE ON *.* TO repl_user@'<replica_IP>';
```

**Obtaining Starting position from Primary**

Run the following command on the primary host:

```shell
mysql> SHOW MASTER STATUS\G
*************************** 1. row ***************************
             File: mysql-bin.000001
         Position: 73
     Binlog_Do_DB:
 Binlog_Ignore_DB:
Executed_Gtid_Set: e17d0920-d00e-11eb-a3e6-000d3aa00f87:1-3
1 row in set (0.00 sec)
```

If we are working with binary log-based replication, the top two output lines are the most important ones. That tells the current binlog on the primary host and till what position it has written. For fresh hosts we know that no data is written, so we can directly set up replication using the very first `binlog` file and position 4. If we are setting up a replication from a backup, then that changes the way we obtain the starting position. For GTIDs, the `executed_gtid_set` is the value where primary is right now. Again, for a fresh setup, we don’t have to specify anything about the starting point and it will start from the transaction id 1, but when we set up from a backup, the backup will contain the GTID positions till where backup has been taken.

**Setting up Replica**

The replication setup must know about the primary host, the user and password to connect, the binlog coordinates (for binlog-based replication) or the GTID auto-position parameter.
The following command is used for setting up:

```SQL
CHANGE MASTER TO
master_host = '<primary host IP>',
master_port = <primary host port - default=3306>,
master_user = 'repl_user',
master_password = 'xxxxx',
master_auto_position = 1;
```

**Note**: The `CHANGE MASTER TO` command has been replaced with `CHANGE REPLICATION SOURCE TO` from Mysql 8.0.23 onwards, also all the *master* and *slave* keywords are replaced with *source* and *replica*.

If it is binlog-based replication, then instead of `master_auto_position`, we need to specify the binlog coordinates.

```
master_log_file = 'mysql-bin.000001',
master_log_pos = 4
```

**Starting Replication and Check Status**

Now that everything is configured, we just need to start the replication on the replica via the following command

```SQL
START SLAVE;
```

OR from MySQL 8.0.23 onwards,

```SQL
START REPLICA;
```

Whether or not the replication is running successfully, we can determine by running the following command:

```SQL
SHOW SLAVE STATUS\G
```

OR from MySQL 8.0.23 onwards,

```SQL
SHOW REPLICA STATUS\G
```

```shell
mysql> SHOW REPLICA STATUS\G
*************************** 1. row ***************************
             Replica_IO_State: Waiting for master to send event
                  Source_Host: <primary IP>
                  Source_User: repl_user
                  Source_Port: <primary port>
                Connect_Retry: 60
              Source_Log_File: mysql-bin.000001
          Read_Source_Log_Pos: 852
               Relay_Log_File: mysql-relay-bin.000002
                Relay_Log_Pos: 1067
        Relay_Source_Log_File: mysql-bin.000001
           Replica_IO_Running: Yes
          Replica_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB:
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Source_Log_Pos: 852
              Relay_Log_Space: 1283
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Source_SSL_Allowed: No
           Source_SSL_CA_File:
           Source_SSL_CA_Path:
              Source_SSL_Cert:
            Source_SSL_Cipher:
               Source_SSL_Key:
        Seconds_Behind_Source: 0
Source_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Source_Server_Id: 1
                  Source_UUID: e17d0920-d00e-11eb-a3e6-000d3aa00f87
             Source_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
    Replica_SQL_Running_State: Slave has read all relay log; waiting for more updates
           Source_Retry_Count: 86400
                  Source_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Source_SSL_Crl:
           Source_SSL_Crlpath:
           Retrieved_Gtid_Set: e17d0920-d00e-11eb-a3e6-000d3aa00f87:1-3
            Executed_Gtid_Set: e17d0920-d00e-11eb-a3e6-000d3aa00f87:1-3
                Auto_Position: 1
         Replicate_Rewrite_DB:
                 Channel_Name:
           Source_TLS_Version:
       Source_public_key_path:
        Get_Source_public_key: 0
            Network_Namespace:
1 row in set (0.00 sec)
```

Some of the parameters are explained below:

| Parameters               | Description                                                |
|--------------------------|------------------------------------------------------------|
| Relay_Source_Log_File    | the primary’s file where replica is currently reading from |
| Execute_Source_Log_Pos   | for the above file on which position is the replica reading currently from. These two parameters are of utmost importance when binlog based replication is used    |
| Replica_IO_Running       | IO thread of replica is running or not                         |
| Replica_SQL_Running      | SQL thread of replica is running or not                       |
| Seconds_Behind_Source    | the difference of seconds when a statement was executed on Primary and then on Replica. This indicates how much replication lag is there |
| Source_UUID              | the uuid of the primary host |
| Retrieved_Gtid_Set       | the GTIDs fetched from the primary host by the replica to be executed |
| Executed_Gtid_Set        | the GTIDs executed on the replica. This set remains the same for the entire cluster if the replicas are in sync |
| Auto_Position            | it directs the replica to fetch the next GTID automatically|

**Create a Replica for the already setup cluster**

The steps discussed in the previous section talks about the setting up replication on two fresh hosts. When we have to set up a replica for a host which is already serving applications, then the backup of the primary is used, either fresh backup taken for the replica (should only be done if the traffic it is serving is less) or use a recently taken backup.

If the size of the databases on the MySQL primary server is small, less than 100G recommended, then `mysqldump` can be used to take backup along with the following options.

```shell
mysqldump -uroot -p -hhost_ip -P3306 --all-databases --single-transaction --master-data=1 > primary_host.bkp
```

- `--single-transaction` - this option starts a transaction before taking the backup which ensures it is consistent. As transactions are isolated from each other, so no other writes affect the backup.
- `--master-data` - this option is required if binlog-based replication is desired to be set up. It includes the binary log file and log file position in the backup file.

When GTID mode is enabled and `mysqldump` is executed, it includes the GTID executed to be used to start the replica after the backup position. The contents of the `mysqldump` output file will have the following

![GTID info in mysqldump](images/mysqldump_gtid_text.png "GTID info in mysqldump")

It is recommended to comment these before restoring otherwise they could throw errors. Also, using `master-data=2` will automatically comment the `master_log_file` line.

Similarly, when taking backup of the host using `xtrabackup`, the file `xtrabckup_info` file contains the information about binlog file and file position, as well as the GTID executed set.

```
server_version = 8.0.25
start_time = 2021-06-22 03:45:17
end_time = 2021-06-22 03:45:20
lock_time = 0
binlog_pos = filename 'mysql-bin.000007', position '196', GTID of the last change 'e17d0920-d00e-11eb-a3e6-000d3aa00f87:1-5'
innodb_from_lsn = 0
innodb_to_lsn = 18153149
partial = N
incremental = N
format = file
compressed = N
encrypted = N
```

Now, after setting MySQL server on the desired host, restore the backup taken from any one of the above methods. If the intended way is binlog-based replication, then use the binlog file and position info in the following command:

```
CHANGE REPLICATION SOURCE TO 
source_host = ‘primary_ip’,
source_port = 3306,
source_user = ‘repl_user’,
source_password = ‘xxxxx’,
source_log_file = ‘mysql-bin.000007’,
source_log_pos = ‘196’;
```

If the replication needs to be set via GITDs, then run the below command to tell the replica about the GTIDs already executed. On the Replica host, run the following commands:

```
RESET MASTER;

set global gtid_purged = ‘e17d0920-d00e-11eb-a3e6-000d3aa00f87:1-5’

CHANGE REPLICATION SOURCE TO
source_host = ‘primary_ip’,
source_port = 3306,
source_user = ‘repl_user’,
source_password = ‘xxxxx’,
source_auto_position = 1
```

The reset master command resets the position of the binary log to initial. It can be skipped if the host is a freshly installed MySQL, but we restored a backup so it is necessary. The `gtid_purged` global variable lets the replica know the GTIDs that have already been executed, so that the replication can start after that. Then in the change source command, we set the `auto-position` to 1 which automatically gets the next GTID to proceed.

#### Further Reading

- [More applications of Replication](https://dev.mysql.com/doc/refman/8.0/en/replication-solutions.html)
- [Automtaed Failovers using MySQL Orchestrator](https://github.com/openark/orchestrator/tree/master/docs)

================================================
FILE: courses/level101/databases_sql/select_query.md
================================================
### SELECT Query
The most commonly used command while working with MySQL is `SELECT`. It is used to fetch the resultset from one or more tables. 
The general form of a typical select query looks like:

```
SELECT expr
FROM table1
[WHERE condition]
[GROUP BY column_list HAVING condition]
[ORDER BY column_list ASC|DESC]
[LIMIT #]
```

The above general form contains some commonly used clauses of a `SELECT` query:

- **expr** - comma-separated column list or * (for all columns)
- **WHERE** - a condition is provided, if true, directs the query to select only those records.
- **GROUP BY** - groups the entire resultset based on the column list provided. An aggregate function is recommended to be present in the select expression of the query. **HAVING** supports grouping by putting a condition on the selected or any other aggregate function.
- **ORDER BY** - sorts the resultset based on the column list in ascending or descending order.
- **LIMIT** - commonly used to limit the number of records.

Let’s have a look at some examples for a better understanding of the above. The dataset used for the examples below is available [here](https://dev.mysql.com/doc/employee/en/employees-installation.html) and is free to use.

**Select all records**

```shell
mysql> SELECT * FROM employees LIMIT 5;
+--------+------------+------------+-----------+--------+------------+
| emp_no | birth_date | first_name | last_name | gender | hire_date  |
+--------+------------+------------+-----------+--------+------------+
|  10001 | 1953-09-02 | Georgi     | Facello   | M      | 1986-06-26 |
|  10002 | 1964-06-02 | Bezalel    | Simmel    | F      | 1985-11-21 |
|  10003 | 1959-12-03 | Parto      | Bamford   | M      | 1986-08-28 |
|  10004 | 1954-05-01 | Chirstian  | Koblick   | M      | 1986-12-01 |
|  10005 | 1955-01-21 | Kyoichi    | Maliniak  | M      | 1989-09-12 |
+--------+------------+------------+-----------+--------+------------+
5 rows in set (0.00 sec)
```

**Select specific fields for all records**

```shell
mysql> SELECT first_name, last_name, gender FROM employees LIMIT 5;
+------------+-----------+--------+
| first_name | last_name | gender |
+------------+-----------+--------+
| Georgi     | Facello   | M      |
| Bezalel    | Simmel    | F      |
| Parto      | Bamford   | M      |
| Chirstian  | Koblick   | M      |
| Kyoichi    | Maliniak  | M      |
+------------+-----------+--------+
5 rows in set (0.00 sec)
```

**Select all records Where hire_date >= January 1, 1990**

```shell
mysql> SELECT * FROM employees WHERE hire_date >= '1990-01-01' LIMIT 5;
+--------+------------+------------+-------------+--------+------------+
| emp_no | birth_date | first_name | last_name   | gender | hire_date  |
+--------+------------+------------+-------------+--------+------------+
|  10008 | 1958-02-19 | Saniya     | Kalloufi    | M      | 1994-09-15 |
|  10011 | 1953-11-07 | Mary       | Sluis       | F      | 1990-01-22 |
|  10012 | 1960-10-04 | Patricio   | Bridgland   | M      | 1992-12-18 |
|  10016 | 1961-05-02 | Kazuhito   | Cappelletti | M      | 1995-01-27 |
|  10017 | 1958-07-06 | Cristinel  | Bouloucos   | F      | 1993-08-03 |
+--------+------------+------------+-------------+--------+------------+
5 rows in set (0.01 sec)
```

**Select first_name and last_name from all records Where birth_date >= 1960 AND gender = ‘F’**

```shell
mysql> SELECT first_name, last_name FROM employees WHERE year(birth_date) >= 1960 AND gender='F' LIMIT 5;
+------------+-----------+
| first_name | last_name |
+------------+-----------+
| Bezalel    | Simmel    |
| Duangkaew  | Piveteau  |
| Divier     | Reistad   |
| Jeong      | Reistad   |
| Mingsen    | Casley    |
+------------+-----------+
5 rows in set (0.00 sec)
```

**Display the total number of records**

```shell
mysql> SELECT COUNT(*) FROM employees;
+----------+
| COUNT(*) |
+----------+
|   300024 |
+----------+
1 row in set (0.05 sec)
```

**Display gender-wise count of all records**

```shell
mysql> SELECT gender, COUNT(*) FROM employees GROUP BY gender;
+--------+----------+
| gender | COUNT(*) |
+--------+----------+
| M      |   179973 |
| F      |   120051 |
+--------+----------+
2 rows in set (0.14 sec)
```

**Display the year of hire_date and number of employees hired that year, also only those years where more than 20k employees were hired**

```shell
mysql> SELECT year(hire_date), COUNT(*) FROM employees GROUP BY year(hire_date) HAVING COUNT(*) > 20000;
+-----------------+----------+
| year(hire_date) | COUNT(*) |
+-----------------+----------+
|            1985 |    35316 |
|            1986 |    36150 |
|            1987 |    33501 |
|            1988 |    31436 |
|            1989 |    28394 |
|            1990 |    25610 |
|            1991 |    22568 |
|            1992 |    20402 |
+-----------------+----------+
8 rows in set (0.14 sec)
```

**Display all records ordered by their hire_date in descending order. If hire_date is the same, then in order of their birth_date ascending order**

```shell
mysql> SELECT * FROM employees ORDER BY hire_date DESC, birth_date ASC LIMIT 5;
+--------+------------+------------+-----------+--------+------------+
| emp_no | birth_date | first_name | last_name | gender | hire_date  |
+--------+------------+------------+-----------+--------+------------+
| 463807 | 1964-06-12 | Bikash     | Covnot    | M      | 2000-01-28 |
| 428377 | 1957-05-09 | Yucai      | Gerlach   | M      | 2000-01-23 |
| 499553 | 1954-05-06 | Hideyuki   | Delgrande | F      | 2000-01-22 |
| 222965 | 1959-08-07 | Volkmar    | Perko     | F      | 2000-01-13 |
|  47291 | 1960-09-09 | Ulf        | Flexer    | M      | 2000-01-12 |
+--------+------------+------------+-----------+--------+------------+
5 rows in set (0.12 sec)
```

### SELECT - JOINS
`JOIN` statement is used to produce a combined resultset from two or more tables based on certain conditions. It can be also used with `UPDATE` and `DELETE` statements, but we will be focussing on the select query.

Following is a basic general form for joins:

```
SELECT table1.col1, table2.col1, ... (any combination)
FROM
table1 <join_type> table2
ON (or USING depends on join_type) table1.column_for_joining = table2.column_for_joining
WHERE …
```

Any number of columns can be selected, but it is recommended to select only those which are relevant to increase the readability of the resultset. All other clauses like `WHERE`, `GROUP BY` are not mandatory.
Let’s discuss the types of JOINs supported by MySQL Syntax.

**Inner Join**

This joins table A with table B on a condition. Only the records where the condition is True are selected in the resultset.

Display some details of employees along with their salary:

```shell
mysql> SELECT e.emp_no,e.first_name,e.last_name,s.salary FROM employees e JOIN salaries s ON e.emp_no=s.emp_no LIMIT 5;
+--------+------------+-----------+--------+
| emp_no | first_name | last_name | salary |
+--------+------------+-----------+--------+
|  10001 | Georgi     | Facello   |  60117 |
|  10001 | Georgi     | Facello   |  62102 |
|  10001 | Georgi     | Facello   |  66074 |
|  10001 | Georgi     | Facello   |  66596 |
|  10001 | Georgi     | Facello   |  66961 |
+--------+------------+-----------+--------+
5 rows in set (0.00 sec)
```

Similar result can be achieved by:

```shell
mysql> SELECT e.emp_no,e.first_name,e.last_name,s.salary FROM employees e JOIN salaries s USING (emp_no) LIMIT 5;
+--------+------------+-----------+--------+
| emp_no | first_name | last_name | salary |
+--------+------------+-----------+--------+
|  10001 | Georgi     | Facello   |  60117 |
|  10001 | Georgi     | Facello   |  62102 |
|  10001 | Georgi     | Facello   |  66074 |
|  10001 | Georgi     | Facello   |  66596 |
|  10001 | Georgi     | Facello   |  66961 |
+--------+------------+-----------+--------+
5 rows in set (0.00 sec)
```

And also by:

```shell
mysql> SELECT e.emp_no,e.first_name,e.last_name,s.salary FROM employees e NATURAL JOIN salaries s LIMIT 5;
+--------+------------+-----------+--------+
| emp_no | first_name | last_name | salary |
+--------+------------+-----------+--------+
|  10001 | Georgi     | Facello   |  60117 |
|  10001 | Georgi     | Facello   |  62102 |
|  10001 | Georgi     | Facello   |  66074 |
|  10001 | Georgi     | Facello   |  66596 |
|  10001 | Georgi     | Facello   |  66961 |
+--------+------------+-----------+--------+
5 rows in set (0.00 sec)
```

**Outer Join**

Majorly of two types:

- **LEFT** - joining complete table A with table B on a condition. All the records from table A are selected, but from table B, only those records are selected where the condition is True.
- **RIGHT** - Exact opposite of the `LEFT JOIN`.

Let us assume the below tables for understanding `LEFT JOIN` better.

```shell
mysql> SELECT * FROM dummy1;
+----------+------------+
| same_col | diff_col_1 |
+----------+------------+
|        1 | A          |
|        2 | B          |
|        3 | C          |
+----------+------------+

mysql> SELECT * FROM dummy2;
+----------+------------+
| same_col | diff_col_2 |
+----------+------------+
|        1 | X          |
|        3 | Y          |
+----------+------------+
```

A simple `SELECT JOIN` will look like the one below:

```shell
mysql> SELECT * FROM dummy1 d1 LEFT JOIN dummy2 d2 ON d1.same_col=d2.same_col;
+----------+------------+----------+------------+
| same_col | diff_col_1 | same_col | diff_col_2 |
+----------+------------+----------+------------+
|        1 | A          |        1 | X          |
|        3 | C          |        3 | Y          |
|        2 | B          |     NULL | NULL       |
+----------+------------+----------+------------+
3 rows in set (0.00 sec)
```

Which can also be written as:

```shell
mysql> SELECT * FROM dummy1 d1 LEFT JOIN dummy2 d2 USING(same_col);
+----------+------------+------------+
| same_col | diff_col_1 | diff_col_2 |
+----------+------------+------------+
|        1 | A          | X          |
|        3 | C          | Y          |
|        2 | B          | NULL       |
+----------+------------+------------+
3 rows in set (0.00 sec)
```

And also as:

```shell
mysql> SELECT * FROM dummy1 d1 NATURAL LEFT JOIN dummy2 d2;
+----------+------------+------------+
| same_col | diff_col_1 | diff_col_2 |
+----------+------------+------------+
|        1 | A          | X          |
|        3 | C          | Y          |
|        2 | B          | NULL       |
+----------+------------+------------+
3 rows in set (0.00 sec)
```

**Cross Join**

This does a cross product of table A and table B without any condition. It doesn’t have a lot of applications in the real world.

A Simple `CROSS JOIN` looks like this:

```shell
mysql> SELECT * FROM dummy1 CROSS JOIN dummy2;
+----------+------------+----------+------------+
| same_col | diff_col_1 | same_col | diff_col_2 |
+----------+------------+----------+------------+
|        1 | A          |        3 | Y          |
|        1 | A          |        1 | X          |
|        2 | B          |        3 | Y          |
|        2 | B          |        1 | X          |
|        3 | C          |        3 | Y          |
|        3 | C          |        1 | X          |
+----------+------------+----------+------------+
6 rows in set (0.01 sec)
```

One use case that can come in handy is when you have to fill in some missing entries. For example, all the entries from `dummy1` must be inserted into a similar table `dummy3`, with each record must have 3 entries with statuses 1, 5 and 7.

```shell
mysql> DESC dummy3;
+----------+----------+------+-----+---------+-------+
| Field    | Type     | Null | Key | Default | Extra |
+----------+----------+------+-----+---------+-------+
| same_col | int      | YES  |     | NULL    |       |
| value    | char(15) | YES  |     | NULL    |       |
| status   | smallint | YES  |     | NULL    |       |
+----------+----------+------+-----+---------+-------+
3 rows in set (0.02 sec)
```

Either you create an `INSERT` query script with as many entries as in `dummy1` or use `CROSS JOIN` to produce the required resultset.

```shell
mysql> SELECT * FROM dummy1 
CROSS JOIN 
(SELECT 1 UNION SELECT 5 UNION SELECT 7) T2 
ORDER BY same_col;
+----------+------------+---+
| same_col | diff_col_1 | 1 |
+----------+------------+---+
|        1 | A          | 1 |
|        1 | A          | 5 |
|        1 | A          | 7 |
|        2 | B          | 1 |
|        2 | B          | 5 |
|        2 | B          | 7 |
|        3 | C          | 1 |
|        3 | C          | 5 |
|        3 | C          | 7 |
+----------+------------+---+
9 rows in set (0.00 sec)
```

The **T2** section in the above query is called a *sub-query*. We will discuss the same in the next section.

**Natural Join**

This implicitly selects the common column from table A and table B and performs an inner join.

```shell
mysql> SELECT e.emp_no,e.first_name,e.last_name,s.salary FROM employees e NATURAL JOIN salaries s LIMIT 5;
+--------+------------+-----------+--------+
| emp_no | first_name | last_name | salary |
+--------+------------+-----------+--------+
|  10001 | Georgi     | Facello   |  60117 |
|  10001 | Georgi     | Facello   |  62102 |
|  10001 | Georgi     | Facello   |  66074 |
|  10001 | Georgi     | Facello   |  66596 |
|  10001 | Georgi     | Facello   |  66961 |
+--------+------------+-----------+--------+
5 rows in set (0.00 sec)
```

Notice how `NATURAL JOIN` and using takes care that the common column is displayed only once if you are not explicitly selecting columns for the query.

**Some More Examples**

Display `emp_no`, `salary`, `title` and `dept` of the employees where salary > 80000.

```shell
mysql> SELECT e.emp_no, s.salary, t.title, d.dept_no 
FROM  
employees e 
JOIN salaries s USING (emp_no) 
JOIN titles t USING (emp_no) 
JOIN dept_emp d USING (emp_no) 
WHERE s.salary > 80000 
LIMIT 5;
+--------+--------+--------------+---------+
| emp_no | salary | title        | dept_no |
+--------+--------+--------------+---------+
|  10017 |  82163 | Senior Staff | d001    |
|  10017 |  86157 | Senior Staff | d001    |
|  10017 |  89619 | Senior Staff | d001    |
|  10017 |  91985 | Senior Staff | d001    |
|  10017 |  96122 | Senior Staff | d001    |
+--------+--------+--------------+---------+
5 rows in set (0.00 sec)
```

Display title-wise count of employees in each department ordered by `dept_no`:

```shell
mysql> SELECT d.dept_no, t.title, COUNT(*) 
FROM titles t 
LEFT JOIN dept_emp d USING (emp_no) 
GROUP BY d.dept_no, t.title 
ORDER BY d.dept_no 
LIMIT 10;
+---------+--------------------+----------+
| dept_no | title              | COUNT(*) |
+---------+--------------------+----------+
| d001    | Manager            |        2 |
| d001    | Senior Staff       |    13940 |
| d001    | Staff              |    16196 |
| d002    | Manager            |        2 |
| d002    | Senior Staff       |    12139 |
| d002    | Staff              |    13929 |
| d003    | Manager            |        2 |
| d003    | Senior Staff       |    12274 |
| d003    | Staff              |    14342 |
| d004    | Assistant Engineer |     6445 |
+---------+--------------------+----------+
10 rows in set (1.32 sec)
```

#### SELECT - Subquery
A subquery is generally a smaller resultset that can be used to power a `SELECT` query in many ways. It can be used in a `WHERE` condition, can be used in place of `JOIN` mostly where a `JOIN` could be an overkill. 
These subqueries are also termed as derived tables. They must have a table alias in the `SELECT` query.

Let’s look at some examples of subqueries.

Here, we got the department name from the `departments` table by a subquery which used `dept_no` from `dept_emp` table.

```shell
mysql> SELECT e.emp_no, 
(SELECT dept_name FROM departments WHERE dept_no=d.dept_no) dept_name FROM employees e 
JOIN dept_emp d USING (emp_no) 
LIMIT 5;
+--------+-----------------+
| emp_no | dept_name       |
+--------+-----------------+
|  10001 | Development     |
|  10002 | Sales           |
|  10003 | Production      |
|  10004 | Production      |
|  10005 | Human Resources |
+--------+-----------------+
5 rows in set (0.01 sec)
```

Here, we used the `AVG` query above (which got the avg salary) as a subquery to list the employees whose latest salary is more than the average. 

```shell
mysql> SELECT AVG(salary) FROM salaries;
+-------------+
| AVG(salary) |
+-------------+
|  63810.7448 |
+-------------+
1 row in set (0.80 sec)

mysql> SELECT e.emp_no, MAX(s.salary) 
FROM employees e 
NATURAL JOIN salaries s 
GROUP BY e.emp_no 
HAVING MAX(s.salary) > (SELECT AVG(salary) FROM salaries) 
LIMIT 10;
+--------+---------------+
| emp_no | MAX(s.salary) |
+--------+---------------+
|  10001 |         88958 |
|  10002 |         72527 |
|  10004 |         74057 |
|  10005 |         94692 |
|  10007 |         88070 |
|  10009 |         94443 |
|  10010 |         80324 |
|  10013 |         68901 |
|  10016 |         77935 |
|  10017 |         99651 |
+--------+---------------+
10 rows in set (0.56 sec)
```

================================================
FILE: courses/level101/git/branches.md
================================================
# Working With Branches

Coming back to our local repo which has two commits. So far, what we have is a single line of history. Commits are chained in a single line. But sometimes you may have a need to work on two different features in parallel in the same repo. Now one option here could be making a new folder/repo with the same code and use that for another feature development. But there's a better way. Use _branches_. Since git follows tree-like structure for commits, we can use branches to work on different sets of features. From a commit, two or more branches can be created and branches can also be merged.

Using branches, there can exist multiple lines of histories and we can checkout to any of them and work on it. Checking out, as we discussed earlier, would simply mean replacing contents of the directory (repo) with the snapshot at the checked out version.

Let's create a branch and see how it looks like:

```bash
$ git branch b1
$ git log --oneline --graph
* 7f3b00e (HEAD -> master, b1) adding file 2
* df2fb7a adding file 1
```

We create a branch called `b1`. Git log tells us that `b1` also points to the last commit (`7f3b00e`) but the `HEAD` is still pointing to `master`. If you remember, `HEAD` points to the commit/reference wherever you are checkout to. So if we checkout to `b1`, `HEAD` should point to that. Let's confirm:

```bash
$ git checkout b1
Switched to branch 'b1'
$ git log --oneline --graph
* 7f3b00e (HEAD -> b1, master) adding file 2
* df2fb7a adding file 1
```

`b1` still points to the same commit but `HEAD` now points to `b1`. Since we create a branch at commit `7f3b00e`, there will be two lines of histories starting this commit. Depending on which branch you are checked out on, the line of history will progress.

At this moment, we are checked out on branch `b1`, so making a new commit will advance branch reference `b1` to that commit and current `b1` commit will become its parent. Let's do that.

```bash
# Creating a file and making a commit
$ echo "I am a file in b1 branch" > b1.txt
$ git add b1.txt
$ git commit -m "adding b1 file"
[b1 872a38f] adding b1 file
1 file changed, 1 insertion(+)
create mode 100644 b1.txt

# The new line of history
$ git log --oneline --graph
* 872a38f (HEAD -> b1) adding b1 file
* 7f3b00e (master) adding file 2
* df2fb7a adding file 1
$
```

Do note that master is still pointing to the old commit it was pointing to. We can now checkout to `master` branch and make commits there. This will result in another line of history starting from commit `7f3b00e`.

```bash
# checkout to master branch
$ git checkout master
Switched to branch 'master'

# Creating a new commit on master branch
$ echo "new file in master branch" > master.txt
$ git add master.txt
$ git commit -m "adding master.txt file"
[master 60dc441] adding master.txt file
1 file changed, 1 insertion(+)
create mode 100644 master.txt

# The history line
$ git log --oneline --graph
* 60dc441 (HEAD -> master) adding master.txt file
* 7f3b00e adding file 2
* df2fb7a adding file 1
```

Notice how branch `b1` is not visible here since we are on the `master`. Let's try to visualize both to get the whole picture:

```bash
$ git log --oneline --graph --all
* 60dc441 (HEAD -> master) adding master.txt file
| * 872a38f (b1) adding b1 file
|/
* 7f3b00e adding file 2
* df2fb7a adding file 1
```

Above tree structure should make things clear. Notice a clear branch/fork on commit `7f3b00e`. This is how we create branches. Now they both are two separate lines of history on which feature development can be done independently.

**To reiterate, internally, git is just a tree of commits. Branch names (human readable) are pointers to those commits in the tree. We use various git commands to work with the tree structure and references. Git accordingly modifies contents of our repo.**

## Merges

Now say the feature you were working on branch `b1` is complete and you need to merge it on `master` branch, where all the final version of code goes. So first, you will `checkout` to branch `master` and then you `pull` the latest code from `upstream` (eg: GitHub). Then you need to merge your code from `b1` into `master`. There could be two ways this can be done.

Here is the current history:

```bash
$ git log --oneline --graph --all
* 60dc441 (HEAD -> master) adding master.txt file
| * 872a38f (b1) adding b1 file
|/
* 7f3b00e adding file 2
* df2fb7a adding file 1
```

**Option 1: Directly merge the branch.** Merging the branch `b1` into `master` will result in a new merge commit. This will merge changes from two different lines of history and create a new commit of the result.

```bash
$ git merge b1
Merge made by the 'recursive' strategy.
b1.txt | 1 +
1 file changed, 1 insertion(+)
create mode 100644 b1.txt
$ git log --oneline --graph --all
*   8fc28f9 (HEAD -> master) Merge branch 'b1'
|\
| * 872a38f (b1) adding b1 file
* | 60dc441 adding master.txt file
|/
* 7f3b00e adding file 2
* df2fb7a adding file 1
```

You can see a new merge commit created (`8fc28f9`). You will be prompted for the commit message. If there are a lot of branches in the repo, this result will end-up with a lot of merge commits. Which looks ugly compared to a single line of history of development. So let's look at an alternative approach.

First, let's [reset](https://git-scm.com/docs/git-reset) our last merge and go to the previous state.

```bash
$ git reset --hard 60dc441
HEAD is now at 60dc441 adding master.txt file
$ git log --oneline --graph --all
* 60dc441 (HEAD -> master) adding master.txt file
| * 872a38f (b1) adding b1 file
|/
* 7f3b00e adding file 2
* df2fb7a adding file 1
```

**Option 2: Rebase.** Now, instead of merging two branches which has a similar base (commit: `7f3b00e`), let us rebase branch `b1` on to current master. **What this means is take branch `b1` (from commit `7f3b00e` to commit `872a38f`) and rebase (put them on top of) master (`60dc441`).**

```bash
# Switch to b1
$ git checkout b1
Switched to branch 'b1'

# Rebase (b1 which is current branch) on master
$ git rebase master
First, rewinding head to replay your work on top of it...
Applying: adding b1 file

# The result
$ git log --oneline --graph --all
* 5372c8f (HEAD -> b1) adding b1 file
* 60dc441 (master) adding master.txt file
* 7f3b00e adding file 2
* df2fb7a adding file 1
```

You can see `b1` which had 1 commit. That commit's parent was `7f3b00e`. But since we rebase it on master (`60dc441`). That becomes the parent now. As a side effect, you also see it has become a single line of history. Now if we were to merge `b1` into `master`, it would simply mean change `master` to point to `5372c8f` which is `b1`. Let's try it:

```bash
# checkout to master since we want to merge code into master
$ git checkout master
Switched to branch 'master'

# the current history, where b1 is based on master
$ git log --oneline --graph --all
* 5372c8f (b1) adding b1 file
* 60dc441 (HEAD -> master) adding master.txt file
* 7f3b00e adding file 2
* df2fb7a adding file 1


# Performing the merge, notice the "fast-forward" message
$ git merge b1
Updating 60dc441..5372c8f
Fast-forward
b1.txt | 1 +
1 file changed, 1 insertion(+)
create mode 100644 b1.txt

# The Result
$ git log --oneline --graph --all
* 5372c8f (HEAD -> master, b1) adding b1 file
* 60dc441 adding master.txt file
* 7f3b00e adding file 2
* df2fb7a adding file 1
```

Now you see both `b1` and `master` are pointing to the same commit. Your code has been merged to the master branch and it can be pushed. Also we have clean line of history! :D


================================================
FILE: courses/level101/git/conclusion.md
================================================
## What next from here?

There are a lot of git commands and features which we have not explored here. But with the base built-up, be sure to explore concepts like

- Cherrypick
- Squash
- Amend
- Stash
- Reset



================================================
FILE: courses/level101/git/git-basics.md
================================================
# Git

## Prerequisites

1. Have Git installed [https://git-scm.com/downloads](https://git-scm.com/downloads)
2. Have taken any git high-level tutorial or following LinkedIn learning courses
      - [https://www.linkedin.com/learning/git-essential-training-the-basics/](https://www.linkedin.com/learning/git-essential-training-the-basics/)
      - [https://www.linkedin.com/learning/git-branches-merges-and-remotes/](https://www.linkedin.com/learning/git-branches-merges-and-remotes/)
      - [The Official Git Docs](https://git-scm.com/doc)

## What to expect from this course

As an engineer in the field of computer science, having knowledge of version control tools becomes almost a requirement. While there are a lot of version control tools that exist today like SVN, Mercurial, etc, Git perhaps is the most used one and this course we will be working with Git. While this course does not start with Git 101 and expects basic knowledge of git as a prerequisite, it will reintroduce the git concepts known by you with details covering what is happening under the hood as you execute various `git` commands. So that next time you run a `git` command, you will be able to press `enter` more confidently!

## What is not covered under this course

Advanced usage and specifics of internal implementation details of Git.

## Course Contents

 1. [Git Basics](https://linkedin.github.io/school-of-sre/level101/git/git-basics/#git-basics)
 2. [Working with Branches](https://linkedin.github.io/school-of-sre/level101/git/branches/)
 3. [Git with Github](https://linkedin.github.io/school-of-sre/level101/git/github-hooks/#git-with-github)
 4. [Hooks](https://linkedin.github.io/school-of-sre/level101/git/github-hooks/#hooks)

## Git Basics

Though you might be aware already, let's revisit why we need a version control system. As the project grows and multiple developers start working on it, an efficient method for collaboration is warranted. Git helps the team collaborate easily and also maintains the history of the changes happening with the codebase.

### Creating a Git Repo

Any folder can be converted into a git repository. After executing the following command, we will see a `.git` folder within the folder, which makes our folder a git repository. **All the magic that git does, `.git` folder is the enabler for the same.**

```bash
# creating an empty folder and changing current dir to it
$ cd /tmp
$ mkdir school-of-sre
$ cd school-of-sre/

# initialize a git repo
$ git init
Initialized empty Git repository in /private/tmp/school-of-sre/.git/
```

As the output says, an empty git repo has been initialized in our folder. Let's take a look at what is there.

```bash
$ ls .git/
HEAD        config      description hooks       info        objects     refs
```

There are a bunch of folders and files in the `.git` folder. As I said, all these enable git to do its magic. We will look into some of these folders and files. But for now, what we have is an empty git repository.

### Tracking a File

Now as you might already know, let us create a new file in our repo (we will refer to the folder as _repo_ now.) And see `git status`:

```bash
$ echo "I am file 1" > file1.txt
$ git status
On branch master

No commits yet

Untracked files:
 (use "git add <file>..." to include in what will be committed)

       file1.txt

nothing added to commit but untracked files present (use "git add" to track)
```

The current git status says `No commits yet` and there is one untracked file. Since we just created the file, git is not tracking that file. We explicitly need to ask git to track files and folders. (Also checkout [gitignore](https://git-scm.com/docs/gitignore)) And how we do that is via `git add` command as suggested in the above output. Then, we go ahead and create a commit.

```bash
$ git add file1.txt
$ git status
On branch master

No commits yet

Changes to be committed:
 (use "git rm --cached <file>..." to unstage)

       new file:   file1.txt

$ git commit -m "adding file 1"
[master (root-commit) df2fb7a] adding file 1
1 file changed, 1 insertion(+)
create mode 100644 file1.txt
```

Notice how after adding the file, `git status` says `Changes to be committed:`. What it means is whatever is listed there, will be included in the next commit. Then, we go ahead and create a commit, with an attached message via `-m`.

### More About a Commit

Commit is a snapshot of the repo. Whenever a commit is made, a snapshot of the current state of repo (the folder) is taken and saved. Each commit has a unique ID. (`df2fb7a` for the commit we made in the previous step). As we keep adding/changing more and more contents and keep making commits, all those snapshots are stored by git. Again, all this magic happens inside the `.git` folder. This is where all this snapshot or versions are stored _in an efficient manner._

### Adding More Changes

Let us create one more file and commit the change. It would look the same as the previous commit we made.

```bash
$ echo "I am file 2" > file2.txt
$ git add file2.txt
$ git commit -m "adding file 2"
[master 7f3b00e] adding file 2
1 file changed, 1 insertion(+)
create mode 100644 file2.txt
```

A new commit with ID `7f3b00e` has been created. You can issue `git status` at any time to see the state of the repository.

       **IMPORTANT: Note that commit IDs are long string (SHA) but we can refer to a commit by its initial few (8 or more) characters too. We will interchangeably using shorter and longer commit IDs.**

Now that we have two commits, let's visualize them:

```bash
$ git log --oneline --graph
* 7f3b00e (HEAD -> master) adding file 2
* df2fb7a adding file 1
```

`git log`, as the name suggests, prints the log of all the git commits. Here you see two additional arguments, `--oneline` prints the shorter version of the log, ie: the commit message only and not the person who made the commit and when. `--graph` prints it in graph format.

**Now at this moment, the commits might look like just one in each line but all commits are stored as a tree like data structure internally by git. That means there can be two or more children commits of a given commit. And not just a single line of commits. We will look more into this part when we get to the Branches section. For now, this is our commit history:**

```bash
   df2fb7a ===> 7f3b00e
```

### Are commits really linked?

As I just said, the two commits we just made are linked via tree like data structure and we saw how they are linked. But let's actually verify it. Everything in git is an object. Newly created files are stored as an object. Changes to file are stored as an objects and even commits are objects. To view contents of an object, we can use the following command with the object's ID. We will take a look at the contents of the second commit:

```bash
$ git cat-file -p 7f3b00e
tree ebf3af44d253e5328340026e45a9fa9ae3ea1982
parent df2fb7a61f5d40c1191e0fdeb0fc5d6e7969685a
author Sanket Patel <spatel1@linkedin.com> 1603273316 -0700
committer Sanket Patel <spatel1@linkedin.com> 1603273316 -0700

adding file 2
```

Take a note of `parent` attribute in the above output. It points to the commit id of the first commit we made. So this proves that they are linked! Additionally, you can see the second commit's message in this object. As I said all this magic is enabled by `.git` folder and the object to which we are looking at also is in that folder.

```bash
$ ls .git/objects/7f/3b00eaa957815884198e2fdfec29361108d6a9
.git/objects/7f/3b00eaa957815884198e2fdfec29361108d6a9
```

It is stored in `.git/objects/` folder. All the files and changes to them as well are stored in this folder.

### The Version Control part of Git

We already can see two commits (versions) in our git log. One thing a version control tool gives you is ability to browse back and forth in history. For example: some of your users are running an old version of code and they are reporting an issue. In order to debug the issue, you need access to the old code. The one in your current repo is the latest code. In this example, you are working on the second commit (`7f3b00e`) and someone reported an issue with the code snapshot at commit (`df2fb7a`). This is how you would get access to the code at any older commit.

```bash
# Current contents, two files present
$ ls
file1.txt file2.txt

# checking out to (an older) commit
$ git checkout df2fb7a
Note: checking out 'df2fb7a'.

You are in 'detached HEAD' state. You can look around, make experimental
changes and commit them, and you can discard any commits you make in this
state without impacting any branches by performing another checkout.

If you want to create a new branch to retain commits you create, you may
do so (now or later) by using -b with the checkout command again. Example:

 git checkout -b <new-branch-name>

HEAD is now at df2fb7a adding file 1

# checking contents, can verify it has old contents
$ ls
file1.txt
```

So this is how we would get access to old versions/snapshots. All we need is a _reference_ to that snapshot. Upon executing `git checkout ...`, what git does for you is use the `.git` folder, see what was the state of things (files and folders) at that version/reference and replace the contents of current directory with those contents. The then-existing content will no longer be present in the local dir (repo) but we can and will still get access to them because they are tracked via `git commit` and `.git` folder has them stored/tracked.

### Reference

I mention in the previous section that we need a _reference_ to the version. By default, git repo is made of tree of commits. And each commit has a unique IDs. But the unique ID is not the only thing we can reference commits via. There are multiple ways to reference commits. For example: `HEAD` is a reference to current commit. _Whatever commit your repo is checked out at, `HEAD` will point to that._ `HEAD~1` is reference to previous commit. So while checking out previous version in section above, we could have done `git checkout HEAD~1`.

Similarly, `master` is also a reference (to a branch). Since git uses tree like structure to store commits, there of course will be branches. And the default branch is called `master`. Master (or any branch reference) will point to the latest commit in the branch. Even though we have checked out to the previous commit in out repo, `master` still points to the latest commit. And we can get back to the latest version by `checkout` at `master` reference

```bash
$ git checkout master
Previous HEAD position was df2fb7a adding file 1
Switched to branch 'master'

# now we will see latest code, with two files
$ ls
file1.txt file2.txt
```

Note, instead of `master` in above command, we could have used commit's ID as well.

### References and The Magic

Let's look at the state of things. Two commits, `master` and `HEAD` references are pointing to the latest commit

```bash
$ git log --oneline --graph
* 7f3b00e (HEAD -> master) adding file 2
* df2fb7a adding file 1
```

The magic? Let's examine these files:

```bash
$ cat .git/refs/heads/master
7f3b00eaa957815884198e2fdfec29361108d6a9
```

Viola! Where `master` is pointing to is stored in a file. **Whenever git needs to know where master reference is pointing to, or if git needs to update where master points, it just needs to update the file above.** So when you create a new commit, a new commit is created on top of the current commit and the master file is updated with the new commit's ID.

Similary, for `HEAD` reference:

```bash
$ cat .git/HEAD
ref: refs/heads/master
```

We can see `HEAD` is pointing to a reference called `refs/heads/master`. So `HEAD` will point where ever the `master` points.

### Little Adventure

We discussed how git will update the files as we execute commands. But let's try to do it ourselves, by hand, and see what happens.

```bash
$ git log --oneline --graph
* 7f3b00e (HEAD -> master) adding file 2
* df2fb7a adding file 1
```

Now, let's change `master` to point to the previous/first commit.

```bash
$ echo df2fb7a61f5d40c1191e0fdeb0fc5d6e7969685a > .git/refs/heads/master
$ git log --oneline --graph
* df2fb7a (HEAD -> master) adding file 1

# RESETTING TO ORIGINAL
$ echo 7f3b00eaa957815884198e2fdfec29361108d6a9 > .git/refs/heads/master
$ git log --oneline --graph
* 7f3b00e (HEAD -> master) adding file 2
* df2fb7a adding file 1
```

We just edited the `master` reference file and now we can see only the first commit in git log. Undoing the change to the file brings the state back to original. Not so much of magic, is it?


================================================
FILE: courses/level101/git/github-hooks.md
================================================
# Git with GitHub

Till now all the operations we did were in our local repo while git also helps us in a collaborative environment. GitHub is one place on the Internet where you can centrally host your git repos and collaborate with other developers.

Most of the workflow will remain the same as we discussed, with addition of couple of things:

 1. Pull: to pull latest changes from GitHub (the central) repo
 2. Push: to push your changes to GitHub repo so that it's available to all people

GitHub has written nice guides and tutorials about this and you can refer to them here:

- [GitHub Hello World](https://guides.github.com/activities/hello-world/)
- [Git Handbook](https://guides.github.com/introduction/git-handbook/)

## Hooks

Git has another nice feature called hooks. Hooks are basically scripts which will be called when a certain event happens. Here is where hooks are located:

```bash
$ ls .git/hooks/
applypatch-msg.sample     fsmonitor-watchman.sample pre-applypatch.sample     pre-push.sample           pre-receive.sample        update.sample
commit-msg.sample         post-update.sample        pre-commit.sample         pre-rebase.sample         prepare-commit-msg.sample
```

Names are self-explanatory. These hooks are useful when you want to do certain things when a certain event happens. If you want to run tests before pushing code, you would want to setup `pre-push` hooks. Let's try to create a pre commit hook.

```bash
$ echo "echo this is from pre commit hook" > .git/hooks/pre-commit
$ chmod +x .git/hooks/pre-commit
```

We basically create a file called `pre-commit` in hooks folder and make it executable. Now if we make a commit, we should see the message getting printed.

```bash
$ echo "sample file" > sample.txt
$ git add sample.txt
$ git commit -m "adding sample file"
this is from pre commit hook     # <===== THE MESSAGE FROM HOOK EXECUTION
[master 9894e05] adding sample file
1 file changed, 1 insertion(+)
create mode 100644 sample.txt
```


================================================
FILE: courses/level101/linux_basics/command_line_basics.md
================================================
# Command Line Basics

## Lab Environment Setup

One can use an online Bash interpreter to run all the commands that are provided as examples in this course. This will also help you in getting a hands-on experience of various Linux commands.

[REPL](https://repl.it/languages/bash) is one of the popular online Bash interpreters for running Linux commands. We will be using it for running all the commands mentioned in this course.

## What is a Command

A command is a program that tells the operating system to perform
specific work. Programs are stored as files in Linux. Therefore, a
command is also a file which is stored somewhere on the disk.

Commands may also take additional arguments as input from the user.
These arguments are called command line arguments. Knowing how to use
the commands is important and there are many ways to get help in Linux,
especially for commands. Almost every command will have some form of
documentation, most commands will have a command-line argument `-h` or 
`--help` that will display a reasonable amount of documentation. But the
most popular documentation system in Linux is called `man` pages&mdash;short
for manual pages.

Using `--help` to show the documentation for `ls` command.

![](images/linux/commands/image19.png)

## File System Organization

The Linux file system has a hierarchical (or tree-like) structure with
its highest-level directory called `root` (denoted by `/`). Directories
present inside the root directory stores files related to the system.
These directories in turn can either store system files or application
files or user-related files.

![](images/linux/commands/image17.png)

| Directory  | Description                                                                    |
|------------|--------------------------------------------------------------------------------| 
| bin        | The executable program of most commonly used commands reside in `bin` directory|
| dev        | This directory contains files related to devices on the system                 |
| etc        | This directory contains all the system configuration files                     |
| home       | This directory contains user-related files and directories                     |       
| lib        | This directory contains all the library files                                  |
| mnt        | This directory contains files related to mounted devices on the system         |
| proc       | This directory contains files related to the running processes on the system   |
| root       | This directory contains root user-related files and directories                | 
| sbin       | This directory contains programs used for system administration                |
| tmp        | This directory is used to store temporary files on the system                  |
| usr        | This directory is used to store application programs on the system             |

## Commands for Navigating the File System

There are three basic commands which are used frequently to navigate the
file system:

- ls

- pwd

- cd

We will now try to understand what each command does and how to use
these commands. You should also practice the given examples on the
online Bash shell.

### pwd (print working directory)

At any given moment of time, we will be standing in a certain directory.
To get the name of the directory in which we are standing, we can use
the `pwd` command in Linux.

![](images/linux/commands/image2.png)

We will now use the `cd` command to move to a different directory and then
print the working directory.

![](images/linux/commands/image20.png)

### cd (change directory)

The `cd` command can be used to change the working directory. Using the
command, you can move from one directory to another.

In the below example, we are initially in the `root` directory. We have
then used the `cd` command to change the directory.

![](images/linux/commands/image3.png)

### ls (list files and directories)**

The `ls` command is used to list the contents of a directory. It will list
down all the files and folders present in the given directory.

If we just type `ls` in the shell, it will list all the files and
directories present in the current directory.

![](images/linux/commands/image7.png)

We can also provide the directory name as argument to `ls` command. It
will then list all the files and directories inside the given directory.

![](images/linux/commands/image4.png)

## Commands for Manipulating Files

There are five basic commands which are used frequently to manipulate
files:

- touch

- mkdir

- cp

- mv

- rm

We will now try to understand what each command does and how to use
these commands. You should also practice the given examples on the
online Bash shell.

### touch (create new file)

The `touch` command can be used to create an empty new file.
This command is very useful for many other purposes, but we will discuss
the simplest use case of creating a new file.

General syntax of using `touch` command:

```shell
touch <file_name>
```

![](images/linux/commands/image9.png)

### mkdir (create new directories)

The `mkdir` command is used to create directories. You can use `ls` command
to verify that the new directory is created.

General syntax of using `mkdir` command:

```shell
mkdir <directory_name>
```

![](images/linux/commands/image11.png)

### rm (delete files and directories)

The `rm` command can be used to delete files and directories. It is very
important to note that this command permanently deletes the files and
directories. It's almost impossible to recover these files and
directories once you have executed `rm` command on them successfully. Do
run this command with care.

General syntax of using `rm` command:

```shell
rm <file_name>
```

Let's try to understand the `rm` command with an example. We will try to
delete the file and directory we created using `touch` and `mkdir` command
respectively.

![](images/linux/commands/image18.png)

### cp (copy files and directories)

The `cp` command is used to copy files and directories from one location
to another. Do note that the `cp` command doesn't do any change to the
original files or directories. The original files or directories and
their copy both co-exist after running `cp` command successfully.

General syntax of using `cp` command:

```shell
cp <source_path> <destination_path>
```

We are currently in the `/home/runner` directory. We will use the `mkdir`
command to create a new directory named `test_directory`. We will now
try to copy the `_test_runner.py` file to the directory we created just
now.

![](images/linux/commands/image23.png)

Do note that nothing happened to the original `_test_runner.py` file.
It's still there in the current directory. A new copy of it got created
inside the `test_directory`.

![](images/linux/commands/image14.png)

We can also use the `cp` command to copy the whole directory from one
location to another. Let's try to understand this with an example.

![](images/linux/commands/image12.png)

We again used the `mkdir` command to create a new directory called
`another_directory`. We then used the `cp` command along with an
additional argument `-r` to copy the `test_directory`.

**mv (move files and directories)**

The `mv` command can either be used to move files or directories from one
location to another or it can be used to rename files or directories. Do
note that moving files and copying them are very different. When you
move the files or directories, the original copy is lost.

General syntax of using `mv` command:

```shell
mv <source_path> <destination_path>
```

In this example, we will use the `mv` command to move the
`_test_runner.py` file to `test_directory`. In this case, this file
already exists in `test_directory`. The `mv` command will just replace it.
**Do note that the original file doesn't exist in the current directory
after `mv` command ran successfully.**

![](images/linux/commands/image26.png)

We can also use the `mv` command to move a directory from one location to
another. In this case, we do not need to use the `-r` flag that we did
while using the `cp` command. Do note that the original directory will not
exist if we use `mv` command.

One of the important uses of the `mv` command is to rename files and
directories. Let's see how we can use this command for renaming.

We have first changed our location to `test_directory`. We then use the
`mv` command to rename the `_test_runner.py` file to `test.py`.

![](images/linux/commands/image29.png)

## Commands for Viewing Files

There are five basic commands which are used frequently to view the
files:

- cat

- head

- tail

- more

- less

We will now try to understand what each command does and how to use
these commands. You should also practice the given examples on the
online Bash shell.

We will create a new file called `numbers.txt` and insert numbers from 1
to 100 in this file. Each number will be in a separate line.

![](images/linux/commands/image21.png)

Do not worry about the above command now. It's an advanced command which
is used to generate numbers. We have then used a redirection operator to
push these numbers to the file. We will be discussing I/O redirection in the
later sections.


### cat

The most simplest use of `cat` command is to print the contents of the file on
your output screen. This command is very useful and can be used for many
other purposes. We will study about other use cases later.

![](images/linux/commands/image1.png)

You can try to run the above command and you will see numbers being
printed from 1 to 100 on your screen. You will need to scroll up to view
all the numbers.

### head

The `head` command displays the first 10 lines of the file by default. We
can include additional arguments to display as many lines as we want
from the top.

In this example, we are only able to see the first 10 lines from the
file when we use the `head` command.

![](images/linux/commands/image15.png)

By default, `head` command will only display the first 10 lines. If we
want to specify the number of lines we want to see from start, use the
`-n` argument to provide the input.

![](images/linux/commands/image16.png)

### tail

The `tail` command displays the last 10 lines of the file by default. We
can include additional arguments to display as many lines as we want
from the end of the file.

![](images/linux/commands/image22.png)

By default, the `tail` command will only display the last 10 lines. If we
want to specify the number of lines we want to see from the end, use `-n`
argument to provide the input.

![](images/linux/commands/image10.png)

In this example, we are only able to see the last 5 lines from the file
when we use the `tail` command with explicit `-n` option.


### more

The `more` command displays the contents of a file or a command output, 
displaying one screen at a time in case the file is large (Eg: log files).
It also allows forward navigation and limited backward navigation in the file.

![](images/linux/commands/image33.png)

The `more` command displays as much as can fit on the current screen and waits for user input to advance. Forward navigation can be done by pressing `Enter`, which advances the output by one line and `Space`, which advances the output by one screen.

### less

The `less` command is an improved version of `more`. It displays the contents of a file or a command output, one page at a time.
It allows backward navigation as well as forward navigation in the file and also has search options. We can use `arrow keys` for advancing backward or forward by one line. For moving forward by one page, press `Space` and for moving backward by one page, press `b` on your keyboard.
You can go to the beginning and the end of a file instantly.


## Echo Command in Linux

The `echo` command is one of the simplest commands that is used in the
shell. This command is equivalent to `print` in other
programming languages.

The `echo` command prints the given input string on the screen.

![](images/linux/commands/image34.png)

## Text Processing Commands

In the previous section, we learned how to view the content of a file.
In many cases, we will be interested in performing the below operations:

- Print only the lines which contain a particular word(s)

- Replace a particular word with another word in a file

- Sort the lines in a particular order

There are three basic commands which are used frequently to process
texts:

- grep

- sed

- sort

We will now try to understand what each command does and how to use
these commands. You should also practice the given examples on the
online Bash shell.

We will create a new file called `numbers.txt` and insert numbers from 1
to 10 in this file. Each number will be in a separate line.

![](images/linux/commands/image8.png)

### grep

The `grep` command in its simplest form can be used to search particular
words in a text file. It will display all the lines in a file that
contains a particular input. The word we want to search is provided as
an input to the `grep` command.

General syntax of using `grep` command:

```shell
grep <word_to_search> <file_name>
```

In this example, we are trying to search for a string "1" in this file.
The `grep` command outputs the lines where it found this string.

![](images/linux/commands/image36.png)

### sed

The `sed` command in its simplest form can be used to replace a text in a
file.

General syntax of using the `sed` command for replacement:

```shell
sed 's/<text_to_replace>/<replacement_text>/' <file_name>
```

Let's try to replace each occurrence of "1" in the file with "3" using
`sed` command.

![](images/linux/commands/image31.png)

The content of the file will not change in the above
example. To do so, we have to use an extra argument `-i` so that the
changes are reflected back in the file.

### sort

The `sort` command can be used to sort the input provided to it as an
argument. By default, it will sort in increasing order.

Let's first see the content of the file before trying to sort it.

![](images/linux/commands/image27.png)

Now, we will try to sort the file using the `sort` command. The `sort`
command sorts the content in lexicographical order.

![](images/linux/commands/image32.png)

The content of the file will not change in the above
example.

## I/O Redirection

Each open file gets assigned a file descriptor. A file descriptor is an
unique identifier for open files in the system. There are always three
default files open, `stdin` (the keyboard), `stdout` (the screen), and
`stderr` (error messages output to the screen). These files can be
redirected.

Everything is a file in Linux -
[https://unix.stackexchange.com/questions/225537/everything-is-a-file](https://unix.stackexchange.com/questions/225537/everything-is-a-file)

Till now, we have displayed all the output on the screen which is the
standard output. We can use some special operators to redirect the
output of the command to files or even to the input of other commands.
I/O redirection is a very powerful feature.

In the below example, we have used the `>` operator to redirect the
output of `ls` command to `output.txt` file.

![](images/linux/commands/image30.png)

In the below example, we have redirected the output from `echo` command to
a file.

![](images/linux/commands/image13.png)

We can also redirect the output of a command as an input to another
command. This is possible with the help of pipes.

In the below example, we have passed the output of `cat` command as an
input to `grep` command using pipe (`|`) operator.

![](images/linux/commands/image6.png)

In the below example, we have passed the output of `sort` command as an
input to `uniq` command using pipe (`|`) operator. The `uniq` command only
prints the unique numbers from the input.

![](images/linux/commands/image28.png)

I/O redirection -
[https://tldp.org/LDP/abs/html/io-redirection.html](https://tldp.org/LDP/abs/html/io-redirection.html)


================================================
FILE: courses/level101/linux_basics/conclusion.md
================================================
# Conclusion

We have covered the basics of Linux operating systems and basic commands used in Linux.
We have also covered the Linux server administration commands.

We hope that this course will make it easier for you to operate on the command line.

## Applications in SRE Role

1. As a SRE, you will be required to perform some general tasks on these Linux servers. You will also be using the command line when you are troubleshooting issues.
2. Moving from one location to another in the filesystem will require the help of `ls`, `pwd` and `cd` commands.
3. You may need to search some specific information in the log files. `grep` command would be very useful here. I/O redirection will become handy if you want to store the output in a file or pass it as an input to another command.
4. `tail` command is very useful to view the latest data in the log file.
5. Different users will have different permissions depending on their roles. We will also not want everyone in the company to access our servers for security reasons. Users permissions can be restricted with `chown`, `chmod` and `chgrp` commands.
6. `ssh` is one of the most frequently used commands for a SRE. Logging into servers and troubleshooting along with performing basic administration tasks will only be possible if we are able to login into the server.
7. What if we want to run an Apache server or NGINX on a server? We will first install it using the package manager. Package management commands become important here.
8. Managing services on servers is another critical responsibility of a SRE. `systemd`-related commands can help in troubleshooting issues. If a service goes down, we can start it using `systemctl start` command. We can also stop a service in case it is not needed.
9. Monitoring is another core responsibility of a SRE. Memory and CPU are two important system-level metrics which should be monitored. Commands like `top` and `free` are quite helpful here.
10. If a service throws an error, how do we find out the root cause of the error? We will certainly need to check lo

Download .txt

gitextract_zk337uau/

├── .github/
│   └── workflows/
│       ├── build.yml
│       └── gh-deploy.yml
├── .gitignore
├── LICENSE
├── NOTICE
├── courses/
│   ├── CODE_OF_CONDUCT.md
│   ├── CONTRIBUTING.md
│   ├── index.md
│   ├── level101/
│   │   ├── big_data/
│   │   │   ├── evolution.md
│   │   │   ├── intro.md
│   │   │   └── tasks.md
│   │   ├── databases_nosql/
│   │   │   ├── further_reading.md
│   │   │   ├── intro.md
│   │   │   └── key_concepts.md
│   │   ├── databases_sql/
│   │   │   ├── backup_recovery.md
│   │   │   ├── concepts.md
│   │   │   ├── conclusion.md
│   │   │   ├── innodb.md
│   │   │   ├── intro.md
│   │   │   ├── lab.md
│   │   │   ├── mysql.md
│   │   │   ├── operations.md
│   │   │   ├── query_performance.md
│   │   │   ├── replication.md
│   │   │   └── select_query.md
│   │   ├── git/
│   │   │   ├── branches.md
│   │   │   ├── conclusion.md
│   │   │   ├── git-basics.md
│   │   │   └── github-hooks.md
│   │   ├── linux_basics/
│   │   │   ├── command_line_basics.md
│   │   │   ├── conclusion.md
│   │   │   ├── intro.md
│   │   │   └── linux_server_administration.md
│   │   ├── linux_networking/
│   │   │   ├── conclusion.md
│   │   │   ├── dns.md
│   │   │   ├── http.md
│   │   │   ├── intro.md
│   │   │   ├── ipr.md
│   │   │   ├── tcp.md
│   │   │   └── udp.md
│   │   ├── messagequeue/
│   │   │   ├── further_reading.md
│   │   │   ├── intro.md
│   │   │   └── key_concepts.md
│   │   ├── metrics_and_monitoring/
│   │   │   ├── alerts.md
│   │   │   ├── best_practices.md
│   │   │   ├── command-line_tools.md
│   │   │   ├── conclusion.md
│   │   │   ├── introduction.md
│   │   │   ├── observability.md
│   │   │   └── third-party_monitoring.md
│   │   ├── python_web/
│   │   │   ├── intro.md
│   │   │   ├── python-concepts.md
│   │   │   ├── python-web-flask.md
│   │   │   ├── sre-conclusion.md
│   │   │   └── url-shorten-app.md
│   │   ├── security/
│   │   │   ├── conclusion.md
│   │   │   ├── fundamentals.md
│   │   │   ├── intro.md
│   │   │   ├── network_security.md
│   │   │   ├── threats_attacks_defences.md
│   │   │   └── writing_secure_code.md
│   │   └── systems_design/
│   │       ├── availability.md
│   │       ├── conclusion.md
│   │       ├── fault-tolerance.md
│   │       ├── intro.md
│   │       └── scalability.md
│   ├── level102/
│   │   ├── .level102
│   │   ├── containerization_and_orchestration/
│   │   │   ├── conclusion.md
│   │   │   ├── containerization_with_docker.md
│   │   │   ├── intro.md
│   │   │   ├── intro_to_containers.md
│   │   │   └── orchestration_with_kubernetes.md
│   │   ├── continuous_integration_and_continuous_delivery/
│   │   │   ├── cicd_brief_history.md
│   │   │   ├── conclusion.md
│   │   │   ├── continuous_delivery_release_pipeline.md
│   │   │   ├── continuous_integration_build_pipeline.md
│   │   │   ├── introduction.md
│   │   │   ├── introduction_to_cicd.md
│   │   │   └── jenkins_cicd_pipeline_hands_on_lab.md
│   │   ├── linux_intermediate/
│   │   │   ├── archiving_backup.md
│   │   │   ├── bashscripting.md
│   │   │   ├── conclusion.md
│   │   │   ├── introduction.md
│   │   │   ├── introvim.md
│   │   │   ├── package_management.md
│   │   │   └── storage_media.md
│   │   ├── networking/
│   │   │   ├── conclusion.md
│   │   │   ├── infrastructure-features.md
│   │   │   ├── introduction.md
│   │   │   ├── rtt.md
│   │   │   ├── scale.md
│   │   │   └── security.md
│   │   ├── system_calls_and_signals/
│   │   │   ├── conclusion.md
│   │   │   ├── intro.md
│   │   │   ├── signals.md
│   │   │   └── system_calls.md
│   │   ├── system_design/
│   │   │   ├── conclusion.md
│   │   │   ├── intro.md
│   │   │   ├── large-system-design.md
│   │   │   ├── resiliency.md
│   │   │   ├── scaling-beyond-the-datacenter.md
│   │   │   └── scaling.md
│   │   └── system_troubleshooting_and_performance/
│   │       ├── conclusion.md
│   │       ├── important-tools.md
│   │       ├── introduction.md
│   │       ├── performance-improvements.md
│   │       ├── troubleshooting-example.md
│   │       └── troubleshooting.md
│   ├── sre_community.md
│   └── stylesheets/
│       └── custom.css
├── mkdocs.yml
├── overrides/
│   └── partials/
│       ├── header.html
│       ├── nav-item.html
│       └── nav.html
└── requirements.txt

Download .json

Condensed preview — 115 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (748K chars).

[
  {
    "path": ".github/workflows/build.yml",
    "chars": 528,
    "preview": "name: Build mkdocs\n\n\non:\n  pull_request:\n\n  # Allows you to run this workflow manually from the Actions tab\n  workflow_d"
  },
  {
    "path": ".github/workflows/gh-deploy.yml",
    "chars": 1235,
    "preview": "name: Deploy to gh-pages\n\n# Controls when the action will run. \non:\n  # Triggers the workflow on push or pull request ev"
  },
  {
    "path": ".gitignore",
    "chars": 31,
    "preview": ".DS_Store\n.venv\nsite/\n.vscode/\n"
  },
  {
    "path": "LICENSE",
    "chars": 12700,
    "preview": "Creative Commons Attribution 4.0 International Public License\n\nBy exercising the Licensed Rights (defined below), You ac"
  },
  {
    "path": "NOTICE",
    "chars": 243,
    "preview": "Copyright 2020 LinkedIn Corporation\nAll Rights Reserved.\n\nLicensed under the Creative Commons Attribution 4.0 Internatio"
  },
  {
    "path": "courses/CODE_OF_CONDUCT.md",
    "chars": 5056,
    "preview": "This code of conduct outlines expectations for participation in LinkedIn-managed open source communities, as well as ste"
  },
  {
    "path": "courses/CONTRIBUTING.md",
    "chars": 1667,
    "preview": "We realise that the initial content we created is just a starting point and our hope is that the community can help in t"
  },
  {
    "path": "courses/index.md",
    "chars": 6170,
    "preview": "# School of SRE\n\n<img src=\"img/sos.png\" width=200 >\n\nSite Reliability Engineers (SREs) sits at the intersection of softw"
  },
  {
    "path": "courses/level101/big_data/evolution.md",
    "chars": 7245,
    "preview": "# Evolution of Hadoop\n\n![Evolution of hadoop](images/hadoop_evolution.png)\n\n# Architecture of Hadoop\n\n1. **HDFS**\n    1."
  },
  {
    "path": "courses/level101/big_data/intro.md",
    "chars": 2966,
    "preview": "# Big Data\n\n## Prerequisites\n\n- Basics of Linux File systems.\n- Basic understanding of System Design.\n\n## What to expect"
  },
  {
    "path": "courses/level101/big_data/tasks.md",
    "chars": 844,
    "preview": "# Tasks and conclusion\n\n## Post-training tasks:\n\n1. Try setting up your own three-node Hadoop cluster. \n    1. A VM-base"
  },
  {
    "path": "courses/level101/databases_nosql/further_reading.md",
    "chars": 1196,
    "preview": "# Conclusion\n\nWe have covered basic concepts of NoSQL databases. There is much more to learn and do. We hope this course"
  },
  {
    "path": "courses/level101/databases_nosql/intro.md",
    "chars": 9706,
    "preview": "# NoSQL Concepts\n\n## Prerequisites\n- [Relational Databases](https://linkedin.github.io/school-of-sre/level101/databases_"
  },
  {
    "path": "courses/level101/databases_nosql/key_concepts.md",
    "chars": 16680,
    "preview": "# Key Concepts\n\nLets looks at some of the key concepts when we talk about NoSQL or distributed systems.\n\n### CAP Theorem"
  },
  {
    "path": "courses/level101/databases_sql/backup_recovery.md",
    "chars": 12844,
    "preview": "### Backup and Recovery\nBackups are a very crucial part of any database setup. They are generally a copy of the data tha"
  },
  {
    "path": "courses/level101/databases_sql/concepts.md",
    "chars": 5940,
    "preview": "*   Relational DBs are used for data storage. Even a file can be used to store data, but relational DBs are designed wit"
  },
  {
    "path": "courses/level101/databases_sql/conclusion.md",
    "chars": 1165,
    "preview": "# Conclusion\nWe have covered basic concepts of SQL databases. We have also covered some of the tasks that an SRE may be "
  },
  {
    "path": "courses/level101/databases_sql/innodb.md",
    "chars": 1711,
    "preview": "### Why should you use this?\n\nGeneral purpose, row level locking, ACID support, transactions, crash recovery and multi-v"
  },
  {
    "path": "courses/level101/databases_sql/intro.md",
    "chars": 1792,
    "preview": "# Relational Databases\n\n### Prerequisites\n*   Complete [Linux course](https://linkedin.github.io/school-of-sre/level101/"
  },
  {
    "path": "courses/level101/databases_sql/lab.md",
    "chars": 6999,
    "preview": "**Prerequisites**\n\nInstall Docker\n\n**Setup**\n\nCreate a working directory named `sos` or something similar, and `cd` into"
  },
  {
    "path": "courses/level101/databases_sql/mysql.md",
    "chars": 2781,
    "preview": "### MySQL architecture\n\n![alt_text](images/mysql_architecture.png \"MySQL architecture diagram\")\n\nMySQL architecture enab"
  },
  {
    "path": "courses/level101/databases_sql/operations.md",
    "chars": 3839,
    "preview": "*   Explain and explain+analyze\n\n\t`EXPLAIN <query>` analyzes query plans from the optimizer, including how tables are jo"
  },
  {
    "path": "courses/level101/databases_sql/query_performance.md",
    "chars": 16314,
    "preview": "### Query Performance Improvement\nQuery Performance is a very crucial aspect of relational databases. If not tuned corre"
  },
  {
    "path": "courses/level101/databases_sql/replication.md",
    "chars": 18478,
    "preview": "### MySQL Replication\nReplication enables data from one MySQL host (termed as Primary) to be copied to another MySQL hos"
  },
  {
    "path": "courses/level101/databases_sql/select_query.md",
    "chars": 17069,
    "preview": "### SELECT Query\nThe most commonly used command while working with MySQL is `SELECT`. It is used to fetch the resultset "
  },
  {
    "path": "courses/level101/git/branches.md",
    "chars": 7561,
    "preview": "# Working With Branches\n\nComing back to our local repo which has two commits. So far, what we have is a single line of h"
  },
  {
    "path": "courses/level101/git/conclusion.md",
    "chars": 212,
    "preview": "## What next from here?\n\nThere are a lot of git commands and features which we have not explored here. But with the base"
  },
  {
    "path": "courses/level101/git/git-basics.md",
    "chars": 12584,
    "preview": "# Git\n\n## Prerequisites\n\n1. Have Git installed [https://git-scm.com/downloads](https://git-scm.com/downloads)\n2. Have ta"
  },
  {
    "path": "courses/level101/git/github-hooks.md",
    "chars": 1989,
    "preview": "# Git with GitHub\n\nTill now all the operations we did were in our local repo while git also helps us in a collaborative "
  },
  {
    "path": "courses/level101/linux_basics/command_line_basics.md",
    "chars": 15972,
    "preview": "# Command Line Basics\n\n## Lab Environment Setup\n\nOne can use an online Bash interpreter to run all the commands that are"
  },
  {
    "path": "courses/level101/linux_basics/conclusion.md",
    "chars": 2601,
    "preview": "# Conclusion\n\nWe have covered the basics of Linux operating systems and basic commands used in Linux.\nWe have also cover"
  },
  {
    "path": "courses/level101/linux_basics/intro.md",
    "chars": 9603,
    "preview": "# Linux Basics\n\n## Introduction\n### Prerequisites\n\n- Should be comfortable in using any operating systems like Windows, "
  },
  {
    "path": "courses/level101/linux_basics/linux_server_administration.md",
    "chars": 21671,
    "preview": "# Linux Server Administration\n\nIn this course, will try to cover some of the common tasks that a Linux\nserver administra"
  },
  {
    "path": "courses/level101/linux_networking/conclusion.md",
    "chars": 1080,
    "preview": "# Conclusion\n\nWith this, we have traversed through the TCP/IP stack completely. We hope there will be a different perspe"
  },
  {
    "path": "courses/level101/linux_networking/dns.md",
    "chars": 9333,
    "preview": "# DNS\n\nDomain Names are the simple human-readable names for websites. The Internet understands only IP addresses, but si"
  },
  {
    "path": "courses/level101/linux_networking/http.md",
    "chars": 12117,
    "preview": "# HTTP\n\nTill this point we have only got the IP address of [linkedin.com](https://www.linkedin.com/). The HTML page of ["
  },
  {
    "path": "courses/level101/linux_networking/intro.md",
    "chars": 2178,
    "preview": "# Linux Networking Fundamentals\n\n## Prerequisites\n\n- High-level knowledge of commonly used jargon in TCP/IP stack like D"
  },
  {
    "path": "courses/level101/linux_networking/ipr.md",
    "chars": 3462,
    "preview": "# IP Routing and Data Link Layer\nWe will dig how packets that leave the client reach the server and vice versa. When the"
  },
  {
    "path": "courses/level101/linux_networking/tcp.md",
    "chars": 4882,
    "preview": "# TCP\n\nTCP is a transport layer protocol like UDP but it guarantees reliability, flow control and congestion control.\nTC"
  },
  {
    "path": "courses/level101/linux_networking/udp.md",
    "chars": 3407,
    "preview": "# UDP\n\n\nUDP is a transport layer protocol. DNS is an application layer protocol that runs on top of UDP (most of the tim"
  },
  {
    "path": "courses/level101/messagequeue/further_reading.md",
    "chars": 1314,
    "preview": "# Conclusion\n\nWe have covered basic concepts of Message Services. There is much more to learn and do. We hope this cours"
  },
  {
    "path": "courses/level101/messagequeue/intro.md",
    "chars": 8296,
    "preview": "# Messaging services\n\n\n## What to expect from this course\n\nAt the end of training, you will have an understanding of wha"
  },
  {
    "path": "courses/level101/messagequeue/key_concepts.md",
    "chars": 6390,
    "preview": "# Key Concepts\n\nLet's looks at some of the key concepts when we talk about messaging system\n\n\n### Delivery guarantees\t\t\t"
  },
  {
    "path": "courses/level101/metrics_and_monitoring/alerts.md",
    "chars": 1646,
    "preview": "##\n\n# Proactive monitoring using alerts\nEarlier we discussed different ways to collect key metric data points\nfrom a ser"
  },
  {
    "path": "courses/level101/metrics_and_monitoring/best_practices.md",
    "chars": 1656,
    "preview": "##\n\n# Best practices for monitoring\n\nWhen setting up monitoring for a service, keep the following best\npractices in mind"
  },
  {
    "path": "courses/level101/metrics_and_monitoring/command-line_tools.md",
    "chars": 4524,
    "preview": "##\n\n# Command-line tools\nMost of the Linux distributions today come with a set of tools that\nmonitor the system's perfor"
  },
  {
    "path": "courses/level101/metrics_and_monitoring/conclusion.md",
    "chars": 2237,
    "preview": "# Conclusion\n\nA robust monitoring and alerting system is necessary for maintaining and\ntroubleshooting a system. A dashb"
  },
  {
    "path": "courses/level101/metrics_and_monitoring/introduction.md",
    "chars": 13109,
    "preview": "##\n\n# Prerequisites\n\n-   [Linux  Basics](https://linkedin.github.io/school-of-sre/level101/linux_basics/intro/)\n\n-   [Py"
  },
  {
    "path": "courses/level101/metrics_and_monitoring/observability.md",
    "chars": 7981,
    "preview": "##\n\n# Observability\n\nEngineers often use observability when referring to building reliable\nsystems. *Observability* is a"
  },
  {
    "path": "courses/level101/metrics_and_monitoring/third-party_monitoring.md",
    "chars": 1886,
    "preview": "##\n\n# Third-party monitoring\n\nToday most cloud providers offer a variety of monitoring solutions. In\naddition, a number "
  },
  {
    "path": "courses/level101/python_web/intro.md",
    "chars": 9454,
    "preview": "# Python and The Web\n\n## Prerequisites\n\n- Basic understanding of Python language.\n- Basic familiarity with Flask framewo"
  },
  {
    "path": "courses/level101/python_web/python-concepts.md",
    "chars": 6668,
    "preview": "# Some Python Concepts\n\nThough you are expected to know python and its syntax at basic level, let us discuss some fundam"
  },
  {
    "path": "courses/level101/python_web/python-web-flask.md",
    "chars": 3845,
    "preview": "# Python, Web and Flask\n\nBack in the old days, websites were simple. They were simple static html contents. A webserver "
  },
  {
    "path": "courses/level101/python_web/sre-conclusion.md",
    "chars": 5057,
    "preview": "# Conclusion\n\n## Scaling The App\n\nThe design and development is just a part of the journey. We will need to setup contin"
  },
  {
    "path": "courses/level101/python_web/url-shorten-app.md",
    "chars": 5225,
    "preview": "# The URL Shortening App\n\nLet's build a very simple URL-shortening app using Flask and try to incorporate all aspects of"
  },
  {
    "path": "courses/level101/security/conclusion.md",
    "chars": 2479,
    "preview": "# Conclusion\n\nNow that you have completed this course on Security you are now aware of the possible security threats to "
  },
  {
    "path": "courses/level101/security/fundamentals.md",
    "chars": 33214,
    "preview": "# Part I: Fundamentals\n\n## Introduction to Security Overview for SRE\n\n- If you look closely, both Site Reliability Engin"
  },
  {
    "path": "courses/level101/security/intro.md",
    "chars": 1694,
    "preview": "# Security\n\n## Prerequisites\n\n1. [Linux Basics](https://linkedin.github.io/school-of-sre/level101/linux_basics/intro/)\n\n"
  },
  {
    "path": "courses/level101/security/network_security.md",
    "chars": 43376,
    "preview": "# Part II: Network Security\n\n## Introduction\n\n- TCP/IP is the dominant networking technology today. It is a five-layer a"
  },
  {
    "path": "courses/level101/security/threats_attacks_defences.md",
    "chars": 22459,
    "preview": "# Part III: Threats, Attacks & Defense\n\n## DNS Protection\n\n### Cache Poisoning Attack\n\n- Since DNS responses are cached,"
  },
  {
    "path": "courses/level101/security/writing_secure_code.md",
    "chars": 5080,
    "preview": "# PART IV: Writing Secure Code & More\n\nThe first and most important step in reducing security and reliability issues is "
  },
  {
    "path": "courses/level101/systems_design/availability.md",
    "chars": 5649,
    "preview": "# HA - Availability - Common “Nines”\nAvailability is generally expressed as “Nines”, common ‘Nines’  are listed below.\n\n"
  },
  {
    "path": "courses/level101/systems_design/conclusion.md",
    "chars": 598,
    "preview": "# Conclusion\n\nArmed with these principles, we hope the course will give a fresh perspective to design software systems. "
  },
  {
    "path": "courses/level101/systems_design/fault-tolerance.md",
    "chars": 4874,
    "preview": "# Fault Tolerance\n\nFailures are not avoidable in any system and will happen all the time, hence we need to build systems"
  },
  {
    "path": "courses/level101/systems_design/intro.md",
    "chars": 2610,
    "preview": "# Systems Design\n\n## Prerequisites\n\nFundamentals of common software system components:\n\n- [Linux Basics](https://linkedi"
  },
  {
    "path": "courses/level101/systems_design/scalability.md",
    "chars": 13949,
    "preview": "# Scalability\n**What does scalability mean for a system/service?** A system is composed of services/components, each ser"
  },
  {
    "path": "courses/level102/.level102",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "courses/level102/containerization_and_orchestration/conclusion.md",
    "chars": 628,
    "preview": "# Conclusion\n\nIn this sub-module we have toured the world of containers starting from why we use containers, how contain"
  },
  {
    "path": "courses/level102/containerization_and_orchestration/containerization_with_docker.md",
    "chars": 6482,
    "preview": "## Introduction\n\nDocker has gained huge popularity among other container engines since it was released to the public in "
  },
  {
    "path": "courses/level102/containerization_and_orchestration/intro.md",
    "chars": 5062,
    "preview": "# Containers and orchestration\n\n## Introduction\n\nContainers, Docker and Kubernetes are \"cool\" terms that are being spoke"
  },
  {
    "path": "courses/level102/containerization_and_orchestration/intro_to_containers.md",
    "chars": 15000,
    "preview": "## What are containers\n\nHere's a popular definition of containers according to [Docker](https://www.docker.com/resources"
  },
  {
    "path": "courses/level102/containerization_and_orchestration/orchestration_with_kubernetes.md",
    "chars": 13403,
    "preview": "## Introduction\n\nNow we finally arrive at the most awaited part: running and managing containers at scale. So far, we ha"
  },
  {
    "path": "courses/level102/continuous_integration_and_continuous_delivery/cicd_brief_history.md",
    "chars": 3250,
    "preview": "## The Evolution of the CI/CD\n\nTraditional development approaches have been around for a very long time. The [waterfall "
  },
  {
    "path": "courses/level102/continuous_integration_and_continuous_delivery/conclusion.md",
    "chars": 2283,
    "preview": "## Applications in SRE Role\n\nThe Monitoring, Automation and Eliminating the toil are some of the core pillars of the SRE"
  },
  {
    "path": "courses/level102/continuous_integration_and_continuous_delivery/continuous_delivery_release_pipeline.md",
    "chars": 1596,
    "preview": "***Continuous Delivery*** means deploying the application builds more frequently in the non-production environments such"
  },
  {
    "path": "courses/level102/continuous_integration_and_continuous_delivery/continuous_integration_build_pipeline.md",
    "chars": 1772,
    "preview": "CI is a software development practice where members of a team integrate their work frequently. Each integration is verif"
  },
  {
    "path": "courses/level102/continuous_integration_and_continuous_delivery/introduction.md",
    "chars": 1876,
    "preview": "## Prerequisites\n1.\t[Software Development and Maintenance](https://en.wikibooks.org/wiki/Introduction_to_Software_Engine"
  },
  {
    "path": "courses/level102/continuous_integration_and_continuous_delivery/introduction_to_cicd.md",
    "chars": 849,
    "preview": "Continuous Integration and Continuous Delivery, also known as CI/CD, is a set of processes that helps in faster integrat"
  },
  {
    "path": "courses/level102/continuous_integration_and_continuous_delivery/jenkins_cicd_pipeline_hands_on_lab.md",
    "chars": 8629,
    "preview": "## Jenkins based CI/CD Pipeline\n\nJenkins is an open-source continuous integration server for orchestrating the CI/CD pip"
  },
  {
    "path": "courses/level102/linux_intermediate/archiving_backup.md",
    "chars": 4316,
    "preview": "\n# Archiving and Backup\n\n## Introduction\nOne of the things SREs make sure of is the services are up all the time (at lea"
  },
  {
    "path": "courses/level102/linux_intermediate/bashscripting.md",
    "chars": 10440,
    "preview": "# Bash Scripting\n\n## Introduction\nAs an SRE, the Linux system sits at the core of our day to day work and so is bash scr"
  },
  {
    "path": "courses/level102/linux_intermediate/conclusion.md",
    "chars": 860,
    "preview": "# Conclusion\n\nUnderstanding package management is very crucial as an SRE, we always want the right set of software with "
  },
  {
    "path": "courses/level102/linux_intermediate/introduction.md",
    "chars": 7900,
    "preview": "\n# Linux-Intermediate\n\n## Prerequisites \n\n- Expect to have gone through the School Of SRE [<u>Linux Basics</u>](https://"
  },
  {
    "path": "courses/level102/linux_intermediate/introvim.md",
    "chars": 1862,
    "preview": "\n# Introduction to Vim\n\n## Introduction\nAs an SRE we several times log into into the servers and make changes to the con"
  },
  {
    "path": "courses/level102/linux_intermediate/package_management.md",
    "chars": 2235,
    "preview": "# Package Management\n## Introduction \n\nOne of the main features of any operating system is the ability to run other prog"
  },
  {
    "path": "courses/level102/linux_intermediate/storage_media.md",
    "chars": 13601,
    "preview": "# Storage Media\n## Introduction\n\nStorage media are devices which are used to store data and information. Linux has amazi"
  },
  {
    "path": "courses/level102/networking/conclusion.md",
    "chars": 964,
    "preview": "\nThis course would have given some background on deploying services in datacentre and various parameters to consider and"
  },
  {
    "path": "courses/level102/networking/infrastructure-features.md",
    "chars": 8536,
    "preview": "> *Some of the aspects to consider are, whether the underlying data\ncentre infrastructure supports ToR resiliency, i.e. "
  },
  {
    "path": "courses/level102/networking/introduction.md",
    "chars": 6882,
    "preview": "\n# Prerequisites\n\nIt is recommended to have basic knowledge of network security, TCP and\ndatacenter setup and the common"
  },
  {
    "path": "courses/level102/networking/rtt.md",
    "chars": 1689,
    "preview": "> *Latency plays a key role in determining the overall performance of the\ndistributed service/application, where calls a"
  },
  {
    "path": "courses/level102/networking/scale.md",
    "chars": 7032,
    "preview": "> *Deploying large scale applications, require a better understanding of\ninfrastructure capabilities, in terms of resour"
  },
  {
    "path": "courses/level102/networking/security.md",
    "chars": 8474,
    "preview": "> *This section will cover threat vectors faced by services facing\nexternal/internal clients. Potential mitigation optio"
  },
  {
    "path": "courses/level102/system_calls_and_signals/conclusion.md",
    "chars": 1418,
    "preview": "# Conclusion\n\nOne of the main goals of a SRE is to improve the reliability of high scale systems. Inorder to achieve thi"
  },
  {
    "path": "courses/level102/system_calls_and_signals/intro.md",
    "chars": 2411,
    "preview": "# System Calls and Signals\n\n## Prerequisites\n\n- [Linux Basics](https://linkedin.github.io/school-of-sre/level101/linux_b"
  },
  {
    "path": "courses/level102/system_calls_and_signals/signals.md",
    "chars": 13823,
    "preview": "## Introduction to interrupts and signals\n\nAn interrupt is an event that alters the normal execution flow of a program a"
  },
  {
    "path": "courses/level102/system_calls_and_signals/system_calls.md",
    "chars": 16696,
    "preview": "## Introduction\n\nA system call is a controlled entry point into the kernel, allowing a process to\nrequest the kernel to "
  },
  {
    "path": "courses/level102/system_design/conclusion.md",
    "chars": 671,
    "preview": "We have looked at designing a sytem from the scratch, scaling it up from a single server to multiple datacenters and hun"
  },
  {
    "path": "courses/level102/system_design/intro.md",
    "chars": 5378,
    "preview": "# System Design\n\n## Prerequisites\n\n- [School of SRE - System Design - Phase I](https://linkedin.github.io/school-of-sre/"
  },
  {
    "path": "courses/level102/system_design/large-system-design.md",
    "chars": 8490,
    "preview": "\nDesigning a system usually starts out to be abstract - we have large functional blocks that need to work together and a"
  },
  {
    "path": "courses/level102/system_design/resiliency.md",
    "chars": 5342,
    "preview": "A resilient system is one that can keep functioning in the face of\nadversity. With our application, there can be numerou"
  },
  {
    "path": "courses/level102/system_design/scaling-beyond-the-datacenter.md",
    "chars": 4857,
    "preview": "## Caching static assets\n\nExtending the existing caching solution a bit, we arrive at Content Delivery Networks(CDNs). C"
  },
  {
    "path": "courses/level102/system_design/scaling.md",
    "chars": 9352,
    "preview": "\nIn the Phase 1 of this course, we had seen AKF [scale cube](https://akfpartners.com/growth-blog/scale-cube) and how it "
  },
  {
    "path": "courses/level102/system_troubleshooting_and_performance/conclusion.md",
    "chars": 1250,
    "preview": "Complex systems have many factors which can go wrong. It can be a bad design & architecture, poorly managed code, poor p"
  },
  {
    "path": "courses/level102/system_troubleshooting_and_performance/important-tools.md",
    "chars": 2475,
    "preview": "### Important linux commands\n\nHaving knowledge of following commands will help find issues faster. Elaborating each comm"
  },
  {
    "path": "courses/level102/system_troubleshooting_and_performance/introduction.md",
    "chars": 5000,
    "preview": "# System troubleshooting and performance improvements\n\n## Prerequisites\n\n* [Linux Basics](https://linkedin.github.io/sch"
  },
  {
    "path": "courses/level102/system_troubleshooting_and_performance/performance-improvements.md",
    "chars": 5198,
    "preview": "Performance tools are an important part of development/operations lifecycle, Its highly important for understanding appl"
  },
  {
    "path": "courses/level102/system_troubleshooting_and_performance/troubleshooting-example.md",
    "chars": 4518,
    "preview": "In this section we will see an example of an issue and try to troubleshoot it, and at the end a few famous troubleshooti"
  },
  {
    "path": "courses/level102/system_troubleshooting_and_performance/troubleshooting.md",
    "chars": 3604,
    "preview": "Troubleshooting system failures can be tricky or tedious at times. In this practice we need to examine the end-to-end fl"
  },
  {
    "path": "courses/sre_community.md",
    "chars": 529,
    "preview": "We are having an active [LinkedIn](https://www.linkedin.com) community for School of SRE.\n \n**Please join the group via*"
  },
  {
    "path": "courses/stylesheets/custom.css",
    "chars": 568,
    "preview": "div.md-content img { border: 4px solid #ddd; padding: 12px; }\n.callout {\n  padding: 20px;\n  margin: 20px 0;\n  border: 1p"
  },
  {
    "path": "mkdocs.yml",
    "chars": 8399,
    "preview": "site_url: https://linkedin.github.io/school-of-sre/\nsite_name: School Of SRE\ndocs_dir: courses\ntheme:\n  name: material\n "
  },
  {
    "path": "overrides/partials/header.html",
    "chars": 2425,
    "preview": "{% block libs %}\n   <script async defer data-domain=\"linkedin.github.io\" src=\"https://tracking.eskratch.com/js/plausible"
  },
  {
    "path": "overrides/partials/nav-item.html",
    "chars": 7836,
    "preview": "<!-- Render navigation link status -->\n{% macro render_status(nav_item, type) %}\n  {% set class = \"md-status md-status--"
  },
  {
    "path": "overrides/partials/nav.html",
    "chars": 817,
    "preview": "{% import \"partials/nav-item.html\" as item with context %}\n\n<!-- Determine classes -->\n{% set class = \"md-nav md-nav--pr"
  },
  {
    "path": "requirements.txt",
    "chars": 52,
    "preview": "mkdocs==1.5.3\nmkdocs-material==9.5.12\njinja2>=3.0.2\n"
  }
]

About this extraction

This page contains the full source code of the linkedin/school-of-sre GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 115 files (708.8 KB), approximately 167.1k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo