[
  {
    "path": ".travis.yml",
    "content": "language: python\n\nsudo: false\n\ninstall: \n    - pip install doc8\n    - npm install -g write-good\n\nscript:\n    - doc8 README.rst\n    - write-good README.rst --so --thereIs --cliches\n"
  },
  {
    "path": "README.rst",
    "content": "What happens when...\n====================\n\nThis repository is an attempt to answer the age-old interview question \"What\nhappens when you type google.com into your browser's address box and press\nenter?\"\n\nExcept instead of the usual story, we're going to try to answer this question\nin as much detail as possible. No skipping out on anything.\n\nThis is a collaborative process, so dig in and try to help out! There are tons\nof details missing, just waiting for you to add them! So send us a pull\nrequest, please!\n\nThis is all licensed under the terms of the `Creative Commons Zero`_ license.\n\nRead this in `简体中文`_ (simplified Chinese), `日本語`_ (Japanese), `한국어`_\n(Korean) and `Spanish`_. NOTE: these have not been reviewed by the alex/what-happens-when\nmaintainers.\n\nTable of Contents\n====================\n\n.. contents::\n   :backlinks: none\n   :local:\n\nThe \"g\" key is pressed\n----------------------\nThe following sections explain the physical keyboard actions\nand the OS interrupts. When you press the key \"g\" the browser receives the\nevent and the auto-complete functions kick in.\nDepending on your browser's algorithm and if you are in\nprivate/incognito mode or not various suggestions will be presented\nto you in the dropdown below the URL bar. Most of these algorithms sort\nand prioritize results based on search history, bookmarks, cookies, and\npopular searches from the internet as a whole. As you are typing\n\"google.com\" many blocks of code run and the suggestions will be refined\nwith each keypress. It may even suggest \"google.com\" before you finish typing\nit.\n\nThe \"enter\" key bottoms out\n---------------------------\n\nTo pick a zero point, let's choose the Enter key on the keyboard hitting the\nbottom of its range. At this point, an electrical circuit specific to the enter\nkey is closed (either directly or capacitively). This allows a small amount of\ncurrent to flow into the logic circuitry of the keyboard, which scans the state\nof each key switch, debounces the electrical noise of the rapid intermittent\nclosure of the switch, and converts it to a keycode integer, in this case 13.\nThe keyboard controller then encodes the keycode for transport to the computer.\nThis is now almost universally over a Universal Serial Bus (USB) or Bluetooth\nconnection, but historically has been over PS/2 or ADB connections.\n\n*In the case of the USB keyboard:*\n\n- The USB circuitry of the keyboard is powered by the 5V supply provided over\n  pin 1 from the computer's USB host controller.\n\n- The keycode generated is stored by internal keyboard circuitry memory in a\n  register called \"endpoint\".\n\n- The host USB controller polls that \"endpoint\" every ~10ms (minimum value\n  declared by the keyboard), so it gets the keycode value stored on it.\n\n- This value goes to the USB SIE (Serial Interface Engine) to be converted in\n  one or more USB packets that follow the low-level USB protocol.\n\n- Those packets are sent by a differential electrical signal over D+ and D-\n  pins (the middle 2) at a maximum speed of 1.5 Mb/s, as an HID\n  (Human Interface Device) device is always declared to be a \"low-speed device\"\n  (USB 2.0 compliance).\n\n- This serial signal is then decoded at the computer's host USB controller, and\n  interpreted by the computer's Human Interface Device (HID) universal keyboard\n  device driver.  The value of the key is then passed into the operating\n  system's hardware abstraction layer.\n\n*In the case of Virtual Keyboard (as in touch screen devices):*\n\n- When the user puts their finger on a modern capacitive touch screen, a\n  tiny amount of current gets transferred to the finger. This completes the\n  circuit through the electrostatic field of the conductive layer and\n  creates a voltage drop at that point on the screen. The\n  ``screen controller`` then raises an interrupt reporting the coordinate of\n  the keypress.\n\n- Then the mobile OS notifies the currently focused application of a press event\n  in one of its GUI elements (which now is the virtual keyboard application\n  buttons).\n\n- The virtual keyboard can now raise a software interrupt for sending a\n  'key pressed' message back to the OS.\n\n- This interrupt notifies the currently focused application of a 'key pressed'\n  event.\n\n\nInterrupt fires [NOT for USB keyboards]\n---------------------------------------\n\nThe keyboard sends signals on its interrupt request line (IRQ), which is mapped\nto an ``interrupt vector`` (integer) by the interrupt controller. The CPU uses\nthe ``Interrupt Descriptor Table`` (IDT) to map the interrupt vectors to\nfunctions (``interrupt handlers``) which are supplied by the kernel. When an\ninterrupt arrives, the CPU indexes the IDT with the interrupt vector and runs\nthe appropriate handler. Thus, the kernel is entered.\n\n(On Windows) A ``WM_KEYDOWN`` message is sent to the app\n--------------------------------------------------------\n\nThe HID transport passes the key down event to the ``KBDHID.sys`` driver which\nconverts the HID usage into a scancode. In this case, the scan code is\n``VK_RETURN`` (``0x0D``). The ``KBDHID.sys`` driver interfaces with the\n``KBDCLASS.sys`` (keyboard class driver). This driver is responsible for\nhandling all keyboard and keypad input in a secure manner. It then calls into\n``Win32K.sys`` (after potentially passing the message through 3rd party\nkeyboard filters that are installed). This all happens in kernel mode.\n\n``Win32K.sys`` figures out what window is the active window through the\n``GetForegroundWindow()`` API. This API provides the window handle of the\nbrowser's address box. The main Windows \"message pump\" then calls\n``SendMessage(hWnd, WM_KEYDOWN, VK_RETURN, lParam)``. ``lParam`` is a bitmask\nthat indicates further information about the keypress: repeat count (0 in this\ncase), the actual scan code (can be OEM dependent, but generally wouldn't be\nfor ``VK_RETURN``), whether extended keys (e.g. alt, shift, ctrl) were also\npressed (they weren't), and some other state.\n\nThe Windows ``SendMessage`` API is a straightforward function that\nadds the message to a queue for the particular window handle (``hWnd``).\nLater, the main message processing function (called a ``WindowProc``) assigned\nto the ``hWnd`` is called in order to process each message in the queue.\n\nThe window (``hWnd``) that is active is actually an edit control and the\n``WindowProc`` in this case has a message handler for ``WM_KEYDOWN`` messages.\nThis code looks within the 3rd parameter that was passed to ``SendMessage``\n(``wParam``) and, because it is ``VK_RETURN`` knows the user has hit the ENTER\nkey.\n\n(On OS X) A ``KeyDown`` NSEvent is sent to the app\n--------------------------------------------------\n\nThe interrupt signal triggers an interrupt event in the I/O Kit kext keyboard\ndriver. The driver translates the signal into a key code which is passed to the\nOS X ``WindowServer`` process. Resultantly, the ``WindowServer`` dispatches an\nevent to any appropriate (e.g. active or listening) applications through their\nMach port where it is placed into an event queue. Events can then be read from\nthis queue by threads with sufficient privileges calling the\n``mach_ipc_dispatch`` function. This most commonly occurs through, and is\nhandled by, an ``NSApplication`` main event loop, via an ``NSEvent`` of\n``NSEventType`` ``KeyDown``.\n\n(On GNU/Linux) the Xorg server listens for keycodes\n---------------------------------------------------\n\nWhen a graphical ``X server`` is used, ``X`` will use the generic event\ndriver ``evdev`` to acquire the keypress. A re-mapping of keycodes to scancodes\nis made with ``X server`` specific keymaps and rules.\nWhen the scancode mapping of the key pressed is complete, the ``X server``\nsends the character to the ``window manager`` (DWM, metacity, i3, etc), so the\n``window manager`` in turn sends the character to the focused window.\nThe graphical API of the window  that receives the character prints the\nappropriate font symbol in the appropriate focused field.\n\nParse URL\n---------\n\n* The browser now has the following information contained in the URL (Uniform\n  Resource Locator):\n\n    - ``Protocol``  \"http\"\n        Use 'Hyper Text Transfer Protocol'\n\n    - ``Resource``  \"/\"\n        Retrieve main (index) page\n\n\nIs it a URL or a search term?\n-----------------------------\n\nWhen no protocol or valid domain name is given the browser proceeds to feed\nthe text given in the address box to the browser's default web search engine.\nIn many cases the URL has a special piece of text appended to it to tell the\nsearch engine that it came from a particular browser's URL bar.\n\nConvert non-ASCII Unicode characters in the hostname\n------------------------------------------------\n\n* The browser checks the hostname for characters that are not in ``a-z``,\n  ``A-Z``, ``0-9``, ``-``, or ``.``.\n* Since the hostname is ``google.com`` there won't be any, but if there were\n  the browser would apply `Punycode`_ encoding to the hostname portion of the\n  URL.\n\nCheck HSTS list\n---------------\n* The browser checks its \"preloaded HSTS (HTTP Strict Transport Security)\"\n  list. This is a list of websites that have requested to be contacted via\n  HTTPS only.\n* If the website is in the list, the browser sends its request via HTTPS\n  instead of HTTP. Otherwise, the initial request is sent via HTTP.\n  (Note that a website can still use the HSTS policy *without* being in the\n  HSTS list.  The first HTTP request to the website by a user will receive a\n  response requesting that the user only send HTTPS requests.  However, this\n  single HTTP request could potentially leave the user vulnerable to a\n  `downgrade attack`_, which is why the HSTS list is included in modern web\n  browsers.)\n\nDNS lookup\n----------\n\n* Browser checks if the domain is in its cache. (to see the DNS Cache in\n  Chrome, go to `chrome://net-internals/#dns <chrome://net-internals/#dns>`_).\n* If not found, the browser calls ``gethostbyname`` library function (varies by\n  OS) to do the lookup.\n* ``gethostbyname`` checks if the hostname can be resolved by reference in the\n  local ``hosts`` file (whose location `varies by OS`_) before trying to\n  resolve the hostname through DNS.\n* If ``gethostbyname`` does not have it cached nor can find it in the ``hosts``\n  file then it makes a request to the DNS server configured in the network\n  stack. This is typically the local router or the ISP's caching DNS server.\n* If the DNS server is on the same subnet the network library follows the\n  ``ARP process`` below for the DNS server.\n* If the DNS server is on a different subnet, the network library follows\n  the ``ARP process`` below for the default gateway IP.\n\n\nARP process\n-----------\n\nIn order to send an ARP (Address Resolution Protocol) broadcast the network\nstack library needs the target IP address to lookup. It also needs to know the\nMAC address of the interface it will use to send out the ARP broadcast.\n\nThe ARP cache is first checked for an ARP entry for our target IP. If it is in\nthe cache, the library function returns the result: Target IP = MAC.\n\nIf the entry is not in the ARP cache:\n\n* The route table is looked up, to see if the Target IP address is on any of\n  the subnets on the local route table. If it is, the library uses the\n  interface associated with that subnet. If it is not, the library uses the\n  interface that has the subnet of our default gateway.\n\n* The MAC address of the selected network interface is looked up.\n\n* The network library sends a Layer 2 (data link layer of the `OSI model`_)\n  ARP request:\n\n``ARP Request``::\n\n    Sender MAC: interface:mac:address:here\n    Sender IP: interface.ip.goes.here\n    Target MAC: FF:FF:FF:FF:FF:FF (Broadcast)\n    Target IP: target.ip.goes.here\n\nDepending on what type of hardware is between the computer and the router:\n\nDirectly connected:\n\n* If the computer is directly connected to the router the router response\n  with an ``ARP Reply`` (see below)\n\nHub:\n\n* If the computer is connected to a hub, the hub will broadcast the ARP\n  request out of all other ports. If the router is connected on the same \"wire\",\n  it will respond with an ``ARP Reply`` (see below).\n\nSwitch:\n\n* If the computer is connected to a switch, the switch will check its local\n  CAM/MAC table to see which port has the MAC address we are looking for. If\n  the switch has no entry for the MAC address it will rebroadcast the ARP\n  request to all other ports.\n\n* If the switch has an entry in the MAC/CAM table it will send the ARP request\n  to the port that has the MAC address we are looking for.\n\n* If the router is on the same \"wire\", it will respond with an ``ARP Reply``\n  (see below)\n\n``ARP Reply``::\n\n    Sender MAC: target:mac:address:here\n    Sender IP: target.ip.goes.here\n    Target MAC: interface:mac:address:here\n    Target IP: interface.ip.goes.here\n\nNow that the network library has the IP address of either our DNS server or\nthe default gateway it can resume its DNS process:\n\n* The DNS client establishes a socket to UDP port 53 on the DNS server,\n  using a source port above 1023.\n* If the response size is too large, TCP will be used instead.\n* If the local/ISP DNS server does not have it, then a recursive search is\n  requested and that flows up the list of DNS servers until the SOA is reached,\n  and if found an answer is returned.\n\nOpening of a socket\n-------------------\nOnce the browser receives the IP address of the destination server, it takes\nthat and the given port number from the URL (the HTTP protocol defaults to port\n80, and HTTPS to port 443), and makes a call to the system library function\nnamed ``socket`` and requests a TCP socket stream - ``AF_INET/AF_INET6`` and\n``SOCK_STREAM``.\n\n* This request is first passed to the Transport Layer where a TCP segment is\n  crafted. The destination port is added to the header, and a source port is\n  chosen from within the kernel's dynamic port range (ip_local_port_range in\n  Linux).\n* This segment is sent to the Network Layer, which wraps an additional IP\n  header. The IP address of the destination server as well as that of the\n  current machine is inserted to form a packet.\n* The packet next arrives at the Link Layer. A frame header is added that\n  includes the MAC address of the machine's NIC as well as the MAC address of\n  the gateway (local router). As before, if the kernel does not know the MAC\n  address of the gateway, it must broadcast an ARP query to find it.\n\nAt this point the packet is ready to be transmitted through either:\n\n* `Ethernet`_\n* `WiFi`_\n* `Cellular data network`_\n\nFor most home or small business Internet connections the packet will pass from\nyour computer, possibly through a local network, and then through a modem\n(MOdulator/DEModulator) which converts digital 1's and 0's into an analog\nsignal suitable for transmission over telephone, cable, or wireless telephony\nconnections. On the other end of the connection is another modem which converts\nthe analog signal back into digital data to be processed by the next `network\nnode`_ where the from and to addresses would be analyzed further.\n\nMost larger businesses and some newer residential connections will have fiber\nor direct Ethernet connections in which case the data remains digital and\nis passed directly to the next `network node`_ for processing.\n\nEventually, the packet will reach the router managing the local subnet. From\nthere, it will continue to travel to the autonomous system's (AS) border\nrouters, other ASes, and finally to the destination server. Each router along\nthe way extracts the destination address from the IP header and routes it to\nthe appropriate next hop. The time to live (TTL) field in the IP header is\ndecremented by one for each router that passes. The packet will be dropped if\nthe TTL field reaches zero or if the current router has no space in its queue\n(perhaps due to network congestion).\n\nThis send and receive happens multiple times following the TCP connection flow:\n\n* Client chooses an initial sequence number (ISN) and sends the packet to the\n  server with the SYN bit set to indicate it is setting the ISN\n* Server receives SYN and if it's in an agreeable mood:\n   * Server chooses its own initial sequence number\n   * Server sets SYN to indicate it is choosing its ISN\n   * Server copies the (client ISN +1) to its ACK field and adds the ACK flag\n     to indicate it is acknowledging receipt of the first packet\n* Client acknowledges the connection by sending a packet:\n   * Increases its own sequence number\n   * Increases the receiver acknowledgment number\n   * Sets ACK field\n* Data is transferred as follows:\n   * As one side sends N data bytes, it increases its SEQ by that number\n   * When the other side acknowledges receipt of that packet (or a string of\n     packets), it sends an ACK packet with the ACK value equal to the last\n     received sequence from the other\n* To close the connection:\n   * The closer sends a FIN packet\n   * The other sides ACKs the FIN packet and sends its own FIN\n   * The closer acknowledges the other side's FIN with an ACK\n\nTLS handshake\n-------------\n* The client computer sends a ``ClientHello`` message to the server with its\n  Transport Layer Security (TLS) version, list of cipher algorithms and\n  compression methods available.\n\n* The server replies with a ``ServerHello`` message to the client with the\n  TLS version, selected cipher, selected compression methods and the server's\n  public certificate signed by a CA (Certificate Authority). The certificate\n  contains a public key that will be used by the client to encrypt the rest of\n  the handshake until a symmetric key can be agreed upon.\n\n* The client verifies the server digital certificate against its list of\n  trusted CAs. If trust can be established based on the CA, the client\n  generates a string of pseudo-random bytes and encrypts this with the server's\n  public key. These random bytes can be used to determine the symmetric key.\n\n* The server decrypts the random bytes using its private key and uses these\n  bytes to generate its own copy of the symmetric master key.\n\n* The client sends a ``Finished`` message to the server, encrypting a hash of\n  the transmission up to this point with the symmetric key.\n\n* The server generates its own hash, and then decrypts the client-sent hash\n  to verify that it matches. If it does, it sends its own ``Finished`` message\n  to the client, also encrypted with the symmetric key.\n\n* From now on the TLS session transmits the application (HTTP) data encrypted\n  with the agreed symmetric key.\n\nIf a packet is dropped\n----------------------\n\nSometimes, due to network congestion or flaky hardware connections, TLS packets\nwill be dropped before they get to their final destination. The sender then has\nto decide how to react. The algorithm for this is called `TCP congestion\ncontrol`_. This varies depending on the sender; the most common algorithms are\n`cubic`_ on newer operating systems and `New Reno`_ on almost all others.\n\n* Client chooses a `congestion window`_ based on the `maximum segment size`_\n  (MSS) of the connection.\n* For each packet acknowledged, the window doubles in size until it reaches the\n  'slow-start threshold'. In some implementations, this threshold is adaptive.\n* After reaching the slow-start threshold, the window increases additively for\n  each packet acknowledged. If a packet is dropped, the window reduces\n  exponentially until another packet is acknowledged.\n\nHTTP protocol\n-------------\n\nIf the web browser used was written by Google, instead of sending an HTTP\nrequest to retrieve the page, it will send a request to try and negotiate with\nthe server an \"upgrade\" from HTTP to the SPDY protocol.\n\nIf the client is using the HTTP protocol and does not support SPDY, it sends a\nrequest to the server of the form::\n\n    GET / HTTP/1.1\n    Host: google.com\n    Connection: close\n    [other headers]\n\nwhere ``[other headers]`` refers to a series of colon-separated key-value pairs\nformatted as per the HTTP specification and separated by single newlines.\n(This assumes the web browser being used doesn't have any bugs violating the\nHTTP spec. This also assumes that the web browser is using ``HTTP/1.1``,\notherwise it may not include the ``Host`` header in the request and the version\nspecified in the ``GET`` request will either be ``HTTP/1.0`` or ``HTTP/0.9``.)\n\nHTTP/1.1 defines the \"close\" connection option for the sender to signal that\nthe connection will be closed after completion of the response. For example,\n\n    Connection: close\n\nHTTP/1.1 applications that do not support persistent connections MUST include\nthe \"close\" connection option in every message.\n\nAfter sending the request and headers, the web browser sends a single blank\nnewline to the server indicating that the content of the request is done.\n\nThe server responds with a response code denoting the status of the request and\nresponds with a response of the form::\n\n    200 OK\n    [response headers]\n\nFollowed by a single newline, and then sends a payload of the HTML content of\n``www.google.com``. The server may then either close the connection, or if\nheaders sent by the client requested it, keep the connection open to be reused\nfor further requests.\n\nIf the HTTP headers sent by the web browser included sufficient information for\nthe webserver to determine if the version of the file cached by the web\nbrowser has been unmodified since the last retrieval (ie. if the web browser\nincluded an ``ETag`` header), it may instead respond with a request of\nthe form::\n\n    304 Not Modified\n    [response headers]\n\nand no payload, and the web browser instead retrieve the HTML from its cache.\n\nAfter parsing the HTML, the web browser (and server) repeats this process\nfor every resource (image, CSS, favicon.ico, etc) referenced by the HTML page,\nexcept instead of ``GET / HTTP/1.1`` the request will be\n``GET /$(URL relative to www.google.com) HTTP/1.1``.\n\nIf the HTML referenced a resource on a different domain than\n``www.google.com``, the web browser goes back to the steps involved in\nresolving the other domain, and follows all steps up to this point for that\ndomain. The ``Host`` header in the request will be set to the appropriate\nserver name instead of ``google.com``.\n\nHTTP Server Request Handle\n--------------------------\nThe HTTPD (HTTP Daemon) server is the one handling the requests/responses on\nthe server-side. The most common HTTPD servers are Apache or nginx for Linux\nand IIS for Windows.\n\n* The HTTPD (HTTP Daemon) receives the request.\n* The server breaks down the request to the following parameters:\n   * HTTP Request Method (either ``GET``, ``HEAD``, ``POST``, ``PUT``,\n     ``PATCH``, ``DELETE``, ``CONNECT``, ``OPTIONS``, or ``TRACE``). In the\n     case of a URL entered directly into the address bar, this will be ``GET``.\n   * Domain, in this case - google.com.\n   * Requested path/page, in this case - / (as no specific path/page was\n     requested, / is the default path).\n* The server verifies that there is a Virtual Host configured on the server\n  that corresponds with google.com.\n* The server verifies that google.com can accept GET requests.\n* The server verifies that the client is allowed to use this method\n  (by IP, authentication, etc.).\n* If the server has a rewrite module installed (like mod_rewrite for Apache or\n  URL Rewrite for IIS), it tries to match the request against one of the\n  configured rules. If a matching rule is found, the server uses that rule to\n  rewrite the request.\n* The server goes to pull the content that corresponds with the request,\n  in our case it will fall back to the index file, as \"/\" is the main file\n  (some cases can override this, but this is the most common method).\n* The server parses the file according to the handler. If Google\n  is running on PHP, the server uses PHP to interpret the index file, and\n  streams the output to the client.\n\nBehind the scenes of the Browser\n----------------------------------\n\nOnce the server supplies the resources (HTML, CSS, JS, images, etc.)\nto the browser it undergoes the below process:\n\n* Parsing - HTML, CSS, JS\n* Rendering - Construct DOM Tree → Render Tree → Layout of Render Tree →\n  Painting the render tree\n\nBrowser\n-------\n\nThe browser's functionality is to present the web resource you choose, by\nrequesting it from the server and displaying it in the browser window.\nThe resource is usually an HTML document, but may also be a PDF,\nimage, or some other type of content. The location of the resource is\nspecified by the user using a URI (Uniform Resource Identifier).\n\nThe way the browser interprets and displays HTML files is specified\nin the HTML and CSS specifications. These specifications are maintained\nby the W3C (World Wide Web Consortium) organization, which is the\nstandards organization for the web.\n\nBrowser user interfaces have a lot in common with each other. Among the\ncommon user interface elements are:\n\n* An address bar for inserting a URI\n* Back and forward buttons\n* Bookmarking options\n* Refresh and stop buttons for refreshing or stopping the loading of\n  current documents\n* Home button that takes you to your home page\n\n**Browser High-Level Structure**\n\nThe components of the browsers are:\n\n* **User interface:** The user interface includes the address bar,\n  back/forward button, bookmarking menu, etc. Every part of the browser\n  display except the window where you see the requested page.\n* **Browser engine:** The browser engine marshals actions between the UI\n  and the rendering engine.\n* **Rendering engine:** The rendering engine is responsible for displaying\n  requested content. For example if the requested content is HTML, the\n  rendering engine parses HTML and CSS, and displays the parsed content on\n  the screen.\n* **Networking:** The networking handles network calls such as HTTP requests,\n  using different implementations for different platforms behind a\n  platform-independent interface.\n* **UI backend:** The UI backend is used for drawing basic widgets like combo\n  boxes and windows. This backend exposes a generic interface that is not\n  platform-specific.\n  Underneath it uses operating system user interface methods.\n* **JavaScript engine:** The JavaScript engine is used to parse and\n  execute JavaScript code.\n* **Data storage:** The data storage is a persistence layer. The browser may\n  need to save all sorts of data locally, such as cookies. Browsers also\n  support storage mechanisms such as localStorage, IndexedDB, WebSQL and\n  FileSystem.\n\nHTML parsing\n------------\n\nThe rendering engine starts getting the contents of the requested\ndocument from the networking layer. This will usually be done in 8kB chunks.\n\nThe primary job of the HTML parser is to parse the HTML markup into a parse tree.\n\nThe output tree (the \"parse tree\") is a tree of DOM element and attribute\nnodes. DOM is short for Document Object Model. It is the object presentation\nof the HTML document and the interface of HTML elements to the outside world\nlike JavaScript. The root of the tree is the \"Document\" object. Prior to\nany manipulation via scripting, the DOM has an almost one-to-one relation to\nthe markup.\n\n**The parsing algorithm**\n\nHTML cannot be parsed using the regular top-down or bottom-up parsers.\n\nThe reasons are:\n\n* The forgiving nature of the language.\n* The fact that browsers have traditional error tolerance to support well\n  known cases of invalid HTML.\n* The parsing process is reentrant. For other languages, the source doesn't\n  change during parsing, but in HTML, dynamic code (such as script elements\n  containing `document.write()` calls) can add extra tokens, so the parsing\n  process actually modifies the input.\n\nUnable to use the regular parsing techniques, the browser utilizes a custom\nparser for parsing HTML. The parsing algorithm is described in\ndetail by the HTML5 specification.\n\nThe algorithm consists of two stages: tokenization and tree construction.\n\n**Actions when the parsing is finished**\n\nThe browser begins fetching external resources linked to the page (CSS, images,\nJavaScript files, etc.).\n\nAt this stage the browser marks the document as interactive and starts\nparsing scripts that are in \"deferred\" mode: those that should be\nexecuted after the document is parsed. The document state is\nset to \"complete\" and a \"load\" event is fired.\n\nNote there is never an \"Invalid Syntax\" error on an HTML page. Browsers fix\nany invalid content and go on.\n\nCSS interpretation\n------------------\n\n* Parse CSS files, ``<style>`` tag contents, and ``style`` attribute\n  values using `\"CSS lexical and syntax grammar\"`_\n* Each CSS file is parsed into a ``StyleSheet object``, where each object\n  contains CSS rules with selectors and objects corresponding CSS grammar.\n* A CSS parser can be top-down or bottom-up when a specific parser generator\n  is used.\n\nPage Rendering\n--------------\n\n* Create a 'Frame Tree' or 'Render Tree' by traversing the DOM nodes, and\n  calculating the CSS style values for each node.\n* Calculate the preferred width of each node in the 'Frame Tree' bottom-up\n  by summing the preferred width of the child nodes and the node's\n  horizontal margins, borders, and padding.\n* Calculate the actual width of each node top-down by allocating each node's\n  available width to its children.\n* Calculate the height of each node bottom-up by applying text wrapping and\n  summing the child node heights and the node's margins, borders, and padding.\n* Calculate the coordinates of each node using the information calculated\n  above.\n* More complicated steps are taken when elements are ``floated``,\n  positioned ``absolutely`` or ``relatively``, or other complex features\n  are used. See\n  http://dev.w3.org/csswg/css2/ and http://www.w3.org/Style/CSS/current-work\n  for more details.\n* Create layers to describe which parts of the page can be animated as a group\n  without being re-rasterized. Each frame/render object is assigned to a layer.\n* Textures are allocated for each layer of the page.\n* The frame/render objects for each layer are traversed and drawing commands\n  are executed for their respective layer. This may be rasterized by the CPU\n  or drawn on the GPU directly using D2D/SkiaGL.\n* All of the above steps may reuse calculated values from the last time the\n  webpage was rendered, so that incremental changes require less work.\n* The page layers are sent to the compositing process where they are combined\n  with layers for other visible content like the browser chrome, iframes\n  and addon panels.\n* Final layer positions are computed and the composite commands are issued\n  via Direct3D/OpenGL. The GPU command buffer(s) are flushed to the GPU for\n  asynchronous rendering and the frame is sent to the window server.\n\nGPU Rendering\n-------------\n\n* During the rendering process the graphical computing layers can use general\n  purpose ``CPU`` or the graphical processor ``GPU`` as well.\n\n* When using ``GPU`` for graphical rendering computations the graphical\n  software layers split the task into multiple pieces, so it can take advantage\n  of ``GPU`` massive parallelism for float point calculations required for\n  the rendering process.\n\n\nWindow Server\n-------------\n\nPost-rendering and user-induced execution\n-----------------------------------------\n\nAfter rendering has been completed, the browser executes JavaScript code as a result\nof some timing mechanism (such as a Google Doodle animation) or user\ninteraction (typing a query into the search box and receiving suggestions).\nPlugins such as Flash or Java may execute as well, although not at this time on\nthe Google homepage. Scripts can cause additional network requests to be\nperformed, as well as modify the page or its layout, causing another round of\npage rendering and painting.\n\n.. _`Creative Commons Zero`: https://creativecommons.org/publicdomain/zero/1.0/\n.. _`\"CSS lexical and syntax grammar\"`: http://www.w3.org/TR/CSS2/grammar.html\n.. _`Punycode`: https://en.wikipedia.org/wiki/Punycode\n.. _`Ethernet`: http://en.wikipedia.org/wiki/IEEE_802.3\n.. _`WiFi`: https://en.wikipedia.org/wiki/IEEE_802.11\n.. _`Cellular data network`: https://en.wikipedia.org/wiki/Cellular_data_communication_protocol\n.. _`analog-to-digital converter`: https://en.wikipedia.org/wiki/Analog-to-digital_converter\n.. _`network node`: https://en.wikipedia.org/wiki/Computer_network#Network_nodes\n.. _`TCP congestion control`: https://en.wikipedia.org/wiki/TCP_congestion_control\n.. _`cubic`: https://en.wikipedia.org/wiki/CUBIC_TCP\n.. _`New Reno`: https://en.wikipedia.org/wiki/TCP_congestion_control#TCP_New_Reno\n.. _`congestion window`: https://en.wikipedia.org/wiki/TCP_congestion_control#Congestion_window\n.. _`maximum segment size`: https://en.wikipedia.org/wiki/Maximum_segment_size\n.. _`varies by OS` : https://en.wikipedia.org/wiki/Hosts_%28file%29#Location_in_the_file_system\n.. _`简体中文`: https://github.com/skyline75489/what-happens-when-zh_CN\n.. _`한국어`: https://github.com/SantonyChoi/what-happens-when-KR\n.. _`日本語`: https://github.com/tettttsuo/what-happens-when-JA\n.. _`downgrade attack`: http://en.wikipedia.org/wiki/SSL_stripping\n.. _`OSI Model`: https://en.wikipedia.org/wiki/OSI_model\n.. _`Spanish`: https://github.com/gonzaleztroyano/what-happens-when-ES\n"
  }
]