Showing preview only (2,061K chars total). Download the full file or copy to clipboard to get everything.
Repository: tuhdo/os01
Branch: master
Commit: a7587edc10c5
Files: 48
Total size: 2.0 MB
Directory structure:
gitextract_3fx0xerz/
├── .gitignore
├── CHANGELOG.md
├── README.md
├── _config.yml
├── book_src/
│ ├── Operating Systems From 0 to 1.lyx
│ ├── Operating Systems From 0 to 1.txt
│ ├── images/
│ │ ├── .rid
│ │ ├── 02/
│ │ │ └── layer_translation.graphml
│ │ ├── 03/
│ │ │ └── .rid
│ │ ├── 04/
│ │ │ ├── .rid
│ │ │ └── modrm32.tex
│ │ ├── 05/
│ │ │ └── .rid
│ │ ├── 06/
│ │ │ └── .rid
│ │ ├── 07/
│ │ │ └── .rid
│ │ └── 08/
│ │ └── .rid
│ └── references.bib
└── code/
├── chapter7/
│ └── os/
│ ├── .gdbinit
│ ├── Makefile
│ ├── bootloader/
│ │ ├── Makefile
│ │ ├── bootloader.asm
│ │ └── debug.elf
│ ├── build/
│ │ ├── bootloader/
│ │ │ └── bootloader.o
│ │ └── os/
│ │ └── sample.o
│ ├── disk.img
│ └── os/
│ ├── Makefile
│ └── sample.asm
└── chapter8/
└── os/
├── .gdbinit
├── Makefile
├── bootloader/
│ ├── Makefile
│ ├── bootloader.asm
│ ├── bootloader.dbg
│ └── bootloader.lds
├── build/
│ ├── bootloader/
│ │ ├── bootloader.elf
│ │ ├── bootloader.o
│ │ └── bootloader.o.elf
│ ├── disk.img
│ ├── main.o
│ └── os/
│ ├── main.o
│ ├── os
│ ├── os.debug
│ └── sample.o
├── disk.img
└── os/
├── Makefile
├── main
├── main.c
├── os
├── os.lds
└── sample.asm
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
*.png
*.lyx~
================================================
FILE: CHANGELOG.md
================================================
## 0.0.1 (2017-02-15)
* [#6] Fix the figure `far_jmp_ex.svg` in chapter 4, where the segment and the offset in memory are reversed.
* [#7] Fix example 5.3.1: change NULL section to .interp section.
* [#8] Fix command output to reflect the source code.
* [#9] abort(), not .abort(). A function call, not a section.
* [#10] Use `__FUNCTION__` for consistency.
* [#11] Fix incorrect filename.
* [#12] Fix confusing sentence.
* [#13] Fix a typo.
================================================
FILE: README.md
================================================
[](https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=tuhdo1710%40gmail%2ecom&lc=VN&item_number=tuhdo¤cy_code=USD&bn=PP%2dDonationsBF%3aDonate%2dPayPal%2dgreen%2esvg%3aNonHosted)
[Operating Systems: From 0 to 1](https://tuhdo.github.io/os01/)
=============================
This book helps you gain the foundational knowledge required to write an
operating system from scratch. Hence the title, 0 to 1.
After completing this book, at the very least you will learn:
- How to write an operating system from scratch by reading hardware datasheets.
In the real world, it works like that. You won't be able to consult Google for
a quick answer.
- A big picture of how each layer of a computer is related to the other, from hardware to software.
- Write code independently. It's pointless to copy and paste code. Real learning
happens when you solve problems on your own. Some examples are given to kick
start, but most problems are yours to conquer. However, the solutions are
available online for you to examine after giving it a good try.
- Linux as a development environment and how to use common tools for low-level
programming.
- x86 assembly in-depth.
- How a program is structured so that an operating system can run.
- How to debug a program running directly on hardware with gdb and QEMU.
- Linking and loading on bare metal x86_64, with pure C. No standard library. No
runtime overhead.
[Download the book](https://github.com/tuhdo/os01/blob/master/Operating_Systems_From_0_to_1.pdf)
# The pedagogy of the book
> You give a poor man a fish and you feed him for a day. You teach him to fish
> and you give him an occupation that will feed him for a lifetime.
This has been the guiding principle of the book when I was writing it. The book does
not try to teach you everything, but enough to enable you to learn by yourself.
The book itself, at this point, is quite "complete": once you master part 1 and
part 2 (which consist of 8 chapters), you can drop the book and learn by
yourself. At this point, smart readers should be able to continue on their own.
For example, they can continue their journeys
on [OSDev wiki](http://wiki.osdev.org/Main_Page); in fact, after you study
everything in part 1 and part 2, you only meet
the [minimum requirement](http://wiki.osdev.org/Required_Knowledge) by OSDev
Wiki (well, not quite, the book actually goes deeper for the suggested topics).
Or, if you consider developing an OS for fun is impractical, you can continue
with a Linux-specific book, such as this free
book [Linux Insides](https://0xax.gitbooks.io/linux-insides/content/), or other
popular Linux kernel books. The book tries hard to provide you a strong
foundation, and that's why part 1 and part 2 were released first.
The book teaches you core concepts, such as x86 Assembly, ELF, linking and
debugging on bare metal, etc., but more importantly, where such information
come from. For example, instead of just teaching x86 Assembly, it also teaches
how to use reference manuals from Intel. Learning to read the official
manuals is important because only the hardware manufacturers themselves
understand how their hardware work. If you only learn from the secondary
resources because it is easier, you will never gain a complete understanding of
the hardware you are programming for. Have you ever read a book on Assembly, and
wondered where all the information came from? How does the author know
everything he says is correct? And how one seems to magically know so much about
hardware programming? This book gives pointers to such questions.
As an example, you should skim through chapter 4, "x86 Assembly and C", to see
how it makes use of the Intel manual, Volume 2. And in
the process, it guides you how to use the official manuals.
Part 3 is planned as a series of specifications that a reader will implement to
complete each operating system component. It does not contain code aside from a
few examples. Part 3 is just there to shorten the reader's time when reading the
official manuals by giving hints where to read, explaining difficult concepts
and how to use the manuals to debug. In short, the implementation is up to the
reader to work on his or her own; the chapters are just like university assignments.
# Prerequisites
Know some circuit concepts:
+ Basic Concepts of Electricity: atoms, electrons, protons, neutrons, current flow.
+ Ohm's law
However, if you know absolutely nothing about electricity, you can quickly learn it here:
<http://www.allaboutcircuits.com/textbook/>, by reading chapter 1 and chapter 2.
C programming. In particular:
+ Variable and function declarations/definitions
+ While and for loops
+ Pointers and function pointers
+ Fundamental algorithms and data structures in C
Linux basics:
+ Know how to navigate directory with the command line
+ Know how to invoke a command with options
+ Know how to pipe output to another program
Touch typing. Since we are going to use Linux, touch typing helps. I know typing
speed does not relate to problem-solving, but at least your typing speed should
be fast enough not to let it get it the way and degrade the learning experience.
In general, I assume that the reader has basic C programming knowledge, and can
use an IDE to build and run a program.
# Status:
* Part 1
- Chapter 1: Complete
- Chapter 2: Complete
- Chapter 3: Almost. Currently, the book relies on the Intel Manual for fully explaining x86 execution environment.
- Chapter 4: Complete
- Chapter 5: Complete
- Chapter 6: Complete
* Part 2
- Chapter 7: Complete
- Chapter 8: Complete
* Part 3
- Chapter 9: Incomplete
- Chapter 10: Incomplete
- Chapter 11: Incomplete
- Chapter 12: Incomplete
- Chapter 13: Incomplete
... and future chapters not included yet ...
In the future, I hope to expand part 3 to cover more than the first 2 parts. But
for the time being, I will try to finish the above chapters first.
# Sample OS
[This repository](https://github.com/tuhdo/sample-os) is the sample OS of the
book that is intended as a reference material for part 3. It covers 10 chapters
of the "System Programming Guide" (Intel Manual Volume 3), along with a simple
keyboard and video driver for input and output. However, at the moment, only the
following features are implemented:
- Protected mode.
- Creating and managing processes with TSS (Task State Structure).
- Interrupts
- LAPIC.
Paging and I/O are not yet implemented. I will try to implement it as the book progresses.
# Contributing
If you find any grammatical issues, please report it using Github Issues. Or, if
some sentence or paragraph is difficult to understand, feel free to open an
issue with the following title format: `[page number][type] Descriptive Title`.
For example: `[pg.9][grammar] Incorrect verb usage`.
`type` can be one of the following:
- `Typo`: indicates typing mistake.
- `Grammar`: indicates incorrect grammar usage.
- `Style`: indicates a style improvement.
- `Content`: indicates problems with the content.
Even better, you can make a pull request with the provided book source. The main
content of the book is in the file "Operating Systems: From 0 to 1.lyx". You can
edit the .txt file, then I will integrate the changes manually. It is a
workaround for now since Lyx can cause a huge diff which makes it impossible to
review changes.
The book is in development, so please bear with me if the English irritates you.
I really appreciate it.
Finally, if you like the project and if it is possible, please donate to help
this project and keep it going.
# Got questions?
If you have any question related to the material or the development of the book,
feel free to [open a Github issue](https://github.com/tuhdo/os01/issues/new).
================================================
FILE: _config.yml
================================================
theme: jekyll-theme-architect
================================================
FILE: book_src/Operating Systems From 0 to 1.lyx
================================================
#LyX 2.3 created this file. For more info see http://www.lyx.org/
\lyxformat 544
\begin_document
\begin_header
\save_transient_properties true
\origin unavailable
\textclass tufte-book
\begin_preamble
% DO NOT ALTER THIS PREAMBLE!!!
%
% This preamble is designed to ensure that the manual prints
% out as advertised. If you mess with this preamble,
% parts of the manual may not print out as expected. If you
% have problems LaTeXing this file, please contact
% the documentation team
% email: lyx-docs@lists.lyx.org
\usepackage[makeindex]{imakeidx}
\makeindex[intoc]
\usepackage{hyperref}
% if pdflatex is used
\usepackage{ifpdf}
\ifpdf
% set fonts for nicer pdf view
\IfFileExists{lmodern.sty}
{\usepackage{lmodern}}{}
\fi % end if pdflatex is used
% the pages of the TOC are numbered roman
% and a PDF-bookmark for the TOC is added
\pagenumbering{roman}
\let\myTOC\tableofcontents
\renewcommand{\tableofcontents}{%
\pdfbookmark[1]{\contentsname}{}
\myTOC
\cleardoublepage
\pagenumbering{arabic}}
\usepackage{xcolor}
% extra space for tables
\newcommand{\extratablespace}[1]{\noalign{\vskip#1}}
\usepackage{titlesec}
\usepackage{graphicx}
\definecolor{mygray}{gray}{0.3.}
\definecolor{lightgray}{gray}{0.6}
\definecolor{green}{rgb}{0.31, 0.78, 0.47}
\definecolor{yellow}{rgb}{1.0, 0.87, 0.0}
\definecolor{cyan}{rgb}{0.0, 0.72, 0.92}
\hyphenation{MovCursor}
\usepackage{multirow}
% add numbers to chapters, sections, subsections
\setcounter{secnumdepth}{4}
% section format
\titleformat{\section}%
{\normalfont\LARGE\bfseries\color{mygray}}% format applied to label+text
{\llap{\colorbox{mygray}{\parbox{3.5cm}{\hfill\color{white}\thesection}}}}% label
{1em}% horizontal separation between label and title body
{}% before the title body
[]% after the title body
\titleformat{\subsection}%
{\color{gray}\normalfont\large\itshape}
{}
{0em}
{\large\thesubsection\hspace{0.6em}}
[{\titlerule[0.8pt]}]
\usepackage[activate={true, nocompatibility}, final, tracking=true, kerning=true, spacing=true, factor=1100, stretch=20, shrink=20]{microtype}
\hyphenpenalty=10
\exhyphenpenalty=10
\doublehyphendemerits=10
\finalhyphendemerits=5000
\uchyph=0
\titleformat{\chapter}[display]
{\normalfont\bfseries\color{mygray}}
{\filleft\hspace*{-60pt}%
\rotatebox[origin=c]{90}{%
\normalfont\color{black}\Large%
\textls[180]{\textsc{ }}%
}\hspace{10pt}%
{\setlength\fboxsep{0pt}%
\colorbox{mygray}{\parbox[c][3cm][c]{3.5cm}{%
\centering\color{white}\fontsize{80}{90}\fontfamily{lmtt}\selectfont\thechapter}%
}}%
}
{10pt}
{\raggedleft\Huge\itshape\bfseries\fontfamily{pzc}\selectfont}
\usepackage{listings}
\lstset{
basicstyle=\ttfamily,
columns=fullflexible,
breaklines=true,
escapeinside={@|}{|@}
}
\usepackage{bookmark}
% table of contents styling
\usepackage{titletoc}
\usepackage{etoolbox}
\newcommand\frontformat{%
\titlecontents{chapter}[0em]
{\itshape}{\contentslabel{0em}}
{}{\normalfont\titlerule*[1pc]{.}\contentspage}}
\newcommand\mainformat{%
\titlecontents{chapter}[1.4em]
{\addvspace{10pt}\bfseries}{\contentslabel{1.5em}}
{}{\normalfont\titlerule*[1pc]{.}\bfseries\contentspage}
}
\newcommand\backformat{%
\titlecontents{chapter}[1.5em]
{\addvspace{10pt}\itshape}{\contentslabel{1.5em}}
{\hspace*{-1.5em}}{\normalfont\titlerule*[1pc]{.}\contentspage}}
\titlecontents{section}[3.8em]
{\itshape}{\contentslabel{2.3em}}
{\hspace*{-2.3em}}{\titlerule*[1pc]{.}\contentspage}
\apptocmd{\frontmatter}{\frontformat}{}{}
\apptocmd{\mainmatter}{\mainformat}{}{}
\apptocmd{\appendix}{\backformat}{}{}
% caption customization
\usepackage[font=small,labelfont=bf]{caption}
\usepackage[many]{tcolorbox}
\definecolor{greentitle}{RGB}{61,170,61}
\definecolor{greentitleback}{RGB}{216,233,213}
\newtcolorbox{shelloutput16.6}[1][]{%
breakable,
enhanced,
title=Output,
arc=0mm,
auto outer arc,
colback=white,
boxrule=1pt,
leftrule=5pt,
before skip = 0mm,
fonttitle=\bfseries\texttt\smaller,
enlarge top initially by=5mm,
width=16.6cm,
attach boxed title to top left={xshift=-15.8mm,yshift=-5.72mm},
boxed title style={skin=enhancedfirst jigsaw,size=small,arc=0mm,bottom=0mm,
interior style={fill=none,
top color=mygray,
bottom color=mygray}},
#1
}
\definecolor{whitesmoke}{rgb}{0.96, 0.96, 0.96}
\newtcolorbox{shelloutput}[1][]{%
breakable,
enhanced,
colback=white,
title=Output,
arc=0mm,
auto outer arc,
boxrule=1pt,
leftrule=5pt,
fonttitle=\bfseries\texttt\smaller,
enlarge top initially by=5mm,
attach boxed title to top left={xshift=-15.8mm,yshift=-5.72mm},
boxed title style={skin=enhancedfirst jigsaw,size=small,arc=0mm,bottom=0mm,
interior style={fill=none,
top color=mygray,
bottom color=mygray}},
#1
}
\newtcolorbox{shellcommand}[1][]{%
enlarge top initially by=5mm,
}
\RequirePackage{ragged2e}
\setlength{\RaggedRightRightskip}{\z@ plus 0.01\hsize}
\end_preamble
\options bibliography=totoc,index=totoc,BCOR7.5mm,titlepage,captions=tableheading
\use_default_options true
\begin_modules
eqs-within-sections
figs-within-sections
logicalmkup
multicol
shapepar
algorithm2e
tcolorbox
theorems-ams-bytype
enumitem
tabs-within-sections
theorems-ams-extended-bytype
theorems-sec-bytype
fix-cm
fixltx2e
\end_modules
\maintain_unincluded_children false
\begin_local_layout
Format 7
InsetLayout CharStyle:MenuItem
LyxType charstyle
LabelString menu
LatexType command
LatexName menuitem
Font
Family Sans
EndFont
Preamble
\newcommand*{\menuitem}[1]{{\sffamily #1}}
EndPreamble
End
\end_local_layout
\language english
\language_package none
\inputencoding auto
\fontencoding global
\font_roman "lmodern" "default"
\font_sans "lmss" "default"
\font_typewriter "lmtt" "default"
\font_math "auto" "auto"
\font_default_family default
\use_non_tex_fonts false
\font_sc false
\font_osf false
\font_sf_scale 110 100
\font_tt_scale 100 100
\use_microtype false
\use_dash_ligatures false
\graphics default
\default_output_format default
\output_sync 1
\output_sync_macro "\synctex=1"
\bibtex_command default
\index_command default
\paperfontsize 12
\spacing onehalf
\use_hyperref true
\pdf_title "LyX's Additional Features manual"
\pdf_author "LyX Team"
\pdf_subject "LyX's additional features documentation"
\pdf_keywords "LyX, Documentation, Additional"
\pdf_bookmarks true
\pdf_bookmarksnumbered true
\pdf_bookmarksopen false
\pdf_bookmarksopenlevel 1
\pdf_breaklinks false
\pdf_pdfborder false
\pdf_colorlinks true
\pdf_backref false
\pdf_pdfusetitle false
\pdf_quoted_options "linkcolor=black, citecolor=black, urlcolor=blue, filecolor=blue, pdfpagelayout=OneColumn, pdfnewwindow=true, pdfstartview=XYZ, plainpages=false"
\papersize a4paper
\use_geometry true
\use_package amsmath 1
\use_package amssymb 1
\use_package cancel 0
\use_package esint 0
\use_package mathdots 1
\use_package mathtools 0
\use_package mhchem 1
\use_package stackrel 0
\use_package stmaryrd 0
\use_package undertilde 0
\cite_engine natbib
\cite_engine_type authoryear
\biblio_style plain
\use_bibtopic true
\use_indices false
\paperorientation portrait
\suppress_date false
\justification true
\use_refstyle 0
\use_minted 0
\notefontcolor #aa007f
\index Index
\shortcut idx
\color #008000
\end_index
\leftmargin 2cm
\rightmargin 2cm
\secnumdepth 2
\tocdepth 1
\paragraph_separation skip
\defskip smallskip
\is_math_indent 1
\math_indentation default
\math_numbering_side default
\quotes_style english
\dynamic_quotes 0
\papercolumns 1
\papersides 2
\paperpagestyle fancy
\listings_params "language=C,commentstyle={\color{lightgray}\itshape},emphstyle={\itshape},breaklines=true,basicstyle={\ttfamily},stringstyle={\color{gray}},frame=shadowbox,rulesepcolor={\color{black}}"
\bullet 0 0 17 -1
\tracking_changes false
\output_changes false
\html_math_output 0
\html_css_as_file 0
\html_be_strict true
\end_header
\begin_body
\begin_layout Title
\noindent
Operating Systems:
\begin_inset Newline newline
\end_inset
From 0 to 1
\end_layout
\begin_layout Author
Tu, Do Hoang
\end_layout
\begin_layout Standard
\begin_inset Newpage cleardoublepage
\end_inset
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
setcounter{page}{1}% Start page number with 1
\end_layout
\begin_layout Plain Layout
\backslash
renewcommand{
\backslash
thepage}{
\backslash
Roman{page}}% Roman numerals for page counter
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset CommandInset toc
LatexCommand tableofcontents
\end_inset
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
setcounter{page}{1}% Start page number with 1
\end_layout
\begin_layout Plain Layout
\backslash
renewcommand{
\backslash
thepage}{
\backslash
roman{page}}% Roman numerals for page counter
\end_layout
\end_inset
\end_layout
\begin_layout Chapter*
\emph on
Preface
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
addcontentsline{toc}{chapter}{Preface}
\end_layout
\end_inset
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout Standard
\noindent
\align left
Greetings!
\end_layout
\begin_layout Standard
\noindent
You've probably asked yourself at least once how an operating system is
written from the ground up.
You might even have years of programming experience under your belt, yet
your understanding of operating systems may still be a collection of abstract
concepts not grounded in actual implementation.
To those who've never built one, an operating system may seem like magic:
a mysterious thing that can control hardware while handling a programmer's
requests via the API of their favorite programming language.
Learning how to build an operating system seems intimidating and difficult;
no matter how much you learn, it never feels like you know enough.
You're probably reading this book right now to gain a better understanding
of operating systems to be a better software engineer.
\end_layout
\begin_layout Standard
If that is the case, this book is for you.
By going through this book, you will be able to find the missing pieces
that are essential and enable you to implement your own operating system
from scratch! Yes, from scratch, without going through any existing operating
system layer to prove to yourself that you are an operating system developer.
You may ask,
\begin_inset Quotes eld
\end_inset
Isn't it more practical to learn the internals of Linux?
\begin_inset Quotes erd
\end_inset
.
\end_layout
\begin_layout Standard
Yes...
\end_layout
\begin_layout Standard
and no.
\end_layout
\begin_layout Standard
Learning Linux can help your workflow at your day job.
However, if you follow that route, you still won't achieve the ultimate
goal of writing an actual operating system.
By writing your own operating system, you will gain knowledge that you
will not be able to glean just from learning Linux.
\end_layout
\begin_layout Standard
Here's a list of some benefits of writing your own OS:
\end_layout
\begin_layout Itemize
You will learn how a computer works at the hardware level, and you will
learn to write software to manage that hardware directly.
\end_layout
\begin_layout Itemize
You will learn the fundamentals of operating systems, allowing you to adapt
to any operating system, not just Linux
\end_layout
\begin_layout Itemize
To hack on Linux internals suitably, you'll need to write at least one operating
system on your own.
This is just like applications programming: to write a large application,
you'll need to start with simple ones.
\end_layout
\begin_layout Itemize
You will open pathways to various low-level programming domains such as
reverse engineering, exploits, building virtual machines, game console
emulation and more.
Assembly language will become one of your most indispensable tools for
low-level analysis.
(But that does not mean you have to write your operating system in Assembly!)
\end_layout
\begin_layout Itemize
Writing an operating system is fun!
\end_layout
\begin_layout Section*
\emph on
Why another book on Operating Systems?
\end_layout
\begin_layout Standard
There are many books and courses on this topic made by famous professors
and experts out there already.
Who am I to write a book on such an advanced topic? While it's true that
many quality resources exist, I find them lacking.
Do any of them show you how to compile your C code and the C runtime library
independent of an existing operating system? Most books on operating system
design and implementation only discuss the software side; how the operating
system communicates with the hardware is skipped.
Important hardware details are skipped, and it's difficult for a self-learner
to find relevant resources on the Internet.
The aim of this book is to bridge that gap: not only will you learn how
to program hardware directly, but also how to read official documents from
hardware vendors to program it.
You no longer have to seek out resources to help yourself interpret hardware
manuals and documentation: you can do it yourself.
Lastly, I wrote this book from an autodidact's perspective.
I made this book as self-contained as possible so you can spend more time
learning and less time guessing or seeking out information on the Internet.
\end_layout
\begin_layout Standard
One of the core focuses of this book is to guide you through the process
of reading official documentation from vendors to implement your software.
Official documents from hardware vendors like Intel are critical for implementi
ng an operating system or any other software that directly controls the
hardware.
At a minimum, an operating system developer needs to be able to comprehend
these documents and implement software based on a set of hardware requirements.
Thus, the first chapter is dedicated to discussing relevant documents and
their importance.
\end_layout
\begin_layout Standard
Another distinct feature of this book is that it is
\begin_inset Quotes eld
\end_inset
Hello World
\begin_inset Quotes erd
\end_inset
centric.
Most examples revolve around variants of a
\begin_inset Quotes eld
\end_inset
Hello World
\begin_inset Quotes erd
\end_inset
program, which will acquaint you with core concepts.
These concepts must be learned before attempting to write an operating
system.
Anything beyond a simple
\begin_inset Quotes eld
\end_inset
Hello World
\begin_inset Quotes erd
\end_inset
example gets in the way of teaching the concepts, thus lengthening the
time spent on getting started writing an operating system.
\end_layout
\begin_layout Standard
Let's dive in.
With this book, I hope to provide enough foundational knowledge that will
open doors for you to make sense of other resources.
This book will be beneficial to students who've just finished their first
C/C++ course greatly.
Imagine how cool it would be to show prospective employers that you've
already built an operating system.
\end_layout
\begin_layout Section*
\emph on
Prerequisites
\end_layout
\begin_layout Itemize
Basic knowledge of circuits
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Itemize
Basic Concepts of Electricity: atoms, electrons, proton, neutron, current
flow.
\end_layout
\begin_layout Itemize
Ohm's law
\end_layout
\begin_layout Standard
If you are unfamiliar with these concepts, you can quickly learn them here:
\begin_inset Flex URL
status collapsed
\begin_layout Plain Layout
http://www.allaboutcircuits.com/textbook/
\end_layout
\end_inset
, by reading chapter 1 and chapter 2.
\end_layout
\end_deeper
\begin_layout Itemize
C programming.
In particular:
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Itemize
Variable and function declarations/definitions
\end_layout
\begin_layout Itemize
While and for loops
\end_layout
\begin_layout Itemize
Pointers and function pointers
\end_layout
\begin_layout Itemize
Fundamental algorithms and data structures in C
\end_layout
\end_deeper
\begin_layout Itemize
Linux basics:
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Itemize
Know how to navigate directory with the command line
\end_layout
\begin_layout Itemize
Know how to invoke a command with options
\end_layout
\begin_layout Itemize
Know how to pipe output to another program
\end_layout
\end_deeper
\begin_layout Itemize
Touch typing.
Since we are going to use Linux, touch typing helps.
I know typing speed does not relate to problem-solving, but at least your
typing speed should be fast enough not to let it get in the way and degrade
the learning experience.
\end_layout
\begin_layout Standard
In general, I assume that the reader has basic C programming knowledge,
and can use an IDE to build and run a program.
\end_layout
\begin_layout Section*
\emph on
What you will learn in this book
\end_layout
\begin_layout Itemize
How to write an operating system from scratch by reading hardware datasheets.
In the real world, you will not be able to consult Google for a quick answer.
\end_layout
\begin_layout Itemize
Write code independently.
It's pointless to copy and paste code.
Real learning happens when you solve problems on your own.
Some examples are provided to help kick start your work, but most problems
are yours to conquer.
However, the solutions are available online for you after giving a good
try.
\end_layout
\begin_layout Itemize
A big picture of how each layer of a computer related to each other, from
hardware to software.
\end_layout
\begin_layout Itemize
How to use Linux as a development environment and common tools for low-level
programming.
\end_layout
\begin_layout Itemize
How a program is structured so that an operating system can run.
\end_layout
\begin_layout Itemize
How to debug a program running directly on hardware with
\family typewriter
gdb
\family default
and QEMU.
\end_layout
\begin_layout Itemize
Linking and loading on bare metal x86_64, with pure C.
No standard library.
No runtime overhead.
\end_layout
\begin_layout Section*
\emph on
What this book is not about
\end_layout
\begin_layout Itemize
\begin_inset Flex NewThought
status open
\begin_layout Plain Layout
Electrical Engineering
\end_layout
\end_inset
: The book discusses some concepts from electronics and electrical engineering
only to the extent of how software operates on bare metal.
\end_layout
\begin_layout Itemize
\begin_inset Flex Noun
status open
\begin_layout Plain Layout
How to use Linux or any OS types of books
\end_layout
\end_inset
: Though Linux is used as a development environment and as a medium to demonstra
te high-level operating system concepts, it is not the focus of this book.
\end_layout
\begin_layout Itemize
\begin_inset Flex Noun
status open
\begin_layout Plain Layout
Linux Kernel development
\end_layout
\end_inset
: There are already many high-quality books out there on this subject.
\end_layout
\begin_layout Itemize
\begin_inset Flex Noun
status open
\begin_layout Plain Layout
Operating system books focused on algorithms
\end_layout
\end_inset
: This book focuses more on actual hardware platform - Intel x86_64 - and
how to write an OS that utilizes of OS support from the hardware platform.
\end_layout
\begin_layout Section*
The organization of the book
\end_layout
\begin_layout Description
Part
\begin_inset space ~
\end_inset
1 provides a foundation for learning operating system.
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Itemize
Chapter 1 briefly explains the importance of domain documents.
Documents are crucial for the learning experience, so they deserve a chapter.
\end_layout
\begin_layout Itemize
Chapter 2 explains the layers of abstractions from hardware to software.
The idea is to provide insight into how code runs physically.
\end_layout
\begin_layout Itemize
Chapter 3 provides the general architecture of a computer, then introduces
a sample computer model that you will use to write an operating system.
\end_layout
\begin_layout Itemize
Chapter 4 introduces the x86 assembly language through the use of the Intel
manuals, along with commonly used instructions.
This chapter gives detailed examples of how high-level syntax corresponds
to low-level assembly, enabling you to read generated assembly code comfortably.
It is necessary to read assembly code when debugging an operating system.
\end_layout
\begin_layout Itemize
Chapter 5 dissects ELF in detail.
Only by understanding how the structure of a program at the binary level,
you can build one that runs on bare metal.
\end_layout
\begin_layout Itemize
Chapter 6 introduces
\family typewriter
gdb
\family default
debugger with extensive examples for commonly used commands.
After acquainting the reader with
\family typewriter
gdb
\family default
, it then provides insight on how a debugger works.
This knowledge is essential for building a debuggable program on the bare
metal.
\end_layout
\end_deeper
\begin_layout Description
Part
\begin_inset space ~
\end_inset
2 presents how to write a bootloader to bootstrap a kernel.
Hence the name
\emph on
\begin_inset Quotes eld
\end_inset
Groundwork
\begin_inset Quotes erd
\end_inset
\emph default
.
After mastering this part, the reader can continue with the next part,
which is a guide for writing an operating system.
However, if the reader does not like the presentation, he or she can look
elsewhere, such as OSDev Wiki:
\begin_inset Flex URL
status open
\begin_layout Plain Layout
http://wiki.osdev.org/
\end_layout
\end_inset
.
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Itemize
Chapter 7 introduces what the bootloader is, how to write one in assembly,
and how to load it on QEMU, a hardware emulator.
This process involves typing repetitive and long commands, so GNU Make
is applied to improve productivity by automating the repetitive parts and
simplifying the interaction with the project.
This chapter also demonstrates the use of GNU Make in context.
\end_layout
\begin_layout Itemize
Chapter 8 introduces linking by explaining the relocation process when combining
object files.
In addition to a bootloader and an operating system written in C, this
is the last piece of the puzzle required for building debuggable programs
on bare metal, including the bootloader written in Assembly and an operating
system written in C.
\end_layout
\end_deeper
\begin_layout Description
Part
\begin_inset space ~
\end_inset
3 provides guidance on how to write an operating system, as you should implement
an operating system on your own and be proud of your creation.
The guidance consists of simpler and coherent explanations of necessary
concepts, from hardware to software, to implement the features of an operating
system.
Without such guidance, you will waste time gathering information spread
through various documents and the Internet.
It then provides a plan on how to map the concepts to code.
\end_layout
\begin_layout Section*
\emph on
Acknowledgments
\end_layout
\begin_layout Standard
Thank you, my beloved family.
Thank you, the contributors.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
mainmatter
\end_layout
\begin_layout Plain Layout
\backslash
renewcommand{
\backslash
thepage}{
\backslash
arabic{page}}% Arabic numerals for page counter
\end_layout
\begin_layout Plain Layout
\backslash
setcounter{page}{1}% Start page number with 2
\end_layout
\end_inset
\end_layout
\begin_layout Part
Preliminary
\end_layout
\begin_layout Chapter
Domain documents
\end_layout
\begin_layout Section
Problem domains
\end_layout
\begin_layout Standard
In the real world, software engineering is not only focused on software,
but also the problem domain it is trying to solve.
\end_layout
\begin_layout Quote
A
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
problem domain
\end_layout
\end_inset
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
problem domain
\end_layout
\end_inset
problem domain
\emph default
is
\emph on
the part of the world
\emph default
where the computer is to produce effects, together with the means available
to produce them, directly or indirectly.
\begin_inset CommandInset citation
LatexCommand citep
key "Kovitz_psr"
literal "true"
\end_inset
\end_layout
\begin_layout Standard
A
\emph on
problem domain
\begin_inset Index idx
status open
\begin_layout Plain Layout
problem domain
\end_layout
\end_inset
\emph default
is anything outside of programming that a software engineer needs to understand
to produce correct code that can achieve the desired effects.
\begin_inset Quotes eld
\end_inset
Directly
\begin_inset Quotes erd
\end_inset
means include anything that the software can control to produce the desired
effects, e.g.
keyboards, printers, monitors, other software, etc.
\begin_inset Quotes eld
\end_inset
Indirectly
\begin_inset Quotes erd
\end_inset
means anything not part of the software but relevant to the problem domain
e.g.
appropriate people to be informed by the software when some event happens,
students that move to correct classrooms according to the schedule generated
by the software.
To write a finance application, a software engineer needs to learn sufficient
finance concepts to understand the
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
requirements
\end_layout
\end_inset
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
requirements
\end_layout
\end_inset
requirements
\emph default
of a customer and implement such requirements, correctly.
\end_layout
\begin_layout Quote
Requirements are the effects that the machine is to exert in the problem
domain by virtue of its programming.
\end_layout
\begin_layout Standard
Programming alone is not too complicated; programming to solve a problem
domain, is
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
We refer to the concept of
\begin_inset Quotes eld
\end_inset
programming
\begin_inset Quotes erd
\end_inset
here as someone able to write code in a language, but not necessary know
any or all software engineering knowledge.
\end_layout
\end_inset
.
Not only a software engineer needs to understand how to implement the software,
but also the problem domain that it tries to solve, which might require
in-depth expert knowledge.
The software engineer must also select the right programming techniques
that apply to the problem domain he is trying to solve because many techniques
that are effective in one domain might not be in another.
For example, many types of applications do not require performant written
code, but a short time to market.
In this case, interpreted languages are widely popular because it can satisfy
such need.
However, for writing huge 3D games or operating system, compiled languages
are dominant because it can generate the most efficient code required for
such applications.
\end_layout
\begin_layout Standard
Often, it is too much for a software engineer to learn non-trivial domains
(that might require a bachelor degree or above to understand the domains).
Also, it is easier for a
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
domain expert
\end_layout
\end_inset
domain expert
\emph default
to learn enough programming to break down the problem domain into parts
small enough for the software engineers to implement.
Sometimes, domain experts implement the software themselves.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Problem domains: Software and Non-software.
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/01/domains_general.pdf
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
One example of such scenario is the domain that is presented in this book:
\emph on
operating system
\emph default
.
A certain amount of electrical engineering (EE) knowledge is required to
implement an operating system.
If a computer science (CS) curriculum does not include minimum EE courses,
students in the curriculum have little chance to implement a working operating
system.
Even if they can implement one, either they need to invest a significant
amount of time to study on their own, or they fill code in a predefined
framework just to understand high-level algorithms.
For that reason, EE students have an easier time to implement an OS, as
they only need to study a few core CS courses.
In fact, only
\emph on
\begin_inset Quotes eld
\end_inset
C programming
\begin_inset Quotes erd
\end_inset
\emph default
and
\emph on
\begin_inset Quotes eld
\end_inset
Algorithms and Data Structures
\begin_inset Quotes erd
\end_inset
\emph default
classes are usually enough to get them started writing code for device
drivers, and later generalize it into an
\emph on
operating system.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\end_layout
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Operating System domain.
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/01/domains_os_example.pdf
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
One thing to note is that software is its own problem domain.
A problem domain does not necessarily divide between software and itself.
Compilers, 3D graphics, games, cryptography, artificial intelligence, etc.,
are parts of software engineering domains (actually it is more of a computer
science domain than a software engineering domain).
In general, a software-exclusive domain creates software to be used by
other software.
Operating System is also a domain, but is overlapped with other domains
such as electrical engineering.
To effectively implement an operating system, it is required to learn enough
of the external domain.
How much learning is enough for a software engineer? At the minimum, a
software engineer should be knowledgeable enough to understand the documents
prepared by hardware engineers for using (i.e.
programming) their devices.
\end_layout
\begin_layout Standard
Learning a programming language, even C or Assembly, does not mean a software
engineer can automatically be good at hardware programming or any related
low-level programming domains.
One can spend 10 years, 20 years or his entire life writing C/C++ code,
and he still cannot write an operating system, simply because of the ignorance
of relevant domain knowledge.
Just like learning English does not mean a person automatically becomes
good at reading Math books written in English.
Much more than that is needed.
Knowing one or two programming languages is not enough.
If a programmer writes software for a living, he had better be specialized
in one or two problem domains outside of software if he does not want his
job taken by domain experts who learn programming in their spare time.
\end_layout
\begin_layout Section
Documents for implementing a problem domain
\end_layout
\begin_layout Standard
Documents are essential for learning a problem domain (and actually, anything)
since information can be passed down in a reliable way.
It is evident that this written text has been used for thousands of years
to pass knowledge from generation to generation.
Documents are integral parts of non-trivial projects.
Without the documents:
\end_layout
\begin_layout Itemize
New people will find it much harder to join a project.
\end_layout
\begin_layout Itemize
It is harder to maintain a project because people may forget important unresolve
d bugs or quirks in their system.
\end_layout
\begin_layout Itemize
It is challenging for customers to understand the product they are going
to use.
However, documents do not need to be written in book format.
It can be anything from HTML format to database format to be displayed
by a graphical user interface.
Important information must be stored somewhere safe, readily accessible.
\end_layout
\begin_layout Standard
There are many types of documents.
However, to facilitate the understanding of a problem domain, these two
documents need to be written:
\emph on
software requirement document
\emph default
and
\emph on
software specification
\emph default
.
\end_layout
\begin_layout Subsection
Software Requirement Document
\end_layout
\begin_layout Standard
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
Software requirement document
\end_layout
\end_inset
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
Software requirement
\end_layout
\end_inset
Software requirement document
\emph default
includes both a list of requirements and a description of the problem domain
\begin_inset CommandInset citation
LatexCommand citep
key "Kovitz_psr"
literal "true"
\end_inset
.
\end_layout
\begin_layout Standard
A software solves a business problem.
But, which problems to solve, are requested by a customer.
Many of these requests make a list of requirements that our software needs
to fulfill.
However, an enumerated list of features is seldom useful in delivering
software.
As stated in the previous section, the tricky part is not programming alone
but programming according to a problem domain.
The bulk of software design and implementation depends upon the knowledge
of the problem domain.
The better understood the domain, the higher quality software can be.
For example, building a house is practiced over thousands of years and
is well understood, and it is easy to build a high-quality house; software
is no different.
Code that is difficult to understand is usually due to the author's ignorance
of a problem domain.
In the context of this book, we seek to understand the low-level working
of various hardware devices.
\end_layout
\begin_layout Standard
Because software quality depends upon the understanding of the problem domain,
the amount of software requirement document should consist of problem domain
description.
\end_layout
\begin_layout Standard
Be aware that software requirements are not:
\end_layout
\begin_layout Description
What
\begin_inset space ~
\end_inset
vs
\begin_inset space ~
\end_inset
How
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Standard
\begin_inset Quotes eld
\end_inset
what
\begin_inset Quotes erd
\end_inset
and
\begin_inset Quotes eld
\end_inset
how
\begin_inset Quotes erd
\end_inset
are vague terms.
What is the
\begin_inset Quotes eld
\end_inset
what
\begin_inset Quotes erd
\end_inset
? Is it nouns only? If so, what if a customer requires his software to perform
specific steps of operations, such as purchasing procedure for a customer
on a website.
Does it include
\begin_inset Quotes eld
\end_inset
verbs
\begin_inset Quotes erd
\end_inset
now? However, isn't the
\begin_inset Quotes eld
\end_inset
how
\begin_inset Quotes erd
\end_inset
supposed to be step by step operations? Anything can be the
\begin_inset Quotes eld
\end_inset
what
\begin_inset Quotes erd
\end_inset
and anything can be the
\begin_inset Quotes eld
\end_inset
how
\begin_inset Quotes erd
\end_inset
.
\end_layout
\end_deeper
\begin_layout Description
Sketches
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Standard
Software requirement document is all about the problem domain.
It should not be a high-level description of an implementation.
Some problems might seem straightforward to map directly from its domain
description to the structure of an implementation.
For example:
\end_layout
\begin_layout Itemize
Users are given a list of books in a
\series bold
\emph on
drop-down menu
\series default
\emph default
to choose.
\end_layout
\begin_layout Itemize
Books are stored in a
\series bold
\emph on
linked list
\series default
\emph default
\begin_inset Quotes erd
\end_inset
.
\end_layout
\begin_layout Itemize
etc
\end_layout
\begin_layout Standard
In the future, instead of a drop-down menu, all books are listed directly
on a page in thumbnails.
Books might be reimplemented as a graph, and each node is a book for finding
related books, as a recommender is going to be added in the next version.
The requirement document needs updating again to remove all the outdated
implementation details, thus required additional efforts to maintain the
requirement document, and when the effort for syncing with the implementation
is too much, the developers give up documentation, and everyone starts
ranting how useless documentation is.
\end_layout
\begin_layout Standard
More often than not there is no straightforward one-to-one mapping.
For example, a regular computer user expects an OS to be something that
runs some program with GUI, or their favorite computer games.
But for such requirements, an operating system is implemented as multiple
layers, each hiding the details from the upper layers.
To implement an operating system, a large body of knowledge from multiple
fields is required, especially if the operating system runs on non-PC devices.
\end_layout
\begin_layout Standard
It's best to include information related to the problem domain in the requiremen
t document.
A good way to test the quality of a requirement document is to provide
it to a domain expert for proofreading, to ensure he can understand the
material thoroughly.
A requirement document is also useful as a help document later, or for
writing one much easier.
\end_layout
\end_deeper
\begin_layout Subsection
Software Specification
\end_layout
\begin_layout Standard
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
Software specification
\end_layout
\end_inset
\emph default
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
Software specification
\end_layout
\end_inset
\emph on
Software specification
\emph default
document states rules relating desired behavior of the output devices to
all possible behavior of the input devices, as well as any rules that other
parts of the problem domain must obey.
\begin_inset CommandInset citation
LatexCommand cite
key "Kovitz_psr"
literal "true"
\end_inset
\end_layout
\begin_layout Standard
Simply put, software specification is interface design, with constraints
for the problem domain to follow e.g.
the software can accept certain types of input such as the software is
designed to accept English but no other language.
For a hardware device, a specification is always needed, as software depends
on its hardwired behaviors.
And in fact, it is mostly the case that hardware specifications are well-define
d, with the tiniest details in it.
It needs to be that way because once hardware is physically manufactured,
there's no going back, and if defects exist, it's a devastating damage
to the company on both finance and reputation.
\end_layout
\begin_layout Standard
Note that, similar to a requirement document, a specification only concerns
interface design.
If implementation details leak in, it is a burden to sync between the actual
implementation and the specification, and soon to be abandoned.
\end_layout
\begin_layout Standard
Another important remark is that, though a specification document is important,
it does not have to be produced
\emph on
before
\emph default
the implementation.
It can be prepared in any order: before or after a complete implementation;
or at the same time with the implementation, when some part is done, and
the interface is ready to be recorded in the specification.
Regardless of methods, what matter is a complete specification at the end.
\end_layout
\begin_layout Section
Documents for writing an x86 Operating System
\end_layout
\begin_layout Standard
When problem domain is different from software domain, requirement document
and specification are usually separated.
However, if the problem domain is inside software, specification most often
includes both, and content of both can be mixed with each other.
As demonstrated by previous sections the importance of documents, to implement
an OS, we will need to collect relevant documents to gain sufficient domain
knowledge.
These documents are as follow:
\end_layout
\begin_layout Itemize
Intel® 64 and IA-32 Architectures Software Developer’s Manual (Volume 1,
2, 3)
\end_layout
\begin_layout Itemize
Intel® 3 Series Express Chipset Family Datasheet
\end_layout
\begin_layout Itemize
System V Application Binary Interface
\end_layout
\begin_layout Standard
Aside from the Intel's official website, the website of this book also hosts
the documents for convenience
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
Intel may change the links to the documents as they update their website,
so this book doesn't contain any link to the documents to avoid confusion
for readers.
\end_layout
\end_inset
.
\end_layout
\begin_layout Standard
Intel documents divide the requirement and specification sections clearly,
but call the sections with different names.
The corresponding to the requirement document is a section called
\emph on
\begin_inset Quotes eld
\end_inset
Functional Description
\begin_inset Quotes erd
\end_inset
\emph default
, which consists mostly of domain description; for specification,
\emph on
\begin_inset Quotes eld
\end_inset
Register Description
\begin_inset Quotes erd
\end_inset
\emph default
section describes all programming interfaces.
Both documents carry no unnecessary implementation details
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
As it should be, those details are trade secret.
\end_layout
\end_inset
.
Intel documents are also great examples of how to write well requirements/speci
fications, as explained in this chapter.
\end_layout
\begin_layout Standard
Other than the Intel documents, other documents will be introduced in the
relevant chapters.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
chapter[From hardware to software: Layers of abstraction]{From hardware
to software:
\backslash
\backslash
Layers of abstraction}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
This chapter gives an intuition on how hardware and software connected together,
and how software is represented physically.
\end_layout
\begin_layout Section
The physical implementation of a bit
\end_layout
\begin_layout Standard
All electronic devices, from simple to complex, manipulate this flow to
achieve desired effects in the real world.
Computers are no exception.
When we write software, we indirectly manipulate electrical current at
the physical level, in such a way that the underlying machine produces
desired effects.
To understand the process, we consider a simple light bulb.
A light bulb can change two states between on and off with a switch, periodical
ly: an off means number 0, and an on means 1.
\begin_inset Float marginfigure
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
A lightbulb
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/02/bulb.svg
scale 15
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
However, one problem is that such a switch requires manual intervention
from a human.
What is required is an automatic switch based on the voltage level, as
described above.
To enable automatic switching of electrical signals, a device called
\emph on
transistor
\emph default
, invented by William Shockley, John Bardeen and Walter Brattain.
This invention started the whole computer industry.
\end_layout
\begin_layout Standard
At the core, a
\emph on
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
transistor
\end_layout
\end_inset
transistor
\begin_inset Index idx
status open
\begin_layout Plain Layout
transistor
\end_layout
\end_inset
\emph default
is just a resistor whose values can vary based on an input voltage value.
\begin_inset Float marginfigure
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Modern transistor
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/02/transistor.svg
scale 25
\end_inset
\end_layout
\end_inset
With this property, a transistor can be used as a current amplifier (more
voltage, less resistance) or switch electrical signals off and on (block
and unblock an electron flow) based on a voltage level.
At 0
\begin_inset space \thinspace{}
\end_inset
v, no current can pass through a transistor, thus it acts like a circuit
with an open switch (light bulb off) because the resistor value is enough
to block the electrical flow.
Similarly, at +3.5
\begin_inset space \thinspace{}
\end_inset
v, current can flow through a transistor because the resistor value is lessened,
effectively enables electron flow, thus acts like a circuit with a closed
switch.
\begin_inset Marginal
status collapsed
\begin_layout Plain Layout
If you want a deeper explanation of transistors e.g.
how electrons move, you should look at the video
\begin_inset Quotes eld
\end_inset
How semiconductors work
\begin_inset Quotes erd
\end_inset
on Youtube, by Ben Eater.
\end_layout
\end_inset
\end_layout
\begin_layout Standard
A bit has two states: 0 and 1, which is the building block of all digital
systems and software.
Similar to a light bulb that can be turned on and off, bits are made out
of this electrical stream from the power source: Bit 0 are represented
with 0
\begin_inset space \thinspace{}
\end_inset
v (no electron flow), and bit 1 is +3.5
\begin_inset space \thinspace{}
\end_inset
v to +5
\begin_inset space \thinspace{}
\end_inset
v (electron flow).
Transistor implements a bit correctly, as it can regulate the electron
flow based on voltage level.
\end_layout
\begin_layout Subsection
MOSFET transistors
\end_layout
\begin_layout Standard
The classic transistors invented open a whole new world of micro digital
devices.
Prior to the invention, vacuum tubes - which are just fancier light bulbs
- were used to present 0 and 1, and required human to turn it on and off.
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
MOSFET
\end_layout
\end_inset
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
MOSFET
\end_layout
\end_inset
MOSFET
\emph default
, or
\series bold
\emph on
M
\series default
etal–
\series bold
O
\series default
xide–
\series bold
S
\series default
emiconductor
\series bold
F
\series default
ield-
\series bold
E
\series default
ffect
\series bold
T
\series default
ransistor
\emph default
, invented in 1959 by Dawon Kahng and Martin M.
(John) Atalla at Bell Labs, is an improved version of classic transistors
that is more suitable for digital devices, as it requires shorter switching
time between two states 0 and 1, more stable, consumes less power and easier
to produce.
\end_layout
\begin_layout Standard
There are also two types of MOSFETs analogous to two types of transistors:
n-MOSFET and p-MOSFET.
n-MOSFET and p-MOSFET are also called NMOS and PMOS transistors for short.
\end_layout
\begin_layout Section
Beyond transistors: digital logic gates
\end_layout
\begin_layout Standard
All digital devices are designed with logic gates.
A
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
logic gate
\end_layout
\end_inset
\emph default
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
logic gate
\end_layout
\end_inset
\emph on
logic gate
\emph default
is a device that implements a boolean function.
Each logic gate includes a number of inputs and an output.
All computer operations are built from the combinations of logic gates,
which are just combinations of boolean functions.
\begin_inset Float marginfigure
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Example: NAND gate
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/02/Nand-gate.svg
scale 30
\end_inset
\end_layout
\begin_layout Plain Layout
\end_layout
\end_inset
\end_layout
\begin_layout Subsection
The theory behind logic gates
\end_layout
\begin_layout Standard
Logic gates accept only binary inputs
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
Input that is either a 0 or 1.
\end_layout
\end_inset
and produce binary outputs.
In other words, logic gates are functions that transform binary values.
Fortunately, a branch of math that deals exclusively with binary values
already existed, called
\emph on
Boolean Algebra
\emph default
, developed in the 19
\begin_inset script superscript
\begin_layout Plain Layout
th
\end_layout
\end_inset
century by George Boole.
With a sound mathematical theory as a foundation logic gates were created
\emph on
.
\emph default
As logic gates implement Boolean functions, a set of Boolean functions is
\begin_inset Index idx
status collapsed
\begin_layout Plain Layout
functionally complete
\end_layout
\end_inset
\begin_inset Marginal
status collapsed
\begin_layout Plain Layout
\series bold
\emph on
functionally complete
\end_layout
\end_inset
\emph on
functionally complete
\emph default
, if this set can construct all other Boolean functions can be constructed
from.
Later, Charles Sanders Peirce (during 1880 – 1881) proved that either Boolean
function of NOR or NAND alone is enough to create all other Boolean logic
functions.
Thus NOR and NAND gates are functionally complete
\begin_inset CommandInset citation
LatexCommand cite
key "Peirce"
literal "true"
\end_inset
.
Gates are simply the implementations of Boolean logic functions, therefore
NAND or NOR gate is enough to implement
\series bold
\emph on
all
\series default
\emph default
other logic gates.
The simplest gates CMOS circuit can implement are inverters (NOT gates)
and from the inverters, comes NAND gates.
With NAND gates, we are confident to implement everything else.
This is why the inventions of transistors, then CMOS circuit revolutionized
computer industry.
\begin_inset Marginal
status collapsed
\begin_layout Plain Layout
If you want to understand why and how from NAND gate we can create all Boolean
functions and a computer, I suggest the course
\emph on
Build a Modern Computer from First Principles: From Nand to Tetris
\emph default
available on Coursera:
\begin_inset Flex URL
status open
\begin_layout Plain Layout
https://www.coursera.org/learn/build-a-computer
\end_layout
\end_inset
.
Go even further, after the course, you should take the series
\emph on
Computational Structures
\emph default
on Edx.
\end_layout
\end_inset
\end_layout
\begin_layout Standard
We should realize and appreciate how powerful boolean functions are available
in all programming languages.
\end_layout
\begin_layout Subsection
Logic Gate implementation: CMOS circuit
\end_layout
\begin_layout Standard
Underlying every logic gate is a circuit called
\series bold
\emph on
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
CMOS
\end_layout
\end_inset
CMOS
\series default
\emph default
\begin_inset Index idx
status open
\begin_layout Plain Layout
\emph on
CMOS
\end_layout
\end_inset
-
\series bold
\emph on
C
\series default
omplementary
\series bold
MOS
\series default
FET
\emph default
.
CMOS consists of two complementary transistors,
\emph on
NMOS
\emph default
and
\emph on
PMOS.
\emph default
The simplest CMOS circuit is an inverter or a
\emph on
NOT
\emph default
gate:
\end_layout
\begin_layout Standard
\begin_inset VSpace vfill
\end_inset
\end_layout
\begin_layout Standard
\begin_inset Newpage pagebreak
\end_inset
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure}
\end_layout
\begin_layout Plain Layout
\backslash
caption{Electron flows of an inverter.
Input is on the left side and output on the right side.
The upper component is a PMOS and the lower component is a NMOS, both connect
to the input and output.
(Source: Created with
\backslash
url{http://www.falstad.com/circuit/})}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[When input is low]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.6]{images/02/inverter-0}}
\backslash
hfill{}
\backslash
subfloat[When input is high]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.6]{images/02/inverter-1}}
\end_layout
\begin_layout Plain Layout
\backslash
end{figure}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
From NOT gate, a NAND gate can be created:
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure}
\backslash
caption{Electron flows of a NAND gate.}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[Input = 00, Ouput = 1]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.52]{images/02/nand-00}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[Input = 01, Ouput = 1]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.52]{images/02/nand-01}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[Input = 10, Output = 1]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.52]{images/02/nand-10}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[Input = 11, Output = 0]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.52]{images/02/nand-11}}
\end_layout
\begin_layout Plain Layout
\backslash
end{figure}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
From NAND gate, we have all other gates.
As demonstrated, such a simple circuitry performs the logical operators
in day-to-day program languages e.g.
NOT operator
\family typewriter
~
\family default
is executed directly by an inverter circuit, and operator
\family typewriter
&
\family default
is executed by an AND circuit and so on.
Code does not run on a magic black box.
In contrast, code execution is precise and transparent, often as simple
as running some hardwired circuit.
When we write software, we simply manipulate electrical current at the
physical level to run appropriate circuits to produce desired outcomes.
However, this whole process somehow does not relate to any thought involving
electrical current.
That is the real magic and will be explained soon.
\end_layout
\begin_layout Standard
One interesting property of CMOS is that
\series bold
\emph on
a k-input gate uses k PMOS and k NMOS transistors
\series default
\emph default
\begin_inset CommandInset citation
LatexCommand citep
key "John_digital"
literal "true"
\end_inset
.
All logic gates are built by pairs of NMOS and PMOS transistors, and gates
are the building blocks of all digital devices from simple to complex,
including any computer.
Thanks to this pattern, it is possible to separate between the actual physical
circuit implementation and logical implementation.
Digital designs are done by designing with logic gates then later be
\begin_inset Quotes eld
\end_inset
compiled
\begin_inset Quotes erd
\end_inset
into physical circuits.
In fact, later we will see that logic gates become a language that describes
how circuits operate.
Understanding how CMOS works is important to understand how a computer
is designed, and as a consequence, how a computer works
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
Again, if you want to understand how logic gates make a computer, consider
the suggested courses on Coursera and Edx earlier.
\end_layout
\end_inset
.
\end_layout
\begin_layout Standard
Finally, an implemented circuit with its wires and transistors is stored
physically in a package called a
\emph on
chip
\emph default
.
A
\emph on
chip
\begin_inset Index idx
status open
\begin_layout Plain Layout
chip
\end_layout
\end_inset
\emph default
is a substrate that an integrated circuit is etched onto.
However, a chip also refers to a completely packaged integrated circuit
in consumer market.
Depends on the context, it is understood differently.
\begin_inset Float marginfigure
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\end_layout
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
74HC00 chip physical view
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/02/74hc00_nxp_physical.jpg
scale 60
\end_inset
\end_layout
\end_inset
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout Standard
\noindent
\align left
\begin_inset CommandInset line
LatexCommand rule
offset "-4ex"
width "100col%"
height "1.5pt"
\end_inset
\end_layout
\begin_layout Example
74HC00 is a chip with four 2-input NAND gates.
The chip comes with 8 input pins and 4 output pins, 1 pin for connecting
to a voltage source and 1 pin for connecting to the ground.
This device is the physical implementation of NAND gates that we can physically
touch and use.
But instead of just a single gate, the chip comes with 4 gates that can
be combined.
Each combination enables a different logic function, effective creating
other logic gates.
This feature is what make the chip popular.
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
74HC00 logic diagrams (Source: 74HC00 datasheet,
\begin_inset Flex URL
status collapsed
\begin_layout Plain Layout
http://www.scrpdf.com/pdf/Semiconductors_new/Logic/74HCT/74HC_HCT00.pdf
\end_layout
\end_inset
)
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Logic diagram of 74HC00
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/02/7400_block_diagram.png
\end_inset
\end_layout
\end_inset
\begin_inset space \hfill{}
\end_inset
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Logic diagram of one NAND gate
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/02/7400_logic_diagram.png
\end_inset
\end_layout
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Each of the gates above is just a simple NAND circuit with the electron
flows, as demonstrated earlier.
Yet, many these NAND-gates chips combined can build a simple computer.
Software, at the physical level, is just electron flows.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure}
\end_layout
\begin_layout Plain Layout
\backslash
caption{Gates built from NAND gates, each accepts 2 input signals and generate
1 output signal.}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[NOT gate]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.5]{images/02/not-gate}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[AND gate]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.5]{images/02/and-gate}}
\backslash
hfill{}
\backslash
\backslash
[0.5cm]
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[OR gate]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.5]{images/02/or-gate}}
\backslash
qquad
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[NOR gate]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.5]{images/02/nor-gate}}
\end_layout
\begin_layout Plain Layout
\backslash
end{figure}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
How can the above gates be created with 74HC00? It is simple: as every gate
has 2 input pins and 1 output pin, we can write the output of 1 NAND gate
to an input of another NAND gate, thus chaining NAND gates together to
produce the diagrams as above.
\begin_inset Separator latexpar
\end_inset
\end_layout
\end_deeper
\begin_layout Standard
\noindent
\align left
\begin_inset CommandInset line
LatexCommand rule
offset "4ex"
width "100col%"
height "1.5pt"
\end_inset
\end_layout
\begin_layout Section
Beyond Logic Gates: Machine Language
\end_layout
\begin_layout Subsection
Machine language
\end_layout
\begin_layout Standard
Being built upon gates, as gates only accept a series of 0 and 1, a hardware
device only understands 0 and 1.
However, a device only takes 0 and 1 in a systematic way.
\emph on
\begin_inset Marginal
status collapsed
\begin_layout Plain Layout
\series bold
\emph on
Machine language
\end_layout
\end_inset
Machine language
\begin_inset Index idx
status collapsed
\begin_layout Plain Layout
Machine language
\end_layout
\end_inset
\emph default
is a collection of unique bit patterns that a device can identify and perform
a corresponding action.
A
\emph on
machine instruction
\emph default
is a unique bit pattern that a device can identify.
In a computer system, a device with its language is called
\series bold
\emph on
CPU
\series default
-
\series bold
C
\series default
entral
\series bold
P
\series default
rocessing
\series bold
U
\series default
nit
\emph default
, which controls all activities going inside a computer.
For example, in the x86 architecture, the pattern
\family typewriter
10100000
\family default
means telling a CPU to add two numbers, or
\family typewriter
000000101
\family default
to halt a computer.
In the early days of computers, people had to write completely in binary.
\end_layout
\begin_layout Standard
Why does such a bit pattern cause a device to do something? The reason is
that underlying each instruction is a small circuit that implements the
instruction.
Similar to how a function/subroutine in a computer program is called by
its name, a bit pattern is a name of a little function inside a CPU that
got executed when the CPU finds one.
\end_layout
\begin_layout Standard
Note that CPU is not the only device with its language.
CPU is just a name to indicate a hardware device that controls a computer
system.
A hardware device may not be a CPU but still has its language.
A device with its own machine language is a
\emph on
programmable device
\emph default
, since a user can use the language to command the device to perform different
actions.
For example, a printer has its set of commands for instructing it how to
print a page.
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout Standard
\noindent
\align left
\begin_inset CommandInset line
LatexCommand rule
offset "-4ex"
width "100col%"
height "1.5pt"
\end_inset
\end_layout
\begin_layout Example
\begin_inset CommandInset label
LatexCommand label
name "exa:74HC00-chip-can"
\end_inset
A user can use 74HC00 chip without knowing its internal, but only the interface
for using the device.
First, we need to know its layout:
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
74HC00 Pin Layout (Source: 74HC00 datasheet,
\begin_inset Flex URL
status collapsed
\begin_layout Plain Layout
http://www.nxp.com/documents/data_sheet/74HC_HCT00.pdf
\end_layout
\end_inset
)
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/02/7400_pin_configuration.pdf
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Then, the functionality of each pin:
\end_layout
\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Pin Description (Source: 74HC00 datasheet,
\begin_inset Flex URL
status collapsed
\begin_layout Plain Layout
http://www.nxp.com/documents/data_sheet/74HC_HCT00.pdf
\end_layout
\end_inset
)
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Tabular
<lyxtabular version="3" rows="6" columns="3">
<features tabularvalignment="middle">
<column alignment="left" valignment="top" width="3cm">
<column alignment="left" valignment="top" width="3cm">
<column alignment="left" valignment="top" width="3cm">
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Symbol
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Pin
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Description
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1A to 4A
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1, 4, 9, 12
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
data input
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1B to 4B
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
2, 5, 10, 13
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
data input
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1Y to 4Y
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
3, 6, 8, 11
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
data output
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
GND
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
7
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
ground (0
\begin_inset space \thinspace{}
\end_inset
V)
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
V
\begin_inset script subscript
\begin_layout Plain Layout
cc
\end_layout
\end_inset
\begin_inset script subscript
\begin_layout Plain Layout
\end_layout
\end_inset
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
14
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
supply voltage
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Finally, how to use the pins:
\end_layout
\begin_layout Standard
\begin_inset Float table
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Functional Description
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Tabular
<lyxtabular version="3" rows="6" columns="3">
<features tabularvalignment="middle">
<column alignment="left" valignment="top" width="3cm">
<column alignment="left" valignment="top" width="3cm">
<column alignment="left" valignment="top" width="3cm">
<row>
<cell multicolumn="1" alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Input
\end_layout
\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Output
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
nA
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
nB
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
nY
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
L
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
L
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
H
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
L
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
X
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
H
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
X
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
L
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
H
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
H
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
H
\end_layout
\end_inset
</cell>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
L
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset Marginal
status collapsed
\begin_layout Itemize
n is a number, either 1, 2, 3, or 4
\end_layout
\begin_layout Itemize
H = HIGH voltage level; L = LOW voltage level; X = don’t care.
\end_layout
\end_inset
The functional description provides a truth table with all possible pin
inputs and outputs, which also describes the usage of all pins in the device.
A user needs not to know the implementation, but on such a table to use
the device.
We can say that the truth table above is the machine language of the device.
Since the device is digital, its language is a collection of binary strings:
\end_layout
\begin_layout Itemize
The device has 8 input pins, and this means it accepts binary strings of
8 bits.
\end_layout
\begin_layout Itemize
The device has 4 output pins, and this means it produces binary strings
of 4 bits from the 8-bit inputs.
\end_layout
\begin_layout Standard
The number of input strings is what the device understand, and the number
of output strings is what the device can speak.
Together, they make the language of the device.
Even though this device is simple, yet the language it can accept contains
quite many binary strings:
\begin_inset Formula $2^{8}+2^{4}=272$
\end_inset
.
However, the number is a tiny fraction of a complex device like a CPU,
with hundreds of pins.
\end_layout
\begin_layout Standard
When leaving as is, 74HC00 is simply a NAND device with two 4-bit inputs
\begin_inset Foot
status open
\begin_layout Plain Layout
Or simply 4-bit NAND gate, as it can only accept 4 bits of input at the
maximum.
\end_layout
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset Tabular
<lyxtabular version="3" rows="3" columns="13">
<features tabularvalignment="middle">
<column alignment="left" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell multicolumn="1" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
Input
\end_layout
\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell multicolumn="1" alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Output
\end_layout
\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
<cell multicolumn="2" alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Pin
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
1A
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
1B
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
2A
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
2B
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
3A
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
3B
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
4A
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
4B
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
1Y
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
2Y
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
3Y
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
\series bold
4Y
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="left" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Value
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\begin_layout Standard
\begin_inset VSpace medskip
\end_inset
\end_layout
\begin_layout Standard
The inputs and outputs as visually presented:
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\end_layout
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Pins when receiving digital signals that correspond to a binary string.
Green signals are inputs; blue signals are outputs.
\end_layout
\end_inset
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/02/7400_bin_string1.pdf
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
On the other hand, if OR gate is implemented, we can only build a 2-input
OR gate from 74HC00, as it requires 3 NAND gates: 2 input NAND gates and
1 output NAND gate.
Each input NAND gate represents only a 1-bit input of the OR gate.
In the following figure, the pins of each input NAND gates are always set
to the same values (either both inputs are A or both inputs are B) to represent
a single bit input for the final OR gate:
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure*}
\end_layout
\begin_layout Plain Layout
\backslash
caption{2-bit OR gate implementation}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[2-bit OR gate logic diagram, built from 3 NAND gates with 4 pins
just
\end_layout
\begin_layout Plain Layout
for 2 bits of input.
\backslash
newline]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.6]{images/02/or-gate-ex}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[Pin 3A and 3B take the values from 1Y and 2Y.]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics{images/02/or-gate-layout-ex}}
\end_layout
\begin_layout Plain Layout
\backslash
end{figure*}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\begin_inset VSpace bigskip
\end_inset
\begin_inset Float margintable
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Truth table of OR logic diagram.
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Tabular
<lyxtabular version="3" rows="5" columns="5">
<features tabularvalignment="middle">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
A
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
B
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
C
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
D
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Y
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
0
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
1
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
0
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
1
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\end_deeper
\begin_layout Standard
\noindent
\align left
\begin_inset CommandInset line
LatexCommand rule
offset "4ex"
width "100col%"
height "1.5pt"
\end_inset
\end_layout
\begin_layout Standard
To implement a 4-bit OR gate, we need a total of four of 74HC00 chips configured
as OR gates, packaged as a single chip as in figure
\begin_inset CommandInset ref
LatexCommand ref
reference "or-chip-74hc00"
\end_inset
.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
4-bit OR chip made from four 74HC00 devices
\end_layout
\end_inset
\begin_inset CommandInset label
LatexCommand label
name "or-chip-74hc00"
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/02/4-bit-or-gate-layout.pdf
scale 41
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Subsection
Assembly Language
\end_layout
\begin_layout Standard
Assembly language is the symbolic representation of binary machine code,
by giving bit patterns mnemonic names.
It was a vast improvement when programmers had to write 0 and 1.
For example, instead of writing
\family typewriter
000000101
\family default
, a programmer simply write
\family typewriter
hlt
\family default
to stop a computer.
Such an abstraction makes instructions executed by a CPU easier to remember,
and thus more instructions could be memorized, less time spent looking
up CPU manual to find instructions in bit forms and as a result, code was
written faster.
\end_layout
\begin_layout Standard
Understand assembly language is crucial for low-level programming domains,
even to this day.
The more instructions a programmer want to understand, the deeper understanding
of machine architecture is required.
\end_layout
\begin_layout Example
We can build a device with 2 assembly instructions:
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout LyX-Code
or <op1>, <op2>
\end_layout
\begin_layout LyX-Code
nand <op1>, <op2>
\end_layout
\begin_layout Itemize
\family typewriter
or
\family default
accepts two 4-bit operands.
This corresponds to a 4-input OR gate device built from 4 74HC00 chips.
\end_layout
\begin_layout Itemize
\family typewriter
nand
\family default
accepts two 4-bit operands.
This corresponds to a single 74HC00 chips, leave as is.
\end_layout
\begin_layout Standard
Essentially, the gates in the example
\begin_inset CommandInset ref
LatexCommand ref
reference "exa:74HC00-chip-can"
\end_inset
implements the instructions.
Up to this point, we only specify input and output and manually feed it
to a device.
That is, to perform an operation:
\end_layout
\begin_layout Itemize
Pick a device by hands.
\end_layout
\begin_layout Itemize
Manually put electrical signals into pins.
\end_layout
\begin_layout Standard
First, we want to automate the process of device selection.
That is, we want to simply write assembly instruction and the device that
implements the instruction is selected correctly.
Solving this problem is easy:
\end_layout
\begin_layout Itemize
Give each instruction an index in binary code, called
\emph on
operation code
\emph default
or
\emph on
opcode
\emph default
for short, and embed it as part of input.
The value for each instruction is specified as in table
\begin_inset CommandInset ref
LatexCommand formatted
reference "ex-ins-ops"
\end_inset
.
\begin_inset Float margintable
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Instruction-Opcode mapping.
\end_layout
\end_inset
\begin_inset CommandInset label
LatexCommand label
name "ex-ins-ops"
\end_inset
\end_layout
\begin_layout Plain Layout
\noindent
\align center
\begin_inset Tabular
<lyxtabular version="3" rows="3" columns="2">
<features tabularvalignment="middle">
<column alignment="center" valignment="top">
<column alignment="center" valignment="top">
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Instruction
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\series bold
Binary Code
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
nand
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
00
\end_layout
\end_inset
</cell>
</row>
<row>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
or
\end_layout
\end_inset
</cell>
<cell alignment="center" valignment="top" topline="true" bottomline="true" leftline="true" rightline="true" usebox="none">
\begin_inset Text
\begin_layout Plain Layout
\family typewriter
01
\end_layout
\end_inset
</cell>
</row>
</lyxtabular>
\end_inset
\end_layout
\end_inset
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Standard
Each input now contains additional data at the beginning: an opcode.
For example, the instruction:
\end_layout
\begin_layout LyX-Code
\color red
nand
\color inherit
1100, 1100
\end_layout
\begin_layout Standard
corresponds to the binary string:
\family typewriter
\color red
00
\color inherit
11001100
\family default
.
The first two bits
\family typewriter
\color red
00
\family default
\color inherit
encodes a
\family typewriter
nand
\family default
instruction, as listed in the table above.
\end_layout
\end_deeper
\begin_layout Itemize
Add another device to select a device, based on a binary code peculiar to
an instruction.
\end_layout
\begin_layout Standard
Such a device is called a
\emph on
decoder
\emph default
, an important component in a CPU that decides which circuit to use.
In the above example, when feeding
\family typewriter
\color red
00
\color inherit
11001100
\family default
to the decoder, because the opcode is
\family typewriter
\color red
00
\family default
\color inherit
, data are sent to NAND device for computing.
\end_layout
\begin_layout Standard
Finally, writing assembly code is just an easier way to write binary strings
that a device can understand.
When we write assembly code and save in a text file, a program called an
\emph on
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
assembler
\end_layout
\end_inset
assembler
\begin_inset Index idx
status open
\begin_layout Plain Layout
assembler
\end_layout
\end_inset
\emph default
translates the text file into binary strings that a device can understand.
So, how can an assembler exist in the first place? Assume this is the first
assembler in the world, then it is written in binary code.
In the next version, life is easier: the programmers write the assembler
in the assembly code, then use the first version to compile itself.
These binary strings are then stored in another device that later can be
retrieved and sent to a decoder.
A
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
storage device
\end_layout
\end_inset
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
storage device
\end_layout
\end_inset
storage device
\emph default
is the device that stores machine instructions, which is an array of circuits
for saving 0 and 1 states.
\end_layout
\begin_layout Standard
A decoder is built out of logic gates similar to other digital devices.
However, a storage device can be anything that can store 0 and 1 and is
retrievable.
A storage device can be a magnetized device that uses magnetism to store
information, or it can be made out of electrical circuits that can change
and rermember states when a voltage is applied.
Regardless of the technology used, as long as the device can store data
and is accessible to retrieve data, it suffices.
Indeed, the modern devices are so complex that it is impossible and unnecessary
to understand every implementation detail.
Instead, we only need to learn the interfaces, e.g.
the pins, that the devices expose.
\end_layout
\begin_layout Standard
\begin_inset VSpace vfill
\end_inset
\end_layout
\begin_layout Standard
\begin_inset Newpage pagebreak
\end_inset
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure*}
\end_layout
\begin_layout Plain Layout
\backslash
caption{A decoder retrieves the current instruction pointed by the arrow
and selects the NAND device to execute the
\backslash
texttt{nand} instruction.}
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics{images/02/decoder-ex}
\end_layout
\begin_layout Plain Layout
\backslash
end{figure*}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
A computer essentially implements this process:
\end_layout
\begin_layout Itemize
\emph on
Fetch
\emph default
an instruction from a storage device.
\end_layout
\begin_layout Itemize
\emph on
Decode
\emph default
the instruction.
\end_layout
\begin_layout Itemize
\emph on
Execute
\emph default
the instruction.
\end_layout
\begin_layout Standard
Or in short, a
\begin_inset Index idx
status open
\begin_layout Plain Layout
f@fetch – decode – execute
\end_layout
\end_inset
fetch – decode – execute cycle.
The above device is extremely rudimentary, but it already represents a
computer with a
\emph on
fetch
\emph default
–
\emph on
decode
\emph default
–
\emph on
execute
\emph default
cycle.
More instructions can be implemented by adding more devices and allocating
more opcodes for the instructions, then update the decoder accordingly.
The Apollo Guidance Computer, a digital computer produced for the Apollo
space program from 1961 – 1972, was built entirely with NOR gates - the
other choice to NAND gate for creating other logic gates.
Similarly, if we keep improving our hypothetical device, it eventually
becomes a full-fledge computer.
\end_layout
\end_deeper
\begin_layout Subsection
Programming Languages
\end_layout
\begin_layout Standard
Assembly language is a step up from writing 0 and 1.
As time goes by, people realized that many pieces of assembly code had
repeating patterns of usages.
It would be nice if instead of writing all the repeating blocks of code
all over again in all places, we simply refer to such blocks of code with
easier to use text forms.
For example, a block of assembly code checks whether one variable is greater
than another and if so, execute a block of code, else execute another block
of code; in C, such block of assembly code is represented by an
\begin_inset Flex Code
status open
\begin_layout Plain Layout
if
\end_layout
\end_inset
statement that is close to human language.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Repeated assembly patterns are generalized into a new language.
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/02/asm_to_proglang.pdf
scale 60
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
People created text forms to represent common blocks of assembly code, such
as the
\family typewriter
if
\family default
syntax above, then write a program to translate the text forms into assembly
code.
The program that translates such text forms to machine code is called a
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
compiler
\end_layout
\end_inset
\emph on
compiler
\emph default
\begin_inset Index idx
status open
\begin_layout Plain Layout
\emph on
compiler
\end_layout
\end_inset
:
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure*}
\end_layout
\begin_layout Plain Layout
\backslash
caption{From high-level language back to low-level language.}
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics{images/02/proglang_to_asm}
\end_layout
\begin_layout Plain Layout
\backslash
end{figure*}
\end_layout
\begin_layout Plain Layout
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Any software logic a programming language can implement, hardware can also
implement.
The reverse is also true: any hardware logic that is implemented in a circuit
can be reimplemented in a programming language.
The simple reason is that programming languages, or assembly languages,
or machine languages, or logic gates are just languages to express computations.
It is impossible for software to implement something hardware is incapable
of because programming language is just a simpler way to use the underlying
hardware.
At the end of the day, programming languages are translated to machine
instructions that are valid to a CPU.
Otherwise, code is not runnable, thus a useless software.
In reverse, software can do everything hardware (that run the software)
can, as programming languages are just an easier way to use the hardware.
\end_layout
\begin_layout Standard
In reality, even though all languages are equivalent in power, not all of
them are capable of express programs of each other.
Programming languages vary between two ends of a spectrum: high level and
low level.
\end_layout
\begin_layout Standard
The higher level a programming language is, the more distant it becomes
from the hardware.
In some high-level programming languages, such as Python, a programmer
cannot manipulate underlying hardware, despite being able to deliver the
same computations as low-level programming languages.
The reason is that high-level languages want to hide hardware details to
free programmers from dealing with irrelevant details not related to current
problem domains.
Such convenience, however, is not free: it requires software to carry an
extra code for managing hardware details (e.g.
memory) thus making the code run slower, and it makes hardware programming
difficult or impossible.
The more abstractions a programming language imposes, the more difficult
it is for writing low-level software, such as hardware drivers or an operating
system.
This is the reason why C is usually a language of choice for writing an
operating system, since C is just a thin wrapper of the underlying hardware,
making it easy to understand how exactly a hardware device runs when executing
a certain piece of C code.
\end_layout
\begin_layout Standard
Each programming language represents a way of thinking about programs.
Higher-level programming languages help to focus on problem domains that
are not related to hardware at all, and where programmer performance is
more important than computer performance.
Lower-level programming languages help to focus on the inner-working of
a machine, thus are best suited for problem domains that are related to
control hardware.
That is why so many languages exist.
Use the right tools for the right job to achieve the best results.
\end_layout
\begin_layout Standard
\begin_inset Note Note
status collapsed
\begin_layout Plain Layout
Explain the two ways to create new abstractions in programming languages
\end_layout
\end_inset
\end_layout
\begin_layout Section
Abstraction
\end_layout
\begin_layout Standard
\emph on
Abstraction
\begin_inset Index idx
status open
\begin_layout Plain Layout
Abstraction
\end_layout
\end_inset
\emph default
is a technique for hiding complexity that is irrelevant to the problem in
context.
For example, writing programs without any other layer except the lowest
layer: with circuits.
Not only a person needs an in-depth understanding of how circuits work,
making it much more obscure to design a circuit because the designer must
look at the raw circuits but think in higher-level such as logic gates.
It is a distracting process, as a designer must constantly translate the
idea into circuits.
It is possible for a designer simply thinks his high-level ideas straight,
and later translate the ideas into circuits.
Not only it is more efficient, but it is also more accurate as a designer
can focus all his efforts into verifying the design with high-level thinking.
When a new designer arrives, he can easily understand the high-level designs,
thus can continue to develop or maintain existing systems.
\end_layout
\begin_layout Subsection
Why abstraction works
\end_layout
\begin_layout Standard
In all the layers, abstractions manifest itself:
\end_layout
\begin_layout Itemize
Logic gates abstract away the details of CMOS.
\end_layout
\begin_layout Itemize
Machine language abstracts away the details of logic gates.
\end_layout
\begin_layout Itemize
Assembly language abstracts away the details of machine languages.
\end_layout
\begin_layout Itemize
Programming language abstracts away the details of assembly languages.
\end_layout
\begin_layout Standard
We see repeating patterns of how lower-layers build upper-layers:
\end_layout
\begin_layout Itemize
A lower layer has a recurring pattern.
Then, this recurring pattern is taken out and built a language on top of
it.
\end_layout
\begin_layout Itemize
A higher layer strips away layer-specific (non-recurring) details to focus
on the recurring details.
\end_layout
\begin_layout Itemize
The recurring details are given a new and simpler language than the languages
of the lower layers.
\end_layout
\begin_layout Standard
What to realize is that every layer is just
\emph on
a more convenient language to
\series bold
describe
\series default
the lower layer
\emph default
.
Only after a description is fully created with the language of the higher
layer, it is then be
\emph on
implemented
\emph default
with the language of the lower layer.
\end_layout
\begin_layout Itemize
CMOS layer has a recurring pattern that makes sure logic gates are reliably
translated to CMOS circuits:
\series bold
\emph on
a k-input gate uses k PMOS and k NMOS transistors
\series default
\emph default
\begin_inset CommandInset citation
LatexCommand citep
key "John_digital"
literal "true"
\end_inset
.
Since digital devices use CMOS exclusively, a language arose to describe
higher level ideas while hiding CMOS circuits: Logic Gates.
\end_layout
\begin_layout Itemize
Logic Gates hides the language of circuits and focuses on how to implement
primitive Boolean functions and combine them to create new functions.
All logic gates receive input and generate output as binary numbers.
Thanks to this recurring patterns, logic gates are hidden away for the
new language: Assembly, which is a set of predefined binary patterns that
cause the underlying gates to perform an action.
\end_layout
\begin_layout Itemize
Soon, people realized that many recurring patterns arisen from within Assembly
language.
Repeated blocks of Assembly code appear in Assembly source files that express
the same or similar idea.
There were many such ideas that can be reliably translated into Assembly
code.
Thus, the ideas were extracted for building into the high level programming
languages that everyone programmer learns today.
\end_layout
\begin_layout Standard
Recurring patterns are the key to abstraction.
Recurring patterns are why abstraction works.
Without them, no language can be built, and thus no abstraction.
Fortunately, human already developed a systematic discipline for studying
patterns: Mathematics.
As quoted from the British mathematician G.
H.
Hardy
\begin_inset CommandInset citation
LatexCommand citeyearpar
key "Hardy"
literal "true"
\end_inset
:
\end_layout
\begin_layout Quote
A mathematician, like a painter or a poet, is a maker of patterns.
If his patterns are more permanent than theirs, it is because they are
made with ideas.
\end_layout
\begin_layout Standard
Isn't that a mathematical formula a representation of a pattern? A variable
represents values with the same properties given by constraints? Mathematics
provides a formal system to identify and describe existing patterns in
nature.
For that reason, this system can certainly be applied in the digital world,
which is just a subset of the real world.
Mathematics can be used as a common language to help translation between
layers easier, and help with the understanding of layers.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure*}
\end_layout
\begin_layout Plain Layout
\backslash
caption{Mathematics as a universal language for all layers.
Since all layers
\end_layout
\begin_layout Plain Layout
can express mathematics with their technologies, each layer can be
\end_layout
\begin_layout Plain Layout
translated into another layer.}
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.5]{images/02/layer_translation}
\end_layout
\begin_layout Plain Layout
\backslash
end{figure*}
\end_layout
\end_inset
\end_layout
\begin_layout Subsection
Why abstraction reduces complexity
\end_layout
\begin_layout Standard
Abstraction by building language certainly leverages productivity by stripping
irrelevant details to a problem.
Imagine writing programs without any other layout except the lowest layer:
with circuits.
This is how complexity emerges: when high-level ideas are expressed with
lower-level language, as the example above demonstrated.
Unfortunately, this is the case with software as programming languages
at the moment are more emphasized on software rather than the problem domains.
That is, without prior knowledge, code written in a language is unable
to express itself the knowledge of its target domain.
In other words,
\emph on
a language is expressive if its syntax is designed to express the problem
domain it is trying to solve
\emph default
.
Consider this example: That is, the
\emph on
what
\emph default
it will do rather the
\emph on
how
\emph default
it will do.
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout Standard
\noindent
\align left
\begin_inset CommandInset line
LatexCommand rule
offset "-4ex"
width "100col%"
height "1.5pt"
\end_inset
\end_layout
\begin_layout Example
Graphviz (
\begin_inset Flex URL
status open
\begin_layout Plain Layout
http://www.graphviz.org/
\end_layout
\end_inset
) is a visualization software that provides a language, called
\family typewriter
dot
\family default
, for describing graph:
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure*}
\end_layout
\begin_layout Plain Layout
\backslash
caption{From graph description to graph.}
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics{images/02/digraph}
\end_layout
\begin_layout Plain Layout
\backslash
end{figure*}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
As can be seen, the code perfectly expresses itself how the graph is connected.
Even a non-programmer can understand and use such language easily.
An implementation in C would be more troublesome, and that's assuming that
the functions for drawing graphs are already available.
To draw a line, in C we might write something like:
\end_layout
\begin_layout LyX-Code
draw_line(a, b);
\end_layout
\begin_layout Standard
However, it is still verbose compared with:
\end_layout
\begin_layout LyX-Code
a -> b;
\end_layout
\begin_layout Standard
Also,
\family typewriter
a
\family default
and
\family typewriter
b
\family default
must be defined in C, compared to the implicit nodes in the
\family typewriter
dot
\family default
language.
However, if we do not factor in the verbosity, then C still has a limitation:
it cannot change its syntax to suit the problem domain.
A domain-specific language might even be more verbose, but it makes a domain
more understandable.
If a problem domain must be expressed in C, then it is constraint by the
syntax of C.
Since C is not a specialized language for a problem domain that, but is
a
\emph on
general-purpose
\emph default
programming language, the domain knowledge is buried within the implementation
details.
As a result, a C programmer is needed to decipher and extract the domain
knowledge out.
If the domain knowledge cannot be extracted, then the software cannot be
further developed.
\end_layout
\end_deeper
\begin_layout Standard
\begin_inset Separator parbreak
\end_inset
\end_layout
\begin_layout Example
Linux is full of applications controlled by many domain-specific languages
and are placed in
\family typewriter
/etc
\family default
directory, such as a web server.
Instead of reprogramming the software, a domain-agnostic language is made
for it.
\end_layout
\begin_layout Standard
\noindent
\align left
\begin_inset CommandInset line
LatexCommand rule
offset "4ex"
width "100col%"
height "1.5pt"
\end_inset
\end_layout
\begin_layout Standard
In general, code that can express a problem domain must be understandable
by a domain expert.
Even within the software domain, building a language out of repeated programmin
g patterns is useful.
It helps people aware the existence of such patterns in code and thus making
software easier to maintain, as software structure is visible as a language.
Only a programming language that is capable of morphing itself to suit
a problem domain can achieve that goal.
Such language is called a
\emph on
programmable programming language
\emph default
.
Unfortunately, this approach of turning software structure visible is not
favored among programmers, as a new language must be made out of it along
with new toolchain to support it.
Thus, software structure and domain knowledge are buried within code written
in the syntax of a general-purpose language, and if a programmer is not
familiar or even aware of the existence of a code pattern, then it is hopeless
to understand the code.
A prime example is reading C code that controls hardware, e.g.
an operating system: if a programmer knows absolutely nothing about hardware,
then it is impossible to read and write operating system code in C, even
if he could have 20 years of writing application C code.
\end_layout
\begin_layout Standard
With abstraction, a software engineer can also understand the inner-working
of a device without specialized knowledge of physical circuit design, enables
the software engineer to write code that controls a device.
The separation between logical and physical implementation also entails
that gate designs can be reused even when the underlying technologies changed.
For example, in some distant future biological computer could be a reality,
and gates might not be implemented as CMOS but some kind of biological
cells e.g.
as living cells; in either technology: electrical or biological, as long
as logic gates are physically realized, the same computer design could
be implemented.
\end_layout
\begin_layout Chapter
Computer Architecture
\end_layout
\begin_layout Standard
To write lower level code, a programmer must understand the architecture
of a computer.
It is similar to when one writes programs in a software framework, he must
know what kinds of problems the framework solves, and how to use the framework
by its provided software interfaces.
But before getting to the definition of what computer architecture is,
we must understand what exactly is a computer, as many people still think
that a computer is a regular computer we put on a desk, or at best, a server.
Computers come in various shapes and sizes and are devices that people
never imagine they are computers, and that code can run on such devices.
\end_layout
\begin_layout Section
What is a computer?
\end_layout
\begin_layout Standard
A
\emph on
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
computer
\end_layout
\end_inset
computer
\begin_inset Index idx
status open
\begin_layout Plain Layout
computer
\end_layout
\end_inset
\emph default
is a hardware device that consists of at least a processor (CPU), a memory
device and input/output interfaces.
All the computers can be grouped into two types:
\end_layout
\begin_layout Description
Single-purpose
\begin_inset space ~
\end_inset
computer is a computer built at the
\emph on
hardware level
\emph default
for specific tasks.
For example, dedicated application encoders/decoders , timer, image/video/sound
processors.
\end_layout
\begin_layout Description
General-purpose
\begin_inset space ~
\end_inset
computer is a computer that can be programmed (without modifying its hardware)
to emulate various features of single-purpose computers.
\end_layout
\begin_layout Subsection
Server
\end_layout
\begin_layout Standard
A
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
server
\end_layout
\end_inset
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
server
\end_layout
\end_inset
server
\emph default
is a general-purpose high-performance computer with huge resources to provide
large-scale services for a broad audience.
The audience are people with their personal computer connected to a server.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Blade servers.
Each blade server is a computer with a modular design optimize for the
use of physical space and energy.
The enclosure of blade servers is called a
\emph on
chassis
\emph default
.(Source:
\begin_inset CommandInset href
LatexCommand href
name "Wikimedia"
target "https://commons.wikimedia.org/wiki/File:Wikimedia_Foundation_Servers-8055_35.jpg"
literal "false"
\end_inset
, author: Victorgrigas)
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/03/Wikimedia_Foundation_Servers-8055_35.jpg
scale 80
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Subsection
Desktop Computer
\end_layout
\begin_layout Standard
A
\emph on
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
desktop computer
\end_layout
\end_inset
desktop computer
\begin_inset Index idx
status collapsed
\begin_layout Plain Layout
desktop computer
\end_layout
\end_inset
\emph default
is a general-purpose computer with an input and output system designed
for a human user, with moderate resources enough for regular use.
The input system usually includes a mouse and a keyboard, while the output
system usually consists of a monitor that can display a large mount of
pixels.
The computer is enclosed in a chassis large enough for putting various
computer components such as a processor, a motherboard, a power supply,
a hard drive, etc.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
A typical desktop computer.
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/03/computer-158675.svg
scale 50
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Subsection
Mobile Computer
\end_layout
\begin_layout Standard
A
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
mobile computer
\end_layout
\end_inset
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
mobile computer
\end_layout
\end_inset
mobile computer
\emph default
is similar to a desktop computer with fewer resources but can be carried
around.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure}
\end_layout
\begin_layout Plain Layout
\backslash
caption{Mobile computers}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[A laptop]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics{images/03/macbook}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[A tablet]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics{images/03/tablet}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[A mobile phone]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics{images/03/mobile_phone}}
\end_layout
\begin_layout Plain Layout
\backslash
end{figure}
\end_layout
\end_inset
\end_layout
\begin_layout Subsection
Game Consoles
\end_layout
\begin_layout Standard
Game consoles are similar to desktop computers but are optimized for gaming.
Instead of a keyboard and a mouse, the input system of a game console are
game controllers, which is a device with a few buttons for controlling
on-screen objects; the output system is a television.
The chassis is similar to a desktop computer but is smaller.
Game consoles use custom processors and graphic processors but are similar
to ones in desktop computers.
For example, the first Xbox uses a custom Intel Pentium III processor.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure*}
\end_layout
\begin_layout Plain Layout
\backslash
caption{Current-gen Game Consoles}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[A Play Station 4]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics{images/03/PS4-Console-wDS4}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[A Xbox One]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics{images/03/Xbox_One_Console_Set}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[A Wii U]{
\end_layout
\begin_layout Plain Layout
\backslash
hfill{}
\backslash
includegraphics[scale=0.7]{images/03/Wii_U_Console_and_Gamepad}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
hfill
\backslash
break
\end_layout
\begin_layout Plain Layout
\backslash
end{figure*}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Handheld game consoles are similar to game consoles, but incorporate both
the input and output systems along with the computer in a single package.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure}
\end_layout
\begin_layout Plain Layout
\backslash
caption{Some Handheld Consoles}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[A Nintendo DS]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics{images/03/256px-Nintendo-DS-Lite-w-stylus}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[A PS Vita]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics{images/03/PlayStation-Vita}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
end{figure}
\end_layout
\end_inset
\end_layout
\begin_layout Subsection
Embedded Computer
\end_layout
\begin_layout Standard
An
\emph on
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
embedded computer
\end_layout
\end_inset
embedded computer
\begin_inset Index idx
status open
\begin_layout Plain Layout
embedded computer
\end_layout
\end_inset
\emph default
is a single-board or single-chip computer with limited resources designed
for integrating into larger hardware devices.
\begin_inset Float marginfigure
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
An Intel 82815 Graphics and Memory Controller Hub embedded on a PC motherboard.
(Source:
\begin_inset CommandInset href
LatexCommand href
name "Wikimedia"
target "https://commons.wikimedia.org/wiki/File:Intel_82815_GMCH.jpg"
literal "false"
\end_inset
, author: Qurren)
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/03/Intel_82815_GMCH.jpg
scale 50
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\begin_inset Float marginfigure
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
A PIC microcontroller.
(Soure:
\begin_inset CommandInset href
LatexCommand href
name "Microchip"
target "http://www.microchip.com/wwwproducts/en/PIC18F4620"
literal "false"
\end_inset
)
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/03/medium-PIC18F4620-PDIP-40.png
scale 50
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
A
\emph on
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
microcontroller
\end_layout
\end_inset
microcontroller
\begin_inset Index idx
status open
\begin_layout Plain Layout
Microcontroller
\end_layout
\end_inset
\emph default
is an embedded computer designed for controlling other hardware devices.
A microcontroller is mounted on a chip.
Microcontrollers are general-purpose computers, but with limited resources
so that it is only able to perform one or a few specialized tasks.
These computers are used for a single purpose, but they are still general-purpo
se since it is possible to program them to perform different tasks, depends
on the requirements, without changing the underlying hardware.
\end_layout
\begin_layout Standard
Another type of embedded computer is
\emph on
system-on-chip
\emph default
.
A
\emph on
system-on-chip
\begin_inset Index idx
status open
\begin_layout Plain Layout
system-on-chip
\end_layout
\end_inset
\emph default
is a full computer on a single chip.
Though a microcontroller is housed on a chip, its purpose is different:
to control some hardware.
A microcontroller is usually simpler and more limited in hardware resources
as it specializes only in one purpose when running, whereas a system-on-chip
is a general-purpose computer that can serve multiple purposes.
A system-on-chip can run like a regular desktop computer that is capable
of loading an operating system and run various applications.
A system-on-chip typically presents in a smartphone, such as Apple A5 SoC
used in Ipad2 and iPhone 4S, or Qualcomm Snapdragon used in many Android
phones.
\begin_inset Float marginfigure
wide false
sideways false
status collapsed
\begin_layout Plain Layout
\end_layout
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Apple A5 SoC
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/03/128px-Apple_A5_Chip.jpg
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Be it a microcontroller or a system-on-chip, there must be an environment
where these devices can connect to other devices.
This environment is a circuit board called a
\emph on
\begin_inset Index idx
status collapsed
\begin_layout Plain Layout
PCB
\end_layout
\end_inset
PCB
\emph default
–
\emph on
\begin_inset Index idx
status collapsed
\begin_layout Plain Layout
Printed Circuit Board
\end_layout
\end_inset
\series bold
P
\series default
rinted
\series bold
C
\series default
ircuit
\series bold
B
\series default
oard.
\emph default
A
\emph on
printed circuit board
\begin_inset Index idx
status collapsed
\begin_layout Plain Layout
Printed Circuit Board
\end_layout
\end_inset
\emph default
is a physical board that contains lines and pads to enable electron flows
between electrical and electronics components.
Without a PCB, devices cannot be combined to create a larger device.
As long as these devices are hidden inside a larger device and contribute
to a larger device that operates at a higher level layer for a higher level
purpose, they are embedded devices.
Writing a program for an embedded device is therefore called
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
embedded programming
\end_layout
\end_inset
embedded programming
\emph default
.
Embedded computers are used in automatically controlled devices including
power tools, toys, implantable medical devices, office machines, engine
control systems, appliances, remote controls and other types of embedded
systems.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure*}
\end_layout
\begin_layout Plain Layout
\backslash
caption{Raspberry Pi B+ Rev 1.2, a single-board computer that includes both
a system-on-chip and a microcontroller.}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[Functional View.
\end_layout
\begin_layout Plain Layout
\backslash
newline The SoC is a Broadcom BCM2835.
\end_layout
\begin_layout Plain Layout
\backslash
newline The microcontroller is the Ethernet Controller LAN9514.
\end_layout
\begin_layout Plain Layout
\backslash
newline (Source:
\backslash
protect
\backslash
href{https://commons.wikimedia.org/wiki/File:Raspberry_Pi_B
\backslash
%2B_rev_1.2.svg}{Wikimedia}, author: Efa2)]{
\end_layout
\begin_layout Plain Layout
\backslash
hfill{}
\backslash
includegraphics[scale=1.1]{images/03/Raspberry_Pi_B}
\backslash
hfill{}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[Physical View]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.075]{images/03/Raspberry_Pi_2_Model_B}}
\end_layout
\begin_layout Plain Layout
\backslash
end{figure*}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The line between a microcontroller and a system-on-chip is blurry.
If hardware keeps evolving more powerful, then a microcontroller can get
enough resources to run a minimal operating system on it for multiple specializ
ed purposes.
In contrast, a system-on-chip is powerful enough to handle the job of a
microcontroller.
However, using a system-on-chip as a microcontroller would not be a wise
choice as price will rise significantly, but we also waste hardware resources
since the software written for a microcontroller requires little computing
resources.
\end_layout
\begin_layout Subsection
Field Gate Programmable Array
\end_layout
\begin_layout Standard
\emph on
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
Field Programmable Gate Array
\end_layout
\end_inset
Field Programmable Gate Array
\begin_inset Index idx
status collapsed
\begin_layout Plain Layout
Field Gate Programmable Array
\end_layout
\end_inset
\emph default
(
\emph on
\begin_inset Index idx
status collapsed
\begin_layout Plain Layout
FPGA
\end_layout
\end_inset
FPGA
\emph default
) is a hardware an array of reconfigurable gates that makes circuit structure
programmable after it is shipped away from the factory
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
This is why it is called
\series bold
\emph on
Field
\series default
\emph default
Gate Programmable Array.
It is changeable
\begin_inset Quotes eld
\end_inset
in the field
\begin_inset Quotes erd
\end_inset
where it is applied.
\end_layout
\end_inset
.
Recall that in the previous chapter, each 74HC00 chip can be configured
as a gate, and a more sophisticated device can be built by combining multiple
74HC00 chips.
In a similar manner, each FPGA device contains thousands of chips called
\emph on
logic blocks
\emph default
, which is a more complicated chip than a 74HC00 chip that can be configured
to implement a Boolean logic function.
These logic blocks can be chained together to create a high-level hardware
feature.
This high-level feature is usually a dedicated algorithm that needs high-speed
processing.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
FPGA Architecture (Source:
\begin_inset CommandInset href
LatexCommand href
name "National Instruments"
target "http://www.ni.com/tutorial/6097/en/"
literal "false"
\end_inset
)
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \hfill{}
\end_inset
\begin_inset Graphics
filename images/03/fpga_400x212.jpg
scale 80
\end_inset
\begin_inset space \hfill{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
Digital devices can be designed by combining logic gates, without regarding
actual circuit components, since the physical circuits are just multiples
of CMOS circuits.
Digital hardware, including various components in a computer, is designed
by writing code, like a regular programmer, by using a language to describe
how gates are wired together.
This language is called a
\emph on
Hardware Description Language
\begin_inset Index idx
status open
\begin_layout Plain Layout
Hardware Description Language
\end_layout
\end_inset
\emph default
.
Later the hardware description is compiled to a description of connected
electronic components called a
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
netlist
\end_layout
\end_inset
netlist
\emph default
, which is a more detailed description of how gates are connected.
\end_layout
\begin_layout Standard
The difference between FPGA and other embedded computers is that programs
in FPGA are implemented at the digital logic level, while programs in embedded
computers like microcontrollers or system-on-chip devices are implemented
at assembly code level.
An algorithm written for a FPGA device is a description of the algorithm
in logic gates, which the FPGA device then follows the description to configure
itself to run the algorithm.
An algorithm written for a microcontroller is in assembly instructions
that a processor can understand and act accordingly.
\end_layout
\begin_layout Standard
FPGA is applied in the cases where the specialized operations are unsuitable
and costly to run on a regular computer such as real-time medical image
processing, cruise control system, circuit prototyping, video encoding/decoding
, etc.
These applications require high-speed processing that is not achievable
with a regular processor because a processor wastes a significant amount
of time in executing many non-specialized instructions - which might add
up to thousands of instructions or more - to implement a specialized operation,
thus more circuits at physical level to carry the same operation.
A FPGA device carries no such overhead; instead, it runs a single specialized
operation implemented in hardware directly.
\end_layout
\begin_layout Subsection
Application-Specific Integrated Circuit
\end_layout
\begin_layout Standard
An
\emph on
\begin_inset Index idx
status collapsed
\begin_layout Plain Layout
Application-Specific Integrated Circuit
\end_layout
\end_inset
\series bold
A
\series default
pplication-
\series bold
S
\series default
pecific
\series bold
I
\series default
ntegrated
\series bold
C
\series default
ircuit
\emph default
(or
\emph on
ASIC
\series bold
\begin_inset Index idx
status collapsed
\begin_layout Plain Layout
ASIC
\end_layout
\end_inset
\series default
\emph default
) is a chip designed for a particular purpose rather than for general-purpose
use.
ASIC does not contain a generic array of logic blocks that can be reconfigured
to adapt to any operation like an FPGA; instead, every logic block in an
ASIC is made and optimized for the circuit itself.
FPGA can be considered as the prototyping stage of an ASIC, and ASIC as
the final stage of circuit production.
ASIC is even more specialized than FPGA, so it can achieve even higher
performance.
However, ASICs are very costly to manufacture and once the circuits are
made, if design errors happen, everything is thrown away, unlike the FPGA
devices which can simply be reprogrammed because of the generic gate array.
\end_layout
\begin_layout Section
Computer Architecture
\end_layout
\begin_layout Standard
The previous section examined various classes of computers.
Regardless of shapes and sizes, every computer is designed for an architect
from high level to low level.
\end_layout
\begin_layout Standard
\begin_inset Formula
\[
Computer\,Architecture=Instruction\,Set\,Architecture+Computer\,Organization+Hardware
\]
\end_inset
\end_layout
\begin_layout Standard
At the highest-level is the Instruction Set Architecture.
\end_layout
\begin_layout Standard
At the middle-level is the Computer Organization.
\end_layout
\begin_layout Standard
At the lowest-level is the Hardware.
\end_layout
\begin_layout Subsection
Instruction Set Architecture
\end_layout
\begin_layout Standard
An
\emph on
instruction set
\begin_inset Index idx
status open
\begin_layout Plain Layout
instruction set
\end_layout
\end_inset
\emph default
is the basic set of commands and instructions that a microprocessor understands
and can carry out.
\end_layout
\begin_layout Standard
An
\series bold
\emph on
I
\series default
nstruction
\series bold
S
\series default
et
\series bold
A
\series default
rchitecture
\emph default
\begin_inset Index idx
status open
\begin_layout Plain Layout
Instruction Set Architecture
\end_layout
\end_inset
, or
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
ISA
\end_layout
\end_inset
\series bold
ISA
\series default
\emph default
, is the design of an environment that implements an instruction set.
Essentially, a runtime environment similar to those interpreters of high-level
languages.
The design includes all the instructions, registers, interrupts, memory
models (how memory are arranged to be used by programs), addressing modes,
I/O, etc., of a CPU.
The more features (e.g.
more instructions) a CPU has, the more circuits are required to implement
it.
\end_layout
\begin_layout Subsection
Computer organization
\end_layout
\begin_layout Standard
\emph on
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
Computer organization
\end_layout
\end_inset
Computer organization
\begin_inset Index idx
status open
\begin_layout Plain Layout
Computer organization
\end_layout
\end_inset
\emph default
is the functional view of the design of a computer.
In this view, hardware components of a computer are presented as boxes
with input and output that connects to each other and form the design of
a computer.
Two computers may have the same ISA, but different organizations.
For example, both AMD and Intel processors implement x86 ISA, but the hardware
components of each processor that make up the environments for the ISA
are not the same.
\end_layout
\begin_layout Standard
Computer organizations may vary depend on a manufacturer's design, but they
are all originated from the Von Neumann architecture
\begin_inset Foot
status collapsed
\begin_layout Plain Layout
\emph on
John von Neumann
\emph default
was a mathematician and physicist who invented a computer architecture.
\end_layout
\end_inset
:
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Von-Neumann Architecture
\end_layout
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset space \space{}
\end_inset
\begin_inset Graphics
filename images/03/von_neumann_architecture.pdf
scale 50
\end_inset
\begin_inset space \space{}
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Description
CPU
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
CPU
\end_layout
\end_inset
\emph default
fetches instructions continuously from main memory and execute.
\end_layout
\begin_layout Description
Memory
\begin_inset Index idx
status open
\begin_layout Plain Layout
Memory
\end_layout
\end_inset
stores program code and data.
\end_layout
\begin_layout Description
Bus
\begin_inset Index idx
status open
\begin_layout Plain Layout
Bus
\end_layout
\end_inset
are electrical wires for sending raw bits between the above components.
\end_layout
\begin_layout Description
I/O
\begin_inset space ~
\end_inset
Devices
\begin_inset Index idx
status open
\begin_layout Plain Layout
I/O Devices
\end_layout
\end_inset
are devices that give input to a computer i.e.
keyboard, mouse, sensor, etc, and takes the output from a computer i.e.
monitor takes information sent from CPU to display it, LED turns on/off
according to a pattern computed by CPU, etc.
\end_layout
\begin_layout Standard
The Von-Neumann computer operates by storing its instructions in main memory,
and CPU repeatedly fetches those instructions into its internal storage
for executing, one after another.
Data are transferred through a data bus between CPU, memory and I/O devices,
and where to store in the devices is transferred through the address bus
by the CPU.
This architecture completely implements the
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
f@fetch – decode – execute
\end_layout
\end_inset
fetch – decode – execute
\emph default
cycle.
\end_layout
\begin_layout Standard
The earlier computers were just the exact implementations of the Von Neumann
architecture, with CPU and memory and I/O devices communicate through the
same bus.
Today, a computer has more buses, each is specialized in a type of traffic.
However, at the core, they are still Von Neumann architecture.
To write an OS for a Von Neumann computer, a programmer needs to be able
to understand and write code that controls the cores components: CPU, memory,
I/O devices, and bus.
\end_layout
\begin_layout Standard
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
CPU
\end_layout
\end_inset
\series bold
CPU
\series default
\emph default
, or
\series bold
\emph on
C
\series default
entral
\series bold
P
\series default
rocessing
\series bold
U
\series default
nit
\emph default
\begin_inset Index idx
status open
\begin_layout Plain Layout
Central Processing Unit
\end_layout
\end_inset
, is the heart and brain of any computer system.
Understand a CPU is essential to writing an OS from scratch:
\end_layout
\begin_layout Itemize
To use these devices, a programmer needs to controls the CPU to use the
programming interfaces of other devices.
CPU is the only way, as CPU is the only direct device a programmer can
use and the only device that understand code written by a programmer.
\end_layout
\begin_layout Itemize
In a CPU, many OS concepts are already implemented directly in hardware,
e.g.
task switching, paging.
A kernel programmer needs to know how to use the hardware features, to
avoid duplicating such concept in software, thus wasting computer resources.
\end_layout
\begin_layout Itemize
CPU built-in OS features boost both OS performance and developer productivity
because those features are actual hardware, the lowest possible level,
and developers are free to implement such features.
\end_layout
\begin_layout Itemize
To effectively use the CPU, a programmer needs to understand the documentation
provided by the CPU manufacturer.
For example,
\begin_inset CommandInset href
LatexCommand href
name "Intel® 64 and IA-32 Architectures Software Developer Manuals"
target "[http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html"
literal "false"
\end_inset
.
\end_layout
\begin_layout Itemize
After understanding one CPU architecture well, it is easier to learn other
CPU architectures.
\end_layout
\begin_layout Standard
A CPU is an implementation of an ISA, effectively the implementation of
an assembly language (and depending on the CPU architecture, the language
may vary).
Assembly language is one of the interfaces that are provided for software
engineers to control a CPU, thus control a computer.
But how can every computer device be controlled with only access to
the CPU? The simple answer is that a CPU can communicate with other devices
through these two interfaces, thus commanding them:
\end_layout
\begin_layout Description
\emph on
Registers
\emph default
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
Registers
\end_layout
\end_inset
\begin_inset Marginal
status collapsed
\begin_layout Plain Layout
\series bold
\emph on
Registers
\end_layout
\end_inset
\emph default
are a hardware component for high-speed data access and communication with
other hardware devices.
Registers allow software to control hardware directly by writing to registers
of a device, or receive information from a hardware device when reading from
registers of a device.
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Standard
Not all registers are used for communication with other devices.
In a CPU, most registers are used as high-speed storage for temporary data.
Other devices that a CPU can communicate with always have a set of registers
for interfacing with the CPU.
\end_layout
\end_deeper
\begin_layout Description
\emph on
Port
\emph default
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
Port
\end_layout
\end_inset
\begin_inset Marginal
status collapsed
\begin_layout Plain Layout
\series bold
\emph on
Port
\end_layout
\end_inset
\emph default
is a specialized register in a hardware device used for communication with
other devices.
When data is written to a port, it causes a hardware device to perform
some operation according to values written to the port.
The difference between a port and a register is that a port does not store
data, but delegates data to some other circuit.
\end_layout
\begin_layout Standard
These two interfaces are extremely important, as they are the only interfaces
for controlling hardware with software.
Writing device drivers is essentially learning the functionality of each
register and how to use them properly to control the device.
\end_layout
\begin_layout Standard
\emph on
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
Memory
\end_layout
\end_inset
Memory
\begin_inset Index idx
status open
\begin_layout Plain Layout
Memory
\end_layout
\end_inset
\emph default
is a storage device that stores information.
Memory consists of many cells.
Each cell is a byte with its address number, so a CPU can use such address
number to access an exact location in memory.
Memory is where software instructions (in the form of machine language)
is stored and retrieved to be executed by CPU; memory also stores data
needed by some software.
Memory in a Von Neumann machine does not distinguish between which bytes
are data and which bytes are software instructions.
It's up to the software to decide, and if somehow data bytes are fetched
and executed as instructions, CPU still does it if such bytes represents
valid instructions, but will produce undesirable results.
To a CPU, there's no code and data; both are merely different types of
data for it to act on: one tells it how to do something in a specific manner,
and one is necessary materials for it to carry such action.
\end_layout
\begin_layout Standard
The RAM is controlled by a device called a
\emph on
memory controller
\begin_inset Index idx
status open
\begin_layout Plain Layout
memory controller
\end_layout
\end_inset
\emph default
.
Currently, most processors have this device embedded, so the CPU has a
dedicated memory bus connecting the processor to the RAM.
On older CPU
\begin_inset Foot
status open
\begin_layout Plain Layout
Prior to the CPU's produced in 2009
\end_layout
\end_inset
, however, this device was located in a chip also known as
\series bold
MCH
\series default
or
\series bold
\emph on
M
\series default
emory
\series bold
C
\series default
ontroller
\series bold
H
\series default
ub
\begin_inset Index idx
status open
\begin_layout Plain Layout
Memory Controller Hub
\end_layout
\end_inset
\emph default
.
In this case, the CPU does not communicate directly to the RAM, but to
the MCH chip, and this chip then accesses the memory to read or write data.
The first option provides better performance since there is no middleman
in the communications between the CPU and the memory.
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{figure*}
\end_layout
\begin_layout Plain Layout
\backslash
caption{CPU - Memory Communication}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[Old CPU]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.4]{images/03/cpu_chipset_memory}}
\backslash
hfill{}
\end_layout
\begin_layout Plain Layout
\backslash
subfloat[Modern CPU]{
\end_layout
\begin_layout Plain Layout
\backslash
includegraphics[scale=0.4]{images/03/cpu_memory_chipset}}
\end_layout
\begin_layout Plain Layout
\backslash
hfill
\backslash
break
\end_layout
\begin_layout Plain Layout
\backslash
end{figure*}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
At the physical level, RAM is implemented as a grid of cells that each contain
a transistor and an electrical device called a
\emph on
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
capacitor
\end_layout
\end_inset
capacitor
\emph default
\begin_inset Index idx
status open
\begin_layout Plain Layout
capacitor
\end_layout
\end_inset
, which stores charge for short periods of time.
The transistor controls access to the capacitor; when switched on, it allows
a small charge to be read from or written to the capacitor.
The charge on the capacitor slowly dissipates, requiring the inclusion
of a refresh circuit to periodically read values from the cells and write
them back after amplification from an external power source.
\end_layout
\begin_layout Standard
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
Bus
\end_layout
\end_inset
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
Bus
\end_layout
\end_inset
Bus
\emph default
is a subsystem that transfers data between computer components or between
computers.
Physically, buses are just electrical wires that connect all components
together and each wire transfer a single big chunk of data.
The total number of wires is called
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
bus width
\end_layout
\end_inset
\begin_inset Marginal
status open
\begin_layout Plain Layout
\series bold
\emph on
bus width
\end_layout
\end_inset
bus width
\emph default
, and is dependent on how many wires a CPU can support.
If a CPU can only accept 16 bits at a time, then the bus has 16 wires connectin
g from a component to the CPU, which means the CPU can only retrieve 16
bits of data a time.
\end_layout
\begin_layout Subsection
Hardware
\end_layout
\begin_layout Standard
Hardware is a specific implementation of a computer.
A line of processors implement the same instruction set architecture and
use nearly identical organizations but differ in hardware implementation.
For example, the Core i7 family provides a model for desktop computers
that is more powerful but consumes more energy, while another model for
laptops is less performant but more energy efficient.
To write software for a hardware device, seldom we need to understand a
hardware implementation if documents are available.
Computer organization and especially the instruction set architecture are
more relevant to an operating system programmer.
For that reason, the next chapter is devoted to study the x86 instruction
set architecture in depth.
\end_layout
\begin_layout Section
x86 architecture
\end_layout
\begin_layout Standard
A
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
chipset
\end_layout
\end_inset
chipset
\emph default
is a chip with multiple functions.
Historically, a chipset is actually a set of individual chips, and each
is responsible for a function, e.g.
memory controller, graphic controllers, network controller, power controller,
etc.
As hardware progressed, the set of chips were incorporated into a single
chip, thus more space, energy, and cost efficient.
In a desktop computer, various hardware devices are connected to each other
through a PCB called a
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
motherboard
\end_layout
\end_inset
motherboard
\emph default
.
Each CPU needs a compatible motherboard that can host it.
Each motherboard is defined by its chipset model that determine the environment
that a CPU can control.
This environment typically consists of
\end_layout
\begin_layout Itemize
a slot or more for CPU
\end_layout
\begin_layout Itemize
a chipset of two chips which are the Northbridge and Southbridge chips
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Itemize
Northbridge chip is responsible for the high-performance communication between
CPU, main memory and the graphic card.
\end_layout
\begin_layout Itemize
Southbridge chip is responsible for the communication with I/O devices and
other devices that are not performance sensitive.
\end_layout
\end_deeper
\begin_layout Itemize
slots for memory sticks
\end_layout
\begin_layout Itemize
a slot or more for graphic cards.
\end_layout
\begin_layout Itemize
generic slots for other devices, e.g.
network card, sound card.
\end_layout
\begin_layout Itemize
ports for I/O devices, e.g.
keyboard, mouse, USB.
\end_layout
\begin_layout Standard
\begin_inset Float figure
wide false
sideways false
status open
\begin_layout Plain Layout
\begin_inset Caption Standard
\begin_layout Plain Layout
Motherboard organization.
\end_layout
\end_inset
\begin_inset CommandInset label
LatexCommand label
name "mobo-organization"
\end_inset
\end_layout
\begin_layout Plain Layout
\begin_inset Graphics
filename images/03/Motherboard_diagram.svg
\end_inset
\end_layout
\end_inset
\end_layout
\begin_layout Standard
To write a complete operating system, a programmer needs to understand how
to program these devices.
After all, an operating system manages hardware automatically to free applicati
on programs doing so.
However, of all the components, learning to program the CPU is the most
important, as it is the component present in any computer, regardless of
what type a computer is.
For this reason, the primary focus of this book will be on how to program
an x86 CPU.
Even solely focused on this device, a reasonably good minimal operating
system can be written.
The reason is that not all computers include all the devices as in a normal
desktop computer.
For example, an embedded computer might only have a CPU and limited internal
memory, with pins for getting input and producing an output; yet, operating
systems were written for such devices.
\end_layout
\begin_layout Standard
However, learning how to program an x86 CPU is a daunting task, with 3 primary
manuals written for it: almost 500 pages for volume 1, over 2000 pages
for volume 2 and over 1000 pages for volume 3.
It is an impressive feat for a programmer to master every aspect of x86
CPU programming.
\end_layout
\begin_layout Section
Intel Q35 Chipset
\end_layout
\begin_layout Standard
Q35 is an Intel chipset released September 2007.
Q35 is used as an example of a high-level computer organization because
later we will use QEMU to emulate a Q35 system, which is latest Intel system
that QEMU can emulate.
Though released in 2007, Q35 is relatively modern to the current hardware,
and the knowledge can still be reused for current chipset model.
With a Q35 chipset, the emulated CPU is also relatively up-to-date with
features presented in current day CPUs so we can use the latest software
manuals from Intel.
\end_layout
\begin_layout Standard
Figure
\begin_inset CommandInset ref
LatexCommand vref
reference "mobo-organization"
\end_inset
is a typical current-day motherboard organization, in which Q35 shares
similar organization.
\end_layout
\begin_layout Section
x86 Execution Environment
\end_layout
\begin_layout Standard
An
\emph on
execution environment
\begin_inset Index idx
status open
\begin_layout Plain Layout
execution environment
\end_layout
\end_inset
\emph default
is an environment that provides the facility to make code executable.
The execution environment needs to address the following question:
\end_layout
\begin_layout Itemize
\begin_inset Flex Noun
status open
\begin_layout Plain Layout
Supported
\begin_inset space ~
\end_inset
operations?
\end_layout
\end_inset
data transfer, arithmetic, control, floating-point, etc.
\end_layout
\begin_layout Itemize
\begin_inset Flex Noun
status open
\begin_layout Plain Layout
Where are operands stored?
\end_layout
\end_inset
registers, memory, stack, accumulator
\end_layout
\begin_layout Itemize
\begin_inset Flex Noun
status open
\begin_layout Plain Layout
How many explicit operands are there for each instruction?
\end_layout
\end_inset
0, 1, 2, or 3
\end_layout
\begin_layout Itemize
\begin_inset Flex Noun
status open
\begin_layout Plain Layout
How is the operand location specified?
\end_layout
\end_inset
register, immediate, indirect, etc.
\end_layout
\begin_layout Itemize
\begin_inset Flex Noun
status open
\begin_layout Plain Layout
What type and size of operands are supported?
\end_layout
\end_inset
byte, int, float, double, string, vector, etc.
\end_layout
\begin_layout Itemize
\begin_inset Flex Noun
status open
\begin_layout Plain Layout
etc.
\end_layout
\end_inset
\end_layout
\begin_layout Standard
For the remain of this chapter, please carry on the reading to chapter 3
in Intel Manual Volume 1,
\emph on
\begin_inset Quotes eld
\end_inset
Basic Execution Environment
\begin_inset Quotes erd
\end_inset
\emph default
.
\end_layout
\begin_layout Chapter
x86 Assembly and C
\end_layout
\begin_layout Standard
In this chapter, we will explore assembly language, and how it connects
to C.
But why should we do so? Isn't it better to trust the compiler, plus no
one writes assembly anymore?
\end_layout
\begin_layout Standard
Not quite.
Surely, the compiler at its current state of the art is trustworthy, and
we do not need to write code in assembly,
\emph on
most of the time
\emph default
.
A compiler can generate code, but as mentioned previously, a high-level
language is a collection of patterns of a lower-level language.
It does not cover everything that a hardware platform provides.
As a consequence, not every assembly instruction can be generated by a
compiler, so we still need to write assembly code for these circumstances
to access hardware-specific features.
Since hardware-specific features require writing assembly code, debugging
requires reading it.
We might spend even more time reading than writing.
Working with low-level code that interacts directly with hardware, assembly
code is unavoidable.
Also, understand how a compiler generates assembly code could improve a
programmer's productivity.
For example, if a job or school assignment requires us to write assembly
code, we can simply write it in C, then let
\family typewriter
gcc
\family default
does the hard working of writing the assembly code for us.
We merely collect the generated assembly code, modify as needed and be
done with the assignment.
\end_layout
\begin_layout Standard
We will learn
\family typewriter
objdump
\family default
extensively, along with how to use Intel documents to aid in understanding
x86 assembly code.
\end_layout
\begin_layout Section
objdump
\end_layout
\begin_layout Standard
\begin_inset Flex Code
status open
\begin_layout Plain Layout
objdump
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
objdump
\end_layout
\end_inset
\end_layout
\end_inset
is a program that displays information about object files.
It will be handy later to debug incorrect layout from manual linking.
Now, we use
\begin_inset Flex Code
status open
\begin_layout Plain Layout
objdump
\end_layout
\end_inset
to examine how high level source code maps to assembly code.
For now, we ignore the output and learn how to use the command first.
Supposed that we have a executable binary named
\begin_inset Flex Code
status open
\begin_layout Plain Layout
hello
\end_layout
\end_inset
compiled from a
\begin_inset Flex Code
status open
\begin_layout Plain Layout
hello.c
\end_layout
\end_inset
thath prints
\begin_inset Quotes eld
\end_inset
Hello World', it is simple to use
\begin_inset Flex Code
status open
\begin_layout Plain Layout
objdump
\end_layout
\end_inset
:
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{tcolorbox}[enlarge top initially by=5mm]
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\family typewriter
$ objdump -d hello
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
end{tcolorbox}
\end_layout
\end_inset
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout Standard
\noindent
\align left
\begin_inset Flex Code
status open
\begin_layout Plain Layout
-d
\end_layout
\end_inset
option only displays assembled contents of executable sections.
A
\emph on
\begin_inset Index idx
status open
\begin_layout Plain Layout
section
\end_layout
\end_inset
section
\emph default
is a block of memory that contains either program code or data.
A code section is executable by the CPU, while a data section is not executable.
Non-executable sections, such as
\begin_inset Flex Code
status open
\begin_layout Plain Layout
.data
\end_layout
\end_inset
and
\begin_inset Flex Code
status open
\begin_layout Plain Layout
.bss
\end_layout
\end_inset
(for storing program data), debug sections, etc, are not displayed.
We will learn more about section when studying ELF binary file format in
chapter
\begin_inset CommandInset ref
LatexCommand vref
reference "chap:The-Anatomy-of-a-program"
\end_inset
.
On the other hand:
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{tcolorbox}[enlarge top initially by=5mm]
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\family typewriter
$ objdump -D hello
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
end{tcolorbox}
\end_layout
\end_inset
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout Standard
\noindent
\align left
where
\begin_inset Flex Code
status open
\begin_layout Plain Layout
-D
\end_layout
\end_inset
option displays assembly contents of all sections.
If
\begin_inset Flex Code
status open
\begin_layout Plain Layout
-D
\end_layout
\end_inset
,
\begin_inset Flex Code
status open
\begin_layout Plain Layout
-d
\end_layout
\end_inset
is implicitly assumed.
\begin_inset Flex Code
status open
\begin_layout Plain Layout
objdump
\end_layout
\end_inset
is mostly used for inspecting assembly code, so
\begin_inset Flex Code
status open
\begin_layout Plain Layout
-d
\end_layout
\end_inset
is the most useful and thus is set by default.
\end_layout
\begin_layout Standard
The output overruns the terminal screen.
To make it easy for reading, send all the output to
\begin_inset Flex Code
status open
\begin_layout Plain Layout
less
\end_layout
\end_inset
:
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{tcolorbox}[enlarge top initially by=5mm]
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\family typewriter
$ objdump -d hello | less
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
end{tcolorbox}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
To intermix source code and assembly, the binary must be compiled with
\begin_inset Flex Code
status open
\begin_layout Plain Layout
-g
\end_layout
\end_inset
option to include source code in it, then add
\begin_inset Flex Code
status open
\begin_layout Plain Layout
-S
\end_layout
\end_inset
option:
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{tcolorbox}[enlarge top initially by=5mm]
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\family typewriter
$ objdump -S hello | less
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
end{tcolorbox}
\end_layout
\end_inset
\end_layout
\begin_layout Standard
The default syntax used by
\begin_inset Flex Code
status open
\begin_layout Plain Layout
objdump
\end_layout
\end_inset
is AT&T syntax.
To change it to the familiar Intel syntax:
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{tcolorbox}[enlarge top initially by=5mm]
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\family typewriter
$ objdump -M intel -D hello | less
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
end{tcolorbox}
\end_layout
\end_inset
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout Standard
\noindent
\align left
When using
\family typewriter
-M
\family default
option, option
\family typewriter
-D
\family default
or
\family typewriter
-d
\family default
must be explicitly supplied.
Next, we will use
\begin_inset Flex Code
status open
\begin_layout Plain Layout
objdump
\end_layout
\end_inset
to examine how compiled C data and code are represented in machine code.
\end_layout
\begin_layout Standard
Finally, we will write a 32-bit kernel, therefore we will need to compile
a 32-bit binary and examine it in 32-bit mode:
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{tcolorbox}[enlarge top initially by=5mm]
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\family typewriter
$ objdump -M i386,intel -D hello | less
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
end{tcolorbox}
\end_layout
\end_inset
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout Standard
\noindent
\align left
\begin_inset Flex Code
status open
\begin_layout Plain Layout
-M i386
\end_layout
\end_inset
tells objdump to display assembly content using 32-bit layout.
Knowing the difference between 32-bit and 64-bit is crucial for writing
kernel code.
We will examine this matter later on when writing our kernel.
\end_layout
\begin_layout Section
Reading the output
\end_layout
\begin_layout Standard
At the start of the output displays the file format of the object file:
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout LyX-Code
\noindent
\align left
hello: file format elf64-x86-64
\end_layout
\begin_layout Standard
After the line is a series of disassembled sections:
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout LyX-Code
\noindent
\align left
Disassembly of section .interp:
\end_layout
\begin_layout LyX-Code
...
\end_layout
\begin_layout LyX-Code
Disassembly of section .note.ABI-tag:
\end_layout
\begin_layout LyX-Code
...
\end_layout
\begin_layout LyX-Code
Disassembly of section .note.gnu.build-id:
\end_layout
\begin_layout LyX-Code
...
\end_layout
\begin_layout LyX-Code
...
\end_layout
\begin_layout LyX-Code
etc
\end_layout
\begin_layout Standard
Finally, each disassembled section displays its actual content - which is
a sequence of assembly instructions - with the following format:
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout LyX-Code
\noindent
\align left
\color red
4004d6
\color inherit
:
\color blue
55
\color inherit
\color green
push rbp
\end_layout
\begin_layout Itemize
The
\color red
first column
\color inherit
is the address of an assembly instruction.
In the above example, the address is
\family typewriter
0x4004d6
\family default
.
\end_layout
\begin_layout Itemize
The
\color blue
second column
\color inherit
is assembly instruction in raw hex values.
In the above example, the value is
\family typewriter
0x55
\family default
.
\end_layout
\begin_layout Itemize
The
\color green
third column
\color inherit
is the assembly instruction.
Depends on the section, the assembly instruction might be meaningful or
meaningless.
For example, if the assembly instructions are in a
\family typewriter
.text
\family default
section, then the assembly instructions are actual program code.
On the other hand, if the assembly instructions are displayed in a
\family typewriter
.data
\family default
section, then we can safely ignore the displayed instructions.
The reason is that
\begin_inset Flex Code
status open
\begin_layout Plain Layout
objdump
\end_layout
\end_inset
doesn't know which hex values are code and which are data, so it blindly
translates every hex values into assembly instructions.
In the above example, the assembly instruction is
\family typewriter
push %rbp
\family default
.
\end_layout
\begin_layout Itemize
The optional fourth column is a comment - appears when there is a reference
to an address - to inform where the address originates.
For example, the comment in
\color blue
blue
\color inherit
:
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout Full Width
\family typewriter
\begin_inset space ~
\end_inset
\begin_inset space ~
\end_inset
\begin_inset space ~
\end_inset
\begin_inset space ~
\end_inset
lea r12,
\color red
[rip+0x2008ee]
\color inherit
\color blue
# 600e10 <__frame_dummy_init_array_entry>
\end_layout
\begin_layout Standard
is to inform that the referenced address from
\family typewriter
\color red
[rip+0x2008ee]
\family default
\color inherit
is
\family typewriter
\color blue
0x600e10
\family default
\color inherit
, where the variable
\family typewriter
__frame_dummy_init_array_entry
\family default
resides.
\end_layout
\end_deeper
\begin_layout Standard
In a disassembled section, it may also contain
\emph on
labels
\emph default
.
A label is a name given to an assembly instruction.
The label denotes the purpose of an assembly block to a human reader, to
make it easier to understand.
For example,
\family typewriter
.text
\family default
section carries many of such labels to denote where code in a program start;
\family typewriter
.text
\family default
section below carries two functions:
\family typewriter
\color red
_start
\family default
\color inherit
and
\family typewriter
\color red
deregister_tm_clones
\family default
\color inherit
.
The
\family typewriter
\color red
_start
\family default
\color inherit
function starts at address
\family typewriter
\color blue
4003e0
\family default
\color inherit
, is annotated to the left of the function name.
Right below
\family typewriter
\color red
_start
\color inherit
\family default
label is also the instruction at address
\family typewriter
\color blue
4003e0
\family default
\color inherit
.
This whole thing means that a label is simply a name of a memory address.
The function
\family typewriter
deregister_tm_clones
\family default
also shares the same format as every function in the section.
\end_layout
\begin_layout LyX-Code
0000000000
\color blue
4003e0
\color inherit
\color red
<_start>
\color inherit
:
\end_layout
\begin_layout LyX-Code
\color blue
4003e0
\color inherit
: 31 ed xor ebp,ebp
\end_layout
\begin_layout LyX-Code
4003e2: 49 89 d1 mov r9,rdx
\end_layout
\begin_layout LyX-Code
4003e5: 5e pop rsi
\end_layout
\begin_layout LyX-Code
...more assembly code....
\end_layout
\begin_layout LyX-Code
\end_layout
\begin_layout LyX-Code
0000000000
\color blue
400410
\color inherit
\color red
<deregister_tm_clones>
\color inherit
:
\end_layout
\begin_layout LyX-Code
\color blue
400410
\color inherit
: b8 3f 10 60 00 mov eax,0x60103f
\end_layout
\begin_layout LyX-Code
400415: 55 push rbp
\end_layout
\begin_layout LyX-Code
400416: 48 2d 38 10 60 00 sub rax,0x601038
\end_layout
\begin_layout LyX-Code
...more assembly code....
\end_layout
\begin_layout Section
Intel manuals
\end_layout
\begin_layout Standard
The best way to understand and use assembly language properly is to understand
precisely the underlying computer architecture and what each machine instructio
n does.
To do so, the most reliable source is to refer to documents provided by
vendors.
After all, hardware vendors are the one who made their machines.
To understand Intel's instruction set, we need the document
\begin_inset Quotes eld
\end_inset
\emph on
Intel 64 and IA-32 architectures software developer's manual combined volumes
2A, 2B, 2C, and 2D: Instruction set reference, A-Z
\emph default
\begin_inset Quotes erd
\end_inset
.
The document can be retrieved here:
\begin_inset Flex URL
status open
\begin_layout Plain Layout
https://software.intel.com/en-us/articles/intel-sdm
\end_layout
\end_inset
.
\end_layout
\begin_layout Itemize
Chapter 1 provides brief information about the manual, and the comment notations
used in the book.
\end_layout
\begin_layout Itemize
Chapter 2 provides an in-depth explanation of the anatomy of an assembly
instruction, which we will investigate in the next section.
\end_layout
\begin_layout Itemize
Chapter 3 - 5 provide the details of every instruction of the x86_64 architectur
e.
\end_layout
\begin_layout Itemize
Chapter 6 provides information about safer mode extensions.
We won't need to use this chapter.
\end_layout
\begin_layout Standard
The first volume
\begin_inset Quotes eld
\end_inset
\emph on
Intel® 64 and IA-32 Architectures Software Developer’s Manual Volume 1:
Basic Architecture
\emph default
\begin_inset Quotes erd
\end_inset
describes the basic architecture and programming environment of Intel processor
s.
In the book, Chapter 5 gives the summary of all Intel instructions, by
listing instructions into different categories.
We only need to learn general-purpose instructions listed
\emph on
chapter 5.1
\emph default
for our OS.
\emph on
Chapter 7
\emph default
describes the purpose of each category.
Gradually, we will learn all of these instructions.
\end_layout
\begin_layout Exercise
Read section 1.3 in volume 2, exclude sections 1.3.5 and 1.3.7.
\end_layout
\begin_layout Section
Experiment with assembly code
\end_layout
\begin_layout Standard
The subsequent sections examine the anatomy of an assembly instruction.
To fully understand, it is necessary to write code and see the code in
its actual form displayed as hex numbers.
For this purpose, we use
\family typewriter
nasm
\family default
assembler to write a few line of assembly code and see the generated code.
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout Full Width
\noindent
\align left
\begin_inset CommandInset line
LatexCommand rule
offset "-4ex"
width "100col%"
height "1.5pt"
\end_inset
\end_layout
\begin_layout Example
Suppose we want to see the machine code generated for this instruction:
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_deeper
\begin_layout LyX-Code
jmp eax
\end_layout
\begin_layout Standard
Then, we use an editor e.g.
Emacs, then create a new file, write the code and save it in a file, e.g.
\family typewriter
test.asm
\family default
.
Then, in the terminal, run the command:
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
begin{tcolorbox}[enlarge top initially by=5mm]
\end_layout
\end_inset
\end_layout
\begin_layout Standard
\family typewriter
$ nasm -f bin test.asm -o test
\end_layout
\begin_layout Standard
\begin_inset ERT
status open
\begin_layout Plain Layout
\backslash
end{tcolorbox}
\end_layout
\end_inset
\begin_inset Separator latexpar
\end_inset
\end_layout
\begin_layout S
gitextract_3fx0xerz/
├── .gitignore
├── CHANGELOG.md
├── README.md
├── _config.yml
├── book_src/
│ ├── Operating Systems From 0 to 1.lyx
│ ├── Operating Systems From 0 to 1.txt
│ ├── images/
│ │ ├── .rid
│ │ ├── 02/
│ │ │ └── layer_translation.graphml
│ │ ├── 03/
│ │ │ └── .rid
│ │ ├── 04/
│ │ │ ├── .rid
│ │ │ └── modrm32.tex
│ │ ├── 05/
│ │ │ └── .rid
│ │ ├── 06/
│ │ │ └── .rid
│ │ ├── 07/
│ │ │ └── .rid
│ │ └── 08/
│ │ └── .rid
│ └── references.bib
└── code/
├── chapter7/
│ └── os/
│ ├── .gdbinit
│ ├── Makefile
│ ├── bootloader/
│ │ ├── Makefile
│ │ ├── bootloader.asm
│ │ └── debug.elf
│ ├── build/
│ │ ├── bootloader/
│ │ │ └── bootloader.o
│ │ └── os/
│ │ └── sample.o
│ ├── disk.img
│ └── os/
│ ├── Makefile
│ └── sample.asm
└── chapter8/
└── os/
├── .gdbinit
├── Makefile
├── bootloader/
│ ├── Makefile
│ ├── bootloader.asm
│ ├── bootloader.dbg
│ └── bootloader.lds
├── build/
│ ├── bootloader/
│ │ ├── bootloader.elf
│ │ ├── bootloader.o
│ │ └── bootloader.o.elf
│ ├── disk.img
│ ├── main.o
│ └── os/
│ ├── main.o
│ ├── os
│ ├── os.debug
│ └── sample.o
├── disk.img
└── os/
├── Makefile
├── main
├── main.c
├── os
├── os.lds
└── sample.asm
SYMBOL INDEX (1 symbols across 1 files)
FILE: code/chapter8/os/os/main.c
function main (line 1) | void main(){}
Condensed preview — 48 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (2,324K chars).
[
{
"path": ".gitignore",
"chars": 13,
"preview": "*.png\n*.lyx~\n"
},
{
"path": "CHANGELOG.md",
"chars": 459,
"preview": "## 0.0.1 (2017-02-15)\n\n * [#6] Fix the figure `far_jmp_ex.svg` in chapter 4, where the segment and the offset in memory"
},
{
"path": "README.md",
"chars": 7867,
"preview": "\n[](https://www.paypal.com/cgi-bin/webscr?cmd=_donations&"
},
{
"path": "_config.yml",
"chars": 29,
"preview": "theme: jekyll-theme-architect"
},
{
"path": "book_src/Operating Systems From 0 to 1.lyx",
"chars": 1440088,
"preview": "#LyX 2.3 created this file. For more info see http://www.lyx.org/\r\n\\lyxformat 544\r\n\\begin_document\r\n\\begin_header\r\n\\save"
},
{
"path": "book_src/Operating Systems From 0 to 1.txt",
"chars": 562536,
"preview": "Operating Systems:\r\nFrom 0 to 1\r\n\r\nTu, Do Hoang\r\n\r\n\r\n\r\n\r\n\r\n\r\nTable of Contents\r\n\r\n Preface\r\n Why another book"
},
{
"path": "book_src/images/.rid",
"chars": 32,
"preview": "58f78e5b0bedbe9328c87d501e6feda8"
},
{
"path": "book_src/images/02/layer_translation.graphml",
"chars": 21011,
"preview": "<?xml version=\"1.0\" encoding=\"UTF-8\" standalone=\"no\"?>\r\n<graphml xmlns=\"http://graphml.graphdrawing.org/xmlns\" xmlns:jav"
},
{
"path": "book_src/images/03/.rid",
"chars": 32,
"preview": "f5ac4f7ffcf0c9cdf703fb078663fd4e"
},
{
"path": "book_src/images/04/.rid",
"chars": 32,
"preview": "49527604b893e9c9a27465cd47b4ab70"
},
{
"path": "book_src/images/04/modrm32.tex",
"chars": 13905,
"preview": "\\begin{tabular}{|>{\\raggedright}p{4cm}|>{\\raggedright}p{1cm}|>{\\raggedright}p{1cm}|>{\\raggedright}p{1cm}|>{\\raggedright}"
},
{
"path": "book_src/images/05/.rid",
"chars": 32,
"preview": "a9ba32dde86cd02374a576cbb347870e"
},
{
"path": "book_src/images/06/.rid",
"chars": 32,
"preview": "978a2a6b43366794c5decbda95bf1e08"
},
{
"path": "book_src/images/07/.rid",
"chars": 32,
"preview": "58310471881fcc80a0b152521c5de8ae"
},
{
"path": "book_src/images/08/.rid",
"chars": 32,
"preview": "1ba7fe42ddc5a44d1184e22d6c2fa796"
},
{
"path": "book_src/references.bib",
"chars": 2762,
"preview": "% This file was created with JabRef 2.10.\r\n% Encoding: UTF8\r\n\r\n\r\n@InBook{Hardy,\r\n Title = {A Mathema"
},
{
"path": "code/chapter7/os/.gdbinit",
"chars": 260,
"preview": "define hook-stop\n # Translate the segment:offset into a physical address\n printf \"[%4x:%4x] \", $cs, $eip\nend\nset a"
},
{
"path": "code/chapter7/os/Makefile",
"chars": 542,
"preview": "BUILD_DIR=build\nBOOTLOADER=$(BUILD_DIR)/bootloader/bootloader.o\nOS=$(BUILD_DIR)/os/sample.o\nDISK_IMG=disk.img\n\nall: boot"
},
{
"path": "code/chapter7/os/bootloader/Makefile",
"chars": 242,
"preview": "BUILD_DIR=../build/bootloader\n\nBOOTLOADER_SRCS := $(wildcard *.asm)\nBOOTLOADER_OBJS := $(patsubst %.asm, $(BUILD_DIR)/%."
},
{
"path": "code/chapter7/os/bootloader/bootloader.asm",
"chars": 969,
"preview": ";******************************************\n; Bootloader.asm\n; A Simple Bootloader\n;************************************"
},
{
"path": "code/chapter7/os/os/Makefile",
"chars": 202,
"preview": "BUILD_DIR=../build/os\n\nOS_SRCS := $(wildcard *.asm)\nOS_OBJS := $(patsubst %.asm, $(BUILD_DIR)/%.o, $(OS_SRCS))\n\nall: $(O"
},
{
"path": "code/chapter7/os/os/sample.asm",
"chars": 144,
"preview": ";******************************************\n; sample.asm\t\t\n; A Sample Program\n;*****************************************"
},
{
"path": "code/chapter8/os/.gdbinit",
"chars": 260,
"preview": "define hook-stop\n # Translate the segment:offset into a physical address\n printf \"[%4x:%4x] \", $cs, $eip\nend\nset a"
},
{
"path": "code/chapter8/os/Makefile",
"chars": 691,
"preview": "BUILD_DIR=build\nBOOTLOADER=$(BUILD_DIR)/bootloader/bootloader.o\nOS=$(BUILD_DIR)/os/os\nDISK_IMG=disk.img\n\nall: bootdisk\n\n"
},
{
"path": "code/chapter8/os/bootloader/Makefile",
"chars": 354,
"preview": "BUILD_DIR=../build/bootloader\n\nBOOTLOADER_SRCS := $(wildcard *.asm)\nBOOTLOADER_OBJS := $(patsubst %.asm, $(BUILD_DIR)/%."
},
{
"path": "code/chapter8/os/bootloader/bootloader.asm",
"chars": 970,
"preview": ";******************************************\n; Bootloader.asm\n; A Simple Bootloader\n;************************************"
},
{
"path": "code/chapter8/os/bootloader/bootloader.lds",
"chars": 205,
"preview": "OUTPUT(bootloader);\n\nPHDRS\n{\n headers PT_NULL;\n text PT_LOAD FILEHDR PHDRS ;\n data PT_LOAD ;\n}\n\n\nSECTIONS\n{\n . = SIZ"
},
{
"path": "code/chapter8/os/os/Makefile",
"chars": 410,
"preview": "BUILD_DIR=../build/os\nOS=$(BUILD_DIR)/os\n\nCFLAGS+=-ffreestanding -nostdlib -m32 -gdwarf-4 -ggdb3\n\nOS_SRCS := $(wildcard"
},
{
"path": "code/chapter8/os/os/main.c",
"chars": 14,
"preview": "void main(){}\n"
},
{
"path": "code/chapter8/os/os/os.lds",
"chars": 214,
"preview": "ENTRY(main);\n\nPHDRS\n{\n headers PT_PHDR FILEHDR PHDRS;\n code PT_LOAD;\n}\n\nSECTIONS\n{\n .text 0x600: ALIGN(0x100) { *(.t"
},
{
"path": "code/chapter8/os/os/sample.asm",
"chars": 144,
"preview": ";******************************************\n; sample.asm\t\t\n; A Sample Program\n;*****************************************"
}
]
// ... and 17 more files (download for full content)
About this extraction
This page contains the full source code of the tuhdo/os01 GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 48 files (2.0 MB), approximately 528.6k tokens, and a symbol index with 1 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.