Repository: WebAssembly/design
Branch: main
Commit: 6e43416ebc8b
Files: 27
Total size: 123.4 KB
Directory structure:
gitextract_zxmj6d79/
├── .gitignore
├── BinaryEncoding.md
├── CAndC++.md
├── CodeOfConduct.md
├── Contributing.md
├── DynamicLinking.md
├── Events.md
├── FAQ.md
├── FeatureTest.md
├── FutureFeatures.md
├── HighLevelGoals.md
├── JITLibrary.md
├── JS.md
├── LICENSE
├── MVP.md
├── Modules.md
├── NonWeb.md
├── Nondeterminism.md
├── Portability.md
├── README.md
├── Rationale.md
├── Security.md
├── Semantics.md
├── TextFormat.md
├── Tooling.md
├── UseCases.md
└── Web.md
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
*~
*#
.#*
.*.swp
out
================================================
FILE: BinaryEncoding.md
================================================
# Binary Encoding
This file historically contained the description of WebAssembly's binary encoding.
For the current description, see the [normative documentation](http://webassembly.github.io/spec/core/binary/index.html).
================================================
FILE: CAndC++.md
================================================
# Guide for C/C++ developers
WebAssembly is being designed to support C and C++ code well, right from
the start in [the MVP](MVP.md). The following explains the outlook for
C and C++ developers.
## Porting C and C++ code to WebAssembly
### Platform features
WebAssembly has a pretty conventional ISA: 8-bit bytes, two's complement
integers, little-endian, and a lot of other normal properties. Reasonably
portable C/C++ code should port to WebAssembly without difficultly.
WebAssembly has 32-bit and 64-bit architecture variants, called wasm32 and
wasm64. wasm32 has an ILP32 data model, meaning that `int`, `long`, and
pointer types are all 32-bit, while the `long long` type is 64-bit. wasm64
has an LP64 data model, meaning that `long` and pointer types will be
64-bit, while `int` is 32-bit.
[The MVP](MVP.md) will support only wasm32; support for wasm64 will be
added in the future to support
[64-bit address spaces :unicorn:][future 64-bit].
`float` and `double` are the IEEE 754-2019 single- and double-precision types,
which are native in WebAssembly. `long double` is the IEEE 754-2019
quad-precision type, which is a software-emulated type. WebAssembly does
not have a builtin quad-precision type or associated operators. The long
double type here is software-emulated in library code linked into WebAssembly
applications that need it.
For performance and compatibility with other platforms, `float` and
`double` are recommended for most uses.
### Language Support
C and C++ language conformance is largely determined by individual compiler
support, but WebAssembly includes all the functionality that popular C and C++
compilers need to support high-quality implementations.
While [the MVP](MVP.md) will be fully functional, additional features enabling
greater performance will be added soon after, including:
* Support for [multi-threaded execution with shared memory][future threads].
* [Zero-cost C++ exception handling][future exceptions].
C++ exceptions can be implemented without this, but this feature will
enable them to have lower runtime overhead.
* Support for [128-bit SIMD][future simd]. SIMD will be
exposed to C/C++ though explicit APIs such as [LLVM's vector extensions]
and [GCC's vector extensions], auto-vectorization, and emulated APIs from
other platforms such as `<xmmintrin.h>`.
[LLVM's vector extensions]: https://clang.llvm.org/docs/LanguageExtensions.html#vectors-and-extended-vectors
[GCC's vector extensions]: https://gcc.gnu.org/onlinedocs/gcc/Vector-Extensions.html
### APIs
WebAssembly applications can use high-level C/C++ APIs such as the C
and C++ standard libraries, OpenGL, SDL, pthreads, and others, just as
in normal C/C++ development. Under the covers, these libraries
implement their functionality by using low-level facilities provided by
WebAssembly implementations. On [the Web](Web.md), they utilize
Web APIs (for example, OpenGL is executed on WebGL, libc date and
time methods use the browser's Date functionality, etc.).
[In other contexts](NonWeb.md), other low-level mechanisms may be used.
### ABIs
In [the MVP](MVP.md), WebAssembly does not yet have a stable ABI for
libraries. Developers will need to ensure that all code linked into an
application are compiled with the same compiler and options.
In the future, when WebAssembly is extended to support
[dynamic linking](DynamicLinking.md), stable ABIs are
expected to be defined in accompaniment.
### Undefined and Implementation-defined Behavior
#### Undefined Behavior
WebAssembly doesn't change the C or C++ languages. Things which cause
undefined behavior in C or C++ are still bugs when compiling for WebAssembly
[even when the corresponding behavior in WebAssembly itself is defined](Nondeterminism.md#note-for-users-of-c-c-and-similar-languages).
C and C++ optimizers still assume that undefined behavior won't occur,
so such bugs can still lead to surprising behavior.
For example, while unaligned memory access is
[fully defined](Semantics.md#alignment) in WebAssembly, C and C++ compilers
make no guarantee that a (non-packed) unaligned memory access at the source
level is harmlessly translated into an unaligned memory access in WebAssembly.
And in practice, popular C and C++ compilers do optimize on the assumption that
alignment rules are followed, meaning that they don't always preserve program
behavior otherwise.
On WebAssembly, the primary [nondeterminism](Nondeterminism.md) and
[security](Security.md) invariants are always maintained. Demons can't actually
fly out your nose, as that would constitute an escape from the sandbox. And,
callstacks can't become corrupted.
Other than that, programs which invoke undefined behavior at the source language
level may be compiled into WebAssembly programs which do anything else,
including corrupting the contents of the application's linear memory, calling APIs with
arbitrary parameters, hanging, trapping, or consuming arbitrary amounts of
resources (within the limits).
[Tools are being developed and ported](Tooling.md) to help developers find
and fix such bugs in their code.
#### Implementation-Defined Behavior
Most implementation-defined behavior in C and C++ is dependent on the compiler
rather than on the underlying platform. For those details that are dependent
on the platform, on WebAssembly they follow naturally from having 8-bit bytes,
32-bit and 64-bit two's complement integers, and
[32-bit and 64-bit IEEE-754-2019-style floating point support](https://webassembly.github.io/spec/core/exec/numerics.html#floating-point).
## Portability of compiled code
WebAssembly can be efficiently implemented on a wide variety of platforms,
provided they can satisfy certain
[basic expectations](Portability.md#assumptions-for-efficient-execution).
WebAssembly has very limited [nondeterminism](Nondeterminism.md), so it is
expected that compiled WebAssembly programs will behave very consistently
across different implementations, and across different versions of the same
implementation.
[future 64-bit]: https://github.com/WebAssembly/memory64
[future threads]: https://github.com/WebAssembly/design/issues/1073
[future simd]: https://github.com/WebAssembly/design/issues/1075
[future exceptions]: https://github.com/WebAssembly/design/issues/1078
================================================
FILE: CodeOfConduct.md
================================================
The code of conduct has moved to https://github.com/WebAssembly/meetings/blob/main/CODE_OF_CONDUCT.md.
================================================
FILE: Contributing.md
================================================
# Contributing to WebAssembly
Interested in contributing to WebAssembly? We suggest you start by:
* Acquainting yourself with the
[Code of Ethics and Professional Conduct](CodeOfConduct.md).
* Joining the
[W3C Community Group]. Please do this before starting new proposals,
attending meetings, or making significant contributions. It provides the legal
framework that protects participants and supports standardization. And make
sure you're registered as affiliated with your company or organization in the
Community Group, if any.
* Then, you may wish to:
- [ask questions],
- discuss existing [phase-0 proposals] and [phase-1+ proposals],
- attend [meetings],
- discuss [tooling conventions],
- or make [new proposals]!
If your issue or PR has not received comments, you can ping the relevant spec
editor or [champion]. It is polite to ping no more than once a week.
Happy assembly!
[W3C Community Group]: https://www.w3.org/community/webassembly/
[Code of Ethics and Professional Conduct]: https://github.com/WebAssembly/meetings/blob/main/CODE_OF_CONDUCT.md
[meetings]: https://github.com/WebAssembly/meetings
[phase-1+ proposals]: https://github.com/WebAssembly/proposals
[phase-0 proposals]: https://github.com/WebAssembly/design/issues
[ask questions]: https://github.com/WebAssembly/design/discussions
[new proposals]: https://github.com/WebAssembly/meetings/blob/main/process/proposal.md
[tooling conventions]: https://github.com/WebAssembly/tool-conventions
[champion]: https://github.com/WebAssembly/proposals
================================================
FILE: DynamicLinking.md
================================================
# Dynamic linking
WebAssembly enables load-time and run-time (`dlopen`) dynamic linking in the
MVP by having multiple [instantiated modules](Modules.md)
share functions, [linear memories](Semantics.md#linear-memory),
[tables](Semantics.md#table) and [constants](Semantics.md#constants)
using module [imports](Modules.md#imports) and [exports](Modules.md#exports). In
particular, since all (non-local) state that a module can access can be imported
and exported and thus shared between separate modules' instances, toolchains
have the building blocks to implement dynamic loaders.
Since the manner in which modules are loaded and instantiated is defined by the
host environment (e.g., the [JavaScript API](JS.md)), dynamic linking requires
use of host-specific functionality to link two modules. At a minimum, the host
environment must provide a way to dynamically instantiate modules while
connecting exports to imports.
The simplest load-time dynamic linking scheme between modules A and B can be
achieved by having module A export functions, tables and memories that are
imported by B. A C++ toolchain can expose this functionality by using the
same function attributes currently used to export/import symbols from
native DSOs/DLLs:
```
#ifdef _WIN32
# define EXPORT __declspec(dllexport)
# define IMPORT __declspec(dllimport)
#else
# define EXPORT __attribute__ ((visibility ("default")))
# define IMPORT __attribute__ ((visibility ("default")))
#endif
typedef void (**PF)();
IMPORT PF imp();
EXPORT void exp() { (*imp())(); }
```
This code would, at a minimum, generate a WebAssembly module with imports for:
* the function `imp`
* the heap used to perfom the load, when dereferencing the return value of `imp`
* the table used to perform the pointer-to-function call
and exports for:
* the function `exp`
A more realistic module using libc would have more imports including:
* an immutable `i32` global import for the offset in linear memory to place
global [data segments](Modules.md#data-section) and later use as a constant
base address when loading and storing from globals
* an immutable `i32` global import for the offset into the indirect function
table at which to place the modules' indirectly called functions and later
compute their indices for address-of
One extra detail is what to use as the [module name](Modules.md#imports) for
imports (since WebAssembly has a two-level namespace). One option is to have a
single default module name for all C/C++ imports/exports (which then allows the
toolchain to put implementation-internal names in a separate namespace, avoiding
the need for `__`-prefix conventions).
To implement run-time dynamic linking (e.g., `dlopen` and `dlsym`):
* `dlopen` would compile and instantiate a new module, storing the compiled
instance in a host-environment table, returning the index to the caller.
* `dlsym` would be given this index, pull the instance out of the table,
search the instances's exports, append the found function to the function
table (using host-defined functionality in the MVP, but directly from
WebAssembly code in the
[future :unicorn:][future types]) and return the
table index of the appended element to the caller.
Note that the representation of a C function-pointer in WebAssembly is an index
into a function table, so the above scheme lines up perfectly with the
function-pointer return value of `dlsym`.
More complicated dynamic linking functionality (e.g., interposition, weak
symbols, etc) can be simulated efficiently by assigning a function table
index to each weak/mutable symbol, calling the symbol via `call_indirect` on that
index, and mutating the underlying element as needed.
After the MVP, we would like to standardize a single [ABI][] per source
language, allowing for WebAssembly libraries to interface with each other
regardless of compiler. Specifying an ABI requires that all ABI-related
future features (like SIMD, multiple return values and exception handling)
have been implemented. While it is highly recommended for compilers targeting
WebAssembly to adhere to the specified ABI for interoperability, WebAssembly
runtimes will be ABI agnostic, so it will be possible to use a non-standard ABI
for specialized purposes.
[ABI]: https://en.wikipedia.org/wiki/Application_binary_interface
[future types]: https://github.com/WebAssembly/reference-types/blob/master/proposals/reference-types/Overview.md#language-extensions
================================================
FILE: Events.md
================================================
## Past Events
| Date | Title | Slides | Video | Presenter(s) |
|-----:|-------|:------:|:-----:|--------------|
| October 29th 2015 | LLVM Developers' Meeting | [slides](https://llvm.org/devmtg/2015-10/slides/BastienGohman-WebAssembly-HereBeDragons.pdf) | [video](https://www.youtube.com/watch?v=5W7NkofUtAw) | JF Bastien and Dan Gohman |
| November 10th 2015 | BlinkOn 5: WebAssembly | | [video](https://youtu.be/iCSAUHpPbiU) | Nick Bray |
| December 9th 2015 | Emscripten and WebAssembly | [slides](https://kripken.github.io/talks/wasm.html) | | Alon Zakai |
| January 31st 2016 | FOSDEM LLVM room | [slides](https://fosdem.org/2016/schedule/event/llvm_webassembly) | | JF Bastien and Dan Gohman |
| June 13th 2016 | NYLUG Presents: WebAssembly: A New Compiler Target For The Web | | [video](https://www.youtube.com/watch?v=RByPdCN1RQ4) | Luke Wagner |
| July 7th 2016 | VMSS16: A Little on V8 and WebAssembly | [slides](https://ia601208.us.archive.org/16/items/vmss16/titzer.pdf) | [video](https://www.youtube.com/watch?v=BRNxM8szTPA) | Ben L. Titzer |
| September 14th 2016 | WebAssembly: birth of a virtual ISA | | [video](https://www.youtube.com/watch?v=vmzz17JGPHI) | Ben Smith |
| September 21st 2016 | C++ on the Web: Let's have some serious fun | | [video](https://www.youtube.com/watch?v=jXMtQ2fTl4c) | Dan Gohman |
| September 22nd 2016 | WebAssembly: high speed at low cost for everyone | [paper](http://www.mlworkshop.org/2016-1.pdf) | | Andreas Rossberg |
| October 31st 2016 | VMIL16: WebAssembly from wire to machine code: a view inside V8's implementation | | | Ben L. Titzer |
| November 7th 2016 | Empire Node: How Webassembly Will Change the Way You Write Javascript | | [video](https://www.youtube.com/watch?v=kq2HBddiyh0) | Seth Samuel |
| November 12th 2016 | Chrome Dev Summit Advanced JS performance with V8 and WebAssembly | | [video](https://www.youtube.com/watch?v=PvZdTZ1Nl5o) | Seth Thompson |
| June 26th 2017 | Bringing the Web up to Speed with WebAssembly | [paper](https://github.com/WebAssembly/spec/blob/master/papers/pldi2017.pdf) | [video](https://www.youtube.com/watch?v=AFy5TdrFG9Y) | Andreas Rossberg
| October 5th 2017 | Node.js Interactive: WebAssembly and the Future of the Web | [paper](https://kgryte.github.io/talks-nodejs-interactive-2017/#/splash) | [video](https://www.youtube.com/watch?v=iJL59lh4IJA) | Athan Reines
| October 5th 2017 | Node.js Interactive: What's a Wasm? | | [video](https://www.youtube.com/watch?v=gk9ERa7UYPM) | Paul Milham
## Upcoming
| Date | Title | Slides | Video | Presenter(s) |
|-----:|-------|:------:|:-----:|--------------|
================================================
FILE: FAQ.md
================================================
# FAQ
## Why create a new standard when there is already asm.js?
... especially since pthreads ([Mozilla pthreads][], [Chromium pthreads][]) and
SIMD ([simd.js][], [Chromium SIMD][], [simd.js in asm.js][]) are coming to
JavaScript.
[Mozilla pthreads]: https://blog.mozilla.org/javascript/2015/02/26/the-path-to-parallel-javascript/
[Chromium pthreads]: https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/d-0ibJwCS24
[simd.js]: https://hacks.mozilla.org/2014/10/introducing-simd-js/
[Chromium SIMD]: https://groups.google.com/a/chromium.org/forum/#!topic/blink-dev/2PIOEJG_aYY
[simd.js in asm.js]: http://discourse.specifiction.org/t/request-for-comments-simd-js-in-asm-js/676
There are two main benefits WebAssembly provides:
1. The kind of binary format being considered for WebAssembly can be natively
decoded much faster than JavaScript can be parsed ([experiments][] show more
than 20× faster). On mobile, large compiled codes can easily take 20–40
seconds *just to parse*, so native decoding (especially when combined with
other techniques like [streaming][] for better-than-gzip compression) is
critical to providing a good cold-load user experience.
2. By avoiding the simultaneous asm.js constraints of [AOT][]-[compilability][]
and good performance even on engines without
[specific asm.js optimizations][], a new standard makes it *much easier* to
add the [features :unicorn:][future general] required to reach native
levels of performance.
[experiments]: BinaryEncoding.md#why-a-binary-encoding-instead-of-a-text-only-representation
[streaming]: https://www.w3.org/TR/streams-api/
[AOT]: http://asmjs.org/spec/latest/#ahead-of-time-compilation
[compilability]: https://blog.mozilla.org/luke/2014/01/14/asm-js-aot-compilation-and-startup-performance/
[specific asm.js optimizations]: https://blog.mozilla.org/luke/2015/02/18/microsoft-announces-asm-js-optimizations/#asmjs-opts
Of course, every new standard introduces new costs (maintenance, attack surface,
code size) that must be offset by the benefits. WebAssembly minimizes costs by
having a design that allows (though not requires) a browser to implement
WebAssembly inside its *existing* JavaScript engine (thereby reusing the
JavaScript engine's existing compiler backend, ES6 module loading frontend,
security sandboxing mechanisms and other supporting VM components). Thus, in
cost, WebAssembly should be comparable to a big new JavaScript feature, not a
fundamental extension to the browser model.
Comparing the two, even for engines which already optimize asm.js, the benefits
outweigh the costs.
## What are WebAssembly's use cases?
WebAssembly was designed with [a variety of use cases in mind](UseCases.md).
## Can WebAssembly be polyfilled?
We think so. There was an early
[prototype](https://github.com/WebAssembly/polyfill-prototype-1) with demos
[[1](https://lukewagner.github.io/AngryBotsPacked),
[2](https://lukewagner.github.io/PlatformerGamePacked)], which showed
that decoding a binary WebAssembly-like format into asm.js can be efficient.
And as the WebAssembly design has changed there have been
[more](https://github.com/WebAssembly/polyfill-prototype-2)
[experiments](https://github.com/WebAssembly/binaryen/blob/master/src/wasm2asm.h)
with polyfilling.
Overall, optimism has been increasing for quick adoption of WebAssembly in
browsers, which is great, but it has decreased the motivation to work on a
polyfill.
It is also the case that polyfilling WebAssembly to asm.js is less urgent
because of the existence of alternatives, for example, a reverse polyfill -
compiling
[asm.js to WebAssembly](https://github.com/WebAssembly/binaryen/blob/master/src/asm2wasm.h) -
exists, and it allows shipping a single build that can run as either
asm.js or WebAssembly. It is also possible to build a project into
two parallel asm.js and WebAssembly builds by just
[flipping a switch](https://github.com/kripken/emscripten/wiki/WebAssembly)
in emscripten, which avoids polyfill time on the client entirely. A third
option, for non-performant code, is to use a compiled WebAssembly interpreter
such as
[binaryen.js](https://github.com/WebAssembly/binaryen/blob/master/test/binaryen.js/test.js).
However, a WebAssembly polyfill is still an interesting idea and should in
principle be possible.
## Is WebAssembly only for C/C++ programmers?
As explained in the [high-level goals](HighLevelGoals.md), to achieve a Minimum
Viable Product, the initial focus is on [C/C++](CAndC++.md).
However, by [integrating with JavaScript at the ES6 Module interface](Modules.md#integration-with-es6-modules),
web developers don't need to write C++ to take advantage of libraries that others have written;
reusing a modular C++ library can be as simple as [using a module from JavaScript](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules).
Beyond the MVP, another [high-level goal](HighLevelGoals.md)
is to improve support for languages other than C/C++. This includes [allowing WebAssembly code to
allocate and access garbage-collected (JavaScript, DOM, Web API) objects
:unicorn:][future garbage collection].
Even before GC support is added to WebAssembly, it is possible to compile a language's VM
to WebAssembly (assuming it's written in portable C/C++) and this has already been demonstrated
([1](https://ruby.dj), [2](https://kripken.github.io/lua.vm.js/lua.vm.js.html),
[3](https://syntensity.blogspot.com/2010/12/python-demo.html)). However, "compile the VM" strategies
increase the size of distributed code, lose browser devtools integration, can have cross-language
cycle-collection problems and miss optimizations that require integration with the browser.
## Which compilers can I use to build WebAssembly programs?
WebAssembly initially focuses on [C/C++](CAndC++.md), and a new, clean
WebAssembly backend is being developed in upstream clang/LLVM, which can then be
used by LLVM-based projects like [Emscripten][] and [PNaCl][].
As WebAssembly evolves it will support more languages than C/C++, and we hope
that other compilers will support it as well, even for the C/C++ language, for
example [GCC][]. The WebAssembly working group found it easier to start with
LLVM support because they had more experience with that toolchain from their
[Emscripten][] and [PNaCl][] work.
[Emscripten]: http://emscripten.org
[PNaCl]: https://developer.chrome.com/docs/native-client/
[GCC]: https://gcc.gnu.org
We hope that proprietary compilers also gain WebAssembly support, but we'll let
vendors speak about their own platforms.
The [WebAssembly Community Group][] would be delighted to collaborate with more
compiler vendors, take their input into consideration in WebAssembly itself, and
work with them on ABI matters.
[WebAssembly Community Group]: https://www.w3.org/community/webassembly/
## Will WebAssembly support View Source on the Web?
Yes! WebAssembly defines a [text format](TextFormat.md) to be rendered when
developers view the source of a WebAssembly module in any developer tool. Also,
a specific goal of the text format is to allow developers to write WebAssembly
modules by hand for testing, experimenting, optimizing, learning and teaching
purposes. In fact, by dropping all the
[coercions required by asm.js validation](http://asmjs.org/spec/latest/#introduction),
the WebAssembly text format should be much more natural to read and write than
asm.js. Outside the browser, command-line and online tools that convert between
text and binary will also be made readily available. Lastly, a scalable form of
source maps is also being considered as part of the WebAssembly
[tooling story](Tooling.md).
## What's the story for Emscripten users?
Existing Emscripten users will get the option to build their projects to
WebAssembly, by flipping a flag. Initially, Emscripten's asm.js output would be
converted to WebAssembly, but eventually Emscripten would use WebAssembly
throughout the pipeline. This painless transition is enabled by the
[high-level goal](HighLevelGoals.md) that WebAssembly integrate well with the
Web platform (including allowing synchronous calls into and out of JavaScript)
which makes WebAssembly compatible with Emscripten's current asm.js compilation
model.
## Is WebAssembly trying to replace JavaScript?
No! WebAssembly is designed to be a complement to, not replacement of,
JavaScript. While WebAssembly will, over time, allow many languages to be
compiled to the Web, JavaScript has an incredible amount of momentum and will
remain the single, privileged (as described
[above](FAQ.md#is-webassembly-only-for-cc-programmers)) dynamic language of the
Web. Furthermore, it is expected that JavaScript and WebAssembly will be used
together in a number of configurations:
* Whole, compiled C++ apps that leverage JavaScript to glue things together.
* HTML/CSS/JavaScript UI around a main WebAssembly-controlled center canvas,
allowing developers to leverage the power of web frameworks to build
accessible, web-native-feeling experiences.
* Mostly HTML/CSS/JavaScript app with a few high-performance WebAssembly modules
(e.g., graphing, simulation, image/sound/video processing, visualization,
animation, compression, etc., examples which we can already see in asm.js
today) allowing developers to reuse popular WebAssembly libraries just like
JavaScript libraries today.
* When WebAssembly
[gains the ability to access garbage-collected objects :unicorn:][future garbage collection],
those objects will be shared with JavaScript, and not live in a walled-off
world of their own.
## Why not just use LLVM bitcode as a binary format?
The [LLVM](https://llvm.org/) compiler infrastructure has a lot of attractive
qualities: it has an existing intermediate representation (LLVM IR) and binary
encoding format (bitcode). It has code generation backends targeting many
architectures and is actively developed and maintained by a large community. In
fact PNaCl already uses LLVM as a basis for its binary
format. However, the goals and requirements that LLVM was designed to meet are
subtly mismatched with those of WebAssembly.
WebAssembly has several requirements and goals for its Instruction Set
Architecture (ISA) and binary encoding:
* Portability: The ISA must be the same for every machine architecture.
* Stability: The ISA and binary encoding must not change over time (or change
only in ways that can be kept backward-compatible).
* Small encoding: The representation of a program should be as small as possible
for transmission over the Internet.
* Fast decoding: The binary format should be fast to decompress and decode for
fast startup of programs.
* Fast compiling: The ISA should be fast to compile (and suitable for either
AOT- or JIT-compilation) for fast startup of programs.
* Minimal [nondeterminism](Nondeterminism.md): The behavior of programs should
be as predictable and deterministic as possible (and should be the same on
every architecture, a stronger form of the portability requirement stated
above).
LLVM IR is meant to make compiler optimizations easy to implement, and to
represent the constructs and semantics required by C, C++, and other languages
on a large variety of operating systems and architectures. This means that by
default the IR is not portable (the same program has different representations
for different architectures) or stable (it changes over time as optimization and
language requirements change). It has representations for a huge variety of
information that is useful for implementing mid-level compiler optimizations but
is not useful for code generation (but which represents a large surface area for
codegen implementers to deal with). It also has undefined behavior (largely
similar to that of C and C++) which makes some classes of optimization feasible
or more powerful, but can lead to unpredictable behavior at runtime. LLVM's
binary format (bitcode) was designed for temporary on-disk serialization
of the IR for link-time optimization, and not for stability or compressibility
(although it does have some features for both of those).
None of these problems are insurmountable. For example PNaCl defines a small
portable
[subset](https://developer.chrome.com/native-client/reference/pnacl-bitcode-abi)
of the IR with reduced undefined behavior, and a stable version of the bitcode
encoding. It also employs several techniques to improve startup
performance. However, each customization, workaround, and special solution means
less benefit from the common infrastructure. We believe that by taking our
experience with LLVM and designing an IR and binary encoding for our goals and
requirements, we can do much better than adapting a system designed for other
purposes.
Note that this discussion applies to use of LLVM IR as a standardized
format. LLVM's clang frontend and midlevel optimizers can still be used to
generate WebAssembly code from C and C++, and will use LLVM IR in their
implementation similarly to how PNaCl and Emscripten do today.
## Why is there no fast-math mode with relaxed floating point semantics?
Optimizing compilers commonly have fast-math flags which permit the compiler to
relax the rules around floating point in order to optimize more
aggressively. This can include assuming that NaNs or infinities don't occur,
ignoring the difference between negative zero and positive zero, making
algebraic manipulations which change how rounding is performed or when overflow
might occur, or replacing operators with approximations that are cheaper to
compute.
These optimizations effectively introduce nondeterminism; it isn't possible to
determine how the code will behave without knowing the specific choices made by
the optimizer. This often isn't a serious problem in native code scenarios,
because all the nondeterminism is resolved by the time native code is
produced. Since most hardware doesn't have floating point nondeterminism,
developers have an opportunity to test the generated code, and then count on it
behaving consistently for all users thereafter.
WebAssembly implementations run on the user side, so there is no opportunity for
developers to test the final behavior of the code. Nondeterminism at this level
could cause distributed WebAssembly programs to behave differently in different
implementations, or change over time. WebAssembly does have
[some nondeterminism](Nondeterminism.md) in cases where the tradeoffs warrant
it, but fast-math flags are not believed to be important enough:
* Many of the important fast-math optimizations happen in the mid-level
optimizer of a compiler, before WebAssembly code is emitted. For example,
loop vectorization that depends on floating point reassociation can still be
done at this level if the user applies the appropriate fast-math flags, so
WebAssembly programs can still enjoy these benefits. As another example,
compilers can replace floating point division with floating point
multiplication by a reciprocal in WebAssembly programs just as they do for
other platforms.
* Mid-level compiler optimizations may also be augmented by implementing them
in a [JIT library](JITLibrary.md) in WebAssembly. This would allow them to
perform optimizations that benefit from having
[information about the target](FeatureTest.md) and information about the
source program semantics such as fast-math flags at the same time. For
example, if SIMD types wider than 128-bit are added, it's expected that there
would be feature tests allowing WebAssembly code to determine which SIMD
types to use on a given platform.
* When WebAssembly
[adds an FMA operator :unicorn:][future floating point],
folding multiply and add sequences into FMA operators will be possible.
* WebAssembly doesn't include its own math functions like `sin`, `cos`, `exp`,
`pow`, and so on. WebAssembly's strategy for such functions is to allow them
to be implemented as library routines in WebAssembly itself (note that x86's
`sin` and `cos` instructions are slow and imprecise and are generally avoided
these days anyway). Users wishing to use faster and less precise math
functions on WebAssembly can simply select a math library implementation
which does so.
* Most of the individual floating point operators that WebAssembly does have
already map to individual fast instructions in hardware. Telling `add`,
`sub`, or `mul` they don't have to worry about NaN for example doesn't make
them any faster, because NaN is handled quickly and transparently in hardware
on all modern platforms.
* WebAssembly has no floating point traps, status register, dynamic rounding
modes, or signalling NaNs, so optimizations that depend on the absence of
these features are all safe.
## What about `mmap`?
The [`mmap`](https://pubs.opengroup.org/onlinepubs/009695399/functions/mmap.html)
syscall has many useful features. While these are all packed into one overloaded
syscall in POSIX, WebAssembly unpacks this functionality into multiple
operators:
* the MVP starts with the ability to grow linear memory via a
[`memory.grow`] operator;
* proposed
[future features :unicorn:][future memory control] would
allow the application to change the protection and mappings for pages in the
contiguous range `0` to `memory.size`.
A significant feature of `mmap` that is missing from the above list is the
ability to allocate disjoint virtual address ranges. The reasoning for this
omission is:
* The above functionality is sufficient to allow a user-level libc to implement
full, compatible `mmap` with what appears to be noncontiguous memory
allocation (but, under the hood is just coordinated use of `memory_resize` and
`mprotect`/`map_file`/`map_shmem`/`madvise`).
* The benefit of allowing noncontiguous virtual address allocation would be if
it allowed the engine to interleave a WebAssembly module's linear memory with
other memory allocations in the same process (in order to mitigate virtual
address space fragmentation). There are two problems with this:
- This interleaving with unrelated allocations does not currently admit
efficient security checks to prevent one module from corrupting data outside
its heap (see discussion in
[#285](https://github.com/WebAssembly/design/pull/285)).
- This interleaving would require making allocation nondeterministic and
nondeterminism is something that WebAssembly generally
[tries to avoid](Nondeterminism.md).
## Why have wasm32 and wasm64, instead of just an abstract `size_t`?
The amount of linear memory needed to hold an abstract `size_t` would then also
need to be determined by an abstraction, and then partitioning the linear memory
address space into segments for different purposes would be more complex. The
size of each segment would depend on how many `size_t`-sized objects are stored
in it. This is theoretically doable, but it would add complexity and there would
be more work to do at application startup time.
Also, allowing applications to statically know the pointer size can allow them
to be optimized more aggressively. Optimizers can better fold and simplify
integer expressions when they have full knowledge of the bitwidth. And, knowing
memory sizes and layouts for various types allows one to know how many trailing
zeros there are in various pointer types.
Also, C and C++ deeply conflict with the concept of an abstract `size_t`.
Constructs like `sizeof` are required to be fully evaluated in the front-end
of the compiler because they can participate in type checking. And even before
that, it's common to have predefined macros which indicate pointer sizes,
allowing code to be specialized for pointer sizes at the very earliest stages of
compilation. Once specializations are made, information is lost, scuttling
attempts to introduce abstractions.
And finally, it's still possible to add an abstract `size_t` in the future if
the need arises and practicalities permit it.
## Why have wasm32 and wasm64, instead of just using 8 bytes for storing pointers?
A great number of applications don't ever need as much as 4 GiB of memory.
Forcing all these applications to use 8 bytes for every pointer they store would
significantly increase the amount of memory they require, and decrease their
effective utilization of important hardware resources such as cache and memory
bandwidth.
The motivations and performance effects here should be essentially the same as
those that motivated the development of the
[x32 ABI](https://en.wikipedia.org/wiki/X32_ABI) for Linux.
Even Knuth found it worthwhile to give us his opinion on this issue at some point,
[a flame about 64-bit pointers](https://www-cs-faculty.stanford.edu/~uno/news08.html).
## Will I be able to access proprietary platform APIs (e.g. Android / iOS)?
Yes but it will depend on the _WebAssembly embedder_. Inside a browser you'll
get access to the same HTML5 and other browser-specific APIs which are also
accessible through regular JavaScript. However, if a wasm VM is provided as an
[“app execution platform”](NonWeb.md) by a specific vendor, it might provide
access to [proprietary platform-specific APIs](Portability.md#api) of e.g.
Android / iOS.
[future general]: FutureFeatures.md
[future garbage collection]: https://github.com/WebAssembly/proposals/issues/16
[future floating point]: https://github.com/WebAssembly/design/issues/1391
[future memory control]: https://github.com/WebAssembly/memory-control
[`memory.grow`]: https://webassembly.github.io/spec/core/syntax/instructions.html#syntax-instr-memory
================================================
FILE: FeatureTest.md
================================================
See [rationale](Rationale.md#feature-testing---motivating-scenarios) for motivating scenarios.
# Feature Test
[Post-MVP :unicorn:][future general], applications will be able to query which features are
supported via
[`has_feature` or a similar API :unicorn:][future feature testing]. This
accounts for the pragmatic reality that features are shipped in different orders
at different times by different engines.
What follows is a sketch of what such a feature testing capability could look
like.
Since some WebAssembly features add operators and all WebAssembly code in a
module is validated ahead-of-time, the usual JavaScript feature detection
pattern:
```
if (foo)
foo();
else
alternativeToFoo();
```
won't work in WebAssembly (if `foo` isn't supported, `foo()` will fail to
validate).
Instead, applications may use one of the following strategies:
1. Compile several versions of a module, each assuming different feature support
and use `has_feature` tests to determine which version to load.
2. During the ["specific" layer decoding](BinaryEncoding.md), which will happen
in user code in the MVP *anyway*, use `has_feature` to determine which features
are supported and then translate unsupported feature use into either a polyfill
or a trap.
Both of these options could be automatically provided by the toolchain and
controlled by compiler flags. Since `has_feature` is a constant expression,
it can be constant-folded by WebAssembly engines.
To illustrate, consider 4 examples:
* [`i32.bitrev` :unicorn:][future integer] - Strategy 2
could be used to translate `(i32.bitrev lhs rhs)` into an equivalent expression
that stores `lhs` and `rhs` in locals then uses `i32.lt_s` and `select`.
* [Threads :unicorn:][future threads] - If an application uses `#ifdef` extensively
to produce thread-enabled/disabled builds, Strategy 1 would be appropriate.
However, if the application was able to abstract use of threading to a few
primitives, Strategy 2 could be used to patch in the right primitive
implementation.
* [`mprotect` :unicorn:][future memory control] - If engines
aren't able to use OS signal handling to implement `mprotect` efficiently,
`mprotect` may become a permanently optional feature. For uses of `mprotect`
that are not necessary for correctness (but rather just catching bugs),
`mprotect` could be replaced with `nop`. If `mprotect` was necessary for
correctness but an alternative strategy existed that did not rely on
`mprotect`, `mprotect` could be replaced with an `abort()` call, relying on
the application to test `(has_feature "mprotect")` to avoid calling the
`abort()`. The `has_feature` query could be exposed to C++ code via
the existing `__builtin_cpu_supports`.
* [SIMD][future simd] - When SIMD operators have a good-enough
polyfill, e.g., `f32x4.fma` via `f32x4.mul`/`add`, Strategy 2 could be used
(similar to the `i32.bitrev` example above). However, when a SIMD feature has no
efficient polyfill (e.g., `f64x2`, which introduces both operators *and*
types), alternative algorithms need to be provided and selected at load time.
As a hypothetical (not implemented) example polyfilling the SIMD `f64x2`
feature, the C++ compiler could provide a new function attribute that indicated
that one function was an optimized, but feature-dependent, version of another
function (similar to the
[`ifunc` attribute](https://gcc.gnu.org/onlinedocs/gcc-4.7.2/gcc/Function-Attributes.html#index-g_t_0040code_007bifunc_007d-attribute-2529),
but without the callback):
```
#include <xmmintrin.h>
void foo(...) {
__m128 x, y; // -> f32x4 locals
...
x = _mm_add_ps(x, y); // -> f32x4.add
...
}
void foo_f64x2(...) __attribute__((optimizes("foo","f64x2"))) {
__m256 x, y; // -> f64x2 locals
...
x = _m_add_pd(x, y); // -> f64x2.add
...
}
...
foo(...); // calls either foo or foo_f64x2
```
In this example, the toolchain could emit both `foo` and `foo_f64x2` as
function definitions in the "specific layer" binary format. The load-time
polyfill would then replace `foo` with `foo_f64x2` if
`(has_feature "f64x2")`. Many other strategies are possible to allow finer or
coarser granularity substitution. Since this is all in userspace, the strategy
can evolve over time.
See also the [better feature testing support :unicorn:][future feature testing]
future feature.
[future general]: FutureFeatures.md
[future feature testing]: https://github.com/WebAssembly/design/issues/1280
[future integer]: https://github.com/WebAssembly/design/issues/1382
[future threads]: https://github.com/WebAssembly/design/issues/1073
[future simd]: https://github.com/WebAssembly/design/issues/1075
[future memory control]: https://github.com/WebAssembly/memory-control
================================================
FILE: FutureFeatures.md
================================================
# Future Features for WebAssembly
For a list of standard and standard-track features and their implementation
status in major WebAssembly implementations, see
[the WebAssembly features page].
For a list of all currently active proposals and their champions, see
[the proposals page].
For a list of discussions about possible future features, see
[the design repo's issue tracker].
[the WebAssembly features page]: https://webassembly.org/features/
[the proposals page]: https://github.com/WebAssembly/proposals/
[the design repo's issue tracker]: https://github.com/WebAssembly/design/issues
================================================
FILE: HighLevelGoals.md
================================================
# WebAssembly High-Level Goals
1. Define a [portable](Portability.md), size- and load-time-efficient
[binary format](MVP.md#binary-format) to serve as a compilation target which
can be compiled to execute at native speed by taking advantage of common
hardware capabilities available on a wide range of platforms, including
[mobile](https://en.wikipedia.org/wiki/Mobile_device) and
[IoT](https://en.wikipedia.org/wiki/Internet_of_Things).
2. Specify and implement incrementally:
* develop new features as independent [proposals];
* maintain [layering], with a core spec focused on pure sandboxed
computation, with host interactions factored out into higher
specification layers;
* preserve backwards compatibility;
* prioritize new features according to feedback and experience; and
* avoid biasing towards any one programming language family.
3. Design to execute within and integrate well with the *existing*
[Web platform](Web.md):
* maintain the versionless, [feature-tested](FeatureTest.md) and backwards-compatible evolution story of the Web;
* execute in the same semantic universe as JavaScript;
* allow synchronous calls to and from JavaScript;
* enforce the same-origin and permissions security policies;
* access browser functionality through the same Web APIs that are accessible
to JavaScript; and
* define a human-editable text format that is convertible to and from the
binary format, supporting View Source functionality.
4. Design to support [non-browser embeddings](NonWeb.md) as well.
5. Make a great platform:
* promote [compilers and tools targeting WebAssembly];
* enable other useful [tooling](Tooling.md);
* maintain a high level of [determinism]; and
* specify using [formal semantics].
[compilers and tools targeting WebAssembly]: https://webassembly.org/getting-started/developers-guide/
[proposals]: https://github.com/WebAssembly/proposals/
[layering]: https://webassembly.org/specs/
[determinism]: Nondeterminism.md
[formal semantics]: https://github.com/WebAssembly/spec
================================================
FILE: JITLibrary.md
================================================
# JIT and Optimization Library
WebAssembly's Just-in-Time compilation (JIT)
interface will likely be fairly low-level, exposing general-purpose primitives
rather than higher-level functionality. Still, there is a need for higher-level
functionality, and for greater flexibility than the WebAssembly spec can provide.
There is also a need for experimentation, particularly in the area of
applications wishing to dynamically generate new code, to determine which features
and interfaces are most appropriate. JIT and Optimization libraries that would run
inside WebAssembly and provide support and higher-level features would fit this
need very well.
Such libraries wouldn't be part of the WebAssembly spec itself, but the concept
is relevant to discuss here because features that we can expect to address in
libraries are features that we may not need to add to the spec. This strategy
can help keep the spec itself simple and reduce the surface area of features
required of every spec implementation.
And, libraries will facilitate light-weight experimentation with new features
that we may eventually want to add to WebAssembly itself. In a library layer,
we can quickly iterate, experiment, and gain real-world insight, before adding
features to the spec itself and freezing all the details. And as new features
are standardized, libraries will become the polyfills which will help those
features gain adoption.
This raises the question of how we should decide which features belong in the
spec, and which belong in a library. Some of the fundamental advantages of
putting functionality in a library rather than in the spec and in implementations
themselves include:
* A library can freely choose to offer greater degrees of undefined behavior,
implementation-defined behavior, unspecified behavior, and so on. This means
it can perform much more aggressive optimizations, including many that are
extremely common in optimizing compilers and might otherwise seem missing in
the WebAssembly spec itself:
* Constant folding, strength reduction, and code motion of math functions
such as `sin`, `cos`, `exp`, `log`, `pow`, and `atan2`.
* Performing aggressive expression simplifications that depend on assuming
that integer arithmetic doesn't overflow.
* Performing GVN with redundant load elimination, and other optimizations
based on aliasing rules that incur undefined behavior if they are violated.
* Vectorization that utilizes both floating point reassociation and
awareness of the underlying platform through
[feature testing](FeatureTest.md).
* A library can support higher-level features, and features that are tailored
to certain applications, whereas the WebAssembly spec itself is limited to
general-purpose primitives. Possible examples of this are:
* A richer type system, which could include things like complex, rational,
arbitrary bitwidth integers, non-power-of-2 SIMD types, interval
arithmetic, etc.
* A higher-level type system, which could include basic polymorphism of
various kinds (either with true dynamism or with monomorphisation).
* Richer control flow constructs.
* A broader set of operators, such as string-handling operators,
data type serialization, testing facilities, and linear algebra
operators, all of which can benefit from being integrated at the
language level.
Since every feature required in the spec itself will need to be implemented
by all implementations, domain-specific features run the risk of making
people "pay for what they don't use". With features libraries, people need
only pay for the features they choose to use.
* A library can evolve over time to meet the changing needs of higher-level
languages. In practice, compiler IRs such as LLVM IR evolve to add new
features, change existing features, and sometimes remove features, and these
kinds of changes are much harder to do in a spec.
The library approach also means that applications using a particular version
of a library can get consistent behavior and performance, because of the
determinism of the underlying WebAssembly platform.
A significant range of approaches are possible:
* "Customized WebAssembly". This might involve a library whose input format
is conceptually WebAssembly but with some additional features. The library
could optimize and then lower those features leaving standard WebAssembly
to present to the underlying implementation.
* "Bring Your Own Compiler" There's nothing stopping one from bundling
full-fledged AOT-style compilers that compile an independent source language
or IR into WebAssembly right there in WebAssembly itself. Obviously this
will involve tradeoffs in terms of download size and startup time, but it
would allow a unique degree of flexibility.
* And many things in between.
================================================
FILE: JS.md
================================================
# JavaScript API
This file historically contained the description of WebAssembly's JavaScript API.
For the current description, see the [normative documentation](http://webassembly.github.io/spec/js-api/index.html).
================================================
FILE: LICENSE
================================================
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "{}"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright {yyyy} {name of copyright owner}
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
================================================
FILE: MVP.md
================================================
# Minimum Viable Product
WebAssembly initially launched with a "minimum viable product" version (the "MVP"),
which was complete enough for some use cases, however many important features were
deferred. WebAssembly has added many [features] since then.
[features]: https://webassembly.org/features/
================================================
FILE: Modules.md
================================================
# Modules
This file historically contained the description of WebAssembly's module structure.
For the current description, see the [normative documentation](https://webassembly.github.io/spec/core/syntax/modules.html).
================================================
FILE: NonWeb.md
================================================
# Non-Web Embeddings
While WebAssembly is designed to run [on the Web](Web.md), it is
also desirable for it to be able to execute well in other environments,
including everything from minimal shells for testing to full-blown
application environments e.g. on servers in datacenters, on IoT devices,
or mobile/desktop apps. It may even be desirable to execute WebAssembly
embedded within larger programs.
Non-Web environments may provide different APIs than Web
environments.
Non-Web environments include JavaScript VMs (e.g. [node.js]), however
WebAssembly is also capable of being executed without a JavaScript VM present.
The table in the [features page] of the website lists several such
implementations.
[node.js]: https://nodejs.org
[features page]: https://webassembly.org/features/
[WASI] is a set of standards-track APIs being developed by the [WASI Subgroup]
intended for use in many environments, including in non-browser and non-JS
engines.
[WASI]: https://wasi.dev
[WASI Subgroup]: https://github.com/WebAssembly/wasi
The WebAssembly spec itself will not try to define any large portable libc-like
library. However, certain features that are core to WebAssembly semantics that
are similar to functions found in native libc *would* be part of the core
WebAssembly spec as primitive operators (e.g., the `memory.grow` operator, which
is similar to the `sbrk` function on many systems, and in the future, perhaps
operators supporting `dlopen` functionality).
Where there is overlap between the Web and popular non-Web environments,
shared specs could be proposed, but these would be separate from the WebAssembly
spec. A symmetric example in JavaScript would be the in-progress
[Loader](https://whatwg.github.io/loader) spec, which is proposed for both
Web and node.js environments and is distinct from the JavaScript spec.
However, for most cases it is expected that, to achieve portability at the
source code level, communities would build libraries that mapped from a
source-level interface such as POSIX or SDL to the host environment's builtin
capabilities, either at build time or runtime, possibly making use of
[feature testing](FeatureTest.md), [dynamic linking](DynamicLinking.md),
static linking, or polyfilling.
In general, by preserving spec layering, and the core WebAssembly spec
independent of the host environment and linking specs, WebAssembly can be
used as a portable binary format on many platforms, in many environments, for
many use cases, bringing great benefits in portability, tooling and
language-agnosticity.
[shared-everything linking]: https://github.com/WebAssembly/component-model/blob/main/design/mvp/examples/SharedEverythingDynamicLinking.md
================================================
FILE: Nondeterminism.md
================================================
# Nondeterminism in WebAssembly
WebAssembly is a [portable](Portability.md) [sandboxed](Security.md) platform
with limited, local, nondeterminism.
* *Limited*: nondeterministic execution can only occur in a small number of
well-defined cases (described below) and, in those cases, the implementation
may select from a limited set of possible behaviors.
* *Local*: when nondeterministic execution occurs, the effect is local,
there is no "spooky action at a distance".
The [rationale](Rationale.md) document details why WebAssembly is designed as
detailed in this document.
The following is a list of the places where the WebAssembly specification
currently admits nondeterminism:
* New features will be added to WebAssembly, which means different implementations
will have different support for each feature.
* The sequence of calls of exported functions, and the values of the incoming
arguments and return values from the outside environment, are not
determined by the Wasm spec.
* With `shared` memory that can be accessed by multiple threads, the results of
load, read-modify-write, wait, and awake operators are nondeterministic.
* Except when otherwise specified, when an arithmetic operator returns NaN,
there is nondeterminism in determining the specific bits of the NaN. However,
wasm does still provide the guarantee that NaN values returned from an operation
will not have 1 bits in their fraction field that aren't set in any NaN values
in the input operands, except for the most significant bit of the fraction field
(which most operators set to 1).
* Except when otherwise specified, when an arithmetic operator with a floating
point result type receives no NaN input values and produces a NaN result
value, the sign bit of the NaN result value is nondeterministic.
* The [relaxed SIMD] instructions have nondeterministic results.
* Environment-dependent resource limits may be exhausted. A few examples:
- Memory allocation may fail.
- The runtime can fail to allocate a physical page when a memory location is first
accessed (e.g. through a load or store), even if that memory was virtually reserved
by the maximum size property of the [memory section](Modules.md#linear-memory-section).
- Program stack may get exhausted (e.g., because function call depth is too big,
or functions have too many locals, or infinite recursion). Note that this stack
isn't located in the program-accessible linear memory.
- Resources such as handles may get exhausted.
- Any other resource could get exhausted at any time. Caveat emptor.
The following proposals would add additional nondeterminism:
* [Flexible-vectors], adding nondeterminism in the length of the vectors.
Users of C, C++, and similar languages should be aware that operators which
have defined or constrained behavior in WebAssembly itself may nonetheless still
have undefined behavior
[at the source code level](CAndC++.md#undefined-behavior).
[relaxed SIMD]: https://github.com/WebAssembly/relaxed-simd
[Flexible-vectors]: https://github.com/WebAssembly/flexible-vectors
================================================
FILE: Portability.md
================================================
# Portability
WebAssembly's [binary format](BinaryEncoding.md) is designed to be executable
efficiently on a variety of operating systems and instruction set architectures,
[on the Web](Web.md) and [off the Web](NonWeb.md).
## Assumptions for Efficient Execution
Execution environments which, despite
[limited, local, nondeterminism](Nondeterminism.md), don't offer
the following characteristics may be able to execute WebAssembly modules
nonetheless. In some cases they may have to emulate behavior that the host
hardware or operating system don't offer so that WebAssembly modules execute
*as-if* the behavior were supported. This sometimes will lead to poor
performance.
As WebAssembly's standardization goes forward we expect to formalize these
requirements, and how WebAssembly will adapt to new platforms that didn't
necessarily exist when WebAssembly was first designed.
WebAssembly portability assumes that execution environments offer the following
characteristics:
* 8-bit bytes.
* Addressable at a byte memory granularity.
* Support unaligned memory accesses or reliable trapping that allows software
emulation thereof.
* Two's complement signed integers in 32 bits and optionally 64 bits.
* IEEE 754-2019 32-bit and 64-bit floating point, except for
[a few exceptions][floating-point operations].
* Little-endian byte ordering.
* Memory regions which can be efficiently addressed with 32-bit
pointers or indices.
* wasm64 additionally supports linear memory bigger than
[4 GiB with 64-bit pointers or indices][future 64-bit].
* Enforce secure isolation between WebAssembly modules and other modules or
processes executing on the same machine.
* An execution environment which offers forward progress guarantees to all
threads of execution (even when executing in a non-parallel manner).
* Availability of lock-free atomic memory operators, when naturally aligned, for
8- 16- and 32-bit accesses. At a minimum this must include an atomic
compare-and-exchange operator (or equivalent load-linked/store-conditional).
* wasm64 additionally requires lock-free atomic memory operators, when naturally
aligned, for 64-bit accesses.
## API
WebAssembly does not specify any APIs or syscalls, only an
[import mechanism](Modules.md) where the set of available imports is defined
by the host environment. In a [Web](Web.md) environment, functionality is
accessed through the Web APIs defined by the
[Web Platform](https://en.wikipedia.org/wiki/Open_Web_Platform).
[Non-Web](NonWeb.md) environments can choose to implement standard Web APIs,
standard non-Web APIs (e.g. POSIX), or invent their own.
## Source-level
Portability at the C/C++ level can be achieved by programming to
a standard API (e.g., POSIX) and relying on the compiler and/or libraries to map
the standard interface to the host environment's available imports either at
compile-time (via `#ifdef`) or run-time (via [feature detection]
and dynamic [loading](Modules.md)/[linking](DynamicLinking.md)).
[future 64-bit]: https://github.com/WebAssembly/memory64/blob/main/proposals/memory64/Overview.md
[floating-point operations]: https://webassembly.github.io/spec/core/exec/numerics.html#floating-point-operations
[feature detection]: https://github.com/WebAssembly/design/issues/1280
================================================
FILE: README.md
================================================
# WebAssembly Design
This repository is part of the [WebAssembly Community Group] and hosts the
[issue tracker] for [phase 0 proposals], a [discussion forum] for questions
and general discussion, and a collection of high-level non-normative
[design documents].
For more information about WebAssembly itself, see [the website]! For
information about Contributing to WebAssembly see [the Contributing document].
Please follow our [Code of Ethics and Professional Conduct].
## Design documents
Some of the design documents in this repository are out of date. We're
gradually working updating these documents. The following are some documents
which are fairly up to date:
- [WebAssembly High-Level Goals](HighLevelGoals.md)
- [What does Portability mean in Wasm?](Portability.md)
- [Design Rationale](Rationale.md)
- [WebAssembly's Security model](Security.md)
- [Nondeterminism in WebAssembly](Nondeterminism.md)
[issue tracker]: https://github.com/WebAssembly/design/issues
[phase 0 proposals]: https://github.com/WebAssembly/meetings/blob/main/process/phases.md#0-pre-proposal-individual-contributor
[discussion forum]: https://github.com/WebAssembly/design/discussions
[the Contributing document]: Contributing.md
[the website]: https://webassembly.org
[Code of Ethics and Professional Conduct]: https://github.com/WebAssembly/meetings/blob/main/CODE_OF_CONDUCT.md
[design documents]: #design-documents
[WebAssembly Community Group]: https://www.w3.org/community/webassembly/
================================================
FILE: Rationale.md
================================================
# Design Rationale
This document describes rationales for WebAssembly's design decisions, acting as
footnotes to the main design text, keeping the main specification easier to
read, and making it easier to revisit decisions later without having to plow
through all the issues and pull requests. This rationale document tries to list
how decisions were made, and where tradeoffs were made for the sake of language
ergonomics, portability, performance, security, and Getting Things Done.
WebAssembly was designed incrementally, with multiple implementations being
pursued concurrently. As the MVP stabilizes and we get experience from real-world
codebases, we'll revisit the alternatives listed below, reevaluate the tradeoffs
and update the [design](Semantics.md) before the MVP is finalized.
## Why a stack machine?
Why not an AST, or a register- or SSA-based bytecode?
* We started with an AST and generalized to a [structured stack machine](Semantics.md). ASTs allow a
dense encoding and efficient decoding, compilation, and interpretation.
The structured stack machine of WebAssembly is a generalization of ASTs allowed in previous versions while allowing
efficiency gains in interpretation and baseline compilation, as well as a straightforward
design for multi-return functions.
* The stack machine allows smaller binary encoding than registers or SSA [JSZap][], [Slim Binaries][],
and structured control flow allows simpler and more efficient verification, including decoding directly
to a compiler's internal SSA form.
* [Polyfill prototype][] shows simple and efficient translation to asm.js.
[JSZap]: https://research.microsoft.com/en-us/projects/jszap/
[Slim Binaries]: https://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.108.1711
[Polyfill prototype]: https://github.com/WebAssembly/polyfill-prototype-1
## Why not a fully-general stack machine?
The WebAssembly stack machine is restricted to structured control flow and structured
use of the stack. This greatly simplifies one-pass verification, avoiding a fixpoint computation
like that of other stack machines such as the Java Virtual Machine (prior to [stack maps](https://docs.oracle.com/javase/specs/jvms/se7/html/jvms-4.html)).
This also simplifies compilation and manipulation of WebAssembly code by other tools.
Further generalization of the WebAssembly stack machine is planned post-MVP, such as the
addition of multiple return values from control flow constructs and function calls.
## Basic Types Only
WebAssembly only represents [a few types](Semantics.md#Types).
* More complex types can be formed from these basic types. It's up to the source
language compiler to express its own types in terms of the basic machine
types. This allows WebAssembly to present itself as a virtual ISA, and lets
compilers target it as they would any other ISA.
* These types are directly representable on all modern CPU architectures.
* Smaller types (such as `i8` and `i16`) are usually no more efficient and in
languages like C/C++ are only semantically meaningful for memory accesses
since arithmetic get widened to `i32` or `i64`. Avoiding them at least for MVP
makes it easier to implement a WebAssembly VM.
* Other types (such as `f16`, `i128`) aren't widely supported by existing
hardware and can be supported by runtime libraries if developers wish to use
them. Hardware support is sometimes uneven, e.g. some support load/store of
`f16` only whereas other hardware also supports scalar arithmetic on `f16`,
and yet other hardware only supports SIMD arithmetic on `f16`. They can be
added to WebAssembly later without compromising MVP.
* More complex object types aren't semantically useful for MVP: WebAssembly
seeks to provide the primitive building blocks upon which higher-level
constructs can be built. They may become useful to support other languages,
especially when considering [garbage collection][future garbage collection].
## Load/Store Addressing
Load/store instructions include an immediate offset used for
[addressing](Semantics.md#Addressing). This is intended to simplify folding
of offsets into complex address modes in hardware, and to simplify bounds
checking optimizations. It offloads some of the optimization work to the
compiler that targets WebAssembly, executing on the developer's machine, instead
of performing that work in the WebAssembly compiler on the user's machine.
## Alignment Hints
Load/store instructions contain
[alignment hints](Semantics.md#Alignment). This makes it easier to generate
efficient code on certain hardware architectures.
Either tooling or an explicit opt-in "debug mode" in the spec could allow
execution of a module in a mode that threw exceptions on misaligned access.
This mode would incur some runtime cost for branching on most platforms which is
why it isn't the specified default.
## Out of Bounds
The ideal semantics is for
[out-of-bounds accesses](Semantics.md#Out-of-Bounds) to trap, but the
implications are not yet fully clear.
There are several possible variations on this design being discussed and
experimented with. More measurement is required to understand the associated
tradeoffs.
* After an out-of-bounds access, the instance can no longer execute code and
any outstanding JavaScript [ArrayBuffer][] aliasing the linear memory are
detached.
* This would primarily allow hoisting bounds checks above effectful
operators.
* This can be viewed as a mild security measure under the assumption that
while the sandbox is still ensuring safety, the instance's internal state
is incoherent and further execution could lead to Bad Things (e.g., XSS
attacks).
* To allow for potentially more-efficient memory sandboxing, the semantics
could allow for a nondeterministic choice between one of the following when
an out-of-bounds access occurred.
* The ideal trap semantics.
* Loads return an unspecified value.
* Stores are either ignored or store to an unspecified location in the
linear memory.
* Either tooling or an explicit opt-in "debug mode" in the spec should allow
execution of a module in a mode that threw exceptions on out-of-bounds
access.
[ArrayBuffer]: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/ArrayBuffer
## Linear Memory Resizing
To allow efficient engines to employ virtual-memory based techniques for bounds
checking, memory sizes are required to be page-aligned.
For portability across a range of CPU architectures and operating systems,
WebAssembly defines a fixed page size.
Programs can depend on this fixed page size and still remain portable across all
WebAssembly engines.
64KiB represents the least common multiple of many platforms and CPUs.
In the future, WebAssembly may offer the ability to use larger page sizes on
some platforms for increased TLB efficiency.
The `memory.grow` operator returns the old memory size. This is desirable for
using `memory.grow` independently on multiple threads, so that each thread can
know where the region it allocated starts. The obvious alternative would be for
such threads to communicate manually, however WebAssembly implementations will likely
already be communicating between threads in order to properly allocate the sum
of the allocation requests, so it's expected that they can provide the needed
information without significant extra effort.
The [optional maximum size](Modules.md#linear-memory-section) is designed to
address a number of competing constraints:
1. Allow WebAssembly modules to grab large regions of contiguous memory in a
32-bit address space early in an application's startup before the virtual
address space becomes fragmented by execution of the application.
2. Allow many small WebAssembly instances to execute in a single 32-bit process.
(For example, it is common for a single web application to use dozens of
libraries, each of which may, over time, include WebAssembly modules as
implementation details.)
3. Avoid *forcing* every developer using WebAssembly to understand their precise
maximum heap usage.
4. When [threading and shared memory][future threads] are added to WebAssembly
post-MVP, the design should not require memory growth
to `realloc` since this implies significant implementation complexity,
security hazards, and optimization challenges.
The optional maximum addresses these constraints:
* (1) is addressed by specifying a large maximum memory size. Simply setting a
large *initial* memory size has problems due to (3) and the fact that a
failure to allocate initial is a fatal error which makes the choice of "how
big?" difficult.
* (2) and (3) are addressed by making the maximum optional combined with the
implied implementation that, on 32-bit, engines will not allocate
significantly more than the current memory size, *and* the compiler sets the
initial size to just enough to hold static data.
* (4) is addressed assuming that, when threading is added, a new, optional
"shared" flag is added to the memory section that must be set to enable shared
memory and the shared flag forces the maximum to be specified. In this case,
shared memory never moves; the only thing that changes is that the bounds
grows which does not have all the abovementioned hazards. In particular, any
extant `SharedArrayBuffer`s that alias linear memory stay valid without
any updates.
## Linear memory disabled if no linear memory section
See [#107](https://github.com/WebAssembly/spec/pull/107).
## Control Flow
Structured control flow provides simple and size-efficient binary encoding and
compilation. Any control flow--even irreducible--can be transformed into structured
control flow with the
[Relooper](https://github.com/kripken/emscripten/raw/master/docs/paper.pdf)
[algorithm](https://dl.acm.org/citation.cfm?id=2048224&CFID=670868333&CFTOKEN=46181900),
with guaranteed low code size overhead, and typically minimal throughput
overhead (except for pathological cases of irreducible control
flow). Alternative approaches can generate reducible control flow via node
splitting, which can reduce throughput overhead, at the cost of increasing
code size (potentially very significantly in pathological cases).
Also,
[more expressive control flow constructs :unicorn:][future flow control]
may be added in the future.
## Nop
The nop operator does not produce a value or cause side effects.
It is nevertheless useful for compilers and tools, which sometimes need to replace instructions with a `nop`. Without a `nop` instruction, code generators would use alternative *does-nothing* opcode patterns that consume space in a module and may have a runtime cost. Finding an appropriate opcode that does nothing but has the appropriate type for the node's location is nontrivial. The existence of many different ways to encode `nop` - often mixed in the same module - would reduce the efficiency of compression algorithms.
## Locals
C/C++ makes it possible to take the address of a function's local values and
pass this pointer to callees or to other threads. Since WebAssembly's local
variables are outside the address space, C/C++ compilers implement address-taken
variables by creating a separate stack data structure within linear memory. This
stack is sometimes called the "aliased" stack, since it is used for variables
which may be pointed to by pointers.
Since the aliased stack appears to the WebAssembly engine as normal memory,
WebAssembly optimizations that would target the aliased stack need to be more
general, and thus more complicated. We observe that common compiler
optimizations done before the WebAssembly code is produced, such as LLVM's
global value numbering, effectively split address-taken variables into many
small ranges that can often be allocated as local variables. Thus our
expectation that any loss of optimization potential here is minimal.
Conversely, non-address taken values which are usually on the stack are instead
represented as locals inside functions. This effectively means that WebAssembly
has an infinite set of registers, and can choose to spill values as it sees fit
in a manner unobservable to the hosted code. This implies that there's a
separate stack, unaddressable from hosted code, which is also used to spill
return values. This allows strong security properties to be enforced, but does
mean that two stacks are maintained (one by the VM, the other by the compiler
which targets WebAssembly) which can lead to some inefficiencies.
Local variables are not in Static Single Assignment (SSA) form, meaning that
multiple incoming SSA values which have separate liveness can "share" the
storage represented by a local through the `set_local` operator. From an SSA
perspective, this means that multiple independent values can share a local
variable in WebAssembly, which is effectively a kind of pre-coloring that clever
producers can use to pre-color variables and give hints to a WebAssembly VM's
register allocation algorithms, offloading some of the optimization work from
the WebAssembly VM.
## Variable-Length Argument Lists ("varargs")
C and C++ compilers are expected to implement variable-length argument lists by
storing arguments in a buffer in linear memory and passing a pointer to the
buffer. This greatly simplifies WebAssembly VM implementations by punting this
ABI consideration to the front-end compiler. It does negatively impact
performance, but variable-length calls are already somewhat slow.
## Multiple Return Values
WebAssembly's MVP does not support multiple return values from functions because
they aren't strictly necessary for the earliest anticipated use cases (and it's
a *minimum* viable product), and they would introduce some complexity for some
implementations. However, multiple return values are a very useful feature, and
are relevant to ABIs, so it's likely to be added soon after the MVP.
## Indirect Calls
The table-based scheme for indirect function calls was motivated by the need
to represent function pointers as integer values that can be stored into the
linear memory, as well as to enforce basic safety properties such as
calling a function with the wrong signature does not destroy the safety
guarantees of WebAssembly. In particular, an exact signature match implies
an internal machine-level ABI match, which some engines require to ensure safety.
An indirection also avoids a possible information leak through raw code addresses.
Languages like C and C++ that compile to WebAssembly also imposed
requirements, such as the uniqueness of function pointers and the ability
to compare function pointers to data pointers, or treat data as function
pointers.
Several alternatives to direct indices with a heterogeneous indirect function table
were considered, from alternatives with multiple tables to statically typed function
pointers that can be mapped back and forth to integers. With the added complication
of dynamic linking and dynamic code generation, none of these alternatives perfectly
fit the requirements.
The current design requires two dynamic checks when invoking a function pointer:
a bounds check against the size of the indirect function table and a signature check
for the function at that index against an expected signature. Some dynamic optimization
techniques (e.g. inline caches, or a one-element cache), can reduce the number of
checks in common cases. Other techniques such as trading a bounds check for a mask or
segregating the table per signature to require only a bounds check could be considered
in the future. Also, if tables are small enough, an engine can internally use per-signature
tables filled with failure handlers to avoid one check.
## Control Flow Instructions with Values
Control flow instructions such as `br`, `br_if`, `br_table`, `if` and `if-else` can
transfer stack values in WebAssembly. These primitives are useful building blocks for
WebAssembly producers, e.g. in compiling expression languages. It offers significant
size reduction by avoiding the need for `set_local`/`get_local` pairs in the common case
of an expression with only one immediate use. Control flow instructions can then model
expressions with result values, thus allowing even more opportunities to further reduce
`set_local`/`get_local` usage (which constitute 30-40% of total bytes in the
[polyfill prototype](https://github.com/WebAssembly/polyfill-prototype-1)).
`br`-with-value and `if` constructs that return values can also model `phis` which
appear in SSA representations of programs.
## Limited Local Nondeterminism
There are a few obvious cases where nondeterminism is essential to the API, such
as random number generators, date/time functions or input events. The
WebAssembly specification is strict when it comes to other sources of
[limited local nondeterminism](Nondeterminism.md) of operators: it specifies
all possible corner cases, and specifies a single outcome when this can be done
reasonably.
Ideally, WebAssembly would be fully deterministic because a fully deterministic
platform is easier to:
* Reason about.
* Implement.
* Test portably.
Nondeterminism is only specified as a compromise when there is no other
practical way to:
* Achieve [portable](Portability.md) native performance.
* Lower resource usage.
* Reduce implementation complexity (both of WebAssembly VMs as well as compilers
generating WebAssembly binaries).
* Allow usage of new hardware features.
* Allows implementations to [security-harden](Security.md) certain usecases.
When nondeterminism is allowed into WebAssembly it is always done in a limited
and local manner. This prevents the entire program from being invalid, as would
be the case with C++ undefined behavior.
As WebAssembly gets implemented and tested with multiple languages on multiple
architectures we may revisit some of the design decisions:
* When all relevant hardware implements an operation the same way, there's no
need for nondeterminism in WebAssembly semantics. One such
example is floating-point: at a high-level most operators follow
IEEE-754 semantics, it is therefore not necessary to specify WebAssembly's
floating-point operators differently from IEEE-754.
* When different languages have different expectations then it's unfortunate if
WebAssembly measurably penalizes one's performance by enforcing determinism
which that language doesn't care about, but which another language may want.
## NaN bit-pattern nondeterminism.
NaNs produced by floating-point instructions in WebAssembly have
nondeterministic bit patterns in most circumstances. The bit pattern of a NaN
is not usually significant, however there are a few ways that it can be
observed:
- a `reinterpret` conversion to an integer type
- a `store` to linear memory followed by a load with a different type or index
- a NaN stored to an imported or exported global variable or linear memory may
be observed by the outside environment
- a NaN passed to a `call` or `call_indirect` to an imported function may
be observed by the outside environment
- a return value of an exported function may be observed by the outside
environment
- `copysign` can be used to copy the sign bit onto a non-NaN value, where
it then be observed
The motivation for nondeterminism in NaN bit patterns is that popular platforms
have differing behavior. IEEE 754-2019 makes some recommendations, but has few
hard requirements in this area, and in practice there is significant divergence,
for example:
- When an instruction with no NaN inputs produces a NaN output, x86 produces
a NaN with the sign bit set, while ARM and others produce a NaN with it
unset.
- When an instruction has multiple NaN inputs, x86 always returns the first
NaN (converted to a quiet NaN if needed), while ARMv8 returns the first
signaling NaN (converted to a quiet NaN) if one is present, and otherwise
returns the first quiet NaN.
- Some hardware architectures have found that returning one of the input NaNs
has a cost, and prefer to return a NaN with a fixed bit pattern instead.
- LLVM (used in some WebAssembly implementations) doesn't guarantee that it
won't commute `fadd`, `fmul` and other instructions, so it's not possible
to rely on the "first" NaN being preserved as such.
- IEEE 754-2019 itself recommends architectures use NaN bits to provide
architecture-specific debugging facilities.
IEEE 754-2019 6.2 says that instructions returning a NaN *should* return one of
their input NaNs. In WebAssembly, implementations may do this, however they are
not required to. Since IEEE 754-2019 states this as a "should" (as opposed to a
"shall"), it isn't a requirement for IEEE 754-2019 conformance.
An alternative design would be to require engines to always "canonicalize"
NaNs whenever their bits could be observed. This would eliminate the
nondeterminism and provide slightly better portability, since it would hide
hardware-specific NaN propagation behavior. However, it is theorized that this
would add an unacceptable amount of overhead, and that the benefit is marginal
since most programs are unaffected by this issue.
## Support for NaN-boxing.
In general, WebAssembly's floating point instructions provide the guarantee that
if all NaNs passed to an instruction are "canonical", the result is "canonical",
where canonical means the most significant bit of the fraction field is 1, and
the trailing bits are all 0.
This is intended to support interpreters running on WebAssembly that use
NaN-boxing, because they don't have to canonicalize the output of an arithmetic
instruction if they know the inputs are canonical.
When one or more of the inputs of an instruction are non-canonical NaNs, the
resulting NaN bit pattern is nondeterministic. This is intended to accommodate
the diversity in NaN behavior among popular hardware architectures.
Note that the sign bit is still nondeterministic in a canonical NaN. This is
also to accommodate popular hardware architectures; for example, x86 generates
NaNs with the sign bit set to 1 while other architectures generate NaNs with it
set to 0. And as above, the cost of canonicalizing NaNs is believed to be
greater than the benefit.
NaNs generated by JS or other entities in the external environment are not
required to be canonical, so exported function arguments, imported function
return values, and values stored in exported variables or memory may be
non-canonical.
## Integer operations
WebAssembly's signed integer divide rounds its result toward zero. This is not
because of a lack of sympathy for
[better alternatives](https://python-history.blogspot.com/2010/08/why-pythons-integer-division-floors.html),
but out of practicality. Because all popular hardware today implements
rounding toward zero, and because C and many other languages now specify
rounding to zero, having WebAssembly in the middle doing something different
would mean divisions would have to be doubly complicated.
Similarly, WebAssembly's shift operators mask their shift counts to the number
of bits in the shifted value. Confusingly, this means that shifting a 32-bit
value by 32 bits is an identity operation, and that a left shift is not
equivalent to a multiplication by a power of 2 because the overflow behavior
is different. Nevertheless, because several popular hardware architectures
today implement this masking behavior, and those that don't can typically
emulate it with a single extra mask instruction, and because several popular
source languages, including JavaScript and C#, have come to specify this
behavior too, we reluctantly adopt this behavior as well.
WebAssembly has three classes of integer operations: signed, unsigned, and
sign-agnostic. The signed and unsigned instructions have the property that
whenever they can't return their mathematically expected value (such as when
an overflow occurs, or when their operand is outside their domain), they
trap, in order to avoid silently returning an incorrect value.
Note that the `add`, `sub`, and `mul` operators are categorized as
sign-agnostic. Because of the magic of two's complement representation, they
may be used for both signed and unsigned purposes. Note that this (very
conveniently!) means that engines don't need to add extra overflow-checking
code for these most common of arithmetic operators on the most popular
hardware platforms.
## Motivating Scenarios for Feature Testing
1. [Post-MVP :unicorn:][future general],
[`i32.bitrev` :unicorn:][future bitrev] is introduced. A
WebAssembly developer updates their toolkit so that the compiler may leverage
`i32.bitrev`. The developer's WebAssembly module works correctly both on
execution environments at MVP, as well as those supporting `i32.bitrev`.
* A variant of this, where a few more new opcodes are available, the compiler
is updated to be able to leverage all of them, but not all execution targets
support all of them. The developer wants to reach as many of their customers as
possible, while at the same time providing them with the best experience
possible. The developer has to balance the cost of the test matrix resulting
from the combinations of possible feature configurations.
2. [Post-MVP :unicorn:][future general], module authors may now use
[Threading][future threads]
APIs in the browser. A developer wants to leverage multithreading in their
module.
* In one variant of the scenario, our developer does not want to pay the
engineering cost of developing and supporting a threaded and non-threaded
version of their code. They opt not to support MVP targets, and only support
post-MVP targets. End-users (browser users) get some message indicating they
need MVP support.
* In another variant, our developer explicitly authors both MVP-only and post-
MVP (with threads) code.
3. [SIMD][future simd] support is not universally
equivalent on all targets. While polyfill variants of SIMD APIs are available,
a developer prefers writing dedicated SIMD and non-SIMD versions of their
compression algorithm, because the non-SIMD version performs better in
environments without SIMD support, when compared to the SIMD polyfill. They
package their compression code for reuse by third parties.
4. An application author is assembling together an application by reusing
modules such as those developed in the scenarios above. The application author's
development environment is able to quickly and correctly identify the platform
dependencies (e.g. threading, SIMD) and communicate back to the application
author the implications these dependencies have on the end-application. Some
APIs exposed from the threading-aware module are only pertinent to environments
supporting threading. As a consequence, the application author needs to write
specialized code when threads are/are not supported. (Note: we should understand
this scenario for both forms of WebAssembly reuse currently imagined: dynamic
linking and static imports.)
5. The compression algorithm described in scenario 3 is deployed on a
restrictive execution environment, as part of an application. In this
environment, a process may not change memory page access protection flags (e.g.
certain gaming consoles, to investigate server side deployment scenarios). The
compression module is compiled by the WebAssembly environment, enabling the
configuration most specific to the target (i.e. with/without Threads, SIMD,
etc).
* A variant of this scenario where the environment is additionally separating
storage into system-visible and application-visible, the latter not being able
to contain machine-executable code (certain phones, to investigate if gaming
consoles or server side have a similar sandboxing mechanism).
## Why a binary encoding?
Given that text is so compressible and it is well known that it is hard to beat
gzipped source, is there any win from having a binary format over a text format?
Yes:
* Large reductions in payload size can still significantly decrease the
compressed file size.
* Experimental results from a
[polyfill prototype](https://github.com/WebAssembly/polyfill-prototype-1) show the
gzipped binary format to be about 20-30% smaller than the corresponding
gzipped asm.js.
* A binary format that represents the names of variables and functions with raw
indices instead of strings is much faster to decode: array indexing
vs. dictionary lookup.
* Experimental results from a
[polyfill prototype](https://github.com/WebAssembly/polyfill-prototype-1) show that
decoding the binary format is about 23× faster than parsing the
corresponding asm.js source (using
[this demo](https://github.com/lukewagner/AngryBotsPacked), comparing
*just* parsing in SpiderMonkey (no validation, IR generation) to *just*
decoding in the polyfill (no asm.js code generation).
* A binary format allows many optimizations for code size and decoding speed that would
not be possible on a source form.
## Why a layered binary encoding?
* We can do better than generic compression because we are aware of the code
structure and other details:
* For example, macro compression that
[deduplicates AST trees](https://github.com/WebAssembly/design/issues/58#issuecomment-101863032)
can focus on ASTs + their children, thus having `O(nodes)` entities
to worry about, compared to generic compression which in principle would
need to look at `O(bytes*bytes)` entities. Such macros would allow the
logical equivalent of `#define ADD1(x) (x+1)`, i.e., to be
parameterized. Simpler macros (`#define ADDX1 (x+1)`) can implement useful
features like constant pools.
* Another example is reordering of functions and some internal nodes, which
we know does not change semantics, but
[can improve general compression](https://www.rfk.id.au/blog/entry/cromulate-improve-compressibility/).
* JITs and simple developer tooling do not benefit from compression, so layering allows
the related development and maintenance burden to be offloaded to reusable tools/libraries.
* Each of the layers works to find compression opportunities to the best of
its abilities, without encroaching upon the subsequent layer's compression
opportunities.
* [Existing web standards](https://www.w3.org/TR/PNG/) demonstrate many of
the advantages of a layered encoding strategy.
## Why "polymorphic" stack typing after control transfer?
For a stack machine, the fundamental property that type checking must ensure is that for every _edge_ in a control flow graph (formed by the individual instructions), the assumptions about the stack match up.
At the same time, type checking should be fast, so be expressible by a linear traversal.
There is no control edge from an unconditional control transfer instruction (like `br`, `br_table`, `return`, or `unreachable`) to the textually following instruction -- the next instruction is unreachable.
Consequently, no constraint has to be imposed between the two regarding the stack.
In a linear type checking algorithm this can be expressed by allowing to assume any type of stack.
In type system terms, the instruction can thus be said to be _polymorphic_ in its stack type.
This solution is canonical in the sense that it induces the minimal type constraints necessary to ensure soundness, while also maintaining the following useful structural properties about valid (i.e., well-typed) code:
* It is composable: if instruction sequences A and B are valid, then block(A,B) is valid (e.g. at empty stack type).
* It is decomposable: if block(A,B) is valid, then both A and B are valid.
* It generalises an expression language: every expression is directly expressible, even if it contains control transfer.
* It is agnostic to reachability: neither the specification nor producers need to define, check, or otherwise care about reachability (unless they choose to, e.g. to perform dead code elimination).
* It maintains basic transformations: for example, (i32.const 1) (br_if $l) can always be replaced with (br $l).
Some of these properties are relevant to support certain compilation techniques without artificial complication for producers.
As a small but representative example, consider a textbook one-pass compiler, which would often use something like the following scheme for compiling expressions:
```
compile(A + B) = compile(A); compile(B); emit(i32.add)
compile(A / B) = compile(A); compile(B); emit(dup i32.eqz (br_if $div_zero) drop i32.div)
compile(A / 0) = compile(A); emit(br $div_zero)
```
The third case is a specialisation of the second, a simple optimisation.
Without polymorphic typing of the `br` instruction, however, this simple scheme would not work, since `compile((x/0)+y)` would result in invalid code. Worse, it can lead to subtle compiler bugs.
Similar situations arise in production compilers, for example, in the asm.js-to-wasm compilers that some of the WebAssembly VMs/tools are implementing.
Other solutions that have been discussed would lose most of the above properties. For example:
* Disallowing certain forms of unreachable code (whether by rules or by construction) would break all but decomposability.
* Allowing but not type-checking unreachable code would break decomposability and requires the spec to provide a syntactic definition of reachability (which is unlikely to match the definition otherwise used by producers).
* Requiring the stack to be empty after a jump remains agnostic to reachability and maintains decomposability, but still breaks all other properties.
It is worth noting that this kind of type checking, in general, is not unusual.
For example, the JVM also poses no constraint on the stack type after a jump (however, in its modern form, it recommends type annotations in the form of _stack maps_, which are then required after jumps to make the instantiation of the polymorphic typing rule explicit).
Moreover, programming languages that allow control transfer _expressions_ usually type them polymorphically as well (e.g., `throw`/`raise`, which is an expression in some languages).
[future general]: FutureFeatures.md
[future flow control]: https://github.com/WebAssembly/design/issues/796
[future bitrev]: https://github.com/WebAssembly/design/issues/1382
[future threads]: https://github.com/WebAssembly/design/issues/1073
[future simd]: https://github.com/WebAssembly/design/issues/1075
[future garbage collection]: https://github.com/WebAssembly/proposals/issues/16
================================================
FILE: Security.md
================================================
# Security
The security model of WebAssembly has two important goals: (1) protect *users*
from buggy or malicious modules, and (2) provide *developers* with useful
primitives and mitigations for developing safe applications, within the
constraints of (1).
## Users
Each WebAssembly module executes within a sandboxed environment separated from
the host runtime using fault isolation techniques. This implies:
* Applications execute independently, and can't escape the sandbox without
going through appropriate APIs.
* Applications generally execute deterministically
[with limited exceptions](Nondeterminism.md).
Additionally, each module is subject to the security policies of its embedding.
Within a [web browser](Web.md), this includes restrictions on information flow
through [same-origin policy][]. On a [non-web](NonWeb.md) platform, this could
include the POSIX security model.
## Developers
The design of WebAssembly promotes safe programs by eliminating dangerous
features from its execution semantics, while maintaining compatibility with
programs written for [C/C++](CAndC%2B%2B.md).
Modules must declare all accessible functions and their associated types
at load time, even when [dynamic linking](DynamicLinking.md) is used. This
allows implicit enforcement of [control-flow integrity][] (CFI) through
structured control-flow. Since compiled code is immutable and not observable at
runtime, WebAssembly programs are protected from control flow hijacking attacks.
* [Function calls](Semantics.md#calls) must specify the index of a target
that corresponds to a valid entry in the
[function index space](Modules.md#function-index-space) or
[table index space](Modules.md#table-index-space).
* [Indirect function calls](Rationale.md#indirect-calls) are subject to a type
signature check at runtime; the type signature of the selected indirect
function must match the type signature specified at the call site.
* A protected call stack that is invulnerable to buffer overflows in the
module heap ensures safe function returns.
* [Branches](Semantics.md#branches-and-nesting) must point to valid
destinations within the enclosing function.
Variables in C/C++ can be lowered to two different primitives in WebAssembly,
depending on their scope. [Local variables](Semantics.md#local-variables)
with fixed scope and [global variables](Semantics.md#global-variables) are
represented as fixed-type values stored by index. The former are initialized
to zero by default and are stored in the protected call stack, whereas
the latter are located in the [global index space](Modules.md#global-index-space)
and can be imported from external modules. Local variables with
[unclear static scope](Rationale.md#locals) (e.g. are used by the address-of
operator, or are of type `struct` and returned by value) are stored in a separate
user-addressable stack in [linear memory](Semantics.md#linear-memory) at
compile time. This is an isolated memory region with fixed maximum size that is
zero initialized by default. References to this memory are computed with
infinite precision to avoid wrapping and simplify bounds checking. WebAssembly
modules may also have [multiple linear memory sections], which are independent
of each other. In the future,
[finer-grained memory operations][future memory control]
(e.g. shared memory, page protection, large pages, etc.) may be implemented.
[Traps](Semantics.md#traps) are used to immediately terminate execution and
signal abnormal behavior to the execution environment. In a browser, this is
represented as a JavaScript exception. Support for
module-defined trap handlers
may be implemented in the future. Operations that can trap include:
* specifying an invalid index in any index space,
* performing an indirect function call with a mismatched signature,
* exceeding the maximum size of the protected call stack,
* accessing [out-of-bounds](#Rationale.md#out-of-bounds) addresses in linear
memory,
* executing an illegal arithmetic operations (e.g. division or remainder by
zero, signed division overflow, etc).
### Memory Safety
Compared to traditional C/C++ programs, these semantics obviate certain classes
of memory safety bugs in WebAssembly. Buffer overflows, which occur when data
exceeds the boundaries of an object and accesses adjacent memory regions, cannot
affect local or global variables stored in index space, they are fixed-size and
addressed by index. Data stored in linear memory can overwrite adjacent objects,
since bounds checking is performed at linear memory region granularity and is
not context-sensitive. However, the presence of control-flow integrity and
protected call stacks prevents direct code injection attacks. Thus,
common mitigations such as [data execution prevention][] (DEP) and
[stack smashing protection][] (SSP) are not needed by WebAssembly programs.
Another common class of memory safety errors involves unsafe pointer usage and
[undefined behavior](CAndC%2B%2B.md#undefined-behavior). This includes
dereferencing pointers to unallocated memory (e.g. `NULL`), or freed memory
allocations. In WebAssembly, the semantics of pointers have been eliminated for
function calls and variables with fixed static scope, allowing references to
invalid indexes in any index space to trigger a validation error at load time,
or at worst a trap at runtime. Accesses to linear memory are bounds-checked at
the region level, potentially resulting in a trap at runtime. These memory
region(s) are isolated from the internal memory of the runtime, and are set to
zero by default unless otherwise initialized.
Nevertheless, other classes of bugs are not obviated by the semantics of
WebAssembly. Although attackers cannot perform direct code injection attacks,
it is possible to hijack the control flow of a module using code reuse attacks
against indirect calls. However, conventional [return-oriented programming][]
(ROP) attacks using short sequences of instructions ("gadgets") are not possible
in WebAssembly, because control-flow integrity ensures that call targets are
valid functions declared at load time. Likewise, race conditions, such as
[time of check to time of use][] (TOCTOU) vulnerabilities, are possible in
WebAssembly, since no execution or scheduling guarantees are provided beyond
in-order execution and [atomic memory primitives][threads].
Similarly, [side channel attacks][] can occur, such as timing attacks against
modules. In the future, additional protections may be provided by runtimes or
the toolchain, such as code diversification or memory randomization (similar to
[address space layout randomization][] (ASLR)), or [bounded pointers][] ("fat"
pointers).
### Control-Flow Integrity
The effectiveness of control-flow integrity can be measured based on its
completeness. Generally, there are three types of external control-flow
transitions that need to be protected, because the callee may not be trusted:
1. Direct function calls,
2. Indirect function calls,
3. Returns.
Together, (1) and (2) are commonly referred to as "forward-edge", since they
correspond to forward edges in a directed control-flow graph. Likewise (3) is
commonly referred to as "back-edge", since it corresponds to back edges in a
directed control-flow graph. More specialized function calls, such as tail
calls, can be viewed as a combination of (1) and (3).
Typically, this is implemented using runtime instrumentation. During
compilation, the compiler generates an expected control flow graph of program
execution, and inserts runtime instrumentation at each call site to verify that
the transition is safe. Sets of expected call targets are constructed from the
set of all possible call targets in the program, unique identifiers are assigned
to each set, and the instrumentation checks whether the current call target is
a member of the expected call target set. If this check succeeds, then the
original call is allowed to proceed, otherwise a failure handler is executed,
which typically terminates the program.
In WebAssembly, the execution semantics implicitly guarantee the safety of (1)
through usage of explicit function section indexes, and (3) through a protected
call stack. Additionally, the type signature of indirect function calls is
already checked at runtime, effectively implementing coarse-grained type-based
control-flow integrity for (2). All of this is achieved without explicit runtime
instrumentation in the module. However, as discussed
[previously](#memory-safety), this protection does not prevent code reuse
attacks with function-level granularity against indirect calls.
#### Clang/LLVM CFI
The Clang/LLVM compiler infrastructure includes a [built-in implementation] of
fine-grained control flow integrity, which has been extended to support the
WebAssembly target. It is available in Clang/LLVM with the
[WebAssembly backend].
Enabling fine-grained control-flow integrity (by passing `-fsanitize=cfi` to
emscripten) has a number of advantages over the default WebAssembly
configuration. Not only does this better defend against code reuse attacks that
leverage indirect function calls (2), but it also enhances the built-in function
signature checks by operating at the C/C++ type level, which is semantically
richer that the WebAssembly [type level](Semantics.md#types), which consists
of only four value types. Currently, enabling this feature has a small
performance cost for each indirect call, because an integer range check is
used to verify that the target index is trusted, but this may be eliminated in
the future by leveraging built-in support for
[multiple indirect tables] with homogeneous type
in WebAssembly.
[address space layout randomization]: https://en.wikipedia.org/wiki/Address_space_layout_randomization
[bounded pointers]: https://en.wikipedia.org/wiki/Bounded_pointer
[built-in implementation]: https://clang.llvm.org/docs/ControlFlowIntegrity.html
[control-flow integrity]: https://research.microsoft.com/apps/pubs/default.aspx?id=64250
[data execution prevention]: https://en.wikipedia.org/wiki/Executable_space_protection
[forward-edge control-flow integrity]: https://www.usenix.org/node/184460
[new WebAssembly backend]: https://github.com/WebAssembly/binaryen#cc-source--webassembly-llvm-backend--s2wasm--webassembly
[return-oriented programming]: https://en.wikipedia.org/wiki/Return-oriented_programming
[same-origin policy]: https://www.w3.org/Security/wiki/Same_Origin_Policy
[side channel attacks]: https://en.wikipedia.org/wiki/Side-channel_attack
[stack smashing protection]: https://en.wikipedia.org/wiki/Buffer_overflow_protection#Random_canaries
[time of check to time of use]: https://en.wikipedia.org/wiki/Time_of_check_to_time_of_use
[multiple linear memory sections]: https://github.com/WebAssembly/multi-memory/blob/main/proposals/multi-memory/Overview.md
[multiple indirect tables]: https://github.com/WebAssembly/reference-types/blob/master/proposals/reference-types/Overview.md
[threads]: https://github.com/WebAssembly/threads/blob/main/proposals/threads/Overview.md
[future memory control]: https://github.com/WebAssembly/memory-control
================================================
FILE: Semantics.md
================================================
# Semantics
This file historically contained the description of WebAssembly's instruction set.
For the current description, see the [normative documentation](http://webassembly.github.io/spec/core/exec/index.html).
================================================
FILE: TextFormat.md
================================================
# Text Format
This file historically contained the description of WebAssembly's text format.
For the current description, see the [normative documentation](http://webassembly.github.io/spec/core/text/index.html).
================================================
FILE: Tooling.md
================================================
# Tooling support
Tooling for in-browser execution is often of uneven quality. WebAssembly aims at
making it possible to support truly great tooling by exposing
[low-level capabilities][] instead of prescribing which tooling should be
built. This enables:
* Porting of existing and familiar tooling to WebAssembly;
* Building new tooling that's particularly well suited to WebAssembly.
[low-level capabilities]: https://extensiblewebmanifesto.org
WebAssembly development should be self-hosting, and not just as a cute hack but
as enjoyable platform that developers actively seek out because the tools they
want and need *just work*. Developers have high expectations, and meeting these
expectations on tooling means WebAssembly has the features required to build
rich applications for non-developers.
The tooling we expect to support includes:
* Editors:
- Editors such as vim and emacs should *just work*.
* Compilers and language virtual machines:
- Compilers for languages which can target WebAssembly (C/C++, Rust, Go, C#)
should be able to run in WebAssembly themselves, emit a WebAssembly module
that can then be executed.
- Virtual machines for languages such as bash, Python, Ruby should work.
- Virtual machines which use a just-in-time compiler (JavaScript VMs, luajit,
pypy) should be able to support a new just-in-time backend for WebAssembly.
* Debuggers:
- Basic browser integration can be done through source map support.
- Full integration for languages like C++ require more standardization effort
on debugging information format as well as permissions for interrupting
programs, inspecting their state, modifying their state.
- Debug information is better delivered on-demand instead of built-in to a
WebAssembly module.
* Sanitizers for [non-memory-safe](Security.md#memory-safety) languages: asan,
tsan, msan, ubsan. Efficient support of sanitizers may require improving:
- Trapping support;
- Shadow stack techniques (often implemented through `mmap`'s `MAP_FIXED`).
* Opt-in [security](Security.md) enhancements for developers' own code:
developers targeting WebAssembly may want their own code to be sandboxed
further than what WebAssembly implementations require to protect users.
* Profilers:
- Sample-based;
- Instrumentation-based.
* Process dump: local variables, call stack, heap, global variables, list of
threads.
* JavaScript+WebAssembly size optimization tool: huge WebAssembly+JavaScript
mixed applications, WebAssembly calling to JavaScript libraries which
communicate with the rest of the Web platform, need tooling to perform
dead code stripping and global optimization across the API boundary.
In many cases, the tooling will be pure WebAssembly without any tool-specific
support from WebAssembly. This won't be possible for debugging, but should be
entirely possible for sanitizers.
================================================
FILE: UseCases.md
================================================
# Use Cases
WebAssembly's [high-level goals](HighLevelGoals.md) define *what* WebAssembly
aims to achieve, and in *which order*. *How* WebAssembly achieves its goals is
documented for [Web](Web.md) and [non-Web](NonWeb.md) platforms. The following
is an unordered and incomplete list of applications/domains/computations that
would benefit from WebAssembly and are being considered as use cases during the
design of WebAssembly.
## Inside the browser
* Better execution for languages and toolkits that are currently cross-compiled
to the Web (C/C++, GWT, …).
* Image / video editing.
* Games:
- Casual games that need to start quickly.
- AAA games that have heavy assets.
- Game portals (mixed-party/origin content).
* Peer-to-peer applications (games, collaborative editing, decentralized and
centralized).
* Music applications (streaming, caching).
* Image recognition.
* Live video augmentation (e.g. putting hats on people's heads).
* VR and augmented reality (very low latency).
* CAD applications.
* Scientific visualization and simulation.
* Interactive educational software, and news articles.
* Platform simulation / emulation (ARC, DOSBox, QEMU, MAME, …).
* Language interpreters and virtual machines.
* POSIX user-space environment, allowing porting of existing POSIX applications.
* Developer tooling (editors, compilers, debuggers, …).
* Remote desktop.
* VPN.
* Encryption.
* Local web server.
* Common NPAPI users, within the web's security model and APIs.
* Fat client for enterprise applications (e.g. databases).
## Outside the browser
* Game distribution service (portable and secure).
* Server-side compute of untrusted code.
* Server-side application.
* Hybrid native apps on mobile devices.
* Symmetric computations across multiple nodes
## How WebAssembly can be used
* Entire code base in WebAssembly.
* Main frame in WebAssembly, but the UI is in JavaScript / HTML.
* Re-use existing code by targeting WebAssembly, embedded in a larger
JavaScript / HTML application. This could be anything from simple helper
libraries, to compute-oriented task offload.
================================================
FILE: Web.md
================================================
# Web Embedding
Unsurprisingly, one of WebAssembly's primary purposes is to run on the Web,
for example embedded in Web browsers (though this is
[not its only purpose](NonWeb.md)).
This means integrating with the Web ecosystem, leveraging Web APIs, supporting
the Web's security model, preserving the Web's portability, and designing in
room for evolutionary development. Many of these goals are clearly
reflected in WebAssembly's [high-level goals](HighLevelGoals.md). In
particular, WebAssembly MVP will be no looser from a security point of view
than if the module was JavaScript.
More concretely, the following is a list of points of contact between WebAssembly
and the rest of the Web platform that have been considered:
## JavaScript API
A [JavaScript API](JS.md) is provided which allows JavaScript to compile
WebAssembly modules, perform limited reflection on compiled modules, store
and retrieve compiled modules from offline storage, instantiate compiled modules
with JavaScript imports, call the exported functions of instantiated modules,
alias the exported memory of instantiated modules, etc.
The Web embedding includes additional methods useful in that context.
In non-web embeddings, these APIs may not be present.
### Additional Web Embedding API
This section historically contained the description of WebAssembly's Web API.
For the current description, see the [normative documentation](https://webassembly.github.io/spec/web-api/index.html).
## Modules
WebAssembly's [modules](Modules.md) allow for natural [integration with
the ES6 module system](https://github.com/WebAssembly/esm-integration).
## Security
WebAssembly's [security](Security.md) model depend on the
[same-origin policy], with [cross-origin resource sharing (CORS)] and
[subresource integrity] to enable distribution through content
distribution networks and to implement [dynamic linking](DynamicLinking.md).
## WebIDL
There are various proposals in flight which may support future work toward
WebIDL bindings for WebAssembly, including [JS String builtins],
[source-phase imports], and the [component model].
There are also tools to provide this functionality today by generating
JS wrapper code, for example [Emscripten's WebIDL Binder],
[the wasm-webidl-bindings Rust crate], and
[jco's experimental WebIDL Imports support].
[same-origin policy]: https://www.w3.org/Security/wiki/Same_Origin_Policy
[cross-origin resource sharing (CORS)]: https://www.w3.org/TR/cors/
[subresource integrity]: https://www.w3.org/TR/SRI/
[JS String builtins]: https://github.com/WebAssembly/js-string-builtins/
[source-phase imports]: https://github.com/tc39/proposal-source-phase-imports
[component model]: https://github.com/WebAssembly/component-model
[Emscripten's WebIDL Binder]: https://emscripten.org/docs/porting/connecting_cpp_and_javascript/WebIDL-Binder.html
[the wasm-webidl-bindings Rust crate]: https://github.com/rustwasm/wasm-webidl-bindings
[jco's experimental WebIDL Imports support]: https://github.com/bytecodealliance/jco/blob/main/docs/src/transpiling.md#experimental-webidl-imports
gitextract_zxmj6d79/ ├── .gitignore ├── BinaryEncoding.md ├── CAndC++.md ├── CodeOfConduct.md ├── Contributing.md ├── DynamicLinking.md ├── Events.md ├── FAQ.md ├── FeatureTest.md ├── FutureFeatures.md ├── HighLevelGoals.md ├── JITLibrary.md ├── JS.md ├── LICENSE ├── MVP.md ├── Modules.md ├── NonWeb.md ├── Nondeterminism.md ├── Portability.md ├── README.md ├── Rationale.md ├── Security.md ├── Semantics.md ├── TextFormat.md ├── Tooling.md ├── UseCases.md └── Web.md
Condensed preview — 27 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (130K chars).
[
{
"path": ".gitignore",
"chars": 21,
"preview": "*~\n*#\n.#*\n.*.swp\nout\n"
},
{
"path": "BinaryEncoding.md",
"chars": 225,
"preview": "# Binary Encoding\n\nThis file historically contained the description of WebAssembly's binary encoding.\n\nFor the current d"
},
{
"path": "CAndC++.md",
"chars": 6295,
"preview": "# Guide for C/C++ developers\n\nWebAssembly is being designed to support C and C++ code well, right from\nthe start in [the"
},
{
"path": "CodeOfConduct.md",
"chars": 103,
"preview": "The code of conduct has moved to https://github.com/WebAssembly/meetings/blob/main/CODE_OF_CONDUCT.md.\n"
},
{
"path": "Contributing.md",
"chars": 1542,
"preview": "# Contributing to WebAssembly\n\nInterested in contributing to WebAssembly? We suggest you start by:\n\n* Acquainting yourse"
},
{
"path": "DynamicLinking.md",
"chars": 4463,
"preview": "# Dynamic linking\n\nWebAssembly enables load-time and run-time (`dlopen`) dynamic linking in the\nMVP by having multiple ["
},
{
"path": "Events.md",
"chars": 2665,
"preview": "## Past Events\n\n| Date | Title | Slides | Video | Presenter(s) |\n|-----:|-------|:------:|:-----:|--------------|\n| Octo"
},
{
"path": "FAQ.md",
"chars": 21689,
"preview": "# FAQ\n\n## Why create a new standard when there is already asm.js?\n\n... especially since pthreads ([Mozilla pthreads][], "
},
{
"path": "FeatureTest.md",
"chars": 4798,
"preview": "See [rationale](Rationale.md#feature-testing---motivating-scenarios) for motivating scenarios.\n\n# Feature Test\n\n[Post-MV"
},
{
"path": "FutureFeatures.md",
"chars": 596,
"preview": "# Future Features for WebAssembly\n\nFor a list of standard and standard-track features and their implementation\nstatus in"
},
{
"path": "HighLevelGoals.md",
"chars": 2105,
"preview": "# WebAssembly High-Level Goals\n\n1. Define a [portable](Portability.md), size- and load-time-efficient\n [binary format]"
},
{
"path": "JITLibrary.md",
"chars": 4921,
"preview": "# JIT and Optimization Library\n\nWebAssembly's Just-in-Time compilation (JIT)\ninterface will likely be fairly low-level, "
},
{
"path": "JS.md",
"chars": 218,
"preview": "# JavaScript API\n\nThis file historically contained the description of WebAssembly's JavaScript API.\n\nFor the current des"
},
{
"path": "LICENSE",
"chars": 11358,
"preview": " Apache License\n Version 2.0, January 2004\n "
},
{
"path": "MVP.md",
"chars": 300,
"preview": "# Minimum Viable Product\n\nWebAssembly initially launched with a \"minimum viable product\" version (the \"MVP\"),\nwhich was "
},
{
"path": "Modules.md",
"chars": 221,
"preview": "# Modules\n\nThis file historically contained the description of WebAssembly's module structure.\n\nFor the current descript"
},
{
"path": "NonWeb.md",
"chars": 2698,
"preview": "# Non-Web Embeddings\n\nWhile WebAssembly is designed to run [on the Web](Web.md), it is\nalso desirable for it to be able "
},
{
"path": "Nondeterminism.md",
"chars": 3140,
"preview": "# Nondeterminism in WebAssembly\n\nWebAssembly is a [portable](Portability.md) [sandboxed](Security.md) platform\nwith limi"
},
{
"path": "Portability.md",
"chars": 3280,
"preview": "# Portability\n\nWebAssembly's [binary format](BinaryEncoding.md) is designed to be executable\nefficiently on a variety of"
},
{
"path": "README.md",
"chars": 1488,
"preview": "# WebAssembly Design\n\nThis repository is part of the [WebAssembly Community Group] and hosts the\n[issue tracker] for [ph"
},
{
"path": "Rationale.md",
"chars": 34570,
"preview": "# Design Rationale\n\nThis document describes rationales for WebAssembly's design decisions, acting as\nfootnotes to the ma"
},
{
"path": "Security.md",
"chars": 11187,
"preview": "# Security\n\nThe security model of WebAssembly has two important goals: (1) protect *users*\nfrom buggy or malicious modul"
},
{
"path": "Semantics.md",
"chars": 217,
"preview": "# Semantics\n\nThis file historically contained the description of WebAssembly's instruction set.\n\nFor the current descrip"
},
{
"path": "TextFormat.md",
"chars": 215,
"preview": "# Text Format\n\nThis file historically contained the description of WebAssembly's text format.\n\nFor the current descripti"
},
{
"path": "Tooling.md",
"chars": 2891,
"preview": "# Tooling support\n\nTooling for in-browser execution is often of uneven quality. WebAssembly aims at\nmaking it possible t"
},
{
"path": "UseCases.md",
"chars": 2100,
"preview": "# Use Cases\n\nWebAssembly's [high-level goals](HighLevelGoals.md) define *what* WebAssembly\naims to achieve, and in *whic"
},
{
"path": "Web.md",
"chars": 3094,
"preview": "# Web Embedding\n\nUnsurprisingly, one of WebAssembly's primary purposes is to run on the Web,\nfor example embedded in Web"
}
]
About this extraction
This page contains the full source code of the WebAssembly/design GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 27 files (123.4 KB), approximately 27.9k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.