Internal change

PiperOrigin-RevId: 266889781
Change-Id: Ibea87a7bb5fafb50ae3d09f7b0df876beecaf087
This commit is contained in:
Wiktor Garbacz 2019-09-03 03:01:04 -07:00 committed by Copybara-Service
parent da3c6c138e
commit daa3defac0
13 changed files with 0 additions and 1174 deletions

View File

@ -1,37 +0,0 @@
# Examples
We have prepared some examples, which might help you to implement your first
Sandboxed API library.
## Sum
A demo library implementing a few [C functions](../examples/sum/lib/sum.c) and a
single [C++ function](../examples/sum/lib/sum_cpp.cc).
It uses ProtoBuffs to exchange data between host code and the SAPI Library.
* The sandbox definition can be found in the
[sandbox.h](../examples/sum/lib/sandbox.h) file.
* The (automatically generated) function annotation file (a file providing
prototypes of sandboxed functions) can be found in
`bazel-out/genfiles/sandboxed_api/examples/sum/lib/sum-sapi.sapi.h`
after a Bazel build.
* The actual execution logic (a.k.a. host code) making use of the exported
sandboxed procedures can be found in [main_sum.cc](../examples/sum/main_sum.cc).
## zlib
This is a demo implementation (functional, but currently not used in production)
for the zlib library exporting some of its functions, and making them available
to the [host code](../examples/zlib/main_zlib.cc).
The demonstrated functionality of the host code is decoding of zlib streams
from stdin to stdout.
This SAPI library doesn't use the `sandbox.h` file, as it uses the default
Sandbox2 policy, and an embedded SAPI library, so there is no need to provide
`sapi::Sandbox::GetLibPath()` nor `sapi::Sandbox::GetPolicy()` methods.
The zlib SAPI can be found in [//sapi_sandbox/examples/zlib](../examples/zlib),
along with its [host code](../examples/zlib/main_zlib.cc).

View File

@ -1,76 +0,0 @@
# Host Code
## Description
The *host code* is the actual code making use of the functionality offered by
its contained/isolated/sandboxed counterpart, i.e. a [SAPI Library](library.md).
Such code implements the logic, that any program making use of a typical library
would: it calls functions exported by said library, passing and receiving data
to/from it.
Given that the SAPI Library lives in a separate and contained/sandboxed process,
calling such functions directly is not possible. Therefore the SAPI project
provides tools which create an API object that proxies accesses to sandboxed
libraries.
More on that can be found under [library](library.md).
## Variables
In order to make sure that host code can access variables and memory blocks in
a remote process, SAPI provides a comprehensive set of C++ classes. These try to
make the implementation of the main logic code simpler. To do this you will
sometimes have to use those objects instead of typical data types known from C.
For example, instead of an array of three `int`'s, you will instead have to use
and pass to sandboxed functions the following object
```cpp
int arr[3] = {1, 2, 3};
sapi::v::Array<int> sarr(arr, ABSL_ARRAYSIZE(arr));
```
[Read more](variables.md) on the internal data representation used in host
code.
## Transactions
When you use a typical library of functions, you do not have to worry about the
fact that a call to a library might fail at runtime, as the linker ensures all
necessary functions are available after compilation.
Unfortunately with the SAPI, the sandboxed library lives in a separate process,
therefore we need to check for all kinds of problems related to passing such
calls via our RPC layer.
Users of SAPI need to check - in addition to regular errors returned by the
native API of a library - for errors returned by the RPC layer. Sometimes these
errors might not be interesting, for example when doing bulk processing and you
would just restart the sandbox.
Handling these errors would mean that each call to a SAPI library is followed
by an additional check to RPC layer of SAPI. To make handling of such
cases easier we have implemented the `::sapi::Transaction` class.
This module makes sure that all function calls to the sandboxed library were
completed without any RPC-level problems, or it will return relevant error.
Read more about this module under [Transactions](transactions.md).
## Sandbox restarts
Many sandboxees handle sensitive user input. This data might be at risk when the
sandboxee was corrupted at some point and stores data between runs - imagine
an Imagemagick sandbox that starts sending out pictures of the previous run. To
avoid this we need to stop reusing sandboxes. This can be achieved by restarting
the sandboxee with `::sapi::Sandbox::Restart()` or
`::sapi::Transaction::Restart()` when using transactions.
**Restarting the sandboxee will invalidate any references to the sandboxee!**
This means passed file descriptors/allocated memory will not exist anymore.
Note: Restarting the sandboxee takes some time, about *75-80 ms* on modern
machines (more if network namespaces are used).

View File

@ -1,29 +0,0 @@
# How it works
## Overview
The Sandboxed API project allows to run code of libraries in a sandboxed
environment, isolated with the help of [Sandbox2](../sandbox2/README.md).
Our goal is to provide developers with tools to prepare such libraries for the
sandboxing process, as well as necessary APIs to communicate (i.e. make function
calls and receive results) with such library.
All calls to the sandboxed library are passed over our custom RPC implementation
to a sandboxed process, and the results are passed back to the caller.
![SAPI Diagram](images/sapi-overview.png)
The project also provides [primitives](variables.md) for manual and
automatic (based on custom pointer attributes) memory synchronization (arrays,
structures) between the SAPI Libraries and the host code.
A [high-level Transactions API](transactions.md) provides monitoring of SAPI
Libraries, and restarts them if they fail (e.g, due to security violations,
crashes or resource exhaustion).
## Getting started
Read our [Get Started](getting-started.md) page to set up your first Sandboxed
API project.

Binary file not shown.

Before

Width:  |  Height:  |  Size: 105 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 40 KiB

View File

@ -1,123 +0,0 @@
# Library
## BUILD.bazel
Here, you'll prepare a build target, that your [host code](host-code.md)
will make use of.
Start by preparing a `sapi_library()` target in your `BUILD.bazel`
file.
For reference, you can take a peek at a working example from the
[zlib example](../examples/zlib/BUILD.bazel).
```python
load(
"//sandboxed_api/tools/generator:sapi_generator.bzl",
"sapi_library",
)
sapi_library(
name = "zlib-sapi",
srcs = [], # Extra code compiled with the SAPI library
hdrs = [] # Leave empty if embedded SAPI libraries are used, and the
# default sandbox policy is sufficient.
embed = True, # This is the default
functions = [
"deflateInit_",
"deflate",
"deflateEnd",
],
lib = "@zlib//:zlibonly",
lib_name = "Zlib",
namespace = "sapi::zlib",
)
```
* **`name`** - name for your SAPI target
* **`srcs`** - any additional sources that you'd like to include with your
Sandboxed API library - typically, it's not necessary, unless you want to
provide your SAPI Library sandbox definition in a .cc file, and not in the
`sandbox.h` file.
* **`hdrs`** - as with **`srcs`**. Typically your sandbox definition (sandbox.h)
should go here, or empty, if embedded SAPI library is used, and the default
sandbox policy is sufficient.
* **`functions`** - a list of functions that you'd like to use in your host
code. Leaving this list empty will try to export and wrap all functions found
in the library.
* **`embed`** - whether the SAPI library should be embedded inside host code,
so the SAPI Sandbox can be initialized with the
`::sapi::Sandbox::Sandbox(FileToc*)` constructor. This is the default.
* **`lib`** - (mandatory) the library target you want to sandbox and expose to
the host code.
* **`lib_name`** - (mandatory) name of the object which is proxying your library
functions from the `functions` list. You will call functions from the
sandboxed library via this object.
* **`input_files`** - list of source files, which SAPI interface generator
should scan for library's function declarations. Library's exported headers
are always scanned, so `input_files` can usually be left empty.
* **`namespace`** - a C++ namespace identifier to place API object defined in
`lib_name` into. Defaults to `sapigen`.
* **`deps**`** - a list of any additional dependency targets to add. Typically
not necessary.
* **`header`** - name of the header file to use instead of the generated one.
Do not use if you want to auto-generate the code.
## `sapi_library()` Rule Targets
For the above definition, `sapi_library()` build rule provides the following
targets:
* **`zlib-sapi`** - sandboxed library, substitiution for normal cc_library;
consists of **`zlib_sapi.bin`** and sandbox dependencies
* **`zlib-sapi.interface`** - generated library interface
* **`zlib-sapi.embed`** - `cc_embed_data()` target used to embed sandboxee in
the binary. See `bazel/embed_data.bzl`.
* **`zlib-sapi.bin`** - sandboxee binary, consists of small communication stub
and the library that is being sandboxed.
## Interface Generation
__`zlib-sapi`__ target creates target library with small communication stub
wrapped in [Sandbox2](../sandbox2/README.md). To be able to use the stub and
code within the sanbox, you should generate the interface file.
There are two options:
1. Add dependency on __`zlib-sapi.interface`__. This will auto-generate a
header that you can include in your code - the header name is of the form:
__`TARGET_NAME`__`.sapi.h`.
2. Run `bazel build TARGET_NAME.interface`, save generated header in your
project and include it in the code. You will also need to add the `header`
argument to the `sapi_library()` rule to indicate that you will skip code
generation.
## Sandbox Description (`sandbox.h`)
**Note**: If the default SAPI Sandbox policy is sufficient, and the constructor
used is **`::sapi::Sandbox::Sandbox(FileToc*)`**, then this file might not be
necessary.
In this step you will prepare the sandbox definition file (typically named
`sandbox.h`) for your library.
The goal of this is to tell the SAPI code where the sandboxed library can be
found, and how it should be contained.
At first, you should tell the SAPI code what your sandboxed library should be
allowed to do in terms of security policies and other process constraints. In
order to do that, you will have to implement and instantiate an object based on
the [::sapi::Sandbox](../sandbox.h) class.
This object will also specify where your SAPI Library can be found
and how it should be executed (though you can depend on default settings).
A working example of such SAPI object definition file can be found
[here](../examples/sum/lib/sandbox.h).
In order to familiarize yourself with the Sandbox2 policies, you might want to
take a look at the [Sandbox2 documenation](../sandbox2/README.md).

View File

@ -1,52 +0,0 @@
# Sandboxing Code
Sometimes, a piece of code carries a lot of security risk. Examples include:
* Commercial binary-only code to do document parsing. Document parsing often
goes wrong, and binary-only means no opportunity to fix it up.
* A web browser's core HTML parsing and rendering. This is such a large amount
of code that there will be security bugs.
* A JavaScript engine in Java. Accidents here would permit arbitrary calls to
Java methods.
Where a piece of code is very risky, and directly exposed to untrusted users
and untrusted input, it is sometimes desirable to sandbox this code. The hardest
thing about sandboxing is making the call whether the risk warrants the effort
to sandbox.
There are many approaches to sandboxing, including virtualization, jail
environments, network segregation and restricting the permissions code runs
with. This page covers technologies available to do the latter: restrict the
permission code runs with. See the following depending on which technology you
are using:
## General Sandboxing
Project/technology | Description
-------------------------------------------|------------
[Sandbox2](../sandbox2/README.md) | Linux sandboxing using namespaces, resource limits and seccomp-bpf syscall filters. Provides the underlying sandboxing technology for Sandboxed API.
[gVisor](https://github.com/google/gvisor) | Uses hardware virtualization and a small syscall emulation layer implemented in Go.
## Sandbox command-line tools
Project/technology | Description
---------------------|------------
[Firejail](https://github.com/netblue30/firejail) | Lightweight sandboxing tool implemented as a SUID program with minimal dependencies.
[Minijail](https://android.googlesource.com/platform/external/minijail/) | The sandboxing and containment tool used in Chrome OS and Android. Provides an executable and a library that can be used to launch and sandbox other programs and code.
[NSJail](nsjail.com) | Process isolation for Linux using namespaces, resource limits and seccomp-bpf syscall filters. Can optionally make use of [Kafel](https://github.com/google/kafel/), a custom domain specific language, for specifying syscall policies.
## C/C++
Project/technology | Description
-------------------------|------------
[Sandboxed API](..) | Reusable sandboxes for C/C++ libraries using Sandbox2.
(Portable) Native Client | **(Deprecated)** Powerful technique to sandbox C/C++ binaries by compiling to a restricted subset of x86 (NaCl)/LLVM bytecode (PNaCl).
## Graphical/Desktop Applications
Project/technology | Description
----------------------------------------------|------------
[Flatpak](https://github.com/flatpak/flatpak) | Built on top of [Bubblewrap](https://github.com/projectatomic/bubblewrap), provides sandboxing for Linux desktop applications. Puts an emphasis on packaging and distribution of native apps.

View File

@ -1,133 +0,0 @@
# Transactions
## Introduction
When using SAPI, there is another layer around library calls that might
fail, which is why all library function prototypes return `::sapi::StatusOr<T>`
instead of `T`. In the event that the library function invocation fails (e.g.
because of a sandbox violation), the return value will contain details about
the error that occurred.
In order to deal with those exceptional situations, the high-level
`::sapi::Transaction` module can be used.
### `::sapi::Transaction`
With SAPI we are trying to isolate the [host code](host-code.md) from such
problems in the sandboxed library, giving ability to the caller to restart or
abort the problematic data processing request.
The transaction class goes one step further and automatically repeats processes
that have failed.
The usual pattern when dealing with libraries looks like this:
```cpp
LibInit();
while (data = NextDataToProcess()) {
result += LibProcessData(data);
}
LibClose();
```
This translates to this code when using SAPI:
```cpp
::sapi::Status Init(::sapi::Sandbox* sandbox) {
LibraryAPI lib(sandbox);
SAPI_RETURN_IF_ERROR(lib.LibInit());
return ::sapi::OkStatus();
}
::sapi::Status Finish(::sapi::Sandbox *sandbox) {
// ...
}
::sapi::Status handle_data(::sapi::Sandbox *sandbox, Data data_to_process,
Result *out) {
LibraryAPI lib(sandbox);
SAPI_ASSIGN_OR_RETURN(*out, lib.LibProcessData(data_to_process));
return ::sapi::OkStatus();
}
void handle() {
// ...
::sapi::BasicTransaction transaction(Init, Finish);
while (data = NextDataToProcess()) {
::sandbox2::Result result;
transaction.Run(handle_data, data, &result);
// ...
}
// ...
}
```
The transaction class makes sure to reinitialize the library in the case that an
error occures during the `handle_data` invovcation - more on this later.
SAPI transaction can be used in two different ways, depending on your
requirements:
* Implementing a transaction class inheriting from `::sapi::Transaction`,
* Using function pointers passed to `::sapi::BasicTransaction`, see above.
Both methods allow you to specify the following three functions:
* `::sapi::Transaction::Init()`, which will be called **only once** during each
transaction to the sandboxed library (and, also, during each restart of the
transaction). It's similar to calling a `LibInit()` function from a typical
C/C++ library.
* `::sapi::Transaction::Main()`, which will be called for each call to
`::sapi::Transaction::Run()`.
* `::sapi::Transaction::Finish()`, which will be called during the
`::sapi::Transaction` object destruction, resembling the call to a typical
`LibClose()` function call.
### Transaction Restarts
If any kind of problem arises during execution of the
`Init()`/`Main()`/`Finish()` methods, e.g, they return a failure return code due
to library error, or sandboxed process crash, or a security sandbox violation,
the transaction will be restarted (by default, `kDefaultRetryCnt` times, see
[transaction.h](../transaction.h)).
During such restarts the `Init()`/`Main()` flow is observed (i.e, the `Init()`
function is called again), and if repeated calls to the
`::sapi::Transaction::Run()` method return errors, then the whole method
returns an error to its caller.
### Sandbox/RPC Error handling
Although the automatically generated [SAPI library
interface](library.md#Interface-Generation) tries to be as similar to the
original library function prototype we somehow need to signal Sandbox/RPC
errors. Instead of providing the return value directly, SAPI makes use of
`::sapi::StatusOr<T>` for return types `T` != `void` or `::sapi::Status` for
functions returning `void`.
Example of how to use the API (from the sum example):
```cpp
::sapi::Status SumTransaction::Main() {
SumApi f(GetSandbox());
// ::sapi::StatusOr<int> sum(int a, int b)
SAPI_ASSIGN_OR_RETURN(int v, f.sum(1000, 337));
...
// ::sapi::Status sums(sapi::v::Ptr* params)
SumParams params;
params.mutable_data()->a = 1111;
params.mutable_data()->b = 222;
params.mutable_data()->ret = 0;
SAPI_RETURN_IF_ERROR(f.sums(params.PtrBoth()));
...
int *ssaddr;
SAPI_RETURN_IF_ERROR(GetSandbox()->Symbol(
"sumsymbol", reinterpret_cast<void**>(&ssaddr)));
::sapi::v::Int sumsymbol;
sumsymbol.SetRemote(ssaddr);
SAPI_RETURN_IF_ERROR(GetSandbox()->TransferFromSandboxee(&sumsymbol));
...
return ::sapi::OkStatus();
}
```

View File

@ -1,69 +0,0 @@
# Variables
Typically, you'll be able to use native C-types to deal with the SAPI Library,
but sometimes some special types will be required. This mainly happens when
passing pointers to simple types, and pointers to memory blocks (structures,
arrays). Because you operate on local process memory (of the host code), when
calling a function taking a pointer, it must be converted into a corresponding
pointer inside the sandboxed process (SAPI Library) memory.
Take a look at the [SAPI directory](..). The `var_*.h` files provide classes
and templates representing various types of data, e.g. `::sapi::v::UChar`
represents well-known `unsigned char` while `::sapi::v::Array<int>` represents
an array of integers (`int[]`).
## Pointers
When creating your host code, you'll be generally using functions exported by
an auto-generated SAPI interface header file from your SAPI Library. Most of
them will take simple types (or typedef'd types), but when a pointer is needed,
you need to wrap it with the `::sapi::v::Ptr` template class.
Most types that you will use, provide the following methods:
* `::PtrNone()`: this pointer, when passed to the SAPI Library function,
doesn't synchronize the underlying memory between the host code process and
the SAPI Library process.
* `::PtrBefore()`: when passed to the SAPI Library function, will synchronize
memory of the object it points to, before the call takes place. This means,
that the local memory of the pointed variable will be transferred to the
SAPI Library process before the call is initiated.
* `::PtrAfter()`: this pointer will synchronize memory of the object it points
to, after the call has taken place. This means, that the remote memory of a
pointed variable will be transferred to the host code process' memory, after
the call has been completed.
* `::PtrBoth()`: combines the functionality of both `::PtrBefore()` and
`::PtrAfter()`
## Structures
When a pointer to a structure is used inside a call to a SAPI Library, that
structure needs to created with the `::sapi::v::Struct` template. You can use
the `PtrNone()`/`Before()`/`After()`/`Both()` methods of this template to obtain
a relevant `::sapi::v::Ptr` object that can be used in SAPI Library function
calls.
## Arrays
The `::sapi::v::Array` template allow to wrap both existing arrays of elements,
as well as dynamically create one for you (please take a look at its
constructor to decide which one you would like to use).
The use of pointers is analogous to [Structures](#structures).
## Examples
Our canonical [sum library](../examples/sum/main_sum.cc) demonstrates use of
pointers to call sandboxed functions in its corresponding SAPI Library.
You might also want to take a look at the [Examples](examples.md) page to
familiarize yourself with other working examples of libraries sandboxed
with SAPI.
* [sum library](../examples/sum/main_sum.cc)
* [stringop](../examples/stringop/main_stringop.cc)
* [zlib](../examples/zlib/main_zlib.cc)

View File

@ -1,119 +0,0 @@
# Examples
## Overview
We have prepared a few examples to demonstrate how to use sandbox2 depending on
your situation and how to write policies.
You can find them in [//sandboxed_api/sandbox2/examples](../examples), read on
for detailed explanations.
## CRC4
The CRC4 example is an intentionally buggy calculation of a CRC4 checksum, it
demonstrates how to sandbox another program and how to communicate with it.
* [crc4bin.cc](../examples/crc4/crc4bin.cc): is the program we want to sandbox
(the *sandboxee*)
* [crc4sandbox.cc](../examples/crc4/crc4sandbox.cc): is the sandbox program that
will run it (the *executor*).
How it works:
1. The *executor* starts the *sandboxee* from its file path using
`::sandbox2::GetDataDependencyFilePath()`.
2. The *executor* sends input to the *sandboxee* over the communication channel
`Comms` using `SendBytes()`.
3. The *sandboxee* calculates the CRC4 and sends its replies back to the
*executor* over the communication channel `Comms` which receives it with
`RecvUint32()`.
If the program makes any other syscall other than communicating (`read()` and
`write()`), it is killed for policy violation.
## static
The static example demonstrates how to sandbox a statically linked binary, such
as a third-party binary for which you do not have the source, so is not aware
that it will be sandboxed.
* [static_bin.cc](../examples/static/static_bin.cc): the *sandboxee* is a
static C binary that converts ASCII text from standard input to uppercase.
* [static_sandbox.cc](../examples/static/static_sandbox.cc): the *executor*
with its policy, limits and using a file descriptor for *sandboxee* input.
How it works:
1. The *executor* starts the *sandboxee* from its file path using
`GetDataDependencyFilepath`, just like for **CRC4**.
2. It sets up limits, opens a file descriptor on `/proc/version` and marks it
to be mapped in the *sandboxee* with `MapFd`.
3. The policy allows some syscalls (`open`) to return an error (`ENOENT`),
rather than being killed for policy violation. This can be useful when
sandboxing a third party program where we cannot modify which syscalls are
made, but we can make them fail gracefully.
## tool
The tool example is both a tool to develop your own policies and experiment with
**sandbox2** APIs as well a demonstration of its features.
* [sandbox2tool.cc](..examples/tool/sandbox2tool.cc): the *executor*
demonstrating
* how to run another binary sandboxed,
* how to set up filesystem checks, and
* how the *executor* can run the *sandboxee* asynchronously to read its
output progressively
Try it yourself:
```bash
bazel run //sandboxed_api/sandbox2/examples/tool:sandbox2tool -- \
/bin/cat /etc/hostname
```
Flags:
* `--sandbox2tool_keep_env` to keep current environment variables
* `--sandbox2tool_redirect_fd1` to receive the *sandboxee* STDOUT_FILENO (1)
and output it locally
* `--sandbox2tool_cpu_timeout` to set CPU timeout in seconds
* `--sandbox2tool_walltime_timeout` to set wall-time timeout in seconds
* `--sandbox2tool_file_size_creation_limit` to set the maximum size of created
files
* `--sandbox2tool_cwd` to set sandbox current working directory
## custom_fork
The custom_fork example demonstrates how to create a sandbox, which will
initialize the binary, and then wait for `fork()` requests coming from the
parent executor.
This mode offers potentially increased performance with regard to other types of
sandboxing, as here, creating new instances of sandboxees doesn't require
executing new binaries, just fork()-ing the existing ones
* [custom_fork_bin.cc](../examples/custom_fork): is the custom fork-server,
receiving requests to `fork()` (via `Client::WaitAndFork`) in order to spawn
new sandboxees
* [custom_fork_sandbox.cc](../examples/custom_fork/custom_fork_sandbox.cc): is
the executor, which starts a custom fork server. Then it sends requests to it
(via new executors) to spawn (via `fork()`) new sandboxees.
## network
Enabling the network namespace prevents the sandboxed process from connecting to
the outside world. This example demonstrates how to deal with this problem.
Namespaces are enabled when either
`::sandbox2::PolicyBuilder::EnableNamespaces()` is called, or some other
function that enables namespaces like `AddFile()`. To deal with this problem,
we can initialize a connection inside the executor and pass the socket file
descriptor via `::sandbox2::Comms::SendFD()`. The sandboxee receives the socket
by using `::sandbox2::Comms::RecvFD()` and then it can use this socket to
exchange the data as usual.
* [network_bin.cc](examples/network/network_bin.cc): is the program we want to
sandbox (the sandboxee).
* [network_sandbox.cc](examples/network/network_sandbox.cc): is the sandbox
program that will run it (the executor).

View File

@ -1,123 +0,0 @@
# FAQ
## Can I use threads?
Yes, threads are supported in sandbox2.
### All threads must be sandboxed
Because of the way Linux works, the seccomp-bpf policy is applied to the current
thread only: this means other existing threads do not get the policy, but future
threads will inherit the policy.
If you are using sandbox2 in the
[default mode](getstarted.md#a-Execute-a-binary-with-sandboxing-already-enabled)
where sandboxing is enabled before `execve()`, all threads will inherit the
policy, and there is no problem. This is the preferred mode of sandboxing.
If you want to use the
[second mode](getstarted.md#b-Tell-the-executor-when-to-be-sandboxed) where the
executor has
`set_enable_sandbox_before_exec(false)` and the sandboxee tells the executor
when it wants to be sandboxed with `SandboxMeHere()`, then the filter still
needs to be applied to all threads. Otherwise, there is a risk of a sandbox
escape: malicious code could migrate from a sandboxed thread to an unsandboxed
thread.
The Linux kernel introduced the TSYNC flag in version 3.17, which allows
applying a policy to all threads. Before this flag, it was only possible to
apply the policy on a thread-by-thread basis.
If sandbox2 detects that it is running on a kernel without TSYNC-support and you
call `SandboxMeHere()` from multi-threaded program, sandbox2 will abort, since
this would compromise the safety of the sandbox.
## How should I compile my sandboxee?
If not careful, it is easy to inherit a lot of dependencies and side effects
(extra syscalls, file accesses or even network connections) which make
sandboxing harder (tracking down all side effects) and less safe (because the
syscall and file policies are wider). Some compile options can help reduce this:
* statically compile the sandboxee binary to avoid dynamic linking which uses a
lot of syscalls (`open()`/`openat()`, `mmap()`, etc.). Also since Bazel adds
`pie` by default but static is incompatible with it, use the features flag to
force it off.
That is, use the following options in
[cc_binary](https://docs.bazel.build/versions/master/be/c-cpp.html#cc_binary)
rules:
```python
linkstatic = 1,
features = [
"fully_static_link", # link libc statically
"-pie",
],
```
*However:* this has the downside of reducing ASLR heap entropy (from 30 bits
to 8 bits), making exploits easier. Decide carefully what is preferable
depending on your sandbox implementation and policy:
* **not static**: good heap ASLR, potentially harder to get initial code
execution but at the cost of a less effective sandbox policy, potentially
easier to break out of.
* **static**: bad heap ASLR, potentially easier to get initial code execution
but a more effective sandbox policy, potentially harder to break out of.
It is an unfortunate choice to make because the compiler does not support
static PIE (Position Independent Executables). PIE is implemented by having
the binary be a dynamic object, and the dynamic loader maps it at a random
location before executing it. Then because the heap is traditionnally placed
at a random offset after the base address of the binary (and expanded with
`brk` syscall), it means for static binaries the heap ASLR entropy is only
this offset because there is no PIE.
For examples of these compiling options, look at the
[static](examples.md#static) example
[BUILD.bazel](../examples/static/BUILD.bazel): `static_bin.cc` is compiled
statically, which allows us to have a very tight syscall policy. This works
nicely for sandboxing third party binaries too.
## Can I sandbox 32-bit x86 binaries?
Sandbox2 can only sandbox the same arch as it was compiled with.
In addition, support for 32-bit x86 has been removed from Sandbox2. If you try
to use a 64-bit x86 executor to sandbox a 32-bit x86 binary, or a 64-bit x86
binary making 32-bit syscalls (via `int 0x80`), both will generate a sandbox
violation that can be identified with the architecture label *[X86-32]*.
The reason behind this behavior is that syscall numbers are different between
architectures and since the syscall policy is written in the architecture of the
executor, it would be dangerous to allow a different architecture for the
sandboxee. Indeed, allowing an seemingly harmless syscall that in fact means
another more harmful syscall could open up the sandbox to an escape.
## Any limits on the number of sandboxes an executor process can request?
For each sandboxee instance (new process spawned from the forkserver) a new
thread is created - that's where the limitation would lie.
## Can an Executor request the creation of more than one Sandbox?
No. There is a 1:1 correspondence - an `Executor` instance stores the PID of the
sandboxee, manages the `Comms` instance to the `Sandbox` instance, etc.
## Can I use sandbox2 from Go?
Yes. Write your executor in C++ and expose it to Go via SWIG.
## Why do I get `Function not implemented` inside `forkserver.cc?`
Sandbox2 only supports running on reasonably new kernels. Our current cut-off is
the 3.19 kernel though that might change in the future. The reason for this is
that we are using relatively new kernel features including user namespaces and
seccomp with the TSYNC flag.
If you are running on prod, this should not be in issue, since almost the entire
fleet is running a new enough kernel. If you have any issues with this, please
contact us.
If you are running on Debian or Ubuntu, updating your kernel is as easy as
`apt-get install linux-image-[recent version]`.

View File

@ -1,356 +0,0 @@
# Getting started with Sandbox2
## Introduction
In this guide, you will learn how to create your own sandbox, policy and tweaks.
It is meant as a guide, alongside the [examples](examples.md) and code
documentation in the header files.
## 1. Choose an executor
Sandboxing starts with an *executor* (see [How it works](howitworks.md)), which
will be responsible for running the *sandboxee*. The API for this is in
[executor.h](../executor.h). It is very flexible to let you choose what works
best for your use case.
### a. Execute a binary with sandboxing already enabled
This is the simplest and safest way to use sandboxing. For examples see
[static](examples.md#static) and [sandboxed tool](examples.md#tool).
```c++
#include "sandboxed_api/sandbox2/executor.h"
std::string path = "path/to/binary";
std::vector<std::string> args = {path}; // args[0] will become the sandboxed
// process' argv[0], typically the
// path to the binary.
auto executor = absl::make_unique<sandbox2::Executor>(path, args);
```
### b. Tell the executor when to be sandboxed
This offers you the flexibility to be unsandboxed during initialization, then
choose when to enter sandboxing by calling
`::sandbox2::Client::SandboxMeHere()`. The code has to be careful to always
call this or it would be unsafe to proceed, and it has to be single-threaded
(read why in the [FAQ](faq.md#Can-I-use-threads)). For an example see
[crc4](examples.md#CRC4).
Note: The [filesystem restrictions](#Filesystem-checks) will be in effect right
from the start of your sandboxee. Using this mode allows you to enable the
syscall filter later on from the sandboxee.
```c++
#include "sandboxed_api/sandbox2/executor.h"
std::string path = "path/to/binary";
std::vector<std::string> args = {path};
auto executor = absl::make_unique<sandbox2::Executor>(path, args);
executor->set_enable_sandbox_before_exec(false);
```
### c. Prepare a binary, wait for fork requests, and sandbox on your own
This mode allows you to start a binary, prepare it for sandboxing, and - at the
specific moment of your binary's lifecycle - make it available for the
executor. The executor will send fork request to your binary, which will
`fork()` (via `::sandbox2::ForkingClient::WaitAndFork()`). The newly created
process will be ready to be sandboxed with
`::sandbox2::Client::SandboxMeHere()`. This mode comes with a few downsides,
however: For example, it pulls in more dependencies in your sandboxee and
does not play well with namespaces, so it is only recommended it if you have
tight performance requirements.
For an example see [custom_fork](examples.md#custom_fork).
```c++
#include "sandboxed_api/sandbox2/executor.h"
// Start the custom ForkServer
std::string path = "path/to/binary";
std::vector<std::string> args = {path};
auto fork_executor = absl::make_unique<sandbox2::Executor>(path, args);
fork_executor->StartForkServer();
// Initialize Executor with Comms channel to the ForkServer
auto executor = absl::make_unique<sandbox2::Executor>(
fork_executor->ipc()->GetComms());
```
## 2. Creating a policy
Once you have an executor you need to define the policy for the sandboxee: this
will restrict the syscalls and arguments that the sandboxee can make as well as
the files it can access. For instance, a policy could allow `read()` on a given
file descriptor (e.g. `0` for stdin) but not another.
To create a [policy object][filter], use the
[PolicyBuilder](../policybuilder.h). It comes with helper functions that allow
many common operations (such as `AllowSystemMalloc()`), whitelist syscalls
(`AllowSyscall()`) or grant access to files (`AddFile()`).
If you want to restrict syscall arguments or need to perform more complicated
checks, you can specify a raw seccomp-bpf filter using the bpf helper macros
from the Linux kernel. See the [kernel documentation][filter] for more
information about BPF. If you find yourself writing repetitive BPF-code that
you think should have a usability-wrapper, feel free to file a feature request.
Coming up with the syscalls to whitelist is still a bit of manual work
unfortunately. Create a policy with the syscalls you know your binary needs and
run it with a common workload. If a violation gets triggered, whitelist the
syscall and repeat the process. If you run into a violation that you think might
be risky to whitelist and the program handles errors gracefullly, you can try to
make it return an error instead with `BlockSyscallWithErrno()`.
[filter]: https://www.kernel.org/doc/Documentation/networking/filter.txt
```c++
#include "sandboxed_api/sandbox2/policy.h"
#include "sandboxed_api/sandbox2/policybuilder.h"
#include "sandboxed_api/sandbox2/util/bpf_helper.h"
std::unique_ptr<sandbox2::Policy> CreatePolicy() {
return sandbox2::PolicyBuilder()
.AllowSyscall(__NR_read) // See also AllowRead()
.AllowTime() // Allow time, gettimeofday and clock_gettime
.AddPolicyOnSyscall(__NR_write, {
ARG(0), // fd is the first argument of write (argument #0)
JEQ(1, ALLOW), // allow write only on fd 1
KILL, // kill if not fd 1
})
.AddPolicyOnSyscall(__NR_mprotect, {
ARG_32(2), // prot is a 32-bit wide argument, so it's OK to use *_32
// macro here
JNE32(PROT_READ | PROT_WRITE, KILL), // prot must be the RW, otherwise
// kill the process
ARG(1), // len is a 64-bit argument
JNE(0x1000, KILL), // Allow single page syscalls only, otherwise kill
// the process
ALLOW, // Allow for the syscall to proceed, if prot and
// size match
})
// Allow the open() syscall but always return "not found".
.BlockSyscallWithErrno(__NR_open, ENOENT)
.BuildOrDie();
}
```
Tip: Test for the most used syscalls at the beginning so you can allow them
early without consulting the rest of the policy.
### Filesystem checks
The default way to grant access to files is by using the `AddFile()` class of
functions of the `PolicyBuilder`. This will automatically enable user namespace
support that allows us to create a custom chroot for the sandboxee and gives you
some other features such as creating tmpfs mounts.
```c++
sandbox2::PolicyBuilder()
// ...
.AddFile("/etc/localtime")
.AddDirectory("/usr/share/fonts")
.AddTmpfs("/tmp")
.BuildOrDie();
```
## 3. Adjusting limits
Sandboxing by restricting syscalls is one thing, but if the job can run
indefinitely or exhaust RAM and other resources that is not good either.
Therefore, by default the sandboxee runs under tight execution limits, which can
be adjusted using the [Limits](../limits.h) class, available by calling
`limits()` on the `Executor` object created earlier. For an example see [sandbox
tool](examples.md#tool).
```c++
// Restrict the address space size of the sandboxee to 4 GiB.
executor->limits()->set_rLimit_as(4ULL << 30);
// Kill sandboxee with SIGXFSZ if it writes more than 1 GiB to the filesystem.
executor->limits()->set_rLimit_fsize(1ULL << 30);
// Number of file descriptors which can be used by the sandboxee.
executor->limits()->set_rLimit_nofile(1ULL << 10);
// The sandboxee is not allowed to create core files.
executor->limits()->set_rLimit_core(0);
// Maximum 300s of real CPU time.
executor->limits()->set_rLimit_cpu(300);
// Maximum 120s of wall time.
executor->limits()->set_walltime_limit(absl::Seconds(120));
```
## 4. Running the sandboxee
With our executor and policy ready, we can now create the `Sandbox2` object and
run it synchronously. For an example see [static](examples.md#static).
```c++
#include "sandboxed_api/sandbox2/sandbox2.h"
sandbox2::Sandbox2 s2(std::move(executor), std::move(policy));
auto result = s2.Run(); // Synchronous
LOG(INFO) << "Result of sandbox execution: " << result.ToString();
```
You can also run it asynchronously, for instance to communicate with the
sandboxee. For examples see [crc4](examples.md#CRC4) and [sandbox
tool](examples.md#tool).
```c++
#include "sandboxed_api/sandbox2/sandbox2.h"
sandbox2::Sandbox2 s2(std::move(executor), std::move(policy));
if (s2.RunAsync()) {
... // Communicate with sandboxee, use s2.Kill() to kill it if needed
}
auto result = s2.AwaitResult();
LOG(INFO) << "Final execution status: " << result.ToString();
```
## 5. Communicating with the sandboxee
The executor can communicate with the sandboxee with file descriptors.
Depending on your situation, that can be all that you need (e.g., to share a
file with the sandboxee or to read the sandboxee standard output).
If you need more communication logic, you can implement your own protocol or
reuse our convenient **comms** API able to send integers, strings, byte
buffers, protobufs or file descriptors. Bonus: in addition to C++, we also
provide a pure-C comms library, so it can be used easily when sandboxing C
third-party projects.
### a. Sharing file descriptors
Using the [IPC](../ipc.h) (*Inter-Process Communication*) API, you can either:
* use `MapFd()` to map file descriptors from the executor to the sandboxee, for
instance to share a file opened from the executor for use in the sandboxee,
as it is done in the [static](examples.md#static) example.
```c++
// The executor opened /proc/version and passes it to the sandboxee as stdin
executor->ipc()->MapFd(proc_version_fd, STDIN_FILENO);
```
or
* use `ReceiveFd()` to create a socketpair endpoint, for instance to read the
sandboxee standard output or standard error, as it is done in the
[sandbox tool](examples.md#tool) example.
```c++
// The executor receives a file descriptor of the sandboxee stdout
int recv_fd1 = executor->ipc())->ReceiveFd(STDOUT_FILENO);
```
### b. Using the comms API
Using the [comms](../comms.h) API, you can send integers, strings or byte
buffers. For an example see [crc4](examples.md#CRC4).
To use comms, first get it from the executor IPC:
```c++
auto* comms = executor->ipc()->GetComms();
```
To send data to the sandboxee, use one of the `Send*` family of functions.
For instance in the case of [crc4](examples.md#CRC4), the executor sends an
`unsigned char buf[size]` with `SendBytes(buf, size)`:
```c++
if (!(comms->SendBytes(static_cast<const uint8_t*>(buf), sz))) {
/* handle error */
}
```
To receive data from the sandboxee, use one of the `Recv*` functions. For
instance in the case of [crc4](examples.md#CRC4), the executor receives the
checksum into an 32-bit unsigned integer:
```c++
uint32_t crc4;
if (!(comms->RecvUint32(&crc4))) {
/* handle error */
}
```
### c. Sharing data with buffers
In some situations, it can be useful to share data between executor and
sandboxee in order to share large amounts of data and to avoid expensive copies
that are sent back and forth. The [buffer API](../buffer.h) serves this use
case: the executor creates a `Buffer`, either by size and data to be passed, or
directly from a file descriptor, and passes it to the sandboxee using
`comms->SendFD()` in the executor and `comms->RecvFD()` in the sandboxee.
For example, to create a buffer in the executor, send its file descriptor to
the sandboxee, and afterwards see what the sandboxee did with it:
```c++
sandbox2::Buffer buffer;
buffer.Create(1ULL << 20); // 1 MiB
s2.RunAsync();
comms->SendFD(buffer.GetFD());
auto result = s2.AwaitResult();
uint8_t* buf = buffer.buffer(); // As modified by sandboxee
size_t len = buffer.size();
```
On the other side the sandboxee receives the buffer file descriptor, creates the
buffer object and can work with it:
```c++
int fd;
comms.RecvFD(&fd);
sandbox2::Buffer buffer;
buffer.Setup(fd);
uint8_t *buf = buffer.GetBuffer();
memset(buf, 'X', buffer.GetSize()); /* work with the buffer */
```
## 6. Exiting
If running the sandbox synchronously, then `Run` will only return when it's
finished:
```c++
auto result = s2.Run();
LOG(INFO) << "Final execution status: " << result.ToString();
```
If running asynchronously, you can decide at anytime to kill the sandboxee:
```c++
s2.Kill()
```
Or just wait for completion and the final execution status:
```c++
auto result = s2.AwaitResult();
LOG(INFO) << "Final execution status: " << result.ToString();
```
## 7. Test
Like regular code, your sandbox implementation should have tests. Sandbox tests
are not meant to test the program correctness, but instead to check whether the
sandboxed program can run without issues like sandbox violations. This also
makes sure that the policy is correct.
A sandboxed program is tested the same way it would run in production, with the
arguments and input files it would normally process.
It can be as simple as a shell test or C++ tests using sub processes. Check out
[the examples](examples.md) for inspiration.
## Conclusion
Thanks for reading this far, we hope you liked our guide and now feel empowered
to create your own sandboxes to help keep your users safe.
Creating sandboxes and policies is a difficult task prone to subtle errors. To
remain on the safe side, have a security expert review your policy and code.

View File

@ -1,57 +0,0 @@
# How it works
## Overview
The sandbox technology is organized around 2 processes:
* An **executor** sets up and runs the *monitor*:
* Also known as *parent*, *supervisor* or *monitor*
* By itself is not sandboxed
* Is regular C++ code using the Sandbox2 API
* The **sandboxee**, a child program running in the sandboxed environment:
* Also known as *child* or *sandboxed process*
* Receives its policy from the executor and applies it
* Can come in different shapes:
* Another binary, like in the [crc4](../examples/crc4/crc4sandbox.cc) and
[static](../examples/static/static_sandbox.cc) examples
* A third party binary for which you do not have the source
Purpose/goal:
* Restrict the sandboxee to a set of allowed syscalls and their arguments
* The tighter the policy, the better
Example:
A really tight policy could deny all except reads and writes on standard
input and output file descriptors. Inside this sandbox, a program could take
input, process it, and send the output back.
* The processing is not allowed to make any other syscall, or else it is killed
for policy violation.
* If the processing is compromised (code execution by a malicious user), it
cannot do anything bad other than producing bad output (that the executor and
others still need to handle correctly).
## Sandbox Policies
The sandbox relies on **seccomp-bpf** provided by the Linux kernel. **seccomp**
is a Linux kernel facility for sandboxing and **BPF** is a way to write syscall
filters (the very same used for network filters). Read more about
[seccomp-bpf on Wikipedia](https://en.wikipedia.org/wiki/Seccomp#seccomp-bpf).
In practice, you will generate your policy using our
[PolicyBuilder class](../policybuilder.h). If you need more complex rules, you
can specify raw BPF macros, like in the [crc4](../examples/crc4/crc4sandbox.cc)
example.
Filesystem accesses are restricted with the help of Linux
[user namespaces](http://man7.org/linux/man-pages/man7/user_namespaces.7.html).
User namespaces allow to drop the sandboxee into a custom chroot environment
without requiring root privileges.
## Getting Started
Read our [Getting started](getting-started.md) page to set up your first
sandbox.