The idea here is to have a `Network` object that contains functions for
network operations and an optional userdata object that can manage those
network operations. This allows e.g. a fuzzer to replace the network
functions with no-ops or fuzzer inputs, reducing the need for `#ifdef`s.
Cimple cannot actually find these without also causing false positives,
but I found them with cimple before removing the code causing false
positives again.
I don't know if this will actually work, or how many of these "fixes" I
need to get msan to be happy on CI. For me locally, it all works fine.
On CI, for some reason it's not fine even though I run in the exact same
docker image as CI.
For coverity, which continues to think we're overrunning buffers when
at this point it's easy to prove we're not. Here would be the corrected
coverity finding:
6. Condition packet_length <= 105 /* 1 + 32 * 2 + 24 + 16 */, taking false branch.
7. Condition packet_length > 1024, taking false branch.
Now packet_length must be > 105 and <= 1024.
12. Condition len1 == packet_length - (89 /* 1 + 32 * 2 + 24 */) - 16, taking true branch.
len1 must be > 0 (105 - 89 - 16) and <= 919 (1024 - 89 - 16).
14. decr: Decrementing len1. The value of len1 is now between 0 and 919 (inclusive).
This is where coverity goes wrong: it thinks len1 could be up to 2147483629.
15. buffer access should be OK. Coverity thinks it's not.
Nothing very noteworthy, I just came across this and made it slightly
more readable.
I'm not making this function `bool` right now because it's used in NGC
and that will break.
As a side-effect, DHT now always accepts LAN discovery packets, even
when LAN discovery is disabled. When LAN discovery is disabled, those
packets are ignored.
These were found by the new cimple type check which is completely
unforgiving to implicit boolean conversions. After this PR, there should
be no more implicit int-to-bool or bool-to-int conversions.
These were found by the new stronger type check in cimple. The one
bugfix is in `crypto_sha512_cmp`, which used to think `crypto_verify_32`
returns bool while actually it's -1/0/1.
This uses mallocfail to further increase coverage using the existing
tests. Also:
* Moved the non-auto "tox_one_test" to gtest. This should be
split into smaller tests later.
* Changed `hole_punching` to `bool`.
Missed a few of those in check-c. check-cimple now catches these with a
stronger type check.
Other changes:
* `ptr + int` must always have the `ptr` first, so `int + ptr` is not
allowed anymore.
* `close` and `time` were shadowing libc functions. `file_data` was
shadowed in a function (and is not a good function name anyway), so
renamed to `send_file_data` which is more descriptive.
* Within a function, all local variables of the same name must have the
same type.
* The `strerror_r` change wasn't necessary, but I kept it because it
seems a bit clearer to me now. `#ifdef`s inside functions are a bit
confusing sometimes.
Previously we would try to send three random TCP relays that we're
connected to to each friend once every 5 minutes. The problem with
this method is that it could take an extraordinarily long time
to share every relay; some relays might be consistently skipped
while others might be sent repeatedly. Moreover, there's no
guarantee that the nodes you try to send are actually online.
This leads to a prety unreliable and flaky way of sharing.
Now we reduce the timer to two minutes, and cycle through the list
trying 3 nodes each share attempt. This guarantees that every online
node in our list gets shared with every friend after a fixed amount of
time (which depends on how many nodes are in the list)
One of these was creating a single 262144 byte stack frame. We now have
a way to check and limit the allocation size of a VLA. The `Cmp_Data`
ones were also fairly large. Now, no allocation is larger than 2KiB
(though rtp.c allocates close to that much).
This commit adds functionality for clients to interact with
the DHT, sending getnodes requests to their peers and receiving
nodes in getnodes responses.
These help static analysis and ubsan. We should eventually have all
functions annotated like this with a cimple check to make sure every
pointer has an explicit nullability annotation. The `nullable`
annotation does nothing in GCC, but will be used by cimple to validate
that every parameter has defined nullability.
It was kind of thread-safe, maybe, but there was a data race that makes
tsan unhappy. We now do interface detection once per Tox instance
instead of once per process.
* Function arguments must use `foo_cb *p` and can't just use `foo_cb p`
* You can no longer cast function pointers (if it's incompatible, you
must wrap the callback). I'm avoiding this with tokstyle exclusions.
strerror_r() has two versions: GNU-specific and XSI-compliant. The XSI
version always stores the string in the provided buffer, but the GNU
version might store it in the provided buffer or it might use some
immutable static buffer instead. Since we always free the error string,
we might end up freeing the immutable static buffer.
Also added a whole bunch of logging that I needed while debugging the
issue. The solution in the end is that bootstrap needs to resolve IPs,
and getaddrinfo fails in the browser. Most of the time we bootstrap
against IPs anyway, so trying to parse as IP address first will shortcut
that.
The brackets serve no purpose and make us do extra string
parsing when using the output for other things
Also removed a useless call to ip_ntoa in LAN_discovery.c
We still have them in toxav. That will need to be cleaned up later.
Flexible array members have very limited usefulness. In this particular
case, it's almost entirely useless. It confuses static analysers and is
yet one more C feature we need to understand and support. It is also the
only reason we need special support in tokstyle for calloc with a `+`
operator in the member size.
Use of `strcpy` in these particular cases was safe, but it's hard to
tell and also useless. `strcpy` would effectively need to do another
`strlen` which we already did.
Also removed sprintf, which was also safe in this case but it's easier to
be "obviously safe", especially for static analysers.
We have a more portable wrapper that is now also thread-safe. Also
stopped using sprintf in the one place we used it. This doesn't really
help much, but it allows us to forbid sprintf globally.
Also added a valgrind build to run it on every pull request. I've had to
disable a few tests because valgrind makes those run infinitely slowly,
consistently timing them out.
It's nice we are able to compile with `tcc`. Let's not break that.
CompCert is also neat, but its interpreter mode doesn't work on tox, so
we only use the compiler.
/var/lib/tox-bootstrapd on the host is owned by hosts's tox-bootstrapd
and chowned 700, but the container attempts to access it as its own
tox-bootstrapd user with possibly different uid:gid, which will fail if
host's tox-bootstrapd user has different uid:gid than the tox-bootstrapd
user inside the container.
This change makes the container use host's tox-bootstrapd uid:gid, which
fixes the issue.
The android warnings are disabled now because they suggest using
linux-only extensions of libc. Useful for android indeed, but we're
targeting non-android and non-linux systems as well.
The default stack size for musl-libc is 128kb. Therefore we should try to keep stack
allocations well below this limit in order to avoid stack overflows.
We no longer allow `int a, b;`. In the few cases where we used it, we
instead better
* limit the scope of the identifier (e.g. in a for-init-decl)
* split the line and have 2 separate declarators, because the
identifiers designate different types of things (e.g. friend numbers
and group numbers).
This check puts all of our code in a C++ anonymous namespace, which is
effectively making all functions `static`. This allows the compiler to
determine that a function is unused, so we can delete it.
* Use-after-free because we free network before dht in one case.
* Various unchecked allocs in tests (not so important).
* We used to not check whether ping arrays were actually allocated in DHT.
* `ping_kill` and `ping_array_kill` used to crash when passing NULL.
Also:
* Added an assert in all public API functions to ensure tox isn't NULL.
The error message you get from that is a bit nicer than "Segmentation
fault" when clients (or our tests) do things wrong.
* Decreased the sleep time in iterate_all_wait from 20ms to 5ms.
Everything seems to still work with 5ms, and this greatly decreases
the amount of time spent per test run, making oomer run much faster.
* Use fully static build for the bootstrap daemon.
* Store a sha256sum of the binary in the repo.
* Updated documentation for it.
* Add support for fully static build in cmake.
* Enable the docker build on every PR, so we catch changes to the
checksum. I realise this is adding toil, but having the checksum is
valuable for security of released binaries.
tox-bootstrapd can use around 600 TCP sockets during TCP server's normal
functioning. Many systems default to having a soft limit of 1024 open file
descriptors, which we are close to reaching, so it was suggested we bump that
limit to a higher number. iphy suggested increasing it to 32768.
Don't know why codes with macro dosen't work.
As it's only a few expensive, just code it without macro for now.
\#if (MIN_LOGGER_LEVEL == LOG_TRACE) || (MIN_LOGGER_LEVEL == LOG_DEBUG)
fprintf(stderr, "[%s] %s:%d(%s) %s\n", strlevel, file, line, func, message);
\#endif
Reduced by, e.g.:
* `file_transfer_test`: 33% of the `clock_gettime` calls.
* `tox_many_test`: 53% of the `clock_gettime` calls.
Other tests will see similar improvements. Real world applications will
be closer to 40-50% improvement, since tox_many_test has 100 nodes, while
file_transfer_test has 2 nodes.
It turns out, `unix_time` is also monotonic, and is used as such, so I've
renamed the new functions to `mono_time_*`.
2018-07-08:
```
00:01 <@irungentoo> the idea used to be that the unix_time() function
could go backward in time but I think I might have started using it like
if it could not after I changed it so that it would never go back in time
```
Rules:
1. Constants are uppercase names: THE_CONSTANT.
2. SUE[1] types start with an uppercase letter and have at least one
lowercase letter in it: The_Type, THE_Type.
3. Function types end in "_cb": tox_friend_connection_cb.
4. Variable and function names are all lowercase: the_function.
This makes it easier for humans reading the code to determine what an
identifier means. I'm not convinced by the enum type name change, but I
don't know a better rule. Currently, a lot of enum types are spelled like
constants, which is confusing.
[1] struct/union/enum