diff --git a/CppCoreGuidelines.md b/CppCoreGuidelines.md index 3f0e83e..2943a40 100644 --- a/CppCoreGuidelines.md +++ b/CppCoreGuidelines.md @@ -1,6 +1,6 @@ # C++ Core Guidelines -May 8, 2017 +May 23, 2017 Editors: @@ -67,7 +67,7 @@ You can sample rules for specific language features: [prefer initialization](#Rc-initialize) -- [copy](#Rc-copy-semantics) -- [move](#Rc-move-semantics) -- -[other operations](Rc-matched) -- +[other operations](#Rc-matched) -- [default](#Rc-eqdefault) * `class`: [data](#Rc-org) -- @@ -108,7 +108,7 @@ You can sample rules for specific language features: [may not fail](#Rc-dtor-fail) * exception: [errors](#S-errors) -- -[`throw`](Re-throw) -- +[`throw`](#Re-throw) -- [for errors only](#Re-errors) -- [`noexcept`](#Re-noexcept) -- [minimize `try`](#Re-catch) -- @@ -131,7 +131,7 @@ You can sample rules for specific language features: [lambdas](#Rf-capture-vs-overload) * `inline`: [small functions](#Rf-inline) -- -[in headers](Rs-inline) +[in headers](#Rs-inline) * initialization: [always](#Res-always) -- [prefer `{}`](#Res-list) -- @@ -142,7 +142,7 @@ You can sample rules for specific language features: * lambda expression: [when to use](#SS-lambdas) * operator: -[conventional](Ro-conventional) -- +[conventional](#Ro-conventional) -- [avoid conversion operators](#Ro-conventional) -- [and lambdas](#Ro-lambda) * `public`, `private`, and `protected`: @@ -166,7 +166,7 @@ You can sample rules for specific language features: * `virtual`: [interfaces](#Ri-abstract) -- [not `virtual`](#Rc-concrete) -- -[destructor](Rc-dtor-virtual) -- +[destructor](#Rc-dtor-virtual) -- [never fail](#Rc-dtor-fail) You can look at design concepts used to express the rules: @@ -1201,6 +1201,7 @@ Interface rule summary: * [I.24: Avoid adjacent unrelated parameters of the same type](#Ri-unrelated) * [I.25: Prefer abstract classes as interfaces to class hierarchies](#Ri-abstract) * [I.26: If you want a cross-compiler ABI, use a C-style subset](#Ri-abi) +* [I.30: Encapsulate rule violations](#Ri-encapsulate) See also @@ -1532,6 +1533,10 @@ Once language support becomes available (e.g., see the [contract proposal](http: `Expects()` can also be used to check a condition in the middle of an algorithm. +##### Note + +No, using `unsigned` is not a good way to sidestep the problem of [ensuring that a value is nonnegative](#Res-nonnegative). + ##### Enforcement (Not enforceable) Finding the variety of ways preconditions can be asserted is not feasible. Warning about those that can be easily identified (`assert()`) has questionable value in the absence of a language facility. @@ -1787,8 +1792,9 @@ Consider returning the result by value (use move semantics if the result is larg return res; } -**Alternative**: Pass ownership using a "smart pointer", such as `unique_ptr` (for exclusive ownership) and `shared_ptr` (for shared ownership). -However, that is less elegant and less efficient unless reference semantics are needed. +**Alternative**: [Pass ownership](#Rr-smartptrparam) using a "smart pointer", such as `unique_ptr` (for exclusive ownership) and `shared_ptr` (for shared ownership). +However, that is less elegant and often less efficient than returning the object itself, +so use smart pointers only if reference semantics are needed. **Alternative**: Sometimes older code can't be modified because of ABI compatibility requirements or lack of resources. In that case, mark owning pointers using `owner` from the [guideline support library](#S-gsl): @@ -1812,7 +1818,7 @@ caller, so that its lifetime is handled by the caller. Viewed another way: ownership transferring APIs are relatively rare compared to pointer-passing APIs, so the default is "no ownership transfer." -**See also**: [Argument passing](#Rf-conventional) and [value return](#Rf-T-return). +**See also**: [Argument passing](#Rf-conventional), [use of smart pointer arguments](#Rr-smartptrparam), and [value return](#Rf-T-return). ##### Enforcement @@ -2131,6 +2137,69 @@ If you use a single compiler, you can use full C++ in interfaces. That may requi (Not enforceable) It is difficult to reliably identify where an interface forms part of an ABI. +### I.30: Encapsulate rule violations + +##### Reason + +To keep code simple and safe. +Sometimes, ugly, unsafe, or error-prone techniques are necessary for logical or performance reasons. +If so, keep them local, rather than "infecting" interfaces so that larger groups of programmers have to be aware of the +subtleties. +Implementation complexity should, if at all possible, not leak through interfaces into user code. + +##### Example + +Consider a program that, depending on some form of input (e.g., arguments to `main`), should consume input +from a file, from the command line, or from standard input. +We might write + + bool owned; + owner inp; + switch (source) { + case std_in: owned = false; inp = &cin; + case command_line: owned = true; inp = new istringstream{argv[2]}; + case file: owned = true; inp = new ifstream{argv[2]}; + } + istream& in = *inp; + +This violated the rule [against uninitialized variables](#Res-always), +the rule against [ignoring ownership](#Ri-raw), +and the rule [against magic constants](#Res-magic) . +In particular, someone has to remember to somewhere write + + if (owned) delete inp; + +We could handle this particular example by using `unique_ptr` with a special deleter that does nothing for `cin`, +but that's complicated for novices (who can easily encounter this problem) and the example is an example of a more general +problem where a property that we would like to consider static (here, ownership) needs infrequently be addressed +at run time. +The common, most frequent, and safest examples can be handled statically, so we don't want to add cost and complexity to those. +But we must also cope with the uncommon, less-safe, and necessarily more expensive cases. +Such examples are discussed in [[Str15]](http://www.stroustrup.com/resource-model.pdf). + +So, we write a class + + class Istream { [[gsl::suppress(lifetime)]] + public: + enum Opt { from_line = 1 }; + Istream() { } + Istream(zstring p) :owned{true}, inp{new ifstream{p}} {} // read from file + Istream(zstring p, Opt) :owned{true}, inp{new istringstream{p}} {} // read from command line + ~Itream() { if (owned) delete inp; } + operator istream& () { return *inp; } + private: + bool owned = false; + istream* inp = &cin; + }; + +Now, the dynamic nature of `istream` ownership has been encapsulated. +Presumably, a bit of checking for potential errors would be added in real code. + +##### Enforcement + +* Hard, it is hard to decide what rule-breaking code is essential +* flag rule suppression that enable rule-violations to cross interfaces + # F: Functions A function specifies an action or a computation that takes the system from one consistent state to the next. It is the fundamental building block of programs. @@ -2188,6 +2257,7 @@ Other function rules: * [F.52: Prefer capturing by reference in lambdas that will be used locally, including passed to algorithms](#Rf-reference-capture) * [F.53: Avoid capturing by reference in lambdas that will be used nonlocally, including returned, stored on the heap, or passed to another thread](#Rf-value-capture) * [F.54: If you capture `this`, capture all variables explicitly (no default capture)](#Rf-this-capture) +* [F.55: Don't use `va_arg` arguments](#F-varargs) Functions have strong similarities to lambdas and function objects so see also Section ???. @@ -3730,6 +3800,50 @@ This is under active discussion in standardization, and may be addressed in a fu * Flag any lambda capture-list that specifies a default capture and also captures `this` (whether explicitly or via default capture) +### F.55: Don't use `va_arg` arguments + +##### Reason + +Reading from a `va_arg` assumes that the correct type was actually passed. +Passing to varargs assumes the correct type will be read. +This is fragile because it cannot generally be enforced to be safe in the language and so relies on programmer discipline to get it right. + +##### Example + + int sum(...) { + // ... + while (/*...*/) + result += va_arg(list, int); // BAD, assumes it will be passed ints + // ... + } + + sum(3, 2); // ok + sum(3.14159, 2.71828); // BAD, undefined + + template + auto sum(Args... args) { // GOOD, and much more flexible + return (... + args); // note: C++17 "fold expression" + } + + sum(3, 2); // ok: 5 + sum(3.14159, 2.71828); // ok: ~5.85987 + +##### Alternatives + +* overloading +* variadic templates +* `variant` arguments +* `initializer_list` (homogeneous) + +##### Note + +Declaring a `...` parameter is sometimes useful for techniques that don't involve actual argument passing, notably to declare "take-anything" functions so as to disable "everything else" in an overload set or express a catchall case in a template metaprogram. + +##### Enforcement + +* Issue a diagnostic for using `va_list`, `va_start`, or `va_arg`. +* Issue a diagnostic for passing an argument to a vararg parameter of a function that does not offer an overload for a more specific type in the position of the vararg. To fix: Use a different function, or `[[suppress(types)]]`. + # C: Classes and Class Hierarchies A class is a user-defined type, for which a programmer can define the representation, operations, and interfaces. @@ -4232,7 +4346,7 @@ Constructor rules: * [C.40: Define a constructor if a class has an invariant](#Rc-ctor) * [C.41: A constructor should create a fully initialized object](#Rc-complete) * [C.42: If a constructor cannot construct a valid object, throw an exception](#Rc-throw) -* [C.43: Ensure that a class has a default constructor](#Rc-default0) +* [C.43: Ensure that a value type class has a default constructor](#Rc-default0) * [C.44: Prefer default constructors to be simple and non-throwing](#Rc-default00) * [C.45: Don't define a default constructor that only initializes data members; use member initializers instead](#Rc-default) * [C.46: By default, declare single-argument constructors `explicit`](#Rc-explicit) @@ -4961,7 +5075,9 @@ Leaving behind an invalid object and relying on users to consistently check an ` There are domains, such as some hard-real-time systems (think airplane controls) where (without additional tool support) exception handling is not sufficiently predictable from a timing perspective. There the `is_valid()` technique must be used. In such cases, check `is_valid()` consistently and immediately to simulate [RAII](#Rr-raii). -**Alternative**: If you feel tempted to use some "post-constructor initialization" or "two-stage initialization" idiom, try not to do that. +##### Alternative + +If you feel tempted to use some "post-constructor initialization" or "two-stage initialization" idiom, try not to do that. If you really have to, look at [factory functions](#Rc-factory). ##### Note @@ -4972,13 +5088,22 @@ Another reason is been to delay initialization until an object is needed; the so ##### Enforcement -### C.43: Ensure that a class has a default constructor +??? + +### C.43: Ensure that a value type class has a default constructor ##### Reason Many language and library facilities rely on default constructors to initialize their elements, e.g. `T a[10]` and `std::vector v(10)`. +A default constructor often simplifies the task of defining a suitable [moved-from state](#???). -##### Example , bad +##### Note + +We have not (yet) formally defined [value type](#SS-concrete), but think of it as a class that behaves much as an `int`: +it can be copied using `=` and usually compared using `==`. +It is closely related to the notion of Regular type from [EoP](http://elementsofprogramming.com/) and [the Palo Alto TR](http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2012/n3351.pdf). + +##### Example class Date { // BAD: no default constructor public: @@ -4990,17 +5115,17 @@ Many language and library facilities rely on default constructors to initialize vector vd2(1000, Date{Month::October, 7, 1885}); // alternative The default constructor is only auto-generated if there is no user-declared constructor, hence it's impossible to initialize the vector `vd1` in the example above. +The absence of a default value can cause surprises for users and complicate its use, so if one can be reasonably defined, it should be. +`Date` is chosen to encourage thought: There is no "natural" default date (the big bang is too far back in time to be useful for most people), so this example is non-trivial. `{0, 0, 0}` is not a valid date in most calendar systems, so choosing that would be introducing something like floating-point's `NaN`. However, most realistic `Date` classes have a "first date" (e.g. January 1, 1970 is popular), so making that the default is usually trivial. -##### Example - class Date { public: Date(int dd, int mm, int yyyy); - Date() = default; // See also C.45 + Date() = default; // [See also](#Rc-default) // ... private: int dd = 1; @@ -5047,9 +5172,41 @@ Assuming that you want initialization, an explicit default initialization can he int i {}; // default initialize (to 0) }; +##### Example + +There are classes that simply don't have a reasonable default. + +A class designed to be useful only as a base does not need a default constructor because it cannot be constructed by itself: + + struct Shape { // pure interface: all members are pure virtual functions + void draw() = 0; + void rotate(int) = 0; + // ... + }; + +A class that must acquire a resource during construction: + + lock_guard g {mx}; // guard the mutex mx + lock_guard g2; // error: guarding nothing + +##### Note + +A class that has a "special state" that must be handled separately from other states by member functions or users causes extra work +(and most likely more errors). For example + + ofstream out {"Foobar"}; + // ... + out << log(time, transaction); + +If `Foobar` couldn't be opened for writing and `out` wasn't set to throw exceptions upon errors, the output operations become no-ops. +The implementation must take care of that case, and users must remember to test for success. + +Pointers, even smart pointers, that can point to nothing (null pointers) are an example of this. +Having a default constructor is not a panacea; ideally it defaults to a meaningful state such as `std::string`s `""` and `std::vector`s `{}`. + ##### Enforcement -* Flag classes without a default constructor +* Flag classes that are copyable by `=` or comparable with `==` without a default constructor ### C.44: Prefer default constructors to be simple and non-throwing @@ -6065,7 +6222,7 @@ Asymmetric treatment of operands is surprising and a source of errors where conv ##### Example - class X { + struct X { string name; int number; }; @@ -6257,6 +6414,7 @@ Accessing objects in a hierarchy rule summary: * [C.150: Use `make_unique()` to construct objects owned by `unique_ptr`s](#Rh-make_unique) * [C.151: Use `make_shared()` to construct objects owned by `shared_ptr`s](#Rh-make_shared) * [C.152: Never assign a pointer to an array of derived class objects to a pointer to its base](#Rh-array) +* [C.153: Prefer virtual function to casting](#Rh-use-virtual) ### C.120: Use class hierarchies to represent concepts with inherent hierarchical structure (only) @@ -6982,7 +7140,7 @@ Factoring out `Utility` makes sense if many derived classes share significant "i Obviously, the example is too "theoretical", but it is hard to find a *small* realistic example. `Interface` is the root of an [interface hierarchy](#Rh-abstract) -and `Utility` is the root of an [implementation hierarchy](Rh-kind). +and `Utility` is the root of an [implementation hierarchy](#Rh-kind). Here is [a slightly more realistic example](https://www.quora.com/What-are-the-uses-and-advantages-of-virtual-base-class-in-C%2B%2B/answer/Lance-Diduck) with an explanation. ##### Note @@ -7046,7 +7204,6 @@ Diagnose name hiding ##### Reason Capping a hierarchy with `final` is rarely needed for logical reasons and can be damaging to the extensibility of a hierarchy. -Capping an individual virtual function with `final` is error-prone as that `final` can easily be overlooked when defining/overriding a set of functions. ##### Example, bad @@ -7057,38 +7214,6 @@ Capping an individual virtual function with `final` is error-prone as that `fina class My_improved_widget : public My_widget { /* ... */ }; // error: can't do that -##### Example, bad - - struct Interface { - virtual int f() = 0; - virtual int g() = 0; - }; - - class My_implementation : public Interface { - int f() override; - int g() final; // I want g() to be FAST! - // ... - }; - - class Better_implementation : public My_implementation { - int f(); - int g(); - // ... - }; - - void use(Interface* p) - { - int x = p->f(); // Better_implementation::f() - int y = p->g(); // My_implementation::g() Surprise? - } - - // ... - - use(new Better_implementation{}); - -The problem is easy to see in a small example, but in a large hierarchy with many virtual functions, tools are required for reliably spotting such problems. -Consistent use of `override` would catch this. - ##### Note Not every class is meant to be a base class. @@ -7097,6 +7222,11 @@ This rule is about using `final` on classes with virtual functions meant to be i ##### Note +Capping an individual virtual function with `final` is error-prone as `final` can easily be overlooked when defining/overriding a set of functions. +Fortunately, the compiler catches such mistakes: You cannot re-declare/re-open a `final` member in a derived class. + +##### Note + Claims of performance improvements from `final` should be substantiated. Too often, such claims are based on conjecture or experience with other languages. @@ -7208,10 +7338,37 @@ Flag all slicing. } } +Use of the other casts can violate type safety and cause the program to access a variable that is actually of type `X` to be accessed as if it were of an unrelated type `Z`: + + void user2(B* pb) // bad + { + D* pd = static_cast(pb); // I know that pb really points to a D; trust me + // ... use D's interface ... + } + + void user3(B* pb) // unsafe + { + if (some_condition) { + D* pd = static_cast(pb); // I know that pb really points to a D; trust me + // ... use D's interface ... + } + else { + // ... make do with B's interface ... + } + } + + void f() + { + B b; + user(&b); // OK + user2(&b); // bad error + user3(&b); // OK *if* the programmmer got the some_condition check right + } + ##### Note Like other casts, `dynamic_cast` is overused. -[Prefer virtual functions to casting](#???). +[Prefer virtual functions to casting](#Rh-use-virtual). Prefer [static polymorphism](#???) to hierarchy navigation where it is possible (no run-time resolution necessary) and reasonably convenient. @@ -7228,6 +7385,7 @@ Consider: struct B { const char* name {"B"}; + // if pb1->id() == pb2->id() *pb1 is the same type as *pb2 virtual const char* id() const { return name; } // ... }; @@ -7246,8 +7404,8 @@ Consider: cout << pb1->id(); // "B" cout << pb2->id(); // "D" - if (pb1->id() == pb2->id()) // *pb1 is the same type as *pb2 - if (pb2->id() == "D") { // looks innocent + + if (pb1->id() == "D") { // looks innocent D* pd = static_cast(pb1); // ... } @@ -7274,9 +7432,19 @@ However, compatibility makes changes difficult even if all agree that an effort In very rare cases, if you have measured that the `dynamic_cast` overhead is material, you have other means to statically guarantee that a downcast will succeed (e.g., you are using CRTP carefully), and there is no virtual inheritance involved, consider tactically resorting `static_cast` with a prominent comment and disclaimer summarizing this paragraph and that human attention is needed under maintenance because the type system can't verify correctness. Even so, in our experience such "I know what I'm doing" situations are still a known bug source. +##### Exception + +Consider: + + template + class Dx : B { + // ... + }; + ##### Enforcement -Flag all uses of `static_cast` for downcasts, including C-style casts that perform a `static_cast`. +* Flag all uses of `static_cast` for downcasts, including C-style casts that perform a `static_cast`. +* This rule is part of the [type-safety profile](#Pro-type-downcast). ### C.147: Use `dynamic_cast` to a reference type when failure to find the required class is considered an error @@ -7427,6 +7595,23 @@ Subscripting the resulting base pointer will lead to invalid object access and p * Flag all combinations of array decay and base to derived conversions. * Pass an array as a `span` rather than as a pointer, and don't let the array name suffer a derived-to-base conversion before getting into the `span` + +### C.153: Prefer virtual function to casting + +##### Reason + +A virtual function call is safe, whereas casting is error-prone. +A virtual function call reaches the most derived function, whereas a cast may reach an intermediate class and therefore +give a wrong result (especially as a hierarchy is modified during maintenance). + +##### Example + + ??? + +##### Enforcement + +See [C.146](#Rh-dynamic_cast) and ??? + ## C.over: Overloading and overloaded operators You can overload ordinary functions, template functions, and operators. @@ -9248,6 +9433,7 @@ Expression rules: * [ES.61: Delete arrays using `delete[]` and non-arrays using `delete`](#Res-del) * [ES.62: Don't compare pointers into different arrays](#Res-arr2) * [ES.63: Don't slice](#Res-slice) +* [ES.64: Use the `T{e}`notation for construction](#Res-construct) Statement rules: @@ -9273,6 +9459,8 @@ Arithmetic rules: * [ES.103: Don't overflow](#Res-overflow) * [ES.104: Don't underflow](#Res-underflow) * [ES.105: Don't divide by zero](#Res-zero) +* [ES.106: Don't try to avoid negative values by using `unsigned`](#Res-nonnegative) +* [ES.107: Don't use `unsigned` for subscripts](#Res-subscripts) ### ES.1: Prefer the standard library to other libraries and to "handcrafted code" @@ -9874,6 +10062,29 @@ Creating optimal and equivalent code from all of these examples should be well w (but don't make performance claims without measuring; a compiler may very well not generate optimal code for every example and there may be language rules preventing some optimization that you would have liked in a particular case). +##### Example + +This rule covers member variables. + + class X { + public: + X(int i, int ci) : m2{i}, cm2{ci} {} + // ... + + private: + int m1 = 7; + int m2; + int m3; + + const int cm1 = 7; + const int cm2; + const int cm3; + }; + +The compiler will flag the uninitialized `cm3` because it is a `const`, but it will not catch the lack of initialization of `m3`. +Usually, a rare spurious member initialization is worth the absence of errors from lack of initialization and often an optimizer +can eliminate a redundant initialization (e.g., an initialization that occurs immediately before an assignment). + ##### Note Complex initialization has been popular with clever programmers for decades. @@ -10273,17 +10484,6 @@ It nicely encapsulates local initialization, including cleaning up scratch varia If at all possible, reduce the conditions to a simple set of alternatives (e.g., an `enum`) and don't mix up selection and initialization. -##### Example - - bool owned = false; - owner inp = [&]{ - switch (source) { - case default: owned = false; return &cin; - case command_line: owned = true; return new istringstream{argv[2]}; - case file: owned = true; return new ifstream{argv[2]}; - }(); - istream& in = *inp; - ##### Enforcement Hard. At best a heuristic. Look for an uninitialized variable followed by a loop assigning to it. @@ -10424,6 +10624,7 @@ This is basically the way `printf` is implemented. * Flag definitions of C-style variadic functions. * Flag `#include ` and `#include ` + ## ES.stmt: Statements Statements control the flow of control (except for function calls and exception throws, which are expressions). @@ -11025,17 +11226,197 @@ You should know enough not to need parentheses for: Complicated pointer manipulation is a major source of errors. -* Do all pointer arithmetic on a `span` (exception ++p in simple loop???) -* Avoid pointers to pointers -* ??? +##### Note + +Use `gsl::span` instead. +Pointers should [only refer to single objects](#Ri-array). +Pointer arithmetic is fragile and easy to get wrong, the source of many, many bad bugs and security violations. +`span` is a bounds-checked, safe type for accessing arrays of data. +Access into an array with known bounds using a constant as a subscript can be validated by the compiler. + +##### Example, bad + + void f(int* p, int count) + { + if (count < 2) return; + + int* q = p + 1; // BAD + + ptrdiff_t d; + int n; + d = (p - &n); // OK + d = (q - p); // OK + + int n = *p++; // BAD + + if (count < 6) return; + + p[4] = 1; // BAD + + p[count - 1] = 2; // BAD + + use(&p[0], 3); // BAD + } + +##### Example, good + + void f(span a) // BETTER: use span in the function declaration + { + if (a.length() < 2) return; + + int n = a[0]; // OK + + span q = a.subspan(1); // OK + + if (a.length() < 6) return; + + a[4] = 1; // OK + + a[count - 1] = 2; // OK + + use(a.data(), 3); // OK + } + +##### Note + +Subscripting with a variable is difficult for both tools and humans to validate as safe. +`span` is a run-time bounds-checked, safe type for accessing arrays of data. +`at()` is another alternative that ensures single accesses are bounds-checked. +If iterators are needed to access an array, use the iterators from a `span` constructed over the array. + +##### Example, bad + + void f(array a, int pos) + { + a[pos / 2] = 1; // BAD + a[pos - 1] = 2; // BAD + a[-1] = 3; // BAD (but easily caught by tools) -- no replacement, just don't do this + a[10] = 4; // BAD (but easily caught by tools) -- no replacement, just don't do this + } + +##### Example, good + +Use a `span`: + + void f1(span a, int pos) // A1: Change parameter type to use span + { + a[pos / 2] = 1; // OK + a[pos - 1] = 2; // OK + } + + void f2(array arr, int pos) // A2: Add local span and use that + { + span a = {arr, pos} + a[pos / 2] = 1; // OK + a[pos - 1] = 2; // OK + } + +Use a `at()`: + + void f3(array a, int pos) // ALTERNATIVE B: Use at() for access + { + at(a, pos / 2) = 1; // OK + at(a, pos - 1) = 2; // OK + } + +##### Example, bad + + void f() + { + int arr[COUNT]; + for (int i = 0; i < COUNT; ++i) + arr[i] = i; // BAD, cannot use non-constant indexer + } + +##### Example, good + +Use a `span`: + + void f1() + { + int arr[COUNT]; + span av = arr; + for (int i = 0; i < COUNT; ++i) + av[i] = i; + } + +Use a `span` and range-`for`: + + void f1a() + { + int arr[COUNT]; + span av = arr; + int i = 0; + for (auto& e : av) + e = i++; + } + +Use `at()` for access: + + void f2() + { + int arr[COUNT]; + for (int i = 0; i < COUNT; ++i) + at(arr, i) = i; + } + +Use a range-`for`: + + void f3() + { + int arr[COUNT]; + for (auto& e : arr) + e = i++; + } + +##### Note + +Tooling can offer rewrites of array accesses that involve dynamic index expressions to use `at()` instead: + + static int a[10]; + + void f(int i, int j) + { + a[i + j] = 12; // BAD, could be rewritten as ... + at(a, i + j) = 12; // OK -- bounds-checked + } ##### Example - ??? +Turning an array into a pointer (as the language does essentially always) removes opportunities for checking, so avoid it + + void g(int* p); + + void f() + { + int a[5]; + g(a); // BAD: are we trying to pass an array? + g(&a[0]); // OK: passing one object + } + +If you want to pass an array, say so: + + void g(int* p, size_t length); // old (dangerous) code + + void g1(span av); // BETTER: get g() changed. + + void f2() + { + int a[5]; + span av = a; + + g(av.data(), av.length()); // OK, if you have no choice + g1(a); // OK -- no decay here, instead use implicit span ctor + } ##### Enforcement -We need a heuristic limiting the complexity of pointer arithmetic statement. +* Flag any arithmetic operation on an expression of pointer type that results in a value of pointer type. +* Flag any indexing expression on an expression or variable of array type (either static array or `std::array`) where the indexer is not a compile-time constant expression with a value between `0` or and the upper bound of the array. +* Flag any expression that would rely on implicit conversion of an array type to a pointer type. + +This rule is part of the [bounds-safety profile](#SS-bounds). + ### ES.43: Avoid expressions with undefined order of evaluation @@ -11076,15 +11457,21 @@ C++17 tightens up the rules for the order of evaluation, but the order of evalua int i = 0; f(++i, ++i); -The call will most likely be `f(0, 1)` or `f(1, 0)`, but you don't know which. Technically, the behavior is undefined. +The call will most likely be `f(0, 1)` or `f(1, 0)`, but you don't know which. +Technically, the behavior is undefined. +In C++17, this code does not have undefined behavior, but it is still not specified which argument is evaluated first. ##### Example -??? overloaded operators can lead to order of evaluation problems (shouldn't :-() +Overloaded operators can lead to order of evaluation problems: - f1()->m(f2()); // m(f1(), f2()) + f1()->m(f2()); // m(f1(), f2()) cout << f1() << f2(); // operator<<(operator<<(cout, f1()), f2()) +In C++17, these examples work as expected (left to right) and assignments are evaluated right to left (just as ='s binding is right-to-left) + + f1() = f2(); // undefined behavior in C++14; in C++17, f2() is evaluated before f1() + ##### Enforcement Can be detected by a good analyzer. @@ -11230,12 +11617,22 @@ are seriously overused as well as a major source of errors. If you feel the need for a lot of casts, there may be a fundamental design problem. +##### Alternatives + +Casts are widely (mis) used. Modern C++ has constructs that eliminate the need for casts in many contexts, such as + +* Use templates +* Use `std::variant` + + ##### Enforcement * Force the elimination of C-style casts * Warn against named casts -* Warn if there are many functional style casts (there is an obvious problem in quantifying 'many'). * Warn against unnecessary casts. +* Warn if there are many functional style casts (there is an obvious problem in quantifying 'many'). +* The [type profile](#Pro-type-reinterpretcast) bans `reinterpret_cast`. + ### ES.49: If you must use a cast, use a named cast @@ -11286,25 +11683,100 @@ conversions between types that might result in loss of precision. (It is a compilation error to try to initialize a `float` from a `double` in this fashion, for example.) +##### Note + +`reinterpret_cast` can be essential, but the essential uses (e.g., turning a machine address into pointer) are not type safe: + + auto p = reinterpret_cast(0x800); // inherently dangerous + + ##### Enforcement * Flag C-style and functional casts. -* Suggest brace initialization or gsl::narrow_cast instead of named casts on arithmetic types. +* The [type profile](#Pro-type-reinterpretcast) bans `reinterpret_cast`. ### ES.50: Don't cast away `const` ##### Reason It makes a lie out of `const`. +If the variable is actually declared `const`, the result of "casting away `const`" is undefined behavior. -##### Note +##### Example, bad -Usually the reason to "cast away `const`" is to allow the updating of some transient information of an otherwise immutable object. -Examples are caching, memoization, and precomputation. -Such examples are often handled as well or better using `mutable` or an indirection than with a `const_cast`. + void f(const int& i) + { + const_cast(i) = 42; // BAD + } + + static int i = 0; + static const int j = 0; + + f(i); // silent side effect + f(j); // undefined behavior ##### Example +Sometimes, you may be tempted to resort to `const_cast` to avoid code duplication, such as when two accessor functions that differ only in `const`-ness have similar implementations. For example: + + class Bar; + + class Foo { + public: + // BAD, duplicates logic + Bar& get_bar() { + /* complex logic around getting a non-const reference to my_bar */ + } + + const Bar& get_bar() const { + /* same complex logic around getting a const reference to my_bar */ + } + private: + Bar my_bar; + }; + +Instead, prefer to share implementations. Normally, you can just have the non-`const` function call the `const` function. However, when there is complex logic this can lead to the following pattern that still resorts to a `const_cast`: + + class Foo { + public: + // not great, non-const calls const version but resorts to const_cast + Bar& get_bar() { + return const_cast(static_cast(*this).get_bar()); + } + const Bar& get_bar() const { + /* the complex logic around getting a const reference to my_bar */ + } + private: + Bar my_bar; + }; + +Although this pattern is safe when applied correctly, because the caller must have had a non-`const` object to begin with, it's not ideal because the safety is hard to enforce automatically as a checker rule. + +Instead, prefer to put the common code in a common helper function -- and make it a template so that it deduces `const`. This doesn't use any `const_cast` at all: + + class Foo { + public: // good + Bar& get_bar() { return get_bar_impl(*this); } + const Bar& get_bar() const { return get_bar_impl(*this); } + private: + Bar my_bar; + + template // good, deduces whether T is const or non-const + static auto get_bar_impl(T& t) -> decltype(t.get_bar()) + { /* the complex logic around getting a possibly-const reference to my_bar */ } + }; + +##### Exception + +You may need to cast away `const` when calling `const`-incorrect functions. +Prefer to wrap such functions in inline `const`-correct wrappers to encapsulate the cast in one place. + +##### Example + +Sometimes, "cast away `const`" is to allow the updating of some transient information of an otherwise immutable object. +Examples are caching, memoization, and precomputation. +Such examples are often handled as well or better using `mutable` or an indirection than with a `const_cast`. + Consider keeping previously computed results around for a costly operation: int compute(int x); // compute a value for x; assume this to be costly @@ -11393,7 +11865,8 @@ In any variant, we must guard against data races on the `cache` in multithreaded ##### Enforcement -Flag `const_cast`s. See also [Type.3: Don't use `const_cast` to cast away `const` (i.e., at all)](#Pro-type-constcast) for the related Profile. +* Flag `const_cast`s. +* This rule is part of the [type-safety profile](#Pro-type-constcast) for the related Profile. ### ES.55: Avoid the need for range checking @@ -11642,6 +12115,97 @@ For example: Warn against slicing. +### ES.64: Use the `T{e}`notation for construction + +##### Reason + +The `T{e}` construction syntax makes it explicit that construction is desired. +The `T{e}` construction syntax doesn't allow narrowing. +`T{e}` is the only safe and general expression for constructing a value of type `T` from an expression `e`. +The casts notations `T(e)` and `(T)e` are neither safe nor general. + +##### Example + +For built-in types, the construction notation protects against narrowing and reinterpretation + + void use(char ch, int i, double d, char* p, long long lng) + { + int x1 = int{ch}; // OK, but redundant + int x2 = int{d}; // error: double->int narrowing; use a cast if you need to + int x3 = int{p}; // error: pointer to->int; use a reinterpret_cast if you really need to + int x4 = int{lng}; // error: long long->int narrowing; use a cast if you need to + + int y1 = int(ch); // OK, but redundant + int y2 = int(d); // bad: double->int narrowing; use a cast if you need to + int y3 = int(p); // bad: pointer to->int; use a reinterpret_cast if you really need to + int y4 = int(lng); // bad: long->int narrowing; use a cast if you need to + + int z1 = (int)ch; // OK, but redundant + int z2 = (int)d; // bad: double->int narrowing; use a cast if you need to + int z3 = (int)p; // bad: pointer to->int; use a reinterpret_cast if you really need to + int z4 = (int)lng; // bad: long long->int narrowing; use a cast if you need to + } + +The integer to/from pointer conversions are implementation defined when using the `T(e)` or `(T)e` notations, and non-portable +between platforms with different integer and pointer sizes. + +##### Note + +[Avoid casts](#Res-casts) (explicit type conversion) and if you must [prefer named casts](#Res-casts-named). + +##### Note + +When unambiguous, the `T` can be left out of `T{e}`. + + complex f(complex); + + auto z = f({2*pi, 1}); + +##### Note + +The construction notation is the most general [initializer notation](#Res-list). + +##### Exception + +`std::vector` and other containers were defined before we had `{}` as a notation for construction. +Consider: + + vector vs {10}; // ten empty strings + vector vi1 {1, 2, 3, 4, 5, 6, 7, 8, 9, 10}; // ten elements 1..10 + vector vi2 {10}; // one element with the value 10 + +How do we get a `vector` of 10 default initialized `int`s? + + vector v3(10); // ten elements with value 0 + +The use of `()` rather than `{}` for number of elements is conventional (going back to the early 1980s), hard to change, but still +a design error: for a container where the element type can be confused with the number of elements, we have an ambiguity that +must be resolved. +The conventional resolution is to interpret `{10}` as a list of one element and use `(10)` to distinguish a size. + +This mistake need not be repeated in new code. +We can define a type to represent the number of elements: + + struct Count { int n }; + + template + class Vector { + public: + Vector(Count n); // n default-initialized elements + Vector(initializer_list init); // init.size() elements + // ... + }; + + Vector v1{10}; + Vector v2{Count{10}}; + Vector v3{Count{10}}; // yes, there is still a very minor problem + +The main problem left is to find a suitable name for `Count`. + +##### Enforcement + +Flag the C-style `(T)e` and functional-style `T(e)` casts. + ## Arithmetic ### ES.100: Don't mix signed and unsigned arithmetic @@ -11851,6 +12415,126 @@ This also applies to `%`. * Flag division by an integral value that could be zero + +### ES.106: Don't try to avoid negative values by using `unsigned` + +##### Reason + +Choosing `unsigned` implies many changes to the usual behavior of integers, including modulo arithmetic, +can suppress warnings related to overflow, +and opens the door for errors related to signed/unsigned mixes. +Using `unsigned` doesn't actually eliminate the possibility of negative values. + +##### Example + + unsigned int u1 = -2; // OK: the value of u1 is 4294967294 + int i1 = -2; + unsigned int u2 = i1; // OK: the value of u2 is 4294967294 + int i2 = u2; // OK: the value of i2 is -2 + +These problems with such (perfectly legal) constructs are hard to spot in real code and are the source of many real-world errors. +Consider: + + unsigned area(unsigned height, unsigned width) { return height*width; } // [see also](#Ri-expects) + // ... + int height; + cin >> height; + auto a = area(height, 2); // if the input is -2 a becomes 4294967292 + +Remember that `-1` when assigned to an `unsigned int` becomes the largest `unsigned int`. +Also, since unsigned arithmetic is modulo arithmetic the multiplication didn't overflow, it wrapped around. + +##### Example + + unsigned max = 100000; // "accidental typo", I mean to say 10'000 + unsigned short x = 100; + while (x < max) x += 100; // infinite loop + +Had `x` been a signed `short`, we could have warned about the undefined behavior upon overflow. + +##### Alternatives + +* use signed integers and check for `x >= 0` +* use a positive integer type +* use an integer subrange type +* `Assert(-1 < x)` + +For example + + struct Positive { + int val; + Positive(int x) :val{x} { Assert(0 < x); } + operator int() { return val; } + }; + + int f(Positive arg) {return arg }; + + int r1 = f(2); + int r2 = f(-2); // throws + +##### Note + +??? + +##### Enforcement + +Hard: there is a lot of code using `unsigned` and we don't offer a practical positive number type. + + +### ES.107: Don't use `unsigned` for subscripts + +##### Reason + +To avoid signed/unsigned confusion. +To enable better optimization. +To enable better error detection. + +##### Example, bad + + vector vec {1, 2, 3, 4, 5}; + + for (int i=0; i < vec.size(); i+=2) // mix int and unsigned + cout << vec[i] << '\n'; + for (unsigned i=0; i < vec.size(); i+=2) // risk wraparound + cout << vec[i] << '\n'; + for (vector::size_type i=0; i < vec.size(); i+=2) // verbose + cout << vec[i] << '\n'; + for (auto i=0; i < vec.size(); i+=2) // mix int and unsigned + cout << vec[i] << '\n'; + +##### Note + +The built-in array uses signed subscripts. +The standard-library containers use unsigned subscripts. +Thus, no perfect and fully compatible solution is possible. +Given the known problems with unsigned and signed/unsigned mixtures, better stick to (signed) integers. + +##### Example + + template + struct My_container { + public: + // ... + T& operator[](int i); // not unsigned + // ... + }; + +##### Example + + ??? demonstrate improved code generation and potential for error detection ??? + +##### Alternatives + +Alternatives for users + +* use algorithms +* use range-for +* use iterators/pointers + +##### Enforcement + +Very tricky as long as the standard-library containers get it wrong. + # Per: Performance ??? should this section be in the main guide??? @@ -12496,7 +13180,7 @@ It simply has nothing to do with concurrency. } Here we have a problem: -This is perfectly good code in a single-threaded program, but have two treads execute this and +This is perfectly good code in a single-threaded program, but have two threads execute this and there is a race condition on `free_slots` so that two threads might get the same value and `free_slots`. That's (obviously) a bad data race, so people trained in other languages may try to fix it like this: @@ -12583,12 +13267,9 @@ Concurrency rule summary: * [CP.21: Use `std::lock()` or `std::scoped_lock` to acquire multiple `mutex`es](#Rconc-lock) * [CP.22: Never call unknown code while holding a lock (e.g., a callback)](#Rconc-unknown) * [CP.23: Think of a joining `thread` as a scoped container](#Rconc-join) -* [CP.24: Think of a detached `thread` as a global container](#Rconc-detach) -* [CP.25: Prefer `gsl::raii_thread` over `std::thread` unless you plan to `detach()`](#Rconc-raii_thread) -* [CP.26: Prefer `gsl::detached_thread` over `std::thread` if you plan to `detach()`](#Rconc-detached_thread) -* [CP.27: Use plain `std::thread` for `thread`s that detach based on a run-time condition (only)](#Rconc-thread) -* [CP.28: Remember to join scoped `thread`s that are not `detach()`ed](#Rconc-join-undetached) -* [CP.30: Do not pass pointers to local variables to non-`raii_thread`s](#Rconc-pass) +* [CP.24: Think of a `thread` as a global container](#Rconc-detach) +* [CP.25: Prefer `gsl::joining_thread` over `std::thread`](#Rconc-joining_thread) +* [CP.26: Don't `detach()` a thread](#Rconc-detached_thread) * [CP.31: Pass small amounts of data between threads by value, rather than by reference or pointer](#Rconc-data-by-value) * [CP.32: To share ownership between unrelated `thread`s use `shared_ptr`](#Rconc-shared) * [CP.40: Minimize context switching](#Rconc-switch) @@ -12756,26 +13437,25 @@ If a `thread` joins, we can safely pass pointers to objects in the scope of the void some_fct(int* p) { int x = 77; - raii_thread t0(f, &x); // OK - raii_thread t1(f, p); // OK - raii_thread t2(f, &glob); // OK + joining_thread t0(f, &x); // OK + joining_thread t1(f, p); // OK + joining_thread t2(f, &glob); // OK auto q = make_unique(99); - raii_thread t3(f, q.get()); // OK + joining_thread t3(f, q.get()); // OK // ... } -An `raii_thread` is a `std::thread` with a destructor that joined and cannot be `detached()`. +A `gsl::joining_thread` is a `std::thread` with a destructor that joins and that cannot be `detached()`. By "OK" we mean that the object will be in scope ("live") for as long as a `thread` can use the pointer to it. The fact that `thread`s run concurrently doesn't affect the lifetime or ownership issues here; these `thread`s can be seen as just a function object called from `some_fct`. ##### Enforcement -Ensure that `raii_thread`s don't `detach()`. +Ensure that `joining_thread`s don't `detach()`. After that, the usual lifetime and ownership (for local objects) enforcement applies. - -### CP.24: Think of a detached `thread` as a global container +### CP.24: Think of a `thread` as a global container ##### Reason @@ -12820,87 +13500,28 @@ Even objects with static storage duration can be problematic if used from detach thread continues until the end of the program, it might be running concurrently with the destruction of objects with static storage duration, and thus accesses to such objects might race. -##### Enforcement +##### Note + +This rule is redundant if you [don't `detach()`](#Rconc-detached_thread) and [use `gsl::joining_thread`](#Rconc-joining_thread). +However, converting code to follow those guidelines could be difficult and even impossible for third-party libraries. +In such cases, the rule becomes essential for lifetime safety and type safety. + In general, it is undecidable whether a `detach()` is executed for a `thread`, but simple common cases are easily detected. If we cannot prove that a `thread` does not `detach()`, we must assume that it does and that it outlives the scope in which it was constructed; After that, the usual lifetime and ownership (for global objects) enforcement applies. +##### Enforcement -### CP.25: Prefer `gsl::raii_thread` over `std::thread` unless you plan to `detach()` +Flag attempts to pass local variables to a thread that might `detach()`. + +### CP.25: Prefer `gsl::joining_thread` over `std::thread` ##### Reason -An `raii_thread` is a thread that joins at the end of its scope. - +A `joining_thread` is a thread that joins at the end of its scope. Detached threads are hard to monitor. - -??? Place all "immortal threads" on the free store rather than `detach()`? - -##### Example - - ??? - -##### Enforcement - -??? - -### CP.26: Prefer `gsl::detached_thread` over `std::thread` if you plan to `detach()` - -##### Reason - -Often, the need to `detach` is inherent in the `thread`s task. -Documenting that aids comprehension and helps static analysis. - -##### Example - - void heartbeat(); - - void use() - { - gsl::detached_thread t1(heartbeat); // obviously need not be joined - std::thread t2(heartbeat); // do we need to join? (read the code for heartbeat()) - // ... - } - -Flag unconditional `detach` on a plain `thread` - - -### CP.27: Use plain `std::thread` for `thread`s that detach based on a run-time condition (only) - -##### Reason - -`thread`s that are supposed to unconditionally `join` or unconditionally `detach` can be clearly identified as such. -The plain `thread`s should be assumed to use the full generality of `std::thread`. - -##### Example - - void tricky(thread* t, int n) - { - // ... - if (is_odd(n)) - t->detach(); - // ... - } - - void use(int n) - { - thread t { tricky, this, n }; - // ... - // ... should I join here? ... - } - -##### Enforcement - -??? - - - -### CP.28: Remember to join scoped `thread`s that are not `detach()`ed - -##### Reason - -A `thread` that has not been `detach()`ed when it is destroyed terminates the program. +It is harder to ensure absence of errors in detached threads (and potentially detached threads) ##### Example, bad @@ -12933,40 +13554,97 @@ A `thread` that has not been `detach()`ed when it is destroyed terminates the pr t2.join(); } // one bad bug left -??? Is `cout` synchronized? + +##### Example, bad + +The code determining whether to `join()` or `detach()` may be complicated and even decided in the thread of functions called from it or functions called by the function that creates a thread: + + void tricky(thread* t, int n) + { + // ... + if (is_odd(n)) + t->detach(); + // ... + } + + void use(int n) + { + thread t { tricky, this, n }; + // ... + // ... should I join here? ... + } + +This seriously complicates lifetime analysis, and in not too unlikely cases makes lifetime analysis impossible. +This implies that we cannot safely refer to local objects in `use()` from the thread or refer to local objects in the thread from `use()`. + +##### Note + +Make "immortal threads" globals, put them in an enclosing scope, or put them on the on the free store rather than `detach()`. +[don't `detach`](#Rconc-detached_thread). + +##### Note + +Because of old code and third party libraries using `std::thread` this rule can be hard to introduce. ##### Enforcement -* Flag `join`s for `raii_thread`s ??? -* Flag `detach`s for `detached_thread`s +Flag uses of `std::thread`: +* Suggest use of `gsl::joining_thread`. +* Suggest ["exporting ownership"](#Rconc-detached_thread) to an enclosing scope if it detaches. +* Seriously warn if it is not obvious whether if joins of detaches. -### CP.30: Do not pass pointers to local variables to non-`raii_thread`s +### CP.26: Don't `detach()` a thread ##### Reason -In general, you cannot know whether a non-`raii_thread` will outlive the scope of the variables, so that those pointers will become invalid. +Often, the need to outlive the scope of its creation is inherent in the `thread`s task, +but implementing that idea by `detach` makes it harder to monitor and communicate with the detached thread. +In particular, it is harder (though not impossible) to ensure that the thread completed as expected or lives for as long as expected. -##### Example, bad +##### Example + + void heartbeat(); void use() { - int x = 7; - thread t0 { f, ref(x) }; + std::thread t(heartbeat); // don't join; heartbeat is meant to run forever + t.detach(); // ... - t0.detach(); } -The `detach` may not be so easy to spot. -Use a `raii_thread` or don't pass the pointer. +This is a reasonable use of a thread, for which `detach()` is commonly used. +There are problems, though. +How do we monitor the detached thread to see if it is alive? +Something might go wrong with the heartbeat, and losing a heartbeat can be very serious in a system for which it is needed. +So, we need to communicate with the heartbeat thread +(e.g., through a stream of messages or notification events using a `condition_variable`). -##### Example, bad +An alternative, and usually superior solution is to control its lifetime by placing it in a scope outside its point of creation (or activation). +For example: - ??? put pointer to a local on a queue that is read by a longer-lived thread ??? + void heartbeat(); -##### Enforcement + gsl::joining_thread t(heartbeat); // heartbeat is meant to run "forever" -Flag pointers to locals passed in the constructor of a plain `thread`. +This heartbeat will (barring error, hardware problems, etc.) run for as long as the program does. + +Sometimes, we need to separate the point of creation from the point of ownership: + + void heartbeat(); + + unique_ptr tick_tock {nullptr}; + + void use() + { + // heartbeat is meant to run as long as tick_tock lives + tick_tock = make_unique(heartbeat); + // ... + } + +#### Enforcement + +Flag `detach()`. ### CP.31: Pass small amounts of data between threads by value, rather than by reference or pointer @@ -13084,10 +13762,10 @@ Instead, we could have a set of pre-created worker threads processing the messag void workers() // set up worker threads (specifically 4 worker threads) { - raii_thread w1 {worker}; - raii_thread w2 {worker}; - raii_thread w3 {worker}; - raii_thread w4 {worker}; + joining_thread w1 {worker}; + joining_thread w2 {worker}; + joining_thread w3 {worker}; + joining_thread w4 {worker}; } ##### Note @@ -14116,10 +14794,16 @@ To prevent slicing. // ... } -Instead, use: +Instead, use a reference: catch (exception& e) { /* ... */ } +of - typically better still - a `const` reference: + + catch (const exception& e) { /* ... */ } + +Most handlers do not modify their exception and in general we [recommend use of `const`](#Res-const). + ##### Enforcement Flag by-value exceptions if their types are part of a hierarchy (could require whole-program analysis to be perfect). @@ -17047,6 +17731,7 @@ Source file rule summary: * [SF.7: Don't write `using namespace` in a header file](#Rs-using-directive) * [SF.8: Use `#include` guards for all `.h` files](#Rs-guards) * [SF.9: Avoid cyclic dependencies among source files](#Rs-cycles) +* [SF.10: Avoid dependencies on implicitly `#included` names](#Rs-implicit) * [SF.20: Use `namespace`s to express logical structure](#Rs-namespace) * [SF.21: Don't use an unnamed (anonymous) namespace in a header](#Rs-unnamed) @@ -17381,6 +18066,76 @@ Eliminate cycles; don't just break them with `#include` guards. Flag all cycles. + +### SF.10: Avoid dependencies on implicitly `#included` names + +##### Reason + +Avoid surprises. +Avoid having to change `#include`s if an `#include`d header changes. +Avoid accidentally becoming dependent on implementation details and logically separate entities included in a header. + +##### Example + + #include + using namespace std; + + void use() // bad + { + string s; + cin >> s; // fine + getline(cin, s); // error: getline() not defined + if (s == "surprise") { // error == not defined + // ... + } + } + + exposes the definition of `std::string` ("why?" makes for a fun trivia question), +but it is not required to do so by transitively including the entire `` header, +resulting in the popular beginner question "why doesn't `getline(cin,s);` work?" +or even an occasional "`string`s cannot be compared with `==`). + +The solution is to explicitly `#include`: + + #include + #include + using namespace std; + + void use() + { + string s; + cin >> s; // fine + getline(cin, s); // fine + if (s == "surprise") { // fine + // ... + } + } + +##### Note + +Some headers exist exactly to collect a set of consistent declarations from a variety of headers. +For example: + + // basic_std_lib.h: + + #include + #include + #include + #include + #include + #include + +a user can now get that set of declarations with a single `#include`" + + #include "basic_std_lib.h" + +This rule against implicit inclusion is not meant to prevent such deliberate aggregation. + +##### Enforcement + +Enforcement would require some knowledge about what in a header is meant to be "exported" to users and what is there to enable implementation. +No really good solution is possible until we have modules. + ### SF.20: Use `namespace`s to express logical structure ##### Reason @@ -17451,6 +18206,7 @@ Standard-library rule summary: * [SL.1: Use libraries wherever possible](#Rsl-lib) * [SL.2: Prefer the standard library to other libraries](#Rsl-sl) * [SL.3: Do not add non-standard entities to namespace `std`](#sl-std) +* [SL.4: Use the standard library in a type-safe manner](#sl-safe) * ??? ### SL.1: Use libraries wherever possible @@ -17485,6 +18241,21 @@ Additions to `std` may clash with future versions of the standard. Possible, but messy and likely to cause problems with platforms. +### SL.4: Use the standard library in a type-safe manner + +##### Reason + +Because, obviously, breaking this rule can lead to undefined behavior, memory corruption, and all kinds of other bad errors. + +##### Note + +This is a semi-philosophical meta-rule, which needs many supporting concrete rules. +We need it as a umbrella for the more specific rules. + +Summary of more specific rules: + +* [SL.4: Use the standard library in a type-safe manner](#sl-safe) + ## SL.con: Containers @@ -17494,6 +18265,7 @@ Container rule summary: * [SL.con.1: Prefer using STL `array` or `vector` instead of a C array](#Rsl-arrays) * [SL.con.2: Prefer using STL `vector` by default unless you have a reason to use a different container](#Rsl-vector) +* [SL.con.3: Avoid bounds errors](#Rsl-bounds) * ??? ### SL.con.1: Prefer using STL `array` or `vector` instead of a C array @@ -17518,6 +18290,10 @@ For a variable-length array, use `std::vector`, which additionally can change it std::vector w(initial_size); // ok +##### Note + +Use `gsl::span` for non-owning references into a container. + ##### Enforcement * Flag declaration of a C array inside a function or class that also declares an STL container (to avoid excessive noisy warnings on legacy non-STL code). To fix: At least change the C array to a `std::array`. @@ -17547,6 +18323,87 @@ If you have a good reason to use another container, use that instead. For exampl * Flag a `vector` whose size never changes after construction (such as because it's `const` or because no non-`const` functions are called on it). To fix: Use an `array` instead. +### SL.con.3: Avoid bounds errors + +##### Reason + +Read or write beyond an allocated range of elements typically leads to bad errors, wrong results, crashes, and security violations. + +##### Note + +The standard-library functions that apply to ranges of elements all have (or could have) bounds-safe overloads that take `span`. +Standard types such as `vector` can be modified to perform bounds-checks under the bounds profile (in a compatible way, such as by adding contracts), or used with `at()`. + +Ideally, the in-bounds guarantee should be statically enforced. +For example: + +* a range-`for` cannot loop beyond the range of the container to which it is applied +* a `v.begin(),v.end()` is easily determined to be bounds safe + +Such loops are as fast as any unchecked/unsafe equivalent. + +Often a simple pre-check can eliminate the need for checking of individual indices. +For example + +* for `v.begin(),v.begin()+i` the `i` can easily be checked against `v.size()` + +Such loops can be much faster than individually checked element accesses. + +##### Example, bad + + void f() + { + array a, b; + memset(a.data(), 0, 10); // BAD, and contains a length error (length = 10 * sizeof(int)) + memcmp(a.data(), b.data(), 10); // BAD, and contains a length error (length = 10 * sizeof(int)) + } + +Also, `std::array<>::fill()` or `std::fill()` or even an empty initializer are better candidate than `memset()`. + +##### Example, good + + void f() + { + array a, b, c{}; // c is initialized to zero + a.fill(0); + fill(b.begin(), b.end(), 0); // std::fill() + fill(b, 0); // std::fill() + Ranges TS + + if ( a == b ) { + // ... + } + } + +##### Example + +If code is using an unmodified standard library, then there are still workarounds that enable use of `std::array` and `std::vector` in a bounds-safe manner. Code can call the `.at()` member function on each class, which will result in an `std::out_of_range` exception being thrown. Alternatively, code can call the `at()` free function, which will result in fail-fast (or a customized action) on a bounds violation. + + void f(std::vector& v, std::array a, int i) + { + v[0] = a[0]; // BAD + v.at(0) = a[0]; // OK (alternative 1) + at(v, 0) = a[0]; // OK (alternative 2) + + v.at(0) = a[i]; // BAD + v.at(0) = a.at(i); // OK (alternative 1) + v.at(0) = at(a, i); // OK (alternative 2) + } + +##### Enforcement + +* Issue a diagnostic for any call to a standard library function that is not bounds-checked. +??? insert link to a list of banned functions + +This rule is part of the [bounds profile](#SS-bounds). + +**TODO Notes**: + +* Impact on the standard library will require close coordination with WG21, if only to ensure compatibility even if never standardized. +* We are considering specifying bounds-safe overloads for stdlib (especially C stdlib) functions like `memcmp` and shipping them in the GSL. +* For existing stdlib functions and types like `vector` that are not fully bounds-checked, the goal is for these features to be bounds-checked when called from code with the bounds profile on, and unchecked when called from legacy code, possibly using contracts (concurrently being proposed by several WG21 members). + + + ## SL.str: String Text manipulation is a huge topic. @@ -18542,15 +19399,24 @@ An implementation of this profile shall recognize the following patterns in sour Type safety profile summary: -* [Type.1: Don't use `reinterpret_cast`](#Pro-type-reinterpretcast) -* [Type.2: Don't use `static_cast` downcasts. Use `dynamic_cast` instead](#Pro-type-downcast) -* [Type.3: Don't use `const_cast` to cast away `const` (i.e., at all)](#Pro-type-constcast) -* [Type.4: Don't use C-style `(T)expression` casts that would perform a `static_cast` downcast, `const_cast`, or `reinterpret_cast`](#Pro-type-cstylecast) -* [Type.4.1: Don't use `T(expression)` for casting](#Pro-fct-style-cast) -* [Type.5: Don't use a variable before it has been initialized](#Pro-type-init) -* [Type.6: Always initialize a member variable](#Pro-type-memberinit) -* [Type.7: Avoid accessing members of raw unions. Prefer `variant` instead](#Pro-fct-style-cast) -* [Type.8: Avoid reading from varargs or passing vararg arguments. Prefer variadic template parameters instead](#Pro-type-varargs) +* Type.1: Don't use `reinterpret_cast`: +A strict version of [Avoid casts](#Res-casts) and [prefer named casts](#Res-casts-named). +* Type.2: Don't use `static_cast` to downcast: +[Use `dynamic_cast` instead](#Rh-dynamic_cast). +* Type.3: Don't use `const_cast` to cast away `const` (i.e., at all): +[Don't cast away const](#Res-casts-const). +* Type.4: Don't use C-style `(T)expression` or functional `T(expression)` casts: +Prefer [construction](#Res-construct) or [named casts](#Res-cast-named). +* Type.5: Don't use a variable before it has been initialized: +[always initialize](#Res-always). +* Type.6: Always initialize a member variable: +[always initialize](#Res-always), +possibly using [default constructors](#Rc-default0) or +[default member initializers](#Rc-in-class-initializers). +* Type.7: Avoid naked union: +[Use `variant` instead](#Ru-naked). +* Type.8: Avoid varargs: +[Don't use `va_arg` arguments](#F-varargs). ##### Impact @@ -18559,572 +19425,180 @@ Exception may be thrown to indicate errors that cannot be detected statically (a Note that this type-safety can be complete only if we also have [Bounds safety](#SS-bounds) and [Lifetime safety](#SS-lifetime). Without those guarantees, a region of memory could be accessed independent of which object, objects, or parts of objects are stored in it. -### Type.1: Don't use `reinterpret_cast`. - -##### Reason - -Use of these casts can violate type safety and cause the program to access a variable that is actually of type `X` to be accessed as if it were of an unrelated type `Z`. - -##### Example, bad - - std::string s = "hello world"; - double* p = reinterpret_cast(&s); // BAD - -##### Enforcement - -* Issue a diagnostic for any use of `reinterpret_cast`. To fix: Consider using a `variant` instead. -* Issue a separate diagnostic for any reinterpret_cast to void*. To fix: Remove the cast and allow the implicit type conversion. -* Issue a separate diagnostic for any reinterpret_cast from void*. To fix: Use static_cast instead - -### Type.2: Don't use `static_cast` downcasts. Use `dynamic_cast` instead. - -##### Reason - -Use of these casts can violate type safety and cause the program to access a variable that is actually of type `X` to be accessed as if it were of an unrelated type `Z`. - -##### Example, bad - - class Base { public: virtual ~Base() = 0; }; - - class Derived1 : public Base { }; - - class Derived2 : public Base { - std::string s; - public: - std::string get_s() { return s; } - }; - - Derived1 d1; - Base* p1 = &d1; // ok, implicit conversion to pointer to Base is fine - - // BAD, tries to treat d1 as a Derived2, which it is not - Derived2* p2 = static_cast(p1); - // tries to access d1's nonexistent string member, instead sees arbitrary bytes near d1 - cout << p2->get_s(); - -##### Example, bad - - struct Foo { int a, b; }; - struct Foobar : Foo { int bar; }; - - void use(int i, Foo& x) - { - if (0 < i) { - Foobar& x1 = dynamic_cast(x); // error: Foo is not polymorphic - Foobar& x2 = static_cast(x); // bad - // ... - } - // ... - } - - // ... - - use(99, *new Foo{1, 2}); // not a Foobar - -If a class hierarchy isn't polymorphic, avoid casting. -It is entirely unsafe. -Look for a better design. -See also [C.146](#Rh-dynamic_cast). - -##### Enforcement - -Issue a diagnostic for any use of `static_cast` to downcast, meaning to cast from a pointer or reference to `X` to a pointer or reference to a type that is not `X` or an accessible base of `X`. To fix: If this is a downcast or cross-cast then use a `dynamic_cast` instead, otherwise consider using a `variant` instead. - -### Type.3: Don't use `const_cast` to cast away `const` (i.e., at all). - -##### Reason - -Casting away `const` is a lie. If the variable is actually declared `const`, it's a lie punishable by undefined behavior. - -##### Example, bad - - void f(const int& i) - { - const_cast(i) = 42; // BAD - } - - static int i = 0; - static const int j = 0; - - f(i); // silent side effect - f(j); // undefined behavior - -##### Example - -Sometimes you may be tempted to resort to `const_cast` to avoid code duplication, such as when two accessor functions that differ only in `const`-ness have similar implementations. For example: - - class Bar; - - class Foo { - public: - // BAD, duplicates logic - Bar& get_bar() { - /* complex logic around getting a non-const reference to my_bar */ - } - - const Bar& get_bar() const { - /* same complex logic around getting a const reference to my_bar */ - } - private: - Bar my_bar; - }; - -Instead, prefer to share implementations. Normally, you can just have the non-`const` function call the `const` function. However, when there is complex logic this can lead to the following pattern that still resorts to a `const_cast`: - - class Foo { - public: - // not great, non-const calls const version but resorts to const_cast - Bar& get_bar() { - return const_cast(static_cast(*this).get_bar()); - } - const Bar& get_bar() const { - /* the complex logic around getting a const reference to my_bar */ - } - private: - Bar my_bar; - }; - -Although this pattern is safe when applied correctly, because the caller must have had a non-`const` object to begin with, it's not ideal because the safety is hard to enforce automatically as a checker rule. - -Instead, prefer to put the common code in a common helper function -- and make it a template so that it deduces `const`. This doesn't use any `const_cast` at all: - - class Foo { - public: // good - Bar& get_bar() { return get_bar_impl(*this); } - const Bar& get_bar() const { return get_bar_impl(*this); } - private: - Bar my_bar; - - template // good, deduces whether T is const or non-const - static auto get_bar_impl(T& t) -> decltype(t.get_bar()) - { /* the complex logic around getting a possibly-const reference to my_bar */ } - }; - -##### Exception - -You may need to cast away `const` when calling `const`-incorrect functions. Prefer to wrap such functions in inline `const`-correct wrappers to encapsulate the cast in one place. - -##### See also: - -[ES.50, Don't cast away `const`](#Res-casts-const) for more discussion. - -##### Enforcement - -Issue a diagnostic for any use of `const_cast`. To fix: Either don't use the variable in a non-`const` way, or don't make it `const`. - -### Type.4: Don't use C-style `(T)expression` casts that would perform a `static_cast` downcast, `const_cast`, or `reinterpret_cast`. - -##### Reason - -Use of these casts can violate type safety and cause the program to access a variable that is actually of type `X` to be accessed as if it were of an unrelated type `Z`. -Note that a C-style `(T)expression` cast means to perform the first of the following that is possible: a `const_cast`, a `static_cast`, a `static_cast` followed by a `const_cast`, a `reinterpret_cast`, or a `reinterpret_cast` followed by a `const_cast`. This rule bans `(T)expression` only when used to perform an unsafe cast. - -##### Example, bad - - std::string s = "hello world"; - double* p0 = (double*)(&s); // BAD - - class Base { public: virtual ~Base() = 0; }; - - class Derived1 : public Base { }; - - class Derived2 : public Base { - std::string s; - public: - std::string get_s() { return s; } - }; - - Derived1 d1; - Base* p1 = &d1; // ok, implicit conversion to pointer to Base is fine - - // BAD, tries to treat d1 as a Derived2, which it is not - Derived2* p2 = (Derived2*)(p1); - // tries to access d1's nonexistent string member, instead sees arbitrary bytes near d1 - cout << p2->get_s(); - - void f(const int& i) { - (int&)(i) = 42; // BAD - } - - static int i = 0; - static const int j = 0; - - f(i); // silent side effect - f(j); // undefined behavior - -##### Enforcement - -Issue a diagnostic for any use of a C-style `(T)expression` cast that would invoke a `static_cast` downcast, `const_cast`, or `reinterpret_cast`. To fix: Use a `dynamic_cast`, `const`-correct declaration, or `variant`, respectively. - -### Type.4.1: Don't use `T(expression)` for casting. - -##### Reason - -If `e` is of a built-in type, `T(e)` is equivalent to the error-prone `(T)e`. - -##### Example, bad - - int* p = f(x); - auto i = int(p); // Potential damaging cast; don't or use `reinterpret_cast` - - short s = short(i); // potentially narrowing; don't or use `narrow` or `narrow_cast` - -##### Note - -The {}-syntax makes the desire for construction explicit and doesn't allow narrowing - - f(Foo{bar}); - -##### Enforcement - -Flag `T(e)` if used for `e` of a built-in type. - -### Type.5: Don't use a variable before it has been initialized. - -[ES.20: Always initialize an object](#Res-always) is required. - -### Type.6: Always initialize a member variable. - -##### Reason - -Before a variable has been initialized, it does not contain a deterministic valid value of its type. It could contain any arbitrary bit pattern, which could be different on each call. - -##### Example - - struct X { int i; }; - - X x; - use(x); // BAD, x has not been initialized - - X x2{}; // GOOD - use(x2); - -##### Enforcement - -* Issue a diagnostic for any constructor of a non-trivially-constructible type that does not initialize all member variables. To fix: Write a data member initializer, or mention it in the member initializer list. -* Issue a diagnostic when constructing an object of a trivially constructible type without `()` or `{}` to initialize its members. To fix: Add `()` or `{}`. - -### Type.7: Avoid accessing members of raw unions. Prefer `variant` instead. - -##### Reason - -Reading from a union member assumes that member was the last one written, and writing to a union member assumes another member with a nontrivial destructor had its destructor called. This is fragile because it cannot generally be enforced to be safe in the language and so relies on programmer discipline to get it right. - -##### Example - - union U { int i; double d; }; - - U u; - u.i = 42; - use(u.d); // BAD, undefined - - variant u; - u = 42; // u now contains int - use(u.get()); // ok - use(u.get()); // throws ??? update this when standardization finalizes the variant design - -Note that just copying a union is not type-unsafe, so safe code can pass a union from one piece of unsafe code to another. - -##### Enforcement - -* Issue a diagnostic for accessing a member of a union. To fix: Use a `variant` instead. - -### Type.8: Avoid reading from varargs or passing vararg arguments. Prefer variadic template parameters instead. - -##### Reason - -Reading from a vararg assumes that the correct type was actually passed. Passing to varargs assumes the correct type will be read. This is fragile because it cannot generally be enforced to be safe in the language and so relies on programmer discipline to get it right. - -##### Example - - int sum(...) { - // ... - while (/*...*/) - result += va_arg(list, int); // BAD, assumes it will be passed ints - // ... - } - - sum(3, 2); // ok - sum(3.14159, 2.71828); // BAD, undefined - - template - auto sum(Args... args) { // GOOD, and much more flexible - return (... + args); // note: C++17 "fold expression" - } - - sum(3, 2); // ok: 5 - sum(3.14159, 2.71828); // ok: ~5.85987 - -Note: Declaring a `...` parameter is sometimes useful for techniques that don't involve actual argument passing, notably to declare "take-anything" functions so as to disable "everything else" in an overload set or express a catchall case in a template metaprogram. - -##### Enforcement - -* Issue a diagnostic for using `va_list`, `va_start`, or `va_arg`. To fix: Use a variadic template parameter list instead. -* Issue a diagnostic for passing an argument to a vararg parameter of a function that does not offer an overload for a more specific type in the position of the vararg. To fix: Use a different function, or `[[suppress(types)]]`. ## Pro.bounds: Bounds safety profile -This profile makes it easier to construct code that operates within the bounds of allocated blocks of memory. It does so by focusing on removing the primary sources of bounds violations: pointer arithmetic and array indexing. One of the core features of this profile is to restrict pointers to only refer to single objects, not arrays. +This profile makes it easier to construct code that operates within the bounds of allocated blocks of memory. +It does so by focusing on removing the primary sources of bounds violations: pointer arithmetic and array indexing. +One of the core features of this profile is to restrict pointers to only refer to single objects, not arrays. -For the purposes of this document, bounds-safety is defined to be the property that a program does not use a variable to access memory outside of the range that was allocated and assigned to that variable. (Note that the safety is intended to be complete when combined also with [Type safety](#SS-type) and [Lifetime safety](#SS-lifetime), which cover other unsafe operations that allow bounds violations, such as type-unsafe casts that 'widen' pointers.) - -The following are under consideration but not yet in the rules below, and may be better in other profiles: - -* ??? - -An implementation of this profile shall recognize the following patterns in source code as non-conforming and issue a diagnostic. +We define bounds-safety to be the property that a program does not use an object to access memory outside of the range that was allocated for it. +Bounds safety is intended to be complete only when combined with [Type safety](#SS-type) and [Lifetime safety](#SS-lifetime), +which cover other unsafe operations that allow bounds violations. Bounds safety profile summary: -* [Bounds.1: Don't use pointer arithmetic. Use `span` instead](#Pro-bounds-arithmetic) -* [Bounds.2: Only index into arrays using constant expressions](#Pro-bounds-arrayindex) -* [Bounds.3: No array-to-pointer decay](#Pro-bounds-decay) -* [Bounds.4: Don't use standard library functions and types that are not bounds-checked](#Pro-bounds-stdlib) +* Bounds.1: Don't use pointer arithmetic. Use `span` instead: +[Pass pointers to single objects (only)](#Ri-array) and [Keep pointer arithmetic simple](#Res-simple). +* Bounds.2: Only index into arrays using constant expressions: +[Pass pointers to single objects (only)](#Ri-array) and [Keep pointer arithmetic simple](#Res-simple). +* Bounds.3: No array-to-pointer decay: +[Pass pointers to single objects (only)](#Ri-array) and [Keep pointer arithmetic simple](#Res-simple). +* Bounds.4: Don't use standard library functions and types that are not bounds-checked: +[Use the standard library in a type-safe manner](#Rsl-bounds). + +##### Impact + +Bounds safety implies that access to an object - notably arrays - does not access beyond the object's memory allocation. +This eliminates a large class of insidious and hard-to-find errors, including the (in)famous "buffer overflow" errors. +This closes security loopholes as well as a prominent source of memory corruption (when writing out of bounds). +Even an out-of-bounds access is "just a read", it can lead to invariant violations (when the accessed isn't of the assumed type) +and "mysterious values." -### Bounds.1: Don't use pointer arithmetic. Use `span` instead. - -##### Reason - -Pointers should only refer to single objects, and pointer arithmetic is fragile and easy to get wrong. `span` is a bounds-checked, safe type for accessing arrays of data. - -##### Example, bad - - void f(int* p, int count) - { - if (count < 2) return; - - int* q = p + 1; // BAD - - ptrdiff_t d; - int n; - d = (p - &n); // OK - d = (q - p); // OK - - int n = *p++; // BAD - - if (count < 6) return; - - p[4] = 1; // BAD - - p[count - 1] = 2; // BAD - - use(&p[0], 3); // BAD - } - -##### Example, good - - void f(span a) // BETTER: use span in the function declaration - { - if (a.length() < 2) return; - - int n = a[0]; // OK - - span q = a.subspan(1); // OK - - if (a.length() < 6) return; - - a[4] = 1; // OK - - a[count - 1] = 2; // OK - - use(a.data(), 3); // OK - } - -##### Enforcement - -Issue a diagnostic for any arithmetic operation on an expression of pointer type that results in a value of pointer type. - -### Bounds.2: Only index into arrays using constant expressions. - -##### Reason - -Dynamic accesses into arrays are difficult for both tools and humans to validate as safe. `span` is a bounds-checked, safe type for accessing arrays of data. `at()` is another alternative that ensures single accesses are bounds-checked. If iterators are needed to access an array, use the iterators from a `span` constructed over the array. - -##### Example, bad - - void f(array a, int pos) - { - a[pos / 2] = 1; // BAD - a[pos - 1] = 2; // BAD - a[-1] = 3; // BAD -- no replacement, just don't do this - a[10] = 4; // BAD -- no replacement, just don't do this - } - -##### Example, good - - // ALTERNATIVE A: Use a span - - // A1: Change parameter type to use span - void f1(span a, int pos) - { - a[pos / 2] = 1; // OK - a[pos - 1] = 2; // OK - } - - // A2: Add local span and use that - void f2(array arr, int pos) - { - span a = {arr, pos} - a[pos / 2] = 1; // OK - a[pos - 1] = 2; // OK - } - - // ALTERNATIVE B: Use at() for access - void f3(array a, int pos) - { - at(a, pos / 2) = 1; // OK - at(a, pos - 1) = 2; // OK - } - -##### Example, bad - - void f() - { - int arr[COUNT]; - for (int i = 0; i < COUNT; ++i) - arr[i] = i; // BAD, cannot use non-constant indexer - } - -##### Example, good - - // ALTERNATIVE A: Use a span - void f1() - { - int arr[COUNT]; - span av = arr; - for (int i = 0; i < COUNT; ++i) - av[i] = i; - } - - // ALTERNATIVE Aa: Use a span and range-for - void f1a() - { - int arr[COUNT]; - span av = arr; - int i = 0; - for (auto& e : av) - e = i++; - } - - // ALTERNATIVE B: Use at() for access - void f2() - { - int arr[COUNT]; - for (int i = 0; i < COUNT; ++i) - at(arr, i) = i; - } - -##### Enforcement - -Issue a diagnostic for any indexing expression on an expression or variable of array type (either static array or `std::array`) where the indexer is not a compile-time constant expression. - -Issue a diagnostic for any indexing expression on an expression or variable of array type (either static array or `std::array`) where the indexer is not a value between `0` or and the upper bound of the array. - -**Rewrite support**: Tooling can offer rewrites of array accesses that involve dynamic index expressions to use `at()` instead: - - static int a[10]; - - void f(int i, int j) - { - a[i + j] = 12; // BAD, could be rewritten as ... - at(a, i + j) = 12; // OK -- bounds-checked - } - -### Bounds.3: No array-to-pointer decay. - -##### Reason - -Pointers should not be used as arrays. `span` is a bounds-checked, safe alternative to using pointers to access arrays. - -##### Example, bad - - void g(int* p, size_t length); - - void f() - { - int a[5]; - g(a, 5); // BAD - g(&a[0], 1); // OK - } - -##### Example, good - - void g(int* p, size_t length); - void g1(span av); // BETTER: get g() changed. - - void f() - { - int a[5]; - span av = a; - - g(av.data(), av.length()); // OK, if you have no choice - g1(a); // OK -- no decay here, instead use implicit span ctor - } - -##### Enforcement - -Issue a diagnostic for any expression that would rely on implicit conversion of an array type to a pointer type. - -### Bounds.4: Don't use standard library functions and types that are not bounds-checked. - -##### Reason - -These functions all have bounds-safe overloads that take `span`. Standard types such as `vector` can be modified to perform bounds-checks under the bounds profile (in a compatible way, such as by adding contracts), or used with `at()`. - -##### Example, bad - - void f() - { - array a, b; - memset(a.data(), 0, 10); // BAD, and contains a length error (length = 10 * sizeof(int)) - memcmp(a.data(), b.data(), 10); // BAD, and contains a length error (length = 10 * sizeof(int)) - } - -Also, `std::array<>::fill()` or `std::fill()` or even an empty initializer are better candidate than `memset()`. - -##### Example, good - - void f() - { - array a, b, c{}; // c is initialized to zero - a.fill(0); - fill(b.begin(), b.end(), 0); // std::fill() - fill(b, 0); // std::fill() + Ranges TS - - if ( a == b ) { - // ... - } - } - -##### Example - -If code is using an unmodified standard library, then there are still workarounds that enable use of `std::array` and `std::vector` in a bounds-safe manner. Code can call the `.at()` member function on each class, which will result in an `std::out_of_range` exception being thrown. Alternatively, code can call the `at()` free function, which will result in fail-fast (or a customized action) on a bounds violation. - - void f(std::vector& v, std::array a, int i) - { - v[0] = a[0]; // BAD - v.at(0) = a[0]; // OK (alternative 1) - at(v, 0) = a[0]; // OK (alternative 2) - - v.at(0) = a[i]; // BAD - v.at(0) = a.at(i); // OK (alternative 1) - v.at(0) = at(a, i); // OK (alternative 2) - } - -##### Enforcement - -* Issue a diagnostic for any call to a standard library function that is not bounds-checked. ??? insert link to a list of banned functions - -**TODO Notes**: - -* Impact on the standard library will require close coordination with WG21, if only to ensure compatibility even if never standardized. -* We are considering specifying bounds-safe overloads for stdlib (especially C stdlib) functions like `memcmp` and shipping them in the GSL. -* For existing stdlib functions and types like `vector` that are not fully bounds-checked, the goal is for these features to be bounds-checked when called from code with the bounds profile on, and unchecked when called from legacy code, possibly using contracts (concurrently being proposed by several WG21 members). - ## Pro.lifetime: Lifetime safety profile -See /docs folder for the initial design. The formal rules are in progress (as of March 2017). +See /docs folder for the initial design. The detailed formal rules are in progress (as of May 2017). + +The following are specific rules that are being enforced. + +Lifetime safety profile summary: + +* [Lifetime.1: Don't dereference a possibly invalid pointer.](#Pro-lifetime-invalid-deref) +* [Lifetime.2: Don't dereference a possibly null pointer.](#Pro-lifetime-null-deref) +* [Lifetime.3: Don't pass a possibly invalid pointer to a function.](#Pro-lifetime-invalid-argument) + +??? These rules will be moved into the main-line sections of these guidelines ??? + +### Lifetime.1: Don't dereference a possibly invalid pointer. + +##### Reason + +It is undefined behavior. + +To resolve the problem, either extend the lifetime of the object the pointer is intended to refer to, or shorten the lifetime of the pointer (move the dereference to before the pointed-to object's lifetime ends). + +##### Example, bad + + void f() + { + int x = 0; + int* p = &x; + + if (condition()) { + int y = 0; + p = &y; + } // invalidates p + + *p = 42; // BAD, p might be invalid if the branch was taken + } + +##### Example, good + + void f() + { + int x = 0; + int* p = &x; + + int y = 0; + if (condition()) { + p = &y; + } + + *p = 42; // OK, p points to x or y and both are still in scope + } + +##### Enforcement + +* Issue a diagnostic for any dereference of a pointer that could have been invalidated (could point to an object that was destroyed) along a local code path leading to the dereference. To fix: Extend the lifetime of the pointed-to object, or move the dereference to before the pointed-to object's lifetime ends. + + + +### Lifetime.2: Don't dereference a possibly null pointer. + +##### Reason + +It is undefined behavior. + +##### Example, bad + + void f(int* p1) + { + *p1 = 42; // BAD, p1 might be null + + int i = 0; + int* p2 = condition() ? &i : nullptr; + *p2 = 42; // BAD, p2 might be null + } + +##### Example, good + + void f(int* p1, not_null p3) + { + if (p1 != nullptr) { + *p1 = 42; // OK, must be not null in this branch + } + + int i = 0; + int* p2 = condition() ? &i : nullptr; + if (p2 != nullptr) { + *p2 = 42; // OK, must be not null in this branch + } + + *p3 = 42; // OK, not_null does not need to be tested for nullness + } + +##### Enforcement + +* Issue a diagnostic for any dereference of a pointer that could have been set to null along a local code path leading to the dereference. To fix: Add a null check and dereference the pointer only in a branch that has tested to ensure non-null. + + + +### Lifetime.3: Don't pass a possibly invalid pointer to a function. + +##### Reason + +The function cannot do anything useful with the pointer. + +To resolve the problem, either extend the lifetime of the object the pointer is intended to refer to, or shorten the lifetime of the pointer (move the function call to before the pointed-to object's lifetime ends). + +##### Example, bad + + void f(int*); + + void g() + { + int x = 0; + int* p = &x; + + if (condition()) { + int y = 0; + p = &y; + } // invalidates p + + f(p); // BAD, p might be invalid if the branch was taken + } + +##### Example, good + + void f() + { + int x = 0; + int* p = &x; + + int y = 0; + if (condition()) { + p = &y; + } + + f(p); // OK, p points to x or y and both are still in scope + } + +##### Enforcement + +* Issue a diagnostic for any function argument that is a pointer that could have been invalidated (could point to an object that was destroyed) along a local code path leading to the dereference. To fix: Extend the lifetime of the pointed-to object, or move the function call to before the pointed-to object's lifetime ends. + + # GSL: Guideline support library @@ -19232,6 +19706,7 @@ for example, `Expects(p!=nullptr)` will become `[[expects: p!=nullptr]]`. * `narrow` // `narrow(x)` is `static_cast(x)` if `static_cast(x) == x` or it throws `narrowing_error` * `[[implicit]]` // "Marker" to put on single-argument constructors to explicitly make them non-explicit. * `move_owner` // `p = move_owner(q)` means `p = q` but ??? +* `joining_thread` // a RAII style version of `std::thread` that joins. ## GSL.concept: Concepts