Added Per.7

Design to enable optimization
This commit is contained in:
Bjarne Stroustrup 2016-09-04 11:08:48 -04:00
parent 443c1a4afb
commit 0aea7d997b

View File

@ -1,6 +1,6 @@
# <a name="main"></a>C++ Core Guidelines
August 28, 2016
September 4, 2016
Editors:
@ -9212,9 +9212,9 @@ Simplify refactoring.
##### Example
void use(int arg) // bad: uninitialized variable
void use(int arg)
{
int i;
int i; // bad: uninitialized variable
// ...
i = 7; // initialize i
}
@ -11065,6 +11065,7 @@ Performance rule summary:
* [Per.4: Don't assume that complicated code is necessarily faster than simple code](#Rper-simple)
* [Per.5: Don't assume that low-level code is necessarily faster than high-level code](#Rper-low)
* [Per.6: Don't make claims about performance without measurements](#Rper-measure)
* [Per.7: Design to enable optimization](#Rper-efficiency)
* [Per.10: Rely on the static type system](#Rper-type)
* [Per.11: Move computation from run time to compile time](#Rper-Comp)
* [Per.12: Eliminate redundant aliases](#Rper-alias)
@ -11178,6 +11179,165 @@ Often, you will be surprised.
???
### <a name="Rper-efficiency"></a>Per.7: Design to enable optimization
##### Reason
Because we often need to optimize the initial design.
Because a design that ignore the possibility of later improvement is hard to change.
##### Example
From the C (and C++) standard:
void qsort (void* base, size_t num, size_t size, int (*compar)(const void*,const void*));
When did you even want to sort memory?
Really, we sort sequences of elements, typically stored in containers.
A call to `qsort` throws away much useful information (e.g., the element type), forces the user to repeat information
already known (e.g., the element size), and forces the user to write extra code (e.g., a function to compare `double`s).
This implies added work for the programmer, is error prone, and deprives the compiler of information needed for optimization.
double data[100];
// ... fill a ...
qsort(data,100,sizeof(double),compare_doubles); // 100 chunks of memory of sizeof(double) starting at address data using the order defined by compare_doubles
From the point of view of interface design is that `qsort` throws away useful information.
We can do better (in C++98)
template<typename Iter>
void sort(Iter b, Iter e); // sort [b:e)
sort(data,data+100);
Here, we use the compiler's knowledge about the size of the array, the type of elements, and how to compare `double`s.
With C++11 plus [concepts](#???), we can do better still
void sort(Sortable& c); // Sortable specifies that c must be a random-access sequence of elements comparable with <
sort(c);
The key is to pass sufficient information for a good implementation to be chosen.
In this, the `sort` interfaces shown here still have a weakness:
They implicitly rely on the element type having less-than (`<`) defined.
To complete the interface, we need a second version that accepts a comparison criteria:
void sort(Sortable& c, Predicate<Value_type<Sortable>> p); // compare elements of c using p
The standard-library specification of `sort` offers those two versions,
but the semantics is expressed in English rather than code using concepts.
##### Note
Premature optimization is said to be [the root of all evil](#Rper-Knuth), but that's not a reason to despise performance.
It is never premature to consider what makes a design amenable to improvement, and improved performance is a commonly desired improvement.
Aim to build a set of habits that by default results in efficient, maintainable, and optimizable code.
In particular, when you write a function that is not a one-off implementation detail, consider
* Information passing:
Prefer clean [interfaces](#S-interfaces) carrying sufficient information for later improvement of implementation.
Note that information flows into and out of an implementation through the interfaces we provide.
* Compact data: By default, [use compact data](#Rper-compact), such as `std::vector` and [access it in a systematic fashion](#Rper-access).
If you think you need a linked structure, try to craft the interface so that this structure isn't seen by users.
* Function argument passing and return:
Distinguish between mutable and non-mutable data.
Don't impose a resource management burden on your users.
Don't impose spurious run-time indirections on your users.
Use [conventional ways](#Rf-conventional) of passing information through an interface;
unconventional and/or "optimized" ways of passing data can seriously complicate later reimplementation.
* Abstraction:
Don't overgeneralize; a design that tries to cater for every possible use (and misuse) and defers every design desison for later
(using compile-time or run-time indirections) is usually a complicated, bloated, hard-to-understand mess.
Generalize from concrete examples, preserving performance as we generalize.
Do not generalize based on mere speculation about future needs.
The ideal is zero-overhead generalization.
* Libraries:
Use libraries with good interfaces.
If no library is available build one yourself and immitate the interface style from a good library.
The [standard library](#S-stdlib) is a good first place to look for inspiration.
* Isolation:
Isolate your code from messy and/or old style code by providing an interface of your choosing to it.
This is sometimes called "providing a wrapper" for the useful/necessary but messy code.
Don't let bad designs "bleed into" your code.
##### Example
Consider:
template <class ForwardIterator, class T>
bool binary_search (ForwardIterator first, ForwardIterator last, const T& val);
`binary_search(begin(c),end(c),7)` will tell you whether `7` is in `c` or not.
However, it will not tell you where that `7` is or whether there are more than one `7`.
Sometimes, just passing the minimal amount of information back (here, `true` or `false`) is sufficient, but a good interface passes
needed information back to the caller. Therefore, the standard library also offers
template <class ForwardIterator, class T>
ForwardIterator lower_bound(ForwardIterator first, ForwardIterator last, const T& val);
`lower_bound` returns an iterator to the first match if any, otherwise `last`.
However, `lower_bound` still doesn't return enough information for all uses, so the standard library also offers
template <class ForwardIterator, class T>
pair<ForwardIterator,ForwardIterator>
equal_range (ForwardIterator first, ForwardIterator last, const T& val);
`equal_range` returns a `pair` of iterators specifying the first and one beyond last match.
auto r = equal_range(begin(c),end(c),7);
for (auto p = r.first(); p!=r.second(), ++p)
cout << *p << '\n';
Obviously, these three interfaces are implemented by the same basic code.
They are simply three ways of presenting the basic binary search algorithm to users,
ranging from the simplest ("make simple things simple!")
to returning complete, but not always needed, information ("don't hide useful information").
Naturally, crafting such a set of interfaces requires experience and domain knowledge.
##### Note
Do not simply craft the interface to match the first implementation and the first use case you think of.
Once your first initial implementation is complete, review it; once you deploy it, mistakes will be hard to remedy.
##### Note
A need for efficiency does not imply a need for [low-level code](#Rper-low).
High-level code does not imply slow or bloated.
##### Note
Things have costs.
Don't be paranoid about costs (modern computers really are very fast),
but have a rough idea of the order of magnitude of cost of what you use.
For example, have a rough idea of the cost of
a memory access,
a function call,
a string comparison,
a system call,
a disk access,
and a message through a network.
##### Note
If you can only think of one implementation, you probably don't have something for which you can devise a stable interface.
Maybe, it is just an implementation detail - not every piece of code needs a stable interface - but pause and consider.
One question that can be useful is
"what interface would be needed if this operation should be implemented using multiple threads? be vectorized?"
##### Note
This rule does not contradict the [Don't optimize prematurely](#Rper-Knuth) rule.
It complements it encouraging developers enable later - appropriate and non-premature - optimization, if and where needed.
##### Enforcement
Tricky.
Maybe looing for `void*` function arguments will find examples of interfaces that hinder later optimization.
### <a name="Rper-type"></a>Per.10: Rely on the static type system
##### Reason