mirror of
https://github.com/isocpp/CppCoreGuidelines.git
synced 2024-03-22 13:30:58 +08:00
Added a first cut of string guidelines
This commit is contained in:
parent
6bcfaa4fbe
commit
fc5222ca26
|
@ -1,6 +1,6 @@
|
|||
# <a name="main"></a>C++ Core Guidelines
|
||||
|
||||
April 9, 2017
|
||||
April 16, 2017
|
||||
|
||||
|
||||
Editors:
|
||||
|
@ -2078,7 +2078,7 @@ Parameter passing semantic rules:
|
|||
* [F.22: Use `T*` or `owner<T*>` or a smart pointer to designate a single object](#Rf-ptr)
|
||||
* [F.23: Use a `not_null<T>` to indicate "null" is not a valid value](#Rf-nullptr)
|
||||
* [F.24: Use a `span<T>` or a `span_p<T>` to designate a half-open sequence](#Rf-range)
|
||||
* [F.25: Use a `zstring` or a `not_null<zstring>` to designate a C-style string](#Rf-string)
|
||||
* [F.25: Use a `zstring` or a `not_null<zstring>` to designate a C-style string](#Rf-zstring)
|
||||
* [F.26: Use a `unique_ptr<T>` to transfer ownership where a pointer is needed](#Rf-unique_ptr)
|
||||
* [F.27: Use a `shared_ptr<T>` to share ownership](#Rf-shared_ptr)
|
||||
|
||||
|
@ -3066,7 +3066,7 @@ Passing a `span` object as an argument is exactly as efficient as passing a pair
|
|||
|
||||
(Complex) Warn where accesses to pointer parameters are bounded by other parameters that are integral types and suggest they could use `span` instead.
|
||||
|
||||
### <a name="Rf-string"></a>F.25: Use a `zstring` or a `not_null<zstring>` to designate a C-style string
|
||||
### <a name="Rf-zstring"></a>F.25: Use a `zstring` or a `not_null<zstring>` to designate a C-style string
|
||||
|
||||
##### Reason
|
||||
|
||||
|
@ -17102,8 +17102,278 @@ If you have a good reason to use another container, use that instead. For exampl
|
|||
|
||||
## <a name="SS-string"></a>SL.str: String
|
||||
|
||||
Text manipulation is a huge topic.
|
||||
`std::string` doesn't cover all of it.
|
||||
This section primarily tries to clarify `std::string`'s relation to `char*`, `zstring`, `string_view`, and `gsl::string_span`.
|
||||
The important issue of non-ASCII charactersets and encodings (e.g., `wchar_t`, unicode, and UTF-8) will be covered elswhere.
|
||||
|
||||
See also [regular expressions](#SS-regex).
|
||||
|
||||
Here, we use "sequence of characters" or "string" to refer to a sequence of charaters meant to be read as text (somehow, eventually).
|
||||
We don't consider
|
||||
|
||||
String summary:
|
||||
|
||||
* [SL.str.1: Use `std::string` to own character sequences](#Rstr-string)
|
||||
* [SL.str.2: Use `std::string_view` or `gsl::string_span` to refer to character sequences](#Rstr-view)
|
||||
* [SL.str.3: Use `zstring` or `czstring` to refere to a C-style, zero-terminated, sequence of characters](#Rstr-zstring)
|
||||
* [SL.str.4: Use `char*` to refer to a single character](#Rstr-char*)
|
||||
* [Sl.str.5: Use `std::byte` to refer to byte values that do not necessarily represent characters](#Rstr-byte)
|
||||
|
||||
* [Sl.str.10: Use `std::string` when you need to perform locale-sensitive sting operations](#Rstr-locale)
|
||||
* [Sl.str.11: Use `gsl::string_span` rather than `std::view` when you need to mutate a string](#Rstr-span)
|
||||
* [Sl.str.12: Use the `s` suffix for string literals meant to be standard-library `string`s](#Rstr-s)
|
||||
|
||||
See also
|
||||
|
||||
* [F.24 span](#Rf-range)
|
||||
* [F.25 zstring](#Rf-zstring)
|
||||
|
||||
|
||||
### <a name="Rstr-string"></a>SL.str.1: Use `std::string` to own character sequences
|
||||
|
||||
##### Reason
|
||||
|
||||
`string` correctly handles allocation, ownership, copying, gradual expansion, and offers a variety of useful operations.
|
||||
|
||||
##### Example
|
||||
|
||||
vector<string> read_until(const string& terminator)
|
||||
{
|
||||
vector<string> res;
|
||||
for (string s; cin>>s && s!=terminator; ) // read a word
|
||||
res.push_back(s);
|
||||
return res;
|
||||
}
|
||||
|
||||
Note how `>>` and `!=` are provided for `string` (as examples of a useful operations) and there there are no explicit
|
||||
allocations, deallocations, or range checks (`string` takes care of those).
|
||||
|
||||
In C++17, we might use `string_view` as the argument, rather than `const string *` to allow more flexibility to callers:
|
||||
|
||||
vector<string> read_until(string_view terminator) // C++17
|
||||
{
|
||||
vector<string> res;
|
||||
for (string s; cin>>s && s!=terminator; ) // read a word
|
||||
res.push_back(s);
|
||||
return res;
|
||||
}
|
||||
|
||||
The `gsl::string_span` is a current alternative offering most of the benefits of `string_span` for simple examples:
|
||||
|
||||
vector<string> read_until(string_span terminator)
|
||||
{
|
||||
vector<string> res;
|
||||
for (string s; cin>>s && s!=terminator; ) // read a word
|
||||
res.push_back(s);
|
||||
return res;
|
||||
}
|
||||
|
||||
##### Example, bad
|
||||
|
||||
Don't use C-style strings for operations that require non-trivial memory management
|
||||
|
||||
char* cat(const char* s1, const char* s2) // beware!
|
||||
// return s1 + '.' + s2
|
||||
{
|
||||
int l1 = strlen(s1);
|
||||
int l2 = strlen(s2);
|
||||
char* p = (char*)malloc(l1+l2+2);
|
||||
strcpy(p,s1,l1);
|
||||
p[l1] = '.';
|
||||
strcpy(p+l1+1,s2,l2);
|
||||
p[l1+l2+1] = 0;
|
||||
return res;
|
||||
}
|
||||
|
||||
Did we get that right?
|
||||
Will the caller remember to `free()` the returned pointer?
|
||||
Will this code pass a security review?
|
||||
|
||||
##### Note
|
||||
|
||||
Do not assume that `string` is slower than lower-level techniques without measurement and remember than not all code is performance critical.
|
||||
[Don't optimize prematurely](#Rper-Knuth)
|
||||
|
||||
##### Enforcement
|
||||
|
||||
???
|
||||
|
||||
### <a name="Rstr-view"></a>SL.str.2: Use `std::string_view` or `gsl::string_span` to refer to character sequences
|
||||
|
||||
##### Reason
|
||||
|
||||
`std::string_view` or `gsl::string_span` provides simple and (potentially) safe access to character sequences independently of how
|
||||
those sequences are allocated and stored.
|
||||
|
||||
##### Example
|
||||
|
||||
vector<string> read_until(string_span terminator);
|
||||
|
||||
void user(zstring p, const string& s, string_span ss)
|
||||
{
|
||||
auto v1 = read_until(p);
|
||||
auto v2 = read_until(s);
|
||||
auto v3 = read_until(ss);
|
||||
// ...
|
||||
}
|
||||
|
||||
##### Note
|
||||
|
||||
???
|
||||
|
||||
##### Enforcement
|
||||
|
||||
???
|
||||
|
||||
### <a name="Rstr-zstring"></a>SL.str.3: Use `zstring` or `czstring` to refere to a C-style, zero-terminated, sequence of characters
|
||||
|
||||
##### Reason
|
||||
|
||||
Readability.
|
||||
Statement of intent.
|
||||
A plain `char*` can be a pointer to a single character, a pointer to an arry of characters, a pointer to a C-style (zero terminated) string, or event to a small integer.
|
||||
Distinguishing these alternatives prevents misunderstandings and bugs.
|
||||
|
||||
##### Example
|
||||
|
||||
void f1(const char* s); // s is probably a string
|
||||
|
||||
All we know is that it is supposet ot bet the nullptr or point to at least one character
|
||||
|
||||
void f1(zstring s); // s is a C-style string or the nullptr
|
||||
void f1(czstring s); // s is a C-style string that is not the nullptr
|
||||
void f1(std::byte* s); // s is a pointer to a byte (C++17)
|
||||
|
||||
##### Note
|
||||
|
||||
Don't convert a C-style string to `string` unless there is a reason to.
|
||||
|
||||
##### Note
|
||||
|
||||
Linke any other "plain pointer", a `zstring` should not represent ownership.
|
||||
|
||||
##### Note
|
||||
|
||||
There are billions of lines of C++ "out there", most use `char*` and `const char*` without documenting intent.
|
||||
They are use in a wide varity of ways, including to represent ownership and as generic pointers to memory (instead of `void*`).
|
||||
It is hard to separate these uses, so this guideline is hard to follow.
|
||||
This is one of the major sources of bugs in C and C++ programs, so it it worth while to follow this guideline wherever feasible..
|
||||
|
||||
##### Enforcement
|
||||
|
||||
* Flag uses of `[]` on a `char*`
|
||||
* Flag uses of `delete` on a `char*`
|
||||
* Flag uses of `free()` on a `char*`
|
||||
|
||||
### <a name="Rstr-char*"></a>SL.str.4: Use `char*` to refer to a single character
|
||||
|
||||
##### Reason
|
||||
|
||||
The variety of uses of `char*` in current code is a major source of errors.
|
||||
|
||||
##### Example, bad
|
||||
|
||||
char arr[] = {'a', 'b', 'c'};
|
||||
|
||||
void print(const char* p)
|
||||
{
|
||||
cout << p << '\n';
|
||||
}
|
||||
|
||||
void use()
|
||||
{
|
||||
print(arr); // run-time error; potentially very bad
|
||||
}
|
||||
|
||||
The array `arr` is not a C-style string because it is not zero-terminated.
|
||||
|
||||
##### Alternative
|
||||
|
||||
See [`zstring`](#Rstr-zstring), [`string`](#Rstr-string), and [`string_span`](#Rstr-view).
|
||||
|
||||
##### Enforcement
|
||||
|
||||
* Flag uses of `[]` on a `char*`
|
||||
|
||||
### <a name="Rstr-byte"></a>Sl.str.5: Use `std::byte` to refer to byte values that do not necessarily represent characters
|
||||
|
||||
##### Reason
|
||||
|
||||
Use of `char*` to represent a pinter to something that is not necessarily a character cause confusion
|
||||
and disable valuable optimizations.
|
||||
|
||||
##### Example
|
||||
|
||||
???
|
||||
|
||||
##### Note
|
||||
|
||||
C++17
|
||||
|
||||
##### Enforcement
|
||||
|
||||
???
|
||||
|
||||
|
||||
### <a name="Rstr-locale"></a>Sl.str.10: Use `std::string` when you need to perform locale-sensitive sting operations
|
||||
|
||||
##### Reason
|
||||
|
||||
`std::string` support standard-library [`locale` facilities](#Rstr-locale)
|
||||
|
||||
##### Example
|
||||
|
||||
???
|
||||
|
||||
##### Note
|
||||
|
||||
???
|
||||
|
||||
##### Enforcement
|
||||
|
||||
???
|
||||
### <a name="Rstr-span"></a>Sl.str.11: Use `gsl::string_span` rather than `std::view` when you need to mutate a string
|
||||
|
||||
##### Reason
|
||||
|
||||
`std::string_view` is read-only.
|
||||
|
||||
##### Example
|
||||
|
||||
???
|
||||
|
||||
##### Note
|
||||
|
||||
???
|
||||
|
||||
##### Enforcement
|
||||
|
||||
The compile will flag attempts to write to a `string_view`.
|
||||
|
||||
### <a name="Rstr-s"></a>Sl.str.12: Use the `s` suffix for string literals meant to be standard-library `string`s
|
||||
|
||||
##### Reason
|
||||
|
||||
Direct expression of an idea minimizes mistakes.
|
||||
|
||||
##### Example
|
||||
|
||||
auto pp1 = make_pair("Tokyo",9.00); // {C-style string,double} intended?
|
||||
pair<string,double> pp2 = {"Tokyo",9.00}; // a bit verbose
|
||||
auto pp3 = make_pair("Tokyo"s,9.00); // {std::string,double} // C++17
|
||||
pair pp4 = {"Tokyo"s,9.00}; // {std::string,double} // C++17
|
||||
|
||||
|
||||
##### Note
|
||||
|
||||
C++17
|
||||
|
||||
##### Enforcement
|
||||
|
||||
???
|
||||
|
||||
|
||||
## <a name="SS-io"></a>SL.io: Iostream
|
||||
|
||||
???
|
||||
|
|
Loading…
Reference in New Issue
Block a user