The std::string_view offers the benefits of std::string's interface without the cost of constructing an std::string object.
The std::string_view, from the C++17 standard, is a read-only non-owning reference to a char
sequence. The motivation behind std::string_view is that it is quite common for functions to require a read-only reference to an std::string-like object where the exact type of the object does not matter. The drawback of using const std::string&
in those situations is that it requires creating an std::string object. Here is a simple case in point:
//foo requires a std::string-like object
void foo(const std::string& s) {
if(s.length() >= 6 && s[2] == 'V') {
// extract a part of string
auto d = s.substr(2,4);
// d is a std::string
// ...
}
//...
}
// A std::string is constructed
// Readable, easy to use, but expensive
foo("A Very Long String");
Constructing an std::string object could be expensive because it usually (but not always) requires dynamic memory allocation. Where the cost of constructing an std::string object is a concern, the readability and ease of usage are frequently compromised by using const char*
and length parameters:
/* Better performance, but have to give up
benefits of std::string interface. */
void foo(const char* str, size_t length) {
// written to work with const char* and length
//.....
}
const char* str = "A Very Long String";
foo(str, strlen(str));
What makes std::string_view better than const std::string&
is that it eliminates the need to have an std::string object in the first place. Usually, an std::string_view is composed of two members: a const char*
that points to the start of the char
array, and the size. Our simple example with std::string_view:
// foo requires a std::string-like object
void foo(std::string_view s) {
if(s.length() >= 6 && s[2] == 'V') {
// extract a part of string
auto d = s.substr(2,4);
// d is a std::string_view
//...
}
//...
}
// A std::string_view is constructed.
// Better performance, readable, and easy to use.
foo("A Very Long String");
Note that the substr method of an std::string_view returns an std::string_view. The following illustration shows how std::string_view objects conceptually refer to char
sequences:
The applications that require a substantial amount of constructing and copying string objects can significantly benefit in terms of performance and code readability by using std::string_view.
Assume a hypothetical trading system application that uses a large number of option contract OSI symbols (e.g., "AAPL 131101C00470000"). An OSI symbol is a 21-character long identifier that encodes various attributes of an option contract. The application loads a delimited list of all the symbols from a file to a buffer. Then the symbols from the buffer are split and stored in an std::unordered_set of std::string objects:
// A type alias for symbol
using Symbol = std::string;
/* A routine to split and load the
symbols in a collection */
template<typename C>
void loadSymbols(const char* source, size_t len,
char delim, C& coll) {
const char* first = source;
const char* last = source+len;
while(true) {
// find delimiter location
auto delimPos = std::find(first, last, delim);
// check if delimiter found
if(delimPos == last)
break; // no more delimiter
// Insert the Symbol in coll
coll.insert(Symbol(first, delimPos - first));
// advance the first pointer for next token
first = delimPos + 1;
}
}
// Somewhere else
/*buffer holding '|' delimited symbol list.
Could be a std::vector<char> */
std::string buf = "SPX 191115C02820000|"
"SPX 191115P02820000|"
/*many more symbols...*/;
// the symbols collection
std::unordered_set<Symbol> symbols;
// load symbols
loadSymbols(buf.c_str(), buf.size(), '|', symbols);
At various places in the application, the symbols are searched from the symbols collection, copied when necessary, and stored to other STL containers when needed. But nowhere, the symbols are modified. It is costly in terms of performance and memory usage to have a vast number of the std::string objects, mainly when the dynamic memory allocation is involved. To minimize dynamic allocation, a typical implementation of std::string is optimized to store a small string within itself in a char
array; this is called short/small string optimization (SSO). However, the small-string size for optimization is implementation-dependent and could very well be below 21-char
s.
This application can undoubtedly benefit from using std::string_view. The first thing we have to do is change the type alias of symbol to be of type std::string_view instead of std::string:
// A type alias for symbol
using Symbol = std::string_view;
That is the only change we need for the above example code to work. However, it is likely that more changes would be required in a real-world application. For one, it is important to consider here is that the buffer that holds the symbol list should live for the lifetime of the application; otherwise, all the symbol std::string_views would be invalidated.
Another change could come from a fact that the std::string_view does not have the c_str() interface to return a null-terminated string. We would have to convert the std::string_view to std:: string wherever a null-terminated string is required:
//using namespace std::literals::string_view_literals;
auto v = "This is a view"sv; // 'sv' is a string_view literal
// to get null-terminated string
std::cout << std::string(v).c_str() << "\n";
The std::string_view is an excellent utility for good performance and readability where only the std::string-like interface is required. But the caution must be exercised to ensure that the std::string_view does not outlive the referred char
sequence.