SlideShare a Scribd company logo
H
OW LONG IS A PIECE OF STRING? OR, TO
be precise, what is a string? For C
programmers the answer has always
been easy: it is a null-terminated sequence
of characters. This is lamentably still the answer
of choice for some C++ programmers. For others
working day-to-day with a specific proprietary or
in-house library, it is possible that their answer may
be the resident string class. C++ programmers
educated in the standard library will respond that
the there is a string type defined by the standard,
and that that type is the one true answer for C++.
But there is a much greater diversity than is
suggested by these singular or orthodox answers.
OK, so what is a string? A reasonably high-level
answer would be a sequence of characters. A
slightly more precise answer would be a value
that represents a sequence of characters. The shift
in emphasis is a subtle but worthwhile one.
Why objects aren’t equal
Objects live in a class-based society where equal rights
are not conferred on all. Many programmers
believe that all classes have the same basic rights
as one another and that all should follow the
same rules of development. For instance, they
should all have functional copy constructors and
assignment operators. Sometimes it’s comforting
to have a mechanical set of rules to follow, but it
can also be a means of going wrong with confidence.
While some guidelines can be mechanical and
automatic, most of those concerned with the
bigger design picture must be framed in more general
terms to be of genuine long-term use1, 2
.
Returning to copying, does it make sense to
support copying for all types? Copying strings seems
like a useful idea, as does copying lists, monetary
quantities, dates and so on. But what about
copying objects that represent entities in the
application domain, such as bank accounts or
payment transactions? Copying a bank account is
generally considered to be a criminal offence on
this side of the keyboard.
What about copying objects that play an
infrastructure role in programs, such as databases
or resource wrappers? Remembering that in various
situations temporary objects can be generated or
elided at will by the compiler, is copying a whole
database the kind of operation that is casual
enough to support via something as transparent
as a copy constructor? Considering the issue of
resource wrappers, what does it mean to uniquely
acquire a specific system resource...and then copy
it? Such encapsulation was the basis of the scoped
type, which was explored and defined in previous
instalments of this column3,4
. Copying was strictly
forbidden because, regardless of whether or not it
was technically achievable, it made no actual
sense. All too often there is a temptation to “solve
the solution” rather than “solve the problem”.
When you look at objects this way it becomes clear
that there are some fundamental differences between
different kinds of objects, and that these differences
go beyond the simple feature of copyability that I
picked on for demonstration. It turns out that
for most kinds of objects, copying is either a non-
requirement or its absence is actually the requirement.
For value objects, copying is not only sensible—
it is pretty much a basic requirement. Value
objects represent fine-grained pieces of information
q Strings come in all shapes,sizes and
programmers’preferences.They are
value types that hold sequences of
characters for appropriate definitions
of value,character and sequence.
q Value object types are different from
many other object types because they
focus on fine-grained information
rather than application entities.
Therefore they are copyable,normally
live on the heap or in other objects,
and are not generally accessed by
pointers.
q The core language offers C-style,null-
terminated strings.
q The standard library offers a single
string type.Unfortunately,it is by no
means the best encapsulation.
However,it is should still be used in
preference to proprietary or lower-
level solutions.
FACTS AT A GLANCE
C++ WORKSHOP
Kevlin Henney discusses the use of strings in C++, and finds that one type
doesn’t fit all
Stringing things along
38 APPLICATION DEVELOPMENT ADVISOR q www.appdevadvisor.co.uk
JULY–AUGUST 2002 39
in a system, such as numeric quantities or textual data. They can
be manipulated, composed and combined freely, often with the
intuitive assistance of operators. Subscript into a string s at
index i? Use s[i]. Concatenate two strings, a and b, together?Take
the small leap to consider concatenation to be a form of addition,
and a + b follows. Values are passed around either by copy or by
reference, but not by pointer. Why not by pointer? Because a pointer
emphasises indirection and object identity, two things that are of
little importance—and indeed act as cumbersome distractions—
when working with value objects. Values live most comfortably
on the stack or as data members of other objects, bounded to their
enclosing scope’s lifetime, rather than freely on the heap. Of course,
their internal representation may live on the heap, but that
detail should be self-contained3
and hidden behind a veil of
encapsulation.
OK, so now that we’ve understood that a string is some kind
of a value, what are we offered in the standard? And how does that
measure up against our expectations?
In the core language, inherited from C, we have a strongly
reinforced convention that a string is a null-terminated array of
characters.The character type can be either a char (conventionally
ASCII or one of its relatives) or wchar_t, which may hold a suitable
wide-character type (e.g. 32-bit Unicode) and which ranks as possibly
the worst keyword ever to make it into a serious programming
language. The null character has the integer value 0, and is
expressed most conveniently as ‘0’ or char(0) for char, or
L’0’ or wchar_t(0) for wchar_t.
The death knell for NULL?
Note that the common C programmer mistake of using NULL for
this purpose is just that: a mistake. In C, NULL is intended to denote
the null pointer—not the null character—and depending on the
platform, it may be defined as 0, 0L or (void *) 0. In C++ it is
commonly accepted that NULL should not be used at all, because
it can lead to surprising ambiguity during overload resolution.
This is a point I will return to in a moment.
So how do we work with built-in strings? Clumsily. The only
appropriate operator on offer is for subscripting, i.e. string[index],
and everything else must be done either by hand or by the str
and mem function families defined in the C standard, e.g. strlen
to find the length and strcpy to perform unmanaged copying.
To define a general string value, you must either allocate suitable
space dynamically on the heap (remembering to clear it up
later) or declare an array of sufficient size. Neither of these two
means of allocation allow convenient resizing, although heap-based
allocation does at least allow flexibility in initial size and the
opportunity to replace one string with another—assuming that
your pointer bookkeeping is up to scratch.
Arrays cannot be copied by assignment, by copy construction,
as return values or as function arguments. The first three cases
are illegal and the last is sneakily substituted with a pointer. In
other words,
void write(const char line[80]);
is rewritten by the compiler as:
void write(const char *line);
If you really want to ensure that an 80-character array is passed
through, you have to resort to the syntactically challenging:
void write(const char (&line)[80]);
A point about pointers
This is hard-nosed about its requirement, because only a compile-
time declared array of 80 characters will do. The upshot of all this
is that, intentionally or otherwise, built-in strings are normally
passed around as pointers, and are most commonly thought of
this way. Pointers apparently behave as values for many purposes:
they can be assigned and passed around freely, and they have some
operator support. However, the copying results in aliasing and
there is no management of the string in any way. There is also the
small matter of null. Strings considered as pointers can refer either
to a sequence of characters...or not. This extra out-of-band
value sometimes sees action in expressing special circumstances,
particularly exceptional outcomes. C++ has better mechanisms
for signalling bad news, and the use of null to signal special not
applicable values is an occasionally useful programming trick that
receives too much airtime. Empty strings are often useful
alternatives. However, an empty string is a string, whereas a null
pointer is not.
The legitimacy of the null pointer value in the context of strings
as pointers also gives us an unpleasant side effect. Given:
void write(const char *line);
void write(int);
The common macro definition of NULL as 0 means that the
following code will pick out the second write function rather than
the more obviously pointer-oriented first write:
write(NULL); // wrong write
So NULL is misleading as a pointer value, given that the following
disambiguates the call:
write(static_cast<const char *>(0)); // right write
To sum up, C-style strings don’t measure up at all well against
our value type expectations. They can still be valuable in certain
contexts, but marks out of ten for general-purpose use hug the
lower end of the scale as well as the lower layers of abstraction.
The standard provides the std::basic_string class template for
all your string needs—or at least, that is the intent. The main
motivation for templating is to capture the different possible character
types in a single body of code: strings of char, wchar_t or a
character type of your own invention all use the same implementation
code. A couple of standard typedefs offer you the most commonly
used names for working with std::basic_string:
namespace std
{
typedef basic_string<char> string;
typedef basic_string<wchar_t> wstring;
}
C++ WORKSHOP
Copying a bank account is
generally considered to be a
criminal offence on this side of
the keyboard
40 APPLICATION DEVELOPMENT ADVISOR q www.appdevadvisor.co.uk
What basic_string does well is to act as an encapsulated type
from the perspective of memory management. On the other hand,
it is mired in convenience. It has lots of little features that are thought
to be convenient, but the general effect of including them all is
counterproductive. The redundancy and complexity starts before
we even reach the class keyword in its definition:
namespace std
{
template
<
typename char_type,
typename traits = char_traits<char_type>,
allocator_type = allocator<char_type>
>
class basic_string
{
...
};
}
Using your own traits
Not content simply with offering parameterisable variation on
character type, you can, if you wish, provide your own set of traits
for working with your selected character type5
. At first it looks
like the second parameter could be used for dealing with
differences between character sets, i.e. a useful hook for addressing
internationalisation. However, not only is it not that useful,
but a template parameter on the core string type is a long way from
being the best way to deal with internationalisation issues such
as locale-sensitive string comparison. The second parameter
offers a set of unrelated facilities, some of which are focused on
character values on its own, some of which are focused on
memory manipulation of character type sequences, and the
remainder of which are focused on I/O. For a sensibly defined
character type, the policy parameter offers no useful extras; for
poorly defined character types, the solution is to fix the character
type rather than shore it up with a policy parameter.
Adding to the syntax soup is the third parameter: the allocator
type.This wart can be found on all of the standard container classes.
Allocators were an attempt to parameterise the memory model
used for allocating container elements, prompted originally by
historical shortcomings of the amateurish DOS memory model.
Originally it was hoped that the idea of generalising memory models
would afford transparent access to persistent storage or shared memory
representations. As the process of standardisation nailed down the
semantics of allocators, their complexity increased and their
generality decreased. We now know that they are useful for
remarkably little6
, and probably even less than that7
.
Fortunately, the second and third parameters are defaulted,
which means that out of three degrees of parameterisation, only
one is of any use. But you can at least ignore them in most cases.
They represent failed experiments in generality and policy-based
design. Such gratuitous but limiting overgeneralisation is also
found in the function members offered in the basic_string
interface.
The most obvious example is with functions that search for a
given character or substring, of which there are twenty-four in
basic_string. You might think, given that number, that all your
searching needs would be covered. You would be wrong. Those
twenty-four member functions offer less behaviour than the
fourteen searching algorithms defined in the standard <algorithm>
header. In other words, basic_string duplicates existing extensible
functionality, but without the extensibility or functionality.
Jekyll, meet Hyde
Another feature that you may notice if you look at the interface
is that basic_string seems to be caught between two interface styles:
on the one hand, there are a lot of functions that work in terms
of numeric indices and on the other, there are a lot of familiar STL-
based functions that work in terms of iterators. This Jekyll and
Hyde interface is a historical artefact that reflects the traditional
and index-based style of interface first adopted for strings, and
the later incorporation of STL functionality to make strings look
like STL sequences. Given the choice, focus on the consistency,
expressiveness and extensibility of the STL approach rather than
the other more limited functions.
The desire to please all of the people all of the time manifests
itself even in something as innocuous as subscripting. The
intuitive way to subscript an index is using operator[], but
there is also an at function that adds a difference and a twist: access
is range-checked with an exception, whereas there is no requirement
that operator[] must result in defined behaviour for any out-of-
bounds access. What at considers to be out of bounds is subtly
different to what operator[] considers to be out of bounds—there’s
nothing like consistent design...and that is nothing like it at all.
So, what is the design objective of at? I’m not sure that I can
state it in a way that is consistent with the way that the more natural
operator[] is used and how the rest of the interface supported
by STL sequences is expressed. If the aim is to provide an option
for range-checked access, that is certainly laudable, but why is this
feature provided only for indexing and not for iteration? The
inconsistency in the design seems to dilute its intention somewhat.
If it is a genuine design goal, then it should be reflected in the interface
design as whole. As it is, the quality-of-failure offered by at is limited
at best and, at worst, wasted.
Why‘at’ isn’t good enough
Consider how such a function might be used: the idiomatic approach
to indexing is to use operator[].To do anything else takes a conscious
effort to adopt an alternative. Once that conscious effort is
made, the problem that at attempts to address has already been
solved. The standard documents the intent of such exceptions as,
“The distinguishing characteristic of logic errors is that they are
due to errors in the internal logic of the program. In theory, they
are preventable.” Once you make the decision to use at, you are
at the same doorstep as making the problem (and the use of at)
preventable. It is not whether or not you should have range checking,
but whether it makes any sense to have both checked and
C++ WORKSHOP
Given the choice,focus on the
consistency,expressiveness and
extensibility of the STL approach
rather than the other more limited
functions
unchecked options available in the same class.
The train of thought that leads to using at also leads away from
it—“I had better use at instead of operator[] because I might index
out of range...oh, OK, now that I know that, I will just check and
fix the code so that mistake doesn’t happen.” This reminds me
of a set of macros I once saw intended to make C programming
easier for Fortran programmers (including IF, THEN and so on).
There was a definition for ONE that was 0 (!). It was supposed
to allow for loops to be written using traditional Fortran one-based
indexing rather than C’s zero-based indexing. The user was
supposed to make the effort to write ONE instead of 1... by which
point they had already remembered enough to write 0 instead!
It is not that I don’t believe in at because I have a belief in unsafe
code; I don’t believe in at because it doesn’t work to make code
safe. It tries to address a broad quality-of-failure configuration issue
with a single member function, which is definitely the wrong hammer
for the screws it is trying to drive in. Programmers who believe
that at will make their program safer and more secure probably
do not understand enough about safety and security to make that
stick.
I may appear to be coming down unnecessarily hard on
std::basic_string. After all, I use it all the time and advocate it
frequently8
, and it satisfies the need for self-containedness1
one expects
of a value object type. My concern is that outside of a subset of
its features, it is quite a poor design: it leads by example from the
rear, not the front. The failed ambition to please all comers has
resulted in something of a dog’s dinner of an interface. Additionally,
it embodies the flawed assumption that the diverse needs of
different string users can be accommodated in one type.The design
starts off in the wrong place, and goes downhill from there. So,
in summary, use std::string and std::wstring in preference to
raw C-style strings and proprietary string types, but don’t use
std::basic_string as a model example of design. s
References
1. Kevlin Henney, “Six of the best”, Application Development
Advisor, May 2002.
2. Kevlin Henney, “Half a dozen of the other”, Application
Development Advisor, June 2002.
3. Kevlin Henney, “Making an exception”, Application
Development Advisor, May 2001.
4. Kevlin Henney, “One careful owner”, Application
Development Advisor, June 2001.
5. Kevlin Henney, “Flag waiving”, Application Development
Advisor, April 2002.
6. Scott Meyers, Effective STL, Addison-Wesley, 2001.
7. Kevlin Henney, email to Scott Meyers detailing why
allocators and shared memory do not mix, now listed in
the errata for Effective STL, www.aristeia.com.
8. Kevlin Henney, “The miseducation of C++”, Application
Development Advisor, April 2001
Kevlin Henney is an independent software development consultant
and trainer. He can be reached at www.curbralan.com
C++ WORKSHOP

More Related Content

What's hot

Python Interview questions 2020
Python Interview questions 2020Python Interview questions 2020
Python Interview questions 2020
VigneshVijay21
 
Impact of indentation in programming
Impact of indentation in programmingImpact of indentation in programming
Impact of indentation in programming
ijpla
 
INTRODUCTION TO C PROGRAMMING
INTRODUCTION TO C PROGRAMMINGINTRODUCTION TO C PROGRAMMING
INTRODUCTION TO C PROGRAMMING
Abhishek Dwivedi
 
Data Types, Variables, and Constants in C# Programming
Data Types, Variables, and Constants in C# ProgrammingData Types, Variables, and Constants in C# Programming
Data Types, Variables, and Constants in C# Programming
Sherwin Banaag Sapin
 
Essential c notes singh projects
Essential c notes singh projectsEssential c notes singh projects
Essential c notes singh projects
SINGH PROJECTS
 
Pointers andmemory
Pointers andmemoryPointers andmemory
Pointers andmemory
ariffast
 
Introduction to programming using c
Introduction to programming using cIntroduction to programming using c
Introduction to programming using c
Reham Maher El-Safarini
 
Sienna 12 huffman
Sienna 12 huffmanSienna 12 huffman
Sienna 12 huffman
chidabdu
 
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPS
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPSANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPS
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPS
IJCSEA Journal
 
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing Maps
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing MapsAnomaly Detection in Arabic Texts using Ngrams and Self Organizing Maps
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing Maps
IJCSEA Journal
 
Chapter 13.1.1
Chapter 13.1.1Chapter 13.1.1
Chapter 13.1.1
patcha535
 
A Closer Look at Data Types, Variables and Expressions
A Closer Look at Data Types, Variables and ExpressionsA Closer Look at Data Types, Variables and Expressions
A Closer Look at Data Types, Variables and Expressions
Inan Mashrur
 
Introduction of c_language
Introduction of c_languageIntroduction of c_language
Introduction of c_language
SINGH PROJECTS
 
A New Approach to Optimal Code Formatting
A New Approach to Optimal Code FormattingA New Approach to Optimal Code Formatting
A New Approach to Optimal Code Formatting
Phillip Yelland
 
C presentation book
C presentation bookC presentation book
C presentation book
krunal1210
 
C programming tutorial
C programming tutorialC programming tutorial
C programming tutorial
Mohit Saini
 

What's hot (16)

Python Interview questions 2020
Python Interview questions 2020Python Interview questions 2020
Python Interview questions 2020
 
Impact of indentation in programming
Impact of indentation in programmingImpact of indentation in programming
Impact of indentation in programming
 
INTRODUCTION TO C PROGRAMMING
INTRODUCTION TO C PROGRAMMINGINTRODUCTION TO C PROGRAMMING
INTRODUCTION TO C PROGRAMMING
 
Data Types, Variables, and Constants in C# Programming
Data Types, Variables, and Constants in C# ProgrammingData Types, Variables, and Constants in C# Programming
Data Types, Variables, and Constants in C# Programming
 
Essential c notes singh projects
Essential c notes singh projectsEssential c notes singh projects
Essential c notes singh projects
 
Pointers andmemory
Pointers andmemoryPointers andmemory
Pointers andmemory
 
Introduction to programming using c
Introduction to programming using cIntroduction to programming using c
Introduction to programming using c
 
Sienna 12 huffman
Sienna 12 huffmanSienna 12 huffman
Sienna 12 huffman
 
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPS
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPSANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPS
ANOMALY DETECTION IN ARABIC TEXTS USING NGRAMS AND SELF ORGANIZING MAPS
 
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing Maps
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing MapsAnomaly Detection in Arabic Texts using Ngrams and Self Organizing Maps
Anomaly Detection in Arabic Texts using Ngrams and Self Organizing Maps
 
Chapter 13.1.1
Chapter 13.1.1Chapter 13.1.1
Chapter 13.1.1
 
A Closer Look at Data Types, Variables and Expressions
A Closer Look at Data Types, Variables and ExpressionsA Closer Look at Data Types, Variables and Expressions
A Closer Look at Data Types, Variables and Expressions
 
Introduction of c_language
Introduction of c_languageIntroduction of c_language
Introduction of c_language
 
A New Approach to Optimal Code Formatting
A New Approach to Optimal Code FormattingA New Approach to Optimal Code Formatting
A New Approach to Optimal Code Formatting
 
C presentation book
C presentation bookC presentation book
C presentation book
 
C programming tutorial
C programming tutorialC programming tutorial
C programming tutorial
 

Viewers also liked

One Careful Owner
One Careful OwnerOne Careful Owner
One Careful Owner
Kevlin Henney
 
Collections for States
Collections for StatesCollections for States
Collections for States
Kevlin Henney
 
Worse Is Better, for Better or for Worse
Worse Is Better, for Better or for WorseWorse Is Better, for Better or for Worse
Worse Is Better, for Better or for Worse
Kevlin Henney
 
The Perfect Couple
The Perfect CoupleThe Perfect Couple
The Perfect Couple
Kevlin Henney
 
Learning Curve
Learning CurveLearning Curve
Learning Curve
Kevlin Henney
 
Inside Requirements
Inside RequirementsInside Requirements
Inside Requirements
Kevlin Henney
 
Substitutability
SubstitutabilitySubstitutability
Substitutability
Kevlin Henney
 
Bound and Checked
Bound and CheckedBound and Checked
Bound and Checked
Kevlin Henney
 
Exceptional Naming
Exceptional NamingExceptional Naming
Exceptional Naming
Kevlin Henney
 
Flag Waiving
Flag WaivingFlag Waiving
Flag Waiving
Kevlin Henney
 
The Next Best String
The Next Best StringThe Next Best String
The Next Best String
Kevlin Henney
 
Put to the Test
Put to the TestPut to the Test
Put to the Test
Kevlin Henney
 
The Programmer
The ProgrammerThe Programmer
The Programmer
Kevlin Henney
 

Viewers also liked (13)

One Careful Owner
One Careful OwnerOne Careful Owner
One Careful Owner
 
Collections for States
Collections for StatesCollections for States
Collections for States
 
Worse Is Better, for Better or for Worse
Worse Is Better, for Better or for WorseWorse Is Better, for Better or for Worse
Worse Is Better, for Better or for Worse
 
The Perfect Couple
The Perfect CoupleThe Perfect Couple
The Perfect Couple
 
Learning Curve
Learning CurveLearning Curve
Learning Curve
 
Inside Requirements
Inside RequirementsInside Requirements
Inside Requirements
 
Substitutability
SubstitutabilitySubstitutability
Substitutability
 
Bound and Checked
Bound and CheckedBound and Checked
Bound and Checked
 
Exceptional Naming
Exceptional NamingExceptional Naming
Exceptional Naming
 
Flag Waiving
Flag WaivingFlag Waiving
Flag Waiving
 
The Next Best String
The Next Best StringThe Next Best String
The Next Best String
 
Put to the Test
Put to the TestPut to the Test
Put to the Test
 
The Programmer
The ProgrammerThe Programmer
The Programmer
 

Similar to Stringing Things Along

Highly Strung
Highly StrungHighly Strung
Highly Strung
Kevlin Henney
 
Form Follows Function
Form Follows FunctionForm Follows Function
Form Follows Function
Kevlin Henney
 
If I Had a Hammer...
If I Had a Hammer...If I Had a Hammer...
If I Had a Hammer...
Kevlin Henney
 
Project lexical analyser compiler _1.pdf
Project lexical analyser compiler _1.pdfProject lexical analyser compiler _1.pdf
Project lexical analyser compiler _1.pdf
abhimanyukumar28203
 
Big Brother helps you
Big Brother helps youBig Brother helps you
Big Brother helps you
PVS-Studio
 
C# slid
C# slidC# slid
C# slid
pacatarpit
 
Rust presentation convergeconf
Rust presentation convergeconfRust presentation convergeconf
Rust presentation convergeconf
Krishna Kumar Thokala
 
Writing Efficient Code Feb 08
Writing Efficient Code Feb 08Writing Efficient Code Feb 08
Writing Efficient Code Feb 08
Ganesh Samarthyam
 
Data Type is a basic classification which identifies.docx
Data Type is a basic classification which identifies.docxData Type is a basic classification which identifies.docx
Data Type is a basic classification which identifies.docx
theodorelove43763
 
java vs C#
java vs C#java vs C#
C –FAQ:
C –FAQ:C –FAQ:
C interview-questions-techpreparation
C interview-questions-techpreparationC interview-questions-techpreparation
C interview-questions-techpreparation
Kushaal Singla
 
Difference between Java and c#
Difference between Java and c#Difference between Java and c#
Difference between Java and c#
Sagar Pednekar
 
Essential c
Essential cEssential c
Essential c
Raghu Dathesh
 
Essential c
Essential cEssential c
Essential c
Amit Kapoor
 
Essential c
Essential cEssential c
Essential c
Saurabh Bhoomkar
 
Essential c
Essential cEssential c
Essential c
Kshitij Parashar
 
python and perl
python and perlpython and perl
python and perl
Mara Angelica Refraccion
 
434090527-C-Cheat-Sheet. pdf C# program
434090527-C-Cheat-Sheet. pdf  C# program434090527-C-Cheat-Sheet. pdf  C# program
434090527-C-Cheat-Sheet. pdf C# program
MAHESHV559910
 
C Interview Questions for Fresher
C Interview Questions for FresherC Interview Questions for Fresher
C Interview Questions for Fresher
Javed Ahmad
 

Similar to Stringing Things Along (20)

Highly Strung
Highly StrungHighly Strung
Highly Strung
 
Form Follows Function
Form Follows FunctionForm Follows Function
Form Follows Function
 
If I Had a Hammer...
If I Had a Hammer...If I Had a Hammer...
If I Had a Hammer...
 
Project lexical analyser compiler _1.pdf
Project lexical analyser compiler _1.pdfProject lexical analyser compiler _1.pdf
Project lexical analyser compiler _1.pdf
 
Big Brother helps you
Big Brother helps youBig Brother helps you
Big Brother helps you
 
C# slid
C# slidC# slid
C# slid
 
Rust presentation convergeconf
Rust presentation convergeconfRust presentation convergeconf
Rust presentation convergeconf
 
Writing Efficient Code Feb 08
Writing Efficient Code Feb 08Writing Efficient Code Feb 08
Writing Efficient Code Feb 08
 
Data Type is a basic classification which identifies.docx
Data Type is a basic classification which identifies.docxData Type is a basic classification which identifies.docx
Data Type is a basic classification which identifies.docx
 
java vs C#
java vs C#java vs C#
java vs C#
 
C –FAQ:
C –FAQ:C –FAQ:
C –FAQ:
 
C interview-questions-techpreparation
C interview-questions-techpreparationC interview-questions-techpreparation
C interview-questions-techpreparation
 
Difference between Java and c#
Difference between Java and c#Difference between Java and c#
Difference between Java and c#
 
Essential c
Essential cEssential c
Essential c
 
Essential c
Essential cEssential c
Essential c
 
Essential c
Essential cEssential c
Essential c
 
Essential c
Essential cEssential c
Essential c
 
python and perl
python and perlpython and perl
python and perl
 
434090527-C-Cheat-Sheet. pdf C# program
434090527-C-Cheat-Sheet. pdf  C# program434090527-C-Cheat-Sheet. pdf  C# program
434090527-C-Cheat-Sheet. pdf C# program
 
C Interview Questions for Fresher
C Interview Questions for FresherC Interview Questions for Fresher
C Interview Questions for Fresher
 

More from Kevlin Henney

Program with GUTs
Program with GUTsProgram with GUTs
Program with GUTs
Kevlin Henney
 
The Case for Technical Excellence
The Case for Technical ExcellenceThe Case for Technical Excellence
The Case for Technical Excellence
Kevlin Henney
 
Empirical Development
Empirical DevelopmentEmpirical Development
Empirical Development
Kevlin Henney
 
Lambda? You Keep Using that Letter
Lambda? You Keep Using that LetterLambda? You Keep Using that Letter
Lambda? You Keep Using that Letter
Kevlin Henney
 
Lambda? You Keep Using that Letter
Lambda? You Keep Using that LetterLambda? You Keep Using that Letter
Lambda? You Keep Using that Letter
Kevlin Henney
 
Solid Deconstruction
Solid DeconstructionSolid Deconstruction
Solid Deconstruction
Kevlin Henney
 
Get Kata
Get KataGet Kata
Get Kata
Kevlin Henney
 
Procedural Programming: It’s Back? It Never Went Away
Procedural Programming: It’s Back? It Never Went AwayProcedural Programming: It’s Back? It Never Went Away
Procedural Programming: It’s Back? It Never Went Away
Kevlin Henney
 
Structure and Interpretation of Test Cases
Structure and Interpretation of Test CasesStructure and Interpretation of Test Cases
Structure and Interpretation of Test Cases
Kevlin Henney
 
Agility ≠ Speed
Agility ≠ SpeedAgility ≠ Speed
Agility ≠ Speed
Kevlin Henney
 
Refactoring to Immutability
Refactoring to ImmutabilityRefactoring to Immutability
Refactoring to Immutability
Kevlin Henney
 
Old Is the New New
Old Is the New NewOld Is the New New
Old Is the New New
Kevlin Henney
 
Turning Development Outside-In
Turning Development Outside-InTurning Development Outside-In
Turning Development Outside-In
Kevlin Henney
 
Giving Code a Good Name
Giving Code a Good NameGiving Code a Good Name
Giving Code a Good Name
Kevlin Henney
 
Clean Coders Hate What Happens To Your Code When You Use These Enterprise Pro...
Clean Coders Hate What Happens To Your Code When You Use These Enterprise Pro...Clean Coders Hate What Happens To Your Code When You Use These Enterprise Pro...
Clean Coders Hate What Happens To Your Code When You Use These Enterprise Pro...
Kevlin Henney
 
Thinking Outside the Synchronisation Quadrant
Thinking Outside the Synchronisation QuadrantThinking Outside the Synchronisation Quadrant
Thinking Outside the Synchronisation Quadrant
Kevlin Henney
 
Code as Risk
Code as RiskCode as Risk
Code as Risk
Kevlin Henney
 
Software Is Details
Software Is DetailsSoftware Is Details
Software Is Details
Kevlin Henney
 
Game of Sprints
Game of SprintsGame of Sprints
Game of Sprints
Kevlin Henney
 
Good Code
Good CodeGood Code
Good Code
Kevlin Henney
 

More from Kevlin Henney (20)

Program with GUTs
Program with GUTsProgram with GUTs
Program with GUTs
 
The Case for Technical Excellence
The Case for Technical ExcellenceThe Case for Technical Excellence
The Case for Technical Excellence
 
Empirical Development
Empirical DevelopmentEmpirical Development
Empirical Development
 
Lambda? You Keep Using that Letter
Lambda? You Keep Using that LetterLambda? You Keep Using that Letter
Lambda? You Keep Using that Letter
 
Lambda? You Keep Using that Letter
Lambda? You Keep Using that LetterLambda? You Keep Using that Letter
Lambda? You Keep Using that Letter
 
Solid Deconstruction
Solid DeconstructionSolid Deconstruction
Solid Deconstruction
 
Get Kata
Get KataGet Kata
Get Kata
 
Procedural Programming: It’s Back? It Never Went Away
Procedural Programming: It’s Back? It Never Went AwayProcedural Programming: It’s Back? It Never Went Away
Procedural Programming: It’s Back? It Never Went Away
 
Structure and Interpretation of Test Cases
Structure and Interpretation of Test CasesStructure and Interpretation of Test Cases
Structure and Interpretation of Test Cases
 
Agility ≠ Speed
Agility ≠ SpeedAgility ≠ Speed
Agility ≠ Speed
 
Refactoring to Immutability
Refactoring to ImmutabilityRefactoring to Immutability
Refactoring to Immutability
 
Old Is the New New
Old Is the New NewOld Is the New New
Old Is the New New
 
Turning Development Outside-In
Turning Development Outside-InTurning Development Outside-In
Turning Development Outside-In
 
Giving Code a Good Name
Giving Code a Good NameGiving Code a Good Name
Giving Code a Good Name
 
Clean Coders Hate What Happens To Your Code When You Use These Enterprise Pro...
Clean Coders Hate What Happens To Your Code When You Use These Enterprise Pro...Clean Coders Hate What Happens To Your Code When You Use These Enterprise Pro...
Clean Coders Hate What Happens To Your Code When You Use These Enterprise Pro...
 
Thinking Outside the Synchronisation Quadrant
Thinking Outside the Synchronisation QuadrantThinking Outside the Synchronisation Quadrant
Thinking Outside the Synchronisation Quadrant
 
Code as Risk
Code as RiskCode as Risk
Code as Risk
 
Software Is Details
Software Is DetailsSoftware Is Details
Software Is Details
 
Game of Sprints
Game of SprintsGame of Sprints
Game of Sprints
 
Good Code
Good CodeGood Code
Good Code
 

Recently uploaded

SAP implementation steps PDF - Zyple Software
SAP implementation steps PDF - Zyple SoftwareSAP implementation steps PDF - Zyple Software
SAP implementation steps PDF - Zyple Software
Zyple Software
 
UMiami degree offer diploma Transcript
UMiami degree offer diploma TranscriptUMiami degree offer diploma Transcript
UMiami degree offer diploma Transcript
attueb
 
How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024
TaskSprint | Employee Efficiency Software
 
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
bhumivarma35300
 
🚂🚘 Premium Girls Call Ranchi 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Ranchi  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...🚂🚘 Premium Girls Call Ranchi  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Ranchi 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
bahubalikumar09988
 
當測試開始左移
當測試開始左移當測試開始左移
當測試開始左移
Jersey (CHE-PING) Su
 
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Deliverybangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
sunilverma7884
 
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
OnePlan Solutions
 
B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024
vmsdeptcom
 
Applitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdfApplitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdf
Applitools
 
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
87tomato
 
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
neshakor5152
 
InflectraCON 360: Risk-Based Testing for Mission Critical Systems
InflectraCON 360: Risk-Based Testing for Mission Critical SystemsInflectraCON 360: Risk-Based Testing for Mission Critical Systems
InflectraCON 360: Risk-Based Testing for Mission Critical Systems
Inflectra
 
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
singhlata50dh
 
ERP Software Solutions Provider in Coimbatore
ERP Software Solutions Provider in CoimbatoreERP Software Solutions Provider in Coimbatore
ERP Software Solutions Provider in Coimbatore
Nextskill Technologies
 
02. Ruby Basic slides - Ruby Core Teaching
02. Ruby Basic slides - Ruby Core Teaching02. Ruby Basic slides - Ruby Core Teaching
02. Ruby Basic slides - Ruby Core Teaching
quanhoangd129
 
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdfAI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
Daniel Zivkovic
 
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDSAmadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
aadhiyaeliza
 
TEQnation 2024: Sustainable Software: May the Green Code Be with You
TEQnation 2024: Sustainable Software: May the Green Code Be with YouTEQnation 2024: Sustainable Software: May the Green Code Be with You
TEQnation 2024: Sustainable Software: May the Green Code Be with You
marcofolio
 
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
902basic
 

Recently uploaded (20)

SAP implementation steps PDF - Zyple Software
SAP implementation steps PDF - Zyple SoftwareSAP implementation steps PDF - Zyple Software
SAP implementation steps PDF - Zyple Software
 
UMiami degree offer diploma Transcript
UMiami degree offer diploma TranscriptUMiami degree offer diploma Transcript
UMiami degree offer diploma Transcript
 
How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024How To Fill Timesheet in TaskSprint: Quick Guide 2024
How To Fill Timesheet in TaskSprint: Quick Guide 2024
 
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
Independent Girls call Service Pune 000XX00000 Provide Best And Top Girl Serv...
 
🚂🚘 Premium Girls Call Ranchi 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Ranchi  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...🚂🚘 Premium Girls Call Ranchi  🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
🚂🚘 Premium Girls Call Ranchi 🛵🚡000XX00000 💃 Choose Best And Top Girl Service...
 
當測試開始左移
當測試開始左移當測試開始左移
當測試開始左移
 
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Deliverybangalore Girls call  👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
bangalore Girls call 👀 XXXXXXXXXXX 👀 Rs.9.5 K Cash Payment With Room Delivery
 
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
Maximizing Efficiency and Profitability: Optimizing Data Systems, Enhancing C...
 
B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024B.Sc. Computer Science Department PPT 2024
B.Sc. Computer Science Department PPT 2024
 
Applitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdfApplitools Autonomous 2.0 Sneak Peek.pdf
Applitools Autonomous 2.0 Sneak Peek.pdf
 
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
Verified Girls Call Mumbai 👀 9820252231 👀 Cash Payment With Room DeliveryDeli...
 
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in CityGirls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
Girls Call Mysore 000XX00000 Provide Best And Top Girl Service And No1 in City
 
InflectraCON 360: Risk-Based Testing for Mission Critical Systems
InflectraCON 360: Risk-Based Testing for Mission Critical SystemsInflectraCON 360: Risk-Based Testing for Mission Critical Systems
InflectraCON 360: Risk-Based Testing for Mission Critical Systems
 
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
High Girls Call Chennai 000XX00000 Provide Best And Top Girl Service And No1 ...
 
ERP Software Solutions Provider in Coimbatore
ERP Software Solutions Provider in CoimbatoreERP Software Solutions Provider in Coimbatore
ERP Software Solutions Provider in Coimbatore
 
02. Ruby Basic slides - Ruby Core Teaching
02. Ruby Basic slides - Ruby Core Teaching02. Ruby Basic slides - Ruby Core Teaching
02. Ruby Basic slides - Ruby Core Teaching
 
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdfAI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
AI - Your Startup Sidekick (Leveraging AI to Bootstrap a Lean Startup).pdf
 
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDSAmadeus Travel API, Amadeus Booking API, Amadeus GDS
Amadeus Travel API, Amadeus Booking API, Amadeus GDS
 
TEQnation 2024: Sustainable Software: May the Green Code Be with You
TEQnation 2024: Sustainable Software: May the Green Code Be with YouTEQnation 2024: Sustainable Software: May the Green Code Be with You
TEQnation 2024: Sustainable Software: May the Green Code Be with You
 
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
Private Girls Call Navi Mumbai 🛵🚡9820252231 💃 Choose Best And Top Girl Servic...
 

Stringing Things Along

  • 1. H OW LONG IS A PIECE OF STRING? OR, TO be precise, what is a string? For C programmers the answer has always been easy: it is a null-terminated sequence of characters. This is lamentably still the answer of choice for some C++ programmers. For others working day-to-day with a specific proprietary or in-house library, it is possible that their answer may be the resident string class. C++ programmers educated in the standard library will respond that the there is a string type defined by the standard, and that that type is the one true answer for C++. But there is a much greater diversity than is suggested by these singular or orthodox answers. OK, so what is a string? A reasonably high-level answer would be a sequence of characters. A slightly more precise answer would be a value that represents a sequence of characters. The shift in emphasis is a subtle but worthwhile one. Why objects aren’t equal Objects live in a class-based society where equal rights are not conferred on all. Many programmers believe that all classes have the same basic rights as one another and that all should follow the same rules of development. For instance, they should all have functional copy constructors and assignment operators. Sometimes it’s comforting to have a mechanical set of rules to follow, but it can also be a means of going wrong with confidence. While some guidelines can be mechanical and automatic, most of those concerned with the bigger design picture must be framed in more general terms to be of genuine long-term use1, 2 . Returning to copying, does it make sense to support copying for all types? Copying strings seems like a useful idea, as does copying lists, monetary quantities, dates and so on. But what about copying objects that represent entities in the application domain, such as bank accounts or payment transactions? Copying a bank account is generally considered to be a criminal offence on this side of the keyboard. What about copying objects that play an infrastructure role in programs, such as databases or resource wrappers? Remembering that in various situations temporary objects can be generated or elided at will by the compiler, is copying a whole database the kind of operation that is casual enough to support via something as transparent as a copy constructor? Considering the issue of resource wrappers, what does it mean to uniquely acquire a specific system resource...and then copy it? Such encapsulation was the basis of the scoped type, which was explored and defined in previous instalments of this column3,4 . Copying was strictly forbidden because, regardless of whether or not it was technically achievable, it made no actual sense. All too often there is a temptation to “solve the solution” rather than “solve the problem”. When you look at objects this way it becomes clear that there are some fundamental differences between different kinds of objects, and that these differences go beyond the simple feature of copyability that I picked on for demonstration. It turns out that for most kinds of objects, copying is either a non- requirement or its absence is actually the requirement. For value objects, copying is not only sensible— it is pretty much a basic requirement. Value objects represent fine-grained pieces of information q Strings come in all shapes,sizes and programmers’preferences.They are value types that hold sequences of characters for appropriate definitions of value,character and sequence. q Value object types are different from many other object types because they focus on fine-grained information rather than application entities. Therefore they are copyable,normally live on the heap or in other objects, and are not generally accessed by pointers. q The core language offers C-style,null- terminated strings. q The standard library offers a single string type.Unfortunately,it is by no means the best encapsulation. However,it is should still be used in preference to proprietary or lower- level solutions. FACTS AT A GLANCE C++ WORKSHOP Kevlin Henney discusses the use of strings in C++, and finds that one type doesn’t fit all Stringing things along 38 APPLICATION DEVELOPMENT ADVISOR q www.appdevadvisor.co.uk
  • 2. JULY–AUGUST 2002 39 in a system, such as numeric quantities or textual data. They can be manipulated, composed and combined freely, often with the intuitive assistance of operators. Subscript into a string s at index i? Use s[i]. Concatenate two strings, a and b, together?Take the small leap to consider concatenation to be a form of addition, and a + b follows. Values are passed around either by copy or by reference, but not by pointer. Why not by pointer? Because a pointer emphasises indirection and object identity, two things that are of little importance—and indeed act as cumbersome distractions— when working with value objects. Values live most comfortably on the stack or as data members of other objects, bounded to their enclosing scope’s lifetime, rather than freely on the heap. Of course, their internal representation may live on the heap, but that detail should be self-contained3 and hidden behind a veil of encapsulation. OK, so now that we’ve understood that a string is some kind of a value, what are we offered in the standard? And how does that measure up against our expectations? In the core language, inherited from C, we have a strongly reinforced convention that a string is a null-terminated array of characters.The character type can be either a char (conventionally ASCII or one of its relatives) or wchar_t, which may hold a suitable wide-character type (e.g. 32-bit Unicode) and which ranks as possibly the worst keyword ever to make it into a serious programming language. The null character has the integer value 0, and is expressed most conveniently as ‘0’ or char(0) for char, or L’0’ or wchar_t(0) for wchar_t. The death knell for NULL? Note that the common C programmer mistake of using NULL for this purpose is just that: a mistake. In C, NULL is intended to denote the null pointer—not the null character—and depending on the platform, it may be defined as 0, 0L or (void *) 0. In C++ it is commonly accepted that NULL should not be used at all, because it can lead to surprising ambiguity during overload resolution. This is a point I will return to in a moment. So how do we work with built-in strings? Clumsily. The only appropriate operator on offer is for subscripting, i.e. string[index], and everything else must be done either by hand or by the str and mem function families defined in the C standard, e.g. strlen to find the length and strcpy to perform unmanaged copying. To define a general string value, you must either allocate suitable space dynamically on the heap (remembering to clear it up later) or declare an array of sufficient size. Neither of these two means of allocation allow convenient resizing, although heap-based allocation does at least allow flexibility in initial size and the opportunity to replace one string with another—assuming that your pointer bookkeeping is up to scratch. Arrays cannot be copied by assignment, by copy construction, as return values or as function arguments. The first three cases are illegal and the last is sneakily substituted with a pointer. In other words, void write(const char line[80]); is rewritten by the compiler as: void write(const char *line); If you really want to ensure that an 80-character array is passed through, you have to resort to the syntactically challenging: void write(const char (&line)[80]); A point about pointers This is hard-nosed about its requirement, because only a compile- time declared array of 80 characters will do. The upshot of all this is that, intentionally or otherwise, built-in strings are normally passed around as pointers, and are most commonly thought of this way. Pointers apparently behave as values for many purposes: they can be assigned and passed around freely, and they have some operator support. However, the copying results in aliasing and there is no management of the string in any way. There is also the small matter of null. Strings considered as pointers can refer either to a sequence of characters...or not. This extra out-of-band value sometimes sees action in expressing special circumstances, particularly exceptional outcomes. C++ has better mechanisms for signalling bad news, and the use of null to signal special not applicable values is an occasionally useful programming trick that receives too much airtime. Empty strings are often useful alternatives. However, an empty string is a string, whereas a null pointer is not. The legitimacy of the null pointer value in the context of strings as pointers also gives us an unpleasant side effect. Given: void write(const char *line); void write(int); The common macro definition of NULL as 0 means that the following code will pick out the second write function rather than the more obviously pointer-oriented first write: write(NULL); // wrong write So NULL is misleading as a pointer value, given that the following disambiguates the call: write(static_cast<const char *>(0)); // right write To sum up, C-style strings don’t measure up at all well against our value type expectations. They can still be valuable in certain contexts, but marks out of ten for general-purpose use hug the lower end of the scale as well as the lower layers of abstraction. The standard provides the std::basic_string class template for all your string needs—or at least, that is the intent. The main motivation for templating is to capture the different possible character types in a single body of code: strings of char, wchar_t or a character type of your own invention all use the same implementation code. A couple of standard typedefs offer you the most commonly used names for working with std::basic_string: namespace std { typedef basic_string<char> string; typedef basic_string<wchar_t> wstring; } C++ WORKSHOP Copying a bank account is generally considered to be a criminal offence on this side of the keyboard
  • 3. 40 APPLICATION DEVELOPMENT ADVISOR q www.appdevadvisor.co.uk What basic_string does well is to act as an encapsulated type from the perspective of memory management. On the other hand, it is mired in convenience. It has lots of little features that are thought to be convenient, but the general effect of including them all is counterproductive. The redundancy and complexity starts before we even reach the class keyword in its definition: namespace std { template < typename char_type, typename traits = char_traits<char_type>, allocator_type = allocator<char_type> > class basic_string { ... }; } Using your own traits Not content simply with offering parameterisable variation on character type, you can, if you wish, provide your own set of traits for working with your selected character type5 . At first it looks like the second parameter could be used for dealing with differences between character sets, i.e. a useful hook for addressing internationalisation. However, not only is it not that useful, but a template parameter on the core string type is a long way from being the best way to deal with internationalisation issues such as locale-sensitive string comparison. The second parameter offers a set of unrelated facilities, some of which are focused on character values on its own, some of which are focused on memory manipulation of character type sequences, and the remainder of which are focused on I/O. For a sensibly defined character type, the policy parameter offers no useful extras; for poorly defined character types, the solution is to fix the character type rather than shore it up with a policy parameter. Adding to the syntax soup is the third parameter: the allocator type.This wart can be found on all of the standard container classes. Allocators were an attempt to parameterise the memory model used for allocating container elements, prompted originally by historical shortcomings of the amateurish DOS memory model. Originally it was hoped that the idea of generalising memory models would afford transparent access to persistent storage or shared memory representations. As the process of standardisation nailed down the semantics of allocators, their complexity increased and their generality decreased. We now know that they are useful for remarkably little6 , and probably even less than that7 . Fortunately, the second and third parameters are defaulted, which means that out of three degrees of parameterisation, only one is of any use. But you can at least ignore them in most cases. They represent failed experiments in generality and policy-based design. Such gratuitous but limiting overgeneralisation is also found in the function members offered in the basic_string interface. The most obvious example is with functions that search for a given character or substring, of which there are twenty-four in basic_string. You might think, given that number, that all your searching needs would be covered. You would be wrong. Those twenty-four member functions offer less behaviour than the fourteen searching algorithms defined in the standard <algorithm> header. In other words, basic_string duplicates existing extensible functionality, but without the extensibility or functionality. Jekyll, meet Hyde Another feature that you may notice if you look at the interface is that basic_string seems to be caught between two interface styles: on the one hand, there are a lot of functions that work in terms of numeric indices and on the other, there are a lot of familiar STL- based functions that work in terms of iterators. This Jekyll and Hyde interface is a historical artefact that reflects the traditional and index-based style of interface first adopted for strings, and the later incorporation of STL functionality to make strings look like STL sequences. Given the choice, focus on the consistency, expressiveness and extensibility of the STL approach rather than the other more limited functions. The desire to please all of the people all of the time manifests itself even in something as innocuous as subscripting. The intuitive way to subscript an index is using operator[], but there is also an at function that adds a difference and a twist: access is range-checked with an exception, whereas there is no requirement that operator[] must result in defined behaviour for any out-of- bounds access. What at considers to be out of bounds is subtly different to what operator[] considers to be out of bounds—there’s nothing like consistent design...and that is nothing like it at all. So, what is the design objective of at? I’m not sure that I can state it in a way that is consistent with the way that the more natural operator[] is used and how the rest of the interface supported by STL sequences is expressed. If the aim is to provide an option for range-checked access, that is certainly laudable, but why is this feature provided only for indexing and not for iteration? The inconsistency in the design seems to dilute its intention somewhat. If it is a genuine design goal, then it should be reflected in the interface design as whole. As it is, the quality-of-failure offered by at is limited at best and, at worst, wasted. Why‘at’ isn’t good enough Consider how such a function might be used: the idiomatic approach to indexing is to use operator[].To do anything else takes a conscious effort to adopt an alternative. Once that conscious effort is made, the problem that at attempts to address has already been solved. The standard documents the intent of such exceptions as, “The distinguishing characteristic of logic errors is that they are due to errors in the internal logic of the program. In theory, they are preventable.” Once you make the decision to use at, you are at the same doorstep as making the problem (and the use of at) preventable. It is not whether or not you should have range checking, but whether it makes any sense to have both checked and C++ WORKSHOP Given the choice,focus on the consistency,expressiveness and extensibility of the STL approach rather than the other more limited functions
  • 4. unchecked options available in the same class. The train of thought that leads to using at also leads away from it—“I had better use at instead of operator[] because I might index out of range...oh, OK, now that I know that, I will just check and fix the code so that mistake doesn’t happen.” This reminds me of a set of macros I once saw intended to make C programming easier for Fortran programmers (including IF, THEN and so on). There was a definition for ONE that was 0 (!). It was supposed to allow for loops to be written using traditional Fortran one-based indexing rather than C’s zero-based indexing. The user was supposed to make the effort to write ONE instead of 1... by which point they had already remembered enough to write 0 instead! It is not that I don’t believe in at because I have a belief in unsafe code; I don’t believe in at because it doesn’t work to make code safe. It tries to address a broad quality-of-failure configuration issue with a single member function, which is definitely the wrong hammer for the screws it is trying to drive in. Programmers who believe that at will make their program safer and more secure probably do not understand enough about safety and security to make that stick. I may appear to be coming down unnecessarily hard on std::basic_string. After all, I use it all the time and advocate it frequently8 , and it satisfies the need for self-containedness1 one expects of a value object type. My concern is that outside of a subset of its features, it is quite a poor design: it leads by example from the rear, not the front. The failed ambition to please all comers has resulted in something of a dog’s dinner of an interface. Additionally, it embodies the flawed assumption that the diverse needs of different string users can be accommodated in one type.The design starts off in the wrong place, and goes downhill from there. So, in summary, use std::string and std::wstring in preference to raw C-style strings and proprietary string types, but don’t use std::basic_string as a model example of design. s References 1. Kevlin Henney, “Six of the best”, Application Development Advisor, May 2002. 2. Kevlin Henney, “Half a dozen of the other”, Application Development Advisor, June 2002. 3. Kevlin Henney, “Making an exception”, Application Development Advisor, May 2001. 4. Kevlin Henney, “One careful owner”, Application Development Advisor, June 2001. 5. Kevlin Henney, “Flag waiving”, Application Development Advisor, April 2002. 6. Scott Meyers, Effective STL, Addison-Wesley, 2001. 7. Kevlin Henney, email to Scott Meyers detailing why allocators and shared memory do not mix, now listed in the errata for Effective STL, www.aristeia.com. 8. Kevlin Henney, “The miseducation of C++”, Application Development Advisor, April 2001 Kevlin Henney is an independent software development consultant and trainer. He can be reached at www.curbralan.com C++ WORKSHOP