30 Localization library [localization]

30.4 Standard locale categories [locale.categories]

30.4.2 The ctype category [category.ctype]

30.4.2.5 Class template codecvt [locale.codecvt]

30.4.2.5.1 General [locale.codecvt.general]

namespace std { class codecvt_base { public: enum result { ok, partial, error, noconv }; }; template<class internT, class externT, class stateT> class codecvt : public locale::facet, public codecvt_base { public: using intern_type = internT; using extern_type = externT; using state_type = stateT; explicit codecvt(size_t refs = 0); result out( stateT& state, const internT* from, const internT* from_end, const internT*& from_next, externT* to, externT* to_end, externT*& to_next) const; result unshift( stateT& state, externT* to, externT* to_end, externT*& to_next) const; result in( stateT& state, const externT* from, const externT* from_end, const externT*& from_next, internT* to, internT* to_end, internT*& to_next) const; int encoding() const noexcept; bool always_noconv() const noexcept; int length(stateT&, const externT* from, const externT* end, size_t max) const; int max_length() const noexcept; static locale::id id; protected: ~codecvt(); virtual result do_out( stateT& state, const internT* from, const internT* from_end, const internT*& from_next, externT* to, externT* to_end, externT*& to_next) const; virtual result do_in( stateT& state, const externT* from, const externT* from_end, const externT*& from_next, internT* to, internT* to_end, internT*& to_next) const; virtual result do_unshift( stateT& state, externT* to, externT* to_end, externT*& to_next) const; virtual int do_encoding() const noexcept; virtual bool do_always_noconv() const noexcept; virtual int do_length(stateT&, const externT* from, const externT* end, size_t max) const; virtual int do_max_length() const noexcept; }; }
The class codecvt<internT, externT, stateT> is for use when converting from one character encoding to another, such as from wide characters to multibyte characters or between wide character encodings such as UTF-32 and EUC.
The stateT argument selects the pair of character encodings being mapped between.
The specializations required in Table 104 ([locale.category]) convert the implementation-defined native character set.
codecvt<char, char, mbstate_t> implements a degenerate conversion; it does not convert at all.
The specialization codecvt<char16_t, char8_t, mbstate_t> converts between the UTF-16 and UTF-8 encoding forms, and the specialization codecvt<char32_t, char8_t, mbstate_t> converts between the UTF-32 and UTF-8 encoding forms.
codecvt<wchar_t, char, mbstate_t> converts between the native character sets for ordinary and wide characters.
Specializations on mbstate_t perform conversion between encodings known to the library implementer.
Other encodings can be converted by specializing on a program-defined stateT type.
Objects of type stateT can contain any state that is useful to communicate to or from the specialized do_in or do_out members.

30.4.2.5.2 Members [locale.codecvt.members]

result out( stateT& state, const internT* from, const internT* from_end, const internT*& from_next, externT* to, externT* to_end, externT*& to_next) const;
Returns: do_out(state, from, from_end, from_next, to, to_end, to_next).
result unshift(stateT& state, externT* to, externT* to_end, externT*& to_next) const;
Returns: do_unshift(state, to, to_end, to_next).
result in( stateT& state, const externT* from, const externT* from_end, const externT*& from_next, internT* to, internT* to_end, internT*& to_next) const;
Returns: do_in(state, from, from_end, from_next, to, to_end, to_next).
int encoding() const noexcept;
Returns: do_encoding().
bool always_noconv() const noexcept;
Returns: do_always_noconv().
int length(stateT& state, const externT* from, const externT* from_end, size_t max) const;
Returns: do_length(state, from, from_end, max).
int max_length() const noexcept;
Returns: do_max_length().

30.4.2.5.3 Virtual functions [locale.codecvt.virtuals]

result do_out( stateT& state, const internT* from, const internT* from_end, const internT*& from_next, externT* to, externT* to_end, externT*& to_next) const; result do_in( stateT& state, const externT* from, const externT* from_end, const externT*& from_next, internT* to, internT* to_end, internT*& to_next) const;
Preconditions: (from <= from_end && to <= to_end) is well-defined and true; state is initialized, if at the beginning of a sequence, or else is equal to the result of converting the preceding characters in the sequence.
Effects: Translates characters in the source range [from, from_end), placing the results in sequential positions starting at destination to.
Converts no more than (from_end - from) source elements, and stores no more than (to_end - to) destination elements.
Stops if it encounters a character it cannot convert.
It always leaves the from_next and to_next pointers pointing one beyond the last element successfully converted.
If returns noconv, internT and externT are the same type and the converted sequence is identical to the input sequence [from, from_next).
to_next is set equal to to, the value of state is unchanged, and there are no changes to the values in [to, to_end).
A codecvt facet that is used by basic_filebuf ([file.streams]) shall have the property that if do_out(state, from, from_end, from_next, to, to_end, to_next) would return ok, where from != from_end, then do_out(state, from, from + 1, from_next, to, to_end, to_next) shall also return ok, and that if do_in(state, from, from_end, from_next, to, to_end, to_next) would return ok, where to != to_end, then do_in(state, from, from_end, from_next, to, to + 1, to_next) shall also return ok.246
[Note 1: 
As a result of operations on state, it can return ok or partial and set from_next == from and to_next != to.
— end note]
Returns: An enumeration value, as summarized in Table 106.
Table 106: do_in/do_out result values [tab:locale.codecvt.inout]
Value
Meaning
ok
completed the conversion
partial
not all source characters converted
error
encountered a character in [from, from_end) that cannot be converted
noconv
internT and externT are the same type, and input sequence is identical to converted sequence
A return value of partial, if (from_next == from_end), indicates that either the destination sequence has not absorbed all the available destination elements, or that additional source elements are needed before another destination element can be produced.
Remarks: Its operations on state are unspecified.
[Note 2: 
This argument can be used, for example, to maintain shift state, to specify conversion options (such as count only), or to identify a cache of seek offsets.
— end note]
result do_unshift(stateT& state, externT* to, externT* to_end, externT*& to_next) const;
Preconditions: (to <= to_end) is well-defined and true; state is initialized, if at the beginning of a sequence, or else is equal to the result of converting the preceding characters in the sequence.
Effects: Places characters starting at to that should be appended to terminate a sequence when the current stateT is given by state.247
Stores no more than (to_end - to) destination elements, and leaves the to_next pointer pointing one beyond the last element successfully stored.
Returns: An enumeration value, as summarized in Table 107.
Table 107: do_unshift result values [tab:locale.codecvt.unshift]
Value
Meaning
ok
completed the sequence
partial
space for more than to_end - to destination elements was needed to terminate a sequence given the value of state
error
an unspecified error has occurred
noconv
no termination is needed for this state_type
int do_encoding() const noexcept;
Returns: -1 if the encoding of the externT sequence is state-dependent; else the constant number of externT characters needed to produce an internal character; or 0 if this number is not a constant.248
bool do_always_noconv() const noexcept;
Returns: true if do_in() and do_out() return noconv for all valid argument values.
codecvt<char, char, mbstate_t> returns true.
int do_length(stateT& state, const externT* from, const externT* from_end, size_t max) const;
Preconditions: (from <= from_end) is well-defined and true; state is initialized, if at the beginning of a sequence, or else is equal to the result of converting the preceding characters in the sequence.
Effects: The effect on the state argument is as if it called do_in(state, from, from_end, from, to, to+max, to) for to pointing to a buffer of at least max elements.
Returns: (from_next-from) where from_next is the largest value in the range [from, from_end] such that the sequence of values in the range [from, from_next) represents max or fewer valid complete characters of type internT.
The specialization codecvt<char, char, mbstate_t>, returns the lesser of max and (from_end-from).
int do_max_length() const noexcept;
Returns: The maximum value that do_length(state, from, from_end, 1) can return for any valid range [from, from_end) and stateT value state.
The specialization codecvt<char, char, mbstate_t>​::​do_max_length() returns 1.
246)246)
Informally, this means that basic_filebuf assumes that the mappings from internal to external characters is 1 to N: that a codecvt facet that is used by basic_filebuf can translate characters one internal character at a time.
247)247)
Typically these will be characters to return the state to stateT().
248)248)
If encoding() yields -1, then more than max_length() externT elements can be consumed when producing a single internT character, and additional externT elements can appear at the end of a sequence after those that yield the final internT character.