libusual
0.1
|
Low-level UTF8 handling. More...
Functions | |
int | utf8_get_char (const char **src_p, const char *srcend) |
Parse Unicode codepoint from UTF8 stream. | |
bool | utf8_put_char (unsigned int c, char **dst_p, const char *dstend) |
Write Unicode codepoint as UTF8 sequence. | |
int | utf8_char_size (unsigned int c) |
Return UTF8 seq length based on unicode codepoint. | |
int | utf8_seq_size (unsigned char c) |
Return UTF8 seq length based on first byte. | |
int | utf8_validate_seq (const char *src, const char *srcend) |
Return sequence length if all bytes are valid, 0 otherwise. |
Low-level UTF8 handling.
int utf8_get_char | ( | const char ** | src_p, |
const char * | srcend | ||
) |
Parse Unicode codepoint from UTF8 stream.
On invalid UTF8 sequence returns negative byte value and inreases src_p by one.
src_p | Location of data pointer. Will be incremented in-place. |
srcend | Pointer to end of data. |
bool utf8_put_char | ( | unsigned int | c, |
char ** | dst_p, | ||
const char * | dstend | ||
) |
Write Unicode codepoint as UTF8 sequence.
Skips invalid Unicode values without error.
c | Unicode codepoint. |
dst_p | Location of dest pointer, will be increased in-place. |
dstend | Pointer to end of buffer. |
int utf8_char_size | ( | unsigned int | c | ) |
Return UTF8 seq length based on unicode codepoint.
int utf8_seq_size | ( | unsigned char | c | ) |
Return UTF8 seq length based on first byte.
int utf8_validate_seq | ( | const char * | src, |
const char * | srcend | ||
) |
Return sequence length if all bytes are valid, 0 otherwise.