libusual  0.1
Functions
usual/utf8.h File Reference

Low-level UTF8 handling. More...

Functions

int utf8_get_char (const char **src_p, const char *srcend)
 Parse Unicode codepoint from UTF8 stream.
bool utf8_put_char (unsigned int c, char **dst_p, const char *dstend)
 Write Unicode codepoint as UTF8 sequence.
int utf8_char_size (unsigned int c)
 Return UTF8 seq length based on unicode codepoint.
int utf8_seq_size (unsigned char c)
 Return UTF8 seq length based on first byte.
int utf8_validate_seq (const char *src, const char *srcend)
 Return sequence length if all bytes are valid, 0 otherwise.

Detailed Description

Low-level UTF8 handling.


Function Documentation

int utf8_get_char ( const char **  src_p,
const char *  srcend 
)

Parse Unicode codepoint from UTF8 stream.

On invalid UTF8 sequence returns negative byte value and inreases src_p by one.

Parameters:
src_pLocation of data pointer. Will be incremented in-place.
srcendPointer to end of data.
Returns:
UNOCODE codepoint or negative byte value on error.
bool utf8_put_char ( unsigned int  c,
char **  dst_p,
const char *  dstend 
)

Write Unicode codepoint as UTF8 sequence.

Skips invalid Unicode values without error.

Parameters:
cUnicode codepoint.
dst_pLocation of dest pointer, will be increased in-place.
dstendPointer to end of buffer.
Returns:
false if not room, true otherwise.
int utf8_char_size ( unsigned int  c)

Return UTF8 seq length based on unicode codepoint.

int utf8_seq_size ( unsigned char  c)

Return UTF8 seq length based on first byte.

int utf8_validate_seq ( const char *  src,
const char *  srcend 
)

Return sequence length if all bytes are valid, 0 otherwise.