libusual  0.1
Data Structures | Typedefs | Functions
usual/regex.h File Reference

POSIX regular expession API, provided by either libc or internally. More...

Data Structures

struct  regex_t
 Compiled regex. More...
struct  regmatch_t
 Match location. More...

Defines

Standard flags to regcomp()
#define REG_EXTENDED
 Use POSIX Extended Regex Syntax instead of Basic Syntax.
#define REG_ICASE
 Do case-insensitive matching.
#define REG_NOSUB
 Do case-insensitive matching.
#define REG_NEWLINE
 Do case-insensitive matching.
Standard flags to regexec()
#define REG_NOTBOL
 The start of string is not beginning of line, so ^ should not match.
#define REG_NOTEOL
 The end of string is not end of line, so $ should not match.
Standard error codes
#define REG_NOMATCH
 Match not found.
#define REG_BADBR
 Bad {} repeat specification.
#define REG_BADPAT
 General problem with regular expression.
#define REG_BADRPT
 Repeat used without preceding non-repeat element.
#define REG_EBRACE
 Syntax error with {}.
#define REG_EBRACK
 Syntax error with [].
#define REG_ECOLLATE
 Bad collation reference.
#define REG_ECTYPE
 Bad character class reference.
#define REG_EESCAPE
 Trailing backslack.
#define REG_EPAREN
 Syntax error with ()
#define REG_ERANGE
 Bad endpoint in range.
#define REG_ESPACE
 No memory.
#define REG_ESUBREG
 Bad subgroup reference.
Other defines
#define RE_DUP_MAX
 Max count user can enter via {}.
Non-standard flags for regcomp()
#define REG_RELAXED_SYNTAX
 Allow few common non-standard escapes:
#define REG_RELAXED_MATCHING
 Dont permute groups in attempt to get longest match.
#define REG_RELAXED
 Turn on both REG_RELAXED_SYNTAX and REG_RELAXED_MATCHING.

Typedefs

typedef long regoff_t
 Type for offset in match.

Functions

int regcomp (regex_t *rx, const char *re, int flags)
 Compile regex.
int regexec (const regex_t *rx, const char *str, size_t nmatch, regmatch_t pmatch[], int eflags)
 Execute regex on a string.
size_t regerror (int err, const regex_t *rx, char *dst, size_t dstlen)
 Give error description.
void regfree (regex_t *rx)
 Free resources allocated by regcomp().

Detailed Description

POSIX regular expession API, provided by either libc or internally.

The internal regex engine is only activated if OS does not provide <regex.h> (eg. Windows) or if --with-internal-regex is used when configuring libusual.

Features of internal regex (uregex).

Simple recursive matcher, only features are small size and POSIX compatibility. Supports both Extended Regular Expressions (ERE) and Basic Regular Expressions (BRE).

Supported syntax

   Both: . * ^ $ [] [[:cname:]]
   ERE: () {} | + ?
   BRE: \(\) \{\} \1-9

With REG_RELAXED_SYNTAX, following common escapes will be available:

    Both: \b\B\d\D\s\S\w\W
    BRE:  \|
    ERE:  \1-9

With REG_RELAXED_MATCHING it returns the first match found after applying leftmost-longest to all elements. It skips the combinatorics to turn it into guaranteed-longest match.

Skipped POSIX features

Global defines

Compatibility


Define Documentation

#define REG_EXTENDED

Use POSIX Extended Regex Syntax instead of Basic Syntax.

#define REG_ICASE

Do case-insensitive matching.

#define REG_NOSUB

Do case-insensitive matching.

#define REG_NEWLINE

Do case-insensitive matching.

#define REG_NOTBOL

The start of string is not beginning of line, so ^ should not match.

#define REG_NOTEOL

The end of string is not end of line, so $ should not match.

#define REG_NOMATCH

Match not found.

#define REG_BADBR

Bad {} repeat specification.

#define REG_BADPAT

General problem with regular expression.

#define REG_BADRPT

Repeat used without preceding non-repeat element.

#define REG_EBRACE

Syntax error with {}.

#define REG_EBRACK

Syntax error with [].

#define REG_ECOLLATE

Bad collation reference.

#define REG_ECTYPE

Bad character class reference.

#define REG_EESCAPE

Trailing backslack.

#define REG_EPAREN

Syntax error with ()

#define REG_ERANGE

Bad endpoint in range.

#define REG_ESPACE

No memory.

#define REG_ESUBREG

Bad subgroup reference.

#define RE_DUP_MAX

Max count user can enter via {}.

Allow few common non-standard escapes:

   \b - word-change
   \B - not word change
   \d - digit
   \D - non-digit
   \s - space
   \S - non-space
   \w - word char
   \W - non-word char
   \/ - /

Dont permute groups in attempt to get longest match.

May give minor speed win at the expense of strict POSIX compatibility.

#define REG_RELAXED

Turn on both REG_RELAXED_SYNTAX and REG_RELAXED_MATCHING.


Typedef Documentation

typedef long regoff_t

Type for offset in match.


Function Documentation

int regcomp ( regex_t rx,
const char *  re,
int  flags 
)

Compile regex.

Parameters:
rxPre-allocated regex_t structure to fill.
reRegex as zero-terminated string.
flagsSee above for regcomp() flags.
int regexec ( const regex_t rx,
const char *  str,
size_t  nmatch,
regmatch_t  pmatch[],
int  eflags 
)

Execute regex on a string.

Parameters:
rxRegex previously initialized with regcomp()
strZero-terminated string to match
nmatchNumber of matches in pmatch
pmatchArray of matches.
eflagsExecution flags. Supported flags: REG_NOTBOL, REG_NOTEOL
size_t regerror ( int  err,
const regex_t rx,
char *  dst,
size_t  dstlen 
)

Give error description.

Parameters:
errError code returned by regcomp() or regexec()
rxRegex structure used in regcomp() or regexec()
dstDestination buffer
dstlenSize of dst
void regfree ( regex_t rx)

Free resources allocated by regcomp().

Parameters:
rxRegex previously filled by regcomp()