Skip to content

cjheath/strpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

451 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

strpp - StringPlusPlus - pronounced "strip"

A value-oriented C++ library implementing maps, arrays and strings (with slices), raw Unicode processing, Variant type, pattern-matching and parsing. Designed to minimise dynamic memory allocation and memory safety issues, strpp provides efficient but advanced functionality on machines with restricted memory, such as many embedded systems.

Strpp uses copy-on-write to implement value semantics on shared objects (Unicode strings, generic arrays and maps) using thread-safe atomic reference-counting. Values may be passed by copying references, which means you can pass complex structures cheaply without much fear of aliasing, memory leaks, or object lifetime violations.

The Variant class provides type-safe support for passing any data object, as is common in interpreted languages (Perl, Ruby, Python).

The <rx> library provides a regular expression implementation using StrVal that is ReDOS-safe (using the Thompson algorithm).

These regular expressions are however deprecated in favour of greedy PEG expressions with look-ahead assertions. All PEG operators are in the prefix position, which does not require a compilation step (or memory allocation) for efficient execution. These Pegular Expressions are also composed into full (non-regular) PEG grammars with a parser generator to produce compact table-driven parsers that allocate no memory. The Px compiler also generates grammar documentation using Javascript and SVG, and will produce Textmate syntax highlighting patterns for IDEs and text generators to assist implementation of pretty-printer output. Parser template parameters allow capturing parse results, with a generic Abstract Syntax Tree builder for any grammar specified in Px.

Raw Unicode character processing

#include <char_encoding.h>

See Unicode

Rapid inlined encode/decode for 32-bit character values in 1..6 bytes of UTF8. Gracefully handles errors in UTF-8 coding.

#include <charpointer.h>

#include <char_ptr.h>

#include <utf8_ptr.h>

CharPointer selects either 8-bit (char_ptr) or UTF-8 (utf8_ptr) boxed character pointers. The two kinds support unguarded pointers, pointers that will not advance past a NUL, and pointers which will not retreat before the start either. All are used to facilitate correct coding by restricting the unguarded behaviour of the C/C++ languages.

Error and ErrNum type

#include <error.h>.

See Error

Unicode strings: the StrVal class

#include <strval.h>

See StrVal

Array with slices

#include <array.h>

See Array

COWMap, copy-on-write map template

#include <cowmap.h>

See CowMap

Variant Data Type

#include <variant.h>

See Variant

Prefix Regular Expression pattern matching

#include <pegexp.h>

See Pegexp

PEG parsing

#include <peg.h>

See Peg

Px, a PEG parser generator

See Px

Cross-platform C++ Threading support

#include <thread.h> #include <lockfree.h> #include <condition.h>

Thread is a base class which requires a virtual run() method and provides suspend, resume, join, joinAny, yield(ms), etc. lockfree.h implements atomic an Latch class allowing construction of lock-free code, and condition,h provides an cross-platform implementation of condition variables.

Regular Expressions

A ReDOS-resistant Thompson-style regexp compiler/interpreter using StrVal.

#include <strregex.h>

See Rx

LICENSE

The MIT License. See the LICENSE file for details.

About

C++ Unicode library for Array and String, Regexp and PEG parsing, using ref-counted slices with value semantics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors