mxmString Class Reference
[modularMX Runtime Platform Core]

String of characters, featuring methods greatly facilitating string handling. More...

#include <mxmString.h>

Inheritance diagram for mxmString:

mxmObject List of all members.

Public Member Functions

 mxmString ()
 Constructs empty "" string.
 mxmString (const char *txt)
 The text is copied.
 mxmString (const mxmString &txt)
 Copy contructor, text is copied.
 mxmString (int numba)
 Number is converted using std::sprintf(buffer, "%d", numba);.
mxmStringoperator= (const char *txt)
mxmStringoperator= (const mxmString &txt)
mxmStringoperator+= (const mxmString &txt)
mxmStringoperator+= (const char *txt)
bool operator== (const mxmString &txt) const
bool operator!= (const mxmString &txt) const
bool operator<= (const mxmString &txt) const
 Is-prefix-of operator.
mxmStringList operator, (const mxmString &other_string)
 Returns an mxmStringList with the string as first entry and the other_string as second entry.
void setText (const char *txt)
 The text is copied.
const char * text (void) const
 Only valid until the next operation on the string.
void setNull (void)
 Explicitly makes the string a null string.
bool isNull (void) const
 Tells whether or not the string is a null string.
int length (void) const
 Returns the number of characters currently in the string.
void setLength (int len, char fill_character= ' ')
 Either removes trailing characters or appends fill characters to make the string have the specified length.
int characterPosition (char c) const
 Returns the position of the first occurance of the specified character, counting from 0, or -1 in case the character is not in the string.
void append (const mxmString &txt)
 Appends the specified string to the end of the current string.
bool split (mxmString &left, mxmString &right, const mxmString &separators, bool search_from_behind=false) const
 Non-destructively splits the string in two parts.
bool splitUsingSeparatorStrings (mxmString &left, mxmString &right, const mxmStringList &separator_strings)
 Non-destructively splits the string in two parts, using string-valued separators.
void trim (const mxmString &delimiters)
 Destructively removes delimiter characters from either side of the string.
void purgeCharacters (const mxmString &characters_to_purge, bool invert_logic=false, const char *replacement_character=0)
 Removes from the string all occurances of the characters specified.
void replace (mxmString const &search, mxmString const &replacement)
 Searches for all occurrences of search and replaces them by replacement.
mxmString HTMLEscaped (bool const strict=false) const
 Escapes delicate characters when string is to appear on HTML web page.
void escapeHTML (bool const strict=false)
 Same as HTMLEscaped() but escapes the string on which it is called upon.
mxmString urlPercentSymbolEscaped (const mxmString &characters_to_escape) const
 Returns version of string where the characters_to_escape are replaced by the respective xy substitutions used in URL encoding.
void escapeURLPercentSymbol (mxmString const &characters_to_escape)
 Same as urlPercentSymbolEscaped() but works on the string on which it is called upon.
bool unescapeURLPercentSymbols ()
 Unescapes %xy symbols as used in URL encoding.
mxmString urlPercentSymbolsUnescaped (void) const
 Returns version of string where %xy symbols as used in URLs are unescaped.
bool unescapeURL ()
 Unescapes URL-like character substitutions.
mxmString urlUnescaped (void) const
 Returns URL unescaped version of the string.
mxmString base64 (void) const
 Encodes the string as Base 64.
mxmString left (int character_num) const
 Returns the first character_num characters as substring.
mxmString mid (int character_start, int character_len) const
 Starting from position character_start, returns character_len characters as mxmString.
mxmString left (const mxmString &valid_characters) const
 Returns the longest prefix exclusively consisting of the specified valid characters.
int findSubString (mxmString const &search, int start_from=0) const
 Returns the position of the first occurrence of search at or after position start_from in the string upon which this method it invoked.
bool subString (const mxmString &another_string) const
 Tells whether or not the specified string is a substring of the string upon which this method is invoked.
mxm::smart< mxmStringListtokenize (const mxmString &separators, const mxmString &characters_to_trim="") const
 Splits the string into tokens, honoring the separator characters specified.
mxm::smart< mxmStringListtokenizeUsingSeparatorStrings (const mxmStringList &separator_strings, const mxmString &characters_to_trim="") const
 Splits the string into tokens, using string-valued separators.
bool toInt (int &target_int)
 Does the obvious thing, returns false in case the string is empty or null, and true otherwise.

Static Public Member Functions

static mxmString nullString (void)
 Returns a null-string.
static bool areEqualIgnoringCase (const mxmString &txt1, const mxmString &txt2)
 Case-insensitively compares two strings.

Private Attributes

char * TextBuffer
int BufferedChar
bool CharBuffered
void * StableABIDataExtension

Static Private Attributes

static const char * Base64EncodeTable

Detailed Description

String of characters, featuring methods greatly facilitating string handling.

We introduced this class as a replacement for the standard C++ string processing facilities we abandoned with libstdc++. You can do all sorts of nifty stuff to strings, such as:

     mxmString test_txt, another_txt,
               key, value;

     test_txt.setText("fudel");                   // set text
     test_txt = "funz";                           // assignment operator
     another_txt = test_txt;                      // deep copy
     test_txt.append(another_txt);                // append method
     test_txt.append("4567");                     //   -- dito --
     another_txt = another_txt + test_txt;        // concatenation operator
     another_txt = test_txt + "one-two-three";    //   -- dito --
     another_txt = "one-two-three" + test_txt;    //   -- dito --
     another_txt += test_txt;                     // appending operator
     another_txt += "<<<<";                       //   -- dito --
     std::printf("txt=%s\n", test_txt.text());    // actually use text
     std::printf("len=%d\n", test_txt.length());  // get string length
     test_txt.split(key, value, "|");             // split string in two
     key.trim(" \r\n"); value.trim(" \r\n");      // trim whitespaces and stuff
     std::printf("key=%s, value=%s\n",            // process key/value pair
                 key.text(), value.text());
     std::printf("Pos of char '1' is %d\n",       // locate a character
                 test_txt.characterPosition('1'));
Note that this class is intended for strings, not for larger amounts of text - the class' method performance will degrade with growing number of characters stored.

Empty Strings and null-Strings
Internally, there is a distinction between an empty string and a null-string. However, to the user both variants behave exactly the same way, so typically you won't notice unless you are looking for it by calling isNull(). Note in this context, that null-strings are perfectly legal and will deconstruct properly when deleted. The whole null-string stuff is interesting, apart from historical considerations, because it presents you with a special state that can be exploited and given meaning in certain scenarios, such as with mxmPerlStyleHash es.

Author:
[khe] Kai Hergenroether


Member Function Documentation

bool mxmString::areEqualIgnoringCase const mxmString txt1,
const mxmString txt2
[static]
 

Case-insensitively compares two strings.

null-strings behave like empty strings.

int mxmString::findSubString mxmString const &  search,
int  start_from = 0
const
 

Returns the position of the first occurrence of search at or after position start_from in the string upon which this method it invoked.

if start_from is omitted, starts searching from the beginning of the string. First character is at position 0.

Returns:
  • -1 if the search string is not found or empty.
  • otherwise the position of the first occurrence at or after position start_from

mxmString mxmString::HTMLEscaped bool const   strict = false  )  const
 

Escapes delicate characters when string is to appear on HTML web page.

Note: There is no corresponding unescape function as unescaping is done through the browser's rendering engine.

This function basically converts quotation marks and angle bracket open (<) to their equivalent HTML entities. This prevents intermingling of HTML attribute quotes and content containing quotes. Additionally, the angle bracket avoids interpretation of content as HTML element.

In addition to the above, the strict version escapes ampersand (&) so content which may contain HTML enties will not be interpreted through browser engine.

Usage notes

Use htmlEscaped() to escape user input that is printed out on a HTML page. This prevents XSS (execution of arbitrary Javascript code).

The strict option must be used to prevent browser engine from interpreting HTML entities and as such changing the content of the string, e.g. for

Rules:

  • non-strict escape user supplied data (e.g. form field input) as soon as is to be proccessed (e.g. printed on web page, used in attributes of HTML elements, fed to i8n functions).
  • non-strict escape output of i8n functions if string is to be used as value of attributes of HTML elements (e.g. value of attribute value of INPUT form element) to prevent confusion of quotation marks but allow for HTML entities to be used.
  • strict escape texual content that has to be transferred literally (see examples above).

Remember: Non-strict HTML escape can be applied repeatedly without mangling the string!

mxmString mxmString::mid int  character_start,
int  character_len
const
 

Starting from position character_start, returns character_len characters as mxmString.

First character is at position 0.

If character_start is negative, starts that far from the end of the string.

bool mxmString::operator<= const mxmString txt  )  const
 

Is-prefix-of operator.

The empty or null-string is always prefix.

bool mxmString::operator== const mxmString txt  )  const
 

null-strings behave like empty strings.

void mxmString::replace mxmString const &  search,
mxmString const &  replacement
 

Searches for all occurrences of search and replaces them by replacement.

Relies on findSubString() and as such does not replace anything if search string is empty.

void mxmString::setText const char *  txt  ) 
 

The text is copied.

The text gets copied.

bool mxmString::split mxmString left,
mxmString right,
const mxmString separators,
bool  search_from_behind = false
const
 

Non-destructively splits the string in two parts.

The split will be performed at the position of the first occurrence of one of the separators. If search_from_behind is set to true, then the split will be performed at the last separator occurrence.

Returns:
true in case a separator was found and the split was performed, and false otherwise. If the string could not be split, it is copied to left and right is set to "".

bool mxmString::splitUsingSeparatorStrings mxmString left,
mxmString right,
const mxmStringList separator_strings
 

Non-destructively splits the string in two parts, using string-valued separators.

The split will be performed at the position of the first occurrence of one of the separator strings.

Returns:
true in case a separator was found and the split was performed, and false otherwise. If the string could not be split, it is copied to left and right is set to "".

bool mxmString::subString const mxmString another_string  )  const
 

Tells whether or not the specified string is a substring of the string upon which this method is invoked.

subString() conforms to the GLIBC string search strstr().

Returns:
true if another_string is empty.

bool mxmString::toInt int &  target_int  ) 
 

Does the obvious thing, returns false in case the string is empty or null, and true otherwise.

The return value is either the atoi() result of a non-empty, non-null string or 0.

mxm::smart< mxmStringList > mxmString::tokenize const mxmString separators,
const mxmString characters_to_trim = ""
const
 

Splits the string into tokens, honoring the separator characters specified.

The method returns a list of mxmString s representing the tokens generated from the string. The whole data structure is dynamically generated on the heap.

The function will never generate empty token strings.

mxm::smart< mxmStringList > mxmString::tokenizeUsingSeparatorStrings const mxmStringList separator_strings,
const mxmString characters_to_trim = ""
const
 

Splits the string into tokens, using string-valued separators.

Note that analogously to the character-valued separator version of split(), this method will not produce any empty tokens for concatenations of multiple separator strings.

bool mxmString::unescapeURL  ) 
 

Unescapes URL-like character substitutions.

Currently, URL unescaping does the following:

  • it ressolves + characters to spaces
  • it unescapes the URL-style percent symbols of the form %xy

bool mxmString::unescapeURLPercentSymbols  ) 
 

Unescapes %xy symbols as used in URL encoding.

Returns:
true on success.


Member Data Documentation

const char * mxmString::Base64EncodeTable [static, private]
 

Initial value:

 "ABCDEFGHIJKLMNOPQRSTUVWXYZ"
                                           "abcdefghijklmnopqrstuvwxyz"
                                           "0123456789"
                                           "+/"


The documentation for this class was generated from the following files:
Generated on Fri Jun 29 17:21:05 2007 for MxPEG SDK by  doxygen 1.4.6