C++ bindings for cgul_microxml
More...
#include <cgul_microxml_cxx.h>
Public Member Functions | |
cgul_microxml_cxx () | |
cgul_microxml_cxx (const char *fname) | |
cgul_microxml_cxx (FILE *f) | |
cgul_microxml_cxx (const char *buffer, size_t buffer_size) | |
cgul_microxml_cxx (cgul_stream_reader_t sr) | |
virtual | ~cgul_microxml_cxx () |
virtual void | open_fname (const char *fname) |
virtual void | open_file (FILE *f) |
virtual void | open_memory (const char *buffer, size_t buffer_size) |
virtual void | close () |
virtual cgul_stream_reader_t | get_stream_reader () |
virtual void | set_stream_reader (cgul_stream_reader_t sr) |
virtual const char * | get_presentation_name () |
virtual void | set_presentation_name (const char *presentation_name) |
virtual size_t | get_maximum_entity_length () |
virtual void | set_maximum_entity_length (size_t entity_length_max) |
virtual size_t | get_line_number () |
virtual size_t | get_column_number () |
virtual void | read_next_token () |
virtual bool | is_document_start () |
virtual bool | is_document_end () |
virtual bool | is_element_start () |
virtual bool | is_element_end () |
virtual bool | is_text () |
virtual bool | is_comment () |
virtual const char * | get_element_name () |
virtual cgul_string_t | get_attribute_value (const char *attribute_name) |
virtual cgul_hash_t | get_attributes () |
virtual cgul_hash_t | take_attributes () |
virtual void | delete_attributes (cgul_hash_t attributes) |
virtual bool | is_empty_element () |
virtual cgul_string_t | get_text () |
virtual bool | is_white_space () |
virtual bool | get_skip_white_space () |
virtual void | set_skip_white_space (int skip) |
virtual size_t | get_maximum_text_length () |
virtual void | set_maximum_text_length (size_t text_length_max) |
virtual cgul_string_t | get_comment () |
virtual bool | get_skip_comments () |
virtual void | set_skip_comments (int skip) |
virtual size_t | get_maximum_comment_length () |
virtual void | set_maximum_comment_length (size_t comment_length_max) |
virtual bool | get_skip_prolog_or_epilog_tokens () |
virtual void | set_skip_prolog_or_epilog_tokens (int skip) |
virtual cgul_microxml_t | get_obj () const |
virtual cgul_microxml_t | take_obj () |
virtual void | set_obj (cgul_microxml_t rhs) |
Static Public Member Functions | |
static char * | escape (const char *s) |
This class provides the C++ bindings for C cgul_microxml
objects. The main purpose of this class is to convert the C-style function calls and exception handling in cgul_microxml
into C++-style function calls and exception handling.
|
inline |
Default constructor for cgul_microxml_cxx
. cgul_microxml__set_stream_reader()
should be called before calling read_next_token()
. If an error occurs, an exception is thrown.
References cgul_microxml__new().
|
inline |
Construct a new cgul_microxml_cxx
instance and call open_fname()
passing it fname
. The file will be closed when this instance is deleted. If an error occurs, an exception is thrown.
[in] | fname | file name |
References cgul_microxml__new_from_fname().
|
inline |
Construct a new cgul_microxml_cxx
instance and call open_file()
passing it f
. The class does not take ownership of f
. Thus, the client is still responsible for calling fclose()
on it. If an error occurs, an exception is thrown.
[in] | f | file |
References cgul_microxml__new_from_file().
|
inline |
Construct a new cgul_microxml_cxx
instance and call open_memory()
passing it buffer
and buffer_size
. This class does not take ownership of buffer
so the client is still responsible for freeing buffer
if necessary. If an error occurs, an exception is thrown.
[in] | buffer | memory buffer |
[in] | buffer_size | size of buffer in bytes |
References cgul_microxml__new_from_memory().
|
inline |
Construct a new cgul_microxml_cxx
instance and call set_stream_reader()
passing it sr
. If an error occurs, an exception is thrown.
[in] | sr | stream reader |
References cgul_microxml__new_from_stream_reader().
|
inlinevirtual |
Destructor.
References cgul_microxml__delete().
|
inlinestatic |
Static method used to escape the special MicroXML characters '&', '<', '>', '"', and "'" in the string s
. The client is repsonsible for calling free()
on the pointer returned. If an error occurs, an exception is thrown.
[in] | s | UTF-8 input string |
References cgul_microxml__escape().
|
inlinevirtual |
Open the MicroXML file with file name fname
. If a file or memory buffer is already open, it is closed before attempting to open the new file. The new file will be closed when this instance is deleted. If an error occurs, an exception is thrown.
[in] | fname | file name |
References cgul_microxml__open_fname().
|
inlinevirtual |
Open the MicroXML file f
. If a file or memory buffer is already open, it is closed before attempting to open the new file. This class does not take ownership of f
. Thus, the client is still responsible for calling fclose()
on f
. If an error occurs, an exception is thrown.
[in] | f | input file name |
References cgul_microxml__open_file().
|
inlinevirtual |
Open the MicroXML file contained in the memory buffer buffer
holding size buffer_size
bytes (not Unicode characters). If a file or memory buffer is already open, it is closed before attempting to open the new memory buffer. This class does not take ownership of buffer
. Thus, the client is still responsible for freeing the buffer if necessary. This method does not alter buffer
, but buffer
should not be changed while it is being used by this class. If an error occurs, an exception is thrown.
[in] | buffer | memory buffer |
[in] | buffer_size | buffer size in bytes (not Unicode characters) |
References cgul_microxml__open_memory().
|
inlinevirtual |
Close the open MicroXML file (if any). If an error occurs, an exception is thrown.
References cgul_microxml__close().
|
inlinevirtual |
Return the stream reader. If no stream reader is currently selected, NULL
is returned.
References cgul_microxml__get_stream_reader().
|
inlinevirtual |
Set the stream reader sr
. This class does not take ownership of sr
. Thus, the client is still responsible for deleting it. If sr
is NULL
, the old stream reader will no longer be used making it safe to delete the old stream reader without having to set a new one. If an error occurs, an exception is thrown.
The only constraint on sr
is that it must yield Unicode characters. If the stream is UTF-8 or starts with a Unicode Byte-Order Mark (BOM), cgul_stream_reader_cxx
should automatically detect the correct decoder to use; otherwise, manually set sr
to use one of the following decoders:
cgul_stream_reader_cxx::decode_utf8 cgul_stream_reader_cxx::decode_utf16be cgul_stream_reader_cxx::decode_utf16le cgul_stream_reader_cxx::decode_utf32be cgul_stream_reader_cxx::decode_utf32le
If you need to parse MicroXML fragments, multiple block readers can be added to the stream reader at any time using cgul_stream_reader_cxx::add_block_reader()
.
[in] | sr | stream reader |
References cgul_microxml__set_stream_reader().
|
inlinevirtual |
Return the presentation name. If an input source is not currently set, an exception is thrown.
References cgul_microxml__get_presentation_name().
|
inlinevirtual |
Set the presentation name. The presentation name is primarily used for error reporting and can be set to anything, but it does need to be set again if the input file changes. If the presentation name is not set, it defaults to the file name that was opened or to "FILE"
or "MEMORY"
if using a FILE*
or memory buffer respectively. An exception is thrown if an attempt is made to set the presentation name without first setting the input file.
[in] | presentation_name | presentation name |
References cgul_microxml__set_presentation_name().
|
inlinevirtual |
Return the maximum entity length. If set to 0
, the length is limited only by available memory. The default is 16K.
References cgul_microxml__get_maximum_entity_length().
|
inlinevirtual |
Set the maximum entity length. If set to 0
, the length is limited only by available memory. The default is 16K.
[in] | entity_length_max | maximum entity length |
References cgul_microxml__set_maximum_entity_length().
|
inlinevirtual |
Return the starting line number for the current token. If no input source is set, 0
is returned, and an exception is thrown.
References cgul_microxml__get_line_number().
|
inlinevirtual |
Return the starting column number for the current token. If no input source is set, 0
is returned, and an exception is thrown.
References cgul_microxml__get_column_number().
|
inlinevirtual |
Read the next token from the MicroXML input source. This method can be used to implement a pull parser. If an error occurs, an exception is thrown.
References cgul_microxml__read_next_token().
|
inlinevirtual |
Return whether the current token indicates the start of the document. This is always the first token.
References cgul_microxml__is_document_start().
|
inlinevirtual |
Return whether the current token indicates the end of the document. This is always the last token if no errors occurr. Attempting to get the next token after this one results in an error.
References cgul_microxml__is_document_end().
|
inlinevirtual |
Return whether the start tag for an element has been read. Use get_element_name()
and get_attributes()
to get the name and attributes for the element. Use is_empty_element()
to determine if the element is an empty element (e.g., <foo/>
).
References cgul_microxml__is_element_start().
|
inlinevirtual |
Return whether the end tag for an element has been read or, if an empty element was parsed earlier, whether an end tag is being simulated on behalf of the earlier empty element. This way the following two MicroXML samples generate the same token stream:
<foo></foo> <foo/>
Use get_element_name()
to get the name of the element. Use is_empty_element()
to determine if the end tag is being simulated because the matching start tag is an empty element.
References cgul_microxml__is_element_end().
|
inlinevirtual |
Return whether text has been read. Use get_text()
to get the text.
References cgul_microxml__is_text().
|
inlinevirtual |
Return whether a comment has been read. Use get_comment()
to get the comment.
References cgul_microxml__is_comment().
|
inlinevirtual |
Return the name of the current element. The string is owned by this class so the client must not attempt to free it. This method is valid when processing the start or end tag of an element; otherwise, an exception is thrown.
Element names are interned by the cgul_microxml_cxx
instance making them valid until the cgul_microxml_cxx
instance is deleted or reset. It is common for elements to be repeated. In this case, this method returns the same pointer for the repeated elements which allows DOMs to be created in a memory efficient manner. Because the element names are shared, it is important that the client treat them as immutable which should be natural in most contexts.
References cgul_microxml__get_element_name().
|
inlinevirtual |
Return the value for the attribute name attribute_name
. If no attribute has that name, NULL
is returned. Ownership of the attribute value remains with the cgul_microxml_cxx
instance so the client must not attempt to delete it. This method is valid when processing the start or end tag of an element; otherwise, an exception is thrown.
Unlike element names and attribute names, attribute values are only valid until the next token is read unless cgul_string__take_value()
is called on the returned attribute value in which case the client is responsible for calling free()
on the value taken.
The attribute value can be converted to numeric types using methods from cgul_string
like cgul_string__to_int()
.
[in] | attribute_name | attribute name |
NULL
References cgul_microxml__get_attribute_value().
|
inlinevirtual |
Return the entire attribute map associated with the current element. The keys and values stored in the attributes map are owned by this class so the client must not attempt to free or delete them. This method is valid when processing the start tag of an element; otherwise, an exception is thrown.
The keys in the attributes map correspond to the names of the attributes and are C-style strings. The keys are valid until this cgul_microxml_cxx
instance is deleted or reset.
The values in the attributes map correspond to the attribute values and are cgul_string
instances (not C-style strings). Unlike the keys, the values are only valid until the next token is read unless cgul_string__take_value()
is called on the the attribute values in which case the client is responsible for calling free()
on each value taken.
The attribute values can be converted to numeric types using methods from cgul_string
like cgul_string__to_int()
.
You can iterate over the attribute names and values as follows:
cgul_hash_node_t attr = NULL; attr = cgul_hash__get_front(cex, attrs); for ( ; attr ; attr = cgul_hash_node__get_next(cex, attr)) { // Get the key/value pair for this attribute. const char* key = (const char*)cgul_hash_node__get_key(cex, attr); cgul_string_t tmp = (cgul_string_t)cgul_hash_node__get_value(cex, attr); const char* value = cgul_string__get_value(cex, tmp); ... }
References cgul_microxml__get_attributes().
|
inlinevirtual |
Return the attributes map associated with the current element. This method transfers ownership of the attributes map requiring the client to delete the map by calling cgul_microxml__delete_attributes()
. The element attributes can only be taken once per element. This method is valid when processing the start tag of an element; otherwise, an exception is thrown.
The keys in the attributes map correspond to the names of the attributes and are C-style strings. The keys are only valid until cgul_microxml__delete_attributes()
is called or until this cgul_microxml_cxx
instance is deleted or reset. So do not try use the keys after deleting uxml
! (The reason for this is that attribute names are interned in a symbol table owned by this class so that multiple occurrences of the same attribute name can share the same symbol without requiring extra memory for a duplicate string.)
The values in the attributes map correspond to the attribute values and are cgul_string
instances (not C-style strings). Unlike the keys, the values are valid until cgul_microxml__delete_attributes()
is called even if uxml
has been deleted or reset.
The attribute values can be converted to numeric types using methods from cgul_string
like cgul_string__to_int()
.
You can iterate over the attribute names and values as follows:
cgul_hash_node_t attr = NULL; attr = cgul_hash__get_front(cex, attrs); for ( ; attr ; attr = cgul_hash_node__get_next(cex, attr)) { // Get the key/value pair for this attribute. const char* key = (const char*)cgul_hash_node__get_key(cex, attr); cgul_string_t tmp = (cgul_string_t)cgul_hash_node__get_value(cex, attr); const char* value = cgul_string__get_value(cex, tmp); ... }
References cgul_microxml__take_attributes().
|
inlinevirtual |
Convenience method used to delete the attributes map attributes
returned by the method take_attributes()
. The client must not use attributes
after this method returns.
get_attributes()
. Only use it on attributes maps returned by take_attributes()
. [in] | attributes | attributes map |
References cgul_microxml__delete_attributes().
|
inlinevirtual |
Return whether the element is an empty element (e.g., <foo/>
). This method is valid when processing the start or end tag of an element; otherwise, an exception is thrown.
References cgul_microxml__is_empty_element().
|
inlinevirtual |
Return the current text. The returned value is owned by this class so the client must not attempt to delete it. This method is valid when processing text; otherwise, an exception is thrown.
The text returned is only valid until the next token is read unless cgul_string__take_value()
is called on the returned value in which case the client is responsible for calling free()
on the text taken.
The text can be converted to numeric types using methods from cgul_string
like cgul_string__to_int()
.
References cgul_microxml__get_text().
|
inlinevirtual |
Return whether the current text is comprised of only white space characters. The MicroXML specification defines white space as tab, new line, or space characters. This method is valid when processing text; an exception is thrown.
References cgul_microxml__is_white_space().
|
inlinevirtual |
Return whether to skip text under the root element that is pure white space. By default, this white space is not skipped. For white space in the prolog and epilog, see get_skip_prolog_or_epilog_tokens()
.
References cgul_microxml__get_skip_white_space().
|
inlinevirtual |
Set whether to skip text under the root element that is pure white space. This is convenient when the document being parsed is formatted with line breaks and indentation that are not significant. By default, this white space is not skipped. For white space in the prolog and epilog, see set_skip_prolog_or_epilog_tokens()
.
If enabled, set_maximum_text_length()
should be used to increase the maximum size of each block of text (if necessary) so that at least one character that is not white space is always included in each block of significant white space.
[in] | skip | whether to skip text that is pure white space |
References cgul_microxml__set_skip_white_space().
|
inlinevirtual |
Return the maximum size in Unicode characters of text returned by get_text()
. If set to 0
, the length is limited only by available memory. The default is 16K.
References cgul_microxml__get_maximum_text_length().
|
inlinevirtual |
Set the maximum size in Unicode characters of text returned by get_text()
. If set to 0
, the length is limited only by available memory. The default is 16K.
[in] | text_length_max | maximum text length in Unicode characters |
References cgul_microxml__set_maximum_text_length().
|
inlinevirtual |
Return the current comment. The returned value is owned by this class so the client must not attempt to delete it. This method is valid when processing comments; otherwise, an exception is thrown.
The comment returned is only valid until the next token is read unless cgul_string__take_value()
is called on the returned value in which case the client is responsible for calling free()
on the comment taken.
References cgul_microxml__get_comment().
|
inlinevirtual |
Return whether to skip comments under the root element. By default, these comments are skipped. For comments in the prolog and epilog, see get_skip_prolog_or_epilog_tokens()
.
References cgul_microxml__get_skip_comments().
|
inlinevirtual |
Set whether to skip comments under the root element. By default, these comments are skipped. For comments in the prolog and epilog, see set_skip_prolog_or_epilog_tokens()
.
[in] | skip | whether to skip text that is pure white space |
References cgul_microxml__set_skip_comments().
|
inlinevirtual |
Return the maximum comment length. If set to 0
, the length is limited only by available memory. The default is 16K.
References cgul_microxml__get_maximum_comment_length().
|
inlinevirtual |
Set the maximum comment length. If set to 0
, the length is limited only by available memory. The default is 16K.
[in] | comment_length_max | maximum comment length |
References cgul_microxml__set_maximum_comment_length().
|
inlinevirtual |
Return whether to skip tokens in the prolog or epilog. These are the white space and comment tokens that surround the root element. By default, these tokens are skipped. For white space and comments under the root element, see get_skip_white_space()
and get_skip_comments()
.
References cgul_microxml__get_skip_prolog_or_epilog_tokens().
|
inlinevirtual |
Set whether to skip tokens in the prolog or epilog. These are the white space and comment tokens that surround the root element. By default, these tokens are skipped. For white space and comments under the root element, see set_skip_white_space()
and set_skip_comments()
.
[in] | skip | whether tokens in the prolog or epilog |
References cgul_microxml__set_skip_prolog_or_epilog_tokens().
|
inlinevirtual |
Get the underlying cgul_microxml
object.
|
inlinevirtual |
Take the underlying cgul_microxml
object. This means the underlying object will not be deleted when the wrapper goes out of scope. Also, because you have taken the underlying object, no other methods should be called on this wrapper's instance. Lastly, after taking the underlying object, it is the caller's responsibility to delete the underlying object by calling cgul_microxml__delete()
.
|
inlinevirtual |
Set the new underlying object to rhs
. This causes the old underlying object to be deleted which invalidates any outstanding pointers to or iterators for the old underlying object.
This instance takes ownership of rhs
which means rhs
will be automatically deleted when the C++ wrapper is deleted. To prevent automatic deletion of rhs
, call take_obj()
when the C++ wrapper is no longer needed.
[in] | rhs | right-hand side |
References cgul_microxml__delete().