cgul_crlf_cxx Class Reference

C++ bindings for cgul_crlf More...

#include <cgul_crlf_cxx.h>

Collaboration diagram for cgul_crlf_cxx:
Collaboration graph

Public Member Functions

 cgul_crlf_cxx ()
 
virtual ~cgul_crlf_cxx ()
 
virtual void reset (unsigned long offset=0)
 
virtual int get_strip_utf8_bom () const
 
virtual void set_strip_utf8_bom (int strip_utf8_bom)
 
virtual void convert (char *buf, unsigned long int bsize)
 
virtual const char * get_line () const
 
virtual unsigned long get_line_count () const
 
virtual unsigned long get_line_offset () const
 
virtual const char * get_remainder ()
 
virtual void convert_file (FILE *fin, FILE *fout, const char *eol)
 
virtual cgul_crlf_t get_obj () const
 
virtual cgul_crlf_t take_obj ()
 
virtual void set_obj (cgul_crlf_t rhs)
 

Detailed Description

This class provides the C++ bindings for C cgul_crlf objects. The main purpose of this class is to convert the C-style function calls and exception handling in cgul_crlf into C++-style function calls and exception handling.

Constructor & Destructor Documentation

§ cgul_crlf_cxx()

cgul_crlf_cxx::cgul_crlf_cxx ( )
inline

Default Constructor. If memory cannot be allocated, an exception is thrown.

References cgul_crlf__new().

Referenced by set_obj().

§ ~cgul_crlf_cxx()

virtual cgul_crlf_cxx::~cgul_crlf_cxx ( )
inlinevirtual

Destructor.

References cgul_crlf__delete().

Member Function Documentation

§ reset()

virtual void cgul_crlf_cxx::reset ( unsigned long  offset = 0)
inlinevirtual

This method is used to reset the object so that it can process a new stream of text or process the same stream of text after seeking to a different location in the stream.

The client must inform this class of the new offset into the underlying file so that subsequent calls to get_line_offset() can return correct values. The value of offset should be zero-based. Thus, to start processing at the beginning of a new file call this method with offset set to 0.

Calling this method resets the line count to zero. This can be used by the client to implement a line counter that does not overflow.

Calling this method does not reset whether a leading UTF-8 byte-order mark (BOM) is stripped.

Parameters
[in]offsetzero-based offset

References cgul_crlf__reset().

§ get_strip_utf8_bom()

virtual int cgul_crlf_cxx::get_strip_utf8_bom ( ) const
inlinevirtual

This method returns whether the leading UTF-8 byte-order mark (BOM) should be stripped from the first line if it is present.

Returns
whether to strip a leading UTF-8 byte-order mark

References cgul_crlf__get_strip_utf8_bom().

§ set_strip_utf8_bom()

virtual void cgul_crlf_cxx::set_strip_utf8_bom ( int  strip_utf8_bom)
inlinevirtual

By default, this class detects the leading UTF-8 byte-order mark (BOM) and strips it from the first line returned by get_line() if it is present. It then clears its internal flag so that BOMs internal to the text file will be returned. This is generally what you want because the leading BOM is not significant but the internal BOMs are.

You can alter the way this class handles the leading BOM by calling this method with strip_utf8_bom set to 0. This will cause the leading BOM to be returned as part of the first line. This can be useful, for example, if you just want to convert the text file and are not interested in its contents.

The value of strip_utf8_bom is remembered across calls to reset().

It should be noted that most operating systems do not save UTF-8 text files with a leading BOM because UTF-8 is a character stream and, as such, does not have byte-order problems; however, Microsoft Windows adds the BOM to its UTF-8 text files presumably to help distinguish UTF-8 text files from text files with different encodings.

Parameters
[in]strip_utf8_bomwhether to strip leading UTF-8 byte-order mark

References cgul_crlf__set_strip_utf8_bom().

§ convert()

virtual void cgul_crlf_cxx::convert ( char *  buf,
unsigned long int  bsize 
)
inlinevirtual

The caller feeds this method a block of text in buf of size bsize. The blocks you feed this method can end anywhere; they do not have to end exactly on a line boundary. This method knows how to splice together partial lines from the last call to form arbitrarily long lines using any of the common EOL markers: "\n", "\r", or "\r\n"

After each call to this method, you MUST call get_line() iteratively until it returns NULL before feeding this method another block.

Do not alter buf until after you have exhausted get_line(). This prevents convert() from having to make a copy of each block because this method often (but not always) inserts NUL characters directly into buf to produce the lines returned by get_line().

After feeding the last block to this method and exhausting get_line(), you should call get_remainder() to fetch what remained if the last line had no trailing EOL marker.

This method dynamically allocates space to hold the lines that are split across calls to this method. If an error occurs, an exception is thrown, and the object will be in an undefined state.

WARNING: Because this method embeds NUL characters directly into buf to produce the lines returned by get_line(), it goes without saying that buf must be writable. What might not be obvious is that this means buf probably should not be allocated on the stack because many operating systems have security mechanisms to prevent unexpected writes to stack variables.

Parameters
[in]bufbuffer
[in]bsizebuffer size

References cgul_crlf__convert().

§ get_line()

virtual const char* cgul_crlf_cxx::get_line ( ) const
inlinevirtual

After seeding this object by calling convert(), you call this method to fetch the next line. If a line is ready, this method returns it. If no line is ready, this method returns NULL. The caller should not try to call free() on the line returned because it is really just a pointer back to the contents of the buffer passed into convert().

If this method does not return NULL, you should keep calling it until it does. Once it returns NULL, you can either refill this object by calling convert() with the next block or call get_remainder() to finish.

Returns
next line of text

References cgul_crlf__get_line().

§ get_line_count()

virtual unsigned long cgul_crlf_cxx::get_line_count ( ) const
inlinevirtual

This method returns the total number of lines returned by get_line() and get_remainder(). The line count is one-based. No attempt is made to prevent the return value from overflowing. So, the caller is responsible for verifying the return value.

Calls to reset() reset the line count to zero. This can be used by the client to implement a line counter that does not overflow.

Returns
line count

References cgul_crlf__get_line_count().

§ get_line_offset()

virtual unsigned long cgul_crlf_cxx::get_line_offset ( ) const
inlinevirtual

This method returns the offset of the last line returned by get_line() or get_remainder(). The offset is zero-based. If you are feeding a binary stream into convert() and if the stream is also a random access stream, you can use the return value to directly seek to the line as follows:

    fseek(f, offset, SEEK_SET);

Because the prototype for fseek() requires a long for the offset parameter, no attempt is made to prevent the return value from overflowing. So, the caller is responsible for verifying the return value.

Note that the offset returned is basically the number of bytes from the start of the file to the current line. This is not necessarily the same as the number of characters which depends on how the file is encoded.

To get the offset of the remainder, just call get_remainder() before calling this method.

This method throws an exception if, after converting a new block, it is called before get_line() is called.

Returns
zero-based offset into file for the current line

References cgul_crlf__get_line_offset().

§ get_remainder()

virtual const char* cgul_crlf_cxx::get_remainder ( )
inlinevirtual

This is the last method you should call, and it should only be called once. It should be called only after all the blocks have been feed to convert() and only after get_line() has been exhausted. At this point, all that is left is the remainder.

This method returns NULL if a remainder does not exist. The only time a remainder exists is if the last line in the file is missing the final EOL marker.

After calling this function, use get_line_offset() to get the offset of the remainder.

The caller should not try to call free() on the pointer returned because it points to an internal string that will be freed when delete() is called.

Returns
text remaining after the final EOL

References cgul_crlf__get_remainder().

§ convert_file()

virtual void cgul_crlf_cxx::convert_file ( FILE *  fin,
FILE *  fout,
const char *  eol 
)
inlinevirtual

This method copies fin to fout stripping the original EOL markers and replacing them with eol. fin and fout must have been opened in binary mode. This method internally uses a cgul_crlf object to perform the conversion. If an error occurs, an exception is thrown.

Note that this method use C-style files because portably dealing with C++ files is very difficult.

Parameters
[in]fininput file
[out]foutoutput file
[in]eolnew EOL marker

References cgul_crlf__convert_file().

§ get_obj()

virtual cgul_crlf_t cgul_crlf_cxx::get_obj ( ) const
inlinevirtual

Get the underlying cgul_crlf object.

Returns
underlying object

§ take_obj()

virtual cgul_crlf_t cgul_crlf_cxx::take_obj ( )
inlinevirtual

Take the underlying cgul_crlf object. This means the underlying object will not be deleted when the wrapper goes out of scope. Also, because you have taken the underlying object, no other methods should be called on this wrapper's instance. Lastly, after taking the underlying object, it is the caller's responsibility to delete the underlying object by calling cgul_crlf__delete().

Returns
underlying object

§ set_obj()

virtual void cgul_crlf_cxx::set_obj ( cgul_crlf_t  rhs)
inlinevirtual

Set the new underlying object to rhs. This causes the old underlying object to be deleted which invalidates any outstanding pointers to or iterators for the old underlying object.

This instance takes ownership of rhs which means rhs will be automatically deleted when the C++ wrapper is deleted. To prevent automatic deletion of rhs, call take_obj() when the C++ wrapper is no longer needed.

Parameters
[in]rhsright-hand side

References cgul_crlf__delete(), and cgul_crlf_cxx().


The documentation for this class was generated from the following file: