parser for multipart MIME messages More...
Typedefs | |
typedef size_t(* | cgul_mime__produce_t) (cgul_exception_t *cex, char *block, size_t block_size, void *data) |
typedef int(* | cgul_mime__consume_t) (cgul_exception_t *cex, unsigned long int part, cgul_mime__section_t section, char *block, size_t block_size, void *data) |
Functions | |
CGUL_EXPORT unsigned long int | cgul_mime__parse (cgul_exception_t *cex, const char *boundary, const char *eol, cgul_mime__produce_t produce, void *produce_data, cgul_mime__consume_t consume, void *consume_data) |
CGUL_EXPORT unsigned long int | cgul_mime__parse_memory (cgul_exception_t *cex, char *block, size_t block_size, const char *boundary, const char *eol, cgul_mime__consume_t consume, void *consume_data) |
CGUL_EXPORT unsigned long int | cgul_mime__parse_file (cgul_exception_t *cex, FILE *fin, const char *boundary, const char *eol, cgul_mime__consume_t consume, void *consume_data) |
CGUL_EXPORT unsigned long int | cgul_mime__parse_fname (cgul_exception_t *cex, const char *fname, const char *boundary, const char *eol, cgul_mime__consume_t consume, void *consume_data) |
CGUL_EXPORT unsigned long int | cgul_mime__split_memory (cgul_exception_t *cex, char *block, size_t block_size, const char *dname, const char *pname, const char *hname, const char *bname, const char *ename, const char *boundary, const char *eol) |
CGUL_EXPORT unsigned long int | cgul_mime__split_file (cgul_exception_t *cex, FILE *fin, const char *dname, const char *pname, const char *hname, const char *bname, const char *ename, const char *boundary, const char *eol) |
CGUL_EXPORT unsigned long int | cgul_mime__split_fname (cgul_exception_t *cex, const char *fname, const char *dname, const char *pname, const char *hname, const char *bname, const char *ename, const char *boundary, const char *eol) |
Parser for multipart MIME messages. This multipart MIME parser should be sufficient for handling HTTP uploads. It should scale well being able to handle large, multipart MIME messages efficiently.
typedef size_t(* cgul_mime__produce_t) (cgul_exception_t *cex, char *block, size_t block_size, void *data) |
This typedef is the interface for the callback function invoked by cgul_mime__parse()
to produce blocks from the original MIME source so that the blocks can be parsed.
Functions that implement this interface should read bytes into the block block
that extends for block_size
bytes and return the number of bytes read into block
. If and only if EOF is reached, this function should return 0
. It is permissible to return a short read count greater than zero. If the parser still needs more data, it will just invoke the callback again. If an error occurs, this function should throw an exception.
The value of data
is the same as what was initially passed into cgul_mime__parse()
when the callback was registered allowing the client to pass arbitrary data to the producer.
[in,out] | cex | c-style exception |
[in] | block | block |
[in] | block_size | block size |
[in] | data | client data |
block
or zero if EOF typedef int(* cgul_mime__consume_t) (cgul_exception_t *cex, unsigned long int part, cgul_mime__section_t section, char *block, size_t block_size, void *data) |
This typedef is the interface for the callback function invoked by the following functions to consume the blocks generated by the parser:
cgul_mime__parse() cgul_mime__parse_file() cgul_mime__parse_fname() cgul_mime__parse_memory()
Functions that implement this interface will be passed each block block
and its size in bytes block_size
from the MIME file as it is parsed along with the MIME section type section
from which the data originated. The function will also be passed the current part part
of the multipart MIME file that is being parsed. The value for part
will be 0
for the preamble and the epilogue, 1
for the first part, 2
for the second part, and so on.
The value of data
is the same as what was initially passed in when the callback was registered allowing the client to pass arbitrary data to the consumer. If an error occurs, this function should throw an exception.
[in,out] | cex | c-style exception |
[in] | part | MIME part of the multipart message |
[in] | section | MIME section where the block originates |
[in] | block | block |
[in] | block_size | block size |
[in] | data | client data |
enum cgul_mime__section_t |
The sections of a multipart MIME file.
CGUL_EXPORT unsigned long int cgul_mime__parse | ( | cgul_exception_t * | cex, |
const char * | boundary, | ||
const char * | eol, | ||
cgul_mime__produce_t | produce, | ||
void * | produce_data, | ||
cgul_mime__consume_t | consume, | ||
void * | consume_data | ||
) |
This is the generic parsing function that works by parsing blocks of data supplied by the produce
function. For each block supplied by produce
, it invokes the client's callback function consume
at least once passing in a block that holds a subset of bytes from the current MIME message, the size of the block, the MIME section type, and the client data consume_data
. This function returns the total number of MIME parts. If an error occurs, 0
is returned, and an exception is thrown.
The MIME parts are separated from each other by the MIME boundary boundary
. Unfortunately, the boundary separators are required to start with "--", but the "Content-Type" header specifies the boundary without the implicit "--" prefix. To make it possible to directly pass the boundary read from the "Content-Type" header into this function, boundary
must not include the implicit "--" prefix; instead, the "--" prefix will automatically be inserted by this function.
According to the MIME standard, all EOL sequences should be "\r\n", but as a practical matter, MIME messages often use the native EOL sequence instead. As a result, the client needs to pass in the correct EOL sequence eol
. If eol
is NULL
this function will attempt to automatically detect the EOL sequence by scanning the first 16K of the file. If it cannot automatically detect the EOL sequence, this function will use "\r\n" as the EOL sequence.
[in,out] | cex | c-style exception |
[in] | boundary | MIME boundary |
[in] | eol | EOL sequence |
[in] | produce | callback function to produce MIME blocks |
[in] | produce_data | client data for produce |
[in] | consume | callback function to consume parsed blocks |
[in] | consume_data | client data for consume |
Referenced by cgul_mime_cxx::parse().
CGUL_EXPORT unsigned long int cgul_mime__parse_memory | ( | cgul_exception_t * | cex, |
char * | block, | ||
size_t | block_size, | ||
const char * | boundary, | ||
const char * | eol, | ||
cgul_mime__consume_t | consume, | ||
void * | consume_data | ||
) |
This function is an adapter for cgul_mime__parse()
for parsing multipart MIME messages already resident in memory. It registers a special "producer" that allows the parser to directly access the MIME message in memory rather than having to copy it block-by-block into the parser's internal buffer.
This function works by parsing a multipart MIME message that is in memory at block
and extends for block_size
bytes. The client's callback function consume
is invoked once for each MIME section passing in a block that holds all the bytes from the current MIME section, the size of the block, the MIME section type, and the client data consume_data
. This function returns the total number of MIME parts. If an error occurs, 0
is returned, and an exception is thrown.
Even though this function passes all the bytes from each MIME section to the consume
callback, it may be worthwhile to write consume
so that it can also be used with cgul_mime__parse_file()
which is similar but generally requires multiple calls to consume
in order to pass in the same amount of data.
The MIME parts are separated from each other by the MIME boundary boundary
. Unfortunately, the boundary separators are required to start with "--", but the "Content-Type" header specifies the boundary without the implicit "--" prefix. To make it possible to directly pass the boundary read from the "Content-Type" header into this function, boundary
must not include the implicit "--" prefix; instead, the "--" prefix will automatically be inserted by this function.
According to the MIME standard, all EOL sequences should be "\r\n", but as a practical matter, MIME messages often use the native EOL sequence instead. As a result, the client needs to pass in the correct EOL sequence eol
. If eol
is NULL
this function will attempt to automatically detect the EOL sequence by scanning the first 16K of the file. If it cannot automatically detect the EOL sequence, this function will use "\r\n" as the EOL sequence.
[in,out] | cex | c-style exception |
[in] | block | block |
[in] | block_size | block size |
[in] | boundary | MIME boundary |
[in] | eol | EOL sequence |
[in] | consume | callback function to consume parsed blocks |
[in] | consume_data | client data for consume |
Referenced by cgul_mime_cxx::parse_memory().
CGUL_EXPORT unsigned long int cgul_mime__parse_file | ( | cgul_exception_t * | cex, |
FILE * | fin, | ||
const char * | boundary, | ||
const char * | eol, | ||
cgul_mime__consume_t | consume, | ||
void * | consume_data | ||
) |
This function is an adapter for cgul_mime__parse()
for parsing multipart MIME messages from a file. It registers a "producer" that copies the MIME message from file block-by-block into the parser's internal buffer.
This function works by parsing a multipart MIME message in the file fin
. For each block read from the file, it invokes the client's callback function consume
at least once passing in a block that holds a subset of bytes from the current MIME section, the size of the block, the MIME section type, and the client data consume_data
. This function returns the total number of MIME parts. If an error occurs, 0
is returned, and an exception is thrown.
The MIME parts are separated from each other by the MIME boundary boundary
. Unfortunately, the boundary separators are required to start with "--", but the "Content-Type" header specifies the boundary without the implicit "--" prefix. To make it possible to directly pass the boundary read from the "Content-Type" header into this function, boundary
must not include the implicit "--" prefix; instead, the "--" prefix will automatically be inserted by this function.
According to the MIME standard, all EOL sequences should be "\r\n", but as a practical matter, MIME messages often use the native EOL sequence instead. As a result, the client needs to pass in the correct EOL sequence eol
. If eol
is NULL
this function will attempt to automatically detect the EOL sequence by scanning the first 16K of the file. If it cannot automatically detect the EOL sequence, this function will use "\r\n" as the EOL sequence.
[in,out] | cex | c-style exception |
[in] | fin | input file |
[in] | boundary | MIME boundary |
[in] | eol | EOL sequence |
[in] | consume | callback function to consume parsed blocks |
[in] | consume_data | client data for consume |
Referenced by cgul_mime_cxx::parse_file().
CGUL_EXPORT unsigned long int cgul_mime__parse_fname | ( | cgul_exception_t * | cex, |
const char * | fname, | ||
const char * | boundary, | ||
const char * | eol, | ||
cgul_mime__consume_t | consume, | ||
void * | consume_data | ||
) |
This function is an adapter for cgul_mime__parse()
for parsing multipart MIME messages from a file. It registers a "producer" that copies the MIME message from file block-by-block into the parser's internal buffer.
This function works by parsing a multipart MIME message in the file with name fname
. For each block read from the file, it invokes the client's callback function consume
at least once passing in a block that holds a subset of bytes from the current MIME section, the size of the block, the MIME section type, and the client data consume_data
. This function returns the total number of MIME parts. If an error occurs, 0
is returned, and an exception is thrown.
The MIME parts are separated from each other by the MIME boundary boundary
. Unfortunately, the boundary separators are required to start with "--", but the "Content-Type" header specifies the boundary without the implicit "--" prefix. To make it possible to directly pass the boundary read from the "Content-Type" header into this function, boundary
must not include the implicit "--" prefix; instead, the "--" prefix will automatically be inserted by this function.
According to the MIME standard, all EOL sequences should be "\r\n", but as a practical matter, MIME messages often use the native EOL sequence instead. As a result, the client needs to pass in the correct EOL sequence eol
. If eol
is NULL
this function will attempt to automatically detect the EOL sequence by scanning the first 16K of the file. If it cannot automatically detect the EOL sequence, this function will use "\r\n" as the EOL sequence.
[in,out] | cex | c-style exception |
[in] | fname | name of the input file |
[in] | boundary | MIME boundary |
[in] | eol | EOL sequence |
[in] | consume | callback function to consume parsed blocks |
[in] | consume_data | client data for consume |
Referenced by cgul_mime_cxx::parse_fname().
CGUL_EXPORT unsigned long int cgul_mime__split_memory | ( | cgul_exception_t * | cex, |
char * | block, | ||
size_t | block_size, | ||
const char * | dname, | ||
const char * | pname, | ||
const char * | hname, | ||
const char * | bname, | ||
const char * | ename, | ||
const char * | boundary, | ||
const char * | eol | ||
) |
This function is an adapter for cgul_mime__parse_memory()
that splits a multipart MIME message in memory into multiple files on the host file system.
Split the multipart MIME message in memory starting at block
and extending for block_size
bytes into preamble, header, body, and epilogue sections. The directory where the files should be written is given by dname
. The name of the preamble is given by pname
. The names for the header files will have the form hname-lu
. The names for the body files will have the form bname-lu
where "%lu" will be replaced by the part number (starting with 1). If any of pname
, hname
, bname
, or ename
are NULL
, those files will not be written. The name of the epilogue file is given by ename
. This function returns the total number of MIME parts. If an error occurs, 0
is returned, and an exception is thrown.
The MIME parts are separated from each other by the MIME boundary boundary
. Unfortunately, the boundary separators are required to start with "--", but the "Content-Type" header specifies the boundary without the implicit "--" prefix. To make it possible to directly pass the boundary read from the "Content-Type" header into this function, boundary
must not include the implicit "--" prefix; instead, the "--" prefix will automatically be inserted by this function.
According to the MIME standard, all EOL sequences should be "\r\n", but as a practical matter, MIME messages often use the native EOL sequence instead. As a result, the client needs to pass in the correct EOL sequence eol
. If eol
is NULL
this function will attempt to automatically detect the EOL sequence by scanning the MIME message. If it cannot automatically detect the EOL sequence, this function will use "\r\n" as the EOL sequence.
[in,out] | cex | c-style exception |
[in] | block | block |
[in] | block_size | block size |
[in] | dname | directory name where output files will be created |
[in] | pname | name for the preamble file |
[in] | hname | base name for header files |
[in] | bname | base name for body files |
[in] | ename | name for the epilogue file |
[in] | boundary | MIME boundary |
[in] | eol | EOL sequence |
Referenced by cgul_mime_cxx::split_memory().
CGUL_EXPORT unsigned long int cgul_mime__split_file | ( | cgul_exception_t * | cex, |
FILE * | fin, | ||
const char * | dname, | ||
const char * | pname, | ||
const char * | hname, | ||
const char * | bname, | ||
const char * | ename, | ||
const char * | boundary, | ||
const char * | eol | ||
) |
This function is an adapter for cgul_mime__parse_file()
that splits a multipart MIME message in a file into multiple files on the host file system.
Split the multipart MIME message in the file fin
into preamble, header, body, and epilogue sections. The directory where the files should be written is given by dname
. The name of the preamble is given by pname
. The names for the header files will have the form hname-lu
. The names for the body files will have the form bname-lu
where "%lu" will be replaced by the part number (starting with 1). If any of pname
, hname
, bname
, or ename
are NULL
, those files will not be written. The name of the epilogue file is given by ename
. This function returns the total number of MIME parts. If an error occurs, 0
is returned, and an exception is thrown.
The MIME parts are separated from each other by the MIME boundary boundary
. Unfortunately, the boundary separators are required to start with "--", but the "Content-Type" header specifies the boundary without the implicit "--" prefix. To make it possible to directly pass the boundary read from the "Content-Type" header into this function, boundary
must not include the implicit "--" prefix; instead, the "--" prefix will automatically be inserted by this function.
According to the MIME standard, all EOL sequences should be "\r\n", but as a practical matter, MIME messages often use the native EOL sequence instead. As a result, the client needs to pass in the correct EOL sequence eol
. If eol
is NULL
this function will attempt to automatically detect the EOL sequence by scanning the first 16K of the file. If it cannot automatically detect the EOL sequence, this function will use "\r\n" as the EOL sequence.
[in,out] | cex | c-style exception |
[in] | fin | MIME input file |
[in] | dname | directory name where output files will be created |
[in] | pname | name for the preamble file |
[in] | hname | base name for header files |
[in] | bname | base name for body files |
[in] | ename | name for the epilogue file |
[in] | boundary | MIME boundary |
[in] | eol | EOL sequence |
Referenced by cgul_mime_cxx::split_file().
CGUL_EXPORT unsigned long int cgul_mime__split_fname | ( | cgul_exception_t * | cex, |
const char * | fname, | ||
const char * | dname, | ||
const char * | pname, | ||
const char * | hname, | ||
const char * | bname, | ||
const char * | ename, | ||
const char * | boundary, | ||
const char * | eol | ||
) |
This function is an adapter for cgul_mime__parse_fname()
that splits a multipart MIME message in a file into multiple files on the host file system.
Split the multipart MIME message in the file with name fname
into preamble, header, body, and epilogue sections. The directory where the files should be written is given by dname
. The name of the preamble is given by pname
. The names for the header files will have the form hname-lu
. The names for the body files will have the form bname-lu
where "%lu" will be replaced by the part number (starting with 1). If any of pname
, hname
, bname
, or ename
are NULL
, those files will not be written. The name of the epilogue file is given by ename
. This function returns the total number of MIME parts. If an error occurs, 0
is returned, and an exception is thrown.
The MIME parts are separated from each other by the MIME boundary boundary
. Unfortunately, the boundary separators are required to start with "--", but the "Content-Type" header specifies the boundary without the implicit "--" prefix. To make it possible to directly pass the boundary read from the "Content-Type" header into this function, boundary
must not include the implicit "--" prefix; instead, the "--" prefix will automatically be inserted by this function.
According to the MIME standard, all EOL sequences should be "\r\n", but as a practical matter, MIME messages often use the native EOL sequence instead. As a result, the client needs to pass in the correct EOL sequence eol
. If eol
is NULL
this function will attempt to automatically detect the EOL sequence by scanning the first 16K of the file. If it cannot automatically detect the EOL sequence, this function will use "\r\n" as the EOL sequence.
[in,out] | cex | c-style exception |
[in] | fname | name of MIME input file |
[in] | dname | directory name where output files will be created |
[in] | pname | name for the preamble file |
[in] | hname | base name for header files |
[in] | bname | base name for body files |
[in] | ename | name for the epilogue file |
[in] | boundary | MIME boundary |
[in] | eol | EOL sequence |
Referenced by cgul_mime_cxx::split_fname().