2025-04-29 02:51:27 -04:00

5.5 KiB

String parsing with strparse

The functions take input via a pointer to a pointer, which allows the functions to advance the pointer on success which then by extension allows "chaining" of functions like this example that gets a word, a space and then a second word:

if(Curl_str_word(&line, &word1, MAX) ||
   Curl_str_singlespace(&line) ||
   Curl_str_word(&line, &word2, MAX))
  fprintf(stderr, "ERROR\n");

The input pointer must point to a null terminated buffer area or these functions risk continuing "off the edge".

Strings

The functions that return string information does so by populating a struct Curl_str:

struct Curl_str {
  char *str;
  size_t len;
};

Access the struct fields with Curl_str() for the pointer and Curl_strlen() for the length rather than using the struct fields directly.

Curl_str_init

void Curl_str_init(struct Curl_str *out)

This initiates a string struct. The parser functions that store info in strings always init the string themselves, so this stand-alone use is often not necessary.

Curl_str_assign

void Curl_str_assign(struct Curl_str *out, const char *str, size_t len)

Set a pointer and associated length in the string struct.

Curl_str_word

int Curl_str_word(char **linep, struct Curl_str *out, const size_t max);

Get a sequence of bytes until the first space or the end of the string. Return non-zero on error. There is no way to include a space in the word, no sort of escaping. The word must be at least one byte, otherwise it is considered an error.

max is the longest accepted word, or it returns error.

On a successful return, linep is updated to point to the byte immediately following the parsed word.

Curl_str_until

int Curl_str_until(char **linep, struct Curl_str *out, const size_t max,
                   char delim);

Like Curl_str_word but instead of parsing to space, it parses to a given custom delimiter non-zero byte delim.

max is the longest accepted word, or it returns error.

The parsed word must be at least one byte, otherwise it is considered an error.

Curl_str_untilnl

int Curl_str_untilnl(char **linep, struct Curl_str *out, const size_t max);

Like Curl_str_untilnl but instead parses until it finds a "newline byte". That means either a CR (ASCII 13) or an LF (ASCII 10) octet.

max is the longest accepted word, or it returns error.

The parsed word must be at least one byte, otherwise it is considered an error.

Curl_str_cspn

int Curl_str_cspn(const char **linep, struct Curl_str *out, const char *cspn);

Get a sequence of characters until one of the bytes in the cspn string matches. Similar to the strcspn function.

Curl_str_quotedword

int Curl_str_quotedword(char **linep, struct Curl_str *out, const size_t max);

Get a "quoted" word. This means everything that is provided within a leading and an ending double quote character. No escaping possible.

max is the longest accepted word, or it returns error.

The parsed word must be at least one byte, otherwise it is considered an error.

Curl_str_single

int Curl_str_single(char **linep, char byte);

Advance over a single character provided in byte. Return non-zero on error.

Curl_str_singlespace

int Curl_str_singlespace(char **linep);

Advance over a single ASCII space. Return non-zero on error.

Curl_str_passblanks

void Curl_str_passblanks(char **linep);

Advance over all spaces and tabs.

Curl_str_trimblanks

void Curl_str_trimblanks(struct Curl_str *out);

Trim off blanks (spaces and tabs) from the start and the end of the given string.

Curl_str_number

int Curl_str_number(char **linep, curl_size_t *nump, size_t max);

Get an unsigned decimal number not larger than max. Leading zeroes are just swallowed. Return non-zero on error. Returns error if there was not a single digit.

Curl_str_numblanks

int Curl_str_numblanks(char **linep, curl_size_t *nump);

Get an unsigned 63-bit decimal number. Leading blanks and zeroes are skipped. Returns non-zero on error. Returns error if there was not a single digit.

Curl_str_hex

int Curl_str_hex(char **linep, curl_size_t *nump, size_t max);

Get an unsigned hexadecimal number not larger than max. Leading zeroes are just swallowed. Return non-zero on error. Returns error if there was not a single digit. Does not handled 0x prefix.

Curl_str_octal

int Curl_str_octal(char **linep, curl_size_t *nump, size_t max);

Get an unsigned octal number not larger than max. Leading zeroes are just swallowed. Return non-zero on error. Returns error if there was not a single digit.

Curl_str_newline

int Curl_str_newline(char **linep);

Check for a single CR or LF. Return non-zero on error */

Curl_str_casecompare

int Curl_str_casecompare(struct Curl_str *str, const char *check);

Returns true if the provided string in the str argument matches the check string case insensitively.

Curl_str_cmp

int Curl_str_cmp(struct Curl_str *str, const char *check);

Returns true if the provided string in the str argument matches the check string case sensitively. This is not the same return code as strcmp.

Curl_str_nudge

int Curl_str_nudge(struct Curl_str *str, size_t num);

Removes num bytes from the beginning (left) of the string kept in str. If num is larger than the string, it instead returns an error.