master-server/deps/curl/docs/libcurl/curl_url_set.md

250 lines
7.8 KiB
Markdown
Raw Permalink Normal View History

2024-05-15 15:20:32 -04:00
---
c: Copyright (C) Daniel Stenberg, <daniel@haxx.se>, et al.
SPDX-License-Identifier: curl
Title: curl_url_set
Section: 3
Source: libcurl
See-also:
- CURLOPT_CURLU (3)
- curl_url (3)
- curl_url_cleanup (3)
- curl_url_dup (3)
- curl_url_get (3)
- curl_url_strerror (3)
Protocol:
- All
---
# NAME
2023-12-11 20:30:44 -05:00
curl_url_set - set a URL part
2024-05-15 15:20:32 -04:00
# SYNOPSIS
~~~c
2023-12-11 20:30:44 -05:00
#include <curl/curl.h>
CURLUcode curl_url_set(CURLU *url,
CURLUPart part,
const char *content,
unsigned int flags);
2024-05-15 15:20:32 -04:00
~~~
# DESCRIPTION
The *url* handle to work on, passed in as the first argument, must be a
handle previously created by curl_url(3) or curl_url_dup(3).
2023-12-11 20:30:44 -05:00
This function sets or updates individual URL components, or parts, held by the
URL object the handle identifies.
2024-05-15 15:20:32 -04:00
The *part* argument should identify the particular URL part (see list
below) to set or change, with *content* pointing to a null-terminated
2023-12-11 20:30:44 -05:00
string with the new contents for that URL part. The contents should be in the
form and encoding they would use in a URL: URL encoded.
When setting part in the URL object that was previously already set, it
replaces the data that was previously stored for that part with the new
2024-05-15 15:20:32 -04:00
*content*.
2023-12-11 20:30:44 -05:00
2024-05-15 15:20:32 -04:00
The caller does not have to keep *content* around after a successful call
2023-12-11 20:30:44 -05:00
as this function copies the content.
Setting a part to a NULL pointer removes that part's contents from the
2024-05-15 15:20:32 -04:00
*CURLU* handle.
2023-12-11 20:30:44 -05:00
By default, this API only accepts URLs using schemes for protocols that are
supported built-in. To make libcurl parse URLs generically even for schemes it
2024-05-15 15:20:32 -04:00
does not know about, the **CURLU_NON_SUPPORT_SCHEME** flags bit must be
set. Otherwise, this function returns *CURLUE_UNSUPPORTED_SCHEME* for URL
2023-12-11 20:30:44 -05:00
schemes it does not recognize.
This function has an 8 MB maximum length limit for all provided input strings.
In the real world, excessively long fields in URLs cause problems even if this
API accepts them.
When setting or updating contents of individual URL parts, this API might
accept data that would not be otherwise possible to set in the string when it
gets populated as a result of a full URL parse. Beware. If done so, extracting
a full URL later on from such components might render an invalid URL.
2024-05-15 15:20:32 -04:00
The *flags* argument is a bitmask with independent features.
# PARTS
## CURLUPART_URL
2023-12-11 20:30:44 -05:00
Allows the full URL of the handle to be replaced. If the handle already is
populated with a URL, the new URL can be relative to the previous.
When successfully setting a new URL, relative or absolute, the handle contents
is replaced with the components of the newly set URL.
2024-05-15 15:20:32 -04:00
Pass a pointer to a null-terminated string to the *url* parameter. The
2023-12-11 20:30:44 -05:00
string must point to a correctly formatted "RFC 3986+" URL or be a NULL
pointer.
2024-05-15 15:20:32 -04:00
Unless *CURLU_NO_AUTHORITY* is set, a blank hostname is not allowed in
2023-12-11 20:30:44 -05:00
the URL.
2024-05-15 15:20:32 -04:00
## CURLUPART_SCHEME
2023-12-11 20:30:44 -05:00
Scheme cannot be URL decoded on set. libcurl only accepts setting schemes up
to 40 bytes long.
2024-05-15 15:20:32 -04:00
## CURLUPART_USER
## CURLUPART_PASSWORD
## CURLUPART_OPTIONS
2023-12-11 20:30:44 -05:00
The options field is an optional field that might follow the password in the
userinfo part. It is only recognized/used when parsing URLs for the following
schemes: pop3, smtp and imap. This function however allows users to
independently set this field.
2024-05-15 15:20:32 -04:00
## CURLUPART_HOST
The hostname. If it is International Domain Name (IDN) the string must then be
encoded as your locale says or UTF-8 (when WinIDN is used). If it is a
2023-12-11 20:30:44 -05:00
bracketed IPv6 numeric address it may contain a zone id (or you can use
2024-05-15 15:20:32 -04:00
*CURLUPART_ZONEID*).
Unless *CURLU_NO_AUTHORITY* is set, a blank hostname is not allowed to set.
## CURLUPART_ZONEID
If the hostname is a numeric IPv6 address, this field can also be set.
## CURLUPART_PORT
2023-12-11 20:30:44 -05:00
The port number cannot be URL encoded on set. The given port number is
provided as a string and the decimal number in it must be between 0 and
65535. Anything else returns an error.
2024-05-15 15:20:32 -04:00
## CURLUPART_PATH
2023-12-11 20:30:44 -05:00
If a path is set in the URL without a leading slash, a slash is prepended
automatically.
2024-05-15 15:20:32 -04:00
## CURLUPART_QUERY
2023-12-11 20:30:44 -05:00
The query part gets spaces converted to pluses when asked to URL encode on set
2024-05-15 15:20:32 -04:00
with the *CURLU_URLENCODE* bit.
2023-12-11 20:30:44 -05:00
2024-05-15 15:20:32 -04:00
If used together with the *CURLU_APPENDQUERY* bit, the provided part is
2023-12-11 20:30:44 -05:00
appended on the end of the existing query.
The question mark in the URL is not part of the actual query contents.
2024-05-15 15:20:32 -04:00
## CURLUPART_FRAGMENT
2023-12-11 20:30:44 -05:00
The hash sign in the URL is not part of the actual fragment contents.
2024-05-15 15:20:32 -04:00
# FLAGS
2023-12-11 20:30:44 -05:00
The flags argument is zero, one or more bits set in a bitmask.
2024-05-15 15:20:32 -04:00
## CURLU_APPENDQUERY
Can be used when setting the *CURLUPART_QUERY* component. The provided new
2023-12-11 20:30:44 -05:00
part is then appended at the end of the existing query - and if the previous
part did not end with an ampersand (&), an ampersand gets inserted before the
new appended part.
2024-05-15 15:20:32 -04:00
When *CURLU_APPENDQUERY* is used together with *CURLU_URLENCODE*, the
2023-12-11 20:30:44 -05:00
first '=' symbol is not URL encoded.
2024-05-15 15:20:32 -04:00
## CURLU_NON_SUPPORT_SCHEME
If set, allows curl_url_set(3) to set a non-supported scheme.
## CURLU_URLENCODE
When set, curl_url_set(3) URL encodes the part on entry, except for
2023-12-11 20:30:44 -05:00
scheme, port and URL.
When setting the path component with URL encoding enabled, the slash character
is be skipped.
The query part gets space-to-plus conversion before the URL conversion.
This URL encoding is charset unaware and converts the input in a byte-by-byte
manner.
2024-05-15 15:20:32 -04:00
## CURLU_DEFAULT_SCHEME
2023-12-11 20:30:44 -05:00
If set, allows the URL to be set without a scheme and then sets that to the
2024-05-15 15:20:32 -04:00
default scheme: HTTPS. Overrides the *CURLU_GUESS_SCHEME* option if both
2023-12-11 20:30:44 -05:00
are set.
2024-05-15 15:20:32 -04:00
## CURLU_GUESS_SCHEME
2023-12-11 20:30:44 -05:00
If set, allows the URL to be set without a scheme and it instead "guesses"
2024-05-15 15:20:32 -04:00
which scheme that was intended based on the hostname. If the outermost
subdomain name matches DICT, FTP, IMAP, LDAP, POP3 or SMTP then that scheme is
used, otherwise it picks HTTP. Conflicts with the *CURLU_DEFAULT_SCHEME*
option which takes precedence if both are set.
## CURLU_NO_AUTHORITY
2023-12-11 20:30:44 -05:00
If set, skips authority checks. The RFC allows individual schemes to omit the
host part (normally the only mandatory part of the authority), but libcurl
cannot know whether this is permitted for custom schemes. Specifying the flag
permits empty authority sections, similar to how file scheme is handled.
2024-05-15 15:20:32 -04:00
## CURLU_PATH_AS_IS
When set for **CURLUPART_URL**, this skips the normalization of the
2023-12-11 20:30:44 -05:00
path. That is the procedure where libcurl otherwise removes sequences of
dot-slash and dot-dot etc. The same option used for transfers is called
2024-05-15 15:20:32 -04:00
CURLOPT_PATH_AS_IS(3).
## CURLU_ALLOW_SPACE
2023-12-11 20:30:44 -05:00
If set, the URL parser allows space (ASCII 32) where possible. The URL syntax
does normally not allow spaces anywhere, but they should be encoded as %20
or '+'. When spaces are allowed, they are still not allowed in the scheme.
When space is used and allowed in a URL, it is stored as-is unless
2024-05-15 15:20:32 -04:00
*CURLU_URLENCODE* is also set, which then makes libcurl URL encode the
2023-12-11 20:30:44 -05:00
space before stored. This affects how the URL is constructed when
2024-05-15 15:20:32 -04:00
curl_url_get(3) is subsequently used to extract the full URL or
2023-12-11 20:30:44 -05:00
individual parts. (Added in 7.78.0)
2024-05-15 15:20:32 -04:00
## CURLU_DISALLOW_USER
2023-12-11 20:30:44 -05:00
If set, the URL parser does not accept embedded credentials for the
2024-05-15 15:20:32 -04:00
**CURLUPART_URL**, and instead returns **CURLUE_USER_NOT_ALLOWED** for
2023-12-11 20:30:44 -05:00
such URLs.
2024-05-15 15:20:32 -04:00
# EXAMPLE
~~~c
2023-12-11 20:30:44 -05:00
int main(void)
{
CURLUcode rc;
CURLU *url = curl_url();
rc = curl_url_set(url, CURLUPART_URL, "https://example.com", 0);
if(!rc) {
/* change it to an FTP URL */
rc = curl_url_set(url, CURLUPART_SCHEME, "ftp", 0);
}
curl_url_cleanup(url);
}
2024-05-15 15:20:32 -04:00
~~~
# AVAILABILITY
2023-12-11 20:30:44 -05:00
Added in 7.62.0. CURLUPART_ZONEID was added in 7.65.0.
2024-05-15 15:20:32 -04:00
# RETURN VALUE
Returns a *CURLUcode* error value, which is CURLUE_OK (0) if everything
went fine. See the libcurl-errors(3) man page for the full list with
2023-12-11 20:30:44 -05:00
descriptions.
2024-05-15 15:20:32 -04:00
The input string passed to curl_url_set(3) must be shorter than eight
million bytes. Otherwise this function returns **CURLUE_MALFORMED_INPUT**.
2023-12-11 20:30:44 -05:00
If this function returns an error, no URL part is set.