> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-3a82795f.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

> Documentation for functions used to work with URLs

# Functions for working with URLs

<h2 id="overview">
  Overview
</h2>

<Note>
  The functions mentioned in this section are optimized for maximum performance and for the most part do not follow the RFC-3986 standard.
  Functions which implement RFC-3986 have `RFC` appended to their function name and are generally slower.
</Note>

You can generally use the non-`RFC` function variants when working with publicly registered domains that contain neither user strings nor `@` symbols.
The table below details which symbols in a URL can (`✔`) or cannot (`✗`) be parsed by the respective `RFC` and non-`RFC` variants:

| Symbol | non-`RFC` | `RFC` |
| ------ | --------- | ----- |
| ' '    | ✗         | ✗     |
| \t     | ✗         | ✗     |
| \<     | ✗         | ✗     |
| >      | ✗         | ✗     |
| %      | ✗         | ✔\*   |
| \{     | ✗         | ✗     |
| }      | ✗         | ✗     |
| \|     | ✗         | ✗     |
| \\\\   | ✗         | ✗     |
| ^      | ✗         | ✗     |
| \~     | ✗         | ✔\*   |
| \[     | ✗         | ✗     |
| ]      | ✗         | ✔     |
| ;      | ✗         | ✔\*   |
| =      | ✗         | ✔\*   |
| &      | ✗         | ✔\*   |

symbols marked `*` are sub-delimiters in RFC 3986 and allowed for user info following the `@` symbol.

There are two types of URL functions:

* Functions that extract parts of a URL. If the relevant part isn't present in a URL, an empty string is returned.
* Functions that remove part of a URL. If the URL does not have anything similar, the URL remains unchanged.

<Note>
  The functions below are generated from the `system.functions` system table.
</Note>

{/*AUTOGENERATED_START*/}

<h2 id="URLHierarchy">
  URLHierarchy
</h2>

Introduced in: v1.1.0

Returns an array containing the URL, truncated at the end by the symbols `/`, `?` and `#` in the path and query string. Consecutive separator characters are counted as one. The result includes the protocol and host as the first element, with progressively longer paths forming a hierarchy.

**Syntax**

```sql theme={null}
URLHierarchy(url)
```

**Arguments**

* `url` — The URL to process. [`String`](/reference/data-types/string)

**Returned value**

Returns an array of progressively longer URLs forming a hierarchy. [`Array(String)`](/reference/data-types/array)

**Examples**

**Basic usage**

```sql title=Query theme={null}
SELECT URLHierarchy('https://example.com/a/b?c=1')
```

```response title=Response theme={null}
['https://example.com/','https://example.com/a/','https://example.com/a/b','https://example.com/a/b?c=1']
```

<h2 id="URLPathHierarchy">
  URLPathHierarchy
</h2>

Introduced in: v1.1.0

Returns an array containing the path component of the URL, truncated at the end by the symbols `/`, `?` and `#`. Unlike `URLHierarchy`, the result does not include the protocol and host — it starts from the path. Consecutive separator characters are counted as one.

**Syntax**

```sql theme={null}
URLPathHierarchy(url)
```

**Arguments**

* `url` — The URL to process. [`String`](/reference/data-types/string)

**Returned value**

Returns an array of progressively longer URL path components forming a hierarchy. [`Array(String)`](/reference/data-types/array)

**Examples**

**Basic usage**

```sql title=Query theme={null}
SELECT URLPathHierarchy('https://example.com/a/b?c=1')
```

```response title=Response theme={null}
['/a/','/a/b','/a/b?c=1']
```

<h2 id="cutFragment">
  cutFragment
</h2>

Introduced in: v1.1.0

Removes the fragment identifier, including the number sign, from a URL.

**Syntax**

```sql theme={null}
cutFragment(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the URL with fragment identifier removed. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT cutFragment('http://example.com/path?query=value#fragment123');
```

```response title=Response theme={null}
┌─cutFragment('http://example.com/path?query=value#fragment123')─┐
│ http://example.com/path?query=value                            │
└────────────────────────────────────────────────────────────────┘
```

<h2 id="cutQueryString">
  cutQueryString
</h2>

Introduced in: v1.1.0

Removes the query string, including the question mark from a URL.

**Syntax**

```sql theme={null}
cutQueryString(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the URL with query string removed. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT cutQueryString('http://example.com/path?query=value&param=123#fragment');
```

```response title=Response theme={null}
┌─cutQueryString('http://example.com/path?query=value&param=123#fragment')─┐
│ http://example.com/path#fragment                                         │
└──────────────────────────────────────────────────────────────────────────┘
```

<h2 id="cutQueryStringAndFragment">
  cutQueryStringAndFragment
</h2>

Introduced in: v1.1.0

Removes the query string and fragment identifier, including the question mark and number sign, from a URL.

**Syntax**

```sql theme={null}
cutQueryStringAndFragment(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the URL with query string and fragment identifier removed. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT cutQueryStringAndFragment('http://example.com/path?query=value&param=123#fragment');
```

```response title=Response theme={null}
┌─cutQueryStringAndFragment('http://example.com/path?query=value&param=123#fragment')─┐
│ http://example.com/path                                                             │
└─────────────────────────────────────────────────────────────────────────────────────┘
```

<h2 id="cutToFirstSignificantSubdomain">
  cutToFirstSignificantSubdomain
</h2>

Introduced in: v1.1.0

Returns the part of the domain that includes top-level subdomains up to the [first significant subdomain](/reference/functions/regular-functions/url-functions#firstSignificantSubdomain).

**Syntax**

```sql theme={null}
cutToFirstSignificantSubdomain(url)
```

**Arguments**

* `url` — URL or domain string to process. [`String`](/reference/data-types/string)

**Returned value**

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain if possible, otherwise returns an empty string. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT
    cutToFirstSignificantSubdomain('https://news.clickhouse.com.tr/'),
    cutToFirstSignificantSubdomain('www.tr'),
    cutToFirstSignificantSubdomain('tr');
```

```response title=Response theme={null}
┌─cutToFirstSignificantSubdomain('https://news.clickhouse.com.tr/')─┬─cutToFirstSignificantSubdomain('www.tr')─┬─cutToFirstSignificantSubdomain('tr')─┐
│ clickhouse.com.tr                                                 │ tr                                       │                                      │
└───────────────────────────────────────────────────────────────────┴──────────────────────────────────────────┴──────────────────────────────────────┘
```

<h2 id="cutToFirstSignificantSubdomainCustom">
  cutToFirstSignificantSubdomainCustom
</h2>

Introduced in: v21.1.0

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain. Accepts custom [TLD list](https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains) name. This function can be useful if you need a fresh TLD list or if you have a custom list.

**Configuration example**

```yaml theme={null}
<!-- <top_level_domains_path>/var/lib/clickhouse/top_level_domains/</top_level_domains_path> -->
<top_level_domains_lists>
    <!-- https://publicsuffix.org/list/public_suffix_list.dat -->
    <public_suffix_list>public_suffix_list.dat</public_suffix_list>
    <!-- NOTE: path is under top_level_domains_path -->
</top_level_domains_lists>
```

**Syntax**

```sql theme={null}
cutToFirstSignificantSubdomainCustom(url, tld_list_name)
```

**Arguments**

* `url` — URL or domain string to process. [`String`](/reference/data-types/string)
* `tld_list_name` — Name of the custom TLD list configured in ClickHouse. [`const String`](/reference/data-types/string)

**Returned value**

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain. [`String`](/reference/data-types/string)

**Examples**

**Using custom TLD list for non-standard domains**

```sql title=Query theme={null}
SELECT cutToFirstSignificantSubdomainCustom('bar.foo.there-is-no-such-domain', 'public_suffix_list')
```

```response title=Response theme={null}
foo.there-is-no-such-domain
```

<h2 id="cutToFirstSignificantSubdomainCustomRFC">
  cutToFirstSignificantSubdomainCustomRFC
</h2>

Introduced in: v22.10.0

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain.
Accepts custom [TLD list](https://en.wikipedia.org/wiki/List_of_Internet_top-level_domains) name.
This function can be useful if you need a fresh TLD list or if you have a custom list.
Similar to [cutToFirstSignificantSubdomainCustom](#cutToFirstSignificantSubdomainCustom) but conforms to RFC 3986.

**Configuration example**

```xml theme={null}
<!-- <top_level_domains_path>/var/lib/clickhouse/top_level_domains/</top_level_domains_path> -->
<top_level_domains_lists>
    <!-- https://publicsuffix.org/list/public_suffix_list.dat -->
    <public_suffix_list>public_suffix_list.dat</public_suffix_list>
    <!-- NOTE: path is under top_level_domains_path -->
</top_level_domains_lists>
```

**Syntax**

```sql theme={null}
cutToFirstSignificantSubdomainCustomRFC(url, tld_list_name)
```

**Arguments**

* `url` — URL or domain string to process according to RFC 3986. - `tld_list_name` — Name of the custom TLD list configured in ClickHouse.

**Returned value**

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT cutToFirstSignificantSubdomainCustomRFC('www.foo', 'public_suffix_list');
```

```response title=Response theme={null}
┌─cutToFirstSignificantSubdomainCustomRFC('www.foo', 'public_suffix_list')─────┐
│ www.foo                                                                      │
└──────────────────────────────────────────────────────────────────────────────┘
```

<h2 id="cutToFirstSignificantSubdomainCustomWithWWW">
  cutToFirstSignificantSubdomainCustomWithWWW
</h2>

Introduced in: v21.1.0

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain without stripping 'www'. Accepts custom TLD list name. It can be useful if you need a fresh TLD list or if you have a custom list.

**Configuration example**

````yaml theme={null}
<!-- <top_level_domains_path>/var/lib/clickhouse/top_level_domains/</top_level_domains_path> -->
<top_level_domains_lists>
    <!-- https://publicsuffix.org/list/public_suffix_list.dat -->
    <public_suffix_list>public_suffix_list.dat</public_suffix_list>
    <!-- NOTE: path is under top_level_domains_path -->
</top_level_domains_lists>
    

**Syntax**

```sql
cutToFirstSignificantSubdomainCustomWithWWW(url, tld_list_name)
````

**Arguments**

* `url` — URL or domain string to process. - `tld_list_name` — Name of the custom TLD list configured in ClickHouse.

**Returned value**

Part of the domain that includes top-level subdomains up to the first significant subdomain without stripping 'www'. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT cutToFirstSignificantSubdomainCustomWithWWW('www.foo', 'public_suffix_list');
```

```response title=Response theme={null}
┌─cutToFirstSignificantSubdomainCustomWithWWW('www.foo', 'public_suffix_list')─┐
│ www.foo                                                                      │
└──────────────────────────────────────────────────────────────────────────────┘
```

<h2 id="cutToFirstSignificantSubdomainCustomWithWWWRFC">
  cutToFirstSignificantSubdomainCustomWithWWWRFC
</h2>

Introduced in: v22.10.0

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain without stripping `www`.
Accepts custom TLD list name.
It can be useful if you need a fresh TLD list or if you have a custom list.
Similar to [cutToFirstSignificantSubdomainCustomWithWWW](#cutToFirstSignificantSubdomainCustomWithWWW) but conforms to [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986).

**Configuration example**

````xml theme={null}
{/* <top_level_domains_path>/var/lib/clickhouse/top_level_domains/</top_level_domains_path> */}
<top_level_domains_lists>
    {/* https://publicsuffix.org/list/public_suffix_list.dat */}
    <public_suffix_list>public_suffix_list.dat</public_suffix_list>
    {/* NOTE: path is under top_level_domains_path */}
</top_level_domains_lists>
    

**Syntax**

```sql
cutToFirstSignificantSubdomainCustomWithWWWRFC(url, tld_list_name)
````

**Arguments**

* `url` — URL or domain string to process according to RFC 3986. - `tld_list_name` — Name of the custom TLD list configured in ClickHouse.

**Returned value**

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain without stripping `www`. [`String`](/reference/data-types/string)

**Examples**

**RFC 3986 parsing preserving www with custom TLD list**

```sql title=Query theme={null}
SELECT cutToFirstSignificantSubdomainCustomWithWWWRFC('https://www.subdomain.example.custom', 'public_suffix_list')
```

```response title=Response theme={null}
www.example.custom
```

<h2 id="cutToFirstSignificantSubdomainRFC">
  cutToFirstSignificantSubdomainRFC
</h2>

Introduced in: v22.10.0

Returns the part of the domain that includes top-level subdomains up to the ["first significant subdomain"](/reference/functions/regular-functions/url-functions#firstSignificantSubdomain). Similar to [`cutToFirstSignificantSubdomain`](#cutToFirstSignificantSubdomain) but conforms to [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986).

**Syntax**

```sql theme={null}
cutToFirstSignificantSubdomainRFC(url)
```

**Arguments**

* `url` — URL or domain string to process according to RFC 3986. [`String`](/reference/data-types/string)

**Returned value**

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain if possible, otherwise returns an empty string. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT
    cutToFirstSignificantSubdomain('http://user:password@example.com:8080'),
    cutToFirstSignificantSubdomainRFC('http://user:password@example.com:8080');
```

```response title=Response theme={null}
┌─cutToFirstSignificantSubdomain('http://user:password@example.com:8080')─┬─cutToFirstSignificantSubdomainRFC('http://user:password@example.com:8080')─┐
│                                                                         │ example.com                                                                │
└─────────────────────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────┘
```

<h2 id="cutToFirstSignificantSubdomainWithWWW">
  cutToFirstSignificantSubdomainWithWWW
</h2>

Introduced in: v20.12.0

Returns the part of the domain that includes top-level subdomains up to the "first significant subdomain", without stripping '[www](http://www).'.

Similar to [`cutToFirstSignificantSubdomain`](#cutToFirstSignificantSubdomain) but preserves the '[www](http://www).' prefix if present.

**Syntax**

```sql theme={null}
cutToFirstSignificantSubdomainWithWWW(url)
```

**Arguments**

* `url` — URL or domain string to process. [`String`](/reference/data-types/string)

**Returned value**

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain (with www) if possible, otherwise returns an empty string. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT
    cutToFirstSignificantSubdomainWithWWW('https://news.clickhouse.com.tr/'),
    cutToFirstSignificantSubdomainWithWWW('www.tr'),
    cutToFirstSignificantSubdomainWithWWW('tr');
```

```response title=Response theme={null}
┌─cutToFirstSignificantSubdomainWithWWW('https://news.clickhouse.com.tr/')─┬─cutToFirstSignificantSubdomainWithWWW('www.tr')─┬─cutToFirstSignificantSubdomainWithWWW('tr')─┐
│ clickhouse.com.tr                                                        │ www.tr                                          │                                             │
└──────────────────────────────────────────────────────────────────────────┴─────────────────────────────────────────────────┴─────────────────────────────────────────────┘
```

<h2 id="cutToFirstSignificantSubdomainWithWWWRFC">
  cutToFirstSignificantSubdomainWithWWWRFC
</h2>

Introduced in: v22.10.0

Returns the part of the domain that includes top-level subdomains up to the "first significant subdomain", without stripping 'www'. Similar to [`cutToFirstSignificantSubdomainWithWWW`](#cutToFirstSignificantSubdomainWithWWW) but conforms to [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986).

**Syntax**

```sql theme={null}
cutToFirstSignificantSubdomainWithWWWRFC(url)
```

**Arguments**

* `url` — URL or domain string to process according to RFC 3986.

**Returned value**

Returns the part of the domain that includes top-level subdomains up to the first significant subdomain (with 'www') if possible, otherwise returns an empty string [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT
    cutToFirstSignificantSubdomainWithWWW('http:%2F%2Fwwwww.nova@mail.ru/economicheskiy'),
    cutToFirstSignificantSubdomainWithWWWRFC('http:%2F%2Fwwwww.nova@mail.ru/economicheskiy');
```

```response title=Response theme={null}
┌─cutToFirstSignificantSubdomainWithWWW('http:%2F%2Fwwwww.nova@mail.ru/economicheskiy')─┬─cutToFirstSignificantSubdomainWithWWWRFC('http:%2F%2Fwwwww.nova@mail.ru/economicheskiy')─┐
│                                                                                       │ mail.ru                                                                                  │
└───────────────────────────────────────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────────────────┘
```

<h2 id="cutURLParameter">
  cutURLParameter
</h2>

Introduced in: v1.1.0

Removes the `name` parameter from a URL, if present.
This function does not encode or decode characters in parameter names, e.g. `Client ID` and `Client%20ID` are treated as different parameter names.

**Syntax**

```sql theme={null}
cutURLParameter(url, name)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)
* `name` — Name of URL parameter. [`String`](/reference/data-types/string) or [`Array(String)`](/reference/data-types/array)

**Returned value**

URL with `name` URL parameter removed. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT
    cutURLParameter('http://bigmir.net/?a=b&c=d&e=f#g', 'a') AS url_without_a,
    cutURLParameter('http://bigmir.net/?a=b&c=d&e=f#g', ['c', 'e']) AS url_without_c_and_e;
```

```response title=Response theme={null}
┌─url_without_a────────────────┬─url_without_c_and_e──────┐
│ http://bigmir.net/?c=d&e=f#g │ http://bigmir.net/?a=b#g │
└──────────────────────────────┴──────────────────────────┘
```

<h2 id="cutWWW">
  cutWWW
</h2>

Introduced in: v1.1.0

Removes the leading `www.`, if present, from the URL's domain.

**Syntax**

```sql theme={null}
cutWWW(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the URL with leading `www.` removed from the domain. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT cutWWW('http://www.example.com/path?query=value#fragment');
```

```response title=Response theme={null}
┌─cutWWW('http://www.example.com/path?query=value#fragment')─┐
│ http://example.com/path?query=value#fragment               │
└────────────────────────────────────────────────────────────┘
```

<h2 id="decodeURLComponent">
  decodeURLComponent
</h2>

Introduced in: v1.1.0

Takes a URL-encoded string as input and decodes it back to its original, readable form.

**Syntax**

```sql theme={null}
decodeURLComponent(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the decoded URL. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT decodeURLComponent('http://127.0.0.1:8123/?query=SELECT%201%3B') AS DecodedURL;
```

```response title=Response theme={null}
┌─DecodedURL─────────────────────────────┐
│ http://127.0.0.1:8123/?query=SELECT 1; │
└────────────────────────────────────────┘
```

<h2 id="decodeURLFormComponent">
  decodeURLFormComponent
</h2>

Introduced in: v1.1.0

Decodes URL-encoded strings using form encoding rules ([RFC-1866](https://www.rfc-editor.org/rfc/rfc1866.html)), where `+` signs are converted to spaces and percent-encoded characters are decoded.

**Syntax**

```sql theme={null}
decodeURLFormComponent(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the decoded URL. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT decodeURLFormComponent('http://127.0.0.1:8123/?query=SELECT%201+2%2B3') AS DecodedURL;
```

```response title=Response theme={null}
┌─DecodedURL────────────────────────────────┐
│ http://127.0.0.1:8123/?query=SELECT 1 2+3 │
└───────────────────────────────────────────┘
```

<h2 id="domain">
  domain
</h2>

Introduced in: v1.1.0

Extracts the hostname from a URL.

The URL can be specified with or without a protocol.

**Syntax**

```sql theme={null}
domain(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the host name if the input string can be parsed as a URL, otherwise an empty string. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT domain('svn+ssh://some.svn-hosting.com:80/repo/trunk');
```

```response title=Response theme={null}
┌─domain('svn+ssh://some.svn-hosting.com:80/repo/trunk')─┐
│ some.svn-hosting.com                                   │
└────────────────────────────────────────────────────────┘
```

<h2 id="domainRFC">
  domainRFC
</h2>

Introduced in: v22.10.0

Extracts the hostname from a URL.
Similar to [`domain`](#domain), but [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986) conformant.

**Syntax**

```sql theme={null}
domainRFC(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the host name if the input string can be parsed as a URL, otherwise an empty string. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT
    domain('http://user:password@example.com:8080/path?query=value#fragment'),
    domainRFC('http://user:password@example.com:8080/path?query=value#fragment');
```

```response title=Response theme={null}
┌─domain('http://user:password@example.com:8080/path?query=value#fragment')─┬─domainRFC('http://user:password@example.com:8080/path?query=value#fragment')─┐
│                                                                           │ example.com                                                                  │
└───────────────────────────────────────────────────────────────────────────┴──────────────────────────────────────────────────────────────────────────────┘
```

<h2 id="domainWithoutWWW">
  domainWithoutWWW
</h2>

Introduced in: v1.1.0

Returns the domain of a URL without leading `www.` if present.

**Syntax**

```sql theme={null}
domainWithoutWWW(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the domain name if the input string can be parsed as a URL (without leading `www.`), otherwise an empty string. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT domainWithoutWWW('http://paul@www.example.com:80/');
```

```response title=Response theme={null}
┌─domainWithoutWWW('http://paul@www.example.com:80/')─┐
│ example.com                                         │
└─────────────────────────────────────────────────────┘
```

<h2 id="domainWithoutWWWRFC">
  domainWithoutWWWRFC
</h2>

Introduced in: v1.1.0

Returns the domain without leading `www.` if present. Similar to [`domainWithoutWWW`](#domainWithoutWWW) but conforms to [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986).

**Syntax**

```sql theme={null}
domainWithoutWWWRFC(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the domain name if the input string can be parsed as a URL (without leading `www.`), otherwise an empty string. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT
    domainWithoutWWW('http://user:password@www.example.com:8080/path?query=value#fragment'),
    domainWithoutWWWRFC('http://user:password@www.example.com:8080/path?query=value#fragment');
```

```response title=Response theme={null}
┌─domainWithoutWWW('http://user:password@www.example.com:8080/path?query=value#fragment')─┬─domainWithoutWWWRFC('http://user:password@www.example.com:8080/path?query=value#fragment')─┐
│                                                                                         │ example.com                                                                                │
└─────────────────────────────────────────────────────────────────────────────────────────┴────────────────────────────────────────────────────────────────────────────────────────────┘
```

<h2 id="encodeURLComponent">
  encodeURLComponent
</h2>

Introduced in: v22.3.0

Takes a regular string and converts it into a URL-encoded (percent-encoded) format where special characters are replaced with their percent-encoded equivalents.

**Syntax**

```sql theme={null}
encodeURLComponent(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the encoded URL. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT encodeURLComponent('http://127.0.0.1:8123/?query=SELECT 1;') AS EncodedURL;
```

```response title=Response theme={null}
┌─EncodedURL───────────────────────────────────────────────┐
│ http%3A%2F%2F127.0.0.1%3A8123%2F%3Fquery%3DSELECT%201%3B │
└──────────────────────────────────────────────────────────┘
```

<h2 id="encodeURLFormComponent">
  encodeURLFormComponent
</h2>

Introduced in: v22.3.0

Encodes strings using form encoding rules ([RFC-1866](https://www.rfc-editor.org/rfc/rfc1866.html)), where spaces are converted to + signs and special characters are percent-encoded.

**Syntax**

```sql theme={null}
encodeURLFormComponent(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the encoded URL. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT encodeURLFormComponent('http://127.0.0.1:8123/?query=SELECT 1 2+3') AS EncodedURL;
```

```response title=Response theme={null}
┌─EncodedURL────────────────────────────────────────────────┐
│ http%3A%2F%2F127.0.0.1%3A8123%2F%3Fquery%3DSELECT+1+2%2B3 │
└───────────────────────────────────────────────────────────┘
```

<h2 id="extractURLParameter">
  extractURLParameter
</h2>

Introduced in: v1.1.0

Returns the value of the `name` parameter in the URL, if present, otherwise an empty string is returned.
If there are multiple parameters with this name, the first occurrence is returned.
The function assumes that the parameter in the `url` parameter is encoded in the same way as in the `name` argument.

**Syntax**

```sql theme={null}
extractURLParameter(url, name)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)
* `name` — Parameter name. [`String`](/reference/data-types/string)

**Returned value**

Returns the value of the URL parameter with the specified name. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT extractURLParameter('http://example.com/?param1=value1&param2=value2', 'param1');
```

```response title=Response theme={null}
┌─extractURLPa⋯, 'param1')─┐
│ value1                   │
└──────────────────────────┘
```

<h2 id="extractURLParameterNames">
  extractURLParameterNames
</h2>

Introduced in: v1.1.0

Returns an array of name strings corresponding to the names of URL parameters.
The values are not decoded.

**Syntax**

```sql theme={null}
extractURLParameterNames(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns an array of name strings corresponding to the names of URL parameters. [`Array(String)`](/reference/data-types/array)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT extractURLParameterNames('http://example.com/?param1=value1&param2=value2');
```

```response title=Response theme={null}
┌─extractURLPa⋯m2=value2')─┐
│ ['param1','param2']      │
└──────────────────────────┘
```

<h2 id="extractURLParameters">
  extractURLParameters
</h2>

Introduced in: v1.1.0

Returns an array of `name=value` strings corresponding to the URL parameters.
The values are not decoded.

**Syntax**

```sql theme={null}
extractURLParameters(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns an array of `name=value` strings corresponding to the URL parameters. [`Array(String)`](/reference/data-types/array)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT extractURLParameters('http://example.com/?param1=value1&param2=value2');
```

```response title=Response theme={null}
┌─extractURLParame⋯&param2=value2')─┐
│ ['param1=value1','param2=value2'] │
└───────────────────────────────────┘
```

<h2 id="firstSignificantSubdomain">
  firstSignificantSubdomain
</h2>

Introduced in: v1.1.0

Returns the "first significant subdomain".

The first significant subdomain is a second-level domain if it is 'com', 'net', 'org', or 'co'.
Otherwise, it is a third-level domain.

For example, firstSignificantSubdomain('[https://news.clickhouse.com/](https://news.clickhouse.com/)') = 'clickhouse', firstSignificantSubdomain ('[https://news.clickhouse.com.tr/](https://news.clickhouse.com.tr/)') = 'clickhouse'.

The list of "insignificant" second-level domains and other implementation details may change in the future.

**Syntax**

```sql theme={null}
firstSignificantSubdomain(url)
```

**Arguments**

* None.

**Returned value**

**Examples**

**firstSignificantSubdomain**

```sql title=Query theme={null}
SELECT firstSignificantSubdomain('https://news.clickhouse.com/')
```

```response title=Response theme={null}
```

<h2 id="firstSignificantSubdomainCustom">
  firstSignificantSubdomainCustom
</h2>

Introduced in: v21.1.0

Returns the first significant subdomain of a URL using a custom TLD (Top-Level Domain) list. The custom TLD list name refers to a configuration that defines which domain suffixes should be treated as top-level domains. This is useful for non-standard TLD hierarchies. The function uses a simplified URL parsing algorithm that assumes the protocol and everything following are stripped.

**Syntax**

```sql theme={null}
firstSignificantSubdomainCustom(url, tld_list_name)
```

**Arguments**

* `url` — The URL to extract the subdomain from. [`String`](/reference/data-types/string)
* `tld_list_name` — Name of the custom TLD list from the configuration. [`String`](/reference/data-types/string)

**Returned value**

Returns the first significant subdomain. [`String`](/reference/data-types/string)

**Examples**

**Basic usage**

```sql title=Query theme={null}
SELECT firstSignificantSubdomainCustom('https://news.example.com', 'public_suffix_list')
```

```response title=Response theme={null}
example
```

<h2 id="firstSignificantSubdomainCustomRFC">
  firstSignificantSubdomainCustomRFC
</h2>

Introduced in: v22.10.0

Similar to `firstSignificantSubdomainCustom` but uses RFC 3986 compliant URL parsing instead of the simplified algorithm.

**Syntax**

```sql theme={null}
firstSignificantSubdomainCustomRFC(url, tld_list_name)
```

**Arguments**

* `url` — The URL to extract the subdomain from. [`String`](/reference/data-types/string)
* `tld_list_name` — Name of the custom TLD list from the configuration. [`String`](/reference/data-types/string)

**Returned value**

Returns the first significant subdomain. [`String`](/reference/data-types/string)

**Examples**

**Basic usage**

```sql title=Query theme={null}
SELECT firstSignificantSubdomainCustomRFC('https://news.example.com', 'public_suffix_list')
```

```response title=Response theme={null}
example
```

<h2 id="firstSignificantSubdomainRFC">
  firstSignificantSubdomainRFC
</h2>

Introduced in: v22.10.0

Returns the "first significant subdomain" according to RFC 1034.

**Syntax**

```sql theme={null}
firstSignificantSubdomainRFC(url)
```

**Arguments**

* None.

**Returned value**

**Examples**

<h2 id="fragment">
  fragment
</h2>

Introduced in: v1.1.0

Returns the fragment identifier without the initial hash symbol.

**Syntax**

```sql theme={null}
fragment(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the fragment identifier without the initial hash symbol. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT fragment('https://clickhouse.com/docs/getting-started/quick-start/cloud#1-create-a-clickhouse-service');
```

```response title=Response theme={null}
┌─fragment('http⋯ouse-service')─┐
│ 1-create-a-clickhouse-service │
└───────────────────────────────┘
```

<h2 id="netloc">
  netloc
</h2>

Introduced in: v20.5.0

Extracts network locality (`username:password@host:port`) from a URL.

**Syntax**

```sql theme={null}
netloc(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns `username:password@host:port` from a given URL. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT netloc('http://paul@www.example.com:80/');
```

```response title=Response theme={null}
┌─netloc('http⋯e.com:80/')─┐
│ paul@www.example.com:80  │
└──────────────────────────┘
```

<h2 id="path">
  path
</h2>

Introduced in: v1.1.0

Returns the path without query string from a URL.

**Syntax**

```sql theme={null}
path(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the path of the URL without query string. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT path('https://clickhouse.com/docs/sql-reference/functions/url-functions/?query=value');
```

```response title=Response theme={null}
┌─path('https://clickhouse.com/en/sql-reference/functions/url-functions/?query=value')─┐
│ /docs/sql-reference/functions/url-functions/                                         │
└──────────────────────────────────────────────────────────────────────────────────────┘
```

<h2 id="pathFull">
  pathFull
</h2>

Introduced in: v1.1.0

The same as [`path`](#path), but includes the query string and fragment of the URL.

**Syntax**

```sql theme={null}
pathFull(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the path of the URL including query string and fragment. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT pathFull('https://clickhouse.com/docs/sql-reference/functions/url-functions/?query=value#section');
```

```response title=Response theme={null}
┌─pathFull('https://clickhouse.com⋯unctions/?query=value#section')─┐
│ /docs/sql-reference/functions/url-functions/?query=value#section │
└──────────────────────────────────────────────────────────────────┘
```

<h2 id="port">
  port
</h2>

Introduced in: v20.5.0

Returns the port of a URL, or the `default_port` if the URL contains no port or cannot be parsed.

**Syntax**

```sql theme={null}
port(url[, default_port])
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)
* `default_port` — Optional. The default port number to be returned. `0` by default. [`UInt16`](/reference/data-types/int-uint)

**Returned value**

Returns the port of the URL, or the default port if there is no port in the URL or in case of a validation error. [`UInt16`](/reference/data-types/int-uint)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT port('https://clickhouse.com:8443/docs'), port('https://clickhouse.com/docs', 443);
```

```response title=Response theme={null}
┌─port('https://clickhouse.com:8443/docs')─┬─port('https://clickhouse.com/docs', 443)─┐
│                                     8443 │                                      443 │
└──────────────────────────────────────────┴──────────────────────────────────────────┘
```

<h2 id="portRFC">
  portRFC
</h2>

Introduced in: v22.10.0

Returns the port or `default_port` if the URL contains no port or cannot be parsed.
Similar to [`port`](#port), but [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986) conformant.

**Syntax**

```sql theme={null}
portRFC(url[, default_port])
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)
* `default_port` — Optional. The default port number to be returned. `0` by default. [`UInt16`](/reference/data-types/int-uint)

**Returned value**

Returns the port or the default port if there is no port in the URL or in case of a validation error. [`UInt16`](/reference/data-types/int-uint)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT port('http://user:password@example.com:8080/'), portRFC('http://user:password@example.com:8080/');
```

```response title=Response theme={null}
┌─port('http:/⋯com:8080/')─┬─portRFC('htt⋯com:8080/')─┐
│                        0 │                     8080 │
└──────────────────────────┴──────────────────────────┘
```

<h2 id="protocol">
  protocol
</h2>

Introduced in: v1.1.0

Extracts the protocol from a URL.

Examples of typical returned values: http, https, ftp, mailto, tel, magnet.

**Syntax**

```sql theme={null}
protocol(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the protocol of the URL, or an empty string if it cannot be determined. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT protocol('https://clickhouse.com/');
```

```response title=Response theme={null}
┌─protocol('https://clickhouse.com/')─┐
│ https                               │
└─────────────────────────────────────┘
```

<h2 id="queryString">
  queryString
</h2>

Introduced in: v1.1.0

Returns the query string of a URL without the initial question mark, `#` and everything after `#`.

**Syntax**

```sql theme={null}
queryString(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the query string of the URL without the initial question mark and fragment. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT queryString('https://clickhouse.com/docs?query=value&param=123#section');
```

```response title=Response theme={null}
┌─queryString(⋯3#section')─┐
│ query=value&param=123    │
└──────────────────────────┘
```

<h2 id="queryStringAndFragment">
  queryStringAndFragment
</h2>

Introduced in: v1.1.0

Returns the query string and fragment identifier of a URL.

**Syntax**

```sql theme={null}
queryStringAndFragment(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the query string and fragment identifier of the URL. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT queryStringAndFragment('https://clickhouse.com/docs?query=value&param=123#section');
```

```response title=Response theme={null}
┌─queryStringAnd⋯=123#section')─┐
│ query=value&param=123#section │
└───────────────────────────────┘
```

<h2 id="topLevelDomain">
  topLevelDomain
</h2>

Introduced in: v1.1.0

Extracts the the top-level domain from a URL.

<Note>
  The URL can be specified with or without a protocol.
  For example:

  ```text theme={null}
  svn+ssh://some.svn-hosting.com:80/repo/trunk
  some.svn-hosting.com:80/repo/trunk
  https://clickhouse.com/time/
  ```
</Note>

**Syntax**

```sql theme={null}
topLevelDomain(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Returns the domain name if the input string can be parsed as a URL. Otherwise, an empty string. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT topLevelDomain('svn+ssh://www.some.svn-hosting.com:80/repo/trunk');
```

```response title=Response theme={null}
┌─topLevelDomain('svn+ssh://www.some.svn-hosting.com:80/repo/trunk')─┐
│ com                                                                │
└────────────────────────────────────────────────────────────────────┘
```

<h2 id="topLevelDomainRFC">
  topLevelDomainRFC
</h2>

Introduced in: v22.10.0

Extracts the the top-level domain from a URL.
Similar to [`topLevelDomain`](#topLevelDomain), but conforms to [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986).

**Syntax**

```sql theme={null}
topLevelDomainRFC(url)
```

**Arguments**

* `url` — URL. [`String`](/reference/data-types/string)

**Returned value**

Domain name if the input string can be parsed as a URL. Otherwise, an empty string. [`String`](/reference/data-types/string)

**Examples**

**Usage example**

```sql title=Query theme={null}
SELECT topLevelDomain('http://foo:foo%41bar@foo.com'), topLevelDomainRFC('http://foo:foo%41bar@foo.com');
```

```response title=Response theme={null}
┌─topLevelDomain('http://foo:foo%41bar@foo.com')─┬─topLevelDomainRFC('http://foo:foo%41bar@foo.com')─┐
│                                                │ com                                               │
└────────────────────────────────────────────────┴───────────────────────────────────────────────────┘
```
