> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-3a82795f.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

> Documentation for the Regexp format

# Regexp

| Input | Output | Alias |
| ----- | ------ | ----- |
| ✔     | ✗      |       |

<h2 id="description">
  Description
</h2>

The `Regex` format parses every line of imported data according to the provided regular expression.

**Usage**

The regular expression from [format\_regexp](/reference/settings/formats#format_regexp) setting is applied to every line of imported data. The number of subpatterns in the regular expression must be equal to the number of columns in imported dataset.

Lines of the imported data must be separated by newline character `'\n'` or DOS-style newline `"\r\n"`.

The content of every matched subpattern is parsed with the method of corresponding data type, according to [format\_regexp\_escaping\_rule](/reference/settings/formats#format_regexp_escaping_rule) setting.

If the regular expression does not match the line and [format\_regexp\_skip\_unmatched](/reference/settings/formats#format_regexp_escaping_rule) is set to 1, the line is silently skipped. Otherwise, exception is thrown.

<h2 id="example-usage">
  Example usage
</h2>

Consider the file `data.tsv`:

```text title="data.tsv" theme={null}
id: 1 array: [1,2,3] string: str1 date: 2020-01-01
id: 2 array: [1,2,3] string: str2 date: 2020-01-02
id: 3 array: [1,2,3] string: str3 date: 2020-01-03
```

and table `imp_regex_table`:

```sql title="Query" theme={null}
CREATE TABLE imp_regex_table (id UInt32, array Array(UInt32), string String, date Date) ENGINE = Memory;
```

We'll insert the data from the aforementioned file into the table above using the following query:

```bash title="Query" theme={null}
$ cat data.tsv | clickhouse-client  --query "INSERT INTO imp_regex_table SETTINGS format_regexp='id: (.+?) array: (.+?) string: (.+?) date: (.+?)', format_regexp_escaping_rule='Escaped', format_regexp_skip_unmatched=0 FORMAT Regexp;"
```

We can now `SELECT` the data from the table to see how the `Regex` format parsed the data from the file:

```sql title="Query" theme={null}
SELECT * FROM imp_regex_table;
```

```text title="Response" theme={null}
┌─id─┬─array───┬─string─┬───────date─┐
│  1 │ [1,2,3] │ str1   │ 2020-01-01 │
│  2 │ [1,2,3] │ str2   │ 2020-01-02 │
│  3 │ [1,2,3] │ str3   │ 2020-01-03 │
└────┴─────────┴────────┴────────────┘
```

<h2 id="format-settings">
  Format settings
</h2>

When working with the `Regexp` format, you can use the following settings:

* `format_regexp` — [String](/reference/data-types/string). Contains regular expression in the [re2](https://github.com/google/re2/wiki/Syntax) format.

* `format_regexp_escaping_rule` — [String](/reference/data-types/string). The following escaping rules are supported:

  * CSV (similarly to [CSV](/reference/formats/CSV/CSV)
  * JSON (similarly to [JSONEachRow](/reference/formats/JSON/JSONEachRow)
  * Escaped (similarly to [TSV](/reference/formats/TabSeparated/TabSeparated)
  * Quoted (similarly to [Values](/reference/formats/Values)
  * Raw (extracts subpatterns as a whole, no escaping rules, similarly to [TSVRaw](/reference/formats/TabSeparated/TabSeparated)

* `format_regexp_skip_unmatched` — [UInt8](/reference/data-types/int-uint). Defines the need to throw an exception in case the `format_regexp` expression does not match the imported data. Can be set to `0` or `1`.
