> ## Documentation Index
> Fetch the complete documentation index at: https://private-7c7dfe99-mintlify-3a82795f.mintlify.site/llms.txt
> Use this file to discover all available pages before exploring further.

> Provides a read-only table-like interface to Apache Iceberg tables in Amazon S3, Azure, HDFS or locally stored.

# iceberg

Provides a read-only table-like interface to Apache [Iceberg](https://iceberg.apache.org/) tables in Amazon S3, Azure, HDFS or locally stored.

<h2 id="syntax">
  Syntax
</h2>

```sql theme={null}
icebergS3(url [, NOSIGN | access_key_id, secret_access_key, [session_token]] [,format] [,compression_method] [,extra_credentials])
icebergS3(named_collection[, option=value [,..]])

icebergAzure(connection_string|storage_account_url, container_name, blobpath, [,account_name], [,account_key] [,format] [,compression_method])
icebergAzure(named_collection[, option=value [,..]])

icebergHDFS(path_to_table, [,format] [,compression_method])
icebergHDFS(named_collection[, option=value [,..]])

icebergLocal(path_to_table, [,format] [,compression_method])
icebergLocal(named_collection[, option=value [,..]])
```

<h2 id="arguments">
  Arguments
</h2>

Description of the arguments coincides with description of arguments in table functions `s3`, `azureBlobStorage`, `HDFS` and `file` correspondingly.
`format` stands for the format of data files in the Iceberg table.

For `icebergS3`, an optional `extra_credentials` parameter can be used to pass a `role_arn` for role-based access in ClickHouse Cloud. See [Secure S3](/products/cloud/guides/data-sources/accessing-s3-data-securely) for configuration steps.

<h3 id="returned-value">
  Returned value
</h3>

A table with the specified structure for reading data in the specified Iceberg table.

<h3 id="example">
  Example
</h3>

```sql theme={null}
SELECT * FROM icebergS3('http://test.s3.amazonaws.com/clickhouse-bucket/test_table', 'test', 'test')
```

<Warning>
  ClickHouse currently supports reading v1 and v2 of the Iceberg format via the `icebergS3`, `icebergAzure`, `icebergHDFS` and `icebergLocal` table functions and `IcebergS3`, `icebergAzure`, `IcebergHDFS` and `IcebergLocal` table engines.
</Warning>

<h2 id="defining-a-named-collection">
  Defining a named collection
</h2>

Here is an example of configuring a named collection for storing the URL and credentials:

```xml theme={null}
<clickhouse>
    <named_collections>
        <iceberg_conf>
            <url>http://test.s3.amazonaws.com/clickhouse-bucket/</url>
            <access_key_id>test<access_key_id>
            <secret_access_key>test</secret_access_key>
            <format>auto</format>
            <structure>auto</structure>
        </iceberg_conf>
    </named_collections>
</clickhouse>
```

```sql theme={null}
SELECT * FROM icebergS3(iceberg_conf, filename = 'test_table')
DESCRIBE icebergS3(iceberg_conf, filename = 'test_table')
```

<h2 id="iceberg-writes-catalogs">
  Using a data catalog
</h2>

Iceberg tables can also be used with various data catalogs, such as the [REST Catalog](https://iceberg.apache.org/rest-catalog-spec/), [AWS Glue Data Catalog](https://docs.aws.amazon.com/prescriptive-guidance/latest/serverless-etl-aws-glue/aws-glue-data-catalog.html) and [Unity Catalog](https://www.unitycatalog.io/).

<Warning>
  When using a catalog, most users will want to use the `DataLakeCatalog` database engine, which connects ClickHouse to your catalog to discover your tables. You can use this database engine instead of manually creating individual tables with `IcebergS3` table engine.
</Warning>

To use them, create a table with the `IcebergS3` engine and provide the necessary settings.

For example, using REST Catalog with MinIO storage:

```sql theme={null}
CREATE TABLE `database_name.table_name`
ENGINE = IcebergS3(
  'http://minio:9000/warehouse-rest/table_name/',
  'minio_access_key',
  'minio_secret_key'
)
```

Or, using AWS Glue Data Catalog with S3:

```sql theme={null}
CREATE TABLE `my_database.my_table`  
ENGINE = IcebergS3(
  's3://my-data-bucket/warehouse/my_database/my_table/',
  'aws_access_key',
  'aws_secret_key'
)
```

<h2 id="schema-evolution">
  Schema Evolution
</h2>

At the moment, with the help of CH, you can read iceberg tables, the schema of which has changed over time. We currently support reading tables where columns have been added and removed, and their order has changed. You can also change a column where a value is required to one where NULL is allowed. Additionally, we support permitted type casting for simple types, namely:  

* int -> long
* float -> double
* decimal(P, S) -> decimal(P', S) where P' > P.

Currently, it is not possible to change nested structures or the types of elements within arrays and maps.

<h2 id="partition-pruning">
  Partition Pruning
</h2>

ClickHouse supports partition pruning during SELECT queries for Iceberg tables, which helps optimize query performance by skipping irrelevant data files. To enable partition pruning, set `use_iceberg_partition_pruning = 1`. For more information about iceberg partition pruning address [https://iceberg.apache.org/spec/#partitioning](https://iceberg.apache.org/spec/#partitioning)

<h2 id="time-travel">
  Time Travel
</h2>

ClickHouse supports time travel for Iceberg tables, allowing you to query historical data with a specific timestamp or snapshot ID.

<h2 id="deleted-rows">
  Processing of tables with deleted rows
</h2>

Currently, only Iceberg tables with [position deletes](https://iceberg.apache.org/spec/#position-delete-files) are supported.

The following deletion methods are **not supported**:

* [Equality deletes](https://iceberg.apache.org/spec/#equality-delete-files)
* [Deletion vectors](https://iceberg.apache.org/spec/#deletion-vectors) (introduced in v3)

<h3 id="basic-usage">
  Basic usage
</h3>

```sql theme={null}
SELECT * FROM example_table ORDER BY 1 
SETTINGS iceberg_timestamp_ms = 1714636800000
```

```sql theme={null}
SELECT * FROM example_table ORDER BY 1 
SETTINGS iceberg_snapshot_id = 3547395809148285433
```

Note: You cannot specify both `iceberg_timestamp_ms` and `iceberg_snapshot_id` parameters in the same query.

<h3 id="important-considerations">
  Important considerations
</h3>

* **Snapshots** are typically created when:

* New data is written to the table

* Some kind of data compaction is performed

* **Schema changes typically don't create snapshots** - This leads to important behaviors when using time travel with tables that have undergone schema evolution.

<h3 id="example-scenarios">
  Example scenarios
</h3>

All scenarios are written in Spark because CH doesn't support writing to Iceberg tables yet.

<h4 id="scenario-1">
  Scenario 1: Schema Changes Without New Snapshots
</h4>

Consider this sequence of operations:

```sql theme={null}
-- Create a table with two columns
 CREATE TABLE IF NOT EXISTS spark_catalog.db.time_travel_example (
 order_number bigint, 
 product_code string
 ) 
 USING iceberg 
 OPTIONS ('format-version'='2')

- - Insert data into the table
 INSERT INTO spark_catalog.db.time_travel_example VALUES 
   (1, 'Mars')

 ts1 = now() // A piece of pseudo code

- - Alter table to add a new column
 ALTER TABLE spark_catalog.db.time_travel_example ADD COLUMN (price double)

 ts2 = now()

- - Insert data into the table
 INSERT INTO spark_catalog.db.time_travel_example VALUES (2, 'Venus', 100)

  ts3 = now()

- - Query the table at each timestamp
 SELECT * FROM spark_catalog.db.time_travel_example TIMESTAMP AS OF ts1;

+------------+------------+
|order_number|product_code|
+------------+------------+
|           1|        Mars|
+------------+------------+
 SELECT * FROM spark_catalog.db.time_travel_example TIMESTAMP AS OF ts2;

+------------+------------+
|order_number|product_code|
+------------+------------+
|           1|        Mars|
+------------+------------+

 SELECT * FROM spark_catalog.db.time_travel_example TIMESTAMP AS OF ts3;

+------------+------------+-----+
|order_number|product_code|price|
+------------+------------+-----+
|           1|        Mars| NULL|
|           2|       Venus|100.0|
+------------+------------+-----+
```

Query results at different timestamps:

* At ts1 & ts2: Only the original two columns appear
* At ts3: All three columns appear, with NULL for the price of the first row

<h4 id="scenario-2">
  Scenario 2:  Historical vs. Current Schema Differences
</h4>

A time travel query at a current moment might show a different schema than the current table:

```sql theme={null}
-- Create a table
  CREATE TABLE IF NOT EXISTS spark_catalog.db.time_travel_example_2 (
  order_number bigint, 
  product_code string
  ) 
  USING iceberg 
  OPTIONS ('format-version'='2')

-- Insert initial data into the table
  INSERT INTO spark_catalog.db.time_travel_example_2 VALUES (2, 'Venus');

-- Alter table to add a new column
  ALTER TABLE spark_catalog.db.time_travel_example_2 ADD COLUMN (price double);

  ts = now();

-- Query the table at a current moment but using timestamp syntax

  SELECT * FROM spark_catalog.db.time_travel_example_2 TIMESTAMP AS OF ts;

    +------------+------------+
    |order_number|product_code|
    +------------+------------+
    |           2|       Venus|
    +------------+------------+

-- Query the table at a current moment
  SELECT * FROM spark_catalog.db.time_travel_example_2;
    +------------+------------+-----+
    |order_number|product_code|price|
    +------------+------------+-----+
    |           2|       Venus| NULL|
    +------------+------------+-----+
```

This happens because `ALTER TABLE` doesn't create a new snapshot but for the current table Spark takes value of `schema_id` from the latest metadata file, not a snapshot.

<h4 id="scenario-3">
  Scenario 3:  Historical vs. Current Schema Differences
</h4>

The second one is that while doing time travel you can't get state of table before any data was written to it:

```sql theme={null}
-- Create a table
  CREATE TABLE IF NOT EXISTS spark_catalog.db.time_travel_example_3 (
  order_number bigint, 
  product_code string
  ) 
  USING iceberg 
  OPTIONS ('format-version'='2');

  ts = now();

-- Query the table at a specific timestamp
  SELECT * FROM spark_catalog.db.time_travel_example_3 TIMESTAMP AS OF ts; -- Finises with error: Cannot find a snapshot older than ts.
```

In Clickhouse the behavior is consistent with Spark. You can mentally replace Spark Select queries with Clickhouse Select queries and it will work the same way.

<h2 id="metadata-file-resolution">
  Metadata File Resolution
</h2>

When using the `iceberg` table function in ClickHouse, the system needs to locate the correct metadata.json file that describes the Iceberg table structure. Here's how this resolution process works:

<h3 id="candidate-search">
  Candidate Search (in Priority Order)
</h3>

1. **Direct Path Specification**:
   \*If you set `iceberg_metadata_file_path`, the system will use this exact path by combining it with the Iceberg table directory path.

* When this setting is provided, all other resolution settings are ignored.

2. **Table UUID Matching**:
   \*If `iceberg_metadata_table_uuid` is specified, the system will:
   \*Look only at `.metadata.json` files in the `metadata` directory
   \*Filter for files containing a `table-uuid` field matching your specified UUID (case-insensitive)

3. **Default Search**:
   \*If neither of the above settings are provided, all `.metadata.json` files in the `metadata` directory become candidates

<h3 id="most-recent-file">
  Selecting the Most Recent File
</h3>

After identifying candidate files using the above rules, the system determines which one is the most recent:

* If `iceberg_recent_metadata_file_by_last_updated_ms_field` is enabled:

* The file with the largest `last-updated-ms` value is selected

* Otherwise:

* The file with the highest version number is selected

* (Version appears as `V` in filenames formatted as `V.metadata.json` or `V-uuid.metadata.json`)

**Note**: All mentioned settings are table function settings (not global or query-level settings) and must be specified as shown below:

```sql theme={null}
SELECT * FROM iceberg('s3://bucket/path/to/iceberg_table', 
    SETTINGS iceberg_metadata_table_uuid = 'a90eed4c-f74b-4e5b-b630-096fb9d09021');
```

**Note**: While Iceberg Catalogs typically handle metadata resolution, the `iceberg` table function in ClickHouse directly interprets files stored in S3 as Iceberg tables, which is why understanding these resolution rules is important.

<h2 id="metadata-cache">
  Metadata cache
</h2>

`Iceberg` table engine and table function support metadata cache storing the information of manifest files, manifest list and metadata json. The cache is stored in memory. This feature is controlled by setting `use_iceberg_metadata_files_cache`, which is enabled by default.

<h2 id="aliases">
  Aliases
</h2>

Table function `iceberg` is an alias to `icebergS3` now.

<h2 id="virtual-columns">
  Virtual Columns
</h2>

* `_path` — Path to the file. Type: `LowCardinality(String)`.
* `_file` — Name of the file. Type: `LowCardinality(String)`.
* `_size` — Size of the file in bytes. Type: `Nullable(UInt64)`. If the file size is unknown, the value is `NULL`.
* `_time` — Last modified time of the file. Type: `Nullable(DateTime)`. If the time is unknown, the value is `NULL`.
* `_etag` — The etag of the file. Type: `LowCardinality(String)`. If the etag is unknown, the value is `NULL`.

<h2 id="writes-into-iceberg-table">
  Writes into iceberg table
</h2>

Starting from version 25.7, ClickHouse supports modifications of user’s Iceberg tables.

Currently, this is an experimental feature, so you first need to enable it:

```sql theme={null}
SET allow_insert_into_iceberg = 1;
```

<h3 id="create-iceberg-table">
  Creating table
</h3>

To create your own empty Iceberg table, use the same commands as for reading, but specify the schema explicitly.
Writes supports all data formats from iceberg specification, such as Parquet, Avro, ORC.

<h3 id="example-iceberg-writes-create">
  Example
</h3>

```sql theme={null}
CREATE TABLE iceberg_writes_example
(
    x Nullable(String),
    y Nullable(Int32)
)
ENGINE = IcebergLocal('/home/scanhex12/iceberg_example/')
```

Note: To create a version hint file, enable the `iceberg_use_version_hint` setting.
If you want to compress the metadata.json file, specify the codec name in the `iceberg_metadata_compression_method` setting.

<h3 id="writes-inserts">
  INSERT
</h3>

After creating a new table, you can insert data using the usual ClickHouse syntax.

<h3 id="example-iceberg-writes-insert">
  Example
</h3>

```sql theme={null}
INSERT INTO iceberg_writes_example VALUES ('Pavel', 777), ('Ivanov', 993);

SELECT *
FROM iceberg_writes_example
FORMAT VERTICAL;

Row 1:
──────
x: Pavel
y: 777

Row 2:
──────
x: Ivanov
y: 993
```

<h3 id="iceberg-writes-delete">
  DELETE
</h3>

Deleting extra rows in the merge-on-read format is also supported in ClickHouse.
This query will create a new snapshot with position delete files.

<h3 id="example-iceberg-writes-delete">
  Example
</h3>

```sql theme={null}
ALTER TABLE iceberg_writes_example DELETE WHERE x != 'Ivanov';

SELECT *
FROM iceberg_writes_example
FORMAT VERTICAL;

Row 1:
──────
x: Ivanov
y: 993
```

<h3 id="iceberg-writes-schema-evolution">
  Schema evolution
</h3>

ClickHouse allows you to add, drop, modify, or rename columns with simple types (non-tuple, non-array, non-map).

<h3 id="example-iceberg-writes-evolution">
  Example
</h3>

```sql theme={null}
ALTER TABLE iceberg_writes_example MODIFY COLUMN y Nullable(Int64);
SHOW CREATE TABLE iceberg_writes_example;

   ┌─statement─────────────────────────────────────────────────┐
1. │ CREATE TABLE default.iceberg_writes_example              ↴│
   │↳(                                                        ↴│
   │↳    `x` Nullable(String),                                ↴│
   │↳    `y` Nullable(Int64)                                  ↴│
   │↳)                                                        ↴│
   │↳ENGINE = IcebergLocal('/home/scanhex12/iceberg_example/') │
   └───────────────────────────────────────────────────────────┘

ALTER TABLE iceberg_writes_example ADD COLUMN z Nullable(Int32);
SHOW CREATE TABLE iceberg_writes_example;

   ┌─statement─────────────────────────────────────────────────┐
1. │ CREATE TABLE default.iceberg_writes_example              ↴│
   │↳(                                                        ↴│
   │↳    `x` Nullable(String),                                ↴│
   │↳    `y` Nullable(Int64),                                 ↴│
   │↳    `z` Nullable(Int32)                                  ↴│
   │↳)                                                        ↴│
   │↳ENGINE = IcebergLocal('/home/scanhex12/iceberg_example/') │
   └───────────────────────────────────────────────────────────┘

SELECT *
FROM iceberg_writes_example
FORMAT VERTICAL;

Row 1:
──────
x: Ivanov
y: 993
z: ᴺᵁᴸᴸ

ALTER TABLE iceberg_writes_example DROP COLUMN z;
SHOW CREATE TABLE iceberg_writes_example;
   ┌─statement─────────────────────────────────────────────────┐
1. │ CREATE TABLE default.iceberg_writes_example              ↴│
   │↳(                                                        ↴│
   │↳    `x` Nullable(String),                                ↴│
   │↳    `y` Nullable(Int64)                                  ↴│
   │↳)                                                        ↴│
   │↳ENGINE = IcebergLocal('/home/scanhex12/iceberg_example/') │
   └───────────────────────────────────────────────────────────┘

SELECT *
FROM iceberg_writes_example
FORMAT VERTICAL;

Row 1:
──────
x: Ivanov
y: 993

ALTER TABLE iceberg_writes_example RENAME COLUMN y TO value;
SHOW CREATE TABLE iceberg_writes_example;

   ┌─statement─────────────────────────────────────────────────┐
1. │ CREATE TABLE default.iceberg_writes_example              ↴│
   │↳(                                                        ↴│
   │↳    `x` Nullable(String),                                ↴│
   │↳    `value` Nullable(Int64)                              ↴│
   │↳)                                                        ↴│
   │↳ENGINE = IcebergLocal('/home/scanhex12/iceberg_example/') │
   └───────────────────────────────────────────────────────────┘

SELECT *
FROM iceberg_writes_example
FORMAT VERTICAL;

Row 1:
──────
x: Ivanov
value: 993
```

<h3 id="iceberg-writes-compaction">
  Compaction
</h3>

ClickHouse supports compaction iceberg table. Currently, it can merge position delete files into data files while updating metadata. Previous snapshot IDs and timestamps remain unchanged, so the time-travel feature can still be used with the same values.

How to use it:

```sql theme={null}
SET allow_experimental_iceberg_compaction = 1

OPTIMIZE TABLE iceberg_writes_example;

SELECT *
FROM iceberg_writes_example
FORMAT VERTICAL;

Row 1:
──────
x: Ivanov
y: 993
```

<h3 id="iceberg-expire-snapshots">
  Expire Snapshots
</h3>

Iceberg tables accumulate snapshots with each INSERT, DELETE, or UPDATE operation. Over time, this can lead to a large number of snapshots and associated data files. The `expire_snapshots` command removes old snapshots and cleans up data files that are no longer referenced by any retained snapshot.

**Syntax:**

```sql theme={null}
ALTER TABLE iceberg_table EXECUTE expire_snapshots(
    ['timestamp']
    [, expire_before = 'timestamp']
    [, retention_period = '3d']
    [, retain_last = 100]
    [, snapshot_ids = [1, 2, 3, 4]]
    [, dry_run = 1]
);
```

By default, which snapshots to keep is determined by the [retention policy](#iceberg-snapshot-retention-policy) (table properties `min-snapshots-to-keep`, `max-snapshot-age-ms`, and per-ref overrides). When `snapshot_ids` is specified, the retention policy is bypassed and only the listed snapshots are considered for expiration.

**Arguments:**

* `'timestamp'` (positional) or `expire_before = 'timestamp'` — a datetime string (e.g., `'2024-06-01 00:00:00'`) interpreted in the **server's timezone**. Acts as a safety fuse: snapshots whose `timestamp-ms` is at or after this value are protected from expiration, even if the retention policy would otherwise expire them. Can be combined with `snapshot_ids`, in which case listed snapshots at or newer than the timestamp are not expired.
* `retention_period = '<duration>'` — overrides the table-level `history.expire.max-snapshot-age-ms` for this invocation only. Snapshots older than this duration (measured from now) become candidates for expiration. The value is a duration string consisting of one or more `{number}{unit}` pairs concatenated together. Supported units: `y` (365 days), `w` (7 days), `d` (24 hours), `h` (60 minutes), `m` (60 seconds), `s` (1 second), `ms` (1 millisecond). Units can be combined, e.g. `'3d'`, `'12h'`, `'1d12h30m'`, `'500ms'`.
* `retain_last = N` — overrides the table-level `history.expire.min-snapshots-to-keep` for this invocation only. At least `N` snapshots are always retained regardless of age.
* `snapshot_ids = [id1, id2, ...]` — expires exactly the listed snapshot IDs (except snapshots referenced by current snapshot, branches, or tags). This mode bypasses the retention policy entirely and cannot be combined with `retention_period` or `retain_last`.
* `dry_run = 1` — computes what would be expired and returns metrics without writing new metadata or deleting files.

<Note>
  `retention_period` and `retain_last` override only the **table-level** retention defaults. Per-ref (branch/tag) retention overrides configured in the Iceberg table properties (e.g., `refs.<branch>.min-snapshots-to-keep`) are never overridden — they always take effect as specified in the table metadata.
</Note>

**Example:**

```sql theme={null}
SET allow_insert_into_iceberg = 1;

-- Create some snapshots by inserting data
INSERT INTO iceberg_table VALUES (1);
INSERT INTO iceberg_table VALUES (2);
INSERT INTO iceberg_table VALUES (3);

-- Expire using retention policy only
ALTER TABLE iceberg_table EXECUTE expire_snapshots();

-- Expire with a safety fuse: protect snapshots newer than the timestamp (positional syntax)
ALTER TABLE iceberg_table EXECUTE expire_snapshots('2025-01-01 00:00:00');

-- Same using the named argument form
ALTER TABLE iceberg_table EXECUTE expire_snapshots(expire_before = '2025-01-01 00:00:00');

-- Override retention parameters for one execution
ALTER TABLE iceberg_table EXECUTE expire_snapshots(retention_period = '3d', retain_last = 10);

-- Expire explicit snapshots
ALTER TABLE iceberg_table EXECUTE expire_snapshots(snapshot_ids = [101, 102, 103]);

-- Dry-run preview (no metadata updates, no file deletes)
ALTER TABLE iceberg_table EXECUTE expire_snapshots(retention_period = '1d', dry_run = 1);
```

**Output:**

The command returns a table with two columns (`metric_name String`, `metric_value Int64`) containing one row per metric. The metric names follow the [Iceberg spec](https://iceberg.apache.org/docs/latest/spark-procedures/#output):

| metric\_name                          | Description                                             |
| ------------------------------------- | ------------------------------------------------------- |
| `deleted_data_files_count`            | Number of data files deleted                            |
| `deleted_position_delete_files_count` | Number of position delete files deleted                 |
| `deleted_equality_delete_files_count` | Number of equality delete files deleted                 |
| `deleted_manifest_files_count`        | Number of manifest files deleted                        |
| `deleted_manifest_lists_count`        | Number of manifest list files deleted                   |
| `deleted_statistics_files_count`      | Number of statistics files deleted (always 0 currently) |
| `dry_run`                             | `1` for dry-run mode, `0` for normal execution          |

The command performs the following steps:

1. Evaluates the retention policy (see below) to determine which snapshots must be preserved
2. If a timestamp argument was provided, additionally protects all snapshots at or newer than that timestamp
3. Expires snapshots that are neither retained by the policy nor protected by the timestamp fuse
4. Computes which files are exclusively associated with expired snapshots
5. In normal mode: generates new metadata without the expired snapshots
6. In normal mode: physically deletes unreachable manifest lists, manifest files, and data files
7. In `dry_run = 1` mode: skips steps 5 and 6 and only returns the calculated metrics

<h4 id="iceberg-snapshot-retention-policy">
  Snapshot Retention Policy
</h4>

The `expire_snapshots` command respects the [Iceberg snapshot retention policy](https://iceberg.apache.org/spec/#snapshot-retention-policy). Retention is configured via Iceberg table properties and per-reference overrides:

| Property                               | Scope | Default                                                                    | Description                                                                                         |
| -------------------------------------- | ----- | -------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- |
| `history.expire.min-snapshots-to-keep` | Table | `iceberg_expire_default_min_snapshots_to_keep` (default `1`)               | Minimum number of snapshots to keep in each branch's ancestor chain                                 |
| `history.expire.max-snapshot-age-ms`   | Table | `iceberg_expire_default_max_snapshot_age_ms` (default `432000000`, 5 days) | Maximum age (in ms) of snapshots to retain in a branch                                              |
| `history.expire.max-ref-age-ms`        | Table | `iceberg_expire_default_max_ref_age_ms` (default `∞`)                      | Maximum age (in ms) for a snapshot reference (branch or tag) before the reference itself is removed |

Each snapshot reference (`refs` in the Iceberg metadata) can override these with per-ref fields: `min-snapshots-to-keep`, `max-snapshot-age-ms`, and `max-ref-age-ms`.

**Retention evaluation:**

* **For each branch** (including `main`): the ancestor chain is walked starting from the branch head. Snapshots are retained while either of these conditions is true:
  * The snapshot is one of the first `min-snapshots-to-keep` in the chain
  * The snapshot's age is within `max-snapshot-age-ms` (i.e., `now - timestamp-ms <= max-snapshot-age-ms`)
* **For tags**: the tagged snapshot is retained unless the tag has exceeded its `max-ref-age-ms`, in which case the tag reference is removed
* **Non-main references** whose age exceeds `max-ref-age-ms` are removed entirely (the `main` branch is never removed)
* **Dangling references** that point to non-existent snapshots are removed with a warning
* **The current snapshot is always preserved**, regardless of retention settings

**Required privileges:**

The `ALTER TABLE EXECUTE` privilege is required, which is a child of `ALTER TABLE` in the ClickHouse access control hierarchy. You can grant it specifically or via the parent:

```sql theme={null}
-- Grant only EXECUTE permission
GRANT ALTER TABLE EXECUTE ON my_iceberg_table TO my_user;

-- Or grant all ALTER TABLE permissions (includes ALTER TABLE EXECUTE)
GRANT ALTER TABLE ON my_iceberg_table TO my_user;
```

<Note>
  * Only Iceberg format version 2 tables are supported (v1 snapshots do not guarantee `manifest-list`, which is required to safely identify files for cleanup)
  * The current snapshot is always preserved, even if it is older than the specified timestamp
  * Requires the `allow_insert_into_iceberg` setting to be enabled
  * Requires the `allow_experimental_expire_snapshots` setting to be enabled
  * The catalog's own authorization (REST catalog auth, AWS Glue IAM, etc.) is enforced independently when ClickHouse updates the metadata
</Note>

<h3 id="iceberg-remove-orphan-files">
  Remove Orphan Files
</h3>

Orphan files are files on storage that are not referenced by any snapshot in the Iceberg table metadata. They accumulate from failed writes, partial cleanup after compaction, and interrupted operations, causing unbounded storage growth. The `remove_orphan_files` command identifies and removes these orphan files.

**Syntax:**

```sql theme={null}
-- Positional form: single unnamed older_than argument
ALTER TABLE iceberg_table EXECUTE remove_orphan_files('timestamp')

-- Named form
ALTER TABLE iceberg_table EXECUTE remove_orphan_files(
    older_than = 'timestamp',
    location = 'path',
    dry_run = 0|1
)

-- No arguments: use all defaults (older_than = 3 days ago)
ALTER TABLE iceberg_table EXECUTE remove_orphan_files()
```

**Parameters:**

| Parameter    | Type                 | Default                                                                 | Description                                                                                                                                                |
| ------------ | -------------------- | ----------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `older_than` | `String` (timestamp) | 3 days ago (configurable via `iceberg_orphan_files_older_than_seconds`) | Only consider files with a last-modified time older than this timestamp as orphan candidates. Safety guard against deleting files from in-progress writes. |
| `location`   | `String`             | Table location                                                          | Restrict the scan to a specific subdirectory under the table location (e.g., `'data/'` or `'metadata/'`).                                                  |
| `dry_run`    | `UInt64`             | `0`                                                                     | When `1`, identify orphan files and return the result summary without actually deleting anything.                                                          |

**Examples:**

```sql theme={null}
-- Remove orphan files older than a specific timestamp
ALTER TABLE iceberg_table EXECUTE remove_orphan_files('2026-03-01 00:00:00');

-- Dry run: preview which files would be deleted
ALTER TABLE iceberg_table EXECUTE remove_orphan_files(dry_run = 1);

-- Scan only the data directory
ALTER TABLE iceberg_table EXECUTE remove_orphan_files(
    older_than = '2026-03-01 00:00:00',
    location = 'data/'
);

-- Combine positional older_than with named arguments
ALTER TABLE iceberg_table EXECUTE remove_orphan_files(
    '2026-03-01 00:00:00',
    dry_run = 1
);
```

**Output:**

The command returns a table with `metric_name` and `metric_value` columns showing the count of deleted (or would-be-deleted in dry\_run mode) files by category. File categories are classified using best-effort heuristics based on file naming conventions; files that do not match any specific pattern default to `deleted_data_files_count`:

| metric\_name                            | metric\_value |
| --------------------------------------- | ------------- |
| deleted\_data\_files\_count             | 5             |
| deleted\_position\_delete\_files\_count | 2             |
| deleted\_equality\_delete\_files\_count | 0             |
| deleted\_manifest\_files\_count         | 3             |
| deleted\_manifest\_lists\_count         | 1             |
| deleted\_metadata\_files\_count         | 0             |
| deleted\_statistics\_files\_count       | 0             |
| skipped\_missing\_metadata\_count       | 0             |
| failed\_deletions\_count                | 0             |

**Settings:**

| Setting                                   | Type     | Default           | Description                                                             |
| ----------------------------------------- | -------- | ----------------- | ----------------------------------------------------------------------- |
| `allow_iceberg_remove_orphan_files`       | `Bool`   | `false`           | Gate setting to enable the feature (experimental).                      |
| `iceberg_orphan_files_older_than_seconds` | `UInt64` | `259200` (3 days) | Default `older_than` threshold in seconds when the argument is omitted. |

<Note>
  * **Requires Iceberg format version 2 (or higher).** Version 1 tables are rejected because they lack `manifest-list` pointers in snapshots, which are needed to safely determine the reachable file set. Running the command on a v1 table returns a `BAD_ARGUMENTS` error.
  * Requires both `allow_insert_into_iceberg` and `allow_iceberg_remove_orphan_files` settings to be enabled
  * It is recommended to run `expire_snapshots` before `remove_orphan_files` so that files uniquely referenced by expired snapshots are cleaned up first
  * Use `dry_run = 1` to preview orphan files before deletion
  * The `older_than` threshold protects against deleting files from in-progress writes — the default 3-day threshold provides a generous safety margin
</Note>

<h2 id="see-also">
  See Also
</h2>

* [Iceberg engine](/reference/engines/table-engines/integrations/iceberg)
* [Iceberg cluster table function](/reference/functions/table-functions/icebergCluster)
