# Destination file

## Overview

A **destination file** defines where and how data files generated by the platform are delivered (for example via SFTP or cloud storage). It is configured directly in the **computing console**.

On its own, a destination file does not actively export data. It becomes effective when a **feed** (through the *stream-to-file* capability) references the destination file using its **token (auto-generated)**. In that case, the feed exports data chunks to the delivery file system, which aggregates them into a single file and delivers it to the targeted destination.

Typical use cases include:

* Sending audience segment's user identifiers to an external partner for activation.
* Supporting custom connectors relying on file-based delivery.

### Supported protocols & object storages

A destination file can target one of the following systems:

* **SFTP** – Secure file transfer over SSH
* **S3** – Amazon S3 buckets
* **Google Cloud Storage** – mediarithmics GCS buckets

The destination `type` property defines how and where the files are physically uploaded.

## Destination file setup

The setup is done in the Computing console > File destinations.

{% stepper %}
{% step %}

### Create the destination file

1. Click on *New file delivery destination* button.
2. Fill in the required [configuration fields](#configuration-fields).
3. Save.
   {% endstep %}

{% step %}

### Add the credentials

1. In the action menu of the created file delivery destination, select the *Credentials* option.
2. Copy the provided template (different for each storage type) and fill it in with the required fields.
3. Save.

{% hint style="info" %}
Once saved, you won't be able to read the credential file again. But you will be able to overwrite it.
{% endhint %}
{% endstep %}

{% step %}

### Validate the connection

In the action menu of the created file delivery destination, select the *Validate* option.
{% endstep %}

{% step %}

### Use it

Use the token to reference the destination file.
{% endstep %}
{% endstepper %}

{% hint style="info" %}
To validate the connection, the integration user must have an Editor role on the object storage (or equivalent) with permissions to create and delete a temporary empty file for verification.
{% endhint %}

## Configuration fields

### General information

| Field              | Description                                                                                                             |
| ------------------ | ----------------------------------------------------------------------------------------------------------------------- |
| **Name**           | Human-readable name used in the UI.                                                                                     |
| **Technical name** | Internal identifier used by connectors. This value is referenced in plugin code but has no functional impact by itself. |
| **Type**           | Protocols or object storage type: `SFTP`, `S3`, or `GOOGLE_CLOUD_STORAGE`.                                              |
| **Deduplicate**    | When enabled, removes *perfect duplicates* within a file (two rows that are strictly identical).                        |

### File information

| Field                 | Description                                                                                                                                                 |
| --------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **File path macro**   | Path (from the root of the destination) where the file will be written. Must **not** **start with `/`** and must **end with `/`**. Supports dynamic macros. |
| **File name macro**   | File name to generate. Supports dynamic macros.                                                                                                             |
| **Compression**       | File compression mode: `NONE`, `GZIP`, or `ZIP`.                                                                                                            |
| **Encoding**          | Character encoding. Recommended and supported value: `UTF-8`.                                                                                               |
| **File header macro** | Optional header written at the top of the file (for example column names). Can be left empty. Supports macros.                                              |

### Upload trigger

You must select one of the trigger types:

1. **Size-based**: Files are uploaded immediately when they reach the specified size limit. Otherwise, a secondary time-based interval ensures data is delivered even if the size limit is not reached.
2. **Frequency-based**: Files are uploaded based on a strict daily limit. This restricts the maximum number of files sent to the destination within a rolling 24-hour period.

If you have selected Size-based, fill in the following fields:

| Field                  | Description                                                                                                                                                                   |
| ---------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **File max size (KB)** | <p>As soon as the aggregated data reaches this size, the file is immediately uploaded to the destination.</p><p>Recommended value: <strong>100,000 KB (≈100 MB)</strong>.</p> |
| **Min upload per day** | Controls the upload **frequency** (number of upload windows per rolling 24h period). Acts as a secondary trigger when file size is not reached.                               |

If you have selected Frequency-based, fill in the following field:

| Field                  | Description                                                                                                                                                                      |
| ---------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Max upload per day** | Divides a rolling 24-hour period into equal upload windows to cap the daily file count. Uploads are only delayed if the destination server is still processing the previous file |

## Technical appendix

### Macros and templating

All macros rely on **Freemarker** templating. Macros can be used in:

* File path
* File name
* File header

They allow dynamic file organization based on time, segments, or feeds configuration.

#### Commonly used variable in macros

| Macro                    | Description                                                                                    |
| ------------------------ | ---------------------------------------------------------------------------------------------- |
| `DATE`                   | Evaluates the date at which the file is generated.                                             |
| `PROPERTY`               | Evaluates a property defined in the plugin configuration.                                      |
| `SEGMENT.ID`             | Returns the segment ID.                                                                        |
| `SEGMENT.TECHNICAL_NAME` | Returns the segment technical name.                                                            |
| `GROUPING_KEY`           | Evaluates the `grouping_key` defined in the plugin code. Used only in the context of a plugin. |
| `RANDOM_UUID`            | Generates a random UUID for uniqueness.                                                        |

#### Date formatting example

Freemarker date formatting can be used to structure folders:

```
data/partner/events/${DATE?string["yyyy/MM/dd"]}/
```

This example generates a hierarchical path such as:

```
data/partner/events/2025/03/18/
```

<details>

<summary>Example configuration</summary>

**File path macro**

```
data/exports/${SEGMENT.TECHNICAL_NAME}/${DATE?string["yyyy/MM"]}/
```

**File name macro**

```
events_${GROUPING_KEY}_${RANDOM_UUID}.csv
```

This setup produces compressed, monthly-partitioned files, uniquely identified per execution.

</details>

### Grouping key

The **grouping key** is a concept used **exclusively in plugins** (during the `user_segment_update` phase) to control how records are aggregated.

It defines **which records belong to the same logical file stream**. Records sharing the same grouping key value are grouped together and written into the same file(s).

The grouping key is **defined in the** **plugin code**, not in the destination file configuration. The destination file can then reference it through the `${GROUPING_KEY}` macro.

#### Common Grouping Strategies

* **Grouping by `segment_id`**\
  Accumulates records belonging to the same segment.
  * Can mix data coming from **multiple plugin instances** as long as they target the same segment.
  * Useful for segment-centric exports.
* **Grouping by `datamart_id`**\
  Accumulates records coming from all plugin instances within the same datamart.
  * Produces consolidated exports at the datamart level.
* **Grouping by `feed_id`**\
  Accumulates records coming from a **single plugin instance only**.
  * Ensures strict isolation between feeds.

### File size & upload frequency behavior

File uploads are controlled by **two effective parameters**. A file is uploaded as soon as **one of the active conditions is met**.

#### File max size (primary trigger)

* The **file max size** is always the **main limiting factor**.
* While writing records, as soon as the file reaches the configured size limit, it is **immediately uploaded** to the destination.
* This behavior applies regardless of time-based settings.

#### Min upload per day (secondary Trigger)

Despite its name, this parameter should be understood as:

> **Number of upload windows per rolling 24-hour period**

* It defines a **time-based flush interval**.
* Even if the file size limit is not reached, the file will be uploaded at the end of each interval.
* The period is **rolling**, not aligned to fixed clock boundaries.

The interval duration is calculated as:

```
24 hours / min_upload_per_day
```

#### Max upload per day

It divides a rolling 24-hour period into equal upload windows to cap the daily file count.

The interval duration is calculated as:

```
24 hours / max_upload_per_day
```

Then, it uploads the files to respect the maximum number of files set on the time period.

If a same file (names match exactly) is present on the destination server, we a delay the upload and aggregate the coming data to send in the next file upload.

#### Examples

**Example 1**

```
File max size (in kB) = 100000
Min upload per day = 1
```

Behavior:

* Files are uploaded immediately when they reach **100 MB**.
* If the size is not reached, the file is uploaded at the end of a **rolling 24-hour period**.

This configuration:

* Does **not guarantee daily delivery** (in case of no data coming from the feed).
* Is suitable when file volume is unpredictable and may require more than one file per day.

**Example 2**

```
File max size (in kB) = 100000
Min upload per day = 24
```

Behavior:

* Files are uploaded immediately when they reach **100 MB**.
* If the size is not reached, the file is uploaded at the end of a **rolling 1-hour period**.

This configuration:

* Caps delivery to **at most one file per hour**.
* Is **not suitable** if more than one file per hour is required.

### Key notes & best practices

* Use **compression** for large exports to optimize transfer and storage costs.
* Prefer date-based partitioning in paths to simplify downstream processing and retention.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://developer.mediarithmics.io/data-streams/exporting-your-data/destination-file.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
