# Transformations

Transformations process the result of the inner `sources` before it is passed to the [dataset](https://developer.mediarithmics.io/dashboards/datasets-and-data-sources) or to the next transformation.

```json5
"dataset": {
    "type": "transformation-1",
    // Transformations always take a list of sources
    // Even if only one is used
    "sources": [
        {
            // Transformations can be chained
            "type": "transformation-2"
            "sources": [
                {
                    // End the end we have one more 
                    // query data sources
                    "type": "OTQL"
                    ...
                }
            ]
        }
    ]
}
```

## `to-list`

This puts multiple numbers from multiple queries into a [key / value dataset](https://developer.mediarithmics.io/datasets-and-data-sources#key-value-datasets). It is important to set the key of each number with the `series_title` property. It takes from one to any number of `sources`.

```json5
"dataset": {
    "type": "to-list",
    "sources": [
        {
            "type": "OTQL",
            "query_id": "666", // SELECT @count{} FROM UserPoint WHERE...
            "series_title": "Unknown"
        },
        {
            "type": "OTQL",
            "query_id": "777", // SELECT @count{} FROM UserPoint WHERE...
            "series_title": "With online account"
        },
        {
            "type": "OTQL",
            "query_id": "888", // SELECT @count{} FROM UserPoint WHERE...
            "series_title": "With fidelity program"
        }
    ]
}
```

## `join`

This joins two [key / value datasets](https://developer.mediarithmics.io/datasets-and-data-sources#key-value-datasets) into a [key / values](https://developer.mediarithmics.io/datasets-and-data-sources#key-values-datasets) dataset. It is important to set the name of each series with the `series_title` property. It takes from one to any number of `sources`.

```json5
"dataset": {
    "type": "join",
    "sources": [
        {
            "type": "OTQL",
            "query_id": 666, // Select {interests @map} FROM UserPoint WHERE...
            "series_title": "Unkwnown"
        },
        {
            "type": "OTQL",
            "query_id": 777, // Select {interests @map} FROM UserPoint WHERE...
            "series_title": "With fidelity program"
        }
    ]
}
```

## `to-percentages`

This calculates the representation of each value in the complete dataset. Only one `sources` is accepted.&#x20;

You usually want to use the `format: percentage` option of the associated data visualisation to automatically change the labels and tooltips and formats to display `percentage% (count)`

```json5
"type": "Radar",
"dataset": {
    "type": "to-percentages",
    "sources": [{
        "type": "OTQL",
        "query_id": 666 // Select {interests @map} FROM UserPoint WHERE...
    }]
},
"options": {
    "format": "percentage"
}
```

## `ratio`

This calculates the ratio between two numbers (`source 1 / source 2 * 100)`. It only accepts two `sources` that should each return numbers.&#x20;

```json5
"dataset": {
     "type": "ratio",
     "sources": [
          {
          "type": "OTQL",
          "query_id": "666" // SELECT @count{} FROM UserPoint WHERE...
          }, // Returns 100k
          {
          "type": "OTQL",
          "query_id": "777" // SELECT @count{} FROM UserPoint
          } // Returns 200k
     ]
}
// Result is 100k/200k*100 = 50
```

## `index`

This calculates the representation of values from a [key / value dataset](https://developer.mediarithmics.io/datasets-and-data-sources#key-value-datasets) in comparison to an other [key / value dataset](https://developer.mediarithmics.io/datasets-and-data-sources#key-value-datasets).&#x20;

For example, if 10% of the users in a segment viewed content associated with tag 1, while 5% of the users in the whole datamart viewed content associated with this same tag, the index of tag 1 in segment in comparison to the whole datamart is `10 / 5 * 100 = 200`.

This is typically used to see which values are more/less represented in the first data source compared to the second one. An index above 100 means the value is more represented in the first data source than in the second, a value under 100 means the value is less represented in the first data source than in the second.

This is usually represented in a `Bars` chart with a `plot_line_value` of 100 and an `index` format :&#x20;

![](https://4196284719-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2F-MMuoqM-5hJ5JY0WnAKL%2Fuploads%2F0rLSR4qlROR1ddPtX5aY%2Fimage.png?alt=media\&token=48832fba-d028-4ce8-9af1-7413a97da2fe)

For each value in the first dataset, it automatically calculates its percentage representation in the first and the second source, then does the formula `source value (in percentages) / comparison value (in percentages) * 100` .

```json5
"type": "Bars",
"dataset": {
    "type": "index",
    // Use limits like "limit:20" wisely
    // as the index will be calculated for each return value, then ordered.
    // If you only do a @map with a limit of 10 elements returned and you are asking
    // to show the top 10 indexes, you will have the top 10 indexes from the top 10 values
    // A value could be in position 20 by numbers, but in position 2 by index
    "sources": [
        {
            // This query adapts to the current segment
            "type": "OTQL",
            "query_id": 666, // SELECT {interests @map} FROM UserPoint
            "series_title": "Segment"
        },
        {
            // Same query without adapting to the current segment
            // and always returns data for the whole datamart
            "type": "OTQL",
            "query_id": 666,
            "series_title": "Datamart"
            "adapt_to_scope": false
        }
    ],
    "options": {
        "limit": 10 // Number of elements to display. 10 by default
        "order": "Ascending" | "Descending" // Descending by default
        // This means that indexes will only be calculated for values 
        // representing 0.65% of values in source 1.
        "minimum_percentage" : 0.65 // 0 by default. Values between 0 and 100
    }
},
"options": {
    "type": "bar",
    "plotLineValue": 100,
    "format": "index" // So that the index is correctly displayed in tooltips
}
```

## `format-dates`

This formats timestamps and date fields to the specified date format. Available date formats are [Moment.js date formats](https://momentjs.com/docs/#/displaying/).

Use this transformation to allow the display of friendly dates to the user or to allow joining multiple data sources into the same dataset by putting returning dates in the same format.

```json5
"dataset": {
     "type": "format-dates",
     "sources": [ // Only one source allowed
          {
               "type": "OTQL",
               "query_id": "666" // SELECT {date @date_histogram} FROM UserEvent WHERE...
          },
     ],
      "date_options": {
      "format": "YYYY-MM-DD"
    }
}
```

Dates must be in the `2021-11-05T00:00:00.000Z` format or in timestamp to be formatted.\
\
Typical compatible queries are :&#x20;

* [OTQL queries](https://developer.mediarithmics.io/querying-your-data/otql-queries#date-operators) returning timestamps or @date\_histogram.&#x20;
* [Activities analytics](https://developer.mediarithmics.io/data-streams/data-ingestion/activities-analytics) queries returning the date\_time dimension
* [Collection volumes](https://developer.mediarithmics.io/advanced-usages/platform-monitoring/collection-volumes) queries returning the date\_time dimension.

## `reduce`

This transforms a key / value or key / values dataset into a single number to be displayed in [Metric charts](https://developer.mediarithmics.io/charts#metric-charts).&#x20;

* `avg` calculates the average of values&#x20;
* `count` calculates the number of values&#x20;
* `first` returns the first value&#x20;
* `last` returns the last value&#x20;
* `max` returns the maximum value&#x20;
* `min` returns the minimum value&#x20;
* `sum` returns the sum of all values&#x20;

```json5
"type": "Metric",
"dataset": {
    "type": "reduce",
    "sources": [{
        "type": "OTQL",
        "query_id": 666 // Select {interests @map} FROM UserPoint WHERE...
    }],
    "reduce_options": {
        // avg || count || first || last || max || min || sum
        "type": "count" 
    }
},
```

## `get-decorators`

This transforms identifiers such as channel IDs, compartment IDs and segment IDs into the corresponding channel names, compartment names and segment names.

```json5
// This returns channel IDs associated with the value
"dataset":
     {
          "type": "OTQL",
          "query_id": "666" // SELECT {channel_id @map} FROM UserEvent WHERE...
     }
}

// This returns channel names associated with the value
"dataset": {
     "type": "get-decorators",
     "sources": [ // Only one source allowed
          {
               "type": "OTQL",
               "query_id": "666" // SELECT {channel_id @map} FROM UserEvent WHERE...
          },
     ],
     "decorators_options": {
          "model_type": "CHANNELS", // CHANNELS || COMPARTMENTS || SEGMENTS
          // Optional if the data source returns sub buckets, 
          // to define the transformation for those sub buckets
          "buckets": { 
               // Recursive
          	"buckets": {
            		"model_type": "SEGMENTS"
          	}
          }
     }
}
```
