Data model
A schema is applied to a datamart and defines what mediarithmics should index and make available through queries.
‌Each schema is a Graph Query Language schema defining an object tree index that will allow you to run fast Object Tree Query Language queries to search users.

Sample Schema

This schema defines all available mediarithmics objects with the standard properties. When defining your schema, you can start from this schema and add/remove properties based on your needs and the data you ingest into the platform.
The number of referenced properties has an impact on query performance. It would be best only to have the properties you need to use when defining your schema. Don't just copy the default ones.
User point is the root element of any mediarithmics schema, and only one index can be created. This may change in future releases to allow you to build different indexes.
1
type UserPoint @TreeIndexRoot(index:"USER_INDEX"){
2
id: ID!
3
creation_ts: Timestamp! @TreeIndex(index:"USER_INDEX")
4
agents: [UserAgent!]!
5
accounts: [UserAccount!]!
6
emails: [UserEmail!]!
7
activities: [UserActivity!]!
8
profiles: [UserProfile!]!
9
segments: [UserSegment!]!
10
choices: [UserChoice!]!
11
scenarios: [UserScenario!]!
12
}
13
14
type UserAgent {
15
id:ID!
16
creation_ts: Timestamp!
17
last_activity_ts: Timestamp
18
user_agent_info:UserAgentInfo @Function(name:"DeviceInfo", params:["id"])
19
}
20
21
type UserAccount {
22
id:ID!
23
compartment_id: String! @TreeIndex(index:"USER_INDEX")
24
user_account_id: String!
25
creation_ts: Timestamp!
26
}
27
28
type UserEmail {
29
id:ID!
30
email: String
31
creation_ts: Timestamp!
32
last_activity_ts: Timestamp
33
}
34
35
type UserSegment {
36
id: ID! @TreeIndex(index:"USER_INDEX")
37
creation_ts: Timestamp! @TreeIndex(index:"USER_INDEX")
38
last_modified_ts: Timestamp! @TreeIndex(index:"USER_INDEX")
39
expiration_ts: Timestamp @TreeIndex(index:"USER_INDEX")
40
}
41
42
type UserProfile {
43
id: ID!
44
compartment_id: String!
45
user_account_id: String
46
creation_ts: Timestamp!
47
last_modified_ts: Timestamp!
48
}
49
50
type UserActivity {
51
id: ID!
52
type: UserActivityType!
53
source: UserActivitySource!
54
ts: Timestamp!
55
duration: Int
56
events: [UserEvent!]!
57
}
58
59
type UserEvent {
60
id: ID!
61
name:String!
62
ts: Timestamp!
63
}
64
65
type UserScenario {
66
id: ID! @TreeIndex(index:"USER_INDEX")
67
scenario_id: String! @TreeIndex(index:"USER_INDEX")
68
execution_id: String! @TreeIndex(index:"USER_INDEX")
69
node_id: String! @TreeIndex(index:"USER_INDEX")
70
callback_ts: Timestamp @TreeIndex(index:"USER_INDEX")
71
start_ts: Timestamp! @TreeIndex(index:"USER_INDEX")
72
node_start_ts: Timestamp! @TreeIndex(index:"USER_INDEX")
73
active: Boolean @TreeIndex(index:"USER_INDEX")
74
}
75
76
type UserChoice {
77
id: ID!
78
processing_id: String! @TreeIndex(index:"USER_INDEX")
79
choice_acceptance_value: Boolean! @TreeIndex(index:"USER_INDEX")
80
creation_ts: Timestamp!
81
user_account_id: String
82
compartment_id: String
83
email_hash: String
84
user_agent_id: String
85
channel_id: String
86
choice_ts: Timestamp! @TreeIndex(index:"USER_INDEX")
87
}
88
89
type UserAgentInfo {
90
form_factor:FormFactor
91
brand:String
92
browser_family:BrowserFamily
93
browser_version:String
94
carrier:String
95
model:String
96
os_family:OperatingSystemFamily
97
os_version:String
98
agent_type:UserAgentType
99
}
Copied!

Syntax highlights

The ! operator

The ! operator marks elements as mandatory. That means the element is expected not to be null.
1
type MyType {
2
user_account_id: String # doesn't necessarily have a user account
3
user_account_id: String! # has a user account
4
events: [UserEvent!]! # has a list of events, in which each event can't be null
5
events: [UserEvent!] # doesn't necessarily have a list of events, but lists can't have null elements
6
}
Copied!
If you add the ! operator to a field that happens to have null values, the entire object won't be indexed.
It is hard to ensure a field will always have a value in all the data you'll put into the platform, whatever the ingestion method. Therefore, we recommend not using this operator in your schema for fields other than the predefined ones.

The ID type

This type is treated as a keyword string, but marks data that is not understandable for a user, as it is an identifier.
1
type UserChoice {
2
id: ID!
3
}
Copied!

Basic types

There is existing multiple native type you can use in your schema.
1
type UserProfile {
2
id: ID!
3
creation_ts: Timestamp
4
email: String
5
age: Int
6
active: Boolean
7
}
Copied!

Timestamps and dates

A best practice is to import objects with dates as Timestamp
To display the value as date and time when running queries or in exports, you can use the Date type.
1
// Origin activity
2
{
3
...
4
"$ts": 1632753811859,
5
"other_date": "2021-09-27T14:43:31.859Z",
6
"other_ts": 1632753811859
7
...
8
}
Copied!
1
type UserActivity {
2
...
3
ts: Timestamp @TreeIndex(index:"USER_INDEX")
4
other_date: Date
5
other_ts: Timestamp
6
date: Date @Function(name:"ISODate", params:["ts"])
7
...
8
}
9
10
## Doing SELECT { ts other_date other_ts date } ...
11
## returns
12
## "ts": 1632753811859,
13
## "other_date": "2021-09-27T14:43:31.859Z",
14
## "other_ts": 1632753811859,
15
## "date": "2021-09-27T14:43:31.859Z",
Copied!
You usually get data as Timestamp and generate the Date type from the Timestamp with the ISODate function. If not, then ensure you get data in the correct format. There is no implicit conversion between timestamps and dates.
1
// Origin activity
2
{
3
...
4
"other_date": 1632753811859,
5
...
6
}
7
8
9
type UserActivity {
10
# This won't work as received data is a timestamp.
11
other_date: Date
12
}
13
14
## SELECT { other_date } ...
15
## throws an error
Copied!
Both types can are compatible with Date operators in queries. Only use one @TreeIndex directive when creating a date from a timestamp : this will save space in the index and both types have the same capabilities in the query.

Directives

@TreeIndexRoot

This directive marks the root element of an Object Tree Index. The index property marks the name of the Object Tree Index
It should always be USER_INDEX as multiple indexes are not currently supported.
1
type UserPoint @TreeIndexRoot(index:"USER_INDEX"){
2
}
Copied!

@TreeIndex

This directive makes a field available in the WHERE clause and in Aggregation operations of your OTQL queries. Fields that don't have this directive can't be used in the WHERE clause but can still be retrieved in the SELECT clause.
1
type UserEvent {
2
id:ID!
3
ts:Timestamp!
4
# url and referrr properties are now available in WHERE clauses
5
url:String @TreeIndex(index:"USER_INDEX")
6
referrer:String @TreeIndex(index:"USER_INDEX")
7
}
Copied!
Don't mark every field with this directive. Some fields, like first name, last name ... will never be used in WHERE clauses and would only make your index larger.
The @TreeIndex directive is mandatory for some default properties. They already have that directive in the default schema, and you shouldn't remove it, or your schema won't be validated.
When registering a String in a Tree Index with the directive @TreeIndex, you can specify how the field should be indexed, depending on how you want to use it later.
Two modes are available, text and keyword.
1
type myType {
2
mystring:String @TreeIndex(index:"USER_INDEX", data_type: "text")
3
secondstring:String @TreeIndex(index:"USER_INDEX", data_type: "keyword")
4
}
Copied!

String indexed as text

This mode is considering your value as a set of words (e.g. a text). For example, the value 'The QUICK brown fox JuMpS, over the Lazy doG.' will be considered as the list of:
  • the
  • quick
  • brown
  • fox
  • jumps
  • over
  • lazy
  • dog
As you can see, some transformations were done before storing the data:
  • all the words were put in lowercase -> all string operators will be case insensitive on a field indexed with data_type: text
  • the original string was split, and the splitting characters were removed (here, it was , . and ,)
The method used to split the words together is described in great details here. The most common characters that trigger a split are (non-exhaustive list):
    • .
  • -
  • '
  • "
  • ,
  • ;
  • ?
  • !
The data_type: "text" mode should be used when you're working with:
  • Full sentences (ex. a Page Title)
  • URLs
  • List of keywords (separated by a splitting character as listed above)
  • similar text
Generally, this mode is used when you don't have great control over the value being collected in this field, and you want to do "broad" queries based on it.

String indexed as keyword

This mode is used to consider your value as a single word. No transformation is done with the provided value. The data_type: "keyword" mode should be used when you're working with:
  • Single values
  • Ids passed as text (ex: UUIDs, productId, categoryId, etc.)
  • Every time that you already know the values that are passed in the field (e.g. when the field data is linked to a taxonomy)
  • etc.
Generally, this mode is used when you have great control over the value being collected in this field, and you want to do exact queries on it later by doing exact equality in queries.

@Property

By default, the path associated with each of your properties is the name of these properties. You can change this behavior with the @Property directive.
1
type UserEvent {
2
id:ID!
3
ts:Timestamp!
4
name:String!
5
# We are creating shortcuts to the $url, $referrer and $items properties
6
# that are normaly in a $properties object in the user event.
7
# This will make them easier to query
8
url:String @Property(path:"$properties.$url")
9
referrer:String @Property(path:"$properties.$referrer")
10
products:[Product] @Property(path:"$properties.$items")
11
}
12
13
type Product {
14
# Here we simply change the name into id and name instead of $id and $name
15
id: String @TreeIndex(index:"USER_INDEX") @Property(path:"$id")
16
name: String @TreeIndex(index:"USER_INDEX") @Property(path:"$name")
17
}
Copied!
All the properties in the default schema already redefine their path. For example, the creation_ts property in the UserPoint object points to the $creation_ts property in the stored data. The declaration should theoretically have used the @Property directive, but it is unnecessary to do the work for you.
1
type UserPoint {
2
# What should have been declared
3
creation_ts: Timestamp! @Property(path:"$creation_ts")
4
# What is declared as a shortcut
5
creation_ts: Timestamp!
6
}
7
type Product {
8
# We do have to use the @Property directive as those properties
9
# don't exist in the default schema for a Product object type
10
id: String @Property(path:"$id")
11
name: String @Property(path:"$name")
12
}
Copied!

Taking value from multiple paths

You can define multiple paths to get the data from. If the first path is empty, the second one will be used and so one.
In this example, user activities channel ID is either the site ID or the app ID depending on the user activity's context.
1
type MyType {
2
channel_id: String @Property(paths:["$site_id", "$app_id"])
3
}
Copied!

Available tokens

You can use the [parent] token to go up in the object tree when defining a path
1
type MyType {
2
creative_id:String @Property(path:"[parent].[parent].$origin.$creative_id")
3
}
Copied!

@Mirror

This directive allows you to create custom types based on predefined types.
1
# UserEvent type has been renamed ArticleView
2
# Not really interesting and should be avoided
3
type ArticleView @Mirror(object_type:"UserEvent"){}
4
5
# More advanced usage : ArticleView object are UserEvents
6
# with a name of "navigation.article"
7
type ArticleView @Mirror(object_type:"UserEvent", filter:"name == \"navigation.article\""){}
Copied!

Sample usage: custom types with filters

1
type UserPoint @TreeIndexRoot(index:"USER_INDEX"){
2
###
3
basketviews: [BasketView]
4
productviews: [ProductView]
5
}
6
7
type BasketView @Mirror(object_type:"UserEvent", filter:"name == \"$basket_view\""){}
8
type ProductView @Mirror(object_type:"UserEvent", filter:"name == \"$page_view\""){}
Copied!

@Function

The @Function directive is used to declare a calculated field with a set of predefined functions.

ISODate

This function creates a date from a timestamp.
1
type MyType {
2
# creation_date is a Date created from the timestamp creation_ts
3
creation_date:Date! @Function(name:"ISODate", params:["creation_ts"])
4
}
Copied!

DeviceInfo

This function extracts device information for an agent identifier.
1
type UserAgent {
2
id:ID! @TreeIndex(index:"USER_INDEX")
3
user_agent_info:UserAgentInfo @Function(name:"DeviceInfo", params:["id"])
4
}
Copied!
The UserAgentInfo class has the following properties:
1
type UserAgentInfo {
2
form_factor:FormFactor
3
brand:String
4
browser_family:BrowserFamily
5
browser_version:String
6
carrier:String
7
model:String
8
os_family:OperatingSystemFamily
9
os_version:String
10
agent_type:UserAgentType
11
}
12
13
### The following enums are predefined.
14
### It is not necessary to define them
15
16
enum FormFactor {
17
WEARABLE_COMPUTER
18
TABLET
19
SMARTPHONE
20
GAME_CONSOLE
21
SMART_TV
22
PERSONAL_COMPUTER
23
OTHER
24
}
25
26
enum BrowserFamily {
27
OTHER
28
CHROME
29
IE
30
FIREFOX
31
SAFARI
32
OPERA
33
STOCK_ANDROID
34
BOT
35
EMAIL_CLIENT
36
MICROSOFT_EDGE
37
}
38
39
enum OperatingSystemFamily {
40
OTHER
41
WINDOWS
42
MAC_OS
43
LINUX
44
ANDROID
45
IOS
46
}
47
48
enum UserAgentType {
49
WEB_BROWSER
50
MOBILE_APP
51
}
Copied!

@ReferenceTable

When users create their queries using your schema, they usually remember some elements they search for but don't know their identifiers.
You can add the @ReferenceTable directive to fields storing channels, compartments and segment identifiers. That way, the user will have an autocomplete with the element's name instead of their identifier when creating his queries.
1
type UserSegment {
2
id:ID! @ReferenceTable(type:"CORE_OBJECT", model_type:"SEGMENTS") @TreeIndex(index:"USER_INDEX")
3
}
4
5
type UserActivity {
6
channel_id:String @ReferenceTable(model_type:"CHANNELS", type:"CORE_OBJECT") @TreeIndex(index:"USER_INDEX") @Property(paths:["$site_id", "$app_id"])
7
}
8
9
type UserProfile {
10
compartment_id:String! @ReferenceTable(model_type:"COMPARTMENTS", type:"CORE_OBJECT") @TreeIndex(index:"USER_INDEX")
11
}
12
13
type UserEvent {
14
channel_id:String @ReferenceTable(model_type:"CHANNELS", type:"CORE_OBJECT") @Property(paths:["[parent].$site_id", "[parent].$app_id"]) @TreeIndex(index:"USER_INDEX")
15
}
Copied!

@EdgeAvailability

This directive marks properties as usable in queries when creating Edge segments.
1
type UserAccount {
2
id:ID!
3
# This property won't be usable in Edge segment queries
4
compartment_id:String!
5
# This property will be usable in Edge segment queries
6
user_account_id:String! @TreeIndex(index:"USER_INDEX") @EdgeAvailability
7
}
Copied!

Best practices

Do not index ISODate function result

Do not index the output of the ISODate Function. You should index the timestamp value only.
1
# DO
2
type UserAgent {
3
creation_ts:Timestamp! @TreeIndex(index:"USER_INDEX")
4
creation_date:Date! @Function(name:"ISODate", params:["creation_ts"])
5
user_agent_info:UserAgentInfo @Function(params:["id"], name:"DeviceInfo")
6
id:ID!
7
last_activity_ts:Timestamp
8
}
9
10
# DON'T
11
type UserAgent {
12
creation_ts:Timestamp!
13
creation_date:Date! @Function(name:"ISODate", params:["creation_ts"]) @TreeIndex(index:"USER_INDEX")
14
user_agent_info:UserAgentInfo @Function(params:["id"], name:"DeviceInfo")
15
id:ID!
16
last_activity_ts:Timestamp
17
}
Copied!

UserEvent indexed twice

In some scenarios, you could have events directly in the UserPoint and in user activities. For example, to use frequency OTQL directives on UserEvents and build queries on several events that occurred on a single activity.
In any other case, do not duplicate the UserEvent. Either use it in the user point or the user activity.
1
# Only do if in a specific scenario requiring it
2
type UserPoint @TreeIndexRoot(index:"USER_INDEX"){
3
###
4
activities: [UserActivity!]!
5
events:[UserEvent!]!
6
}
7
8
type UserActivity {
9
###
10
events: [UserEvent!]!
11
}
12
13
type UserEvent @Mirror(object_type:"UserEvent") {
14
name:String! @TreeIndex(index:"USER_INDEX")
15
id:ID!
16
ts:Timestamp!
17
}
Copied!