Relation properties
Relations enable the non-technical users to use dimensions and measures from other datasets when there is a relation between the source dataset and the target datasets. Typically, Rakam generates JOIN statements in SQL.
There are two different ways to define the relations, either via relationships
test in dbt or under meta.rakam.relations
in you have more complex logic.
Creating relation via dbt test
seeds:
- name: countries
columns:
- name: iso_code
models:
- name: pageviews
columns:
- name: country
tests:
- not_null
- relationships:
to: ref('countries')
field: iso_code
rakam: # Includes info about the relationship
type: left_join
relationship: many_to_many
Creating relation via meta.rakam
meta.rakam
In some cases, you need to write a SQL expression to build up a relation between the datasets. In this case, you can use meta.rakam.relations
as follows:
seeds:
- name: countries
columns:
- name: iso_code
models:
- name: pageviews
meta:
rakam:
relations:
countries:
to: ref('countries')
type: left_join
sql: "get_iso_code({TABLE}.country) = {TARGET}.iso_code"
relationship: many_to_many
Here is the full list of properties that you can use:
relationship:
relationship:
Available values are one_to_one
, one_to_many
, many_to_many
, and many_to_many
.
The default is:
type: one_to_one
type:
type:
Available values are left_join
, right_join
, inner_join
, full_join
If you use left_join
, the following SQL expression is generated:
SELECT ...
LEFT JOIN target ON (source.source_col = target.target_col)
The default is:
type: left_join
label:
label:
The value that will be visible in the user interface. The default value is the relation name but you can override it as follows:
label: Countries
description:
description:
Example:
label: The country the user signed up from
hidden:
hidden:
Hides the relation in the user interface. It's useful in case you want to test the relationship in the ELT layer with dbt but hide it from the end-users.
Default:
hidden: false
Symmetric Aggregates
Self joins
You can join the same table or model more than once by specifying multiple join relations with different names. Here is an example:
relations:
customers:
relationship: 'one_to_one'
type: 'left_join'
to: ref('apps')
vendors:
relationship: 'one_to_one'
type: 'inner_join'
to: ref('apps')
Join through another relation
In some cases, you may want to join a model that doesn't have a direct relationship to your model. Let's say that you have three models; events
, campaigns
, and ad_networks
. There is a relation between events
and campaigns
model via events.campaign_id = campaigns.id
and campaigns
to ad_networks
via campaigns.ad_network_id = ad_networks.id
.
Rather than calculating the number of unique users who came from a specific campaign, you want to see different ad networks and see how they perform compared to each other. Since you don't have a direct relationship between the events
and ad_networks
model, you can't easily select the columns and define the relation. Instead, you need to join the campaigns
model to be able to join ad_networks
model. In that case, you can use the following SQL relation from events
model to ad_networks
model.
relations:
campaign:
relationship: one_to_one
type: left_join
to: ref('campaigns')
source: campaign_id
target: id
ad_network:
relationship: many_to_one
type: left_join
to: ref('ad_networks')
sql: '{TARGET}}id = {relation.campaign.dimension.ad_network_id}'
Since you're sql
references the campaign
relation in events
model, the join statements for both campaigns
and ad_networks
model will be included automatically when you select a dimension from ad_network
model.
Updated over 3 years ago