
Data output shape of a dynamic block
A dynamic block must return a list of 2 lists of dictionaries (e.g.List[List[Dict]]
).
For example, a data loader block returns the following output:
Input data for downstream blocks
The 1st item in the list is a list of dictionaries. These dictionaries contain data that will be passed down to all its downstream blocks.Note The transformer block is returning a list as its output data. The list of items is used as positional arguments in any downstream block of this transformer block.
Metadata for downstream blocks (optional)
The 2nd item in the list is a list of dictionaries. These dictionaries contain metadata for each downstream block.Key | Description | Required |
---|---|---|
block_uuid | This value is used in combination with the downstream block’s original UUID to construct a unique UUID across all dynamic blocks. This value must be unique within the same list of metadata dictionaries. | Yes |
anonymize_user_data
and the metadata for the 1st
dynamically created block is { "block_uuid": "for_user_1" }
, then the dynamically created block’s
UUID is anonymize_user_data:for_user_1
.
The convention is [original_block_uuid]:[metadata_block_uuid]
.
Dynamically created blocks
Every downstream block from a dynamic block is referred to as a dynamically created block. The number of these blocks created are determined by the return output data of a dynamic block. For example, if a dynamic block returns the following data:anonymize_user_data
and the following code:
Dynamic block UUID | Return output data |
---|---|
anonymize_user_data:for_user_1 | [{ "id": 100, "name": "user_1" }] |
anonymize_user_data:for_user_2 | [{ "id": 200, "name": "user_2" }] |
anonymize_user_data:for_user_3 | [{ "id": 300, "name": "user_3" }] |
Downstream blocks of dynamically created blocks
If a dynamically created block has downstream blocks, those downstream blocks will be created multiple times. The number of times those are created correspond to how many blocks were created dynamically from the dynamic block. In the current example, 3 blocks were created dynamically (3 users, 1 downstream block from the dynamic block). If the above transformer block has 2 downstream blocks, then 6 more blocks will be created. For example, if the transformer blockanonymize_user_data
has 2 downstream blocks with UUID
clean_column_names
and compute_engagement_score
, the following blocks will also be
dynamically created:
anonymize_user_data:for_user_1
anonymize_user_data:for_user_2
anonymize_user_data:for_user_3
clean_column_names
, compute_engagement_score
) will come from its upstream block
(e.g. anonymize_user_data:for_user_1
, anonymize_user_data:for_user_2
, anonymize_user_data:for_user_3
).
For example, the block clean_column_names:for_user_1
and compute_engagement_score:for_user_1
will have input data with the following shape and value:
Reduce output
By default, dynamically created blocks will create more dynamically created blocks from their own downstream blocks. A dynamically created block will pass its return output data to its downstream blocks and so on. However, a dynamically created block can be configured to reduce all the outputs across each dynamically created block and combine them into a single list of items. If we configure the above transformer blockanonymize_user_data
to reduce its output, then
there will only be 2 downstream blocks instead of 6.
The 2 downstream blocks will be clean_column_names
and compute_engagement_score
.
The input data to each of these 2 downstream blocks will be:
Dynamic SQL blocks
SQL blocks can be dynamic or be a child of a dynamic block. Here is an example pipeline with code:data_loader.py
: dynamic block
dynamic_sql_loader
: dynamic block, dynamic child
transformer
: dynamic child