General information
Parses the column that contains data in a JSON format into columns.
Description
Brick Locations
Bricks → Data Manipulation → Flatten JSON
Brick Parameters
- Columns with JSON
Columns that contain data to parse in JSON format. Multiple columns can be selected by clicking the + button.
- Tags to exclude
Tags to exclude from the dataset. Multiple tags can be selected by clicking the + button.
In case you want to remove a large number of tags, you can select the tags to keep and use the flag ‘Remove all except selected’.
Excluded tags are filtered out into the ‘unparsed_tags’ column.
- Max level
If checked, the maximum level is enabled. Any max level can be selected as a positive integer number. Nested dictionaries with levels greater than max level stay unparsed.
- Omit
- Omit column names
- Omit complex names
A checkbox that allows you to skip the initial column name in the resulting column names.
A checkbox that allows you not to generate complex names (all levels separated by points) and only keep the last level.
- Brick frozen
Enables frozen run for the brick.
Brick Inputs/Outputs
- Inputs
Brick takes the dataset
- Outputs
Brick produces the dataset with parsed JSON columns
Example of usage
Let’s have a look at the usage of the Flatten JSON Brick.
As an example, we can take this dataset with JSON objects in the first column.
To parse those dictionaries into separate columns with single values, we can connect the dataset to the Flatten JSON Brick, select column ‘0’ in ‘Columns with JSON’ settings and run the pipeline.
As a result, we get a dataset with all the values from the JSON cells.
We can specify the tags to exclude or keep in the ‘Tags to exclude’ section.
Let’s filter everything except for the tags ‘a’ and ‘h.i’ with the ‘Remove all except selected’ flag.
We got the dataset with tags ‘a’ and ‘h.i’ parsed. All the other tags are located in the ‘unparsed_tags’ column.
To skip the ‘0.’ at the beginning of column names, check ‘Omit column names’.
In order to generate simple names without all in-between levels included, check ‘Omit complex names’.
If we check ‘Enable max level’ and set max level equal to 1, then the dictionaries are parsed until level 1, leaving the next nested dictionaries unparsed.