Flatten JSON

General information

Parses the column that contains data in a JSON format into columns.

Description

Brick Locations

Bricks Data Manipulation → Flatten JSON

Brick Parameters

  • Columns with JSON
    • Columns that contain data to parse in JSON format. Multiple columns can be selected by clicking the + button.
  • Tags to exclude
    • Tags to exclude from the dataset. Multiple tags can be selected by clicking the + button.
      In case you want to remove a large number of tags, you can select the tags to keep and use the flag ‘Remove all except selected’.
      Excluded tags are filtered out into the ‘unparsed_tags’ column.
  • Max level
    • If checked, the maximum level is enabled. Any max level can be selected as a positive integer number. Nested dictionaries with levels greater than max level stay unparsed.
  • Omit
    • Omit column names
      • A checkbox that allows you to skip the initial column name in the resulting column names.
    • Omit complex names
      • A checkbox that allows you not to generate complex names (all levels separated by points) and only keep the last level.
  • Brick frozen
    • Enables frozen run for the brick.

Brick Inputs/Outputs

  • Inputs
    • Brick takes the dataset
  • Outputs
    • Brick produces the dataset with parsed JSON columns

Example of usage

Let’s have a look at the usage of the Flatten JSON Brick.
As an example, we can take this dataset with JSON objects in the first column.
notion image
To parse those dictionaries into separate columns with single values, we can connect the dataset to the Flatten JSON Brick, select column ‘0’ in ‘Columns with JSON’ settings and run the pipeline.
notion image
As a result, we get a dataset with all the values from the JSON cells.
notion image
We can specify the tags to exclude or keep in the ‘Tags to exclude’ section.
Let’s filter everything except for the tags ‘a’ and ‘h.i’ with the ‘Remove all except selected’ flag.
notion image
We got the dataset with tags ‘a’ and ‘h.i’ parsed. All the other tags are located in the ‘unparsed_tags’ column.
notion image
To skip the ‘0.’ at the beginning of column names, check ‘Omit column names’.
notion image
notion image
In order to generate simple names without all in-between levels included, check ‘Omit complex names’.
notion image
notion image
If we check ‘Enable max level’ and set max level equal to 1, then the dictionaries are parsed until level 1, leaving the next nested dictionaries unparsed.
notion image
notion image