Brick performs approximation for a selected column with specified trend type returning coefficients of approximation.
Bricks → Analytics → Local Trend Analysis
List of possible columns for selection. This field is obligatory for any setting selected. The parameter accepts only columns with integer/float data types.
List of possible functions to make an approximation, This field is obligatory for any setting selected. Users can choose one of the presented:
The default value is “linear”.
Integer. The argument appears only when the Polynomial option is chosen for the Trendlines parameter, it means a degree of a polynomial function, in other words, the highest power of the variable that occurs in a polynomial.
- Window size
Integer. Argument defines the step size for sliding windows and then approximates with chosen trendlines. The default value is 5.
Float. Argument creates some artificial changes in a chosen column for each value in a column adds bias argument that creates some data transformations. The default value is 0
Float. Argument creates some artificial changes in a chosen column for each value in a column multiplied by a scale argument that creates some data transformations. The default value is 1.
Brick takes the dataset.
Brick produces a dataset with new columns created, that are interpreted as coefficients of approximation and R2 score calculated for each point. For example, we want to use Local Trend Analysis brick for column “age” with “linear” trendline after brick execution we will get the output dataset with new columns: “coef_0”, “coef_1”, “R2”. “coef_0” is coefficient for the highest degree.
Lets use this brick on sunflowers dataset. The general information about the dataset is represented below:
- Month (datetime) - observation month
- Sunspots (float) - count of sunspots observed
We will use brick Local Trend Analysis to the column “Sunspots” with window size equal to the lenght of data in the column and then compare coefficients to the excel execution.
Lets move to the brick configurations, here we will use all trendlines and comapre them:
Now let’s move to the comparing the results:
For linear trendline:
As we can see linear equation that we have got : , now let’s look to excel result which are the same:
For exponential trendline:
, let’s compare to the excel results:
For polynomial trendline with power of 2:
Excel coefficients presented on graph:
They are identical.
For logarithmic trendline:
Excel results presented on graph:
The coefficients are the same.
For power trendline:
The results are a bit different. That is because we use non-linear optimization techniques to find coefficients of approximation.
Lets have a look on practical example using Local Trend Analysis brick on default sunflower dataset with window size of 200 and polynomial trendline with power of 2 to see performance of a brick:
Then we build a dashboard view for R2 score that explains how good fitted coefficients of approximation are:
With the mean R2 score of 0.7 is a good result for such data.