- Modern Big Data Processing with Hadoop
- V. Naresh Kumar Prashant Shindgikar
- 126字
- 2025-04-04 17:12:20
Variance
This technique is useful for data types that are numeric in nature. It can also be applied to Date/Time values.
This follows a statistical approach where we try to algorithmically vary the input data by a factor of +/- X percent. The value of X purely depends on the analysis we are doing and shouldn’t have an overall impact on understanding the business figures.
Let's see a few examples:
Input Data |
Output Data |
Method |
Explanation |
100 |
110 |
Fixed variance |
Increase by 10% |
-100 |
90 |
Fixed variance |
Decrease by 10% |
1-Jan-2000 |
1-Feb-2000 |
Fixed variance |
Add 1 month |
1-Aug-2000 |
1-Jul-2000 |
Fixed variance |
Reduce by 1 month |
100 |
101 |
Dynamic variance |
1% to 5% increase or decrease |
100 |
105 |
Dynamic |
1% to 5% increase or decrease |