Erasing

As the name suggests, this causes data loss when applied to the input data. Depending on the significance of the data we are dealing with, we need to apply this technique. Typical examples of this technique is to set a NULL value for all the records in a column. Since this null data cannot be used to infer anything that is meaningful, this technique helps in making sure that confidential data is not sent to the other phases of data processing.

Let's take few examples of erasing:

Input Data

Output Data

What's erased

NULL earns 1000 INR per month

Ravi earns NULL per month

Salary and name

NULL mobile number is 0123456789

Ravi's mobile number is NULL

Mobile number and name

 

From the examples, you might be wondering: why do we nullify these values? This technique is useful when we are not really interested in the PII but interested in a summary of how many salary records or mobile number records are there in our database/input.

This concept can be extended to other use cases as well.