How to remove PII from a CSV file?
Learn why and how to remove sensitive data from .csv or .xslx or Google Sheets
CSV, Excel and Google Sheets are used daily for various business application use cases. These files are then shared daily over email or Slack or over tickets (Zendesk, Intercom, Salesforce, Service Now, SAP). These files may contain sensitive PII, PHI, or confidential data, which must be handled securely for various reasons. Let's learn a) why one should remove sensitive data from csv or excel files and b) how to achieve it.
Personally Identifiable Information (PII) refers to any data that could potentially identify a specific individual. This can include data such as name, social security number, date and place of birth, mother's maiden name, or biometric records.
There are several reasons why one might want to remove PII from a CSV (Comma Separated Values) file or XSLX (Microsoft Excel Open XML Spreadsheet):
If you know which cells in an Excel sheet or data values in CSV are PII or PHI, it is very straightforward to delete that entire column. However, most of the time, it is unknown what is PII or sensitive data in those files as the schema of the files is unknown.
This is where Strac's machine-learning technology and the depth of PII data elements come into the picture. Strac will automatically remove, mask, or redact any sensitive data from all types of documents - .pdf, .jpeg, .docx, .csv, and more.
Let's take an example. Below is a screenshot of an excel file that has Id, Gender, Birthdate, last name, first name, address, Policy Creation Date, Utm Campaign and Utm Content. We would want to remove all PII data automatically.
With Strac's automatic redaction, masking or removal of data powered by its proprietary machine learning technology, we will automatically remove gender, birthdate, last name, first name and address. It will look like below:
The above experience removes PII data from csv or excel or sheets. You can configure to mask or tokenize data within the file. You can learn more about different masking techniques here: https://www.strac.io/integrations/postgres
After (CSV with PII in Columns)
If you have any questions or want to learn how to remove PII from CSV files, whether in email, zendesk ticket, slack message, or any SaaS app, or want API access, please book a meeting with us.