Skip to content

Instantly share code, notes, and snippets.

@brootware
Last active August 24, 2022 06:00
Show Gist options
  • Save brootware/f429ee3048f9faa4496104966657488c to your computer and use it in GitHub Desktop.
Save brootware/f429ee3048f9faa4496104966657488c to your computer and use it in GitHub Desktop.

You can use gtredactkit as an API. To do so:

from gtredactkit import runner
data = """this is my IP: 102.23.5.1
My router is : 10.10.10.1
71.159.188.33
81.141.167.45
165.65.59.139
64.248.67.225
https://tech.gov.sg
My email is harold@mail.com
this is my IP: 102.23.5.1
My router is: 10.10.10.1
71.159.188.33
81.141.167.45
165.65.59.139
64.248.67.225
Base64 data
QVBJX1RPS0VO
UzNjcjN0UGFzc3dvcmQ=
U3VwM3JTM2NyZXRQQHNzd29yZA==
Singapore NRIC
G0022121F
F2121200F
G1021022E
S1022221L
G1222221C
S0000212Q
F2120212E
S0021001P
"""
sensitive_data_list = runner.api_identify_sensitive_data(data)
print(sensitive_data_list)

""" ['102.23.5.1', '10.10.10.1', '71.159.188.33', '81.141.167.45', '165.65.59.139', '64.248.67.225', 'https://tech.gov.sg', 'harold@mail.com', 'mail.com', '102.23.5.1', '10.10.10.1', '71.159.188.33', '81.141.167.45', '165.65.59.139', '64.248.67.225', 'QVBJX1RPS0VO', 'UzNjcjN0UGFzc3dvcmQ=', 'U3VwM3JTM2NyZXRQQHNzd29yZA==', 'G0022121F', 'F2121200F', 'G1021022E', 'S1022221L', 'G1222221C', 'S0000212Q', 'F2120212E', 'S0021001P'] """

Basic usage

You can redact a text glob from PowerShell Terminal as below.

redactor 'this is my ip:127.0.0.1. my email is broot@outlook.com. secret link is github.com'

This will create a redacted file and a hashshadow file which you can later use to unredact.

{
  "5f7aa522-86e5-4ca7-83ae-09fbb5a1044b": "broot@outlook.com",
  "983b017a-98a5-4763-aa6d-a8ad69db20bc": "github.com",
  "a9581c73-05cb-428e-8c62-8bf1521a8aa1": "127.0.0.1"
}
this is my ip:a9581c73-05cb-428e-8c62-8bf1521a8aa1. my email is 5f7aa522-86e5-4ca7-83ae-09fbb5a1044b. secret link is 983b017a-98a5-4763-aa6d-a8ad69db20bc

To redact a single file from terminal.

redactor test.txt 

To Unredact a redacted file.

redactor redacted_test.txt -u .hashshadow_test.txt.json 

This will create an unredacted file which contains the original unmasked data.

this is my ip:127.0.0.1. my email is broot@outlook.com. secret link is github.com

Advance usage

You can redact multiple files in a folder with sub directories and output the files into a newly created directory.

Consider the below folder containing multiple log files with sub directory.

tree foldertoredact 
foldertoredact
├── cctest.txt
├── ip_test 3.txt
├── ip_test 4.txt
├── ip_test.txt
├── nric.txt
└── subdir
    └── ip_test copy.txt

1 directory, 6 files

To redact all of log files in the folder and place them in a new folder:

redactor foldertoredact -d newfolder
tree newfolder
newfolder
├── redacted_cctest.txt
├── redacted_ip_test 3.txt
├── redacted_ip_test 4.txt
├── redacted_ip_test copy.txt
├── redacted_ip_test.txt
└── redacted_nric.txt

0 directories, 6 files

Besides the core regex patterns of SG NRIC, domain names, emails, ip addresses, base64 strings and credit cards, you can also define custom regex patterns to redact them from your log files.

To redact using custom regex pattern, create a custom json file as per format below.

redactor file -c customregex.json
[
    {
        "pattern": "^([a-zA-Z0-9_-]*:[a-zA-Z0-9_-]+@github.com*)$",
        "type": [
            "API Keys",
            "Credentials",
            "Bug Bounty",
            "GitHub"
        ]
    },
    {
        "pattern": "(?i)^(arn:(?P<Partition>[^:\\n]*):(?P<Service>[^:\\n]*):(?P<Region>[^:\\n]*):(?P<AccountID>[^:\\n]*):(?P<Ignore>(?P<ResourceType>[^:\\/\\n]*)[:\\/])?(?P<Resource>.*))$",
        "type": [
            "Identifiers",
            "Networking",
            "AWS",
            "Bug Bounty"
        ]
    },
    {
        "pattern": "(?i)^((facebook|fb)(.{0,20})?['\\\"][0-9a-f]{32}['\\\"])$",
        "type": [
            "API Keys",
            "Bug Bounty",
            "Credentials",
            "Facebook"
        ]
    }
]

Welcome to the GtRedactKit's Documentation Page! To have this tool installed on your GSIB, please request your AFM to push the tool down via software center! We are currently in the progress of listing this software to WOG App Library for self service download and ratings.

What is it about?

GovTech GIG, as the Infrastructure Engineer Capability Center, provides functional leadership to WOG. As part of this initiative, we will publish tools that Agencies can leverage on for their tech operations work.

Scenario:

When we seek support from product principals, there may be instances when we will need to send logs with sensitive internal IP addresses, URLs, email addresses, SOE IDs etc. to them. Engineers will then need to manually eyeball and redact such data which could be time consuming and prone to errors. This tool enables engineers to automate this process and save time, thereby, reducing operation overheads and reduce errors.

Why use a tool?

To redact sensitive data like internal IP addresses, emails, domain names, hostnames and SOE-IDs before sending them to product principles for troubleshooting. Sure, you can use sed  and grep  to redact sensitive data. But the original data is lost. Redactor cli tokenizes the sensitive data for later un-redaction if you need to deep dive into certain parts of the log file during the troubleshooting.

How it works?

A python based command line tool that helps you automate the redaction of sensitive data from the log files. The tool can be used on GSIB. Engineers can redact / un-redact sensitive log data using the tool. Core redaction engine redacts the following list of data types. (Extensible to other types of data based on user defined regular expressions.)

Redacts and Unredacts the following from your text files. 📄 ✍️

  • sg nric 🆔
  • credit cards 🏧
  • URLs 🌐
  • emails ✉️
  • ipv4 📟
  • ipv6 📟
  • base64 🅱️
  • SOE-ID 🆔

Benefits

Saves manual labor which is time consuming and error prone.

More information on usage of the tool below.

API Usage Usage

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment