Skip to content

Instantly share code, notes, and snippets.

View jeffvestal's full-sized avatar

Jeff Vestal jeffvestal

  • Chicago
View GitHub Profile
from faker import Faker
import random
import datetime
# Create an instance of the Faker class
faker = Faker()
# Define the number of logs
num_logs = 1000000

Main Components

  • NER model for Location/Org/People identification
  • Regex patterns in a script processor toidentify pii with standard structures.

Ingest Pipeline

PUT _ingest/pipeline/pii_script-redact

Click Here for Example Jupyter Notebooks

Short Link to this gist - ela.st/operationalize-nlp

NER - Named Entity Recognition

NER models can be used two ways in elasticsearch:

  1. On ingest, new documents can be run through an ingest pipeline, which sends one of the fields through the model, and the resulting entity fields are indexed along with the original document. This is useful later when you want to be able to quickly filter on an entity, or run aggregations, etc.
  2. Strings can be submitted directly to a model using the _infer endpoint. The string will be processed by the model and the response message will include any identified entities. Same output as #1, but this is done adhoc, and the results are not stored.
@jeffvestal
jeffvestal / indexing derivative (rate)
Last active December 6, 2022 21:05
Agg to calculate the rate of ingest
will update shortly
@jeffvestal
jeffvestal / -elastic-nlp-load-huggingface-model-with-zero-shot-example.ipynb
Last active September 7, 2022 15:45
Elastic - NLP - Load HuggingFace Model with Zero Shot Example
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@jeffvestal
jeffvestal / elastic-classification.churn-transform-demo
Last active November 5, 2020 21:29
Dev Tools Data from Elastic Classification Churn Demo
# Preview and create transform
# PUT _transform/churn
POST _transform/_preview
{
"source": {
"index": [
"churn_demo-calls",
"churn_demo-customers"
]
},