Skip to content

Instantly share code, notes, and snippets.

View Wapiti08's full-sized avatar
🎯
Focusing

Wapiti Wapiti08

🎯
Focusing
View GitHub Profile
@Wapiti08
Wapiti08 / custom_biluo_label.py
Last active January 27, 2021 16:28
Give a csv with token, label ast the columns, output with the token, biluo label as the columns
'''
Examples in original_csv:
APT,Sharpshooter
APT,Sandworm Team
APT,Blue Mockingbird
APT,Playful Dragon
techniques,Compromise Software Supply Chain
techniques,Supply Chain Compromise
...
@Wapiti08
Wapiti08 / fuzz_test.py
Created November 22, 2020 17:33
nlp_test_2
sentence3 = "Adversaries may bypass UAC mechanisms to elevate process privileges on system. Windows User Account Control (UAC) allows a program to elevate its privileges (tracked as integrity levels ranging from low to high) to perform a task under administrator-level permissions, possibly by prompting the user for confirmation. The impact to the user ranges from denying the operation under high enforcement to allowing the user to perform the action if they are in the local administrators group and click through the prompt or allowing them to enter an administrator password to complete the action. "
technique3 = "Bypass User Account Control "
# ========== Test for sentence ============
docx_textacy3 = spacy_lang(sentence3)
tokens3 = to_tokenized_text(docx_textacy3)
# merge entities and noun chunks into one token
spans3 = list(docx_textacy3.ents) + list(docx_textacy3.noun_chunks)
spans3 = spacy.util.filter_spans(spans3)
merge_spans(spans3, docx_textacy3)
@Wapiti08
Wapiti08 / WordCount.scala
Created November 21, 2020 10:35
Methods on Word Count by Scala
```
object WordCount { def main(args: Array[String]): Unit = {
val arr = Array("hello flink", "hello spark", "hello spark")
val arr_total = arr.flatMap(x => x.split(" "))
// val counts = arr_total.map(word => word->1).groupBy(_._2).map(x => x._1 -> x._2.size)
val counts = arr_total.groupBy(w => w).mapValues(_.size) println(counts) } }
```
```
val counts = arr_total.map((_,1)).reduceByKey(_+_).collect()
@Wapiti08
Wapiti08 / nlp test
Created November 18, 2020 14:03
fuzz_match_spans_tokens
'''
to verify the average ratio for fuzzy technique token matching
'''
from fuzzywuzzy import process
import textacy
from textacy.spacier.doc_extensions import to_tokenized_text, to_tagged_text
import spacy
from textacy.spacier.utils import merge_spans
@Wapiti08
Wapiti08 / Online_Update.py
Last active October 4, 2020 08:32
Online Update for Deeplog
# ======================= calling part =============================
# predict the testing with reporting the errors
predict.exec_anomaly_predict(exe_model, test_x, test_y)
# predict the testing with tracing back and save the result
exec_anomaly_indexes = predict.exec_anomaly_trace(exe_model, self.log_type, test_x, test_y)
# check whether the result is true positive
true_positive_indexes, false_positive_indexes =
predict_feedback.read_exec_result(exec_anomaly_indexes, self.trace_dataframe_location, \
self.trace_dict_location)
@Wapiti08
Wapiti08 / ELK_pipelines.md
Created July 3, 2020 00:33
ELK_pipelines_collection
# file -- > logstash -- > elasticsearch
input {
  file {
    path => "**/anomaly_logs"
  }
}

filter {
 grok {
@Wapiti08
Wapiti08 / image_nn.md
Last active June 14, 2020 07:13
Easy way to do pre-processing for Image Process in NN

The way to do pre-processing

from keras.preprocessing.image import ImageDataGenerator

train_datagen = ImageDataGenerator(rescale=1./255)
train_generator = train_datagen.flow_from_directory(
    train_dir,
    batch_size=20,
    target_size=(150, 150)
@Wapiti08
Wapiti08 / arg_map.md
Last active June 12, 2020 02:34
The trick to read env in docker image

it is possible to map the .env environment variables to ARGS to be used by the Dockerfile during build.

docker-compose.yml

version: "x"
services:

  xxx:
    build:
 # there is the space between context and .
@Wapiti08
Wapiti08 / jq.md
Created June 4, 2020 03:49
the instruction for json query command in Ubuntu

download the JQ:

sudo apt-get install jq

two main methods to parse the json file:

  • cat xxx.json | jq '.' | less
  • jq '.' xxx.json
@Wapiti08
Wapiti08 / spark.md
Last active July 19, 2020 03:20
The instructions on how to run the spark locally

When you want to run spark with jupyter notebook:

Download the compatible version for spark with hadoop.

My case in spark-2.4.5-bin-hadoop2.7.

Download Spark locally

After you uncompress the package: