Skip to content

Instantly share code, notes, and snippets.

@richiercyrus
Last active July 14, 2023 19:08
Show Gist options
  • Star 7 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save richiercyrus/449f37765595e53a54b3b9ec94a353c7 to your computer and use it in GitHub Desktop.
Save richiercyrus/449f37765595e53a54b3b9ec94a353c7 to your computer and use it in GitHub Desktop.
Juypter Notebook demonstrating usefulness of Apple's Endpoint Security Framework.
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Import Libraries"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from pyspark.sql import SparkSession\n",
"from pyspark.sql.functions import explode\n",
"from pyspark.sql.functions import count, col"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Create SparkSession"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"spark = SparkSession.builder \\\n",
" .appName(\"HELK Reader\") \\\n",
" .master(\"spark://helk-spark-master:7077\") \\\n",
" .enableHiveSupport() \\\n",
" .getOrCreate()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Verify Spark Variable"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
" <div>\n",
" <p><b>SparkSession - hive</b></p>\n",
" \n",
" <div>\n",
" <p><b>SparkContext</b></p>\n",
"\n",
" <p><a href=\"http://1e333a5a6fbf:4040\">Spark UI</a></p>\n",
"\n",
" <dl>\n",
" <dt>Version</dt>\n",
" <dd><code>v2.4.4</code></dd>\n",
" <dt>Master</dt>\n",
" <dd><code>spark://helk-spark-master:7077</code></dd>\n",
" <dt>AppName</dt>\n",
" <dd><code>HELK Reader</code></dd>\n",
" </dl>\n",
" </div>\n",
" \n",
" </div>\n",
" "
],
"text/plain": [
"<pyspark.sql.session.SparkSession at 0x7f4e6d48d1d0>"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"spark"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Initiate Elasticsearch Dataframe Reader"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"es_reader = (spark.read\n",
" .format(\"org.elasticsearch.spark.sql\")\n",
" .option(\"inferSchema\", \"true\")\n",
" .option(\"es.read.field.as.array.include\", \"metadata,metadata.origin_codesigningflags,metadata.env_variables,metadata.mmapflags,metadata.mmapprotection\")\n",
" .option(\"es.nodes\",\"helk-elasticsearch:9200\")\n",
" .option(\"es.net.http.auth.user\",\"elastic\")\n",
" .option(\"es.net.http.auth.pass\",\"elasticpassword\")\n",
")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"In the cell above, a spark instance is being utilized to read the data contained in Elasticsearch. The column metadata is being treated as an array. The elastic username and password is also passed in order to get to the data via the API."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Load Data from Elasticsearch : ESF Index"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [],
"source": [
"esf_df = es_reader.load(\"indexme-*/\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The data of interest originating from the macOS system, was sent to a Kafka topic (esf), enriched by Logstash, and stored in Elasticsearch under the `indexme-*` index. The commnand above takes all of the data from the index specified (`indexme-*`) and stores it in a Dataframe."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Show ESF Spark DataFrame"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"DataFrame[@timestamp: timestamp, @version: string, eventtype: string, metadata: array<struct<ProcessArgs:string,binarypath:string,destinationfilepath:string,env_variables:array<string>,extendedattr:string,fileoffset:bigint,filepath:string,filesize:bigint,gid:bigint,max_protection:bigint,mmapflags:array<string>,mmapprotection:array<string>,origin_binarypath:string,origin_cdhash:string,origin_codesigningflags:array<string>,origin_pid:bigint,origin_platform_binary:boolean,origin_ppid:bigint,origin_signingid:string,origin_teamid:string,origin_uid:bigint,path_truncated:boolean,pid:bigint,ppid:bigint,size:bigint,sourcefilepath:string,sourcepath:string,uid:bigint,user_class:string,user_client:bigint>>, timestamp: timestamp]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"esf_df"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"+-------------------------------+----------------+\n",
"|eventtype |count(eventtype)|\n",
"+-------------------------------+----------------+\n",
"|ES_EVENT_TYPE_NOTIFY_GET_TASK |1 |\n",
"|ES_EVENT_TYPE_NOTIFY_OPEN |641 |\n",
"|ES_EVENT_NOTIFY_FORK |168 |\n",
"|ES_EVENT_TYPE_NOTIFY_MMAP |89 |\n",
"|ES_EVENT_NOTIFY_EXIT |164 |\n",
"|ES_EVENT_TYPE_NOTIFY_WRITE |8909 |\n",
"|ES_EVENT_TYPE_NOTIFY_IOKIT_OPEN|2 |\n",
"|ES_EVENT_TYPE_NOTIFY_CLOSE |751 |\n",
"|ES_EVENT_TYPE_NOTIFY_CREATE |15 |\n",
"|ES_EVENT_TYPE_NOTIFY_SETOWNER |20 |\n",
"|ES_EVENT_NOTIFY_EXEC |65 |\n",
"|ES_EVENT_TYPE_NOTIFY_SETEXTATTR|196 |\n",
"|ES_EVENT_TYPE_NOTIFY_RENAME |11 |\n",
"+-------------------------------+----------------+\n",
"\n"
]
}
],
"source": [
"esf_df.select(\"eventtype\") \\\n",
" .groupBy(\"eventtype\") \\\n",
" .agg(count(\"eventtype\")) \\\n",
" .show(30, False)"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"root\n",
" |-- ProcessArgs: string (nullable = true)\n",
" |-- binarypath: string (nullable = true)\n",
" |-- destinationfilepath: string (nullable = true)\n",
" |-- env_variables: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- extendedattr: string (nullable = true)\n",
" |-- fileoffset: long (nullable = true)\n",
" |-- filepath: string (nullable = true)\n",
" |-- filesize: long (nullable = true)\n",
" |-- gid: long (nullable = true)\n",
" |-- max_protection: long (nullable = true)\n",
" |-- mmapflags: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- mmapprotection: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- origin_binarypath: string (nullable = true)\n",
" |-- origin_cdhash: string (nullable = true)\n",
" |-- origin_codesigningflags: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- origin_pid: long (nullable = true)\n",
" |-- origin_platform_binary: boolean (nullable = true)\n",
" |-- origin_ppid: long (nullable = true)\n",
" |-- origin_signingid: string (nullable = true)\n",
" |-- origin_teamid: string (nullable = true)\n",
" |-- origin_uid: long (nullable = true)\n",
" |-- path_truncated: boolean (nullable = true)\n",
" |-- pid: long (nullable = true)\n",
" |-- ppid: long (nullable = true)\n",
" |-- size: long (nullable = true)\n",
" |-- sourcefilepath: string (nullable = true)\n",
" |-- sourcepath: string (nullable = true)\n",
" |-- uid: long (nullable = true)\n",
" |-- user_class: string (nullable = true)\n",
" |-- user_client: long (nullable = true)\n",
"\n"
]
}
],
"source": [
"esf_df.filter(\"eventtype == 'ES_EVENT_NOTIFY_EXEC'\") \\\n",
" .select(\"metadata\",explode(esf_df.metadata)) \\\n",
" .select(\"col.*\").printSchema()"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"+-----------------------------------------------------------------------------------------------------------+-----------------+\n",
"|binarypath |count(binarypath)|\n",
"+-----------------------------------------------------------------------------------------------------------+-----------------+\n",
"|/usr/libexec/xpcproxy |1 |\n",
"|/usr/bin/uname |1 |\n",
"|null |0 |\n",
"|/usr/bin/egrep |54 |\n",
"|/System/Applications/Calculator.app/Contents/MacOS/Calculator |1 |\n",
"|/usr/bin/dirname%1998A42B8DF1DCF44A3C9C58B4A24D323CB93 |2 |\n",
"|/System/Library/CoreServices/Applications/Feedback Assistant.app/Contents/Library/LaunchServices/seedusaged|1 |\n",
"|/usr/bin/cut |1 |\n",
"|/Users/johnappleseed/Downloads/macos_execute_from_memory-master/main |2 |\n",
"|/usr/bin/dirname |1 |\n",
"+-----------------------------------------------------------------------------------------------------------+-----------------+\n",
"\n"
]
}
],
"source": [
"esf_df.filter(\"eventtype == 'ES_EVENT_NOTIFY_EXEC'\") \\\n",
" .select(\"metadata\",explode(esf_df.metadata)) \\\n",
" .select(\"col.*\").select(\"binarypath\") \\\n",
" .groupBy(\"binarypath\") \\\n",
" .agg(count(\"binarypath\")) \\\n",
" .show(30,False)"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"+-------------------------------------------------------------+----+--------------------------------------------------------------------+----------+\n",
"|binarypath |pid |origin_binarypath |origin_pid|\n",
"+-------------------------------------------------------------+----+--------------------------------------------------------------------+----------+\n",
"|/System/Applications/Calculator.app/Contents/MacOS/Calculator|1376|/Users/johnappleseed/Downloads/macos_execute_from_memory-master/main|1376 |\n",
"+-------------------------------------------------------------+----+--------------------------------------------------------------------+----------+\n",
"\n"
]
}
],
"source": [
"esf_df.filter(\"eventtype == 'ES_EVENT_NOTIFY_EXEC'\") \\\n",
" .select(\"metadata\",explode(esf_df.metadata)) \\\n",
" .select(\"col.*\") \\\n",
" .filter(\"binarypath =='/System/Applications/Calculator.app/Contents/MacOS/Calculator'\") \\\n",
" .select(\"binarypath\",\"pid\",\"origin_binarypath\",\"origin_pid\") \\\n",
" .show(10,False)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"+--------------------------+----------------+\n",
"|eventtype |count(eventtype)|\n",
"+--------------------------+----------------+\n",
"|ES_EVENT_TYPE_NOTIFY_OPEN |3 |\n",
"|ES_EVENT_TYPE_NOTIFY_MMAP |1 |\n",
"|ES_EVENT_TYPE_NOTIFY_WRITE|1 |\n",
"|ES_EVENT_TYPE_NOTIFY_CLOSE|3 |\n",
"|ES_EVENT_NOTIFY_EXEC |1 |\n",
"+--------------------------+----------------+\n",
"\n"
]
}
],
"source": [
"esf_df.select(\"eventtype\",\"metadata\",explode(esf_df.metadata)) \\\n",
" .select(\"eventtype\",\"col.*\") \\\n",
" .filter(\"origin_binarypath =='/Users/johnappleseed/Downloads/macos_execute_from_memory-master/main' \\\n",
" AND origin_pid=1376\") \\\n",
" .select(\"eventtype\") \\\n",
" .groupBy(\"eventtype\") \\\n",
" .agg(count(\"eventtype\")) \\\n",
" .show(30, False)"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"root\n",
" |-- ProcessArgs: string (nullable = true)\n",
" |-- binarypath: string (nullable = true)\n",
" |-- destinationfilepath: string (nullable = true)\n",
" |-- env_variables: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- extendedattr: string (nullable = true)\n",
" |-- fileoffset: long (nullable = true)\n",
" |-- filepath: string (nullable = true)\n",
" |-- filesize: long (nullable = true)\n",
" |-- gid: long (nullable = true)\n",
" |-- max_protection: long (nullable = true)\n",
" |-- mmapflags: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- mmapprotection: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- origin_binarypath: string (nullable = true)\n",
" |-- origin_cdhash: string (nullable = true)\n",
" |-- origin_codesigningflags: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- origin_pid: long (nullable = true)\n",
" |-- origin_platform_binary: boolean (nullable = true)\n",
" |-- origin_ppid: long (nullable = true)\n",
" |-- origin_signingid: string (nullable = true)\n",
" |-- origin_teamid: string (nullable = true)\n",
" |-- origin_uid: long (nullable = true)\n",
" |-- path_truncated: boolean (nullable = true)\n",
" |-- pid: long (nullable = true)\n",
" |-- ppid: long (nullable = true)\n",
" |-- size: long (nullable = true)\n",
" |-- sourcefilepath: string (nullable = true)\n",
" |-- sourcepath: string (nullable = true)\n",
" |-- uid: long (nullable = true)\n",
" |-- user_class: string (nullable = true)\n",
" |-- user_client: long (nullable = true)\n",
"\n"
]
}
],
"source": [
"esf_df.filter(\"eventtype == 'ES_EVENT_TYPE_NOTIFY_MMAP'\") \\\n",
" .select(\"metadata\",explode(esf_df.metadata)) \\\n",
" .select(\"col.*\").printSchema()"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"+-------------------------+-------------+----------------------+--------------------------------------------------------------------+---------------------------------------------------------------------------+\n",
"|eventtype |mmapflags |mmapprotection |origin_binarypath |sourcepath |\n",
"+-------------------------+-------------+----------------------+--------------------------------------------------------------------+---------------------------------------------------------------------------+\n",
"|ES_EVENT_TYPE_NOTIFY_MMAP|[MAP_PRIVATE]|[PROT_READ, PROT_NONE]|/Users/johnappleseed/Downloads/macos_execute_from_memory-master/main|/Users/johnappleseed/Downloads/macos_execute_from_memory-master/test.bundle|\n",
"+-------------------------+-------------+----------------------+--------------------------------------------------------------------+---------------------------------------------------------------------------+\n",
"\n"
]
}
],
"source": [
"esf_df.filter(\"eventtype == 'ES_EVENT_TYPE_NOTIFY_MMAP'\") \\\n",
" .select(\"eventtype\",\"metadata\",explode(esf_df.metadata)) \\\n",
" .select(\"eventtype\",\"col.*\") \\\n",
" .filter(\"origin_binarypath =='/Users/johnappleseed/Downloads/macos_execute_from_memory-master/main' \\\n",
" AND origin_pid=1376\") \\\n",
" .select(\"eventtype\",\"mmapflags\",\"mmapprotection\",\"origin_binarypath\",\"sourcepath\") \\\n",
" .show(10,False)"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"root\n",
" |-- ProcessArgs: string (nullable = true)\n",
" |-- binarypath: string (nullable = true)\n",
" |-- destinationfilepath: string (nullable = true)\n",
" |-- env_variables: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- extendedattr: string (nullable = true)\n",
" |-- fileoffset: long (nullable = true)\n",
" |-- filepath: string (nullable = true)\n",
" |-- filesize: long (nullable = true)\n",
" |-- gid: long (nullable = true)\n",
" |-- max_protection: long (nullable = true)\n",
" |-- mmapflags: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- mmapprotection: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- origin_binarypath: string (nullable = true)\n",
" |-- origin_cdhash: string (nullable = true)\n",
" |-- origin_codesigningflags: array (nullable = true)\n",
" | |-- element: string (containsNull = true)\n",
" |-- origin_pid: long (nullable = true)\n",
" |-- origin_platform_binary: boolean (nullable = true)\n",
" |-- origin_ppid: long (nullable = true)\n",
" |-- origin_signingid: string (nullable = true)\n",
" |-- origin_teamid: string (nullable = true)\n",
" |-- origin_uid: long (nullable = true)\n",
" |-- path_truncated: boolean (nullable = true)\n",
" |-- pid: long (nullable = true)\n",
" |-- ppid: long (nullable = true)\n",
" |-- size: long (nullable = true)\n",
" |-- sourcefilepath: string (nullable = true)\n",
" |-- sourcepath: string (nullable = true)\n",
" |-- uid: long (nullable = true)\n",
" |-- user_class: string (nullable = true)\n",
" |-- user_client: long (nullable = true)\n",
"\n"
]
}
],
"source": [
"esf_df.filter(\"eventtype == 'ES_EVENT_TYPE_NOTIFY_OPEN'\") \\\n",
" .select(\"metadata\",explode(esf_df.metadata)) \\\n",
" .select(\"col.*\").printSchema()"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"+-------------------------+---------------------------------------------------------------------------+--------------------------------------------------------------------+\n",
"|eventtype |filepath |origin_binarypath |\n",
"+-------------------------+---------------------------------------------------------------------------+--------------------------------------------------------------------+\n",
"|ES_EVENT_TYPE_NOTIFY_OPEN|/Users/johnappleseed/Downloads/macos_execute_from_memory-master |/Users/johnappleseed/Downloads/macos_execute_from_memory-master/main|\n",
"|ES_EVENT_TYPE_NOTIFY_OPEN|/dev/dtracehelper |/Users/johnappleseed/Downloads/macos_execute_from_memory-master/main|\n",
"|ES_EVENT_TYPE_NOTIFY_OPEN|/Users/johnappleseed/Downloads/macos_execute_from_memory-master/test.bundle|/Users/johnappleseed/Downloads/macos_execute_from_memory-master/main|\n",
"+-------------------------+---------------------------------------------------------------------------+--------------------------------------------------------------------+\n",
"\n"
]
}
],
"source": [
"esf_df.filter(\"eventtype == 'ES_EVENT_TYPE_NOTIFY_OPEN'\") \\\n",
" .select(\"eventtype\",\"metadata\",explode(esf_df.metadata)) \\\n",
" .select(\"eventtype\",\"col.*\") \\\n",
" .filter(\"origin_binarypath =='/Users/johnappleseed/Downloads/macos_execute_from_memory-master/main' \\\n",
" AND origin_pid=1376\") \\\n",
" .select(\"eventtype\",\"filepath\",\"origin_binarypath\") \\\n",
" .show(10,False)"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"+--------------------------------------------------------------------+-------------------------------------------------------------+---------------------------------------------------------------------------+-------------+----------------------+\n",
"|parent_process_path |process_path |memory_mapped_file |mmapflags |mmapprotection |\n",
"+--------------------------------------------------------------------+-------------------------------------------------------------+---------------------------------------------------------------------------+-------------+----------------------+\n",
"|/Users/johnappleseed/Downloads/macos_execute_from_memory-master/main|/System/Applications/Calculator.app/Contents/MacOS/Calculator|/Users/johnappleseed/Downloads/macos_execute_from_memory-master/test.bundle|[MAP_PRIVATE]|[PROT_READ, PROT_NONE]|\n",
"+--------------------------------------------------------------------+-------------------------------------------------------------+---------------------------------------------------------------------------+-------------+----------------------+\n",
"\n"
]
}
],
"source": [
"execEventsDF = esf_df.filter(\"eventtype == 'ES_EVENT_NOTIFY_EXEC'\") \\\n",
" .select(\"metadata\",explode(esf_df.metadata)) \\\n",
" .select(\"col.*\") \\\n",
" .select(\"binarypath\",\"pid\",\"origin_binarypath\",\"origin_pid\")\n",
"\n",
"mmapEventsDF = esf_df.filter(\"eventtype == 'ES_EVENT_TYPE_NOTIFY_MMAP'\") \\\n",
" .select(\"eventtype\",\"metadata\",explode(esf_df.metadata)) \\\n",
" .select(\"eventtype\",\"col.*\") \\\n",
" .select(\"eventtype\",\"mmapflags\",\"mmapprotection\",\"origin_binarypath\",\"origin_pid\",\"sourcepath\")\n",
"\n",
"#joining ES_EVENT_NOTIFY_EXEC events and ES_EVENT_TYPE_NOTIFY_MMAP events on origin_pid & origin_binarypath to \\\n",
"#get the full chain of the process execution, the parent, and the parent's memory mapped file if it has one.\n",
"execEventsDF.join(mmapEventsDF, ['origin_pid', 'origin_binarypath']) \\\n",
" .select(col(\"origin_binarypath\").alias(\"parent_process_path\"), \\\n",
" col(\"binarypath\").alias(\"process_path\"),\n",
" col(\"sourcepath\").alias(\"memory_mapped_file\"),\n",
" \"mmapflags\", \"mmapprotection\").show(10,False)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "PySpark_Python3",
"language": "python",
"name": "pyspark3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.6"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment