Skip to content

Instantly share code, notes, and snippets.

@BoredHackerBlog
Last active April 30, 2022 01:58
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save BoredHackerBlog/9eadd43b174c4155c6ce2858d086d9e2 to your computer and use it in GitHub Desktop.
Save BoredHackerBlog/9eadd43b174c4155c6ce2858d086d9e2 to your computer and use it in GitHub Desktop.
apache spark / pyspark eve.json search
In [1]: from pyspark.sql import SparkSession
In [2]: spark = SparkSession \
...: .builder \
...: .appName("example") \
...: .getOrCreate()
22/04/29 18:55:18 WARN Utils: Your hostname, ubuntu resolves to a loopback address: 127.0.1.1; using 192.168.95.155 instead (on interface ens33)
22/04/29 18:55:18 WARN Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Using Spark's default log4j profile: org/apache/spark/log4j-defaults.properties
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
22/04/29 18:55:18 WARN NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
In [3]: alerts = spark.read.json('alerts-only.json', multiLine=True)
In [4]: alerts.printSchema()
root
|-- alert: struct (nullable = true)
| |-- action: string (nullable = true)
| |-- category: string (nullable = true)
| |-- gid: long (nullable = true)
| |-- rev: long (nullable = true)
| |-- severity: long (nullable = true)
| |-- signature: string (nullable = true)
| |-- signature_id: long (nullable = true)
|-- app_proto: string (nullable = true)
|-- app_proto_expected: string (nullable = true)
|-- app_proto_tc: string (nullable = true)
|-- app_proto_ts: string (nullable = true)
|-- dest_ip: string (nullable = true)
|-- dest_port: long (nullable = true)
|-- email: struct (nullable = true)
| |-- status: string (nullable = true)
|-- event_type: string (nullable = true)
|-- flow: struct (nullable = true)
| |-- bytes_toclient: long (nullable = true)
| |-- bytes_toserver: long (nullable = true)
| |-- pkts_toclient: long (nullable = true)
| |-- pkts_toserver: long (nullable = true)
| |-- start: string (nullable = true)
|-- flow_id: long (nullable = true)
|-- http: struct (nullable = true)
| |-- hostname: string (nullable = true)
| |-- http_content_type: string (nullable = true)
| |-- http_method: string (nullable = true)
| |-- http_refer: string (nullable = true)
| |-- http_user_agent: string (nullable = true)
| |-- length: long (nullable = true)
| |-- protocol: string (nullable = true)
| |-- redirect: string (nullable = true)
| |-- status: long (nullable = true)
| |-- url: string (nullable = true)
|-- pcap_cnt: long (nullable = true)
|-- proto: string (nullable = true)
|-- smtp: struct (nullable = true)
| |-- helo: string (nullable = true)
| |-- mail_from: string (nullable = true)
| |-- rcpt_to: array (nullable = true)
| | |-- element: string (containsNull = true)
|-- src_ip: string (nullable = true)
|-- src_port: long (nullable = true)
|-- timestamp: string (nullable = true)
|-- tls: struct (nullable = true)
| |-- fingerprint: string (nullable = true)
| |-- issuerdn: string (nullable = true)
| |-- notafter: string (nullable = true)
| |-- notbefore: string (nullable = true)
| |-- serial: string (nullable = true)
| |-- sni: string (nullable = true)
| |-- subject: string (nullable = true)
| |-- version: string (nullable = true)
|-- tx_id: long (nullable = true)
|-- vars: struct (nullable = true)
| |-- flowbits: struct (nullable = true)
| | |-- ET.BotccIP: boolean (nullable = true)
| | |-- ET.ELFDownload: boolean (nullable = true)
| | |-- ET.ETERNALBLUE: boolean (nullable = true)
| | |-- ET.Evil: boolean (nullable = true)
| | |-- ET.HB.Request.CI: boolean (nullable = true)
| | |-- ET.HB.Request.SI: boolean (nullable = true)
| | |-- ET.HB.Response.CI: boolean (nullable = true)
| | |-- ET.MalformedTLSHB: boolean (nullable = true)
| | |-- ET.TorIP: boolean (nullable = true)
| | |-- ET.http.binary: boolean (nullable = true)
| | |-- ET.smb.binary: boolean (nullable = true)
| | |-- ET.zbot.dat: boolean (nullable = true)
| | |-- exe.no.referer: boolean (nullable = true)
| | |-- is_proto_irc: boolean (nullable = true)
| |-- flowints: struct (nullable = true)
| | |-- http.anomaly.count: long (nullable = true)
| | |-- smtp.anomaly.count: long (nullable = true)
| | |-- tls.anomaly.count: long (nullable = true)
In [5]: alerts.createOrReplaceTempView("alerts")
In [6]: spark.sql("select alert.signature from alerts where alert.signature like '%ET SCAN%'").show()
+--------------------+
| signature|
+--------------------+
|ET SCAN Potential...|
|ET SCAN Behaviora...|
|ET SCAN Behaviora...|
|ET SCAN Behaviora...|
|ET SCAN Behaviora...|
|ET SCAN Behaviora...|
|ET SCAN Behaviora...|
|ET SCAN Potential...|
|ET SCAN Non-Allow...|
|ET SCAN Nessus FT...|
|ET SCAN NETWORK O...|
|ET SCAN NETWORK I...|
|ET SCAN Possible ...|
+--------------------+
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment