Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
This is GSOC 2021 proposal for SNARE (under honeypot project)

Candidate Introduction

  • OM PARIKH
  • InfoSec practitioner (specializing in fuzzing, cryptograph and binary exploitation)
  • Sophomore, Part-time Blockchain developer & InfoSec intern @ Matic
  • Member @ Appwrite, contributed to 10+ OSS projects
  • Github Stackoverflow Ethereum-StackExchange LinkedIn
  • Creator & Maintainer of NFTminter (15+ stars on github and 500+ daily visits on website)

Task Statement (SNARE 3.1)

  1. Re-write Cloner
  2. Upgrade the SNARE codebase to be compatible with the latest aiohttp version
  3. Fix the CI pipeline
  4. Improve Test Coverage
  5. Work on critical issues ( #236, #7 and #284 )
  6. Package Publishing & Documentation

Task Analysis

  • For solving the above mentioned tasks, An intermidate knowledge of Python, Familiartiy with InfoSec fundamentals, Git, Client-Server HTTP communication, Unit & E2E Testing, Packaging & Publishing is required.
  • Tasks should be prioritized in same order in Task Statement section for smooth workflow
  • #5 in Task Statement can require extensive debugging (Codebase is light-weight so this should not be trouble )

What I have done so far

  • Setup TANNER + SNARE local development environment & running test suite
  • Tested core components (Tanner handler, dorks & cloner )
  • studied the breaking changes from aiohttp 3.4 to aiohttp 3.7 (changelog)
  • examined the snare & tanner logs after exposing it to artifical traffic

1. Re-write Cloner

  • As per current scenario, cloner relies on BeautifulSoup for all the heavy lifting of parsing the response and then using asyncio's Queue to parse the other hyperlinks from the fetched response.
  • This is very inefficient (as per the discussion carried out in slack) due to multiple reasons and has vast range of corner cases to be convered
  • pywebcopy is the perfect solution for this, this make the user experice very smooth and has less overhead compared to something like selenium (which loads headless browser in memory)
  • pywebcopy is stable, reliable and has good test coverage
  • It is bakced by lxml, requests, beautifulsoup4, pyquery, requests_html (most of which are using natively to solve same purpose) and has support for authentication, bypass_robots & cookies.
  • Integration of this package can be very smoothly by adding helper functions to wrap selective core pywebcopy methods
  • ETA : 22-25 Hours

2. Upgrade the SNARE codebase to be compatible with the latest aiohttp version

  • older version aiohttp (almost 3 year old package) and aiohttp_jinja2 are used in tanner_handler, html_handler, server, middleware and their respective test cases. ( cloner will fixed by time )
  • There were very few breaking changes, however response error handling functions (such as handle 400 & handle 500) are causing troubles with TANNER which needs to be fixed.
  • Observing falied CI builds from automatic dependancy upgrade from dependabot gives lot of hints.
  • ETA: 17-20 Hours

3. Fix the CI pipeline

  • Travis config still uses python 3.5 as primary and should be changed ASAP!
  • config uses direclty pip install, while it should be done with flags such as --no-cache etc. for more deterministic builds.
  • few other minor improvements
  • ETA: 5-7 Hours

4. Improve Test Coverage

  • current test coverage is around 64% ( however snare/ has 90% coverage), it is comparatively low and needs to be inccreaed for more robust architecure.
  • Replace test cases for new cloner version, upgrade aiohttp.
  • Add more tests for middlewares and server (as they are the ones decreasing overall test coverage)
  • ETA: 20-25 Hours

5. Work on critical issues

  • Above mentioned open issues on SNARE would help end users a lot if solved ( #236, #7 and #284 )
  • I will start with #284 for which SNARE is detecting attacker's IP as proxy's IP, this can be solved by b debugging on headers dict and check for X-Forwarded-FOr header
  • For adding SSL support for server, this can be done with python's SSL lib and upgrading aiohttp. For example:
import ssl
from aiohttp import web

app = web.Application()
app.add_routes([..., ...])

ssl_context = ssl.create_default_context(ssl.Purpose.CLIENT_AUTH)
ssl_context.load_cert_chain('path/to/domain.crt', 'path/to/domain.key')

web.run_app(app, ssl_context=ssl_context)
  • Work on other issues discussing priority with communtiy
  • ETA: After completing all above tasks and leaving 15 hours for below task, i will spend remaining time on this.

6. Package Publishing & Documentation

  • This includes the bulding setup.py after completing above major tasks, ensure correct metadata.
  • After publishing, check the installation of package via different methods (pip, egg, wheel) and ensure stability.
  • Improve the current documentation to include minimal basic details, serving as a walkthrough for using (not including co-existence with TANNER as that be done in GSOD )
  • ETA: 15-17 Hours

Community Bonding

  • Read the codebase once more thorughly and quickly (1-2 days)
  • Discuss the implementation specific details and tweak the work flow according to changes suggested
  • Start contributing as quickly as possible

Others

  • I will be able to devote 45+ hours a week at minimum
  • I was not able contribute actively pre GSOC due to a bit health issues and a exam (glad both are solved completly now! )
  • #1, #2 and #3 will be finished before first evaluation
  • I am open to changes to proposal and new ideas if i have missed anything
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment