Skip to content

Instantly share code, notes, and snippets.

@cigzigwon
Created September 16, 2022 15:45
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save cigzigwon/c81078861d28090b7de9fdbc36d49df4 to your computer and use it in GitHub Desktop.
Save cigzigwon/c81078861d28090b7de9fdbc36d49df4 to your computer and use it in GitHub Desktop.
Crawly Config for WAFs and JL File Writer
config :revo,
:user_agent,
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0"
config :revo, :wait_intervals, [30_000, 45_000, 60_000, 65_000, 76_000]
config :crawly,
concurrent_requests_per_domain: 1,
closespider_timeout: 1,
manager_operations_timeout: 5 * 60_000,
middlewares: [
Crawly.Middlewares.DomainFilter,
Crawly.Middlewares.UniqueRequest,
{Crawly.Middlewares.UserAgent,
user_agents: [
# "Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)",
"Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:84.0) Gecko/20100101 Firefox/84.0"
]},
{Crawly.Middlewares.RequestOptions, [timeout: 16_000]}
],
pipelines: [
Crawly.Pipelines.JSONEncoder,
{Crawly.Pipelines.WriteToFile, folder: "/app/priv/data", extension: "jl"}
]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment