Skip to content

Instantly share code, notes, and snippets.

@hmmhmmhm
Last active July 9, 2020 01:04
Show Gist options
  • Save hmmhmmhm/9fe134ce45d3bb633214e8e594c9e237 to your computer and use it in GitHub Desktop.
Save hmmhmmhm/9fe134ce45d3bb633214e8e594c9e237 to your computer and use it in GitHub Desktop.
Anti-Cloudflare Script
import puppeteer from 'puppeteer'
import puppeteerExtra from 'puppeteer-extra'
import pluginStealth from 'puppeteer-extra-plugin-stealth'
import cloudscraper from 'cloudscraper'
import {User} from "../domain/user";
import {accessing} from "../handlers/nasHandler";
let scraper: any = cloudscraper
let browser;
export const getBrowser = async () => {
if (!browser) {
browser = await init({isHeadless: false});
}
return browser;
}
export const init = async ({ isHeadless = true }) => {
console.log(`🚧 초기 μ‹€ν–‰ 진행 쀑...`)
console.log(`🚧 ν—€λ“œλ¦¬μŠ€ 크둬을 μ‹œμž‘ν•˜κ³  μžˆμŠ΅λ‹ˆλ‹€..`)
console.log(`🚧 쀑간에 μ–Έμ œλ“  Ctrl+C둜 μ’…λ£Œμ‹œν‚¬ 수 μžˆμŠ΅λ‹ˆλ‹€.\n`)
try {
const args = [
'--no-sandbox',
'--disable-setuid-sandbox',
'--disable-infobars',
'--window-position=0,0',
'--ignore-certifcate-errors',
'--ignore-certifcate-errors-spki-list',
//'--user-agent="Mozilla/6.0 (Macintosh; Intel Mac OS X 10_12_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/65.0.3312.0 Safari/537.36"',
]
const options = {
args,
headless: isHeadless,
ignoreHTTPSErrors: true,
userDataDir: './tmp',
}
// @ts-ignore
puppeteerExtra.use(pluginStealth())
let browser = await puppeteerExtra.launch(options)
console.log(`🚧 ν—€λ“œλ¦¬μŠ€ 크둬이 μ‹œμž‘λ˜μ—ˆμŠ΅λ‹ˆλ‹€.`)
return browser
} catch (e) {
console.log(`🚧 ν—€λ“œλ¦¬μŠ€ 크둬 μž‘λ™ 쀑 였λ₯˜κ°€ λ°œμƒν•˜μ˜€μŠ΅λ‹ˆλ‹€.`)
console.log(e)
}
return undefined
}
// νŽ˜μ΄μ§€ 이동
export const navigatePage = async (
page: puppeteer.Page,
targetUrl: string,
waitCode: string = 'networkidle0'
) => {
try {
let hookHeaders: any = await scrapeCloudflareHttpHeaderCookie(targetUrl)
// Anti Cloud Flare
await page.setRequestInterception(true)
page.on('request', request => {
const headers = request.headers()
request.continue({ ...hookHeaders })
})
// @ts-ignore
await page.goto(targetUrl, {
waitUntil: ['load', waitCode],
})
return true
} catch (e) {
console.log('νŽ˜μ΄μ§€ 접속 쀑 였λ₯˜κ°€ λ°œμƒν–ˆμŠ΅λ‹ˆλ‹€.')
console.log('5초 λ’€ ν•΄λ‹Ή νŽ˜μ΄μ§€ 접속을 λ‹€μ‹œ μ‹œλ„ν•©λ‹ˆλ‹€.')
console.log(`λ¬Έμ œκ°€ 된 νŽ˜μ΄μ§€: ${targetUrl}\n`)
await page.waitFor(5000)
return false
}
}
// νŽ˜μ΄μ§€ μ•ˆμ „ 이동
export const saftyGoto = async (
page: puppeteer.Page,
targetUrl: string,
waitCode: string = 'networkidle0'
) => {
// μ„±κ³΅ν• λ•ŒκΉŒμ§€ 계속 접속을 μ‹œλ„ν•©λ‹ˆλ‹€.
let isSuccess = false
while (!isSuccess) isSuccess = await navigatePage(page, targetUrl, waitCode)
}
export const delay = time => new Promise(res => setTimeout(res, time))
// 슀크레이퍼 μ΄ˆκΈ°ν™”
export const scrapeCloudflareHttpHeaderCookie = url =>
new Promise((resolve, reject) =>
scraper.get(url, function(error, response, body) {
if (error) {
reject(error)
} else {
resolve(response.request.headers)
}
})
)
@hmmhmmhm
Copy link
Author

hmmhmmhm commented Jul 9, 2020

μ΅œμ‹  λ²„μ „μ˜ νΌνŽ«ν‹°μ–΄ 5둜 μž‘λ™ν•˜λŠ”
ν”„λ‘œμ νŠΈ ν…œν”Œλ¦Ώμ„ λ§Œλ“€μ–΄ κ³΅μœ ν•©λ‹ˆλ‹€.
https://github.com/hmmhmmhm/puppeteer-template

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment