This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
kafkacat -F ./kafkacat.auth -p 0 -t dlq-lcc-dy0v1 -C -f '\nKey (%K bytes): %k | |
Value (%S bytes): %s | |
Timestamp: %T | |
Partition: %p | |
Offset: %o | |
Headers: %h\n' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
[ | |
{ | |
"user_group": [ | |
"dashboarding" | |
], | |
"query_group": [], | |
"name": "dashboarding", | |
"memory_percent_to_use": 70, | |
"query_concurrency": 15, | |
"concurrency_scaling": "auto" |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
// spark-shell --jars /home/otto/algebird-core_2.10-0.9.0.jar,/home/mforns/refinery-core-0.0.9.jar | |
import java.util.Date | |
import java.text.SimpleDateFormat | |
import org.wikimedia.analytics.refinery.core.PageviewDefinition | |
import org.wikimedia.analytics.refinery.core.Webrequest | |
import scala.math.pow | |
import org.apache.spark.rdd.RDD | |
import com.twitter.algebird.QTree |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import org.apache.spark.sql.{ Encoders, SaveMode } | |
val readPath = s"/wmf/data/event/PrefUpdate/year=*/month=*/day=*/hour=*/*.parquet" | |
val propertyWhitelistFilter = s"event.property in ('skin', 'mfMode', 'mf_amc_optin', 'VectorSkinVersion', 'popups', 'popupsreferencepreviews', 'discussiontools-betaenable', 'betafeatures-auto-enroll' , 'echo-notifications-blacklist', 'email-blacklist', 'growthexperiments-help-panel-tog-help-panel', 'growthexperiments-homepage-enable', 'growthexperiments-homepage-pt-link')" | |
case class UserAgent( | |
browser_family: String, | |
browser_major: String, | |
browser_minor: String, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
use wmf; | |
select distinct base_name | |
from | |
mediarequest | |
where year=2019 | |
and month=12 | |
and day=1 | |
and hour=1 | |
and base_name like '%commons%' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
curl -v -H 'Content-Type: text/plain' -d'{"$schema": "/mediawiki/client/error/1.0.0", "meta": {"stream": "mediawiki.client.error"}, "message": "test event", "type": "TEST", "url": "http://otto-test.org", "user_agent": "otto test"}' 'https://intake-logging.wikimedia.org/v1/events?hasty=true' |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
mw.config.values.wgWMEClientErrorIntakeURL | |
/** | |
{"$schema": "/mediawiki/client/error/1.0.0", "meta": {"stream": "mediawiki.client.error"}, "message": "test event", "type": "TEST", "url": "http://otto-test.org", "user_agent": "otto test"} | |
**/ | |
mw.track( 'global.error', { | |
errorMessage: 'ayayaya', | |
url: 'https://intake-logging.wikimedia.org/v1/events?hasty=true', | |
lineNumber: 1, | |
columnNumber: 1, |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
import java.net.URLDecoder | |
import java.net.URLEncoder | |
import org.apache.spark.sql.functions._ | |
val urlDecoder = (u: String) => URLDecoder.decode(u.replaceAll("%(?![0-9a-fA-F]{2})", "%25").replaceAll("\\+", "%2B"), "UTF-8") | |
val urlEncoder = (u: String) => URLEncoder.encode(u, "UTF-8") | |
val countSlashes = (u: String) => u.count(_ == '/') |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
spark2-shell --master yarn --executor-memory 8G --executor-cores 4 --driver-memory 16G --conf spark.dynamicAllocation.maxExecutors=64 --conf spark.executor.memoryOverhead=2048 --jars /srv/deployment/analytics/refinery/artifacts/refinery-job.jar,/srv/deployment/analytics/refinery/artifacts/refinery-hive.jar |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Three levels job->attemp->application to reach cassandra | |
The cassandra jar newly build - with exclusions - is on /tmp/oozie-nuria | |
hdfs dfs -rmr /tmp/oozie-nuria ; hdfs dfs -mkdir /tmp/oozie-nuria; hdfs dfs -put oozie/* /tmp/oozie-nuria; | |
Start job: |