Skip to content

Instantly share code, notes, and snippets.

View maropu's full-sized avatar
🌴
On vacation

Takeshi Yamamuro maropu

🌴
On vacation
View GitHub Profile
- Complex Embeddings for Simple Link Prediction, https://arxiv.org/abs/1606.06357
- Vertex AI Matching Engine: https://cloud.google.com/vertex-ai/docs/matching-engine
$ ./build/mvn clean test -DmemoryFiles=rerun.txt
$ cat
TestFailed Some(org.apache.spark.api.python.RepairSuite) org.apache.spark.api.python.RepairSuite None
TestFailed Some(org.apache.spark.api.python.DepGraphSuite) org.apache.spark.api.python.DepGraphSuite Some(computeFunctionalDepMap)
$ ./build/mvn clean test -DtestsFiles=rerun.txt
Run starting. Expected test count is: 5
DepGraphSuite:
13:29:58.598 WARN org.apache.spark.util.Utils: Your hostname, maropus-MacBook-Pro.local resolves to a loopback address: 127.0.0.1; using 192.168.3.2 instead (on interface en0)
13:29:58.599 WARN org.apache.spark.util.Utils: Set SPARK_LOCAL_IP if you need to bind to another address
Spark session available as 'spark'.
Welcome to
____ __
/ __/__ ___ _____/ /__
_\ \/ _ \/ _ `/ __/ '_/
/___/ .__/\_,_/_/ /_/\_\ version 3.2.0-SNAPSHOT
/_/
Using Scala version 2.12.10 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_181)
Type in expressions to have them evaluated.
import time
from collections import Counter
from pyspark.accumulators import AccumulatorParam
from pyspark.sql.functions import col, pandas_udf, PandasUDFType
class UdfMetricAccumulatorParam(AccumulatorParam):
def zero(self, value):
init_value = {}
return init_value.update(value)
@pytest.hookimpl(hookwrapper=True)
def pytest_report_teststatus(report, config):
outcome = yield
res = outcome.get_result()
attr_name = "___TIME___"
if report.when == "setup":
# HACK: store the start time in `config`
setattr(config, attr_name, time.time())
elif report.when == "call":
// export SPARK_HOME=<YOUR_SPARK_V3_0>
$ git clone https://github.com/maropu/spark-tpcds-datagen.git
$ cd spark-tpcds-datagen
$ ./bin/datagen --master=local[*] --conf spark.driver.memory=8g --scale-factor 10 --output-location /tmp/tpcds-sf-10
scala> :paste
import org.apache.spark.sql.catalyst.catalog.CatalogColumnStat
import org.apache.spark.sql.execution.datasources.LogicalRelation
import org.apache.spark.sql.types.DataType
sql("SET spark.sql.cbo.enabled=true")
# https://qiita.com/9_ties/items/3bdb177384937ddc88df
# https://homes.cs.washington.edu/~pedrod/papers/mlj05.pdf
import pandas as pd
import numpy as np
from scipy.special import logsumexp
from itertools import product
const = ['A', 'B']
preds = [('Smokes', 1), ('Cancer', 1), ('Friends', 2)] # Predicate and arity
///////// Invocation of Scala collection object methods /////////
---
scala> import scala.reflect.runtime.universe._
scala> val mapClazz = scala.collection.immutable.Map.getClass
mapClazz: Class[_ <: scala.collection.immutable.Map.type] = class scala.collection.immutable.Map$
scala> val mirror = runtimeMirror(mapClazz.getClassLoader)
mirror: reflect.runtime.universe.Mirror = JavaMirror with ...
We couldn’t find that file to show.