Skip to content

Instantly share code, notes, and snippets.

View rzykov's full-sized avatar

Roman Zykov rzykov

View GitHub Profile
@rzykov
rzykov / XgBoostRankSparkScala.scala
Last active March 31, 2022 22:43
XGboost Spark - ranking problem
import _root_.ml.dmlc.xgboost4j.scala.spark.XGBoost
import org.apache.spark.ml.feature.LabeledPoint
def encodeFeaturesToLabeledPoint(features: RDD[Feature], relevance: Option[RDD[Relevance]], workers: Int)
(implicit parallel: Int): (RDD[LabeledPoint], Seq[String], Seq[Seq[Int]]) = {
val missingValue = Double.NaN
val names = features
.map { _.name }