Install poetry to manage dependencies (langchain, jupyter)
Requires pipx
Considered Keras and PyTorch to immerse into the mindset of Python and learn something brand new. Deep Learning seemed challenging enough 😉
Looked at the github repos and found that keras is 99.9% Python (with 0.1% Shell) while PyTorch at just 48.4% Python with tons of other languages.
I've also got the O'Reilly book about Keras by Aurelien Geron.
Guided by Development Guidelines.
$ brew install --cask miniconda
https://www.anaconda.com/blog/a-faster-conda-for-a-growing-community
conda install -c conda-forge --solver=libmamba ...
The following is a list of Hadoop properties for Spark to use HDFS more effective.
spark.hadoop.
-prefixed Spark properties are used to configure a Hadoop Configuration that Spark broadcast to tasks. Use spark.sparkContext.hadoopConfiguration
to review the properties.
spark.hadoop.mapreduce.fileoutputcommitter.algorithm.version
= 2
case class Person(id: Long, name: String)
class PersonSerializer extends Serializer[Person] {
override def configure(configs: util.Map[String, _], isKey: Boolean): Unit = {}
override def serialize(topic: String, data: Person): Array[Byte] = {
println(s">>> serialize($topic, $data)")
s"${data.id},${data.name}".map(_.toByte).toArray
}
// Let's create a sample dataset with just a single line, i.e. facebook profile
val facebookProfile = "ActivitiesDescription:703 likes, 0 talking about this, 4 were here; Category:; Email:joe@pvhvac.com; Hours:Mon-Fri: 8:00 am - 5:00 pm; Likes:703; Link:https://www.facebook.com/pvhvac; Location:165 W Wieuca Rd NE, Ste 310, Atlanta, Georgia; Name:PV Heating & Air; NumberOfPictures:0; NumberOfReviews:26; Phone:(404) 798-9672; ShortDescription:We specialize in residential a/c, heating, indoor air quality & home performance.; Url:http://www.pvhvac.com; Visitors:4"
val fbs = Seq(facebookProfile).toDF("profile")
scala> fbs.show(truncate = false)
+----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------