Skip to content

Instantly share code, notes, and snippets.

View thiau's full-sized avatar

Thiago Araujo thiau

  • IBM
  • São Paulo
View GitHub Profile
{
"data": {
"edges": [
{
"animated": false,
"className": "",
"data": {
"sourceHandle": {
"dataType": "WatsonxEmbeddingsComponent",
"id": "WatsonxEmbeddingsComponent-9atFF",
{
"data": {
"edges": [
{
"animated": false,
"className": "",
"data": {
"sourceHandle": {
"dataType": "ChatInput",
"id": "ChatInput-vnngf",
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@thiau
thiau / mongo_spark_python.md
Last active September 24, 2019 18:53
Using MongoDB with Spark and Python
  1. Download spark on Downloads | Apache Spark
  2. Install python on Download Python | Python.org
  3. Run export SPARK_HOME=/opt/spark to set spark home folder
  4. Extract the contents of the spark download in $SPARK_HOME
  5. Install pyspark: pip install pyspark
  6. Install mongo-spark dependencies with:
    • $SPARK_HOME/bin/pyspark --packages org.mongodb.spark:mongo-spark-connector_2.11:2.4.1
  7. Check “Ivy Default Cache set to” output and copy the files from this folder to $SPARK_HOME/jars
  8. Copy the current code: