Skip to content

Instantly share code, notes, and snippets.

@wey-gu
Last active November 23, 2021 10:52
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save wey-gu/9f2de0910344b91ad50ecc198d3f4465 to your computer and use it in GitHub Desktop.
Save wey-gu/9f2de0910344b91ad50ecc198d3f4465 to your computer and use it in GitHub Desktop.
  • Get the nebula cluster's metaD ports
$ docker port nebula-docker-compose_metad0_1 | grep ^9559
9559/tcp -> 0.0.0.0:49189
$ docker port nebula-docker-compose_metad1_1 | grep ^9559
9559/tcp -> 0.0.0.0:49190
$ docker port nebula-docker-compose_metad2_1 | grep ^9559
9559/tcp -> 0.0.0.0:49188

Here the three metaD's ports are 49188, 49189 and 49190

  • start a spark container as exchange run time
docker run --name spark-master --network nebula-docker-compose_nebula-net \
    -h spark-master -e ENABLE_INIT_DAEMON=false -d \
    bde2020/spark-master:2.4.5-hadoop2.7
  • enter the spark container, download the exchange jar package

note, you could exit to exit the container

docker exec -it spark-master bash
cd /root/
wget https://repo1.maven.org/maven2/com/vesoft/nebula-exchange/2.6.0/nebula-exchange-2.6.0.jar

vi player.json

{"id":"player100","age":42,"name":"Tim Duncan"}
{"id":"player101","age":36,"name":"Tony Parker"}
{"id":"player102","age":33,"name":"LaMarcus Aldridge"}
{"id":"player103","age":32,"name":"Rudy Gay"}
  • create a nebula exchange config file refer to https://docs.nebula-graph.io2.6.1/nebula-exchange/use-exchange/ex-ug-import-from-json/
    • graph: should put at least one graphD endpoint, it should be your machine IP(not 127.0.0.1) with port 9669
    • meta: refer to first step, put the meta ports here
    • space: this is the graph space you would like to import to
    • tags: those files to be imported as vertecies
    • edges: those files to be imported as edges

this is an example

vi json-exchange.conf

{
  # Spark relation config
  spark: {
    app: {
      name: Nebula Exchange 2.6
    }
    master:local
    driver: {
      cores: 1
      maxResultSize: 1G
    }
    executor: {
        memory:1G
    }
    cores:{
      max: 16
    }
  }

  # Nebula Graph relation config
  nebula: {
    address:{
      graph:["192.168.8.127:9669"]
      meta:["192.168.8.127:49190","192.168.8.127:49189","192.168.8.127:49188"]
    }
    user: root
    pswd: nebula
    space: basketballplayer

    # nebula client connection parameters
    connection {
      # socket connect & execute timeout, unit: millisecond
      timeout: 30000
    }

    error: {
      # max number of failures, if the number of failures is bigger than max, then exit the application.
      max: 32
      # failed import job will be recorded in output path
      output: /tmp/errors
    }

    # use google's RateLimiter to limit the requests send to NebulaGraph
    rate: {
      # the stable throughput of RateLimiter
      limit: 1024
      # Acquires a permit from RateLimiter, unit: MILLISECONDS
      # if it can't be obtained within the specified timeout, then give up the request.
      timeout: 1000
    }
  }

  # Processing tags
  # There are tag config examples for different dataSources.
  tags: [

    # file json
    {
      name: player
      type: {
        source: json
        sink: client
      }
      path: "file:///root/player.json"
      fields: [age,name]
      nebula.fields: [age, name]
      vertex: {
        field:id
      }
      batch: 256
      partition: 32
    }

  ]
}
docker exec -it spark-master bash
cd /root/
/spark/bin/spark-submit --master local \
   --class com.vesoft.nebula.exchange.Exchange nebula-exchange-2.6.0.jar \
   -c json-exchange.conf
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment