wey-gu/nebula-exchange-json-example.md

## nebula-exchange-json-example.md

      
    Raw
  

              nebula-exchange-json-example.md
            
          
Get the nebula cluster's metaD ports

$ docker port nebula-docker-compose_metad0_1 | grep ^9559
9559/tcp -> 0.0.0.0:49189
$ docker port nebula-docker-compose_metad1_1 | grep ^9559
9559/tcp -> 0.0.0.0:49190
$ docker port nebula-docker-compose_metad2_1 | grep ^9559
9559/tcp -> 0.0.0.0:49188
Here the three metaD's ports are 49188, 49189 and 49190

start a spark container as exchange run time

docker run --name spark-master --network nebula-docker-compose_nebula-net \
    -h spark-master -e ENABLE_INIT_DAEMON=false -d \
    bde2020/spark-master:2.4.5-hadoop2.7

enter the spark container, download the exchange jar package


note, you could exit to exit the container

docker exec -it spark-master bash
cd /root/
wget https://repo1.maven.org/maven2/com/vesoft/nebula-exchange/2.6.0/nebula-exchange-2.6.0.jar

create a fake data named player.json

refer to https://docs.nebula-graph.io2.6.1/nebula-exchange/use-exchange/ex-ug-import-from-json/


vi player.json
{"id":"player100","age":42,"name":"Tim Duncan"}
{"id":"player101","age":36,"name":"Tony Parker"}
{"id":"player102","age":33,"name":"LaMarcus Aldridge"}
{"id":"player103","age":32,"name":"Rudy Gay"}

create a nebula exchange config file refer to https://docs.nebula-graph.io2.6.1/nebula-exchange/use-exchange/ex-ug-import-from-json/

graph: should put at least one graphD endpoint, it should be your machine IP(not 127.0.0.1) with port 9669
meta: refer to first step, put the meta ports here
space: this is the graph space you would like to import to
tags: those files to be imported as vertecies
edges: those files to be imported as edges


this is an example

vi json-exchange.conf
{
  # Spark relation config
  spark: {
    app: {
      name: Nebula Exchange 2.6
    }
    master:local
    driver: {
      cores: 1
      maxResultSize: 1G
    }
    executor: {
        memory:1G
    }
    cores:{
      max: 16
    }
  }

  # Nebula Graph relation config
  nebula: {
    address:{
      graph:["192.168.8.127:9669"]
      meta:["192.168.8.127:49190","192.168.8.127:49189","192.168.8.127:49188"]
    }
    user: root
    pswd: nebula
    space: basketballplayer

    # nebula client connection parameters
    connection {
      # socket connect & execute timeout, unit: millisecond
      timeout: 30000
    }

    error: {
      # max number of failures, if the number of failures is bigger than max, then exit the application.
      max: 32
      # failed import job will be recorded in output path
      output: /tmp/errors
    }

    # use google's RateLimiter to limit the requests send to NebulaGraph
    rate: {
      # the stable throughput of RateLimiter
      limit: 1024
      # Acquires a permit from RateLimiter, unit: MILLISECONDS
      # if it can't be obtained within the specified timeout, then give up the request.
      timeout: 1000
    }
  }

  # Processing tags
  # There are tag config examples for different dataSources.
  tags: [

    # file json
    {
      name: player
      type: {
        source: json
        sink: client
      }
      path: "file:///root/player.json"
      fields: [age,name]
      nebula.fields: [age, name]
      vertex: {
        field:id
      }
      batch: 256
      partition: 32
    }

  ]
}


Create schema of the data, an example is here: https://docs.nebula-graph.io/2.6.1/nebula-exchange/use-exchange/ex-ug-import-from-json/#step_1_create_the_schema_in_nebula_graph


Run the exchange application inside the spark container


docker exec -it spark-master bash
cd /root/
/spark/bin/spark-submit --master local \
   --class com.vesoft.nebula.exchange.Exchange nebula-exchange-2.6.0.jar \
   -c json-exchange.conf