- Get the nebula cluster's metaD ports
$ docker port nebula-docker-compose_metad0_1 | grep ^9559
9559/tcp -> 0.0.0.0:49189
$ docker port nebula-docker-compose_metad1_1 | grep ^9559
9559/tcp -> 0.0.0.0:49190
$ docker port nebula-docker-compose_metad2_1 | grep ^9559
9559/tcp -> 0.0.0.0:49188
Here the three metaD's ports are 49188
, 49189
and 49190
- start a spark container as exchange run time
docker run --name spark-master --network nebula-docker-compose_nebula-net \
-h spark-master -e ENABLE_INIT_DAEMON=false -d \
bde2020/spark-master:2.4.5-hadoop2.7
- enter the spark container, download the exchange jar package
note, you could
exit
to exit the container
docker exec -it spark-master bash
cd /root/
wget https://repo1.maven.org/maven2/com/vesoft/nebula-exchange/2.6.0/nebula-exchange-2.6.0.jar
- create a fake data named
player.json
vi player.json
{"id":"player100","age":42,"name":"Tim Duncan"}
{"id":"player101","age":36,"name":"Tony Parker"}
{"id":"player102","age":33,"name":"LaMarcus Aldridge"}
{"id":"player103","age":32,"name":"Rudy Gay"}
- create a nebula exchange config file refer to https://docs.nebula-graph.io2.6.1/nebula-exchange/use-exchange/ex-ug-import-from-json/
- graph: should put at least one graphD endpoint, it should be your machine IP(not 127.0.0.1) with port 9669
- meta: refer to first step, put the meta ports here
- space: this is the graph space you would like to import to
- tags: those files to be imported as vertecies
- edges: those files to be imported as edges
this is an example
vi json-exchange.conf
{
# Spark relation config
spark: {
app: {
name: Nebula Exchange 2.6
}
master:local
driver: {
cores: 1
maxResultSize: 1G
}
executor: {
memory:1G
}
cores:{
max: 16
}
}
# Nebula Graph relation config
nebula: {
address:{
graph:["192.168.8.127:9669"]
meta:["192.168.8.127:49190","192.168.8.127:49189","192.168.8.127:49188"]
}
user: root
pswd: nebula
space: basketballplayer
# nebula client connection parameters
connection {
# socket connect & execute timeout, unit: millisecond
timeout: 30000
}
error: {
# max number of failures, if the number of failures is bigger than max, then exit the application.
max: 32
# failed import job will be recorded in output path
output: /tmp/errors
}
# use google's RateLimiter to limit the requests send to NebulaGraph
rate: {
# the stable throughput of RateLimiter
limit: 1024
# Acquires a permit from RateLimiter, unit: MILLISECONDS
# if it can't be obtained within the specified timeout, then give up the request.
timeout: 1000
}
}
# Processing tags
# There are tag config examples for different dataSources.
tags: [
# file json
{
name: player
type: {
source: json
sink: client
}
path: "file:///root/player.json"
fields: [age,name]
nebula.fields: [age, name]
vertex: {
field:id
}
batch: 256
partition: 32
}
]
}
-
Create schema of the data, an example is here: https://docs.nebula-graph.io/2.6.1/nebula-exchange/use-exchange/ex-ug-import-from-json/#step_1_create_the_schema_in_nebula_graph
-
Run the exchange application inside the spark container
docker exec -it spark-master bash
cd /root/
/spark/bin/spark-submit --master local \
--class com.vesoft.nebula.exchange.Exchange nebula-exchange-2.6.0.jar \
-c json-exchange.conf