Skip to content

Instantly share code, notes, and snippets.

View 777ki's full-sized avatar
🎯
Focusing

little jack 777ki

🎯
Focusing
View GitHub Profile
@sjs7007
sjs7007 / newTrainer.py
Created September 22, 2016 00:02
Distributed tensorflow example : between graph and async replication. Credits to ischlag.github.io for the script.
'''
Distributed Tensorflow example of using data parallelism and share model parameters.
Trains a simple sigmoid neural network on mnist for 20 epochs on three machines using one parameter server.
Change the hardcoded host urls below with your own hosts.
Run like this:
pc-01$ python example.py --job-name="ps" --task_index=0
pc-02$ python example.py --job-name="worker" --task_index=0
pc-03$ python example.py --job-name="worker" --task_index=1
@yaroslavvb
yaroslavvb / simple_barrier.py
Created December 16, 2016 06:03
Example of using shared counters to implement Barrier primitive
"""Example of barrier implementation using TensorFlow shared variables.
All workers synchronize on barrier, copy global parameters to local versions
and increment global parameter variable asynchronously. Should see something
like this:
bash> killall python
bash> python simple_barrier.py --num_workers=4
worker 0, local_param 4 global_param 5
worker 2, local_param 4 global_param 7
@manuzhang
manuzhang / simple_barrier.py
Last active March 27, 2020 07:12 — forked from yaroslavvb/simple_barrier.py
TensorFlow in-graph replication example
"""
This example is adapted from https://gist.github.com/yaroslavvb/ef407a599f0f549f62d91c3a00dcfb6c
Example of barrier implementation using TensorFlow shared variables.
All workers synchronize on barrier, copy global parameters to local versions
and increment global parameter variable asynchronously. Should see something
like this:
python simple_barrier.py --wk "node13-1:21393,node13-1:21395"
Creating session