How to simulate a Ray cluster on a single machine
Sometimes, you need to test a multi-node Ray script in your CI without actually standing up a multi-node cluster every time you run your CI. In some cases, it's sufficient to simulate a multi-node cluster by creating multiple Ray processes on the same machine, and Ray will treat the different processes as if they were separate nodes.
Here's how:
#!/usr/bin/env bash
# Enable local clusters on Windows and macOS
export RAY_ENABLE_WINDOWS_OR_OSX_CLUSTER=1
cleanup() {
ray stop %3E/dev/null 2>&1 || true
echo "Cluster stopped. Done."
}
trap cleanup EXIT
echo "Starting head node..."
ray start --head --port=6379 --ray-client-server-port=10001 \
--node-manager-port=63000 --object-manager-port=63001 \
--min-worker-port=30000 --max-worker-port=30099 --num-cpus=0 \
--temp-dir=/tmp/ray/head-node
echo "Head node started"
echo "Starting worker node A..."
ray start --address=127.0.0.1:6379 \
--node-manager-port=63010 --object-manager-port=63011 \
--min-worker-port=30100 --max-worker-port=30199 --num-cpus=1
echo "Worker node A started"
echo "Starting worker node B..."
ray start --address=127.0.0.1:6379 \
--node-manager-port=63020 --object-manager-port=63021 \
--min-worker-port=30200 --max-worker-port=30299 --num-cpus=1
echo "Worker node B started"
echo "Cluster started!!"
echo
echo "Testing the cluster..."
python - <<'PY'
import ray
ray.init(address="127.0.0.1:6379")
print(ray.nodes())
PY
In this script, we set up a "head node" at port 6379, and then connected two "worker nodes" to that head node. The trick is to use different port ranges for each processes.