Skip to content

Instantly share code, notes, and snippets.

@chenchun
Last active September 11, 2018 07:45
Show Gist options
  • Save chenchun/d9b82152cbd847a4001ddb23873075a3 to your computer and use it in GitHub Desktop.
Save chenchun/d9b82152cbd847a4001ddb23873075a3 to your computer and use it in GitHub Desktop.
#bgp

Table of contents generated with markdown-toc

Get start

docker run -d --name=b1 --privileged chenchun/gobgp
docker run -d --name=b2 --privileged chenchun/gobgp

docker exec -it b1 bash
cat <<EOF >conf
[global.config]
  as = 1
  router-id = "172.17.0.2"
EOF

cat <<EOF >start.sh
kill -9 \$(ps aux | grep gobgpd | grep -v grep | awk '{print \$2}')
gobgpd -f conf >> log &
EOF
chmod +x start.sh
./start.sh
exit

docker exec -it b2 bash
cat <<EOF >conf
[global.config]
  as = 2
  router-id = "172.17.0.3"
EOF

cat <<EOF >start.sh
kill -9 \$(ps aux | grep gobgpd | grep -v grep | awk '{print \$2}')
gobgpd -f conf >> log &
EOF
chmod +x start.sh
./start.sh
exit

docker exec b1 gobgp neighbor add 172.17.0.3 as 2
docker exec b2 gobgp neighbor add 172.17.0.2 as 1

docker exec b1 gobgp neigh
root@e126c8afbf38:~# gobgp neigh
Peer       AS  Up/Down State       |#Received  Accepted
172.17.0.3  2 00:00:24 Establ      |        0         0

Enable zebra

root@e126c8afbf38:~# gobgp global rib
Network not in table

This is because You need to add any routes to it or enable its zebra feature, see osrg/gobgp#1493

root@e126c8afbf38:~# gobgp global rib add 10.33.0.0/16 -a ipv4
root@e126c8afbf38:~# 
root@e126c8afbf38:~# gobgp global rib 
   Network              Next Hop             AS_PATH              Age        Attrs
*> 10.33.0.0/16         0.0.0.0                                   00:00:03   [{Origin: ?}]

Enable zebra

cat <<EOF >conf
[global.config]
  as = 2
  router-id = "172.17.0.3"

[zebra]
    [zebra.config]
        enabled = true
        url = "unix:/var/run/quagga/zserv.api"
        redistribute-route-type-list = ["connect"]
        version = 2
EOF
root@99ad08ff8f30:~# gobgp global rib 
   Network              Next Hop             AS_PATH              Age        Attrs
*> 172.17.0.0/16        0.0.0.0                                   00:00:21   [{Origin: i} {Med: 0}]
*  172.17.0.0/16        172.17.0.2           1                    00:00:08   [{Origin: i} {Med: 0}]

Peer group

With Peer Group, you can set the same configuration to multiple peers. https://github.com/osrg/gobgp/blob/master/docs/sources/peer-group.md

# c1 configuration
[global.config]                                    
  as = 1                                           
  router-id = "172.17.0.2"                         

[zebra]                                            
    [zebra.config]                                 
        enabled = true                             
        url = "unix:/var/run/quagga/zserv.api"     
        redistribute-route-type-list = ["connect"] 
        version = 2                                

[[peer-groups]]                                    
  [peer-groups.config]                             
    peer-group-name = "sample-group"               
    peer-as = 2                                    
  [[peer-groups.afi-safis]]                        
    [peer-groups.afi-safis.config]                 
      afi-safi-name = "ipv4-unicast"               
  [[peer-groups.afi-safis]]                        
    [peer-groups.afi-safis.config]                 
      afi-safi-name = "ipv4-flowspec"              

[[neighbors]]                                      
  [neighbors.config]                               
    neighbor-address = "172.17.0.3"                
    peer-group = "sample-group"                    
  [neighbors.timers.config]                        
    hold-time = 99
    
# c2 container
[global.config]                                    
  as = 2                                           
  router-id = "172.17.0.3"                         

[zebra]                                            
    [zebra.config]                                 
        enabled = true                             
        url = "unix:/var/run/quagga/zserv.api"     
        redistribute-route-type-list = ["connect"] 
        version = 2                                

[[neighbors]]                                      
  [neighbors.config]                               
    neighbor-address = "172.17.0.2"

Dynamic Neighbor

Dynamic Neighbor enables GoBGP to accept connections from the peers in specific prefix. Note that GoBGP will be passive mode to members of dynamic neighbors. So if both peers listen to each other as dynamic neighbors, the connection will never be established.

# r1

[global.config]
  as = 1
  router-id = "172.17.0.2"

[[peer-groups]]
  [peer-groups.config]
    peer-group-name = "sample-group"
    peer-as = 2
  [[peer-groups.afi-safis]]
    [peer-groups.afi-safis.config]
      afi-safi-name = "ipv4-unicast"
  [[peer-groups.afi-safis]]
    [peer-groups.afi-safis.config]
      afi-safi-name = "ipv4-flowspec"

[[dynamic-neighbors]]
  [dynamic-neighbors.config]
    prefix = "172.17.0.0/16"
    peer-group = "sample-group"

# r2
[global.config]                                    
  as = 2                                           
  router-id = "172.17.0.3"                         

[[neighbors]]                                      
  [neighbors.config]                               
    neighbor-address = "172.17.0.2"

Equal Cost Multipath Routing with Zebra

How GoBGP handles Equal Cost Multipath (ECMP) routes with Zebra daemon included in Quagga

docker run -d --name=r1 --privileged --hostname=r1 chenchun/gobgp #172.17.0.2/16
docker run -d --name=r2 --privileged --hostname=r2 chenchun/gobgp #172.17.0.3/16
docker run -d --name=r3 --privileged --hostname=r3 chenchun/gobgp #172.17.0.4/16

docker network create -d bridge bridge2
docker network ls
NETWORK ID          NAME                DRIVER              SCOPE
6f9ccd10f670        bridge              bridge              local
94136127c737        bridge2             bridge              local
f277d3e5024a        host                host                local
9ca05ce2e370        none                null                local

docker network connect bridge2 r1
docker network connect bridge2 r3
docker network disconnect bridge r3

R1: GoBGP + Zebra
R2: GoBGP
R3: GoBGP

    +-------------+                     +-------------+
    | R1          | .2/16         .3/16 | R2          |
    | ID: 1.1.1.1 |---------------------| ID: 2.2.2.2 |
    | AS: 65000   |   172.17.0.0/16     | AS: 65000   |
    +-------------+                     +-------------+
        | .2/16
        |
        | 172.18.0.0/16
        |
        | .3/16
    +-------------+
    | R3          |
    | ID: 3.3.3.3 |
    | AS: 65000   |
    +-------------+

r1-r3 conf

# gobgpd.toml on R1                                

[global.config]                                    
  as = 65000                                       
  router-id = "1.1.1.1"                            

[global.use-multiple-paths.config]                 
  enabled = true                                   

[[neighbors]]                                      
  [neighbors.config]                               
    neighbor-address = "172.17.0.3"                
    peer-as = 65000                                
  [[neighbors.afi-safis]]                          
    [neighbors.afi-safis.config]                   
      afi-safi-name = "ipv4-unicast"               
  [[neighbors.afi-safis]]                          
    [neighbors.afi-safis.config]                   
      afi-safi-name = "ipv6-unicast"               

[[neighbors]]                                      
  [neighbors.config]                               
    neighbor-address = "172.18.0.3"                
    peer-as = 65000                                
  [[neighbors.afi-safis]]                          
    [neighbors.afi-safis.config]                   
      afi-safi-name = "ipv4-unicast"               
  [[neighbors.afi-safis]]                          
    [neighbors.afi-safis.config]                   
      afi-safi-name = "ipv6-unicast"               

[zebra.config]                                     
  enabled = true                                   
  url = "unix:/var/run/quagga/zserv.api"           
  redistribute-route-type-list = ["connect"]       
  version = 2 
# gobgpd.toml on R2                                

[global.config]                                    
  as = 65000                                       
  router-id = "2.2.2.2"                            

[[neighbors]]                                      
  [neighbors.config]                               
    neighbor-address = "172.17.0.2"                
    peer-as = 65000                                
  [[neighbors.afi-safis]]                          
    [neighbors.afi-safis.config]                   
      afi-safi-name = "ipv4-unicast"               
  [[neighbors.afi-safis]]                          
    [neighbors.afi-safis.config]                   
      afi-safi-name = "ipv6-unicast"

# gobgpd.toml on R3                                

[global.config]                                    
  as = 65000                                       
  router-id = "3.3.3.3"                            

[[neighbors]]                                      
  [neighbors.config]                               
    neighbor-address = "172.18.0.2"                
    peer-as = 65000                                
  [[neighbors.afi-safis]]                          
    [neighbors.afi-safis.config]                   
      afi-safi-name = "ipv4-unicast"               
  [[neighbors.afi-safis]]                          
    [neighbors.afi-safis.config]                   
      afi-safi-name = "ipv6-unicast"

start each one, wait for established, check rib and linux route

root@r1:~# gobgp global rib -a ipv4
   Network              Next Hop             AS_PATH              Age        Attrs
*> 172.17.0.0/16        0.0.0.0                                   00:01:32   [{Origin: i} {Med: 0}]
*> 172.18.0.0/16        0.0.0.0                                   00:01:32   [{Origin: i} {Med: 0}]
root@r1:~# ip route
default via 172.17.0.1 dev eth0 
172.17.0.0/16 dev eth0  proto kernel  scope link  src 172.17.0.2 
172.18.0.0/16 dev eth1  proto kernel  scope link  src 172.18.0.2 

root@r2:~# gobgp global rib -a ipv4
   Network              Next Hop             AS_PATH              Age        Attrs
*> 172.17.0.0/16        172.17.0.2                                00:01:21   [{Origin: i} {Med: 0} {LocalPref: 100}]
*> 172.18.0.0/16        172.17.0.2                                00:01:21   [{Origin: i} {Med: 0} {LocalPref: 100}]
root@r2:~# ip route
default via 172.17.0.1 dev eth0 
172.17.0.0/16 dev eth0  proto kernel  scope link  src 172.17.0.3 

root@r3:~# gobgp global rib -a ipv4
   Network              Next Hop             AS_PATH              Age        Attrs
*> 172.17.0.0/16        172.18.0.2                                00:01:21   [{Origin: i} {Med: 0} {LocalPref: 100}]
*> 172.18.0.0/16        172.18.0.2                                00:01:21   [{Origin: i} {Med: 0} {LocalPref: 100}]
root@r3:~# ip route
default via 172.18.0.1 dev eth1 
172.18.0.0/16 dev eth1  proto kernel  scope link  src 172.18.0.3 

add rib on r2, check rib and linux route

root@r2:~# gobgp global rib -a ipv4 add 10.23.1.0/24

root@r1:~# gobgp global rib 
   Network              Next Hop             AS_PATH              Age        Attrs
*> 10.23.1.0/24         172.17.0.3                                00:01:14   [{Origin: ?} {LocalPref: 100}]
*> 172.17.0.0/16        0.0.0.0                                   00:05:49   [{Origin: i} {Med: 0}]
*> 172.18.0.0/16        0.0.0.0                                   00:05:49   [{Origin: i} {Med: 0}]
root@r1:~# ip r
default via 172.17.0.1 dev eth0 
10.23.1.0/24 via 172.17.0.3 dev eth0  proto zebra 
172.17.0.0/16 dev eth0  proto kernel  scope link  src 172.17.0.2 
172.18.0.0/16 dev eth1  proto kernel  scope link  src 172.18.0.2 

root@r2:~# gobgp global rib 
   Network              Next Hop             AS_PATH              Age        Attrs
*> 10.23.1.0/24         0.0.0.0                                   00:01:14   [{Origin: ?}]
*> 172.17.0.0/16        172.17.0.2                                00:05:38   [{Origin: i} {Med: 0} {LocalPref: 100}]
*> 172.18.0.0/16        172.17.0.2                                00:05:38   [{Origin: i} {Med: 0} {LocalPref: 100}]
root@r2:~# ip r
default via 172.17.0.1 dev eth0 
172.17.0.0/16 dev eth0  proto kernel  scope link  src 172.17.0.3 

root@r3:~# gobgp global rib 
   Network              Next Hop             AS_PATH              Age        Attrs
*> 172.17.0.0/16        172.18.0.2                                00:05:38   [{Origin: i} {Med: 0} {LocalPref: 100}]
*> 172.18.0.0/16        172.18.0.2                                00:05:38   [{Origin: i} {Med: 0} {LocalPref: 100}]
root@r3:~# ip r
default via 172.18.0.1 dev eth1 
172.18.0.0/16 dev eth1  proto kernel  scope link  src 172.18.0.3

add rib on r3, check rib and linux route. GoBGP on R1 will receive these routes and install them into R1's Kernel routing table via Zebra. The following shows that traffic to "10.23.1.0/24" will be forwarded through the interface eth0 (nexthop is R2) or the interface eth1 (nexthop is R3) with the same weight.

root@r3:~# gobgp global rib -a ipv4 add 10.23.1.0/24

root@r1:~# gobgp global rib 
   Network              Next Hop             AS_PATH              Age        Attrs
*> 10.23.1.0/24         172.17.0.3                                00:02:41   [{Origin: ?} {LocalPref: 100}]
*  10.23.1.0/24         172.18.0.3                                00:00:20   [{Origin: ?} {LocalPref: 100}]
*> 172.17.0.0/16        0.0.0.0                                   00:07:16   [{Origin: i} {Med: 0}]
*> 172.18.0.0/16        0.0.0.0                                   00:07:16   [{Origin: i} {Med: 0}]
root@r1:~# ip r
default via 172.17.0.1 dev eth0 
10.23.1.0/24  proto zebra 
        nexthop via 172.17.0.3  dev eth0 weight 1
        nexthop via 172.18.0.3  dev eth1 weight 1
172.17.0.0/16 dev eth0  proto kernel  scope link  src 172.17.0.2 
172.18.0.0/16 dev eth1  proto kernel  scope link  src 172.18.0.2 

root@r2:~# gobgp global rib 
   Network              Next Hop             AS_PATH              Age        Attrs
*> 10.23.1.0/24         0.0.0.0                                   00:02:41   [{Origin: ?}]
*> 172.17.0.0/16        172.17.0.2                                00:07:05   [{Origin: i} {Med: 0} {LocalPref: 100}]
*> 172.18.0.0/16        172.17.0.2                                00:07:05   [{Origin: i} {Med: 0} {LocalPref: 100}]
root@r2:~# ip r
default via 172.17.0.1 dev eth0 
172.17.0.0/16 dev eth0  proto kernel  scope link  src 172.17.0.3 

root@r3:~# gobgp global rib 
   Network              Next Hop             AS_PATH              Age        Attrs
*> 10.23.1.0/24         0.0.0.0                                   00:00:20   [{Origin: ?}]
*> 172.17.0.0/16        172.18.0.2                                00:07:05   [{Origin: i} {Med: 0} {LocalPref: 100}]
*> 172.18.0.0/16        172.18.0.2                                00:07:05   [{Origin: i} {Med: 0} {LocalPref: 100}]
root@r3:~# ip r
default via 172.18.0.1 dev eth1 
172.18.0.0/16 dev eth1  proto kernel  scope link  src 172.18.0.3

Unnumbered BGP

BGP is not only for the Internet. Due to proven scalability and configuration flexibility, large data center operators are using BGP for their data center networking [ietf-rtgwg-bgp-routing-large-dc].

In typical case, the topology of the network is CLOS network which can offer multiple ECMP for ToR switches. Each ToR switches run BGP daemon and peer to uplink switches connected with P2P link.

In this case, since all switches are operated by single administrator and trusted, we can skip tedious neighbor configurations like specifying neighbor address or neighbor AS number by using unnumbered BGP feature.

Unnumbered BGP utilizes IPv6 link local address to automatically decide who to connect. Also, when using unnumbered BGP, you don't need to specify neighbor AS number. GoBGP will accept any AS number in the neighbor's open message.

Prerequisites

To use unnumbered BGP feature, be sure the link between two BGP daemons is P2P and IPv6 is enabled on interfaces connected to the link.

Also, check neighbor's IPv6 link local address is on the linux's neighbor table.

$ ip -6 neigh show
fe80::42:acff:fe11:5 dev eth0 lladdr 02:42:ac:11:00:05 REACHABLE

If neighbor's address doesn't exist, easiest way to fill the table is ping6. Try the command below

$ ping6 -c 1 ff02::1%eth0
PING ff02::1%eth0 (ff02::1%eth0): 56 data bytes
64 bytes from fe80::42:acff:fe11:5%eth0: icmp_seq=0 ttl=64 time=0.312 ms
--- ff02::1%eth0 ping statistics ---
1 packets transmitted, 1 packets received, 0% packet loss
round-trip min/avg/max/stddev = 0.312/0.312/0.312/0.000 ms

More reliable method is to run radvd or zebra to periodically send router advertisement.

docker run --name=r1 -d --hostname=r1 --privileged --net=none gobgp
docker run --name=r2 -d --hostname=r2 --privileged --net=none gobgp
ip link add v1 type veth peer name v2
pid1=$(docker inspect -f {{.State.Pid}} r1)
pid2=$(docker inspect -f {{.State.Pid}} r2)                                                                                 
mkdir -p /var/run/netns/
touch /var/run/netns/r1
touch /var/run/netns/r2
mount --bind /proc/$pid1/ns/net /var/run/netns/r1
mount --bind /proc/$pid2/ns/net /var/run/netns/r2
ip link set v1 netns r1
ip link set v2 netns r2
ip netns exec r1 ip link set v1 name eth0
ip netns exec r2 ip link set v2 name eth0
ip netns exec r1 ip link set eth0 up
ip netns exec r2 ip link set eth0 up
ip netns exec r1 ip ad add 192.168.0.1/24 dev eth0
ip netns exec r2 ip ad add 192.168.0.2/24 dev eth0
docker exec -it r1 bash
echo 0 > /proc/sys/net/ipv6/conf/all/disable_ipv6
docker exec -it r2 bash
echo 0 > /proc/sys/net/ipv6/conf/all/disable_ipv6
docker exec -it r1 bash
ping6 -c 1 ff02::1%eth0
ip -6 neigh show


root@r1:~# cat conf
[global.config]
  as = 1
  router-id = "192.168.0.1"

[[neighbors]]
  [neighbors.config]
    neighbor-interface = "eth0"

root@r2:~# cat conf
[global.config]
  as = 2
  router-id = "192.168.0.2"

[[neighbors]]
  [neighbors.config]
    neighbor-interface = "eth0"

Route reflector

for i in $(seq 1 5); do docker run -d --name=r$i --privileged --hostname=r$i chenchun/gobgp; done

root@ramichen:/home/ramichen/project/go/src/github.com/osrg/gobgp# cat setup.sh 
cat <<EOF >start.sh
#!/bin/bash
kill -9 \$(ps aux | grep gobgpd | grep -v grep | awk '{print \$2}') 2>/dev/null
nohup /go/bin/gobgpd -f conf >> log
EOF
chmod +x start.sh

cat <<EOF >conf1
[global.config]
  router-id = "172.17.0.2"
  as = 65000

[[neighbors]]
  [neighbors.config]
    neighbor-address = "172.17.0.3"
    peer-as = 65000
  [neighbors.route-reflector.config]
    route-reflector-client = true
    route-reflector-cluster-id = "172.17.0.2"
[[neighbors]]
  [neighbors.config]
    neighbor-address = "172.17.0.4"
    peer-as = 65000
  [neighbors.route-reflector.config]
    route-reflector-client = true
    route-reflector-cluster-id = "172.17.0.2"

[[neighbors]]
  [neighbors.config]
    neighbor-address = "172.17.0.4"
    peer-as = 65000
  [neighbors.route-reflector.config]
    route-reflector-client = true
    route-reflector-cluster-id = "172.17.0.2"
[[neighbors]]
  [neighbors.config]
    neighbor-address = "172.17.0.5"
    peer-as = 65000
[[neighbors]]
  [neighbors.config]
    neighbor-address = "172.17.0.6"
    peer-as = 65000
EOF

for i in $(seq 2 5); do
        cat <<EOF >conf$i
[global.config]
  as = 65000
  router-id = "172.17.0.$((i + 1))"
  
[[neighbors]]
  [neighbors.config]
    neighbor-address = "172.17.0.2"
    peer-as = 65000
EOF
done

for i in $(seq 1 5); do
        echo starting r$i
        docker cp conf$i r$i:/root/conf
        docker cp start.sh r$i:/root
        rm conf$i
        docker exec -d r$i bash -c /root/start.sh
done

sleep 2
docker exec r1 gobgp neigh

# setup.sh file end -----------

for i in $(seq 1 5); do docker exec r$i gobgp global rib add 10.0.$((i + 1)).0/24 -a ipv4; done
root@ramichen:/home/ramichen/project/go/src/github.com/osrg/gobgp# docker exec r1 gobgp neigh 172.17.0.5 adj-out
   ID  Network              Next Hop             AS_PATH              Attrs
   1   10.0.2.0/24          172.17.0.2                                [{Origin: ?} {LocalPref: 100}]
   1   10.0.3.0/24          172.17.0.3                                [{Origin: ?} {LocalPref: 100}]
   1   10.0.4.0/24          172.17.0.4                                [{Origin: ?} {LocalPref: 100}]
root@ramichen:/home/ramichen/project/go/src/github.com/osrg/gobgp# docker exec r1 gobgp neigh 172.17.0.4 adj-out
   ID  Network              Next Hop             AS_PATH              Attrs
   1   10.0.2.0/24          172.17.0.2                                [{Origin: ?} {LocalPref: 100} {Originator: 172.17.0.2} {ClusterList: [172.17.0.2]}]
   1   10.0.3.0/24          172.17.0.3                                [{Origin: ?} {LocalPref: 100} {Originator: 172.17.0.3} {ClusterList: [172.17.0.2]}]
   1   10.0.5.0/24          172.17.0.5                                [{Origin: ?} {LocalPref: 100} {Originator: 172.17.0.5} {ClusterList: [172.17.0.2]}]
   1   10.0.6.0/24          172.17.0.6                                [{Origin: ?} {LocalPref: 100} {Originator: 172.17.0.6} {ClusterList: [172.17.0.2]}]

# clean up
for i in $(seq 1 5); do docker rm -vf r$i; done

  • GoBGP is just a bgp daemon and itself does not contain any functionality to modify routing table
  • If you like to use GoBGP as a component of software router and do packet forwarding, you need to implement that
  • There are two options to achieve FIB manipulation with GoBGP
    • Use built-in zebra integration
    • Write your own code using gRPC API
device g1 g2 g3
lo 192.168.0.1/32 192.168.0.2/32 192.168.0.3/32
eth1 192.168.10.2/24 192.168.10.3/24 192.168.10.4/24
     ------------------------------
     |                            |
   ------        ------        ------
   | g1 |--------| g2 |--------| g3 |
   ------        ------        ------
   /   \          /   \        /    \
  /     \        /     \      /      \
------ ------ ------ ------ ------ ------
| h1 | | j1 | | h2 | | j2 | | h3 | | j3 |
------ ------ ------ ------ ------ ------

Table of contents generated with markdown-toc

FIB

FIB(Forward Information dataBase)在Linux路由系统中主要保存了三种与路由相关的数据,第一种是在物理上和本机相连接的主机地址信息表——相邻表:neigh_table{ },第二种是保存了在网络访问中判断一个网络地址应该走什么路由的数据表——路由规则表:fib_table{ },第三种表是最新使用过的查询路由地址的缓存地址数据表——路由缓存:rtcache,由rtable{ }节点组成。

linux fib

上图中FIB主要是和用户层打交道,而RT cache主要是和协议栈打交道,除非处于维护的目的,协议栈是不会去直接访问FIB系统。在一个基本稳定系统中,红线箭头表示用户操作流,蓝线箭头表示底层报文的路由查询过程。只有特殊情况,报文路由过程才会进入到FIB系统查询,这个原理和CPU访问Cache不命中然后才访问内存是一个道理。

FIB表就是ip rule其中一个表

和FIB相近的一个概念叫RIB(Route Information dataBase)。FIB强调的是作为转发的路由表,RIB是用来做路由管理的表。通常有了动态路由协议的参与才能理解这个问题。RIP、OSPF、BGP、ISIS都是动态路由协议,它们学习到的路由首先要通告给RIB表。RIB表把所有路由协议学习到的路由汇总到一起,经过优选,把优选结果的路由加入到FIB表,供转发使用。所以FIB是RIB的一个子集。

比如BGP协议中到一个网段的路由可能有多条路径,经过不同的AS域,RIB表会存储所有的路径,但是FIB表可能存的是某一条路径

VRF RD

每一个VRF,可看作是虚拟的路由器,包括:

  • 一张独立的路由表,包括独立的地址空间
  • 一组归属于这个VRF的接口的集合,
  • 一组只用于本VRF的路由协议

VPNv4前缀有12字节(96bits)就是在4字节的IPv4地址之前,增加了8个字节的标识符,增加的8个字节叫做RD(RouteDistinguisher)。格式就是上面那样。

RD(Route-Distinguisher)

格式:Type=0,AdministratorSubfield=iBGP AS number,Assigned Number Subfield=每个VPN唯一的编号。这是比价常用的。Type=0,AdministratorSubfield=IP-Address,Assigned Number Subfield=每个VPN唯一的编号。

功能:与32bits的IPv4前缀一起构成96bits的VPNv4前缀。若不同的VPN用户,存在相同的IPv4地址空间,可通过设置不同的RD值,来保证前缀的唯一性

本质:就是一个数字,不包含了实际的信息,区分相同IPv4前缀的路由,保证BGP的进程不会认为这些路由相同

https://blog.csdn.net/qq_32822927/article/details/78299611

EVPN

EVPN全称是Ethernet VPN,它最开始由RFC7432定义,RFC的全称是:BGP MPLS-Based Ethernet VPN,从名字上看,这是一个基于BGP和MPLS的L2 VPN。虽然这是一个2015年才有编号的RFC,但当它还是草案时,很多厂商已经开始实现EVPN。

https://www.sdnlab.com/19650.html

IBGP, EBGP, IGP

https://blog.csdn.net/zhouwei1221q/article/details/45420223

IGP

https://zh.wikipedia.org/wiki/%E8%B7%AF%E7%94%B1%E5%8D%8F%E8%AE%AE https://zh.wikipedia.org/wiki/%E5%86%85%E9%83%A8%E7%BD%91%E5%85%B3%E5%8D%8F%E8%AE%AE 内部网关协议(英语:Interior Gateway Protocol,缩写为IGP)是指在一个自治系统(AS)内部所使用的一种路由协议。 内部网关协议可分为三类:

  • 距离矢量路由协议
    • 路由信息协议(RIP)
    • 内部网关路由协议(IGRP)(注意:勿将内部网关协议IGP与内部网关路由协议IGRP混淆,IGP是本条目所指一类协议,而IGRP是特定的一种路由协议)
  • 连接状态路由协议
    • 开放式最短路径优先协议(OSPF)
    • 中间系统到中间系统路由交换协议(IS-IS)
  • 高级距离矢量路由协议
    • 增强型内部网关路由协议(EIGRP)(增强型内部网关路由协议EIGRP是内部网关路由协议IGRP的增强版,EIGRP是Cisco专用协议)

BGP/IGP example

The following is from https://www.tigera.io/bgp-unumbered/

The original genesis of BGP came from the need to scalably pass large numbers of routes between Internet Service Providers. At the time BGP was developed, the existing routing protocols (e.g., OSPF and IS-IS) performed two functions, map an exact route to a given destination, and detect faults along the route. Combining these two functions meant that those protocols were (and still are) not scalable for large numbers of routes. The need for carrying large numbers of routes (all the ISP’s customer routes, and all of the routes of all the other Internet routes – currently the Internet routing table has more than 725K routes), led to the creation of BGP.

To reach the scale necessary, BGP behaves a bit differently than other routing protocols. To allow for scale, each network on the Internet is assigned a unique ID, called an Autonomous System Number (ASN). You can think of these as equivalent to country codes in the phone network, or zip/postal codes for postal mail. Two BGP routers that are communicating either are doing so within a given ASN (called iBGP) or between two ASNs (called eBGP). Each router also has one or more IP addresses that it uses as a next hop when it exchanges routes.

When a router learns about routes over an eBGP connection, it changes the next hop value of the route(s) to be one of its address(es). In effect, it is saying that it is the path to the given destination.

When that router then tells other routers in the AS (via iBGP sessions) those other routers do not change the next hop value, basically saying that the first router in the ASN that learned the route is the gateway for that route. A picture might make this easier to understand:

image

The diagram shows that next hop for the route to X changes when it crosses an AS boundary, but not when it is passed between routers in each AS. You can think of it that BGP tells you where you want to go, but not necessarily how to get there. In a large, diffuse network, that task is still carried out by an Interior Gateway Protocol (IGP) like OSPF or IS-IS. This can be seen in the following diagram (same network, but at the intra-AS routing layer).

image

So, when a router wants to reach a specific destination, it uses BGP to find what router is the gateway to the desired destination itself, or at least the gateway to the next AS in the chain. It then (usually) uses an IGP to find how to get to that BGP routers next hop address.

CLOS网络架构

腾讯网络架构简史 http://km.oa.com/news/post/119354 http://km.oa.com/group/16327/articles/show/96784

下面讲一下云网络,这个是传统数据中心网络拓扑,思科以前一直提倡这样的网络拓扑,三层,最下面是接入层,中间是分布层,上面是核心层。这样做的问题就是可能有很多接入层的交换机接到分布层,很多的分布层接到核心层。如果服务器连到同一接入层有1G的带宽,如果连接到不同接入层交换机的及其带宽小于1G。这种架构造成的问题是什么呢,程序员写程序的时候要考虑这个服务器在的是连接同一个接入层,还是不同层的交换机。如果程序在一个机架上运行,带宽可以1G,如果跨机架就会跨不同层的交换机,带宽就是只有几十兆了,写程序的时候要考虑到哪些不同的接入层,这些开发出来的程序不容易自由迁移,动态部署,因为网络拓扑已经反映里面。如果不能做动态部署,资源利用率就比较低。因为它的应用不可能一天24小时都可以有很高的请求。

我们要做的就是扁平化的网络拓扑,用CLOS Network来部署。50年代就有一个计算机科学家提出来,第一级128台,每台下行40个1G端口,上行40个。第二级4台,每台下行128个10G端口与1级相连。集群内5120台服务器,任何两台都有1G带宽。这样程序员在写分布式应用的时候就不需要考虑这个程序在哪些层运行,因为任何通讯都有1G的带宽。

腾讯很多应用之间都有交互,不谈动态部署,就算是静态部署,这个应用固定在某台服务器运行,不会做自动迁移,腾讯有很多不同的业务,很多业务都有手机的版本,微博和Qzone,游戏等等,还有服务器端的,在线的。很多应用都涉及到QQ好友关系链,不同应用之间交互很多。当他们有很多交互的时候,采用这种拓扑,都要提供足够的带宽给他们,提供交互。在上面运行的结果就会避免拥塞。

这种拓扑结构是Google是2008年做的,三级CLOS Network。

第一级,512台,每台下行40个1G端口,下行40个,每64台一个分区。总共会有8个分区。

第二级,这个核心交换机往下至少有128个10G端口,下行为64个10G,上行64个,每四台一个分区。

第三级,16台,每台下行32个10G端口。

集群内20480台服务器,任何两台都有1G带宽。

others

  • NLRI - Network Layer Reachability Information
  • RIB - Routing Information Base
  • Loc-RIB - Local RIB
  • AS - Autonomous System number
  • VRF - Virtual Routing and Forwarding instance
  • PE - Provider Edge router
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment