VXLAN is an acronym for Virtual eXtensiable Local Area Network. A joint venture by a few companies of which Cisco is one of them.
VXLAN is yet another overlay solution which provides a solution to facilitate sending Layer-2 traffic over a Layer-3 network.
Also, one major reason for VXLAN [as per me at least] is due to the importance virtualization and cloud computing are gaining over the recent years. To isolate huge number customers in a cloud environment becomes extremely challenging when we have just 4,096 VLANs currently available to us [of which again, some cannot be used].
Even though VLANs add a 4 byte header to a packet, they effectively use only 12 bits for VLAN tagging. VXLAN has doubled the bits it uses to 24 for tagging information, which immediately scales the value from 4,096 to 16,000,000 [16 million].
So what VXLAN essentially does is very simple, in the sense, it just encapsulates the original data into another frame [UDP frame, and Cisco uses port '8472']. Since this is an encapsulation solution, it adds an overhead of close to about 50 bytes. The first packet sent out from the router performing VXLAN encapsulation [called VTEP router] is a multicast packet. Subsequent messages would become unicast packets. Its worth mentioning here that only unknown unicasts are flooded using multicast. The MAC addresses once known are not flooded anymore.
Before we try out basic configurations related to VXLAN the below terminologies are best to keep in mind:
VTEP [Virtual Tunnel End-Point] - This in our case will the CSR1000V [Cisco's cloud services router]. This router will be used to encapsulate the Layer-2 frames at the source side and strip the encapsulation at the receiver side.
VNI [VXLAN Network Identifier] - Each of the 16 million VXLAN ID's which are available at our disposal is known as a VNI. VNI's work very similarly to our legacy VLAN's and provide isolation between the VNI traffic.
The multicast mode used is PIM-BIDIR which is a tweaking of PIM sparse-mode.
Lets now move forward and try out our very own VXLAN configuration:
With the current implementation [XE-3.11], VXLAN on Cisco [CSR1000v, to be specific] supports only multicast mode.
VXLAN - Unicast: http://stayinginit.blogspot.in/2014/10/vxlan-unicast.html
Topology:
- We have Router1 and Router2 [both CSR1000V] and we have a core router [ASR1K]
- It worth keeping in mind, Router1 and Router2 are virtual routers which were spawned on a UCS server [I am using ESXi 5.5]
- The VM's are also part of the same UCS server
- Yet another point of interest is that to configure VXLAN related commands your CSR1000V should have the premium license
Configurations:
Before we start VXLAN configuration, let us first ensure the basic routing and multicast related configurations are in place:
Router1:
ip multicast-routing distributed
router ospf 100
router-id 1.1.1.1
interface GigabitEthernet2
description "connected to core"
ip address 10.1.1.1 255.255.255.0
ip pim sparse-mode
no shutdown
ip ospf 100 area 100
ip pim bidir-enable
ip pim rp-address 100.100.100.100 bidir
router ospf 100
router-id 1.1.1.1
interface GigabitEthernet2
description "connected to core"
ip address 10.1.1.1 255.255.255.0
ip pim sparse-mode
no shutdown
ip ospf 100 area 100
ip pim bidir-enable
ip pim rp-address 100.100.100.100 bidir
CORE:
ip multicast-routing distributed
router ospf 100
router-id 2.2.2.2
interface GigabitEthernet0/0/0
description "connected to Router1"
ip address 10.1.1.2 255.255.255.0
ip pim sparse-mode
no shutdown
ip ospf 100 area 100
interface GigabitEthernet0/0/1
description "connected to Router2"
ip address 11.1.1.2 255.255.255.0
ip pim sparse-mode
no shutdown
ip ospf 100 area 100
interface Loopback100
ip address 100.100.100.100 255.255.255.255
ip pim sparse-mode
ip ospf 100 area 100
ip pim bidir-enable
ip pim rp-address 100.100.100.100 bidir
Router2:
ip multicast-routing distributed
router ospf 100
router-id 3.3.3.3
interface GigabitEthernet2
description "connected to core"
ip address 11.1.1.1 255.255.255.0
ip pim sparse-mode
ip ospf 100 area 100
no shutdown
ip pim bidir-enable
ip pim rp-address 100.100.100.100 bidir
router ospf 100
router-id 3.3.3.3
interface GigabitEthernet2
description "connected to core"
ip address 11.1.1.1 255.255.255.0
ip pim sparse-mode
ip ospf 100 area 100
no shutdown
ip pim bidir-enable
ip pim rp-address 100.100.100.100 bidir
With the above configurations, we should be able to reach all the routers in our topology. Now let us go onto the next phase of configuration, that is, VXLAN:
Router1:
interface Loopback100
ip address 10.10.10.10 255.255.255.255
ip pim sparse-mode
ip ospf 100 area 100
interface nve1
no shutdown
source-interface Loopback100
interface GigabitEthernet3
description "connected to VM1"
no shutdown
service instance 10 ethernet
encapsulation dot1q 10
rewrite ingress tag pop 1 symmetric
bridge-domain 10
member GigabitEthernet3 service-instance 10
- Interface NVE [network virtualization endpoint] is the one on which we configure the VNI and multicast mapping
- We cannot assign an IP address to this interface, hence we use a Loopback interfaces as its source-interface [which once assigned immediately creates a tunnel interface, the VTEP]
- The Loopback IP address should be reachable
- Finally, we have ethernet virtual circuit [EVC] configuration on the GigabitEthernet3 to make that an Layer-2 interface
- As per the above configuration, we expect to receive a tagged traffic "VLAN 10". By using "rewrite ingress tag pop 1 symmetric" we are essentially removing the tagging related to ingress traffic and re-adding the tagging to the traffic going out from this interface
The next configuration is the binding configuration, which is shown based on what is observed on the router:
Router1(config)#interface nve 1
Router1(config-if)#member vni ?
WORD VNI range or instance between 4096-16777215 example: 6010-6030 or 7115
Router1(config-if)#member vni 4096 ?
mcast-group Configure multicast group for vni(s)
Router1(config-if)#member vni 4096 mcast-group ?
A.B.C.D Starting Multicast Group IPv4 Address
Router1(config-if)#member vni 4096 mcast-group 225.1.1.1
Router1(config-if)#exit
The range of VNI is from 4096 to 16777215. After the above configuration, the final configuration that needs to be done is binding this VNI to our bridge-domain:
bridge-domain 10
member vni 4096
A very similar configuration repeats on Router2:
Router2:
interface Loopback100
ip address 11.11.11.11 255.255.255.255
ip pim sparse-mode
ip ospf 100 area 100
interface nve1
no shutdown
source-interface Loopback100
member vni 4096 mcast-group 225.1.1.1
interface GigabitEthernet3
description "connected to VM1"
no shutdown
service instance 10 ethernet
encapsulation dot1q 10
rewrite ingress tag pop 1 symmetric
exit ! in case you are pasting this configuration
exit ! in case you are pasting this configuration
bridge-domain 10
member vni 4096
member GigabitEthernet3 service-instance 10
That ends our configuration. Now, let us send our most favorite ping traffic from VM1 to VM2.
But again, before we do this, let me do a quick EPC configuration to see what kind of packets we see going out of Router1.
The configuration details:
A very basic level EPC
Router1#monitor capture vxlan interface gigabitEthernet 2 both match any buffer size 200
Router1#monitor capture vxlan start
Now, ping from VM1 to VM2:
[root@localhost ~]# ping 172.16.11.111 -c 5
PING 172.16.11.111 (172.16.11.111) 56(84) bytes of data.
64 bytes from 172.16.11.111: icmp_seq=1 ttl=64 time=4.33 ms
64 bytes from 172.16.11.111: icmp_seq=2 ttl=64 time=1.52 ms
64 bytes from 172.16.11.111: icmp_seq=3 ttl=64 time=1.75 ms
64 bytes from 172.16.11.111: icmp_seq=4 ttl=64 time=2.05 ms
64 bytes from 172.16.11.111: icmp_seq=5 ttl=64 time=1.79 ms
--- 172.16.11.111 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4001ms
rtt min/avg/max/mdev = 1.526/2.293/4.337/1.035 ms
[root@localhost ~]#
Then, stop the EPC capture:
Router1#monitor capture vxlan stop
Check the capture contents / export them to a machine if you prefer to view via wireshark.
To export use this command:
Exported Successfully
Router1#
You will see the following if viewed on Router1:
Router1#show monitor capture vxlan buffer brief
-------------------------------------------------------------
# size timestamp source destination protocol
-------------------------------------------------------------
2 110 4.497975 10.10.10.10 -> 225.1.1.1 UDP
3 110 4.498982 11.11.11.11 -> 10.10.10.10 UDP
4 148 4.498982 10.10.10.10 -> 11.11.11.11 UDP
5 148 4.499974 11.11.11.11 -> 10.10.10.10 UDP
6 148 5.496984 10.10.10.10 -> 11.11.11.11 UDP
7 148 5.497975 11.11.11.11 -> 10.10.10.10 UDP
8 148 6.496984 10.10.10.10 -> 11.11.11.11 UDP
9 148 6.498982 11.11.11.11 -> 10.10.10.10 UDP
10 148 7.497975 10.10.10.10 -> 11.11.11.11 UDP
11 148 7.499974 11.11.11.11 -> 10.10.10.10 UDP
13 148 8.496984 10.10.10.10 -> 11.11.11.11 UDP
14 148 8.498982 11.11.11.11 -> 10.10.10.10 UDP
17 110 9.498982 11.11.11.11 -> 10.10.10.10 UDP
18 110 9.499974 10.10.10.10 -> 11.11.11.11 UDP
Router1#
I have removed the OSPF and PIM captures. Apart from those, the ping traffic from VM1 to VM2 triggers our first UDP packet, with source as 10.10.10.10 and destination 225.1.1.1.
As mentioned earlier, this is the first message sent out from Router1 [our VTEP]. The source and destination port of this packet is 8472 which is the default VXLAN port.
However, the subsequent UDP messages are sent out with arbitrary source ports with destination ports retained as 8472 and are unicast packets. The source and destination addresses we observe are that of our Router1's and Rotuer2's Loopback addresses.
The below command will give the number of packet's sent / received via the NVE interface:
Router1#show nve interface nve 1 detail
Interface: nve1, State: Admin Up, Oper Up Encapsulation: Vxlan
source-interface: Loopback100 (primary:10.10.10.10 vrf:0)
Pkts In Bytes In Pkts Out Bytes Out
7 666 7 666
Router1#
From above, 5 are ping packets and the remain 2 are related to VXLAN. The same have been highlighted in the EPC capture.
Router1#show nve peers
Interface Peer-IP VNI Up Time
nve1 11.11.11.11 4096 -
Router1#
The above is the binding data which will give information. However, this information will expire after sometime [provided the traffic flow inactive].
Here is one small portion of VXLAN UDP tweaking which by default uses 8742, and we use the "vxlan udp port" in the global config-mode to make this change.
The usage of this is very simple, but, has to be configured on both the VTEP routers.
The port being used by VXLAN can be viewed using the show command "show platform software vxlan F0 udp-port". The same is displayed below:
Router1#show platform software vxlan F0 udp-port
VXLAN UDP Port: 8472
Router1#
Now, let's go ahead and make the port change:
Router1(config)#vxlan udp port ?
<1024-65535> Port number
Router1(config)#vxlan udp port 1025 ?
<cr>
Router1(config)#
But again, before we do this, let me do a quick EPC configuration to see what kind of packets we see going out of Router1.
The configuration details:
A very basic level EPC
Router1#monitor capture vxlan interface gigabitEthernet 2 both match any buffer size 200
Router1#monitor capture vxlan start
Now, ping from VM1 to VM2:
[root@localhost ~]# ping 172.16.11.111 -c 5
PING 172.16.11.111 (172.16.11.111) 56(84) bytes of data.
64 bytes from 172.16.11.111: icmp_seq=1 ttl=64 time=4.33 ms
64 bytes from 172.16.11.111: icmp_seq=2 ttl=64 time=1.52 ms
64 bytes from 172.16.11.111: icmp_seq=3 ttl=64 time=1.75 ms
64 bytes from 172.16.11.111: icmp_seq=4 ttl=64 time=2.05 ms
64 bytes from 172.16.11.111: icmp_seq=5 ttl=64 time=1.79 ms
--- 172.16.11.111 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4001ms
rtt min/avg/max/mdev = 1.526/2.293/4.337/1.035 ms
[root@localhost ~]#
Then, stop the EPC capture:
Router1#monitor capture vxlan stop
Check the capture contents / export them to a machine if you prefer to view via wireshark.
To export use this command:
Router1#monitor capture vxlan export tftp://<IP_address>/<location>/<file_name>.pcap
!!Exported Successfully
Router1#
You will see the following if viewed on Router1:
Router1#show monitor capture vxlan buffer brief
-------------------------------------------------------------
# size timestamp source destination protocol
-------------------------------------------------------------
2 110 4.497975 10.10.10.10 -> 225.1.1.1 UDP
3 110 4.498982 11.11.11.11 -> 10.10.10.10 UDP
4 148 4.498982 10.10.10.10 -> 11.11.11.11 UDP
5 148 4.499974 11.11.11.11 -> 10.10.10.10 UDP
6 148 5.496984 10.10.10.10 -> 11.11.11.11 UDP
7 148 5.497975 11.11.11.11 -> 10.10.10.10 UDP
8 148 6.496984 10.10.10.10 -> 11.11.11.11 UDP
9 148 6.498982 11.11.11.11 -> 10.10.10.10 UDP
10 148 7.497975 10.10.10.10 -> 11.11.11.11 UDP
11 148 7.499974 11.11.11.11 -> 10.10.10.10 UDP
13 148 8.496984 10.10.10.10 -> 11.11.11.11 UDP
14 148 8.498982 11.11.11.11 -> 10.10.10.10 UDP
17 110 9.498982 11.11.11.11 -> 10.10.10.10 UDP
18 110 9.499974 10.10.10.10 -> 11.11.11.11 UDP
Router1#
I have removed the OSPF and PIM captures. Apart from those, the ping traffic from VM1 to VM2 triggers our first UDP packet, with source as 10.10.10.10 and destination 225.1.1.1.
As mentioned earlier, this is the first message sent out from Router1 [our VTEP]. The source and destination port of this packet is 8472 which is the default VXLAN port.
However, the subsequent UDP messages are sent out with arbitrary source ports with destination ports retained as 8472 and are unicast packets. The source and destination addresses we observe are that of our Router1's and Rotuer2's Loopback addresses.
The below command will give the number of packet's sent / received via the NVE interface:
Router1#show nve interface nve 1 detail
Interface: nve1, State: Admin Up, Oper Up Encapsulation: Vxlan
source-interface: Loopback100 (primary:10.10.10.10 vrf:0)
Pkts In Bytes In Pkts Out Bytes Out
7 666 7 666
Router1#
From above, 5 are ping packets and the remain 2 are related to VXLAN. The same have been highlighted in the EPC capture.
Router1#show nve peers
Interface Peer-IP VNI Up Time
nve1 11.11.11.11 4096 -
Router1#
The above is the binding data which will give information. However, this information will expire after sometime [provided the traffic flow inactive].
Here is one small portion of VXLAN UDP tweaking which by default uses 8742, and we use the "vxlan udp port" in the global config-mode to make this change.
The usage of this is very simple, but, has to be configured on both the VTEP routers.
The port being used by VXLAN can be viewed using the show command "show platform software vxlan F0 udp-port". The same is displayed below:
Router1#show platform software vxlan F0 udp-port
VXLAN UDP Port: 8472
Router1#
Now, let's go ahead and make the port change:
Router1(config)#vxlan udp port ?
<1024-65535> Port number
Router1(config)#vxlan udp port 1025 ?
<cr>
Router1(config)#
Lets see the show command to see the change in the VXLAN UDP port:
Router1#show platform software vxlan F0 udp-port
VXLAN UDP Port: 1025
Router1#
If we fail to configure Router2 with the same port, we end up seeing 'OverlayBadPkt' drops on Router2. This can be verified after sending traffic from VM1 to VM2.
So, let us configure the same port on Router2 as well:
Router2(config)#vxlan udp port 1025
Now, let me send the ping traffic again:
[root@localhost ~]# ping 172.16.11.111 -c 5
PING 172.16.11.111 (172.16.11.111) 56(84) bytes of data.
64 bytes from 172.16.11.111: icmp_seq=1 ttl=64 time=4.48 ms
64 bytes from 172.16.11.111: icmp_seq=2 ttl=64 time=1.73 ms
64 bytes from 172.16.11.111: icmp_seq=3 ttl=64 time=1.69 ms
64 bytes from 172.16.11.111: icmp_seq=4 ttl=64 time=1.81 ms
64 bytes from 172.16.11.111: icmp_seq=5 ttl=64 time=1.56 ms
--- 172.16.11.111 ping statistics ---
5 packets transmitted, 5 received, 0% packet loss, time 4003ms
rtt min/avg/max/mdev = 1.566/2.258/4.480/1.114 ms
[root@localhost ~]#
Details can be captured via EPC and the same viewed using wireshark.
Hope you found this post informative.