Skip to content

Instantly share code, notes, and snippets.

Show Gist options
  • Save michaeljoy/bd6ae46c2393058de7986105411db474 to your computer and use it in GitHub Desktop.
Save michaeljoy/bd6ae46c2393058de7986105411db474 to your computer and use it in GitHub Desktop.
Plume Superpod All Ethernet Network Backhaul Performance and Stability Solution
Plume Superpod All Ethernet Network Backhaul Performance and Stability Solution
- This is for pro-sumer / small head count SMB network configuration, this isn't a corporate office level solution
- This probably applies to all consumer mesh network hardware (Plume, Eero, Google WiFi at this point when they are interacting with enterprise or small business grade L2/L3 managed switches for ethernet backhaul configurations)
Network Configuration: 4x Plume Superpod Mesh WiFi AP's with All Ethernet Switched Backhaul
- Problem: Plume superpod or switch randomly loses network, internet, or connectivity to both, and everything on your network dies, or at least important components like your Apple TV, Fire TV, Smart TV, PS4, Xbox, iPhone, Apple Watch, Google Nexus Phone randomly lose internet and local network access etc... The reasons including network storms and loops caused by incorrect topology resolution. Additional problem... Apple Watch on WiFi when connected to iPhone via Bluetooth causes intermitten loss of all internet access on iPhone via WiFi. Only way to restore is to turn off Apple Watch, or turn off bluetooth on iPhone. Either method immediately restores internet access over WiFi on iPhone.
- Solution: Combination of managed network switch firmware, configuration, and iteration over a solid year until a firmware release on the Superpod's finally supported IGMP properly to allow the switches to take over all the 'brains' and leave the pod's as dumb wifi mesh network hardware.
- Result: Rock solid network stability through power outages, ISP temporary downtime for router maintenance, restarts, superpod firmware updates etc...
Initial plume superpod firmware that finally works as intended and confirmed to work here : 3.0.1-20 : https://support.plume.com/hc/en-us/articles/115001797928-Pod-Device-Release-Notes
Notice: 3.0.0-46 IGMP/MLD proxy support added
Topology:
fiber (1g) <-> router <-> fiber (10g) <-> l3 managed switch <-> fiber (10g) <-> l3 managed edge switch * 4 <-> ethernet (1g) <-> plume superpod * 4
Note: Each downstream l3 managed edge switch hosts 1 plume superpod
Router services: NAT, NATv6, DHCP, DHCPv6, (DNS configured to upstream 3rd party non-ISP provider or local pi-hole appliance)
Switch services: All the normal L3 managed switch things, but configuration and 'magic' is below...
L3 switch critically important configuration elements:
STP: disabled (Enabling this will ultimately only lead to pain and suffering on Eero, Plume, or any other mesh network device)
- You WILL have no end of random failure fun here if you play with this
- If you really have need for this; go full MSTP with an enterprise grade switch and don't look back.. Also forget the non-enterprise level access points and go fully managed wireless access point
Forward BPDU while STP Disabled: enabled (critical for mesh AP's, Sonos etc, to be able to feel out the network and find their paths; some switches call this BDPU flood enable..)
QoS: disabled (Enabling this is pointless unless you are tagging packets with QoS and have fully managed MPLS circuits anyways)
- This is a feel good setting that only makes sense on provider backhaul / backbone networks that are routinely saturated fully at a fabric level
- Also, most low end L3 / routers actually perform worse due to low end 400MHz, single core processors with limited memory buffering abilities (this raises queue latency at an interface port level, and typically turns off hardware offload processing killing your latency, jitter, and overal network performance)
- Don't bother with any of this unless you are running 1GHz+ multi-core routers and switches with GB's of memory and 10GB+ wire speed with routing enabled capabilities
Storm control: disabled (normally you use mac address level and 802.11x authentication to protect here, and only certified authorized, managed, and fully configured devices are allowed on the network)
- You WILL have no end of random failure fun here if you play with this
DDoS protection: disabled (normally you use mac address level and 802.11x authentication to protect here, and only certified authorized, managed, and fully configured devices are allowed on the network)
- You WILL PROBABLY have no end of random failure fun here if you play with this
- This is something a product manager dreamt up and is useless in the real world
Bonjour mDNS proxy / daemon / multicast helper: enabled (enable this if your switch has it, otherwise your Airplay, Airprint, Cast traffic will simply be filtered by the IGMP / MLD snooping and dropped)
L2 Loop protection: disabled everywhere on everything (this just causes anoying and nearly impossible to debug random failures, and only protects you from complete and utter failure due to creating network loops, not necessary as nobody is plugging and unplugging your top level l3 switch to multiple downstream switches in a non-corporate environment... Also, this only works on point to point device links, and doesn't work at all for Sonos, Mesh WiFi etc that have multiple non-switch paths available)
Flow control: disabled (Enabling this is pointless unless all hardware supports flow control, and most mesh / consumer grade AP's barf on control frame pauses; this really only works in the real world on all Cisco / Juniper gear networks with their proprietary flow control extensions)
IGMP Snooping: enabled
IGMP IP header validation: enabled (this only works if every device on your network is discovered and reporting igmp properly, and to IETF specifications)
IGMP Snooping Interfaces: all interfaces enabled
Host timeout: 60, Host response timeout: 1, Mrouter timeout: 60, Fast leave: enabled
IGMP Snooping VLAN:
Data vlan (vlan all your Airprint, Airplay, Google Cast, Sonos et al devices are connected on)
Host timeout: 60, Host response timeout: 1, Mrouter timeout: 60, Fast leave: enabled
Report suppression: enabled, Querier mode: enabled, Querier interval: 1
IGMP Querier: enabled
Querier address: Data vlan interface subnet or management IP, or 0.0.0.0 (note lowest IP in subnet wins querier election, so make sure your top level l3 switch has the lowest subnet IP here)
IGMP version: 2 or 3 (just make sure it's the same on all your switches)
Query interval: 1
Querier expiry interval: 60
IGMP Querier VLAN:
Data vlan (vlan all your Airprint, Airplay, Google Cast, Sonos et al devices are connected on)
Querier election participate: enabled
Querier VLAN Address: Data vlan interface subnet IP, or 0.0.0.0 (note lowest IPv4 IP in subnet wins querier election, so make sure your top level l3 switch has the lowest subnet IP here)
IGMP MRouter configuration: all interfaces enabled
IGMP MRouter configuration VLAN: data vlan all interfaces enabled (no need for other vlan's unless you have your air / cast devices split up on separate vlans and they need to find each other)
Note: All other vlan's disable MRouter interface, and disable Querier
MLD Snooping: enabled
MLD Snooping Interfaces: all interfaces enabled
Membership Interval: 60, Max Response Time: 1, MRouter timeout: 600, Fast Leave: enabled
MLD Snooping VLAN:
Data vlan (vlan all your Airprint, Airplay, Google Cast, Sonos et al devices are connected on)
Membership Interval: 60, Max Response Time: 1, MRouter timeout: 600, Fast Leave: enabled
MLD MRouter configuration: all interfaces enabled
MLD MRouter configuration VLAN: data vlan all interfaces enabled (no need for other vlan's unless you have your air / cast devices split up on separate vlans and they need to find each other)
MLD Querier: enabled
Querier address: Data vlan interface subnet or management IP, or :: (note lowest IPv6 IP in subnet wins querier election, so make sure your top level l3 switch has the lowest subnet IP here)
MLD version: 1 or 2 (just make sure it's the same on all your switches, Cisco for example uses both)
Query interval: 1
Querier expiry interval: 60
MLD Querier VLAN:
Data vlan: enabled
All other vlan: disabled
Management vlan
Disable everything IGMP and MLD related unless you just like having unnecessar broadcast and multicast traffic here
Video vlan
Disable everything IGMP and MLD related unless you have some dedicated video streaming hardware or camera hardware here
Voice vlan
Disable everything IGMP and MLD related unless you have some dedicated VOIP hardware hardware here that requires multicast for some very strange reason
If you've made it this far, and you've restarted all the things front to back, top to bottom, and have a fully working network; congrats... profit, you've got one heck of an awesome low latency high performance network.
Performance items...
Switch port interface configuration:
Autonegotiation: auto
Speed: auto
Duplex Mode: auto
Link Trap: enabled
Frame Size: 9198 (!!! this is for Jumbo frames and everyone should configure this for local network performance; however NOTE: this should be set to the lowest number supported by the weakest link in your network switch chain and when possible unified across the network to prevent unnecesary packet fragmentation... If your switch doesn't support at least a frame size of 9018/9022 you should probably throw it away and get something better as it will not pass typical 9000 jumbo frames unfragmented !!!)
- Ever wonder why large file transfers on 1G+ networks don't max out the interface link when you have sufficient IO and memory / cpu bandwidth? It's because you're not using 9000 mtu jumbo frames, and the hardware interupt and TCP overhead are vampirically siphoning off your link speed throughput)
- To maximize jumbo frame benefits, you should configure your LAN router MTU interface, and all operating system ethernet interface MTU's to 9000. This is the only way to get 10G speeds and feeds with iSCSI, FCoE, SMB, NFS, AFS, etc...
- WAN and WiFi interfaces are typically and nearly always restricted to 1500 MTU, so nothing to be done here until IETF, providers, vendors, and most of the internet decides that 1500 MTU's are legacy 1990's garbage and needs to be updated to something more modern
Further notes:
- This should work with nearly all mature L3 capable network switches from Broadcom generic hardware based vendors (d-link, tp-link, dell, some netgear etc), Cisco, Juniper, Brocade, HP, Mikrotik, Ubiquiti etc)...
- If you want the best chance of success make sure all your network gear is the same basic chipset, OS vendor, and capabilities. Mixing and matching network OS's is bound to cause you unnecessary heart ache and grief due to the fact that network vendors rarely if ever test interopterability of 'all the things'; just basic IETF level certifications that they work with their own gear and some basic plug level testing with other gear.
- One last note... there's tons of buggy and poorly tested network software out there, so if you want maximum reliability, wait until network gear has been out for a year and people are reporting good success with it. Unlike dumb L1/L2 switches with no management, managed switches boot an OS like linux, and have to go through dozens of iterations until they iron out the bugs introduced by product feature requirements and slick UI's in lieu of basic functionality. Your best off staying away from fully cloud / web managed switches here, unless you want to be on support tickets for months with no real progress beyond trying default settings here and there that don't actually work for anything other than the vendors full stack WiFi solution they sell. The reason for this is simple, what sells is GUI's and slick mobile apps, and not stable, reliable CLI's and performance (unless your a cloud provider yourself, or corporate network engineer).
- Some day, hopefully soon; the idea of a fully programatically configured network will become a reality and you will be able to pick your raw network switch vendor and load InsertSuperAwesomeLinuxNetworkSwitchOSHere on the flash and boot it up; drop a known working config file on it and profit... Still dreaming, but this is how the network world 'should' work... We shouldn't be pretending that the OS that Broadcom or 'insert random chipset vendor here' provides to the generic switch wrapping companies is any more special than any other switch with the same exact hardware under the hood.
Random config notes:
DNS Servers (Cloudflare, faster and better privacy claims than most other alternatives):
- 1.1.1.1
- 1.0.0.1
- 2606:4700:4700::64
- 2606:4700:4700::6400
SNTP Servers (Google, supports transparent leep second smear):
- time1.google.com
- time2.google.com
- time3.google.com
SNTP Client Mode: unicast
SNTP Client Version: 4
SNTP Port: 123
SNTP Unicast Poll Interval: 10
SNTP Broadcast Poll Interval: 10
SNTP Unicast Poll Timeout: 1
SNTP Unicast Poll Retry: 10
SNTP Time Zone Name: UTC
SNTP Offset Hours: 0
SNTP Offset Minutes: 0
Hardware reliability note: Make sure you UPS your router and parent l3 switch... if you can POE power your downstream L3/L2 switches or UPS power each of them you will have a better time maintaining consistent network reliability through storms and outages etc. However as long as your core router / l3 is up and or stays up before the downstream hardware comes up you will have a much better time with network stability. With STP enabled one can nearly be certain to have very strange and bad network performance after a power event. I've confirmed this with half a dozen managed switch network vendors, and both Eero and Plume (Eero fixed most of their bad behavior with RSTP 2+ years back, but it does still happen from time to time, Plume still obliterates networks due to their automatic fall back to WiFi backhaul with no way to disable this bad behavior).
- If someone from Plume eventually reads all of this and gets here, please give us the ability to disable WiFi backhaul either from the mobile app, or directly in the device itself for those of us that never want a non-ethernet backhauled WiFi access point or mesh device on their network!
I will update this if and when Plume releases their teased WiFi6 802.11ax hardware with some updated configurations where needed...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment