Not sure where to start - Wifi Interference

Trying to help out a customer of mine. Customer has facility with 90 wifi board equipped machines that communicate wirelessly with three raspberry PI Gateways (GWs) on 2.4G. GWs are hard wired over ethernet to Allo fiber based modem. 95% of the time life is great. A couple times per week, always weekend days, GWs loose communication with all the wifi boards. Have to reboot everything including the boards. It’s a PITA. Yes, we’ve tried changing channels.

Pretty sure interference and possibility intentional. Question is best reasonable way to track down. Prefer not to spend $1k on Chanalyzer but worst case will. Would one of the software options work? Other suggestions?? Thanks for the help!

Could be many things, from raspberry pi to electric issue. After an interference the radios should resume normal operations. Go there, pinpoint the time, do a packet capture and spectrum analysis when the issue occurs.

Thank you. Anything you recommend for the packet capture?

Wow, they are using Raspberry Pi’s as access points in an Enterprise operation? Perhaps they need to analyze cost vs. risk factors. Raspberry Pi are good little pieces of tech, but they are limited to the software loaded and hardware limitations, memory, that they have. First you have 90 clients talking to 3 access points, at best that’s 30 clients per AP. That’s in a perfect world and we don’t live there.

You won’t find this easily but since I actually ran testing, that an AP can usually only handle 7 - 12 transactions/second, given a 40 -80K transaction size. Failures get logged for retries/retransmit. So, if one AP gets congested and the transaction doesn’t get through, it logs it in RAM to resend, but in the typical processing fashion these don’t get cleared and are kept as error logs and will eventually fill all the RAM capacity which then causes the AP to stop. Only a reboot will clear this. So if the clients tax one AP to capacity till it stops they will move to next available and overload it and the cascade takes out all 3. They used to call this a memory leak or memory drip because it usually causes the operation portion of RAM to get over written and creates the failure. This was blatantly prevalent in older AP’s and even some lower cost AP’s. Everything networked suffers from this and needs periotic reboots, even cell phones.

Weekly reboots will help, Increase the SD card capacity, but I would add AP’s to reduce the client to AP ratio. I’m sure they are proud of the fact they created their own AP’s, but the code isn’t up to their task requirements. This is the reason WIFI systems have a controller to take the load off the individual AP’s resources. This won’t improve by using home office AP’s from Best Buy or Amazon. Linksys, TP etc. don’t play well when in groups.

Want to prove it is or isn’t interference? Download a WIFI analyzer from the web on your phone. It will let you see any signal sources within the facility, including SSID and MAC address. But my bet is what they are using and the method of implementation.