Trying to help out a customer of mine. Customer has facility with 90 wifi board equipped machines that communicate wirelessly with three raspberry PI Gateways (GWs) on 2.4G. GWs are hard wired over ethernet to Allo fiber based modem. 95% of the time life is great. A couple times per week, always weekend days, GWs loose communication with all the wifi boards. Have to reboot everything including the boards. It’s a PITA. Yes, we’ve tried changing channels.
Pretty sure interference and possibility intentional. Question is best reasonable way to track down. Prefer not to spend $1k on Chanalyzer but worst case will. Would one of the software options work? Other suggestions?? Thanks for the help!
Could be many things, from raspberry pi to electric issue. After an interference the radios should resume normal operations. Go there, pinpoint the time, do a packet capture and spectrum analysis when the issue occurs.
Wow, they are using Raspberry Pi’s as access points in an Enterprise operation? Perhaps they need to analyze cost vs. risk factors. Raspberry Pi are good little pieces of tech, but they are limited to the software loaded and hardware limitations, memory, that they have. First you have 90 clients talking to 3 access points, at best that’s 30 clients per AP. That’s in a perfect world and we don’t live there.
You won’t find this easily but since I actually ran testing, that an AP can usually only handle 7 - 12 transactions/second, given a 40 -80K transaction size. Failures get logged for retries/retransmit. So, if one AP gets congested and the transaction doesn’t get through, it logs it in RAM to resend, but in the typical processing fashion these don’t get cleared and are kept as error logs and will eventually fill all the RAM capacity which then causes the AP to stop. Only a reboot will clear this. So if the clients tax one AP to capacity till it stops they will move to next available and overload it and the cascade takes out all 3. They used to call this a memory leak or memory drip because it usually causes the operation portion of RAM to get over written and creates the failure. This was blatantly prevalent in older AP’s and even some lower cost AP’s. Everything networked suffers from this and needs periotic reboots, even cell phones.
Weekly reboots will help, Increase the SD card capacity, but I would add AP’s to reduce the client to AP ratio. I’m sure they are proud of the fact they created their own AP’s, but the code isn’t up to their task requirements. This is the reason WIFI systems have a controller to take the load off the individual AP’s resources. This won’t improve by using home office AP’s from Best Buy or Amazon. Linksys, TP etc. don’t play well when in groups.
Want to prove it is or isn’t interference? Download a WIFI analyzer from the web on your phone. It will let you see any signal sources within the facility, including SSID and MAC address. But my bet is what they are using and the method of implementation.
very well said rthompson, and i echo what you said here! they are good little things, but belong in a lab as a project, and not to be used in the daily operation of a network, sorry but that’s mad! If they had a laptop, they could download wireshark - free and pretty awesome -, connect to any of their AP’s, and log it all (i tend to use an external hard drive, so i can download a day or two if need be), although it is a lot to go through, but having a time reference cuts that down, plus i can see normal operation against times when things go wrong. Again, very well said and totally on-point.
I cut my teeth on the original 802.11 at a whopping 2mbps back in 1999. Back then you had new firmware every week or so to address the bugs and oversights that the technology encountered.
Wire Shark is a great tool and will tell you loads of what is going on in the network, and will surprise you with things you never expected.
Oh for sure, ive had a few surprises from using it. Ive been using that program for a lot of years. I remember a tech trying to re-image a lab full of computers having a ton of issues. I ran the software while connected to that lab’s switch and found out it was the remote software (dont recall its name, but it was like remote desktop), and it was causing so much traffic on that subnet, that the Ghost software was unable to even “see” the computers. It was WireShark that got me to that point. we uninstalled that remote software and Ghost worked as it should. I preexist wifi and Ethernet, and to a certain extent the daisy-chained coaxial cable network - i was one of the first in the UK to install 10 network PC’s in a lab using coaxial cable, heck even hooked up a shared printer… magic… LOL… so yup, ive been around too.