[Rack] [Noisebridge-discuss] network down this afternoon, an interesting guide for people who want to help when the network goes down
superq at gmail.com
Tue Jun 5 10:19:43 PDT 2012
To me it sounds like there was a rouge dhcp server or some kind of
dhcp problem going on and nothing to do with the links.
One good way we can fix this is to put dhcp server controls at the
switch and wifi network level. I don't know if the equipment we have
is capable of doing this. :/
On Mon, Jun 4, 2012 at 9:28 PM, Danny O'Brien <danny at spesh.com> wrote:
> On Mon, Jun 4, 2012 at 9:14 PM, John Adams <jna at retina.net> wrote:
>> I wrote a pile of code that does exactly this sort of thing. If you need to
>> reset hardware when the internet goes down, this might help.
>> First, find/buy an el-cheapo baytech power controller (like the RPC-3.) It's
>> a power controler that you can control over the network. I see see these
>> things on eBay all the
>> time: http://www.ebay.com/sch/i.html?_trksid=p5197.m570.l1313&_nkw=baytech+rpc-3&_sacat=0
>> Then, download software out of my Github repo, configure the power
>> controller and using my check_dsl script, you'll have a took for instant DSL
>> modem restarting should you lose network connectivity. That's here:
>> WRT to DHCP, Configure two DHCP servers with DHCP Load balancing. Then, if
>> you lose one server, you'll still be able to service the
>> pool: http://www.ipamworldwide.com/dhcp-failover-a-load-balancing/dhcp-load-balancing.html
> We actually have a bunch of failover stuff (to go between
> monkeybrains/sonic), but there's been a fair bit of recent work on it,
> and I suspect what we're seeing is some intermittent hardware issue
> that's interfering with that. I defer to the ladies and gentlemen of
> rack to actually describe what we have right now, but
> DHCP load balancing sounds a good addition, but i wonder if we might
> be better off with thinking about how to simplify the network so that
> it's fixable for humans when it breaks in obvious ways.
>> On Mon, Jun 4, 2012 at 9:02 PM, Danny O'Brien <dannyobrien at gmail.com> wrote:
>>> Summary: my suspicion is that our DHCP server is still on the fritz,
>>> but we got sidetracked. see
>>> https://www.noisebridge.net/pipermail/rack/2012-May/001543.html for
>>> rebooting DHCP server
>>> 1. Somebody says to me "oh the Internet is down"
>>> 2. I say, in a friendly tone, "Do I look like I'm in charge?"
>>> 3. Person says, also friendly, "No-one is in charge, but you look like
>>> somebody who might know someone who can get the Internet back up" <--
>>> 10 anarchist hackerspace points!
>>> 4. I loudly say "Is the Internet down for everyone?"
>>> 5. Everyone mutters yes.
>>> 6. I loudly say "who wants to help fix it?"
>>> 7. General roar of acclaim, followed by hesitancy
>>> 8. Me and (Paula?) start loudly troubleshooting the Wall of Tubes,
>>> surrounded by people standing around trying to work out how to be
>>> 9. I fail to make everyone useful. Minus ten anarchist hackers points to
>>> 10. Me and Paula decide to try powercycling the DSL modem, followed by
>>> resetting the DSL modem, because it only has 3 lights green and the
>>> Internet light is unlit
>>> 11. This turns out to be wrong thing to do. Much of what I do next is
>>> based on getting us back out of this error.
>>> 11.5 I bug SuperQ about various things. He has a cold and is working,
>>> but is helpful.
>>> 12. Meanwhile, Hal turns up and notes that he can't get onto the
>>> network via an AP.
>>> 13. We work out that this is because he isn't getting an IP. Wired
>>> devices are also not getting IPs.
>>> 14. Hal learns about netmasks and /24s
>>> 15. I call up Sonic and they are awesome. They explain how to fix DSL
>>> modem. I leave note on modem instructing future noisebridgers not to
>>> reset DSL modem, and that 3 lights is fine.
>>> 16. Once that is fixed, people get on Internet again. I suspect DHCP
>>> server magically resolved itself.
>>> I will try and think of ways we can detect when stuff goes down. It's
>>> very hard for people in the space to know where to start, especially
>>> when all the docs about the network are a) only on the internets, and
>>> b) a bit out of date (are they out of date?). It's also a bit unfair
>>> on both SuperQ and Jof to depend on them to troubleshoot this stuff.
>>> Noisebridge-discuss mailing list
>>> Noisebridge-discuss at lists.noisebridge.net
More information about the Rack