October 01, 2004 (technical)
Fixing Obsidians Network Problem
Came in on monday the 27th to find that a lot of the cluster seemed to be messed up.
The workstations couldn't ping the frontend-0, frontend-0 didn't appear to have it's routing or network addresses working appropriately.
Fix:
- halt all the machines
From frontend-0: "cluster-fork halt -t now" - fix the frontend-0 network
From frontend-0: modified the /etc/sysconfig/network-scripts/ifcfg-eth1
replacing the DHCPD option with a static IP option
New network settings:
IP: 172.16.2.240
Netmask: 255.255.0.0
This also automatically fixed the "route"ing tables, now any 172.16.x.x traffic was going through eth1 - reboot the comp-pvfs nodes
- to aid in daniel's quest, i then turned dhcpd off - which was giving him dramas. (After resetting frontend-0 DHCPD will be started again
*nb: the dhcpd.conf on frontend-0 doesn't actually give out IP's - only to nodes that were assimilated via "insert-ethers" - which meant that machines requesting IP's from frontend-0 were hanging.
Even though i considered modifying the frontend-0's dhcpd.conf to allow it to give out IP's - because the file is automatically updated via "insert-ethers" - i thought it better not to touch it.