Difference between revisions of "SAC:Admin and Troubleshooting"

From OSGeo
Jump to navigation Jump to search
m (+Category: Infrastructure)
(Emergency plans moved here)
Line 24: Line 24:
  
 
== Entire www.osgeo.org down ==
 
== Entire www.osgeo.org down ==
 +
 +
* ISP/DNS problem: what to do? do we need to call anyone?
 +
* hardware reset: Shawn Barnes (+1 613.565.5056 - Ottawa business hours), Howard Butler, Tyler Mitchell, Frank Warmderdam (+1 613.754.2041 - anytime). One option is a power cycle on the UPS to restart osgeo.org, using the "Reboot Immediate" item on the UPS.
 +
 +
TODO: Define rescue plan with responsible people
 +
 
=== Contact User with Shell Access ===
 
=== Contact User with Shell Access ===
 
If services or o/s need restarting or something else needs emergency attention contact one of the following people with shell access directly:
 
If services or o/s need restarting or something else needs emergency attention contact one of the following people with shell access directly:
 
* Tyler Mitchell - +1-250-277-1621 - tmitchell at osgeo.org - timezone GMT-7
 
* Tyler Mitchell - +1-250-277-1621 - tmitchell at osgeo.org - timezone GMT-7
 
* who else...?
 
* who else...?
 
TODO: Define rescue plan with responsible people
 
  
 
=== PEER 1 Trouble Ticket Process ===
 
=== PEER 1 Trouble Ticket Process ===

Revision as of 11:38, 27 August 2007

Troubleshooting

NFS Mount Missing

For some reason, the /home mount on the various servers is often lost.

To fix...

LDAP Server Down

The LDAP server runs on .220, and if it needs to be restarted it can be done as root with the command:

/opt/fedora/slapd-ldapt/start-slapd

Peer1 LDAP Server Hanging

If there is a power outage like there was on 2-20-07, slapd's database will need to be recovered.

sudo /usr/sbin/slapd_db_recover -h /var/lib/ldap/osgeo2

MySQL Cleanup for Drupal

If a report in drupal starts saying a table is crashed and needs repair, log into mysql and run the following, for example for the cache table:

REPAIR TABLE cache QUICK;

Entire www.osgeo.org down

  • ISP/DNS problem: what to do? do we need to call anyone?
  • hardware reset: Shawn Barnes (+1 613.565.5056 - Ottawa business hours), Howard Butler, Tyler Mitchell, Frank Warmderdam (+1 613.754.2041 - anytime). One option is a power cycle on the UPS to restart osgeo.org, using the "Reboot Immediate" item on the UPS.

TODO: Define rescue plan with responsible people

Contact User with Shell Access

If services or o/s need restarting or something else needs emergency attention contact one of the following people with shell access directly:

  • Tyler Mitchell - +1-250-277-1621 - tmitchell at osgeo.org - timezone GMT-7
  • who else...?

PEER 1 Trouble Ticket Process

TODO: add details here or point to elsewhere?