Search My Techie Guy

Wednesday, January 19, 2011

Unix Systems Administration - Server Health Checks

A.     Maintaining the System Running Environment
Expected output
# top
Check the node to make sure that no single process is consuming all of the available memory.

Check the physical free memory of the system and whether the SWAP free memory meets the requirements.

The screen lists the 15 most active processes that are currently running on the node.

For a single process, the CPU usage should be less than 40%. The total CPU usage should be less than 70%.

The memory utilization (real - free)/real should be less than 70%.
 # df –h
Check free space of the system hard disk.
The usage of the file system space should be less than 80%.
# ntpq -p
check the time synchronization
A table of system clocks appears. The line representing the system clock currently used by the node is marked with an asterisk (*).
B.    Maintaining the Unix System Logs
# more /var/adm/messages
Check this file for error messages

Make sure the log file doesn’t contain the following abnormal information:
warning, panic, error, fail, exception, fatal
# cd /var/log/
# ls –ltr
# more syslog.x

Where “x” is the most recent syslog number.
Check for the most recent system log file and analyze it.
Make sure the log file doesn’t contain the following abnormal information:
warning, panic, error, fail, exception, fatal, reject, alarm etc.
C.    Maintaining the Unix System Processes
# ps -ef

 Check which processes are running on the server, which user started the process (UID), process ID (PID), and the command that started the process (CMD).
A list of all processes currently running on the system. The PID is important when you wish to “kill” the process, and the CMD is used to start the process again
# kill -9 PID_X
Where PID_X is the process ID
Kill the process.

Note the command used to START the process (CMD) before you kill it; you might want to START it again.

Use this command to kill processes that are using up the system resources. This can be seen from the “#top” command.

Use the “#top” command to confirm that some system resources have been freed up after killing the process in question.
D.    Maintaining the TCP/IP Ports
# netstat -an | grep port_x

Where port_x is the port number where the application is listening on or communicating on with other external applications.
Verify that the TCP/IP connection is in “LISTEN” or “ESTABLISHED” mode
If the port is functioning properly, you should receive a LISTEN/ESTABLISHED response. A hanging port will return a CLOSE/WAIT response.
Post a Comment