README (April 2010) README.icol VERSION = '4.15' The collector package consists of the following files: - icol.exe The data collector executable. - icol.rc The default collector configuration file. - icol.rc.sample Configuration example. REQUIREMENTS: - Operating System: AIX v5 TL5 or higher. Linux (2.6 recommended). Sun Solaris (5.8/5.9/5.10). HP-UX. - Disk Space: At least 128MB of free space, recommended 250-500MB. - Configuration: The collector requires a configuration file (by default this is icol.rc). - Software: Requires SSH (ssh/scp). Recommended: OpenSSH 5.1 and higher). SUPPORTED HARDWARE: Specific data collection for: - pSeries HMC (tested on v3-v7) - Disk Subsystems: HP EVA IBM ESS EMC Clariion/Symmetrix PURPOSE: The data collector stores operating system information into a single file that can be analysed remotely using a combination of automated and manual methods. The collector output file stores important system state and configuration information, useful for health checks, trend analysis, audits and disaster recoveries. The output is usually analysed by Glacier Consulting's expert team, using software that analyses the current state of the system and identifies potential issues. BASIC MODES OF OPERATION: Parallel collection via IDIST (recommended): - IDIST can be used to run the collector on multiple remote systems in parallel (see README.idist), as well as to retrieve the collected information to a local directory. Agentless: - It is possible to run the collector against a remote client machine without actually installing the collector software on it. - This is typically used on pSeries HMC systems, where root access is restricted. - This feature is fully supported under IDIST. ICOL FEATURES: - Runlevels Collection operates according to specific runlevels, so information can be gathered for a healthcheck or for a more complete audit of the server. It also support custom runlevels, so that snapshots can be taken (e.g. capturing all network related information). Common runlevels include 1 for healthchecks, 2 for audits, and are denoted L1 and L2. - Versioning The collector program is able to maintain a number of versions of collected data, and provides archiving/expiration functions for previously collected output. - Customization The collector can be customized to collect ad-hoc information. - Timeouts / Deadlines It supports deadlines, a maximum allowed period of time during which each command must complete, in case the operating system is misbehaving. - Compression The collector will automatically determine the most suitable compression method for the output on each target system, in order to reduce output file sizes to a minimum. In addition, the collector can enforce size limits for files and output that it collects. Safety features: - Maximum output limits A maximum size limit can be set for file and command output to prevent excessively sized output files. In addition, it is possible to specify a head/tail truncation method for data that exceeds output limits. - Locking The collector prevents multiple instances of itself from running on the same client machine at the same time. - Agentless operation mode The -s parameter of icol.exe allows the user to collect data from a remote host without requiring any type of collector agent installation. USAGE GUIDELINES: 1. Install the collector package (see INSTALL). On AIX systems, the installation is done to /usr/local/eye. On HMC systems, the collector should be installed on the main monitoring server where the EYE utilities are to be installed. 2. Configure the collector. - The collector configuration file icol.rc is stored in the installation directory. - The EYE collector installation will prepare a generic icol.rc upon installation. - By default, the collector will be set up with the following defaults: * output directory: /var/eye (COLDIR = /var/eye) * time-limit per command: 60 seconds (CMD_TIMEOUT = 60) * maximum command output size: 2.5 MB (MAXCMDSIZE = 2621440) * maximum file output size: 2.5 MB (MAXFILESIZE = 2621440) * directory for lock-file: /tmp/.eye.lock.UNAME (LOCKDIR = /tmp) * directory for log: /tmp/icol.log (LOGDIR = /tmp) * compress the output file (COMPRESS = yes) It is possible to override these settings using the settings, as described in the icol.rc.sample file on each client machine. 3. Run the collector (to gather healthcheck data): Distributive Method (parallel execution): In this method, IDIST is used to collect/retrieve data in parallel against all configured client machines, after SSH trusted keys have been set up. - Log in to the main central monitoring server (where IDIST is installed) - To collect health-check data, use idist.exe collect RUNLEVEL GROUPNAME: ./idist.exe collect prod - To collect data for a specified runlevel on all configured client machines: ./idist.exe collect RUNLEVEL GROUP e.g. ./idist.exe collect 2 all where (Runlevel 1 = healthcheck) (Runlevel 2 = audit) GROUP must correspond to a group that has been configured in the EYE database using iconfig. If a group of "all" is specified, the collector will be run on all active servers. Manual Collection on client machine: In this method, data is collected and stored on a remote client machine. The stored collector file will have to be retrieved or copied at a later stage. 1. Login as root on the client machine. Note: as of v4.15, it is possible to run the collector as a non-root user. Please refer to the INSTALL document ("non-root installation" section) for instructions on how to configure this. 2. Run: /usr/local/eye/icol.exe -r 1 Manual Agentless Collection: In this method, data is collected and stored by the ICOL tool on the central monitoring server, without storing any program or data on the remote client machine. 1. Ensure the collector is installed and configured on the main monitoring server. 2. To run the collector against one or more client machines: Command: icol.exe -s SSHUSER@HOSTNAME - Example: icol.exe -s hscroot@SERVERNAME Note: The -u option can be used in case the name used in the locally saved collector file needs to be different from the client machine host-name. Refer to README.iconfig for more information on hostname override options. - Example: icol.exe -s hscroot@SERVERNAMELONG -u MYSERVERNAME In agentless collection, output is directed to the local directory specified by the COLDIR setting in the local icol.rc file, and data will not be stored on the remote host at all. Retrieval would therefore not be required.