VCS-Cheat-sheet

Veritas Cluster (VCS) Cheat Sheet

Overview

A Veritas Cluster Server (VCS) is a high availabilty system provided by Symantec which consists of combination of multiple servers connected with shared storage devices. VCS links commodity hardware with intelligent software to provide application failover and control. In case of any node or application failure VCS helps in taking the predefined actions to keep system running in cluster. VCS monitors the systems and their services. VCS systems in the cluster communicate over a private network.

A switchover is an orderly shutdown of an application or operating system of the cluster machine and its supporting resources from one server and a controlled startup on another server under VCS.

A failover is situation where applications and resoucrse are stopped abruptly, the ordered shutdown of applications on the original node may not be possible, so the services are started on another node.

The process of starting the application on the node is identical in a failover or switchover.

CLUSTER COMPONENTS:

  • Resources
  • Resource Dependencies
  • Resource Categories
  • Service Groups
  • Agents
  • High-Availability Daemon (HAD)
  • Low Latency Transport (LLT)
  • Traffic Distribution
  • Heartbeat
  • Group Membership Services/Atomic Broadcast (GAB)
  • Cluster Membership
  • Cluster Communications

LLT and GAB files

/etc/llthosts The file is a database, containing one entry per system, that links the LLT system ID with the hosts name. The file is identical on each server in the cluster.
/etc/llttab The file contains information that is derived during installation and is used by the utility lltconfig.
/etc/gabtab The file contains the information needed to configure the GAB driver. This file is used by the gabconfig utility.
/etc/VRTSvcs/conf/config/main.cf The VCS configuration file. The file contains the information that defines the cluster and its systems.

LLT Commands

Verifying that links are active for LLT lltstat -n
verbose output of the lltstat command lltstat -nvv | more
open ports for LLT lltstat -p
display the values of LLT configuration directives lltstat -c
lists information about each configured LLT link lltstat -l
List all MAC addresses in the cluster lltconfig -a list
Stop the LLT running lltconfig -U
Start the LLT lltconfig -c

GAB Commands

Verify that GAB is operating gabconfig -a
Stop GAB running gabconfig -U
Start the GAB gabconfig -c -n <number of nodes>
Override the seed values in the gabtab file gabconfig -c -x

GAB Port Membership

List Membership gabconfig -a
Unregister port f /opt/VRTS/bin/fsclustadm cfsdeinit
Port Function a – gab driver
b – I/O fencing (designed to guarantee data integrity)
d – ODM (Oracle Disk Manager)
f – CFS (Cluster File System)
h – VCS (VERITAS Cluster Server: high availability daemon)
o – VCSMM driver (kernel module needed for Oracle and VCS interface)
q – QuickLog daemon
v – CVM (Cluster Volume Manager)
w – vxconfigd (module for cvm)

Cluster daemons

High Availability Daemon had
Companion Daemon hashadow
Resource Agent daemon <resource>Agent
Web Console cluster managerment daemon CmdServer

Cluster Log Files

Log Directory /var/VRTSvcs/log
Primary log file (engine log file) /var/VRTSvcs/log/engine_A.log

Starting Cluster

Start cluster with local config in ‘stale’ state hastart -stale
Start cluster with stale config in ‘valid’ state hastart -force
Bring the cluster into running mode from a stale state using the configuration file from a particular server hasys -force <server_name>

Stopping Cluster

Stop the cluster on the local server but leave the application/s running, do not failover the application/s hastop -local
Stop cluster on local server but evacuate (failover) the application/s to another node within the cluster hastop -local -evacuate
Stop the cluster on all nodes but leave the application/s running hastop -all -force

Cluster Status

Display cluster summary hastatus -summary
Continually monitor cluster hastatus
Verify the cluster is operating hasys -display

Cluster Details

Information about a cluster haclus -display
Value for a specific cluster attribute haclus -value <attribute>
Modify a cluster attribute haclus -modify <attribute name> <new>
Enable LinkMonitoring haclus -enable LinkMonitoring
Disable LinkMonitoring haclus -disable LinkMonitoring

System Operations

Add a user hauser -add <username>
Modify a user hauser -update <username>
Delete a user hauser -delete <username>
Display all users hauser -display
Add a system to the cluster hasys -add <sys>
Delete a system from the cluster hasys -delete <sys>
Modify a system attributes hasys -modify <sys> <modify options>
List a system state hasys -state
Force a system to start hasys -force
Display the systems attributes hasys -display [-sys]
List all the systems in the cluster hasys -list
Change the load attribute of a system hasys -load <system> <value>
Display the value of a systems nodeid (/etc/llthosts) hasys -nodeid
Freeze a system (No offlining system, No groups onlining) hasys -freeze [-persistent][-evacuate]
Unfreeze a system ( reenable groups and resource back online) hasys -unfreeze [-persistent]

Dynamic Configuration

Change configuration to read/write mode haconf -makerw
Change configuration to read-only mode haconf -dump -makero
Check what mode cluster is running in haclus -display |grep -i ‘readonly’

 

  • 0 = write mode
  • 1 = read only mode
Check the configuration file hacf -verify /etc/VRTSvcs/conf/config
Convert a main.cf file into cluster commands hacf -cftocmd /etc/VRTSvcs/conf/config -dest /tmp
Convert a command file into a main.cf file hacf -cmdtocf /tmp -dest /etc/VRTSvcs/conf/config

Service Groups

Add a service group haconf -makerw
hagrp -add <group>
hagrp -modify groupw SystemList sun1 1 sun2 2
hagrp -autoenable <group> -sys sun1
haconf -dump -makero
Delete a service group haconf -makerw
hagrp -delete <group>
haconf -dump -makero
Change a service group haconf -makerw
hagrp -modify <group> SystemList sun1 1 sun2 2 sun3 3
haconf -dump -makero
List the service groups hagrp -list
List the groups dependencies hagrp -dep <group>
List the parameters of a group hagrp -display <group>
Display a service group’s resource hagrp -resources <group>
Display the current state of the service group hagrp -state <group>
Clear a faulted non-persistent resource in a specific grp hagrp -clear <group> [-sys] <host> <sys>
Change the system list in a cluster hagrp -modify <group> SystemList -delete <hostname>
hagrp -modify <group> SystemList -add <hostname> 1
hagrp -modify <group> AutoStartList <host> <host>

Service Group Operations

Start a service group and bring its resources online hagrp -online <group> -sys <sys>
Stop a service group and takes its resources offline hagrp -offline <group> -sys <sys>
Switch a service group from system to another hagrp -switch <group> to <sys>
Enable all the resources in a group hagrp -enableresources <group>
Disable all the resources in a group hagrp -disableresources <group>
Freeze a service group (disable online and offline) hagrp -freeze <group> [-persistent]
Unfreeze a service group (enable online and offline) hagrp -unfreeze <group> [-persistent]
Enable a service group. Enabled groups can only be brought online haconf -makerw
hagrp -enable <group> [-sys]
haconf -dump -makero
Disable a service group. Stop from bringing online haconf -makerw
hagrp -disable <group> [-sys]
haconf -dump -makero
Flush a service group and enable corrective action. hagrp -flush <group> -sys <system>

Resources

Add a resource haconf -makerw
hares -add <resource> DiskGroup <group>
hares -modify <resource> Enabled 1
hares -modify <resource> DiskGroup <resource-name>
hares -modify <resource> StartVolumes 0
haconf -dump -makero
Delete a resource haconf -makerw
hares -delete <resource>
haconf -dump -makero
Change a resource haconf -makerw
hares -modify <resource> Enabled 1
haconf -dump -makero
Change a resource attribute to be globally wide hares -global <resource> <attribute> <value>
Change a resource attribute to be locally wide hares -local <resource> <attribute> <value>
List the parameters of a resource hares -display <resource>
List the resources hares -list
List the resource dependencies hares -dep

Resource Operations

Online a resource hares -online <resource> [-sys]
Offline a resource hares -offline <resource> [-sys]
Display the state of a resource( offline, online, etc) hares -state
Display the parameters of a resource hares -display <resource>
Offline a resource and propagate the command to its children hares -offprop <resource> -sys <sys>
Cause a resource agent to immediately monitor the resource hares -probe <resource> -sys <sys>
Clearing a resource (automatically initiates the onlining) hares -clear <resource> [-sys]

Resource Types

Add a resource type hatype -add <type>
Remove a resource type hatype -delete <type>
List all resource types hatype -list
Display a resource type hatype -display <type>
List a partitcular resource type hatype -resources <type>
Change a particular resource types attributes hatype -value <type> <attr>

Resource Agents

Add a agent pkgadd -d . <agent package>
Remove a agent pkgrm <agent package>
Change a agent N/A
List all ha agents haagent -list
Display agents run-time information i.e has it started, is it running ? haagent -display <agent_name>
Display agents faults haagent -display |grep Faults
Start an agent haagent -start <agent_name>[-sys]
Stop an agent haagent -stop <agent_name>[-sys]