FAQ

What does a typical SAS Enterprise Session Monitor deployment look like?

A deployment is comprised of one or more instances of SAS Enterprise Session Monitor agents, a single instance of the SAS Enterprise Session Monitor server application, and any number of client machines which access the SAS Enterprise Session Monitor application interface through the thin client browser-based front end.

A SAS Enterprise Session Monitor agent runs on each node which is running analytic workload (for example, SAS Application Server) and is responsible for collecting process and server events and metrics, and submitting them to the SAS Enterprise Session Monitor server application. The SAS Enterprise Session Monitor server application collects metric data from each instance of a SAS Enterprise Session Monitor agent and stores it in the SAS Enterprise Session Monitor database. It also serves the web-based front end for the SAS Enterprise Session Monitor application interface and performs a minimal level of scheduled database maintenance.

SAS Enterprise Session Monitor Overview

How does SAS Enterprise Session Monitor get its metrics from SAS?

In order to maximise compatibility and stability and eliminate any possibility of interfering with the user processes being monitored, the integration of SAS Enterprise Session Monitor with Base SAS sessions relies on the SAS DATA step and a common filesystem location for inter-process communication.

Each SAS Enterprise Session Monitor agent instance is configured with its own events directory, which it continually monitors for event files generated by processes looking to communicate with it. As soon as an event triggerfile appears in this location, it is read by the SAS Enterprise Session Monitor agent, and if the data it contains is valid, the file is removed. The agent then interprets the information or instruction specified in this event file and acts upon it, by either monitoring a new process, or forwarding the data within the event file on to the SAS Enterprise Session Monitor server.

How are events communicated to the SAS Enterprise Session Monitor server?

Deploying the SAS Enterprise Session Monitor agent onto a shared filesystem allows for an instance of an agent to be started on each node in the cluster without requiring multiple installations. To avoid potential conflicts that may arise from non-unique PIDs, each node has its own dedicated events directory. In such a multi-node or GRID installation, where the filesystem is shared across multiple nodes, the layout of the event directories would look something like this:

opt
   ESM
     esm-agent
       events
         node1_hostname
         node2_hostname
           esm_eventfile_1
           esm_eventfile_2
           esm_eventfile_3
         node3_hostname

Multiple types of events are supported by SAS Enterprise Session Monitor.

new process events

A new event tells SAS Enterprise Session Monitor to begin monitoring a process, communicating relevant attributes of that process. The properties are typically communicated:

  • pid (required) - the ID of the process to be monitored
  • hostname (required) - the hostname of the machine (must match configured agent hostname / ESMNODENAME environment variable)
  • owner (required) - the name of the user that the process can be attributed to
  • sasUuid - a unique identifier for this 'session'. Session-provided UUIDs are useful when this value needs to be propagated to sub-sessions as an environment variable for the purposes of reconciliation with parent jobs/sessions. If a value is not provided here it will be automatically generated by the agent
  • queue - the name of the queue that this session / job belongs to
  • jobName - the identifier of the session. For jobs this is typically the job name.
  • workFolder - the temporary directory attributed to the session as transient WORK storage (SAS specific). Can be an array of directory locations*.
  • utilFolder - the temporary directory attributed to the session as transient UTIL storage (SAS specific). Can be an array of directory locations*.
  • logFile - the logfile to be attributed to the session and parsed in real time for events.
  • logs - a list of logFiles, where a job generates more than one logfile or requires more than one log to be followed
  • esmType (required) - the 'session type' is a SAS Enterprise Session Monitor attribute. Typically it is one of WS, PWS, STP, Batch, GRID, LASR, JVM or SYS, but categories can be added dynamically. SYS sessions are not shown by default, and cannot be acted on (terminated) by users.

tag events

A tag event is a basic event which is attributed to a process at a given time, containing contextually relevant information. It is intended to be used by programmers to help identify progress between code blocks or functions, but can be extended for any purpose where overlaying contextual data flags to the timeseries is beneficial.

A basic tag event has the following properties:

  • text - the title of the tag event, searchable from the Tag Search
  • tooltip - detailed information about the event, shown when the user hovers over the flag to display the tooltip
  • color - the colour of the tag flag, in HTML colour notation

highlightStart and highlightEnd events

Highlights are called highlightStart and highlightEnd for legacy reasons and are better described as jobStart and jobEnd events. These are a special type of tag event and are used for communicating information specific to jobs, such as job return codes and job flow information.

A highlightStart event requires the following:

  • pid - the process ID of the job in question
  • hostname - the hostname of the machine the job is executing on (must match configured agent hostname / ESMNODENAME environment variable)
  • uuid - the code-generated unique ID for the job in question. The purpose of this ID is to reconcile the data communicated in the highlightEnd tag with the PID of the job

A highlightEnd event requires the following:

  • hostname - the hostname of the machine the job is executing on (must match configured agent hostname / ESMNODENAME environment variable)
  • uuid - the code-generated unique ID for the job in question. The purpose of this ID is to reconcile the data communicated in the highlightEnd tag with the PID of the job
  • text - the identifier for the job, typically the job name matching the jobName identifier in the new event
  • returnCode - the exit code, or completion status, with which the job terminated (i.e. 0 = success, 1 = warning, 2+ = error). Return codes of 3 and 6 (ABORT exits and internal errors) are treated as errors
  • flow - a colon-separated string of identifiers containing the job's position within the LSF flow hierarchy. This expects the verbatim value of the LSB_JOBNAME environment variable, from which superfluous variables such as user name or LSF job ID are stripped

A note on UUIDs and highlight tags

The highlight tag mechanism may appear convoluted, but it serves to facilitate the reconciliation of job PIDs and return (exit) codes. When a SAS 'Job' is launched, a SAS process is spawned by the parent instance of the executing script (i.e. sasbatch.sh), and when that job finishes, the return code of the SAS job subprocess is collected by that script. In order to ensure a unique relationship between the session being monitored and the exit code returned upon job termination, the uuid must therefore be generated and exported by the parent context of the sasbatch.sh process so that the highlightStart tag (generated by the job process at startup, once the subprocess ID is known), can be linked to the exit code reported back to the parent process.