Enterprise Session Monitor for SAS: User Guide
What does ESM do?
Enterprise Session Monitor for SAS keeps an eye on the server-side operation and performance of a modern client/server based SAS environment. By giving users of the the environment full visibility of server activity, it helps everyone concerned better understand the system's capacity, relative day-to-day performance and the impact of individual users' actions on the overall performance.
What is a Session?
When a user signs on to a desktop SAS application such as Enterprise Guide or DI Studio, SAS starts a new supporting process on the SAS server on behalf of that newly started desktop application. This counterpart process (the server-side 'session') exists in an idle state most of the time, performing computational work as and when instructed by the user via the desktop application. When a user request completes, the session returns any resulting data back to the user. When the desktop SAS application is closed the server-side session also closes.
These server sessions are fundamental to the operation of server-based SAS environment. Depending on its size, the number of concurrent sessions active in a deployment can range from ten or twenty to a few thousand.
Why it is useful to Monitor Sessions
Computing resource is limited. As data volumes and teams of users grow the increased workload can cause the performance of a shared environment to degrade. Sometimes desktop SAS applications can also become disconnected from their server counterparts, such as when a user forcibly closes the desktop application or their network connection drops. In these cases a 'zombie' session can linger, consuming resource indefinitely.
Providing users with full visibility of the sessions that are active on the server and highlighting which sessions and users are consuming the majority of resource can prove invaluable in maintaining a stable, predictable multi-user SAS environment.
How ESM works
When a server-side SAS session is started (such as when a user logs on to a desktop application or when a batch job executes), ESM detects its key attributes and monitors its resource consumption until the session has finished. ESM lets users see the measurements as they are being collected in real time, as well as storing them in its database. The stored measurements are then available to users for later retrospective analysis and investigation, done via the ESM web interface.
How to access ESM
The ESM interface is a thin-client web application which requires no desktop components to be installed. Once ESM is deployed and configured, your Administrator should send you with a link which will let you access ESM via your web browser. The URL may contain a port number at the end, and look something like this:
Opening the URL the Browser will load the ESM web interface. It is recommended that you use a resonably recent version of Google Chrome or Mozilla Firefox. Microsoft Internet Explorer 9 is supported but not advised.
Basic Concepts in ESM
ESM is a web-based application, with a Graphical User Interface that aims to reproduce the look and feel of a traditional desktop-based application. Instead of a menu system, navigation in ESM is facilitated by a large toolbar that sits across the top of the application window.
This documentation refers to this toolbar as the ESM Ribbon. Most functionality in ESM is accessed using the Ribbon.
Data in ESM is presented using a number of Views. Views are opened by clicking on their buttons on the ESM Ribbon. Each view serves a different purpose:
- a Live View of the data presents a view of the data which is constantly updated in real time
- a Historical View explores the data from the perspective of a segment of time across all activity
- a Job View investigates the data from the perspective of a Batch Job name or job name pattern.
In addition to these, two 'Visual Searches' are available:
- the Job Search presents resulting Batch Job runtimes as a Gantt chart
- the Tag Search visualises the occurences of a given Tag or Tag Pattern as a scatter graph.
Both the Views and Searches are discussed in more detail in the next chapter of this document.
To make the exploration of data as intuitive and easy as possible, ESM uses a tabbed layout. Every time a View or Search is opened, it appears in a new Tab, allowing multiple data views to remain open at the same time.
This allows the user to explore the data non-destructively, allowing them to keep certain steps in their investigation open for later comparison or reference. Tabs can also be renamed, where more meaning is required or desired.
Most of the performance data presented through the ESM interface is organised into 'Portlets'. Portlets aim to emulate the look and feel of traditional dektop application windows, and some can be resized, rearranged or closed by the user. The different types of portlet used by ESM are explained in more detail in later chapters of this document.
Sessions in ESM
A 'Session' in ESM is any remote (server-side) process which has been started by a SAS server. The most familiar type of this would be a 'Workspace Session', which is a remote process started by SAS, for example when a user uses SAS Enterprise Guide to connect to an application server context such as SASApp to execute their project code. Workspace sessions are just one type of server-side process managed by SAS; common session types include Pooled Workspace sessions, SAS/CONNECT sessions, Stored Process sessions and Batch jobs, among others. While they serve different purposes, all of these different types of remote session execute SAS code and compete for resource in the same way.
Each SAS Session that starts is assigned a Process ID number (PID) by the server's operating system. This ID uniquely identifies that process on that server. If the session was started on a multi-server distributed SAS environment, SAS may have picked one of a number of servers (Nodes) to start the session on. All nodes are uniquely identifiable by their hostname.
On the most basic level, this is how all SAS sessions are identified - by a hostname (the name given to the node the session is executing on) and a PID (the ID assigned to the session by that node). However, these alone tell us little about what those sessions are doing and which SAS user they belong to, which is why ESM also uses some additional attributes to help identify and track each session.
One of the main attributes that sessions in ESM can be grouped or filtered by is Session Type, discussed above. ESM also identifies sessions by their User. For Metadata aware session types such as Workspace Sessions, the user attribute always corresponds to the Metadata Username of the user who started the session; for most other session types the user attribute corresponds to the Operating System user under which that process is executing.
Finally, ESM introduces the concept of a Session Name. This is a customisable attribute which can be configured in various ways to help sessions be identified more easily. By default, Workspace and Pooled Workspace session names contain the name of the application server context they are executing on, Batch session names contain the name of the executing batch job, and Stored Process session names contain the number of the port that they are listening on. As ESM allows session names to be customised by the administrator, your environment may use a different naming convention to the defaults described here.
In ESM, a Job is a special, extended type of session that is generally reserved for SAS Batch Jobs, i.e. deployed programs which are executed by a scheduler. In ESM, Jobs differ to standard sessions in that:
- Jobs have a Return Code attribute, which denotes whether the job completed successfully, with warnings, or with errors.
- The performance data shown by a Session portlet contains extra highlighting when displaying a Job, in order to make the graphs easier to read.
- Job logs are tracked in real time by ESM, and program Warnings and Errors are displayed as part of the performance data
Jobs can be searched separately from other sessions and have a dedicated pane on the ESM ribbon. For more information, see the Job View section.
Another feature of ESM is the ESM Tag. Tags are small, labelled flags which are displayed on the time-series graph at the time that they occur. Tags can also display extended data when hovered over by a mouse cursor, and their colour can be customised by the user.
Although they are used by ESM with other session types, tags are primarily intended as a programming and debugging tool to be used by SAS developers in place of the %PUT statement. Because tags appear on the graph in real time, they can be used to separate different steps in a program when evaluating or comparing performance. They can also be used to communicate runtime information back to the developer, without requiring them to read through the program logs after each execution, or having to wait for the program to finish executing.
Tags can be written to ESM by calling the %esmtag() macro. The macro accepts three parameters: The flag label, the text to be shown on mouseover, and the colour of the tag. Tags accept some basic HTML tags for line breaks and text formatting, and the colour parameter is expected in HTML format. Only the first parameter is required.
Some example syntax:
%if &sqlObs. > 0 %then %esmtag(SQL OK, The number of rows read from CLASS was &sqlObs., #AAFFAA); %else %esmtag(SQL EMPTY, The step produced no data, #DDAAAA);
In this example, the two calls to the
%esmtag function would produce a green flag and a red flag on the session graph respectively. The following screenshot shows what the text in the first call looks like when hovered over with the mouse cursor:
ESM tags are also easily searchable from the ESM front-end. They are designed to be used to quickly locate the performance data for previous executions of a particular project or piece of code. For more information see the section titled Tag View.
Exploring the Data
Clicking on the Live View button in the Ribbon opens the Live View in a new tab. If a Live View tab already exists, clicking on this button sets the Live View tab as the active tab.
The Live View is ESM's real-time view. It lists all sessions currently executing on every node of the SAS environment being monitored. Selecting one of the listed sessions loads the last five minutes of performance data for that session. This data is then continually updated, at a default interval of every two seconds.
The Live View consists of the Live Sessions portlet on the left, a Session portlet on the top right, and a Node portlet on the bottom right. The List portlet is used to navigate the list of available sessions, while the Session and Node portlets display the data for the session selected.
The Session Portlet and Node Portlet in ESM are permanently synchronised, meaning that both portlets will always display data for the same time window, including when one of the portlets is zoomed. The Node portlet will also always display data for the node that the selected Session executed on. In cases where multiple session portlets are showing (i.e. as part of a History View, the Node portlet will always correspond to the most recently opened Session portlet.
This portlet displays a list of all active or recently active sessions. Session filtering and navigation is performed using the List portlet on the left side of the live view.
All currently active sessions are listed in the list portlet tree, by default grouped by the node they are currently executing on. The data in the tree is split into four columns
1 showing the following session attributes:
- the user column shows the username under which the session is executing. In addition, the icon represents the session type for that session.
- the name column shows the identifier for that session. This value can vary between session types and depends on chosen configuration options.
- the pid column shows the OS Process ID of the session on the server.
the type column shows the session type for a given session. Sessions types are displayed in abbreviated form: PWS stands for Pooled Workspace Session, WS stands for Workspace Session, CS stands for Connect Session, and so on. The sessions shown can vary according to the system type and configuration.
Note: For more information about Session Attributes, see the Session Attributes section of this document.
The List Portlet contains additional panes which facilitate the grouping and filtering of the Session List:
- The Group Sessions By pane
2lets the list of sessions be grouped by type rather than executing node
- The Filter Sessions by pane
3can apply a 'team filter' to the session list, so that only sessions owned by certain users are shown. This filter is configured via the 'User Settings' dialog.
- The Display Session Type pane
4lets the list of sessions be filtered by session type. By default, all session types apart from System sessions are shown here.
The list of recently active sessions updates automatically every 30 seconds. To force an update, click the Refresh Button
5 on the right side of the portlet title bar.
TODO: clicking on the down arrow that becomes visible when hovering over the list grid column headers allows the user to show/hide the columns, and configure advanced sorting options.
TODO: My Team filter. To see how this works, refer to User Settings section.
The Session Portlet uses a time-series line graph to visualise the performance data of a session.
The graph contained in the Session portlet shows the CPU, Memory and Disk space usage for the session chosen from the List portlet. The Disk and Memory measurements are updated at a default interval of two seconds, while the SASWORK and SASUTIL directory sizes are measured at a default interval of 10 seconds.
The data is displayed on the Session Graph as follows:
- CPU consumption is represented by the red area of the graph, with its axis shown to the left of the graph. 100% of CPU usage represents one fully saturated CPU thread. See (section on threads and usage) for more details
- Memory measurements are represented by an orange line, the axis for which is shown to the right of the graph.
- Disk usage measurements are represented by bars on the graph, the axis for which is shown to the right of the Memory axis. Blue bars represent the size of the SASWORK directory, and green bars represent the size of the SASUTIL directory.
It is possible to 'zoom' in on the graph by using the mouse to click and drag along the chart to select the timespan of interest. When this is done, the live graph will stop scrolling and the smaller 'Navigator' graph immediately below the main graph will show the area of the graph that has been zoomed into. The handles on the navigator graph can subsequently be used to adjust the zoom level, and the selected area of the navigator graph can be dragged to move the chosen time window while maintaining the same zoom level.
The individual elements of the graph can be hidden by clicking on the corresponding label in the graph legend. The icons on the top right of the portlet also allow it to be maximised or closed.
The 'Last Tag' button at the top right of the portlet will search for the last occurence of a Tag on the selected session, and if one is found, will open a Historical View showing the last occuring tag on that session. For more information about tags, see the section on Tags above.
The Node Portlet uses a time-series line graph to visualise the overall performance of a chosen session.
The Node Graph shows the total CPU and Memory consumption for all processes executing on the selected node. It also shows the read and write performance for a chosen storage device. Like the Session graph, this graph is updated in real time, and any data displayed by a session graph is always synchronised to its corresponding node graph.
Data is displayed in the Node graph as follows:
- CPU consumption is represented by the red area
1of the graph, the axis for which is shown to the left of the graph. 100% of CPU usage represents one fully saturated CPU thread. TODO See (section on threads and usage) for more details. this should go in the FAQ
- Memory measurements are represented by an orange line
2, the axis for which is shown to the right of the graph.
- When a disk device is chosen from the dropdown menu, Disk usage measurements are represented by bars
3on the graph, the axis for which is shown to the right of the Memory axis. Blue bars represent the read speed of the selected disk device, and green bars represent its write speed.
'Disk devices are chosen using the dropdown menu (link to image) in the top right of the Node portlet. If no disk device is chosen, the default view does not show any disk device metrics.
A Historical View is a data exploration that allows the user to investigate performance from the perspective of a time window. Data is visualised in an almost identical manner to the Live View, with the few important differences outlined below. The Session and Node portlets are covered in full detail in the Live view section above.
A Historical View requires the inputs of a start time, an end time and a node. A Historical View tab can be opened one of three ways:
The first method is to define a time window manually by entering the beginning and end time in the boxes in the ribbon
1. Setting the 'From' time will automatically set the 'To' time to one hour later. Clicking the Calendar button
2will then open a Historical View for the selected timespan.
The second method is by taking the timespan from a current data exploration: if a Session or Node portlet is visible and displaying data, the Graph button
3will open a new History View for that node using the visible timeframe as input using the visible timeframe as input.
An initial History View can also be opened from the Home Screen, by selecting a time period in the '24 hour performance' graph and clicking the Graph button
Note: The time range chosen in the History View is subject to a maximum timespan, configurable by your ESM Administrator. The default value for this is 6 hours.
Resource Breakdown (List Portlet)
When a Historical View is opened for a given timeframe, ESM calculates the CPU and SASWORK share of each session active during that period. This data is displayed in the 'Resource Breakdown' pane of the History View List Portlet:
This pie chart shows how the consumption of CPU and SASWORK resources is split among the sessions that were active during the time period being viewed. Clicking on either the CPU
1 or SASWORK
2 buttons at the top of the pane will change the data shown by the chart accordingly. Clicking on any of the slices in the chart
3 will open the data for that session in a new Session chart, as will clicking on a session in the Session List above it.
If the History View was opened from an existing exploration (using method 2. above), it will already be displaying Session and Node data. By default that session will be 'pinned', and clicking on another session will open a second Session portlet alongside it for easier comparison.
Further sessions can be 'pinned' by clicking on this icon, on the right side of the Session portlet title bar:
Where sessions from different nodes are pinned (possible with a Environment Wide History View), the Node portlet will show data for the node corresponding to the most recently opened Session.
Environment Wide Historical View (All Nodes)
Where resources are shared across nodes in a distributed environment, the user may wish to view how the consumption of resources, especially SASWORK, is shared across all sessions irrespective of the node they executed on. This view can can be opened by clicking the down arrow next to the Graph button on the Ribbon, and selecting 'Overview for all nodes':
A 'Resource Breakdown' pie chart will appear. Resource breakdown charts for All-Node History Views are not labelled with a hostname as the data they show represents multiple nodes.
Job History Search
This button appears on the right side of a Session Portlet title bar, unless the portlet is shown as part of a Live View.
When clicked, it performs a History Search for a Job with the same name. For more information on Jobs, including the Job Search, see the next section titled (Job View).
Note: The Jobs section of this document explains how a 'Job' in ESM is different to a standard SAS session.
The Job View lets the user look at performance data of individual job executions, or runs, from the starting position of a Batch Job name or job name pattern. The view can be opened either directly, by entering a job name pattern in the Job name box
1 and then clicking the List button
2, or by entering a pattern in
1 and first using the Search button
3 to visualise the search results.
When the view is opened directly, all past runs of jobs which match the entered pattern are displayed as a list in the List Portlet. Clicking an item in the list loads the performance data in a similar manner to the Live and History views. The Job View only shows instances of job execution for which the detail data is still retained by the system.
When the Search button
3 is clicked, the results are visualised in a Gantt-style bar chart, showing the start and time, run time and success status of each job execution that matches the search pattern.
The Job Search results are aligned to an X axis that shows a 'time of day' in order to compare day-on-day job runtime and/or start time. The Process ID and day of execution is shown along the left side of the graph, and the time of day is shown across the top. Runs which ran successfully are presented as Green; those that finished with a Warning are presented as Yellow, and any erroring runs are shown in Red. Hovering over a run with the mouse shows extended information about that run, and clicking on it will open that run in a Job View.
Jobs that exited with a status of Warning or Error can also be specifically searched for.
Clicking the down arrow
4 next to the Job Search button will present two more options to the Job Search: "Search for Warnings" and "Search for Errors". These options will search for runs with Warnings and Errors respectively; unlike a standard Job Search, a Name Pattern is not required for these options.
Unlike the Job View, the Job Search also shows information about historical job executions for which the data has been archived. Clicking on a bar displays the detail data for that run only if it is recent enough for the data to still be available.
Note: The Tags section of this document fully describes the concept of 'Tags' in ESM.
The Tag Search allows for the searching of individual instances of Tags within ESM. In the search results, instances of tags are visualised as dots against a time axis and list of individual sessions:
Clicking on one of the dots opens a History View that shows the instance of that Tag within the context of the 5 minute period surrounding it. The same History View can be reached by clicking on the 'Last Tag' button on the Session Portlet.
Some elements of the ESM interface are configurable by the user via the User Settings panel. These can be accessed by clicking the User Settings button on the ESM Ribbon.
Any settings configured here are saved to the local storage of the Browser being used. Clearing the cache of the browser will reset the settings to default.
My Team filter
The 'My Team' filter available in the List Portlet is configured via the 'Users in My Team' pane.
When the 'My Team' filter is enabled, only sessions owned by users explicitly defined in this list are visible.
To add a user to the filter, click the Add User
1 button. Enter the desired username in the popup that appears and click OK. The user will be added to the list of users
Other User Settings
These settings control the following aspects of ESM:
Max number of flags to display in search
This setting controls the number of flags retreived by the Tag Search.
The default value for this setting is 200. The maximum is 500.
Place flags on chart line
This is a cosmetic setting that dictates whether the flags are drawn on the X axis of the Session Portlet graph, or on the line itself.
Session Max Axes Settings
These two settings, Session Max Mem Axis and Server Max Disk Axis, control the the scale of the respective axes on the Session and Server graphs. These can be set manually to aid in visual comparison of scenarios, but can usually be left unchanged.