Overview

Collected objects

Relationships

Understanding the interdependencies between key components is crucial for effective monitoring and troubleshooting. The Agent Management Addon establishes a structured relationship model within VVF & VCF Cloud Operations, linking key elements such as Agent Control Center, Agent Node Manager, Application and associated virtual machines. This enables administrators to visualize and navigate the complex interactions between infrastructure components and applications, facilitating more informed decision-making and efficient issue resolution.

Relations between collected objects

The relationship diagram depicts a hierarchy that starts with the 'Agent Management Adapter Instance', which acts as the central element responsible for integrating data from the platform. From there, the hierarchy flows as follows:

  • Agent Control Center serves as the central management and orchestration component of the platform, managing the lifecycle of applications and coordinating resources.
  • Agent Node Manager is responsible for managing individual nodes within the platform, handling tasks such as node provisioning, configuration, and monitoring.
  • Application Agent within the platform are the workloads or services deployed and managed by the Agent Control Center, which can be monitored through the Agent Management Addon.

All components, Agent Control Center, Agent Node Manager, and Applications are associated with Virtual Machines (VMware Aria Operations section), providing a comprehensive view of resource utilization and performance metrics across the entire environment.

Agent Control Center & Agent Node Manager

Identifiers

NameDescription
IDComponent ID

Properties & Metrics

NameTypeDescription
API PortPropertyPort used for communication between ACC and ANM for the CDTP (Command Data Transfer Protocol) protocol
Update datePropertyDate when the component was last updated
Create datePropertyDate when the component was created
Process IDPropertyMain process ID
StatusMetricComponent status (0-stopped, 1-running)
ConfigurationGroupExtended information
Agent Process infoGroupAgent process information
OS InfoGroupOS global information
Application templatesGroupApplication templates

Configuration

NameTypeDescription
NamePropertyUnique component name
IDPropertyUnique component ID
FQDNPropertyFQDN (Fully Qualified Domain Name)
OS TypePropertyType of operating system
SSL ModePropertyDefines whether SSL is enabled or disabled
Available IP addressesGroupList of publicly available IP addresses

Agent Process info

NameTypeDescription
Start timePropertyStart date and time of the process
UserPropertyUser that started the process
GroupPropertyUser’s group for the process
Memory usageMetricPercentage of memory used by the component’s main process
Processor usageMetricPercentage of processor used by the component’s main process
Thread countMetricCurrent thread count for the process
Memory usedMetricResident Set Size (RSS). Memory in RAM (in bytes) used by the process (stack, heap, shared libraries). Excludes swapped-out memory
Virtual sizeMetricVirtual memory size (VSZ), including swap and shared libraries

OS Info

NameTypeDescription
Total memoryMetricTotal RAM in bytes
Available memoryMetricFree RAM in bytes
Cores amountMetricTotal CPU cores
Processors amountMetricCPU count
CPU usageMetricCurrent peak CPU usage percentage
Disks free summaryMetricTotal free disk space (bytes)
Disks total summaryMetricTotal disk space (bytes)
Disks using summaryMetricTotal used disk space (bytes)
DisksGroupDisk list with statistics

Disks

NameTypeDescription
FreeMetricAvailable free space in bytes
UsingMetricSpace used on the disk in bytes
TotalMetricTotal size of the disk in bytes

Application templates

NameTypeDescription
IDPropertyApplication ID
NamePropertyComponent-standard application name
Short namePropertyApplication short name
VersionPropertyVersion of application
Application stateMetric & PropertyCurrent application instance state (0-NON_EXISTS, 1-DISTRIBUTED, 2-CREATED, 3-REMOVED)

Application Agent

Identifiers

NameDescription
IDResource ID
ONM IDAgent Node Manager ID

Properties & Metrics

NameTypeDescription
StatusMetricCurrent application instance state (0-NON_EXISTS, 1-CREATED, 2-RUNNING, 3-STOPPED, 4-REMOVED)
ConfigurationGroupExtended information
UtilizationGroupUtilization metrics
Process SummaryGroupAgent process information
OS ServiceGroupInformation about OS service monitored by the application agent, including version, OS process statuses

Configuration

NameTypeDescription
NamePropertyApplication name
IDPropertyApplication ID
PathPropertyFilesystem path to the application runtime directory
Short namePropertyApplication short name
VersionPropertyVersion of application
TCP PortsGroupList of TCP ports used by the application

Utilization

NameTypeDescription
Memory usageMetricMemory usage by the process in percent
Processor usageMetricCPU usage percentage for the process
Memory usedMetricResident Set Size (RSS), memory allocated to the process in RAM (stack, heap, shared libs, excludes swapped-out memory)
Virtual sizeMetricVirtual memory size (VSZ), including swap and shared libraries

Process Summary

NameTypeDescription
Process IDPropertyApplication main process ID
Start timePropertyStart date and time of the process
UserPropertyUser who started the process
GroupPropertyGroup of the user who started the process
PathPropertyPath to the directory from which the executable was started
Current working directoryPropertyCurrent working directory of the process (e.g., start script location)
Thread countPropertyNumber of currently running threads in the process
Children process IDsGroupArray of child processes spawned by this process

OS Service

NameTypeDescription
VersionPropertyVersion of OS service monitored by application agent
StatusPropertyOS Service status, if any of the service processes is stopped, the status is 0, otherwise it is 1
ProcessesGroupList of processes (and their statuses: 0-stopped, 1-running) running by OS service

Alerts

Alerts are generated in VVF & VCF Cloud Operations based on events received from cluster. Platform distinguishes four event types:

  • ERROR: Mapped as a critical-level alert in Aria Operations.
  • PROCESS_PID_IS_NOT_ALIVE: Also mapped as a critical-level alert.
  • APP_STOPPED_AFTER: Mapped as a warning-level alert.
  • APP_STARTED_AFTER: Mapped as a warning-level alert.

These mappings ensure that administrators receive timely notifications corresponding to the severity of each event.