|Administrator Handbook||Table of contents|
With most of the SNMP Network Manager software like LoriotPro there are many ways to carry out status and performance indicator SNMP collection on monitored devices. This article explores these different ways, their advantages and disadvantages. LoriotPro is primarily an SNMP MANAGER in the strict sense of the SNMP standard, but it is also able to perform collections of any type using other protocols, databases, files, WEB access, XML, etc. SNMP collections are mainly covered in this document, but other types of data can be collected using the same modes.
The choice of an SNMP collection mode (method) is directly related to the following questions:
Based on this information it will be easier to determine:
Passive mode versus active mode
In passive mode, the SNMP manager (LoriotPro) waits to receive SNMP packets informing it of a change of status or an anomaly. These packets in the SNMP standard are called Trap or Notification. On receipt, the packets are analyzed and, in the event of a response with values judged outside the thresholds, the alarms are generated and the monitoring dashboards updated.
In active mode it is the SNMP manager (LoriotPro) who regularly polls the equipment, in case of non-response or response with values judged out of the thresholds, the alarms are generated and the monitoring dashboards updated.
Advantages of the passive mode:
Drawbacks of the passive mode:
Benefits of active mode:
Drawbacks of active mode:
After choosing a mode of collection, active or passive, the volume of collections, the frequency of collections and the expected reactivity will guide the choice of collection tools. Here are the different collection tools that this document explores in the following chapters
LoriotPro collects the Trap or SNMP notification sent by the devices into a receiving tray. SNMP Traps are displayed in the order they arrive. The filter configuration for each type of trap makes it possible to count these. Counters can then be used to change the status of graphic objects in ActiveView dashboards. Filters can also be used to send alarms to administrators (sound, mail, SMS, etc.)
LoriotPro has a process in the background (Poller Process) in charge of checking at regular intervals the connections with the IP devices declared in its directory. Depending on the responses returned by the devices the color statuses change, green indicates that the device responds to SNMP requests, blue that it responds to PING, yellow and red that the connection is lost. This process is asynchronous, and has two separate threads (threads), one for sending requests the other to parse all responses. Additional collections of SNMP data on various objects and specific to each monitored device can also be performed during polling, the return values will be recorded in files for exploitation in other modules of the software.
Batch configuration possible thanks to a dedicated module (Bulk Configuration)
This module (PLUGIN) makes it possible to synchronously collect, one after another, SNMP objects on one or more devices. The collected values are then compared to thresholds to trigger alarms (called EVENT in the LoriotPro software). The Event is then used to alert an administrator of the anomaly by various means, dashboard, sound, email, SMS, etc.
The threshold comparison can be performed on SNMP objects of type integer, gauge and counter and string.
ActiveView monitoring dashboard are LoriotPro modules (plugin) that can be used for the display of network topology maps, operating availability status, front and rear device view, maps and layout, functional synoptic. ActiveViews are created manually and individually or from templates. They can also be dynamically created by scripts in LUA language
ActiveViews contain graphic objects whose visual appearance, mainly the background color, is dependent on the return value of an expression. This expression can be an SNMP collection, an error counter or alarm report, an LUA script, and so on.
ActiveViews collect values associated with each graphical object sequentially. The more objects in a visual, the more time it takes to collect and update objects. ActiveViews have two threads associated with two processing queues. By default all collections are placed in the first processing queue. If a device processed in the first queue does not respond to requests (timeout), it delays the collection process and is then moved to the second queue. He is put back into the first queue when he answers the queries again.
Audits are LoriotPro plugins that are attached to IP devices declared in the directory. They use LUA scripts to perform collections. The scripts make it possible to carry out SNMP collections, but also of other type by using other protocols, databases, files, WEB accesses, etc.
Each audit has a thread of its own to function. The numbers of thread that can run concurrently on a system depends on its CPU model and power, at the tops of that, Windows 64 bits operating system is able to manager hundreds of threads on a multi-core processor.
But unlike other LoriotPro plugin modules, audit threads are not assigned statically and permanently, but dynamically from a pool (Thread_pool). At the launch of LoriotPro, the threads of the pool share the collection and processing (LUA script) of all the audit modules declared in the directory.
The principle is the same as that presented in the previous chapter, but it adds a feature available in the Broadcast Edition of LoriotPro. This feature called Global Object allows you to store the values of the collected SNMP objects and make them accessible from the other LoriotPro modules.
As a reminder, the data collection on the device to be monitored is mainly performed with the SNMP protocol. As we mention earlier, this protocol is used to retrieve status and performance indicators through SNMP agents on devices and systems. The response times of agents to queries are quite unpredictable, so we cannot really predict the time that a collection process LoriotPro will need to get a response. Each collection process has a maximum Timeout beyond which it considers that the agent does not respond. If a single process is responsible for these collections that it performs so sequentially the performance may not be at the rendezvous knowing that a hundred collections can take a few seconds to minutes.
Let's summarize the context: We know that the collections we call "tasks" have very random execution times ranging from a few milliseconds to several seconds. In addition, we wish to carry out collections periodically and at tight intervals (polling period), of the order of one second for certain performance indicators.
Principle implemented: All the tasks (collections) to be carried out are grouped together in a single program. These details for each task the type of collection to be performed (SNMP GET on a MIB object). For example we use SNMP objects, but other types of collection can be done, extract log files, query on SQL databases, read TRAP counters, etc. It should be noted that the collections can come from global variables already in memory, which makes it possible to process by correlation.
To perform these tasks, a variable number of processes can be attached to it. In principle, the greater the number of tasks to be performed is and higher retry rate is, the greater the number of processes is required. The processes in question are instantiated as much as necessary (LoriotPro's Plugin Audit Process) to assume an almost parallel treatment.
To simplify the configuration, an Audit process (902) is provided. One or more Audits can be responsible for processing global objects that are defined in the same group.
Here is a simplified example with two processes in charge of three collections. The collections are carried out at different time intervals, the collection times are also assumed variable. Both processes (process audit) support collections based on their availability. As soon as they have finished their job, they go through the list of collections to be made and claim the first one for which the polling period has expired and which is not already allocated.
This example has ratios between polling period and disproportionate processing time. Usually the ratio between this is from about 1 to about 100. In the case where an SNMP agent is working properly, a collection is of the order of a few tens of milliseconds and the interrogation intervals between 1 and 15 seconds. Delays in treatment may occur if collection times increase or the number of collections is increased.
Ideally, the sum of the ratios between the execution time and the polling period of all the collections should be less than the number of collection processes available for their processing.
The values of the collections are then stored in a block of global object directly in memory. These objects are accessible from everywhere within LoriotPro and more particularly within an Active View visual.
From version 8 of LoriotPro, a task planner is available. It comes in the form of a calendar-style graphical interface in which it is possible to program collection tasks. The collections can be planned thus days or even months in advance and can be recurring daily, weekly, monthly, etc. Collections based on LUA scripts can be simple and involve a few SNMP or complex objects to integrate many collections on many devices and / or correlation.
The choice of a monitoring data collection mode of an IP equipment infrastructure is dictated by the technical constraints. Two main ones will be retained: the volume of data to be collected and the frequency of these collections. The LoriotPro software has several collection modes to adapt to these constraints whether they are light or strong.