Архивы Platform - Acure AIOps Platform https://acure.io/blog/category/platform/ Tue, 04 Apr 2023 20:43:14 +0000 en-GB hourly 1 https://wordpress.org/?v=6.1.4 https://acure.io/wp-content/uploads/2022/07/cropped-favicon@512-1-32x32.png Архивы Platform - Acure AIOps Platform https://acure.io/blog/category/platform/ 32 32 Monitoring of Metrics and Other Features of Acure 2.2 https://acure.io/blog/metrics-acure-2-2/ https://acure.io/blog/metrics-acure-2-2/#respond Thu, 26 Jan 2023 06:31:18 +0000 https://acure.io/?p=5184 Why It Is Important to Collect Metrics Metrics are an essential part of data monitoring. They are used to measure, track and analyze the performance of a system, process or activity. Metrics give us a more objective and accurate view of how a system, process or activity is performing, which helps us identify and address… Continue reading Monitoring of Metrics and Other Features of Acure 2.2

Сообщение Monitoring of Metrics and Other Features of Acure 2.2 появились сначала на Acure AIOps Platform.

]]>
Why It Is Important to Collect Metrics

Metrics are an essential part of data monitoring. They are used to measure, track and analyze the performance of a system, process or activity. Metrics give us a more objective and accurate view of how a system, process or activity is performing, which helps us identify and address potential problems. Metrics provide a quantitative way of measuring progress and performance of a system, process or activity. They also help identify areas where improvements or changes can be made.

What do Metrics Show 🔍

  • First, metrics help to define what data needs to be monitored. By setting KPIs and metrics, businesses can identify which data points are most important and focus their resources on gathering and analyzing that data. Without metrics, businesses may find themselves gathering and analyzing data that is not useful or relevant.
  • Second, metrics provide an indication of progress. By tracking certain metrics, businesses can assess their progress toward their goals and objectives. This allows them to adjust their strategies and better focus their resources.
  • Third, metrics can help identify areas of improvement. By tracking key metrics, businesses can identify areas that need improvement, such as customer service times or lead generation. This can result in more efficient and cost-effective operations.
Metrics picture

Metrics are useful for many applications, including software development, customer service and operational performance.

For software development, metrics can be used to monitor the progress of a project, how quickly tasks are being completed and how defects are being addressed. Metrics can also be used to compare a project’s performance to that of similar projects or to identify areas where additional effort is needed.

Сategories of Metrics

Metrics can be divided into two main categories: quantitative and qualitative.

Quantitative metrics measure the performance of a data system in terms of numbers or data points. Examples of quantitative metrics include throughput, latency and error rates.

Qualitative metrics measure the performance of a data system in terms of user experience or customer satisfaction. Examples of qualitative metrics include customer feedback, user engagement and ease of use.

There are a variety of metrics that can be used to measure performance, depending on the data and the goals of the organization. Some of the most common metrics include:

1. Speed: How quickly data is processed, stored and read from a system. Speed is typically measured in either megabytes per second (MBps) or megabits per second (Mbps).

2. Availability: A measure of how often a system is available and how quickly it can respond to requests. This metric is usually expressed as a percentage.

3. Response time: The amount of time it takes for a system to respond to a user’s request. Response time is usually measure in milliseconds.

4. Latency: The delay between when a request is made and when it is fulfilled by the system. Latency is typically measured in milliseconds.

5. Throughput: Measures the amount of work that can be performed by the system within a specific period of time. Throughput is typically measured in number of transactions per second (TPS).

6. Accuracy: Measures the accuracy of information received from the system. Accuracy is typically measured in the percentage of occurrence of a problem.

7. Reliability: Measures the probability of the system being available for use when needed. Reliability is typically measured in uptime percentage of time that a system is available for use.

Acure 2.2

The Collection of Metrics in the New Release 🗂

Of course, all of the above could not be ignored by Acure. Therefore, a notable update in version 2.2 is a basic implementation for collecting and storing metrics in the system.

Сollecting metrics is carried out through the Data Stream.
The created data stream allows you to simultaneously receive event information (logs), as well as metric data.

For example, for Prometheus to send metrics to the Data Stream, you will need an API key copied from the settings of the corresponding Data Stream.

Based on this, a full-fledged service for analyzing time series and creating rules for managing thresholds is coming soon. UI for managing metrics in the system to create a signal and link it to CI will be implemented in the next release.

Metrics UI in the Next Releases

The Statistics of the Data Stream (Including Metrics) 📊

In the new release, Data Streams got a Statistics tab, where users can access information about the events (logs) and metrics collected in the data stream.

Information is presented in the form of histograms with statistical indicators.

The Histogram of Events and Logs displays the amount of data received through the Data Stream for the selected period of time with the following indicators:

  • Amount of data for the selected period
  • Average amount of data for the selected period = amount of data for the period / number of timeslots of the period
  • The maximum amount of data for the selected period = the maximum amount of one of the time intervals of the period
  • The minimum amount of data for the selected period = the minimum amount of one of the time intervals of the period
The Histogram of Events and Logs

The Histogram of Metrics displays the number of Metrics collected from the Data Stream for the selected period of time with the following indicators:

  • Quantity for the selected period
  • Average quantity for the selected period = number for the period / number of timeslots of the period
  • The maximum quantity for the selected period = the maximum amount of one of the time intervals of the period
  • The minimum quantity for the selected period = the minimum quantity of one of the time intervals of the period
The Histogram of Metrics

What Else is in the Release

🔥 A new version of the Acure Agent with a built-in HTTP plugin was released. This allows requests to the API of external information systems on the Acure system Agent.

🔥 In the same release, a new functionality for managing CI types through the system interface has been implemented, which is a significant development of the CMDB service.

🔥 Last but not least, Acure added the functionality of providing access to Signals to other Workgroups, which already have access to Signal-related CIs.

💡 Find more information about Acure update 2.2 in the Changelog and try it by yourself in Userspace.

💬 Discuss updates or ask any questions in our friendly community on Discord or on our Community page.

Сообщение Monitoring of Metrics and Other Features of Acure 2.2 появились сначала на Acure AIOps Platform.

]]>
https://acure.io/blog/metrics-acure-2-2/feed/ 0
Acure Life Hacks: Local Function for Event Name Conversion https://acure.io/blog/event-name-conversion-function/ https://acure.io/blog/event-name-conversion-function/#respond Thu, 22 Dec 2022 00:15:20 +0000 https://acure.io/?p=4946 Acure allows you to connect data from a wide variety of monitoring tools. However, events from primary systems often have complex names that do not help to simplify the analysis of the state of the infrastructure. Often a time for decision-making is in short supply, so you need to reduce the cognitive load and make… Continue reading Acure Life Hacks: Local Function for Event Name Conversion

Сообщение Acure Life Hacks: Local Function for Event Name Conversion появились сначала на Acure AIOps Platform.

]]>
Acure allows you to connect data from a wide variety of monitoring tools. However, events from primary systems often have complex names that do not help to simplify the analysis of the state of the infrastructure. Often a time for decision-making is in short supply, so you need to reduce the cognitive load and make the data understandable for perception.

Let’s look at how to do this in Acure using regular expressions and a new local function.

Event name conversion examples

💡 Imagine that you receive a problematic event from the primary monitoring system with the name SomeHost: High CPU utilization (over 90% for 5m).

For this event, we can apply the following regular expression “High.CPU.util.*over.(\\d+)%.*” and open the Signal in a readable form, with the name “CPU > 90%”.

💡 There can be a lot of examples of such transformations, for example, here is another one for a data storage system:

  • Initial event name: “C:: Disk space is critically low (used > 90%)”
  • Regular expression: “(.*): Disk space.*used.>.(.*)%.*”
  • Resulting pattern: “$1 Storage Partition Usage > $2%”
  • Signal Name: “C: Storage Partition Usage > 90%”

✅ We have implemented this request in the automation script as a local function that instantly converts values according to the dictionary defined in the same function. Unlike hard-coded global functions, local functions are a playground where you can implement any of your ideas within a C# script.

How event name conversion works in Acure

👉 To repeat such conversions in Acure, you need to create a local function in the Automation script and transform the Signal name before opening the signal.

Conversion Function in Local Functions List
Conversion Function in Local Functions List

Function parameters:

🔵 Incoming pin – Input – a string containing the name of the event.

🔵 Outgoing pin – Result – a string containing the converted value.

Conversion Function in Low-code Scenario
Conversion Function in Low-code Scenario

📜 Function code:

var regexDict = new Dictionary<string, string>()

{

   {“(.*): Disk space.*used.>.(.*)%.*”,”$1: Storage partition usage > $2%”},

    {“High memory util.*>(\\d+)%.*”,”RAM usage > $1%”},

    {“High.CPU.util.*over.(\\d+)%.*”,”CPU Usage > $1%”},

};

foreach (var regex in regexDict)

{

    if (!System.Text.RegularExpressions.Regex.IsMatch(Input, regex.Key))

        continue;

    return System.Text.RegularExpressions.Regex.Replace(Input, regex.Key, System.Text.RegularExpressions.Regex.Replace(regex.Value, “$”, “”));

}

returnInput;

***

⚙ If the function finds a regular expression in its dictionary that matches the original event name, this value will be converted according to the pattern specified in the dictionary. If there is no matching regular expression in the dictionary, the original value will be returned.

📖 Read more about the functionality of the automation engine in the corresponding section of the Documentation.

💬 Do not forget to share your cool ideas in our Community or Discord channel, and we will add them to our functionality in turn.

Сообщение Acure Life Hacks: Local Function for Event Name Conversion появились сначала на Acure AIOps Platform.

]]>
https://acure.io/blog/event-name-conversion-function/feed/ 0
Acure 2.1: Manual Signals, Table CMDB and Many More https://acure.io/blog/acure-2-1/ https://acure.io/blog/acure-2-1/#respond Fri, 16 Dec 2022 11:45:02 +0000 https://acure.io/?p=4904 Manual Creation of Signals 👨‍💻 In the last update, we released a new dynamic feature instead of static triggers – signals – that report the changes of any parameter from the regular state and are mainly intended for deduplication and correlation of primary events. The management of signals in Aсure 2.0 was entirely handled by… Continue reading Acure 2.1: Manual Signals, Table CMDB and Many More

Сообщение Acure 2.1: Manual Signals, Table CMDB and Many More появились сначала на Acure AIOps Platform.

]]>
Manual Creation of Signals 👨‍💻

In the last update, we released a new dynamic feature instead of static triggers – signals – that report the changes of any parameter from the regular state and are mainly intended for deduplication and correlation of primary events. The management of signals in Aсure 2.0 was entirely handled by automation scenarios: you’re writing several scripts on a low-code engine – and the platform itself opens a signal in case of changes in the system, sorts incoming events and assigns them the appropriate status. When you are just waiting for a line with the “Fatal” severity not to appear in the table.

But what if you found a problem, but the system did not notify you? What if you were informed about the unavailability of a service, while you were glad that there were no alerts? At the same time, the script was successfully compiled, and no code errors were issued, but the signal did not open. Of course, in this case, you need to check the script, look for the reasons why it did not work out and debug it, which can take time. But what to do with the incident at this moment and how to add its system so as not to lose information about it? Acure 2.1 provides the answer.

In the new version of Acure, in addition to the standard scenario management, signals can be added manually.

To create a Signal manually, go to the Signals management screen and click the Сreate – Signal button in the upper right corner.

Fill in the creation form values and press the “Create Signal” button.

After creating the Signal, select the required configuration item in the CI list. If the Signal was created without reference to the CI, set up the filter for displaying Signals without CI accordingly.

To view detailed information about a Signal, go to the Signal card by double-clicking on the corresponding signal in the table.

Acure 2.1 signal card
Signal card

Availability reports will also take this signal and add it to the statistics. Don’t forget to go back to the script and check why the automation didn’t work 😉

📖 Read more about signals in our documentation.

CMDB Table View 📋

Another hallmark of Acure is the auto-building of CMDB and the presentation of the entire IT infrastructure in the form of a single graph on one screen, which is very convenient, including for root cause analysis, when you can identify a problematic element by the links between configuration items.

However, this functionality was not enough for finding bottlenecks, analytics and reports. This is why Acure 2.1 introduces the CMDB table view.

Switching between SM views (split and table) is very simple and available using the corresponding buttons in the upper right corner of the SM Maps section. The table view displays CIs corresponding to the current CMDB filter or SM Map.

Acue 2.1. CMDB table view
CMDB table view

This is a database of all your CIs with the corresponding parameters. Basic information about CI contains: CI ID, CI name, CI type (name and icon), CI Status (name and color), CI Health (percentage and color), Open signals for CI, CI Owner.

You can also customize the composition of the table according to your needs and requirements. Just click on the “gear” icon in the upper right corner of the “SM Maps” screen.

Acure 2.1 Table view customization
Table view customization

In addition to the basic information, you can add other information related to the CI in the table view.

Interface Improvements and Other Features

In addition to the table CMDB, the new release of Acure also includes interface improvements to optimize work with several hundreds of thousands of CIs. Bulk operations are now available when working with CI: after multiple choices of CIs you can archive, unzip selected CIs or delete selected CIs.

Bulk operations with CIs

The automation engine has also been updated with new functions, allowing you to use new features of the SM API in automation scenarios. For example, there is a new function FilterConfigItemsExtended for high-performance scenarios.

💡 Find more information about Acure update 2.1 in the Changelog and try it by yourself in Userspace.

💬 Discuss updates or ask any questions in our friendly community on Discord or on our Community page.

Сообщение Acure 2.1: Manual Signals, Table CMDB and Many More появились сначала на Acure AIOps Platform.

]]>
https://acure.io/blog/acure-2-1/feed/ 0
Low-code as a Future of Development and Its Realization in Acure https://acure.io/blog/low-code-in-acure/ https://acure.io/blog/low-code-in-acure/#respond Thu, 17 Nov 2022 12:51:38 +0000 https://acure.io/?p=4685 What is Low-code? Low-code is a development method that minimizes manual programming. Instead of hard coding, visual constructors are used for application modeling and ready-made scripts are used to solve typical tasks. For low-code development, the process involves moving blocks with ready-made code using the drag-and-drop principle and getting a product with the desired functionality.… Continue reading Low-code as a Future of Development and Its Realization in Acure

Сообщение Low-code as a Future of Development and Its Realization in Acure появились сначала на Acure AIOps Platform.

]]>
What is Low-code?

Low-code is a development method that minimizes manual programming. Instead of hard coding, visual constructors are used for application modeling and ready-made scripts are used to solve typical tasks. For low-code development, the process involves moving blocks with ready-made code using the drag-and-drop principle and getting a product with the desired functionality. Ready-made modules in low-code speed up work with typical tasks and eliminate repetitive actions but code can be used for individual solutions, settings and personalization. Development in the platform takes place according to ready-made templates or freely. Integrations and built-in services are also supported.

The main value of low-coding is the ability to do without programmers when you need to create or change some kind of application, module or even product. To carry out the necessary work, the competencies of the platform administrator will be more than enough.

Benefits of Low-coding 👍

Low-code platforms require less development time and give more flexibility in setting up processes. There is no need to plan the architecture, create prototypes, analyze and develop the UI since it is assumed that this is all implemented in the low-code platform itself.

Low-code meme

Such platforms integrate with a wide range of systems and allow you to add new features to any application. In addition, manufacturers of low-code platforms talk about their greater security for other applications and stability compared to self-written elements.

The main elements of low-code platforms are:

1. Visual modeling

2. Ready-made components, built-in services

3. Rapid deployment of applications, focus on DevOps

4. Pattern development or abstract development

Since the company’s IT specialists in this case no longer have to write a lot of code, the need for these competencies is reduced and, in turn, the ability of staff to build solutions from ready-made components is prioritized.

Low-code Automation in Acure

Acure Automation Service is a high-performance environment for launching and executing custom scripts. Scenarios can be both custom and supplied by the developers themselves as full-fledged services.

Automation scripts allow you to automatically discover new configuration items, the relationships between them and update the service map in real-time without any manual manipulation.

With the help of low-code scenarios, you can also create signals – special dynamic objects that allow you to correlate and deduplicate incoming events and alerts. Read more about this functionality in the article discussing Acure 2.0.

The low-code engine is used to create automatic scripts. Automation scripts in Acure help significantly expand the functionality of the system and create arbitrary event processing scenarios using visual blocks and establishing links between them.

Acure Automation pipeline

Of course, low-code, as described above, means a significant reduction in the use of complex hard coding but does not at all relieve you of the need to learn the logic of building scripts and memorizing functions and variables. And if you are now holding your breath in anticipation of a ton of complex information, calmly exhale. Acure ‘s low-сode instruments are no more complicated than the cheat code in your favorite games, and also as a result make life just as easy. Further, you will be convinced of it. 

Low-code Instruments in Acure

Start events

Any script must start with a “startup”. Startup events are responsible for this – blocks that initiate the launch of the script and contain the event model. If a script contains multiple start blocks, it can run on any of them. The composition of the starting blocks is determined by the route map settings.

When the script is running, it is time for variables and functions.

Functions

There are two types of functions in Acure.

Functions

Impure functions

  • The function is executed every time the input is called from the previous block
  • The ArrayAddElement function requests all the data passed to it as input
  • This function only works once. To use the result of a function, it is not necessary to call it again

Pure functions

  • The function is executed each time its result is requested. Accordingly, to use the result, it will need to be called every time, like Batman.

Variables

Variables are divided into two types.

Local

  • Initialized within the current scenario or manually by the user, or using the SET block
  • Can be called or initialized anywhere in the current script

System

  • Are providers of information about the script, owner, or current space
  • Not initialized by the user. They act only as a source
Variables

Now you will ask: what about data types supported in Acure? Acure supports multiple types of data but you can only link to pins of the same type.

  • The following values are possible for types:
  • Boolean: True / False ;
  • Byte: Integers from 1 to 255;
  • Char: A single Unicode character ;
  • Double: ±5.0 × 10 −324 to ±1.7 × 10308
  • Dynamic: any object;
  • GUID: Format value 00000000-0000-0000-0000-000000000000;
  • Integer: -2 147 483 648 to 2 147 483 647;
  • Integer64: -9 223 372 036 854 775 808 to 9 223 372 036 854 775 807;
  • String: Unicode character string
  • The type can be either Single or Array.

Wildcard Pins And Connections

It is also worth noting that some functions can work with different data types. For convenience, in such cases, Wildcard pins are used. For Wildcard pins the type of connection is set either manually or when establishing a connection.

For WC pins, there are also requirements in the context of each function. More about this is written in the documentation when describing each type of function.

There are also certain requirements for establishing links. For example, when pinning a function call, one-to-many communications are prohibited, but many-to-one are allowed. With a data transfer pin, the opposite is true.

Wildcard Pins And Connections

Function Categories

The main low functions are presented in the table below and divided into several categories.

Function categories

ℹ You can find more information about every function in the corresponding section of the Acure documentation.

In this article, let’s walk through building a simple Autodiscovery scenario.

Creating A Simple Autodiscovery Scenario

As mentioned above, automation scripts in Acure allow you to minimize manual actions, which is especially important when monitoring dynamic environments. After writing several scripts on the low-code engine, you no longer need to think about making changes to the service model yourself. A dynamic map of IT infrastructure links with all configuration items and links will be built and updated automatically.

✨ No shaman tambourines – all the magic happens on the scenario builder page.

By default, there is a start block that runs the script every time the corresponding event arrives.

OnLogEvent

First, you need to create a rule so that the sequence is executed on specific events. To do this, you need some functions in the form of blocks. You can add them from the context menu by right-clicking on an empty space.

Low-code part

Let’s build a simple rule that will receive only those events that came from a specific stream.

For that, add the FiltredByStreamId function and connect the sequence in such a way that when an event arrives in the system, the script checks the ID of the stream from which it came and, if the filtering is successful, the script will continue to run.

Low-code part

The sequence of execution of script functions is indicated by blue arrows — exact pins.

Low-code part

Note that in addition to exact pins, there are data pins. If the former is responsible for the sequence, then the latter is responsible for transmitting and receiving data.

Now let’s analyze our function. For it to be executed, it must be provided with input data. In our case, the function requests an incoming stream model and filtering parameters (stream id).

We must get the initial data from the primary event, i.e., take away the stream model. To do this, we decompose the original structure using the base function and establish a connection with our filter.

Now we need to specify the required parameter (we take it from the previously created stream) and copy-paste it into the FiltredByStreamId block.

FilterByStreamId

Done! The simple rule is ready. Now, further actions will be executed only if the event came from the stream we specified.

Let’s look at the other tools that are available in the editor as well.

The left panel contains the objects of the current scenario. Here you can create and manage local variables, structures and entire functions. From here, they can be added to the screen for use in a script or selected for further customization.

The settings are available in the right panel where we fill in all the required fields and, in the case of a local function, write the executable code.

Executable code

To show off your awesome script or make it easier for your team, you can export the script and share it with others. The recipient, using the import tool, creates an exact copy of this script.

Is Low-code the Future? 

The numbers say yes. The 2022 Mendix State of Low-Code study showed a rise in low-code adoption from 77% in 2021 to 94% IN 2022, with four out of 10 companies now using low-code for mission-critical decisions. The study argues that the spread of low-code may soon lead to the overthrow of more “traditional” forms of operations. This report cites Gartner’s forecast that by 2025, low-code solutions will account for 70% of apps, up from 25% in 2020.

At the same time, the scope of low-code products will also be constantly expanding. This technology has already become a trend and subsequently, the entire market will be rebuilt under it.

All this suggests that the market will increasingly be oriented toward simple solutions when any mass user will be able to automate the solution of routine tasks and satisfy needs without deep programming knowledge. At the same time, the growing needs of users stimulate low-code technology to develop faster and improve functions. Thus, low-code systems will be able to solve more and more complex problems as they develop.

👨‍💻 Want to experience the benefits of low-code? Register in Acure and write your own automation scenarios.

Сообщение Low-code as a Future of Development and Its Realization in Acure появились сначала на Acure AIOps Platform.

]]>
https://acure.io/blog/low-code-in-acure/feed/ 0
ACURE 2.0 Is Officially Released https://acure.io/blog/acure-2-0/ https://acure.io/blog/acure-2-0/#respond Wed, 12 Oct 2022 08:46:07 +0000 https://acure.io/?p=4097 We’re excited to bring you Acure 2.0! Now we are even closer to the concept of “Monitoring as a code”, we removed synthetic triggers replacing them with a dynamic correlation approach called signals and made the interface more compact. But first things first. What Is Monitoring As A Code And What Does Acure Have To… Continue reading ACURE 2.0 Is Officially Released

Сообщение ACURE 2.0 Is Officially Released появились сначала на Acure AIOps Platform.

]]>
We’re excited to bring you Acure 2.0! Now we are even closer to the concept of “Monitoring as a code”, we removed synthetic triggers replacing them with a dynamic correlation approach called signals and made the interface more compact. But first things first.

What Is Monitoring As A Code And What Does Acure Have To Do With It?

The concept of “monitoring as code” appeared on the Internet relatively recently. If we google “monitoring as code”, we will find references to this approach in many popular monitoring tools. However, if you dig deeper, it becomes clear that the described methods are just agent deployment and export setup using configuration management tools such as Puppet, Terraform or Helm. For the most part, these tools do not cover all monitoring and are limited to a simple data collection configuration.

“Monitoring as code” is not just automatic installation and configuration of agents, plug-ins and exporters. It covers the entire lifecycle, including automated diagnostics, alerts, incident management and automated troubleshooting.

Meme: What if I told you in Acure monitoring is a code
Acure 2.0 Meme

From the very beginning, we at Acure have been guided by this concept. The automation scripts covered part of the monitoring cycle, but the 2.0 update is a milestone on the road to monitoring as code.

We Have Replaced Static Triggers With Dynamic Signals 🚦

In previous version of Acure, the main tool for deduplication and correlation of primary events (alerts and logs) were synthetic triggers controlled by rules in the form of Lua scripts. Despite their flexibility, the triggers were static, which was inconvenient when managing large dynamic environments. For each event, the user had to create a separate trigger with its own rule. With a large amount of constantly changing data the number of triggers could reach hundreds of thousands! All this generated additional labor costs when setting up and further working with the system.

As you already know, Acure easily syncs with other popular data monitoring systems. But static triggers gave rise to the problem of this constant synchronization – it was necessary to constantly monitor changes in synchronized systems. If a trigger was deleted or unlinked from a configuration item, the history was lost. In a dynamically changing environment, Acure would have a tough time overcoming these challenges.

This is where our favorite low-code engine saved the day! Thanks to it, we were able to replace synthetic triggers with signals driven by automation scripts.

Signal is similar to a task in a regular task tracker, and unlike a trigger, where only the status changes, the signal is a dynamic object. It opens on a specific event (or set of events), can attach other confirmation alerts in the process, and closes on an alert or event from the scheduler.

Find out more about the Acure 2.0 automation processes in the diagram below:

Scenario architecture in Acure 2.0
Scenario architecture in Acure 2.0

As you can see from this chain, the process architecture in Acure is event-driven :

1. The primary event in the form of raw data enters the system through Data Streams (push method) or with the help of Agents (pull method). 

Push method involves sending data through the REST HTTP API (Data Stream API). This is how Prometheus , Ntopng , Fluentd and Nagios Core are integrated.

Pull method connects and collects data from monitoring tools using Agents. An agent is a special program that can be installed on a remote device to collect data and perform some action. The Agent receives tasks from the Acure server, executes them and transfers the collected information via a secure network protocol to the server. Through the pull method, integrations with Zabbix , SCOM, Nagios XI, vCenter are implemented .

2. After entering the system the raw event is transformed to the corresponding Acure structure via ETL logs. We can see events and alerts in the system at this point.

3. The automation script determines what kind of event we received: a monitoring event (a breakdown of some infrastructure object or service unavailability) or a topology change event (for example, creating a new configuration item (CI) or changing its name). Each of these cases has its own scenario (or group of scenarios).

In the second case, we are dealing with auto-discovery and the auto-building of the resource-service model: with the help of automation scripts, Acure enables the building of the topology of all IT services all without the use of manual labor. However, it is worth noting here that after the CI changes, the event correlated to this change also gets into the calculation of signals.

4. Now let’s move on to the main highlight of the recent update – signals.

Meme: Dynamic Signals vs. Static Triggers
Meme: Dynamic Signals vs. Static Triggers

A signal is a dynamic indicator of a change in the infrastructure with a start time and an end time. It reports the changes of any parameter from the regular state and is mainly intended for deduplication and correlation of primary events. With the help of signals, redundant copies are eliminated, and events are correlated – incidents are assigned the appropriate priority. Thus, with the help of automatic scripts, Acure not only reduces the user’s time for routine manual tasks, but also protects against information noise. 

5. After the signal is created, the status of the configuration item is calculated.  The CI attached to the signal takes over its criticality. For example, if a fatal level criticality is attached to some CIs, then the CIs are also painted in the corresponding bright red color. Changing the status of a CI recalculates its health and triggers the appropriate auto-actions (for example, team notifications, e-mails to users, or auto-repair scripts ). And all this happens automatically!

We Took A Course On Scenario Management 👨‍💻

Of course, there were scenarios in the case of triggers as well. Before binding (manually or via API) a trigger to the required configuration item, it was necessary to create a script for managing its statuses (manually or via a template), as well as an event prefilter. So a separate script was responsible for each individual trigger.

In the case of signals in Acure 2.0, the approach to their management has completely changed. In Acure 2.0, the management of signals is entirely handled by scripts. By analogy with the auto-building of the resource-service model all processes related to signals are implemented inside a script written using low-code. Within the script the opening/closing of signals takes place and the logic of their binding to configuration items is determined, as well as the attachment of events (alerts) to them. In addition, signals can be controlled by a whole set of scripts: you can create one huge alert deduplication script, which can be divided into many small ones without any restrictions.

Part of an automation script (signal opening)
Part of an automation script (signal opening)

In the same place, in the script, the configuration item is attached to the signal: directly from the script, you can access the CMDB functions, find the required CI by attributes and bind it to the signal. The binding logic can be absolutely arbitrary and depends only on the CMDB device and the log source settings.

Signal screen
Signal screen

Let’s have a look at the concept of signals using a simple example of alert deduplication. Imagine that in case of exceeding the threshold value of the metric, the source generates an exceeding alert every 5 minutes. In Acure, a signal is opened on the first event, all subsequent confirmations are attached to this signal. When the metric returns to its original state and there are no new alerts for this metric within 30 minutes, Acure closes the signal. If the situation repeats, a new signal is generated, while the previously triggered and already closed signals remain unchanged.

Such a dynamic scenario approach to problem management greatly simplifies the monitoring process, which is especially important for systems with a dynamic environment (containers, microservice architecture, Kubernetes). Simply put, instead of setting up a lot of static triggers and constantly monitoring their relevance, in the new version of Acure, it is enough to write a few scripts and forget about the old monitoring process – the system will do the rest, keeping with the motto “Monitoring as a code”.

We Made “A Single Pane Of Glass” Even More Single 🖥

The presentation of data is just as important as the automation when comparing the effectiveness of monitoring.  Understanding this, we revamped our approach to data visualization. Previously, an Acure user would have to repeatedly switch between several screens: the main, operational, timeline and resource-service model. The old way wasn’t significantly inconvenient but increased the time to fix the failure. 

Now all the functions of the four screens are collected in a single monitoring window, where the monitoring panel is combined with the Service Model graph.

Service Model Graph
Service Model Graph

The single window is visually divided into two parts. In the left panel there is a list of filtered CIs, their health and status; in the right panel the information changes depending on the selected mode by the operator: Service Model graph, CI card, list of signals, service modes, Changelog. There are many different cross-links and filters on the screen.

Maintenance mode
Maintenance mode
Changelog
Changelog

Another feature is the functionality of transition points between maps. You can link maps to certain configuration items, after which transition points will appear on these configuration items. Making it now possible to move from one map to another directly on the Service Model graph.

This “one-stop shop” concept is designed to speed up the work with monitored objects, and thus minimize the time to solve the problem.

💡 Find more about new Acure features in our documentation and try it by yourself in Userspace.

💬 Discuss updates or ask any questions in our friendly community on Discord or on our Community page.

Сообщение ACURE 2.0 Is Officially Released появились сначала на Acure AIOps Platform.

]]>
https://acure.io/blog/acure-2-0/feed/ 0