Lab 5 - Event Analytics

Task Goal

In this laboratory session, you'll explore the Event Analytics feature and how to use it to effectively monitor your network, detecting changes in behavior by observing and correlating different event types, for both wired and wireless devices.

This lab task will guide you through the Event Analytics workflow:

get an overview of the events, grouped by network domain and event type
use the analytics features to isolate relevant events for interesting time periods
easily access the individual events to reach the maximum level of detail

Benefits

Event Analytics is a newly introduced feature of Cisco Catalyst Center, available since version 2.3.7, designed to enhance network visibility and improve network management efficiency.

Event Analytics harnesses the power of network event data, enabling network admins to gain unparalleled insights into all kinds of situations that may arise on a network.

One of the key features of Event Analytics is its ability to function efficiently irrespective of whether an issue signature is defined or not. This is a significant shift from traditional network management systems that require specific issue signatures to identify and address network problems.

Instead, Event Analytics leverages machine learning and artificial intelligence algorithms, along with innovative data visualization techniques, to continuously process network event data and allowing network admins to dectect anomalies and potential issues in real time.

In this lab we'll explore the Event Analytics workflow, showing how you can go from a global view, down to identifying specific network events, with just a few clicks.

Supported event types

Event type	Wired	Wireless
Syslog	X	X
Reachability	X	X
Radio events		X
Client events	*	X

Note

Wired client events will be added in a future release.

Syslog messages are collected by Cisco Catalyst Center from switches, routers and Wireless LAN Controllers (WLCs).
All messages available at the platform level will be visible using the Event Analytics UI.
By default, to comply with the privacy policy on Cisco AI Analytics, the Syslog events are exported to the Cisco AI Cloud without the text message (therefore only exporting metadata such as the message type, severity, mnemonic, etc.).
In order to benefit from the full Event Analytics functionality, the user has to explicitly agree to export the full text messages, as such data export, unlike the rest of the Cisco AI Analytics data sources, is exported in clear-text.
Reachability events represent a change to the reachability status of the network devices as seen by the Cisco Catalyst Center appliance, that is constantly monitoring the health of the managed network devices.
For Wireless Access Points (APs), the reachability events are reported by the Wireless LAN Controllers, and are triggered by JOIN and DISJOIN events.
The valid statuses for reachability events are:
- REACHABLE: the device is reachable from the network and it's fully manageable by Cisco Catalyst Center
- PING_REACHABLE: the device is reachable from the network, but it not manageable by Cisco Catalyst Center
- UNREACHABLE: the device is completely unreachable and it's considered to be offline
Radio events include events such as channel changes or transmission power changes operated by the Radio Resource Management (RRM) algorithm, as well as Coverage hole detection events and radio resets.
Client Events are related to the onboarding and roaming events for wireless clients. A future release will also cover similar events for wired clients.

Usecase workflow

Let's get started!

Access the Event Analytics dashboard by going to:

Menu > Assurance > Issue and Events > Event Analytics - Preview

AI-Driven issues - Hamburger menu

Select the Event Analytics - Preview tab from the top bar:

AI-Driven issues - Issues menu

Heatmap Overview

The Heatmap overview shows the volume of network events, by type and category.

Event Analytics - Heatmap overview

The default view focuses on the last 24 hours and it shows Syslog and Reachability events for all wired devices (switches and routers).

You can modify the view by applying a filter based on the location, extending the time period (to up to 60 days) or by switching from the wired to the wireless view.

Navigate the events

The Heatmap shows the number of events over the selected time period. Areas with darker colors indicate a higher event volume by event type and category.

For instance, the Syslog Heatmap shows the event volume by severity, grouped by High (Sev. 1 & 2), Medium (3 & 4), and Low (5 & 6).

As different event categories typically have very different event volumes, each color scale is specific to a given category; this makes it easier to spot trends for rare events, such as high severity Syslog messages.

Event Analytics - Heatmap scale

While observing an increase of high severity events would require immediate investigation as such events are likely related to issues, a volume increase for medium and low severity events would also indicate a potentially interesting behaviour change, worth investigating.

The first step in the investigation is to identify a time period characterized by a change in event volume.

On the lab setup, please follow the steps below:

Observe the denser areas of Syslog messages happening around the highlighted timeframes
Observe how there's a visible increase in network reachability events around the same time

Event Analytics - Correlation across different event types

Show Analytics

Note

Due to limitations with the lab/demo system, from this point onwards, the Event Analytics flow is only available to follow only on this lab guide. We apologize for this inconvenience.

Time selection

The heatmap is a powerful tool to identify periods when the event volume changes significantly, as well as providing a first level of correlation across different event types.

We're now interested in knowing more about what happened around this time, as it seems like many devices changed their reachability state, hence becoming unreachable and/or flapping. The Syslog data will likely help us to understand what caused the reachability issues.

The next step to identify what happened at the time of interest, is to restrict the analysis to such timeframe.

Clicking on the heatmap restricts the time selection to the specific time bucket; you can extended and modified the time selection by moving the selector bars.

In the example below we observe a period of 1 hour (on the 24 hours view, each block in the heatmap represents a 15 minutes period):

Heatmap time selection

After clicking on the time bucket, the summary info below each event category will update to reflect the total event count by category over the selected timeframe.

Access the card view

The following step is to click on the Show Analytics tab, which will expand the view for a specific event type, showing cards summarizing the top events by different criteria.

While the heatmap view is similar across all event types, the card view is specific to each event type.

For instance, the cards for Syslog are designed to easily identify not only the highest severity events, or the ones having the highest volume; you can find cards pointing out rare event types as well as events whose have experienced a volume increase or decrease during the selected period.

New events are the ones that only started occurring towards the end of the selected time frame.

Alternatively you can look for the top network devices based on event volume.

Heatmap card view

Detailed view

Identify a card of interest based on the displayed events; then, click on the Show details link in order to access the detailed view.

In our exploration we'll focus on Highest Severity Events, as we see a high number of BLOCK_BPDUGUARD messages that we want to investigate.

We're going to click on the View Details link on the Highest Severity Events card:

Highest severity card

The detailed view is structured as follows:

Detailed heatmap:
See the time evolution of individual event types over the selected timeframe. By default, the top-3 events are selected, but you can select up to 5. The events are sorted based on the criteria of the card used to reach the detailed view (in this example, showing the messages sorted by severity), although the detailed view allows to access all events by clicking on the Show more link.

Detailed view - Events heatmap

Sankey diagram:
Analyze the impact of specific event types by site and network element.

Detailed view - Sankey

You can interact with the sankey diagram to identify the distribution of events across sites and individual network devices. In this example, you can see how the spanning tree messages are coming from two specific network devices.

Detailed view - Sankey select event type

If you then select the network device with more events, you can see that during the same time, the top events are:

SPANTREE-2-BLOCK_BPDUGUARD
PORT_SECURITY-2-PSECURE_VIOLATION

Detailed view - Sankey select network device

Event table:
See the individual events, for full details about the filtered events. The table by default shows the events related to the heatmap selection; you can interact with the Sankey diagram, restricting the events based on the event type, location or individual device.

In this example we want to focus on this network device producing the highest number of high severity events that we isolated thanks to the heatmap and sankey views.

Clicking on the device name on the sankey automatically applies a filter on the event table:

Table view - Select device

Scrolling down through the event table, we can observe how the spanning tree and port security messages are indeed related to two specific interfaces:

Gi1/0/37
Gi1/0/38

Table view - Analyize events

In about 3 clicks we went from the overview to a specific device and its specific interfaces affected by this issue involving spanning tree BPDU guard and port security.

Configure a user-defined issue
The Event Analytics feature allows to identify patterns and correlate events across different event types and it's effective to isolate complex issues irrespective of whether the situation matches an issue signature or not.
In case you want to be notified about new events matching the messages identified thanks to this workflow, you can create a user-defined issue directly from the event table:

User defined issue - link

After clicking on the user-defined issue link next to the message of interest, you'll get a prompt asking to confirm your choice:

Clicking on Confirm takes you to the Issue Settings, where you can complete the user-defined issue creation:

User defined issue - create

You can specify the exact pattern to match, as well as issue priority and notification settings:

User defined issue - pattern

Key takeaways

Event Analytics is the latest addition to the Cisco AI Analytics feature suite on Cisco Catalyst Center and it allows to get visibility across network domains (covering both wired and wireless) and a variety of network event types.

The advanced analytics features allow to quickly identify periods when significant events happen, irrespective of whether there's an issue trigger or not. Correlation across different event types is facilitated by the heatmap view exposing multiple data sources in a compact and easy to navigate way.

When significant events are identifeid thanks to Event Analytics, it's possible to configure User-defined issues directly from the Event Analytics workflow.

This concludes the exploration of the Event Analytics feature.
You can use the link below to proceed with the exploration of other use cases.

Click here to go back to the use-cases list