Session 4: Operations CTRL Monitoring Walk Through

Elastic Monitoring Basics
1. We’re focusing on Expedient’s Operations CTRL Monitoring product which is built on Elastic’s observability platform
2. Like all Elastic services each agent running on client VMs needs to have the correct policy set in order to ingest the data needed for observability to function properly. Expedient will configure this for clients as part of the delivery process but it can also be verified by checking individual agents under Management > Fleet which was covered by the first module in this series.
Infrastructure > Hosts
1. The main page for seeing a snapshot of your environment can be found under Observability > Infrastructure > Hosts or under the spaces shortcut for OperationsCTRL Monitoring
2. The time filter in the upper right corner defaults to the past 15 minutes but Expedient’s default configuration saves up to 30 days of history.
  1. Exact time periods can be dialed in both using the commonly used shortcuts available by click on the calendar or by dialing in an exact time period by clicking the currently shown interval.
  2. When viewing a graph or a table shat shows a time period it’s also possible to click and drag in order select a specific time period and see more detail.
3. There are a number of filters that can be applied at the top. There are premade filters such as operating system but the search bar at the top can use KQL queries to filter the view to an extreme level of granularity. The easiest way usually to find variables to create filters with is to look at existing charts or graphs. For instance hovering over the CPU usage chart shows it’s based on the variable system.cpu.system.pct
4. The bottom half of the page shows various metrics by default. Any filters applied at the top will affect these charts. Also any individual chart can be opened in lens by clicking the three dots in the upper right hand corner for a deeper dive or additional filtering.
5. The logs tab shows all of the events that are being used to generate the data and similarly can be explored and filtered more from the logs view. There is a shortcut on the logs tab that will show the exact filters currently applied. Alternatively the logs module can be accessed from the sidebar under Observability > Logs > Stream
6. The alerts tab similarly will show any alerts that match current filter criteria.
Alerts
1. Alerts has its own dedicated page under observability on the sidebar.
2. The main alerts page allows you to view Current/past alerts as well as filter by both type and KQL syntax.
3. Alerts are based on rules that search all incoming data for anything that matches specific criteria and then creates an alert. To view rules click on “manage rules” in the upper right hand or click on the reason link for an alert to see the specific rule that generated the alert.
  1. Expedient has a preset list of rules that are visible in the default space. These rules have an integration that creates a ticket in Expedient’s Support Management Console (https://support.expedient.com/) for Expedient to investigate.
  2. Expedient recommends that any custom rules clients create are made under their OperationsCTRL monitoring space and not the default space.
  3. Click on a rule name brings the details for that alert including the history of alerts as well as execution history which shows the success or failures of the rule running and performing the relevant search.
  4. Clicking on actions > edit rule or the conditions for a rule will bring up the settings for that rule.
    1. From the edit panel you can edit two main things, the rule metrics thresholds and the actions that are taken when the criteria are met.
    2. Metric Thresholds are essentially KQL syntax and are easily readable as plain text. Any variable can be easily edited. There are also filters that can be added if false positives are generated.
    3. Actions allow notifications to be sent whenever the metric thresholds are met. SMC alert is Expedient’s webhook to our Support Management Console. Elastic has native support for a number of other connectors.
    4. Email is one of the simplest options. The email connector is automatically configured to use Expedient’s mail relay. Other email relays are not supported by Expedient.
    5. The subject and the body of the email can be easily adjusted and relevant variables can be added such as date or even the alerting host name.
4. If you’re trying to create a custom rule and can’t figure out the correct variables the Discover tool (Under Analytics on the hamburger menu in the upper left) allows you to test filters to make sure you’re getting the data you expect.
Uptime Monitors
1. Uptime monitors or ping monitoring is a feature Expedient configures for all OperationsCTRL monitoring clients. Uptime monitoring using a heartbeat server located in the client’s environment (Usually named CLIENTNAME-Mon01) to provide ping monitoring and ingestion of other logs into the Elastic platform. (E.G. Logs from a Palo Alto Firewall that doesn’t have a native Elastic agent)
2. If ping monitoring fails Expedient will automatically generate a ticket and investigate. Additional rule Actions can be created under “Alerts” if clients want to receive notifications using other methods.
3. Configuration changes to uptime monitors (such as adding or removing a specific device or IP) need go through Expedient’s Operation Support Center (OSC) by creating a ticket in the SMC.