Monitor your Service
On This Page:
This document provides instructions on how to monitor your application services using the Volterra service mesh in the VoltConsole. Volterra service mesh provides observability for the application by presenting detailed information on the service interaction for your application. To know more about the service mesh concepts, see Service Mesh.
Using the instructions provided in this document, you can monitor various details such as metrics, API endpoints, and anomalies.
Note: If you do not have an account, see Create a VES Account.
One or more applications deployed on Volterra site and services configured.
- In case of a standalone pod such as a traffic generator, set the
ves.io/app_typeannotation for both the pod and the service for the service mesh to display accurate service graph. For more information on annotations, see Resource Management for vK8s.
Activities of Monitoring
Monitoring site deployment from VoltConsole includes the following activities:
|Inspect Service Graph||View the site dashboard to check metrics, alerts, reachability, and overall health.|
|Inspect API Endpoints||Check the nodes of your site and their detailed status.|
|Inspect Service Mesh Dashboard||View the site dashboard to check metrics, alerts, reachability, and overall health.|
|Inspect Service Metrics||View the system metrics such as throughput, drop rate, etc.|
|Inspect Virtual Services||View the site status to check the connectivity, software, OS, and scaling status.|
|Inspect Service Alerts||Check the nodes of your site and their detailed status.|
|Inspect Service Requests||Check the nodes of your site and their detailed status.|
|Inspect Service Connections||TBD|
Navigate to Service Mesh
Log into VoltConsole and change to your namespace. Select
Mesh from the configuration menu and
Service Mesh from the options pane. A list of your applications gets displayed with the overall application health, total number of services, and indvidual service health for each listed application.
Click on your application from the displayed list to load the service graph for it.
Inspect Service Graph
The service graph presents an overall graph representation for your application's service interactions. The graph shows service interactions representing services as nodes and interactions as edges. The service request flow is indicated between the edges with arrows
Hover on any node or edge to display a snapshot of the overall application health and metrics.
This example shows the snapshot for an edge.
Click on any node or edge to display a quick information card for that service or interaction.
Click on the
Health tabs to obtain quick information on the current status and health respectively. From within those views, you can click
Alerts for nodes or
Endpoints for edges to load the alerts or endpoint views respetively. This example shows the quick view for a node.
Display service mesh only for one node of your application.
Explore Service option on the quick view to display a filtered view of service mesh only for that service.
You can also click the node or edge to display quick view card for that service or interaction.
Set time interval, refresh, or filters for your service graph view.
Last 5 minutesdropdown on the upper right end of the view and select a time interval to inspect your site dashboard for that interval. The default for this is 5 minutes. You can also refresh the status periodically by setting a refresh value using the
Refreshnext to the
Last 5 minutesoption to manually refresh the status.
Inspect API Endpoints
The service endpoints view provides the API endpoints for the services of your application in a graph view with the root and leafs representing the different API paths. You can also obtain the Probability Density Function (PDF) for the Request, Error, Latency, and Throughput (RELT) metrics for the API endpoints.
Load the endpoints monitor view.
Endpoints tab to load the endpoints view.
Switch to PDF view for the endpoint metrics.
Table option to load the PDF for the RELT metrics for all of the endpoints. The following metrics are displayed:
- Request size and response size
- Request rate and error rate
- Latency with and without data
- Response throughput
Hover over any metric to display its PDF value in terms of percenteage, percentile, and mean values.
Note: You can also use the
searchoption to display PDF for a specific service or set of services.
Display probability density for a specific metric of a specific endpoint.
Click on the PDF of any metric in the
Table view to load its probablity density trend in a graph.
Note: You can also switch to other metrics using the
X Axisdrop-down to display its graph.
Inspect Service Mesh Dashboard
The application's dashboard presents an overall monitoring view for your application so that you can inspect overall services, unhealthy services, HTTP errors, latency distribution, and trends for service metrics. Monitor your site dashboard as per the following guidelines:
Open the dashboard view of your service mesh.
Dashboard tab to load the service dashboard.
Monitor overall service snapshot and unhealthy services.
Servicespart of the dashboard to get a count for the healthy, unhealthy, and total services. You can also click
Servicesin this view to load the service graph view.
Top Unhealthy Servicesto obtain detials on the unhealthy services.
Monitor active alerts from the dashboard view.
Active Alertspart of the dashboard to get a count for the critical, major, minor, and total active alerts. You can also click
Active Alertsin this view to load the alerts view.
Top Active Alerts by Serviceto obtain the active alerts per service.
Critical Active Alerts by Serviceto obtain the critical active alerts per service.
Monitor the HTTP errors from dashboard.
HTTP Errors by Servicepart of the dashboard to check the HTTP errors per service for both client and server. You can also display erros for specific HTTP codes using the filter option. The default selected filter is
HTTP Errors as Serverto obtain the server erros for HTTP codes 4xx and 5xx.
HTTP Errors as Clientto obtain the client erros for HTTP codes 4xx and 5xx.
Monitor latency, request rate, and throughput for services from dahboard.
Latency Distribution by Servicepart of the dashboard to check the latency per service.
Request Rate, Latency and Throughput of Servicespart of the dashboard to check the graph of latency versus request rate. The dot size represents the relative throughput and hover over the dots to see the latency, request rate, and throughput values for the service represented by the dot.
Set time interval or a refresh for your dashboard.
Last 10 minutes dropdown on the upper right end of the dashboard and select a time interval to inspect your service dashboard for that interval. The default for this is 10 minutes. You can also refresh the status periodically by setting a refresh value using the
Refresh every field. Alternatively, click
Refresh next to the
Last 10 minutes option to manually refresh the status.
Inspect Service Metrics
Metrics tab to load the service metrics monitoring view.
The service metrics presents detailed service metrics of request rate, error rate, latency, and throughput. The metrics are displayed in graphs representing the trend over a period of time (default 1 hour). You can select to display client or server trends using the toggle option. You can also select specific nodes or edges to display the metrics for that service or interaction.
Monitor request rate and error rate.
- Click on
Error Rate 4XX, and
Error Rate 5XXfields on the right side under the
Ratesection to display the rate trend. Each field has 2 graph bar fields to its left. Select an area of one field and the other area of the second field to display combined graph for in and out drop rate.
Request Ratetrend for the server is displayed by default.
Monitor latency trend.
- Click on
Server RTT, and
App Latencyfields on the right side under the
Latencysection to display the latency trend. Each field has 2 graph bar fields to its left. Select an area of one field and the other area of the second field to display combined graph for those metrics.
Monitor throughput trends.
- Click on
Downstream Throughputfields on the right side under the
Throughputsection to display the throughput trend. Each field has 2 graph bar fields to its left. Select an area of one field and the other area of the second field to display combined graph for them.
Obtain metrics trend for a specific time interval.
Last 1 hour dropdown on the upper right end of the dashboard and select a time interval to inspect your site dashboard for that interval. The default for this is 1 hour and maximum allowed interval is 24 hours. You can customize the interval by selecting the
Custom option and choosing date range. This can also be set graphically by adjusting the controls beneath the main graph.
Inspect Virtual Services
Monitor the virtual services for your application.
Click the 'Virtual Services' tab to load the virtual services view.
Inspect Service Alerts
The service alerts view provides monitoring of the alerts specific to the application services.
Monitor service alerts.
Alerts tab to load the alerts view.
The active alerts are displayed by default. Use the toggle selection to load all alerts. You can also set a time interval in the active alerts view to display alerts over a specific period of time. Click
> for any alert entry to display details in JSON format.
Inspect Service Requests
The service requests view provides monitoring of the requests specific to the application services. This view presents the request trend for your services using sampled HTTP requests over a default or selected time period.
Monitor the requests trend for your services.
Requests tab to load the view for the trend of sampled HTTP requests
The requests are displayed in a graphical trend as well as in a list for the default or specific time interval. Click
> for any listed request to display detailed information in JSON format.
You can also apply filters to display the trend for specific HTTP codes. For example, de-select all and select only
2xx to display the requests for HTTP code 2XX.