In microservice world there are many independent isolated services runs in parallel and there are always concerns that which instance/s of service is running fine or which are not. In monolithic world it was quire simple. For small to moderate traffic there will be 3/4 instance sufficient to handle the load and each instance can be easily monitored through APM, access logs and even some time through instance level monitoring (As one VM will only have one Application instance).
In case of microservice world the complexity increases N fold as each service we are starting embedded tomcat and each service is having multiple API calls which are either call directly by client or by other microservice.
In Microservice architecture principle each service should be isolated and independent, same principle holds true for even monitoring. Each service should be self monitored and should take part in overall scaling, coordination and orchestration.
For any human BP, Sugar, Pulse etc are the metrics to determine whether that person is healthy or not in same way there are various metrics each microservice should expose to determine the health of that particular service.
For monitoring particular instance which are running bunch of microservice following metrics are critically important:
- CPU Usage
- Memory Usage and Swap
- Bytes transferred across the network.
- Disk read write speed.
For monitoring particular microservice following metrics become critically important:
- CPU usage by the container.
- Free and used memory statistics in therms of heap memory.
- Garbage Collection performance.
- Class Loaded and threads
- API Response time.
All above statistics can be easily monitored, alerted and visualised using the following tools:
- cAdvisor -- Monitoring the Docker Container and System.
- Spring Actuator - Getting application performance metrics.
- Prometheus -- Time Series DB
- Grafana -- Graphs and Monitoring Dashboards.
Let's start making all things to work together. For better understanding lets take use case:
Admin looks at the dashboard for monitoring the single microservice which is deployed as docker container.
Spring Boot:
Lets start real work create the some spring boot application and add the following dependency:<dependency> <groupId>org.springframework.boot</groupId> <artifactId>spring-boot-starter-actuator</artifactId> </dependency> <dependency> <groupId>io.prometheus</groupId> <artifactId>simpleclient_spring_boot</artifactId> <version>0.1.0</version> </dependency> <dependency> <groupId>io.prometheus</groupId> <artifactId>simpleclient</artifactId> <version>0.1.0</version> </dependency> <dependency> <groupId>io.prometheus</groupId> <artifactId>simpleclient_hotspot</artifactId> <version>0.1.0</version> </dependency> <dependency> <groupId>io.prometheus</groupId> <artifactId>simpleclient_servlet</artifactId> <version>0.1.0</version> </dependency>
Create Configuration file and add the following:
@Bean @ConditionalOnMissingBean CollectorRegistry metricRegistry() { return CollectorRegistry.defaultRegistry; } @Bean ServletRegistrationBean registerPrometheusExporterServlet(CollectorRegistry metricRegistry) { return new ServletRegistrationBean(new MetricsServlet(metricRegistry), "/prometheus"); } @Bean public SpringBootMetricsCollector metricsCollector(Collection<PublicMetrics> publicMetrics) { List<Collector> collectors = new ArrayList<>(); collectors.add(new StandardExports()); collectors.add(new MemoryPoolsExports()); addCollectors(collectors); SpringBootMetricsCollector springBootMetricsCollector = new SpringBootMetricsCollector( publicMetrics); springBootMetricsCollector.register(); return springBootMetricsCollector; }
Now needs to create sample api:
@RequestMapping(value = "/postjson", method = RequestMethod.POST) public String postjson(@RequestBody String json) throws IOException { System.out.println("Request: "+json); try { Random ran = new Random(); int x = ran.nextInt(6) + 5; TimeUnit.SECONDS.sleep(x); } catch (InterruptedException e) { e.printStackTrace(); } return "Received:"+json; }
Above API is very simple one upon each call it will wait from 5 - 10 seconds and respond back.
This will enable us to see up and down in monitoring graphs.
Once you run the above application you will able to see the metrics in prometheus format in following url:
http://localhost:9099/prometheus
Out put will look like:
# HELP httpsessions_max httpsessions_max # TYPE httpsessions_max gauge httpsessions_max -1.0 # HELP httpsessions_active httpsessions_active # TYPE httpsessions_active gauge httpsessions_active 0.0 # HELP mem mem # TYPE mem gauge mem 362111.0 # HELP mem_free mem_free # TYPE mem_free gauge mem_free 125518.0 # HELP processors processors # TYPE processors gauge processors 8.0 # HELP instance_uptime instance_uptime # TYPE instance_uptime gauge instance_uptime 412918.0 # HELP uptime uptime # TYPE uptime gauge uptime 415762.0 # HELP systemload_average systemload_average # TYPE systemload_average gauge systemload_average 4.4248046875 # HELP heap_committed heap_committed # TYPE heap_committed gauge heap_committed 313344.0 # HELP heap_init heap_init # TYPE heap_init gauge heap_init 262144.0
...
# HELP gauge_response_postjson gauge_response_postjson # TYPE gauge_response_postjson gauge gauge_response_postjson 10032.0
Here we had created our custom metrics gauge_response_postjson which tells us the last response time of postjson API. As we are refreshing every five seconds we will get last response time for every five seconds which we will plot as a graph.Prometheus:Lets starts configuring the prometheus we will use the docker container for the same.Configure the prometheus for our end point:- job_name: 'finx_service' metrics_path: /prometheus scrape_interval: 5s static_configs: - targets: ['192.168.1.178:9099']Start prometheus :docker run -d -p 9090:9090 \ -v $PWD/prometheus.yml:/etc/prometheus/prometheus.yml \ prom/prometheus You should able to access the prometheus using the following url: http://localhost:9090/ Make sure that targets are working (state=up) fine http://localhost:9090/targets Grafana: Start Grafana using docker: docker run -d --name=grafana -p 3000:3000 grafana/grafana Login into http://localhost:3000 as admin/admin. In order to stimulate the load we will fire the following command using curl multiple times.
curl -H "Content-Type: application/json" -X POST -d '{"username":"xyz","password":"xyz"}' http://localhost:9099/postjson
Now configure the Grafana dashboard shown below:
System Monitoring:
Now let's monitor the Docker Container and System Statistics using the cAdvisor.
Start the cAdvisor as follow:
sudo docker run \
--volume=/var/run:/var/run:rw \
--volume=/sys:/sys:ro \
--volume=/var/lib/docker/:/var/lib/docker:ro \
--publish=8080:8080 \
--detach=true \
--name=cadvisor \
google/cadvisor:latest
Check the following url whether metrics are getting generated:
http://localhost:8080/metrics
Note use might wondering cAdvisor has its own dashboard at http://localhost:8080/ then why we are moving its metrics from cAdvisor to Prometheus and plotting graph in Grafana. Reason is that cAdvisor only keep last 15 min of data so if you want to see for more than 15 min then the data is overridden so we send the metrics to prometheus and use Grafana to visualise it.
You should able to see following targets in Prometheus:
Configure the Grafana dashboard by using free Docker and System dashboard from Docker Dashboard
Above source code is available at GitHub.
Next blog I will cover how to do monitoring orchestration and auto discovery.