Spring Boot Autoscaler

One of more important reasons we are deciding to use such a tools like Kubernetes, Pivotal Cloud Foundry or HashiCorp’s Nomad is an availability of auto-scaling our applications. Of course those tools provides many other useful mechanisms, but we can implement auto-scaling by ourselves. At first glance it seems to be difficult, but assuming we use Spring Boot as a framework for building our applications and Jenkins as a CI server, it finally does not require a lot of work. Today, I’m going to show you how to implement such a solutions using the following frameworks/tools:

  • Spring Boot
  • Spring Boot Actuator
  • Spring Cloud Netflix Eureka
  • Jenkins CI

How it works?

Every Spring Boot application, which contains Spring Boot Actuator library can expose metrics under endpoint /actuator/metrics. There are many valuable metrics that gives you the detailed information about an application status. Some of them may be especially important when talking about autoscaling: JVM, CPU metrics, a number of running threads and a number of incoming HTTP requests. There is dedicated Jenkins pipeline responsible for monitoring application’s metrics by polling endpoint /actuator/metrics periodically. If any monitored metrics is below or above target range it runs new instance or shutdown a running instance of application using another Actuator endpoint /actuator/shutdown. Before that, it needs to fetch the current list of running instances of a single application in order to get an address of existing application selected for shutting down or the address of server with the smallest number of running instances for a new instance of application..

spring-autoscaler-1

After discussing an architecture of our system we may proceed to the development. Our application needs to meet some requirements: it has to expose metrics and endpoint for graceful shutdown, it needs to register in Eureka after after startup and deregister on shutdown, and finally it also should dynamically allocate running port randomly from the pool of free ports. Thanks to Spring Boot we may easily implement all these mechanisms if five minutes 🙂

Dynamic port allocation

Since it is possible to run many instances of application on a single machine we have to guarantee that there won’t be conflicts in port numbers. Fortunately, Spring Boot provides such mechanisms for an application. We just need to set port number to 0 inside application.yml file using server.port property. Because our application registers itself in eureka it also needs to send unique instanceId, which is by default generated as a concatenation of fields spring.cloud.client.hostname, spring.application.name and server.port.
Here’s current configuration of our sample application. I have changed the template of instanceId field by replacing number of port to randomly generated number.

spring:
  application:
    name: example-service
server:
  port: ${PORT:0}
eureka:
  instance:
    instanceId: ${spring.cloud.client.hostname}:${spring.application.name}:${random.int[1,999999]}

Enabling Actuator metrics

To enable Spring Boot Actuator we need to include the following dependency to pom.xml.

<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-actuator</artifactId>
</dependency>

We also have to enable exposure of actuator endpoints via HTTP API by setting property management.endpoints.web.exposure.include to '*'. Now, the list of all available metric names is available under context path /actuator/metrics, while detailed information for each metric under path /actuator/metrics/{metricName}.

Graceful shutdown

Besides metrics Spring Boot Actuator also provides endpoint for shutting down an application. However, in contrast to other endpoints this endpoint is not available by default. We have to set property management.endpoint.shutdown.enabled to true. After that we will be to stop our application by sending POST request to /actuator/shutdown endpoint.
This method of stopping application guarantees that service will unregister itself from Eureka server before shutdown.

Enabling Eureka discovery

Eureka is the most popular discovery server used for building microservices-based architecture with Spring Cloud. So, if you already have microservices and want to provide auto-scaling mechanisms for them, Eureka would be a natural choice. It contains IP address and port number of every registered instance of application. To enable Eureka on the client side you just need to include the following dependency to your pom.xml.

<dependency>
	<groupId>org.springframework.cloud</groupId>
	<artifactId>spring-cloud-starter-netflix-eureka-client</artifactId>
</dependency>

As I have mentioned before we also have to guarantee an uniqueness of instanceId send to Eureka server by client-side application. It has been described in the step “Dynamic port allocation”.
The next step is to create application with embedded Eureka server. To achieve it we first need to include the following dependency into pom.xml.

<dependency>
	<groupId>org.springframework.cloud</groupId>
	<artifactId>spring-cloud-starter-netflix-eureka-server</artifactId>
</dependency>

The main class should be annotated with @EnableEurekaServer.

@SpringBootApplication
@EnableEurekaServer
public class DiscoveryApp {

    public static void main(String[] args) {
        new SpringApplicationBuilder(DiscoveryApp.class).run(args);
    }

}

Client-side applications by default tries to connect with Eureka server on localhost under port 8761. We only need single, standalone Eureka node, so we will disable registration and attempts to fetching list of services form another instances of server.

spring:
  application:
    name: discovery-service
server:
  port: ${PORT:8761}
eureka:
  instance:
    hostname: localhost
  client:
    registerWithEureka: false
    fetchRegistry: false
    serviceUrl:
      defaultZone: http://localhost:8761/eureka/

The tests of the sample autoscaling system will be performed using Docker containers, so we need to prepare and build image with Eureka server. Here’s Dockerfile with image definition. It can be built using command docker build -t piomin/discovery-server:2.0 ..

FROM openjdk:8-jre-alpine
ENV APP_FILE discovery-service-1.0-SNAPSHOT.jar
ENV APP_HOME /usr/apps
EXPOSE 8761
COPY target/$APP_FILE $APP_HOME/
WORKDIR $APP_HOME
ENTRYPOINT ["sh", "-c"]
CMD ["exec java -jar $APP_FILE"]

Building Jenkins pipeline for autoscaling

The first step is to prepare Jenkins pipeline responsible for autoscaling. We will create Jenkins Declarative Pipeline, which runs every minute. Periodical execution may be configured with the triggers directive, that defines the automated ways in which the pipeline should be re-triggered. Our pipeline will communicate with Eureka server and metrics endpoints exposed by every microservice using Spring Boot Actuator.
The test service name is EXAMPLE-SERVICE, which is equal to value (big letters) of property spring.application.name defined inside application.yml file. The monitored metric is the number of HTTP listener threads running on Tomcat container. These threads are responsible for processing incoming HTTP requests.

pipeline {
    agent any
    triggers {
        cron('* * * * *')
    }
    environment {
        SERVICE_NAME = "EXAMPLE-SERVICE"
        METRICS_ENDPOINT = "/actuator/metrics/tomcat.threads.busy?tag=name:http-nio-auto-1"
        SHUTDOWN_ENDPOINT = "/actuator/shutdown"
    }
    stages { ... }
}

Integrating Jenkins pipeline with Eureka

The first stage of our pipeline is responsible for fetching list of services registered in service discovery server. Eureka exposes HTTP API with several endpoints. One of them is GET /eureka/apps/{serviceName}, which returns list of all instances of application with given name. We are saving the number of running instances and the URL of metrics endpoint of every single instance. These values would be accessed during next stages of pipeline.
Here’s the fragment of pipeline responsible for fetching list of running instances of application. The name of stage is Calculate. We use HTTP Request Plugin for HTTP connections.

stage('Calculate') {
	steps {
		script {
			def response = httpRequest "http://192.168.99.100:8761/eureka/apps/${env.SERVICE_NAME}"
			def app = printXml(response.content)
			def index = 0
			env["INSTANCE_COUNT"] = app.instance.size()
			app.instance.each {
				if (it.status == 'UP') {
					def address = "http://${it.ipAddr}:${it.port}"
					env["INSTANCE_${index++}"] = address 
				}
			}
		}
	}
}

@NonCPS
def printXml(String text) {
    return new XmlSlurper(false, false).parseText(text)
}

Here’s a sample response from Eureka API for our microservice. The response content type is XML.

spring-autoscaler-2

Integrating Jenkins pipeline with Spring Boot Actuator metrics

Spring Boot Actuator exposes endpoint with metrics, which allows to find metric by name and optionally by tag. In the fragment of pipeline visible below I’m trying to find the instance with metric below or above a defined threshold. If there is such an instance we stop the loop in order to proceed to the next stage, which performs scaling down or up. The ip addresses of running applications are taken from pipeline environment variable with prefix INSTANCE_, which has been saved in the previous stage.

stage('Metrics') {
	steps {
		script {
			def count = env.INSTANCE_COUNT
			for(def i=0; i<count; i++) {
				def ip = env["INSTANCE_${i}"] + env.METRICS_ENDPOINT
				if (ip == null)
					break;
				def response = httpRequest ip
				def objRes = printJson(response.content)
				env.SCALE_TYPE = returnScaleType(objRes)
				if (env.SCALE_TYPE != "NONE")
					break
			}
		}
	}
}

@NonCPS
def printJson(String text) {
    return new JsonSlurper().parseText(text)
}

def returnScaleType(objRes) {
def value = objRes.measurements[0].value
if (value.toInteger() > 100)
		return "UP"
else if (value.toInteger() < 20)
		return "DOWN"
else
		return "NONE"
}

Shutdown application instance

In the last stage of our pipeline we will shutdown the running instance or start new instance depending on the result saved in the previous stage. Shutdown may be easily performed by calling Spring Boot Actuator endpoint. In the following fragment of pipeline we pick the instance returned by Eureka as first. Then we send POST request to that ip address.
If we need to scale up our application we call another pipeline responsible for build fat JAR and launch it on our machine.

stage('Scaling') {
	steps {
		script {
			if (env.SCALE_TYPE == 'DOWN') {
				def ip = env["INSTANCE_0"] + env.SHUTDOWN_ENDPOINT
				httpRequest url:ip, contentType:'APPLICATION_JSON', httpMode:'POST'
			} else if (env.SCALE_TYPE == 'UP') {
				build job:'spring-boot-run-pipeline'
			}
			currentBuild.description = env.SCALE_TYPE
		}
	}
}

Here’s a full definition of our pipeline spring-boot-run-pipeline responsible for starting new instance of application. It clones the repository with application source code, builds binaries using Maven commands, and finally runs the application using java -jar command passing address of Eureka server as a parameter.

pipeline {
    agent any
    tools {
        maven 'M3'
    }
    stages {
        stage('Checkout') {
            steps {
                git url: 'https://github.com/piomin/sample-spring-boot-autoscaler.git', credentialsId: 'github-piomin', branch: 'master'
            }
        }
        stage('Build') {
            steps {
                dir('example-service') {
                    sh 'mvn clean package'
                }
            }
        }
        stage('Run') {
            steps {
                dir('example-service') {
                    sh 'nohup java -jar -DEUREKA_URL=http://192.168.99.100:8761/eureka target/example-service-1.0-SNAPSHOT.jar 1>/dev/null 2>logs/runlog &'
                }
            }
        }
    }
}

Remote extension

The algorithm discussed in the previous sections will work fine only for microservices launched on the single machine. If we would like to extend it to work with many machines, we will have to modify our architecture as shown below. Each machine has Jenkins agent running and communicating with Jenkins master. If we would like to start new instance of microservices on the selected machine, we have to run pipeline using agent running on that machine. This agent is responsible only for building application from source code and launching it on the target machine. The shutdown of instance is still performed just by calling HTTP endpoint.

spring-autoscaler-3

You can find more information about running Jenkins agents and connecting them with Jenkins master via JNLP protocol in my article Jenkins nodes on Docker containers. Assuming we have successfully launched some agents on the target machines we need to parametrize our pipelines in order to be able to select agent (and therefore the target machine) dynamically.
When we are scaling up our application we have to pass agent label to the downstream pipeline.

build job:'spring-boot-run-pipeline', parameters:[string(name: 'agent', value:"slave-1")]

The calling pipeline will be ran by agent labelled with given parameter.

pipeline {
    agent {
        label "${params.agent}"
    }
    stages { ... }
}

If we have more than one agent connected to the master node we can map their addresses into the labels. Thanks to that you would be able to map IP address of microservice instance fetched from Eureka to the target machine with Jenkins agent.

pipeline {
    agent any
    triggers {
        cron('* * * * *')
    }
    environment {
        SERVICE_NAME = "EXAMPLE-SERVICE"
        METRICS_ENDPOINT = "/actuator/metrics/tomcat.threads.busy?tag=name:http-nio-auto-1"
        SHUTDOWN_ENDPOINT = "/actuator/shutdown"
        AGENT_192.168.99.102 = "slave-1"
        AGENT_192.168.99.103 = "slave-2"
    }
    stages { ... }
}

Summary

In this article I have demonstrated how to use Spring Boot Actuator metrics in order to scale up/scale down your Spring Boot application. Using basic mechanisms provided by Spring Boot together with Spring Cloud Netflix Eureka and Jenkins you can implement auto-scaling for your applications without getting any other third-party tools. The case described in this article assumes using Jenkins agents on the remote machines to launch there new instance of application, but you may as well use a tool like Ansible for that. If you would decide to run Ansible playbooks from Jenkins you will not have to launch Jenkins agents on remote machines. The source code with sample applications is available on GitHub: https://github.com/piomin/sample-spring-boot-autoscaler.git.

Advertisements

Chaos Monkey for Spring Boot Microservices

How many of you have never encountered a crash or a failure of your systems in production environment? Certainly, each one of you, sooner or later, has experienced it. If we are not able to avoid a failure, the solution seems to be maintaining our system in the state of permanent failure. This concept underpins the tool invented by Netflix to test the resilience of its IT infrastructure – Chaos Monkey. A few days ago I came across the solution, based on the idea behind Netflix’s tool, designed to test Spring Boot applications. Such a library has been implemented by Codecentric. Until now, I recognize them only as the authors of other interesting solution dedicated for Spring Boot ecosystem – Spring Boot Admin. I have already described this library in one of my previous articles Monitoring Microservices With Spring Boot Admin (https://piotrminkowski.wordpress.com/2017/06/26/monitoring-microservices-with-spring-boot-admin).
Today I’m going to show you how to include Codecentric’s Chaos Monkey in your Spring Boot application, and then implement chaos engineering in sample system consists of some microservices. The Chaos Monkey library can be used together with Spring Boot 2.0, and the current release version of it is 1.0.1. However, I’ll implement the sample using version 2.0.0-SNAPSHOT, because it has some new interesting features not available in earlier versions of this library. In order to be able to download SNAPSHOT version of Codecentric’s Chaos Monkey library you have to remember about including Maven repository https://oss.sonatype.org/content/repositories/snapshots to your repositories in pom.xml.

1. Enable Chaos Monkey for an application

There are two required steps for enabling Chaos Monkey for Spring Boot application. First, let’s add library chaos-monkey-spring-boot to the project’s dependencies.

<dependency>
	<groupId>de.codecentric</groupId>
	<artifactId>chaos-monkey-spring-boot</artifactId>
	<version>2.0.0-SNAPSHOT</version>
</dependency>

Then, we should activate profile chaos-monkey on application startup.

$ java -jar target/order-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey

2. Sample system architecture

Our sample system consists of three microservices, each started in two instances, and a service discovery server. Microservices registers themselves against a discovery server, and communicates with each other through HTTP API. Chaos Monkey library is included to every single instance of all running microservices, but not to the discovery server. Here’s the diagram that illustrates the architecture of our sample system.

chaos

The source code of sample applications is available on GitHub in repository sample-spring-chaosmonkey (https://github.com/piomin/sample-spring-chaosmonkey.git). After cloning this repository and building it using mnv clean install command, you should first run discovery-service. Then run two instances of every microservice on different ports by setting -Dserver.port property with an appropriate number. Here’s a set of my running commands.

$ java -jar target/discovery-service-1.0-SNAPSHOT.jar
$ java -jar target/order-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey
$ java -jar -Dserver.port=9091 target/order-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey
$ java -jar target/product-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey
$ java -jar -Dserver.port=9092 target/product-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey
$ java -jar target/customer-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey
$ java -jar -Dserver.port=9093 target/customer-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey

3. Process configuration

In version 2.0.0-SNAPSHOT of chaos-monkey-spring-boot library Chaos Monkey is by default enabled for applications that include it. You may disable it using property chaos.monkey.enabled. However, the only assault which is enabled by default is latency. This type of assault adds a random delay to the requests processed by the application in the range determined by properties chaos.monkey.assaults.latencyRangeStart and chaos.monkey.assaults.latencyRangeEnd. The number of attacked requests is dependent of property chaos.monkey.assaults.level, where 1 means each request and 10 means each 10th request. We can also enable exception and appKiller assault for our application. For simplicity, I set the configuration for all the microservices. Let’s take a look on settings provided in application.yml file.

chaos:
  monkey:
    assaults:
	  level: 8
	  latencyRangeStart: 1000
	  latencyRangeEnd: 10000
	  exceptionsActive: true
	  killApplicationActive: true
	watcher:
	  repository: true
      restController: true

In theory, the configuration visible above should enable all three available types of assaults. However, if you enable latency and exceptions, killApplication will never happen. Also, if you enable both latency and exceptions, all the requests send to application will be attacked, no matter which level is set with chaos.monkey.assaults.level property. It is important to remember about activating restController watcher, which is disabled by default.

4. Enable Spring Boot Actuator endpoints

Codecentric implements a new feature in the version 2.0 of their Chaos Monkey library – the endpoint for Spring Boot Actuator. To enable it for our applications we have to activate it following actuator convention – by setting property management.endpoint.chaosmonkey.enabled to true. Additionally, beginning from version 2.0 of Spring Boot we have to expose that HTTP endpoint to be available after application startup.

management:
  endpoint:
    chaosmonkey:
      enabled: true
  endpoints:
    web:
      exposure:
        include: health,info,chaosmonkey

The chaos-monkey-spring-boot provides several endpoints allowing you to check out and modify configuration. You can use method GET /chaosmonkey to fetch the whole configuration of library. Yo may also disable chaos monkey after starting application by calling method POST /chaosmonkey/disable. The full list of available endpoints is listed here: https://codecentric.github.io/chaos-monkey-spring-boot/2.0.0-SNAPSHOT/#endpoints.

5. Running applications

All the sample microservices stores data in MySQL. So, the first step is to run MySQL database locally using its Docker image. The Docker command visible below also creates database and user with password.

$ docker run -d --name mysql -e MYSQL_DATABASE=chaos -e MYSQL_USER=chaos -e MYSQL_PASSWORD=chaos123 -e MYSQL_ROOT_PASSWORD=123456 -p 33306:3306 mysql

After running all the sample applications, where all microservices are multiplied in two instances listening on different ports, our environment looks like in the figure below.

chaos-4

You will see the following information in your logs during application boot.

chaos-5

We may check out Chaos Monkey configuration settings for every running instance of application by calling the following actuator endpoint.

chaos-3

6. Testing the system

For the testing purposes, I used popular performance testing library – Gatling. It creates 20 simultaneous threads, which calls POST /orders and GET /order/{id} methods exposed by order-service via API gateway 500 times per each thread.

class ApiGatlingSimulationTest extends Simulation {

  val scn = scenario("AddAndFindOrders").repeat(500, "n") {
        exec(
          http("AddOrder-API")
            .post("http://localhost:8090/order-service/orders")
            .header("Content-Type", "application/json")
            .body(StringBody("""{"productId":""" + Random.nextInt(20) + ""","customerId":""" + Random.nextInt(20) + ""","productsCount":1,"price":1000,"status":"NEW"}"""))
            .check(status.is(200),  jsonPath("$.id").saveAs("orderId"))
        ).pause(Duration.apply(5, TimeUnit.MILLISECONDS))
        .
        exec(
          http("GetOrder-API")
            .get("http://localhost:8090/order-service/orders/${orderId}")
            .check(status.is(200))
        )
  }

  setUp(scn.inject(atOnceUsers(20))).maxDuration(FiniteDuration.apply(10, "minutes"))

}

POST endpoint is implemented inside OrderController in add(...) method. It calls find methods exposed by customer-service and product-service using OpenFeign clients. If customer has a sufficient funds and there are still products in stock, it accepts the order and performs changes for customer and product using PUT methods. Here’s the implementation of two methods tested by Gatling performance test.

@RestController
@RequestMapping("/orders")
public class OrderController {

	@Autowired
	OrderRepository repository;
	@Autowired
	CustomerClient customerClient;
	@Autowired
	ProductClient productClient;

	@PostMapping
	public Order add(@RequestBody Order order) {
		Product product = productClient.findById(order.getProductId());
		Customer customer = customerClient.findById(order.getCustomerId());
		int totalPrice = order.getProductsCount() * product.getPrice();
		if (customer != null && customer.getAvailableFunds() >= totalPrice && product.getCount() >= order.getProductsCount()) {
			order.setPrice(totalPrice);
			order.setStatus(OrderStatus.ACCEPTED);
			product.setCount(product.getCount() - order.getProductsCount());
			productClient.update(product);
			customer.setAvailableFunds(customer.getAvailableFunds() - totalPrice);
			customerClient.update(customer);
		} else {
			order.setStatus(OrderStatus.REJECTED);
		}
		return repository.save(order);
	}

	@GetMapping("/{id}")
	public Order findById(@PathVariable("id") Integer id) {
		Optional order = repository.findById(id);
		if (order.isPresent()) {
			Order o = order.get();
			Product product = productClient.findById(o.getProductId());
			o.setProductName(product.getName());
			Customer customer = customerClient.findById(o.getCustomerId());
			o.setCustomerName(customer.getName());
			return o;
		} else {
			return null;
		}
	}

	// ...

}

Chaos Monkey sets random latency between 1000 and 10000 milliseconds (as shown in the step 3). It is important to change default timeouts for Feign and Ribbon clients before starting a test. I decided to set readTimeout to 5000 milliseconds. It will cause some delayed requests to be timed out, while some will succeeded (around 50%-50%). Here’s timeouts configuration for Feign client.

feign:
  client:
    config:
      default:
        connectTimeout: 5000
        readTimeout: 5000
  hystrix:
    enabled: false

Here’s Ribbon client timeouts configuration for API gateway. We have also changed Hystrix settings to disable circuit breaker for Zuul.

ribbon:
  ConnectTimeout: 5000
  ReadTimeout: 5000

hystrix:
  command:
    default:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 15000
      fallback:
        enabled: false
      circuitBreaker:
        enabled: false

To launch Gatling performance test go to performance-test directory and run gradle loadTest command. Here’s a result generated for the settings latency assaults. Of course, we can change this result by manipulating Chaos Monkey latency values or Ribbon and Feign timeout values.

chaos-5

Here’s Gatling graph with average response times. Results do not look good. However, we should remember that a single POST method from order-service calls two methods exposed by product-service and two methods exposed by customer-service.

chaos-6

Here’s the next Gatling result graph – this time it illustrates timeline with error and success responses. All HTML reports generated by Gatling during performance test are available under directory performance-test/build/gatling-results

chaos-7

Performance Testing with Gatling

How many of you have ever created automated performance tests before running application on production? Usually, developers attaches importance to the functional testing and tries to provide at least some unit and integration tests. However, sometimes a performance leak may turn out to be more serious than undetected business error, because it can affect the whole system, not the only the one business process.
Personally, I have been implementing performance tests for my application, but I have never run them as a part of the Continuous Integration process. Of course it took place some years, my knowledge and experience were a lot smaller… Anyway, recently I have became interested in topics related to performance testing, partly for the reasons of performance issues with the application in my organisation. As it happens, the key is to find the right tool. Probably many of you have heard about JMeter. Today I’m going to present the competitive solution – Gatling. I’ve read it generates rich and colorful reports with all the metrics collected during the test case. That feature seems to be better than in JMeter.
Before starting the discussion about Gatling let me say some words about theory. We can distinguish between two types of performance testing: load and stress testing. Load testing verifies how the system function under a heavy number of concurrent clients sending requests over a certain period of time. However, the main goal of that type of tests is to simulate the standard traffic similar to that, which may arise on production. Stress testing takes load testing and pushes your app to the limits to see how it handles an extremely heavy load.

What is Gatling?

Gatling is a powerful tool for load testing, written in Scala. It has a full support of HTTP protocols and can also be used for testing JDBC connections and JMS. When using Gatling you have to define test scenario as a Scala dsl code. It is worth to mention that it provides a comprehensive informative HTML load reports and has plugins for inteegration with Gradle, Maven and Jenkins.

Building sample application

Before we run any tests we need to have something for tests. Our sample application is really simple. Its source code is available as usual on GitHub. It exposes RESTful HTTP API with CRUD operations for adding and searching entity in the database. I use Postgres as a backend store for the application repository. The application is build on the top of Spring Boot framework. It also uses Spring Data project as a persistence layer implementation.

plugins {
    id 'org.springframework.boot' version '1.5.9.RELEASE'
}
dependencies {
	compile group: 'org.springframework.boot', name: 'spring-boot-starter-web'
	compile group: 'org.springframework.boot', name: 'spring-boot-starter-data-jpa'
	compile group: 'org.postgresql', name: 'postgresql', version: '42.1.4'
	testCompile group: 'org.springframework.boot', name: 'spring-boot-starter-test'
}

There is one entity Person which is mapped to the table person.

@Entity
@SequenceGenerator(name = "seq_person", initialValue = 1, allocationSize = 1)
public class Person {
	@Id
	@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "seq_person")
	private Long id;
	@Column(name = "first_name")
	private String firstName;
	@Column(name = "last_name")
	private String lastName;
	@Column(name = "birth_date")
	private Date birthDate;
	@Embedded
	private Address address;
	// ...
}

Database connection settings and hibernate properties are configured in application.yml file.

spring:
  application:
    name: gatling-service
  datasource:
    url: jdbc:postgresql://192.168.99.100:5432/gatling
    username: gatling
    password: gatling123
  jpa:
    properties:
      hibernate:
        hbm2ddl:
          auto: update

server:
  port: 8090

Like I have already mentioned the application exposes API methods for adding and searching persons in database. Here’s our Spring REST controller implementation.

@RestController
@RequestMapping("/persons")
public class PersonsController {

	private static final Logger LOGGER = LoggerFactory.getLogger(PersonsController.class);

	@Autowired
	PersonsRepository repository;

	@GetMapping
	public List<Person> findAll() {
		return (List<Person>) repository.findAll();
	}

	@PostMapping
	public Person add(@RequestBody Person person) {
		Person p = repository.save(person);
		LOGGER.info("add: {}", p.toString());
		return p;
	}

	@GetMapping("/{id}")
	public Person findById(@PathVariable("id") Long id) {
		LOGGER.info("findById: id={}", id);
		return repository.findOne(id);
	}

}

Running database

The next after the sample application development is to run the database. The most suitable way of running it for the purposes is by Docker image. Here’s a Docker command that start Postgres containerand initializes gatling user and database.

docker run -d --name postgres -e POSTGRES_DB=gatling -e POSTGRES_USER=gatling -e POSTGRES_PASSWORD=gatling123 -p 5432:5432 postgres

Providing test scenario

Every Gatling test suite should extends Simulation class. Inside it you may declare a list of scenarios using Gatling Scala DSL. Our goal is to run 30 clients which simultaneously sends requests 1000 times. First, the clients adds new person into the database by calling POST /persons method. Then they try to search person using its id by calling GET /persons/{id} method. So, totally 60k would be sent to the application: 30k to POST endpoint and 30k to GET method. Like you see on the code below the test scenario is quite simple. ApiGatlingSimulationTest is available under directory src/test/scala.

class ApiGatlingSimulationTest extends Simulation {

  val scn = scenario("AddAndFindPersons").repeat(1000, "n") {
        exec(
          http("AddPerson-API")
            .post("http://localhost:8090/persons")
            .header("Content-Type", "application/json")
            .body(StringBody("""{"firstName":"John${n}","lastName":"Smith${n}","birthDate":"1980-01-01", "address": {"country":"pl","city":"Warsaw","street":"Test${n}","postalCode":"02-200","houseNo":${n}}}"""))
            .check(status.is(200))
        ).pause(Duration.apply(5, TimeUnit.MILLISECONDS))
  }.repeat(1000, "n") {
        exec(
          http("GetPerson-API")
            .get("http://localhost:8090/persons/${n}")
            .check(status.is(200))
        )
  }

  setUp(scn.inject(atOnceUsers(30))).maxDuration(FiniteDuration.apply(10, "minutes"))

}

To enable Gatling framework for the project we should also define the following dependency in the Gradle build file.

testCompile group: 'io.gatling.highcharts', name: 'gatling-charts-highcharts', version: '2.3.0'

Running tests

There are some Gradle plugins available, which provides support for running tests during project build. However, we may also define simple gradle task that just run tests using io.gatling.app.Gatling class.

task loadTest(type: JavaExec) {
   dependsOn testClasses
   description = "Load Test With Gatling"
   group = "Load Test"
   classpath = sourceSets.test.runtimeClasspath
   jvmArgs = [
        "-Dgatling.core.directory.binaries=${sourceSets.test.output.classesDir.toString()}"
   ]
   main = "io.gatling.app.Gatling"
   args = [
           "--simulation", "pl.piomin.services.gatling.ApiGatlingSimulationTest",
           "--results-folder", "${buildDir}/gatling-results",
           "--binaries-folder", sourceSets.test.output.classesDir.toString(),
           "--bodies-folder", sourceSets.test.resources.srcDirs.toList().first().toString() + "/gatling/bodies",
   ]
}

The Gradle task defined above may be run with command gradle loadTest. Of course, before running tests you should launch the application. You may perform it from your IDE by starting the main class pl.piomin.services.gatling.ApiApplication or by running command java -jar build/libs/sample-load-test-gatling.jar.

Test reports

After test execution the report is printed in a text format.

================================================================================
---- Global Information --------------------------------------------------------
> request count                                      60000 (OK=60000  KO=0     )
> min response time                                      2 (OK=2      KO=-     )
> max response time                                   1338 (OK=1338   KO=-     )
> mean response time                                    80 (OK=80     KO=-     )
> std deviation                                        106 (OK=106    KO=-     )
> response time 50th percentile                         50 (OK=50     KO=-     )
> response time 75th percentile                         93 (OK=93     KO=-     )
> response time 95th percentile                        253 (OK=253    KO=-     )
> response time 99th percentile                        564 (OK=564    KO=-     )
> mean requests/sec                                319.149 (OK=319.149 KO=-     )
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                         59818 (100%) > 800 ms < t < 1200 ms                                 166 (  0%) > t > 1200 ms                                           16 (  0%)
> failed                                                 0 (  0%)
================================================================================

But that what is really cool in Gatling is an ability to generate reports in a graphical form. HTML reports are available under directory build/gatling-results. The first report shows global information with total number of requests and maximum response time by percentiles. For example, you may see that maximum response time in 95% of responses for GetPerson-API is 206 ms.

gatling-1

We may check out such report for all requests or filter them to see only those generated by selected API. In the picture below there is visualization only for GetPerson-API.

gatling-2

Here’s the graph with percentage of requests grouped by average response time.

gatling-3

Here’s the graph which ilustrates timeline with average response times. Additionally, that timeline also shows the statistics by percentiles.

gatling-4

Here’s the graph with number of requests processed succesfully by the application in a second.

gatling-5

Spring Cloud Apps Memory Management

Today’s topic is about memory management in Java as well as about microservices architecture. The inspiration to write this post was the situation when our available memory on test environment for the Spring Cloud based applications was exhausted. Without going into the details what was the cause of such a situation, the problem related with memory consumption by monolith based architecture in comparison with microservices is obvious. For example, supposing we have quite a large monolithic application, often 1GB or 2 GB of RAM will be enough for it, especially if we are talking about not a production environment. If we divide this application into 20 or 30 independent microservices it is hard to expect that the RAM will still remain around 1GB or 2GB. Especially if we use Spring Cloud 🙂

When running sample microservices I will use an earlier example prepared for the purpose of the one of previous articles, which is available on GitHub. I’m going to launch three microservices. First, for service discovery which uses Netflix Eureka server and two simple microservices which provide REST API, communicate with each other and register themselves in discovery server. I will not limit in any way the memory usage by those applications.

Like you see in the figure below those three running microservices have occupied about 1.5GB RAM memory on my computer. This is not the best message considering that we are dealing with very simple applications which do not even have a data persistence layer. The lowest RAM usage is for discovery service and the biggest for customer service which initializes declarative feign client for invoking account service API. Before making that screen I send some test requests to every microservice and run Eureka web console.

micro-ram-1

A lot about memory usage is shown on the charts visible below made using JProfiler. As we see most of such a memory usage is affected by heap, in comparison to non-heap it does take up much space.

micro-ram-2

micro-ram-3

Of course, the first obvious question is whether we need as much space on the heap to run our microservice application. The answer is no, we do not. Now, let’s take a brief look at how the memory management process takes place in Java 8.

We can devide JVM memory into two different parts: Heap and Non-Heap. I have already mentioned a little about Heap. As you could see on the graphs above the heap commited size for our microservices was really big (~600MB). In turn, JVM Memory consists of Young Generation and Old Generation. All the newly created objects are located in the Young Generation. When young generation is filled, garbage collection (Minor GC) is performed. To be more precise, those objects are located in the part of Young Generation which is called Eden Space. Minor GC moves all still used objects from Eden Space into Survivor 0. The same process is performed for Survivor 0 and Survivor 1 spaces. All objects that survived many cycles of GC, are moved to the Old Generation memory space. For removing objects from there is Major GC process is responsible. So, these are the most important information about Java Heap. In order to better understand the figure below. The memory limits for Java Heap can be set with the following parameters during running java -jar command:

  • -Xms – initial heap size when JVM starts
  • -Xmx – maximum heap size
  • -Xmn – size of the Young Generation, rest of the space goes for Old Generation

jvm memory

The second part of the JVM Memory, looking at the graphs above slightly less important from our point of view, is Non-Heap. Non-Heap consists of the following parts:

  • Thread Stacks – space for all running threads. The maximum thread size can be set using -Xss parameter.
  • Metaspace – it replaced PermGem, which was in Java 7 the part of JVM Heap. In Metaspace there are located all classes and methods load by application. Looking at the number of libraries included for Spring Cloud we won’t save much memory here. Metaspace size can be managed by setting -XX:MetaspaceSize and -XX:MaxMetaspaceSize parameters.
  • Code Cache – this is the space for native code (like JNI) or Java methods that are compiled into native code by JIT (just-in-time) compiler. The maximum size is determined by setting -XX:ReservedCodeCacheSize parameter.
  • Compressed Class Space – the maximum memory reserved for compressed class space is set with -XX:CompressedClassSpaceSize
  • Direct NIO Buffers

To put it more simply, Heap is for objects and Non-Heap is for classes. As you can imagine we can end up with the situation when non-heap is larger than heap for our application. First, let’s run our service discovery with the parameters below. In my opinion these are the lowest values if you are starting Eureka with embedded Tomcat on Spring Boot.

-Xms16m -Xmx32m -XX:MaxMetaspaceSize=48m -XX:CompressedClassSpaceSize=8m -Xss256k -Xmn8m -XX:InitialCodeCacheSize=4m -XX:ReservedCodeCacheSize=8m -XX:MaxDirectMemorySize=16m

If we are running microservice with REST API and Eureka, Feign and Ribbon clients we need to increase values a little.

-Xms16m -Xmx48m -XX:MaxMetaspaceSize=64m -XX:CompressedClassSpaceSize=8m -Xss256k -Xmn8m -XX:InitialCodeCacheSize=4m -XX:ReservedCodeCacheSize=8m -XX:MaxDirectMemorySize=16m

Here are charts from JProfiler for the settings above and Customer service. The difference is in starting and requests processing time. The application is working slower in comparison with earlier settings (or rather lack of them :)). Well, I wouldn’t set such a parameters in production mode. Treat them rather as a minimum requirements for service discovery and microservice apps.

micro-ram-5

micro-ram-6

The current total memory usage is as follows. It is still the biggest for Customer service and the lowest for Discovery.

micro-ram-4

I have also tried to run Discovery application using different web containers. You can easily change web container by including in your pom.xml file the dependencies visible below.

For Jetty.

<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-jetty</artifactId>
</dependency>

For Undertow.

<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-undertow</artifactId>
</dependency>

The best result was for Undertow (116MB), second place for Tomcat (122MB) and third for Jetty (128MB). This tests were performed only for Eureka server without registering there any microservices.

Generating large PDF files using JasperReports

During the last ‘Code Europe’ conference in Warsaw appeared many topics related to microservices architecture. Several times I heard the conclusion that the best candidate for separation from monolith is service that generates PDF reports. It’s usually quite independent from the other parts of application. I can see a similar approach in my organization, where first microservice running in production mode was the one that generates PDF reports. To my surprise, the vendor which developed that microservice had to increase maximum heap size to 1GB on each of its instances. This has forced me to take a closer look at the topic of PDF reports generation process.
The most popular Java library for creating PDF files is JasperReports. During generation process, this library by default stores all objects in RAM memory. If such reports are large, this could be a problem my vendor encountered. Their solution, as I have mentioned before, was to increase the maximum size of Java heap 🙂

This time, unlike usual, I’m going to start with the test implementation. Here’s simple JUnit test with 20 requests per second sending to service endpoint.

public class JasperApplicationTest {

	protected Logger logger = Logger.getLogger(JasperApplicationTest.class.getName());
	TestRestTemplate template = new TestRestTemplate();

	@Test
	public void testGetReport() throws InterruptedException {
		List<HttpStatus> responses = new ArrayList<>();
		Random r = new Random();
		int i = 0;
		for (; i < 20; i++) {
			new Thread(new Runnable() {
				@Override
				public void run() {
					int age = r.nextInt(99);
					long start = System.currentTimeMillis();
					ResponseEntity<InputStreamResource> res = template.getForEntity("http://localhost:2222/pdf/{age}", InputStreamResource.class, age);
					logger.info("Response (" +  (System.currentTimeMillis()-start) + "): " + res.getStatusCode());
					responses.add(res.getStatusCode());
					try {
						Thread.sleep(50);
					} catch (InterruptedException e) {
						e.printStackTrace();
					}
				}
			}).start();
		}

		while (responses.size() != i) {
			Thread.sleep(500);
		}
		logger.info("Test finished");
	}
}

In my test scenario I inserted about 1M records into the person table. Everything works fine during running test. Generated files had about 500kb size and 200 pages. All requests were succeeded and each of them had been processed about 8 seconds. In comparison with single request which had been processed 4 seconds it seems to be a good result. The situation with RAM is worse as you can see in the figure below. After generating 20 PDF reports allocated heap size increases to more than 1GB and used heap size was about 550MB. Also CPU usage during report generation increased to 100% usage. I could easily image generating files bigger than 500kb in the production mode…

jasper-1

In our situation we have two options. We can always add more RAM memory or … look for another choice 🙂 Jasper library comes with solution – Virtualizers. The virtualizer cuts the jasper report print into different files and save them on the hard drive and/or compress it. There are three types of virtualizers:
JRFileVirtualizer, JRSwapFileVirtualizer and JRGzipVirtualizer. You can read more about them here. Now, look at the figure below. Here’s illustration of memory and CPU usage for the test with JRFileVirtualizer. It looks a little better than the previous figure, but it does not knock us down 🙂 However, requests with the same overload as for the previous test take much longer – about 30 seconds. It’s not a good message, but at least the heap size allocation is not increases as fast as for previous sample.

jasper-2

Same test has been performed for JRSwapFileVirtualizer. The requests was average processed around 10 seconds. The graph illustrating CPU and memory usage is rather more similar to in memory test than JRFileVirtualizer test.

jasper-3

To see the difference between those three scenarios we have to run our application with maximum heap size set. For my tests I set -Xmx128m -Xms128m. For test with file virtualizers we receive HTTP responses with PDF reports, but for in memory tests the exception is thrown by the sample application: java.lang.OutOfMemoryError: GC overhead limit exceeded.

For testing purposes I created Spring Boot application. Sample source code is available as usual on GitHub. Here’s full list of Maven dependencies for that project.

<dependency>
	<groupId>net.sf.jasperreports</groupId>
	<artifactId>jasperreports</artifactId>
	<version>6.4.0</version>
</dependency>
<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-test</artifactId>
	<scope>test</scope>
</dependency>
<dependency>
	<groupId>mysql</groupId>
	<artifactId>mysql-connector-java</artifactId>
	<scope>runtime</scope>
</dependency>

Here’s application main class. There are @Bean declarations of file virtualizers and JasperReport which is responsible for template compilation from .jrxml file. To run application for testing purposes type java -jar -Xms64m -Xmx128m -Ddirectory=C:\Users\minkowp\pdf sample-jasperreport-boot.jar.

@SpringBootApplication
public class JasperApplication {

	@Value("${directory}")
	private String directory;

	public static void main(String[] args) {
		SpringApplication.run(JasperApplication.class, args);
	}

	@Bean
	JasperReport report() throws JRException {
		JasperReport jr = null;
		File f = new File("personReport.jasper");
		if (f.exists()) {
			jr = (JasperReport) JRLoader.loadObject(f);
		} else {
			jr = JasperCompileManager.compileReport("src/main/resources/report.jrxml");
			JRSaver.saveObject(jr, "personReport.jasper");
		}
		return jr;
	}

	@Bean
	JRFileVirtualizer fileVirtualizer() {
		return new JRFileVirtualizer(100, directory);
	}

	@Bean
	JRSwapFileVirtualizer swapFileVirtualizer() {
		JRSwapFile sf = new JRSwapFile(directory, 1024, 100);
		return new JRSwapFileVirtualizer(20, sf, true);
	}

}

There are three endpoints exposed for the tests:
/pdf/{age} – in memory PDF generation
/pdf/fv/{age} – PDF generation with JRFileVirtualizer
/pdf/sfv/{age} – PDF generation with JRSwapFileVirtualizer

Here’s method generating PDF report. Report is generated in fillReport static method from JasperFillManager. It takes three parameters as input: JasperReport which encapsulates compiled .jrxml template file, JDBC connection object and map of parameters. Then report is ganerated and saved on disk as a PDF file. File is returned as an attachement in the response.

	private ResponseEntity<InputStreamResource> generateReport(String name, Map<String, Object> params) {
		FileInputStream st = null;
		Connection cc = null;
		try {
			cc = datasource.getConnection();
			JasperPrint p = JasperFillManager.fillReport(jasperReport, params, cc);
			JRPdfExporter exporter = new JRPdfExporter();
			SimpleOutputStreamExporterOutput c = new SimpleOutputStreamExporterOutput(name);
			exporter.setExporterInput(new SimpleExporterInput(p));
			exporter.setExporterOutput(c);
			exporter.exportReport();

			st = new FileInputStream(name);
			HttpHeaders responseHeaders = new HttpHeaders();
			responseHeaders.setContentType(MediaType.valueOf("application/pdf"));
			responseHeaders.setContentDispositionFormData("attachment", name);
			responseHeaders.setContentLength(st.available());
		    return new ResponseEntity<InputStreamResource>(new InputStreamResource(st), responseHeaders, HttpStatus.OK);
		} catch (Exception e) {
			e.printStackTrace();
		} finally {
			fv.cleanup();
			sfv.cleanup();
			if (cc != null)
				try {
					cc.close();
				} catch (SQLException e) {
					e.printStackTrace();
				}
		}
		return null;
	}

To enable virtualizer during report generation we only have to pass one parameter to the map of parameters – instance of virtualizer object.

	@Autowired
	JRFileVirtualizer fv;
	@Autowired
	JRSwapFileVirtualizer sfv;
	@Autowired
	DataSource datasource;
	@Autowired
	JasperReport jasperReport;

	@ResponseBody
	@RequestMapping(value = "/pdf/fv/{age}")
	public ResponseEntity<InputStreamResource> getReportFv(@PathVariable("age") int age) {
		logger.info("getReportFv(" + age + ")");
		Map<String, Object> m = new HashMap<>();
		m.put(JRParameter.REPORT_VIRTUALIZER, fv);
		m.put("age", age);
		String name = ++count + "personReport.pdf";
		return generateReport(name, m);
	}

Template file report.jrxml is available under /src/main/resources directory. Inside queryString tag there is SQL query which takes age parameter in WHERE statement. There are also five columns declared all taken from SQL query result.

<?xml version = "1.0" encoding = "UTF-8"?>
<!DOCTYPE jasperReport PUBLIC "//JasperReports//DTD Report Design//EN"    "http://jasperreports.sourceforge.net/dtds/jasperreport.dtd">

<jasperReport xmlns="http://jasperreports.sourceforge.net/jasperreports"               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"               xsi:schemaLocation="http://jasperreports.sourceforge.net/jasperreports    http://jasperreports.sourceforge.net/xsd/jasperreport.xsd"               name="report2" pageWidth="595" pageHeight="842"                columnWidth="555" leftMargin="20" rightMargin="20"               topMargin="20" bottomMargin="20">
    <parameter name="age" class="java.lang.Integer"/>
    <queryString>
        <![CDATA[SELECT * FROM person WHERE age = $P{age}]]>
    </queryString>
    <field name="id" class="java.lang.Integer" />
    <field name="first_name" class="java.lang.String" />
    <field name="last_name" class="java.lang.String" />
    <field name="age" class="java.lang.Integer" />
    <field name="pesel" class="java.lang.String" />

    <detail>
        <band height="15">

            <textField>
                <reportElement x="0" y="0" width="50" height="15" />

                <textElement textAlignment="Right" verticalAlignment="Middle"/>

                <textFieldExpression class="java.lang.Integer">
                    <![CDATA[$F{id}]]>
                </textFieldExpression>
            </textField>       

            <textField>
                <reportElement x="100" y="0" width="80" height="15" />

                <textElement textAlignment="Left" verticalAlignment="Middle"/>

                <textFieldExpression class="java.lang.String">
                    <![CDATA[$F{first_name}]]>
                </textFieldExpression>
            </textField> 

            <textField>
                <reportElement x="200" y="0" width="80" height="15" />

                <textElement textAlignment="Left" verticalAlignment="Middle"/>

                <textFieldExpression class="java.lang.String">
                    <![CDATA[$F{last_name}]]>
                </textFieldExpression>
            </textField>               

            <textField>
                <reportElement x="300" y="0" width="50" height="15"/>
                <textElement textAlignment="Right" verticalAlignment="Middle"/>

                <textFieldExpression class="java.lang.Integer">
                    <![CDATA[$F{age}]]>
                </textFieldExpression>
            </textField>

           <textField>
                <reportElement x="380" y="0" width="80" height="15" />

                <textElement textAlignment="Left" verticalAlignment="Middle"/>

                <textFieldExpression class="java.lang.String">
                    <![CDATA[$F{pesel}]]>
                </textFieldExpression>
            </textField>         

        </band>
    </detail>

</jasperReport>

And the last thing we have to do is to properly set database connection pool settings. A natural choice for Spring Boot application is Tomcat JDBC pool.

spring:
  application:
    name: jasper-service
  datasource:
    url: jdbc:mysql://192.168.99.100:33306/datagrid?useSSL=false
    username: datagrid
    password: datagrid
    tomcat:
      initial-size: 20
      max-active: 30

Final words

In this article I showed you how to avoid out of memory exception while generating large PDF reports with JasperReports. I compared three solutions: in memory generation and two methods based on cutting the jasper print into different files and save them on the hard drive. For me, the most interesting was the solution based on single swapped file with JRSwapFileVirtualizer. It is slower a little than in memory generation but works faster than similar tests for JRFileVirtualizer and in contrast to in memory generation didn’t avoid out of memory exception for files larger than 500kb with 20 requests per second.