Testing Microservices: Tools and Frameworks

There are some key challenges around microservices architecture testing that we are facing. The selection of right tools is one of that elements that helps us deal with the issues related to those challenges. First, let’s identify the most important elements involved into the process of microservices testing. These are some of them:

  • Teams coordination – with many independent teams managing their own microservices, it becomes very challenging to coordinate the overall process of software development and testing
  • Complexity – there are many microservices that communicate to each other. We need to ensure that every one of them is working properly and is resistant to the slow responses or failures from other microservices
  • Performance – since there are many independent services it is important to test the whole architecture under traffic close to the production

Let’s discuss some interesting frameworks helping that may help you in testing microservice-based architecture.

Components tests with Hoverfly

Hoverfly simulation mode may be especially useful for building component tests. During component tests we are verifying the whole microservice without communication over network with other microservices or external datastores. The following picture shows how such a test is performed for our sample microservice.

testing-microservices-1

Hoverfly provides simple DSL for creating simulations, and a JUnit integration for using it within JUnit tests. It may orchestrated via JUnit @Rule. We are simulating two services and then overriding Ribbon properties to resolve address of these services by client name. We should also disable communication with Eureka discovery by disabling registration after application boot or fetching list of services for Ribbon client. Hoverfly simulates responses for PUT and GET methods exposed by passenger-management and driver-management microservices. Controller is the main component that implements business logic in our application. It store data using in-memory repository component and communicates with other microservices through @FeignClient interfaces. By testing three methods implemented by the controller we are testing the whole business logic implemented inside trip-management service.

@SpringBootTest(properties = {
        "eureka.client.enabled=false",
        "ribbon.eureka.enable=false",
        "passenger-management.ribbon.listOfServers=passenger-management",
        "driver-management.ribbon.listOfServers=driver-management"
})
@RunWith(SpringRunner.class)
@AutoConfigureMockMvc
@FixMethodOrder(MethodSorters.NAME_ASCENDING)
public class TripComponentTests {

    ObjectMapper mapper = new ObjectMapper();

    @Autowired
    MockMvc mockMvc;

    @ClassRule
    public static HoverflyRule rule = HoverflyRule.inSimulationMode(SimulationSource.dsl(
            HoverflyDsl.service("passenger-management:80")
                    .get(HoverflyMatchers.startsWith("/passengers/login/"))
                    .willReturn(ResponseCreators.success(HttpBodyConverter.jsonWithSingleQuotes("{'id':1,'name':'John Walker'}")))
                    .put(HoverflyMatchers.startsWith("/passengers")).anyBody()
                    .willReturn(ResponseCreators.success(HttpBodyConverter.jsonWithSingleQuotes("{'id':1,'name':'John Walker'}"))),
            HoverflyDsl.service("driver-management:80")
                    .get(HoverflyMatchers.startsWith("/drivers/"))
                    .willReturn(ResponseCreators.success(HttpBodyConverter.jsonWithSingleQuotes("{'id':1,'name':'David Smith','currentLocationX': 15,'currentLocationY':25}")))
                    .put(HoverflyMatchers.startsWith("/drivers")).anyBody()
                    .willReturn(ResponseCreators.success(HttpBodyConverter.jsonWithSingleQuotes("{'id':1,'name':'David Smith','currentLocationX': 15,'currentLocationY':25}")))
    )).printSimulationData();

    @Test
    public void test1CreateNewTrip() throws Exception {
        TripInput ti = new TripInput("test", 10, 20, "walker");
        mockMvc.perform(MockMvcRequestBuilders.post("/trips")
                .contentType(MediaType.APPLICATION_JSON_UTF8)
                .content(mapper.writeValueAsString(ti)))
                .andExpect(MockMvcResultMatchers.status().isOk())
                .andExpect(MockMvcResultMatchers.jsonPath("$.id", Matchers.any(Integer.class)))
                .andExpect(MockMvcResultMatchers.jsonPath("$.status", Matchers.is("NEW")))
                .andExpect(MockMvcResultMatchers.jsonPath("$.driverId", Matchers.any(Integer.class)));
    }

    @Test
    public void test2CancelTrip() throws Exception {
        mockMvc.perform(MockMvcRequestBuilders.put("/trips/cancel/1")
                .contentType(MediaType.APPLICATION_JSON_UTF8)
                .content(mapper.writeValueAsString(new Trip())))
                .andExpect(MockMvcResultMatchers.status().isOk())
                .andExpect(MockMvcResultMatchers.jsonPath("$.id", Matchers.any(Integer.class)))
                .andExpect(MockMvcResultMatchers.jsonPath("$.status", Matchers.is("IN_PROGRESS")))
                .andExpect(MockMvcResultMatchers.jsonPath("$.driverId", Matchers.any(Integer.class)));
    }

    @Test
    public void test3PayTrip() throws Exception {
        mockMvc.perform(MockMvcRequestBuilders.put("/trips/payment/1")
                .contentType(MediaType.APPLICATION_JSON_UTF8)
                .content(mapper.writeValueAsString(new Trip())))
                .andExpect(MockMvcResultMatchers.status().isOk())
                .andExpect(MockMvcResultMatchers.jsonPath("$.id", Matchers.any(Integer.class)))
                .andExpect(MockMvcResultMatchers.jsonPath("$.status", Matchers.is("PAYED")));
    }

}

The tests visible above verify only positive scenarios. What about testing some unexpected behaviour like network delays or server errors? With Hoverfly we can easily simulate such a behaviour and define some negative scenarios. In the following fragment of code I have defined three scenarios. In the first of them target service has been delayed 2 seconds. In order to simulate timeout on the client side I had to change default readTimeout for Ribbon load balancer and then disabled Hystrix circuit breaker for Feign client. The second test simulates HTTP 500 response status from passenger-management service. The last scenario assumes empty response from method responsible for searching the nearest driver.

@SpringBootTest(properties = {
        "eureka.client.enabled=false",
        "ribbon.eureka.enable=false",
        "passenger-management.ribbon.listOfServers=passenger-management",
        "driver-management.ribbon.listOfServers=driver-management",
        "feign.hystrix.enabled=false",
        "ribbon.ReadTimeout=500"
})
@RunWith(SpringRunner.class)
@AutoConfigureMockMvc
public class TripNegativeComponentTests {

    private ObjectMapper mapper = new ObjectMapper();
    @Autowired
    private MockMvc mockMvc;

    @ClassRule
    public static HoverflyRule rule = HoverflyRule.inSimulationMode(SimulationSource.dsl(
            HoverflyDsl.service("passenger-management:80")
                    .get("/passengers/login/test1")
                    .willReturn(ResponseCreators.success(HttpBodyConverter.jsonWithSingleQuotes("{'id':1,'name':'John Smith'}")).withDelay(2000, TimeUnit.MILLISECONDS))
                    .get("/passengers/login/test2")
                    .willReturn(ResponseCreators.success(HttpBodyConverter.jsonWithSingleQuotes("{'id':1,'name':'John Smith'}")))
                    .get("/passengers/login/test3")
                    .willReturn(ResponseCreators.serverError()),
            HoverflyDsl.service("driver-management:80")
                    .get(HoverflyMatchers.startsWith("/drivers/"))
                    .willReturn(ResponseCreators.success().body("{}"))
            ));

    @Test
    public void testCreateTripWithTimeout() throws Exception {
        mockMvc.perform(MockMvcRequestBuilders.post("/trips").contentType(MediaType.APPLICATION_JSON).content(mapper.writeValueAsString(new TripInput("test", 15, 25, "test1"))))
                .andExpect(MockMvcResultMatchers.status().isOk())
                .andExpect(MockMvcResultMatchers.jsonPath("$.id", Matchers.nullValue()))
                .andExpect(MockMvcResultMatchers.jsonPath("$.status", Matchers.is("REJECTED")));
    }

    @Test
    public void testCreateTripWithError() throws Exception {
        mockMvc.perform(MockMvcRequestBuilders.post("/trips").contentType(MediaType.APPLICATION_JSON).content(mapper.writeValueAsString(new TripInput("test", 15, 25, "test3"))))
                .andExpect(MockMvcResultMatchers.status().isOk())
                .andExpect(MockMvcResultMatchers.jsonPath("$.id", Matchers.nullValue()))
                .andExpect(MockMvcResultMatchers.jsonPath("$.status", Matchers.is("REJECTED")));
    }

    @Test
    public void testCreateTripWithNoDrivers() throws Exception {
        mockMvc.perform(MockMvcRequestBuilders.post("/trips").contentType(MediaType.APPLICATION_JSON).content(mapper.writeValueAsString(new TripInput("test", 15, 25, "test2"))))
                .andExpect(MockMvcResultMatchers.status().isOk())
                .andExpect(MockMvcResultMatchers.jsonPath("$.id", Matchers.nullValue()))
                .andExpect(MockMvcResultMatchers.jsonPath("$.status", Matchers.is("REJECTED")));
    }

}

All the timeouts and errors in communication with external microservices are handled by the bean annotated with @ControllerAdvice. In such cases trip-management microservice should not return server error response, but 200 OK with JSON response containing field status equals to REJECTED.

@ControllerAdvice
public class TripControllerErrorHandler extends ResponseEntityExceptionHandler {

    @ExceptionHandler({RetryableException.class, FeignException.class})
    protected ResponseEntity handleFeignTimeout(RuntimeException ex, WebRequest request) {
        Trip trip = new Trip();
        trip.setStatus(TripStatus.REJECTED);
        return handleExceptionInternal(ex, trip, null, HttpStatus.OK, request);
    }

}

Contract tests with Pact

The next type of test strategy usually implemented for microservices-based architecture is consumer-driven contract testing. In fact, there are some tools especially dedicated for such type of tests. One of them is Pact. Contract testing is a way to ensure that services can communicate with each other without implementing integration tests. A contract is signed between two sides of communication: consumer and provider. Pact assumes that contract code is generated and published on the consumer side, and than verified by the provider.

Pact provides tool that can store and share the contracts between consumers and providers. It is called Pact Broker. It exposes a simple RESTful API for publishing and retrieving pacts, and embedded web dashboard for navigating the API. We can easily run Pact Broker on the local machine using its Docker image.

micro-testing-2

We will begin from running Pact Broker. Pact Broker requires running instance of postgresql, so first we have to launch it using Docker image, and then link our broker container with that container.

docker run -d --name postgres -p 5432:5432 -e POSTGRES_USER=oauth -e POSTGRES_PASSWORD=oauth123 -e POSTGRES_DB=oauth postgres
docker run -d --name pact-broker --link postgres:postgres -e PACT_BROKER_DATABASE_USERNAME=oauth -e PACT_BROKER_DATABASE_PASSWORD=oauth123 -e PACT_BROKER_DATABASE_HOST=postgres -e PACT_BROKER_DATABASE_NAME=oauth -p 9080:80 dius/pact-broker

The next step is to implement contract tests on the consumer side. We will use JVM implementation of Pact library for that. It provides PactProviderRuleMk2 object responsible for creating stubs of the provider service. We should annotate it with JUnit @Rule. Ribbon will forward all requests to passenger-management to the stub address – in that case localhost:8180. Pact JVM supports annotations and provides DSL for building test scenarios. Test method responsible for generating contract data should be annotated with @Pact. It is important to set fields state and provider, because then generated contract would be verified on the provider side using these names. Generated pacts are verified inside the same test class by the methods annotated with @PactVerification. Field fragment points to the name of the method responsible for generating pact inside the same test class. The contract is tested using PassengerManagementClient @FeignClient.

@RunWith(SpringRunner.class)
@SpringBootTest(properties = {
        "driver-management.ribbon.listOfServers=localhost:8190",
        "passenger-management.ribbon.listOfServers=localhost:8180",
        "ribbon.eureka.enabled=false",
        "eureka.client.enabled=false",
        "ribbon.ReadTimeout=5000"
})
public class PassengerManagementContractTests {

    @Rule
    public PactProviderRuleMk2 stubProvider = new PactProviderRuleMk2("passengerManagementProvider", "localhost", 8180, this);
    @Autowired
    private PassengerManagementClient passengerManagementClient;

    @Pact(state = "get-passenger", provider = "passengerManagementProvider", consumer = "passengerManagementClient")
    public RequestResponsePact callGetPassenger(PactDslWithProvider builder) {
        DslPart body = new PactDslJsonBody().integerType("id").stringType("name").numberType("balance").close();
        return builder.given("get-passenger").uponReceiving("test-get-passenger")
                .path("/passengers/login/test").method("GET").willRespondWith().status(200).body(body).toPact();
    }

    @Pact(state = "update-passenger", provider = "passengerManagementProvider", consumer = "passengerManagementClient")
    public RequestResponsePact callUpdatePassenger(PactDslWithProvider builder) {
        return builder.given("update-passenger").uponReceiving("test-update-passenger")
                .path("/passengers").method("PUT").bodyWithSingleQuotes("{'id':1,'amount':1000}", "application/json").willRespondWith().status(200)
                .bodyWithSingleQuotes("{'id':1,'name':'Adam Smith','balance':5000}", "application/json").toPact();
    }

    @Test
    @PactVerification(fragment = "callGetPassenger")
    public void verifyGetPassengerPact() {
        Passenger passenger = passengerManagementClient.getPassenger("test");
        Assert.assertNotNull(passenger);
        Assert.assertNotNull(passenger.getId());
    }

    @Test
    @PactVerification(fragment = "callUpdatePassenger")
    public void verifyUpdatePassengerPact() {
        Passenger passenger = passengerManagementClient.updatePassenger(new PassengerInput(1L, 1000));
        Assert.assertNotNull(passenger);
        Assert.assertNotNull(passenger.getId());
    }

}

Just running the tests is not enough. We also have to publish pacts generated during tests to Pact Broker. In order to achieve it we have to include the following Maven plugin to our pom.xml and then execute command mvn clean install pact:publish.

<plugin>
	<groupId>au.com.dius</groupId>
	<artifactId>pact-jvm-provider-maven_2.12</artifactId>
	<version>3.5.21</version>
	<configuration>
		<pactBrokerUrl>http://192.168.99.100:9080</pactBrokerUrl>
	</configuration>
</plugin>

Pact provides support for Spring on the provider side. Thanks to that we may use MockMvc controllers or inject properties from application.yml into the test class. Here’s dependency declaration that has to be included to our pom.xml

<dependency>
	<groupId>au.com.dius</groupId>
	<artifactId>pact-jvm-provider-spring_2.12</artifactId>
	<version>3.5.21</version>
	<scope>test</scope>
</dependency>

Now , the contract is being verified on the provider side. We need to pass provider name inside @Provider annotation and name of states for every verification test inside @State. These values has been during the tests on the consumer side inside @Pact annotation (fields state and provider).

@RunWith(SpringRestPactRunner.class)
@Provider("passengerManagementProvider")
@PactBroker
public class PassengerControllerContractTests {

    @InjectMocks
    private PassengerController controller = new PassengerController();
    @Mock
    private PassengerRepository repository;
    @TestTarget
    public final MockMvcTarget target = new MockMvcTarget();

    @Before
    public void before() {
        MockitoAnnotations.initMocks(this);
        target.setControllers(controller);
    }

    @State("get-passenger")
    public void testGetPassenger() {
        target.setRunTimes(3);
        Mockito.when(repository.findByLogin(Mockito.anyString()))
                .thenReturn(new Passenger(1L, "Adam Smith", "test", 4000))
                .thenReturn(new Passenger(3L, "Tom Hamilton", "hamilton", 400000))
                .thenReturn(new Passenger(5L, "John Scott", "scott", 222));
    }

    @State("update-passenger")
    public void testUpdatePassenger() {
        target.setRunTimes(1);
        Passenger passenger = new Passenger(1L, "Adam Smith", "test", 4000);
        Mockito.when(repository.findById(1L)).thenReturn(passenger);
        Mockito.when(repository.update(Mockito.any(Passenger.class)))
                .thenReturn(new Passenger(1L, "Adam Smith", "test", 5000));
    }
}

Pact Broker host and port are injected from application.yml file.

pactbroker:
  host: "192.168.99.100"
  port: "8090"

Performance tests with Gatling

An important step of testing microservices before deploying them on production is performance testing. One of interesting tools in this area is Gatling. It is highly capable load testing tool written in Scala. It means that we also have to use Scala DSL in order to build test scenarios. Let’s begin from adding required library to pom.xml file.

<dependency>
	<groupId>io.gatling.highcharts</groupId>
	<artifactId>gatling-charts-highcharts</artifactId>
	<version>2.3.1</version>
</dependency>

Now, we may proceed to the test. In the scenario visible above we are testing two endpoints exposed by trip-management: POST /trips and PUT /trips/payment/${tripId}. In fact, this scenario verifies the whole functionality of our sample system, where we are setting up trip and then pay for it after finish.
Every test class using Gatling needs to extend Simulation class. We are defining scenario using scenario method and then setting its name. We may define multiple executions inside single scenario. After every execution of POST /trips method test save generated id returned by the service. Then it inserts that id into the URL used for calling method PUT /trips/payment/${tripId}. Every single test expects response with 200 OK status.
Gatling provides two interesting features, which are worth mentioning. You can see how they are used in the following performance test. First of them is feeder. It is used for polling records and injecting their content into the test. Feed rPassengers selects one of five defined logins randomly. The final test result may be verified using Assertions API. It is responsible for verifying global statistics like response time or number of failed requests matches expectations for a whole simulation. In the scenario visible below the criterium is max response time that needs to be lower 100 milliseconds.

class CreateAndPayTripPerformanceTest extends Simulation {

  val rPassengers = Iterator.continually(Map("passenger" -> List("walker","smith","hamilton","scott","holmes").lift(Random.nextInt(5)).get))

  val scn = scenario("CreateAndPayTrip").feed(rPassengers).repeat(100, "n") {
    exec(http("CreateTrip-API")
      .post("http://localhost:8090/trips")
      .header("Content-Type", "application/json")
      .body(StringBody("""{"destination":"test${n}","locationX":${n},"locationY":${n},"username":"${passenger}"}"""))
      .check(status.is(200), jsonPath("$.id").saveAs("tripId"))
    ).exec(http("PayTrip-API")
      .put("http://localhost:8090/trips/payment/${tripId}")
      .header("Content-Type", "application/json")
      .check(status.is(200))
    )
  }

  setUp(scn.inject(atOnceUsers(20))).maxDuration(FiniteDuration.apply(5, TimeUnit.MINUTES))
    .assertions(global.responseTime.max.lt(100))

}

In order to run Gatling performance test you need to include the following Maven plugin to your pom.xml. You may run a single scenario or run multiple scenarios. After including the plugin you only need to execute command mvn clean gatling:test.

<plugin>
	<groupId>io.gatling</groupId>
	<artifactId>gatling-maven-plugin</artifactId>
	<version>2.2.4</version>
	<configuration>
		<simulationClass>pl.piomin.performance.tests.CreateAndPayTripPerformanceTest</simulationClass>
	</configuration>
</plugin>

Here are some diagrams illustrating result of performance tests for our microservice. Because maximum response time has been greater than set inside assertion (100ms), the test has failed.

microservices-testing-2

and …

microservices-testing-3

Summary

The right selection of tools is not the most important element phase of microservices testing. However, right tools can help you facing the key challenges related to it. Hoverfly allows to create full component tests that verifies if your microservice is able to handle delays or error from downstream services. Pact helps you to organize team by sharing and verifying contracts between independently developed microservices. Finally, Gatling can help you implementing load tests for selected scenarios, in order to verify an end-to-end performance of your system.
The source code used as a demo for this article is available on GitHub: https://github.com/piomin/sample-testing-microservices.git. If you find this article interesting for you you may be also interested in some other articles related to this subject:

Advertisements

Chaos Monkey for Spring Boot Microservices

How many of you have never encountered a crash or a failure of your systems in production environment? Certainly, each one of you, sooner or later, has experienced it. If we are not able to avoid a failure, the solution seems to be maintaining our system in the state of permanent failure. This concept underpins the tool invented by Netflix to test the resilience of its IT infrastructure – Chaos Monkey. A few days ago I came across the solution, based on the idea behind Netflix’s tool, designed to test Spring Boot applications. Such a library has been implemented by Codecentric. Until now, I recognize them only as the authors of other interesting solution dedicated for Spring Boot ecosystem – Spring Boot Admin. I have already described this library in one of my previous articles Monitoring Microservices With Spring Boot Admin (https://piotrminkowski.wordpress.com/2017/06/26/monitoring-microservices-with-spring-boot-admin).
Today I’m going to show you how to include Codecentric’s Chaos Monkey in your Spring Boot application, and then implement chaos engineering in sample system consists of some microservices. The Chaos Monkey library can be used together with Spring Boot 2.0, and the current release version of it is 1.0.1. However, I’ll implement the sample using version 2.0.0-SNAPSHOT, because it has some new interesting features not available in earlier versions of this library. In order to be able to download SNAPSHOT version of Codecentric’s Chaos Monkey library you have to remember about including Maven repository https://oss.sonatype.org/content/repositories/snapshots to your repositories in pom.xml.

1. Enable Chaos Monkey for an application

There are two required steps for enabling Chaos Monkey for Spring Boot application. First, let’s add library chaos-monkey-spring-boot to the project’s dependencies.

<dependency>
	<groupId>de.codecentric</groupId>
	<artifactId>chaos-monkey-spring-boot</artifactId>
	<version>2.0.0-SNAPSHOT</version>
</dependency>

Then, we should activate profile chaos-monkey on application startup.

$ java -jar target/order-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey

2. Sample system architecture

Our sample system consists of three microservices, each started in two instances, and a service discovery server. Microservices registers themselves against a discovery server, and communicates with each other through HTTP API. Chaos Monkey library is included to every single instance of all running microservices, but not to the discovery server. Here’s the diagram that illustrates the architecture of our sample system.

chaos

The source code of sample applications is available on GitHub in repository sample-spring-chaosmonkey (https://github.com/piomin/sample-spring-chaosmonkey.git). After cloning this repository and building it using mnv clean install command, you should first run discovery-service. Then run two instances of every microservice on different ports by setting -Dserver.port property with an appropriate number. Here’s a set of my running commands.

$ java -jar target/discovery-service-1.0-SNAPSHOT.jar
$ java -jar target/order-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey
$ java -jar -Dserver.port=9091 target/order-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey
$ java -jar target/product-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey
$ java -jar -Dserver.port=9092 target/product-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey
$ java -jar target/customer-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey
$ java -jar -Dserver.port=9093 target/customer-service-1.0-SNAPSHOT.jar --spring.profiles.active=chaos-monkey

3. Process configuration

In version 2.0.0-SNAPSHOT of chaos-monkey-spring-boot library Chaos Monkey is by default enabled for applications that include it. You may disable it using property chaos.monkey.enabled. However, the only assault which is enabled by default is latency. This type of assault adds a random delay to the requests processed by the application in the range determined by properties chaos.monkey.assaults.latencyRangeStart and chaos.monkey.assaults.latencyRangeEnd. The number of attacked requests is dependent of property chaos.monkey.assaults.level, where 1 means each request and 10 means each 10th request. We can also enable exception and appKiller assault for our application. For simplicity, I set the configuration for all the microservices. Let’s take a look on settings provided in application.yml file.

chaos:
  monkey:
    assaults:
	  level: 8
	  latencyRangeStart: 1000
	  latencyRangeEnd: 10000
	  exceptionsActive: true
	  killApplicationActive: true
	watcher:
	  repository: true
      restController: true

In theory, the configuration visible above should enable all three available types of assaults. However, if you enable latency and exceptions, killApplication will never happen. Also, if you enable both latency and exceptions, all the requests send to application will be attacked, no matter which level is set with chaos.monkey.assaults.level property. It is important to remember about activating restController watcher, which is disabled by default.

4. Enable Spring Boot Actuator endpoints

Codecentric implements a new feature in the version 2.0 of their Chaos Monkey library – the endpoint for Spring Boot Actuator. To enable it for our applications we have to activate it following actuator convention – by setting property management.endpoint.chaosmonkey.enabled to true. Additionally, beginning from version 2.0 of Spring Boot we have to expose that HTTP endpoint to be available after application startup.

management:
  endpoint:
    chaosmonkey:
      enabled: true
  endpoints:
    web:
      exposure:
        include: health,info,chaosmonkey

The chaos-monkey-spring-boot provides several endpoints allowing you to check out and modify configuration. You can use method GET /chaosmonkey to fetch the whole configuration of library. Yo may also disable chaos monkey after starting application by calling method POST /chaosmonkey/disable. The full list of available endpoints is listed here: https://codecentric.github.io/chaos-monkey-spring-boot/2.0.0-SNAPSHOT/#endpoints.

5. Running applications

All the sample microservices stores data in MySQL. So, the first step is to run MySQL database locally using its Docker image. The Docker command visible below also creates database and user with password.

$ docker run -d --name mysql -e MYSQL_DATABASE=chaos -e MYSQL_USER=chaos -e MYSQL_PASSWORD=chaos123 -e MYSQL_ROOT_PASSWORD=123456 -p 33306:3306 mysql

After running all the sample applications, where all microservices are multiplied in two instances listening on different ports, our environment looks like in the figure below.

chaos-4

You will see the following information in your logs during application boot.

chaos-5

We may check out Chaos Monkey configuration settings for every running instance of application by calling the following actuator endpoint.

chaos-3

6. Testing the system

For the testing purposes, I used popular performance testing library – Gatling. It creates 20 simultaneous threads, which calls POST /orders and GET /order/{id} methods exposed by order-service via API gateway 500 times per each thread.

class ApiGatlingSimulationTest extends Simulation {

  val scn = scenario("AddAndFindOrders").repeat(500, "n") {
        exec(
          http("AddOrder-API")
            .post("http://localhost:8090/order-service/orders")
            .header("Content-Type", "application/json")
            .body(StringBody("""{"productId":""" + Random.nextInt(20) + ""","customerId":""" + Random.nextInt(20) + ""","productsCount":1,"price":1000,"status":"NEW"}"""))
            .check(status.is(200),  jsonPath("$.id").saveAs("orderId"))
        ).pause(Duration.apply(5, TimeUnit.MILLISECONDS))
        .
        exec(
          http("GetOrder-API")
            .get("http://localhost:8090/order-service/orders/${orderId}")
            .check(status.is(200))
        )
  }

  setUp(scn.inject(atOnceUsers(20))).maxDuration(FiniteDuration.apply(10, "minutes"))

}

POST endpoint is implemented inside OrderController in add(...) method. It calls find methods exposed by customer-service and product-service using OpenFeign clients. If customer has a sufficient funds and there are still products in stock, it accepts the order and performs changes for customer and product using PUT methods. Here’s the implementation of two methods tested by Gatling performance test.

@RestController
@RequestMapping("/orders")
public class OrderController {

	@Autowired
	OrderRepository repository;
	@Autowired
	CustomerClient customerClient;
	@Autowired
	ProductClient productClient;

	@PostMapping
	public Order add(@RequestBody Order order) {
		Product product = productClient.findById(order.getProductId());
		Customer customer = customerClient.findById(order.getCustomerId());
		int totalPrice = order.getProductsCount() * product.getPrice();
		if (customer != null && customer.getAvailableFunds() >= totalPrice && product.getCount() >= order.getProductsCount()) {
			order.setPrice(totalPrice);
			order.setStatus(OrderStatus.ACCEPTED);
			product.setCount(product.getCount() - order.getProductsCount());
			productClient.update(product);
			customer.setAvailableFunds(customer.getAvailableFunds() - totalPrice);
			customerClient.update(customer);
		} else {
			order.setStatus(OrderStatus.REJECTED);
		}
		return repository.save(order);
	}

	@GetMapping("/{id}")
	public Order findById(@PathVariable("id") Integer id) {
		Optional order = repository.findById(id);
		if (order.isPresent()) {
			Order o = order.get();
			Product product = productClient.findById(o.getProductId());
			o.setProductName(product.getName());
			Customer customer = customerClient.findById(o.getCustomerId());
			o.setCustomerName(customer.getName());
			return o;
		} else {
			return null;
		}
	}

	// ...

}

Chaos Monkey sets random latency between 1000 and 10000 milliseconds (as shown in the step 3). It is important to change default timeouts for Feign and Ribbon clients before starting a test. I decided to set readTimeout to 5000 milliseconds. It will cause some delayed requests to be timed out, while some will succeeded (around 50%-50%). Here’s timeouts configuration for Feign client.

feign:
  client:
    config:
      default:
        connectTimeout: 5000
        readTimeout: 5000
  hystrix:
    enabled: false

Here’s Ribbon client timeouts configuration for API gateway. We have also changed Hystrix settings to disable circuit breaker for Zuul.

ribbon:
  ConnectTimeout: 5000
  ReadTimeout: 5000

hystrix:
  command:
    default:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 15000
      fallback:
        enabled: false
      circuitBreaker:
        enabled: false

To launch Gatling performance test go to performance-test directory and run gradle loadTest command. Here’s a result generated for the settings latency assaults. Of course, we can change this result by manipulating Chaos Monkey latency values or Ribbon and Feign timeout values.

chaos-5

Here’s Gatling graph with average response times. Results do not look good. However, we should remember that a single POST method from order-service calls two methods exposed by product-service and two methods exposed by customer-service.

chaos-6

Here’s the next Gatling result graph – this time it illustrates timeline with error and success responses. All HTML reports generated by Gatling during performance test are available under directory performance-test/build/gatling-results

chaos-7

Performance Testing with Gatling

How many of you have ever created automated performance tests before running application on production? Usually, developers attaches importance to the functional testing and tries to provide at least some unit and integration tests. However, sometimes a performance leak may turn out to be more serious than undetected business error, because it can affect the whole system, not the only the one business process.
Personally, I have been implementing performance tests for my application, but I have never run them as a part of the Continuous Integration process. Of course it took place some years, my knowledge and experience were a lot smaller… Anyway, recently I have became interested in topics related to performance testing, partly for the reasons of performance issues with the application in my organisation. As it happens, the key is to find the right tool. Probably many of you have heard about JMeter. Today I’m going to present the competitive solution – Gatling. I’ve read it generates rich and colorful reports with all the metrics collected during the test case. That feature seems to be better than in JMeter.
Before starting the discussion about Gatling let me say some words about theory. We can distinguish between two types of performance testing: load and stress testing. Load testing verifies how the system function under a heavy number of concurrent clients sending requests over a certain period of time. However, the main goal of that type of tests is to simulate the standard traffic similar to that, which may arise on production. Stress testing takes load testing and pushes your app to the limits to see how it handles an extremely heavy load.

What is Gatling?

Gatling is a powerful tool for load testing, written in Scala. It has a full support of HTTP protocols and can also be used for testing JDBC connections and JMS. When using Gatling you have to define test scenario as a Scala dsl code. It is worth to mention that it provides a comprehensive informative HTML load reports and has plugins for inteegration with Gradle, Maven and Jenkins.

Building sample application

Before we run any tests we need to have something for tests. Our sample application is really simple. Its source code is available as usual on GitHub. It exposes RESTful HTTP API with CRUD operations for adding and searching entity in the database. I use Postgres as a backend store for the application repository. The application is build on the top of Spring Boot framework. It also uses Spring Data project as a persistence layer implementation.

plugins {
    id 'org.springframework.boot' version '1.5.9.RELEASE'
}
dependencies {
	compile group: 'org.springframework.boot', name: 'spring-boot-starter-web'
	compile group: 'org.springframework.boot', name: 'spring-boot-starter-data-jpa'
	compile group: 'org.postgresql', name: 'postgresql', version: '42.1.4'
	testCompile group: 'org.springframework.boot', name: 'spring-boot-starter-test'
}

There is one entity Person which is mapped to the table person.

@Entity
@SequenceGenerator(name = "seq_person", initialValue = 1, allocationSize = 1)
public class Person {
	@Id
	@GeneratedValue(strategy = GenerationType.SEQUENCE, generator = "seq_person")
	private Long id;
	@Column(name = "first_name")
	private String firstName;
	@Column(name = "last_name")
	private String lastName;
	@Column(name = "birth_date")
	private Date birthDate;
	@Embedded
	private Address address;
	// ...
}

Database connection settings and hibernate properties are configured in application.yml file.

spring:
  application:
    name: gatling-service
  datasource:
    url: jdbc:postgresql://192.168.99.100:5432/gatling
    username: gatling
    password: gatling123
  jpa:
    properties:
      hibernate:
        hbm2ddl:
          auto: update

server:
  port: 8090

Like I have already mentioned the application exposes API methods for adding and searching persons in database. Here’s our Spring REST controller implementation.

@RestController
@RequestMapping("/persons")
public class PersonsController {

	private static final Logger LOGGER = LoggerFactory.getLogger(PersonsController.class);

	@Autowired
	PersonsRepository repository;

	@GetMapping
	public List<Person> findAll() {
		return (List<Person>) repository.findAll();
	}

	@PostMapping
	public Person add(@RequestBody Person person) {
		Person p = repository.save(person);
		LOGGER.info("add: {}", p.toString());
		return p;
	}

	@GetMapping("/{id}")
	public Person findById(@PathVariable("id") Long id) {
		LOGGER.info("findById: id={}", id);
		return repository.findOne(id);
	}

}

Running database

The next after the sample application development is to run the database. The most suitable way of running it for the purposes is by Docker image. Here’s a Docker command that start Postgres containerand initializes gatling user and database.

docker run -d --name postgres -e POSTGRES_DB=gatling -e POSTGRES_USER=gatling -e POSTGRES_PASSWORD=gatling123 -p 5432:5432 postgres

Providing test scenario

Every Gatling test suite should extends Simulation class. Inside it you may declare a list of scenarios using Gatling Scala DSL. Our goal is to run 30 clients which simultaneously sends requests 1000 times. First, the clients adds new person into the database by calling POST /persons method. Then they try to search person using its id by calling GET /persons/{id} method. So, totally 60k would be sent to the application: 30k to POST endpoint and 30k to GET method. Like you see on the code below the test scenario is quite simple. ApiGatlingSimulationTest is available under directory src/test/scala.

class ApiGatlingSimulationTest extends Simulation {

  val scn = scenario("AddAndFindPersons").repeat(1000, "n") {
        exec(
          http("AddPerson-API")
            .post("http://localhost:8090/persons")
            .header("Content-Type", "application/json")
            .body(StringBody("""{"firstName":"John${n}","lastName":"Smith${n}","birthDate":"1980-01-01", "address": {"country":"pl","city":"Warsaw","street":"Test${n}","postalCode":"02-200","houseNo":${n}}}"""))
            .check(status.is(200))
        ).pause(Duration.apply(5, TimeUnit.MILLISECONDS))
  }.repeat(1000, "n") {
        exec(
          http("GetPerson-API")
            .get("http://localhost:8090/persons/${n}")
            .check(status.is(200))
        )
  }

  setUp(scn.inject(atOnceUsers(30))).maxDuration(FiniteDuration.apply(10, "minutes"))

}

To enable Gatling framework for the project we should also define the following dependency in the Gradle build file.

testCompile group: 'io.gatling.highcharts', name: 'gatling-charts-highcharts', version: '2.3.0'

Running tests

There are some Gradle plugins available, which provides support for running tests during project build. However, we may also define simple gradle task that just run tests using io.gatling.app.Gatling class.

task loadTest(type: JavaExec) {
   dependsOn testClasses
   description = "Load Test With Gatling"
   group = "Load Test"
   classpath = sourceSets.test.runtimeClasspath
   jvmArgs = [
        "-Dgatling.core.directory.binaries=${sourceSets.test.output.classesDir.toString()}"
   ]
   main = "io.gatling.app.Gatling"
   args = [
           "--simulation", "pl.piomin.services.gatling.ApiGatlingSimulationTest",
           "--results-folder", "${buildDir}/gatling-results",
           "--binaries-folder", sourceSets.test.output.classesDir.toString(),
           "--bodies-folder", sourceSets.test.resources.srcDirs.toList().first().toString() + "/gatling/bodies",
   ]
}

The Gradle task defined above may be run with command gradle loadTest. Of course, before running tests you should launch the application. You may perform it from your IDE by starting the main class pl.piomin.services.gatling.ApiApplication or by running command java -jar build/libs/sample-load-test-gatling.jar.

Test reports

After test execution the report is printed in a text format.

================================================================================
---- Global Information --------------------------------------------------------
> request count                                      60000 (OK=60000  KO=0     )
> min response time                                      2 (OK=2      KO=-     )
> max response time                                   1338 (OK=1338   KO=-     )
> mean response time                                    80 (OK=80     KO=-     )
> std deviation                                        106 (OK=106    KO=-     )
> response time 50th percentile                         50 (OK=50     KO=-     )
> response time 75th percentile                         93 (OK=93     KO=-     )
> response time 95th percentile                        253 (OK=253    KO=-     )
> response time 99th percentile                        564 (OK=564    KO=-     )
> mean requests/sec                                319.149 (OK=319.149 KO=-     )
---- Response Time Distribution ------------------------------------------------
> t < 800 ms                                         59818 (100%) > 800 ms < t < 1200 ms                                 166 (  0%) > t > 1200 ms                                           16 (  0%)
> failed                                                 0 (  0%)
================================================================================

But that what is really cool in Gatling is an ability to generate reports in a graphical form. HTML reports are available under directory build/gatling-results. The first report shows global information with total number of requests and maximum response time by percentiles. For example, you may see that maximum response time in 95% of responses for GetPerson-API is 206 ms.

gatling-1

We may check out such report for all requests or filter them to see only those generated by selected API. In the picture below there is visualization only for GetPerson-API.

gatling-2

Here’s the graph with percentage of requests grouped by average response time.

gatling-3

Here’s the graph which ilustrates timeline with average response times. Additionally, that timeline also shows the statistics by percentiles.

gatling-4

Here’s the graph with number of requests processed succesfully by the application in a second.

gatling-5