Generating large PDF files using JasperReports

During the last ‘Code Europe’ conference in Warsaw appeared many topics related to microservices architecture. Several times I heard the conclusion that the best candidate for separation from monolith is service that generates PDF reports. It’s usually quite independent from the other parts of application. I can see a similar approach in my organization, where first microservice running in production mode was the one that generates PDF reports. To my surprise, the vendor which developed that microservice had to increase maximum heap size to 1GB on each of its instances. This has forced me to take a closer look at the topic of PDF reports generation process.
The most popular Java library for creating PDF files is JasperReports. During generation process, this library by default stores all objects in RAM memory. If such reports are large, this could be a problem my vendor encountered. Their solution, as I have mentioned before, was to increase the maximum size of Java heap ūüôā

This time, unlike usual, I’m going to start with the test implementation. Here’s simple JUnit test with 20 requests per second sending to service endpoint.

public class JasperApplicationTest {

	protected Logger logger = Logger.getLogger(JasperApplicationTest.class.getName());
	TestRestTemplate template = new TestRestTemplate();

	@Test
	public void testGetReport() throws InterruptedException {
		List<HttpStatus> responses = new ArrayList<>();
		Random r = new Random();
		int i = 0;
		for (; i < 20; i++) {
			new Thread(new Runnable() {
				@Override
				public void run() {
					int age = r.nextInt(99);
					long start = System.currentTimeMillis();
					ResponseEntity<InputStreamResource> res = template.getForEntity("http://localhost:2222/pdf/{age}", InputStreamResource.class, age);
					logger.info("Response (" +  (System.currentTimeMillis()-start) + "): " + res.getStatusCode());
					responses.add(res.getStatusCode());
					try {
						Thread.sleep(50);
					} catch (InterruptedException e) {
						e.printStackTrace();
					}
				}
			}).start();
		}

		while (responses.size() != i) {
			Thread.sleep(500);
		}
		logger.info("Test finished");
	}
}

In my test scenario I inserted about 1M records into the person table. Everything works fine during running test. Generated files had about 500kb size and 200 pages. All requests were succeeded and each of them had been processed about 8 seconds. In comparison with single request which had been processed 4 seconds it seems to be a good result. The situation with RAM is worse as you can see in the figure below. After generating 20 PDF reports allocated heap size increases to more than 1GB and used heap size was about 550MB. Also CPU usage during report generation increased to 100% usage. I could easily image generating files bigger than 500kb in the production mode…

jasper-1

In our situation we have two options. We can always add more RAM memory or … look for another choice ūüôā Jasper library comes with solution – Virtualizers. The virtualizer cuts the jasper report print into different files and save them on the hard drive and/or compress it. There are three types of virtualizers:
JRFileVirtualizer, JRSwapFileVirtualizer and JRGzipVirtualizer. You can read more about them here. Now, look at the figure below. Here’s illustration of memory and CPU usage for the test with JRFileVirtualizer. It looks a little better than the previous figure, but it does not knock us down ūüôā However, requests with the same overload as for the previous test take much longer – about 30 seconds. It’s not a good message, but at least the heap size allocation is not increases as fast as for previous sample.

jasper-2

Same test has been performed for JRSwapFileVirtualizer. The requests was average processed around 10 seconds. The graph illustrating CPU and memory usage is rather more similar to in memory test than JRFileVirtualizer test.

jasper-3

To see the difference between those three scenarios we have to run our application with maximum heap size set. For my tests I set -Xmx128m -Xms128m. For test with file virtualizers we receive HTTP responses with PDF reports, but for in memory tests the exception is thrown by the sample application: java.lang.OutOfMemoryError: GC overhead limit exceeded.

For testing purposes I created Spring Boot application. Sample source code is available as usual on GitHub. Here’s full list of Maven dependencies for that project.

<dependency>
	<groupId>net.sf.jasperreports</groupId>
	<artifactId>jasperreports</artifactId>
	<version>6.4.0</version>
</dependency>
<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-web</artifactId>
</dependency>
<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-data-jpa</artifactId>
</dependency>
<dependency>
	<groupId>org.springframework.boot</groupId>
	<artifactId>spring-boot-starter-test</artifactId>
	<scope>test</scope>
</dependency>
<dependency>
	<groupId>mysql</groupId>
	<artifactId>mysql-connector-java</artifactId>
	<scope>runtime</scope>
</dependency>

Here’s application main class. There are @Bean declarations of file virtualizers and JasperReport which is responsible for template compilation from .jrxml file. To run application for testing purposes type java -jar -Xms64m -Xmx128m -Ddirectory=C:\Users\minkowp\pdf sample-jasperreport-boot.jar.

@SpringBootApplication
public class JasperApplication {

	@Value("${directory}")
	private String directory;

	public static void main(String[] args) {
		SpringApplication.run(JasperApplication.class, args);
	}

	@Bean
	JasperReport report() throws JRException {
		JasperReport jr = null;
		File f = new File("personReport.jasper");
		if (f.exists()) {
			jr = (JasperReport) JRLoader.loadObject(f);
		} else {
			jr = JasperCompileManager.compileReport("src/main/resources/report.jrxml");
			JRSaver.saveObject(jr, "personReport.jasper");
		}
		return jr;
	}

	@Bean
	JRFileVirtualizer fileVirtualizer() {
		return new JRFileVirtualizer(100, directory);
	}

	@Bean
	JRSwapFileVirtualizer swapFileVirtualizer() {
		JRSwapFile sf = new JRSwapFile(directory, 1024, 100);
		return new JRSwapFileVirtualizer(20, sf, true);
	}

}

There are three endpoints exposed for the tests:
/pdf/{age} – in memory PDF generation
/pdf/fv/{age} – PDF generation with JRFileVirtualizer
/pdf/sfv/{age} – PDF generation with JRSwapFileVirtualizer

Here’s method generating PDF report. Report is generated in fillReport static method from JasperFillManager. It takes three parameters as input: JasperReport which encapsulates compiled .jrxml template file, JDBC connection object and map of parameters. Then report is ganerated and saved on disk as a PDF file. File is returned as an attachement in the response.

	private ResponseEntity<InputStreamResource> generateReport(String name, Map<String, Object> params) {
		FileInputStream st = null;
		Connection cc = null;
		try {
			cc = datasource.getConnection();
			JasperPrint p = JasperFillManager.fillReport(jasperReport, params, cc);
			JRPdfExporter exporter = new JRPdfExporter();
			SimpleOutputStreamExporterOutput c = new SimpleOutputStreamExporterOutput(name);
			exporter.setExporterInput(new SimpleExporterInput(p));
			exporter.setExporterOutput(c);
			exporter.exportReport();

			st = new FileInputStream(name);
			HttpHeaders responseHeaders = new HttpHeaders();
			responseHeaders.setContentType(MediaType.valueOf("application/pdf"));
			responseHeaders.setContentDispositionFormData("attachment", name);
			responseHeaders.setContentLength(st.available());
		    return new ResponseEntity<InputStreamResource>(new InputStreamResource(st), responseHeaders, HttpStatus.OK);
		} catch (Exception e) {
			e.printStackTrace();
		} finally {
			fv.cleanup();
			sfv.cleanup();
			if (cc != null)
				try {
					cc.close();
				} catch (SQLException e) {
					e.printStackTrace();
				}
		}
		return null;
	}

To enable virtualizer during report generation we only have to pass one parameter to the map of parameters – instance of virtualizer object.

	@Autowired
	JRFileVirtualizer fv;
	@Autowired
	JRSwapFileVirtualizer sfv;
	@Autowired
	DataSource datasource;
	@Autowired
	JasperReport jasperReport;

	@ResponseBody
	@RequestMapping(value = "/pdf/fv/{age}")
	public ResponseEntity<InputStreamResource> getReportFv(@PathVariable("age") int age) {
		logger.info("getReportFv(" + age + ")");
		Map<String, Object> m = new HashMap<>();
		m.put(JRParameter.REPORT_VIRTUALIZER, fv);
		m.put("age", age);
		String name = ++count + "personReport.pdf";
		return generateReport(name, m);
	}

Template file report.jrxml is available under /src/main/resources directory. Inside queryString tag there is SQL query which takes age parameter in WHERE statement. There are also five columns declared all taken from SQL query result.

<?xml version = "1.0" encoding = "UTF-8"?>
<!DOCTYPE jasperReport PUBLIC "//JasperReports//DTD Report Design//EN"    "http://jasperreports.sourceforge.net/dtds/jasperreport.dtd">

<jasperReport xmlns="http://jasperreports.sourceforge.net/jasperreports"               xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"               xsi:schemaLocation="http://jasperreports.sourceforge.net/jasperreports    http://jasperreports.sourceforge.net/xsd/jasperreport.xsd"               name="report2" pageWidth="595" pageHeight="842"                columnWidth="555" leftMargin="20" rightMargin="20"               topMargin="20" bottomMargin="20">
    <parameter name="age" class="java.lang.Integer"/>
    <queryString>
        <![CDATA[SELECT * FROM person WHERE age = $P{age}]]>
    </queryString>
    <field name="id" class="java.lang.Integer" />
    <field name="first_name" class="java.lang.String" />
    <field name="last_name" class="java.lang.String" />
    <field name="age" class="java.lang.Integer" />
    <field name="pesel" class="java.lang.String" />

    <detail>
        <band height="15">

            <textField>
                <reportElement x="0" y="0" width="50" height="15" />

                <textElement textAlignment="Right" verticalAlignment="Middle"/>

                <textFieldExpression class="java.lang.Integer">
                    <![CDATA[$F{id}]]>
                </textFieldExpression>
            </textField>       

            <textField>
                <reportElement x="100" y="0" width="80" height="15" />

                <textElement textAlignment="Left" verticalAlignment="Middle"/>

                <textFieldExpression class="java.lang.String">
                    <![CDATA[$F{first_name}]]>
                </textFieldExpression>
            </textField> 

            <textField>
                <reportElement x="200" y="0" width="80" height="15" />

                <textElement textAlignment="Left" verticalAlignment="Middle"/>

                <textFieldExpression class="java.lang.String">
                    <![CDATA[$F{last_name}]]>
                </textFieldExpression>
            </textField>               

            <textField>
                <reportElement x="300" y="0" width="50" height="15"/>
                <textElement textAlignment="Right" verticalAlignment="Middle"/>

                <textFieldExpression class="java.lang.Integer">
                    <![CDATA[$F{age}]]>
                </textFieldExpression>
            </textField>

           <textField>
                <reportElement x="380" y="0" width="80" height="15" />

                <textElement textAlignment="Left" verticalAlignment="Middle"/>

                <textFieldExpression class="java.lang.String">
                    <![CDATA[$F{pesel}]]>
                </textFieldExpression>
            </textField>         

        </band>
    </detail>

</jasperReport>

And the last thing we have to do is to properly set database connection pool settings. A natural choice for Spring Boot application is Tomcat JDBC pool.

spring:
  application:
    name: jasper-service
  datasource:
    url: jdbc:mysql://192.168.99.100:33306/datagrid?useSSL=false
    username: datagrid
    password: datagrid
    tomcat:
      initial-size: 20
      max-active: 30

Final words

In this article I showed you how to avoid out of memory exception while generating large PDF reports with JasperReports. I compared three solutions: in memory generation and two methods based on cutting the jasper print into different files and save them on the hard drive. For me, the most interesting was the solution based on single swapped file with JRSwapFileVirtualizer. It is slower a little than in memory generation but works faster than similar tests for JRFileVirtualizer and in contrast to in memory generation didn’t avoid out of memory exception for files larger than 500kb with 20 requests per second.

Exposing Microservices over REST Protocol Buffers

Today exposing RESTful API with JSON protocol is the most common standard. We can find many articles describing advantages and disadvantages of JSON versus XML. Both these protocols exchange messages in text format. If an important aspect affecting to the choice of communication protocol in your systems is performance you should definitely pay attention to Protocol Buffers. It is a binary format created by Google as:

A language-neutral, platform-neutral, extensible way of serializing structured data for use in communications protocols, data storage, and more.

Protocol Buffers, which is sometimes referred as Protobuf is not only a message format but also a set of language rules that define the structure of messages. It is extremely useful in service to service communication what has been very well described in that article Beating JSON performance with Protobuf. In that example Protobuf was about 5 times faster than JSON for tests based on Spring Boot framework.

Introduction to Protocol Buffers can be found here. My sample is similar to previous samples from my weblog – it is based on two microservices account and customer which calls one of account’s endpoint. Let’s begin from message types definition provided inside .proto file. Place your .proto file in src/main/proto¬†directory. Here’s account.proto defined in account service. We set java_package and java_outer_classname to define package and name of Java generated class. Message definition syntax is pretty intuitive. Account object generated from that file¬†has three properties id, customerId and number. There is also Accounts object which wrappes list of Account objects.

syntax = "proto3";

package model;

option java_package = "pl.piomin.services.protobuf.account.model";
option java_outer_classname = "AccountProto";

message Accounts {
	repeated Account account = 1;
}

message Account {

	int32 id = 1;
	string number = 2;
	int32 customer_id = 3;

}

Here’s .proto file definition from customer service. It a little more complicated than the previous one from account service. In addition to its definitions it contains definitions of account service messages, because they are used by @Feign client.

syntax = "proto3";

package model;

option java_package = "pl.piomin.services.protobuf.customer.model";
option java_outer_classname = "CustomerProto";

message Accounts {
	repeated Account account = 1;
}

message Account {

	int32 id = 1;
	string number = 2;
	int32 customer_id = 3;

}

message Customers {
	repeated Customer customers = 1;
}

message Customer {

	int32 id = 1;
	string pesel = 2;
	string name = 3;
	CustomerType type = 4;
	repeated Account accounts = 5;

	enum CustomerType {
		INDIVIDUAL = 0;
		COMPANY = 1;
	}

}

We generate source code from the message definitions above by using protobuf-maven-plugin maven plugin. Plugin needs to have protocExecutable file location set. It can be downloaded from Google’s Protocol Buffer download site.

<plugin>
	<groupId>org.xolstice.maven.plugins</groupId>
	<artifactId>protobuf-maven-plugin</artifactId>
	<version>0.5.0</version>
	<executions>
		<execution>
			<id>protobuf-compile</id>
			<phase>generate-sources</phase>
			<goals>
				<goal>compile</goal>
			</goals>
			<configuration>
				<outputDirectory>src/main/generated</outputDirectory>
				<protocExecutable>${proto.executable}</protocExecutable>
			</configuration>
		</execution>
	</executions>
</plugin>

Protobuf classes are generated into src/main/generated output directory. Let’s add that source directory to maven sources with build-helper-maven-plugin.

<plugin>
	<groupId>org.codehaus.mojo</groupId>
	<artifactId>build-helper-maven-plugin</artifactId>
	<executions>
		<execution>
			<id>add-source</id>
			<phase>generate-sources</phase>
			<goals>
				<goal>add-source</goal>
			</goals>
			<configuration>
				<sources>
					<source>src/main/generated</source>
				</sources>
			</configuration>
		</execution>
	</executions>
</plugin>

Sample application source code is available on GitHub. Before proceeding to the next steps build application using mvn clean install command. Generated classes are available under src/main/generated and our microservices are ready to run. Now, let me describe some implementation details. We need two dependencies in maven pom.xml to use Protobuf.

<dependency>
	<groupId>com.google.protobuf</groupId>
	<artifactId>protobuf-java</artifactId>
	<version>3.3.1</version>
</dependency>
<dependency>
	<groupId>com.googlecode.protobuf-java-format</groupId>
	<artifactId>protobuf-java-format</artifactId>
	<version>1.4</version>
</dependency>

Then, we need to declare default HttpMessageConverter @Bean and inject it into RestTemplate @Bean.

    @Bean
    @Primary
    ProtobufHttpMessageConverter protobufHttpMessageConverter() {
        return new ProtobufHttpMessageConverter();
    }

    @Bean
    RestTemplate restTemplate(ProtobufHttpMessageConverter hmc) {
        return new RestTemplate(Arrays.asList(hmc));
    }

Here’s REST @Controller code. Account and Accounts from AccountProto generated class are returned as a response body in all three API methods visible below. All objects generated from .proto files have newBuilder method used for creating new object instances. I also set application/x-protobuf as default response content type.

@RestController
public class AccountController {

	@Autowired
	AccountRepository repository;

	protected Logger logger = Logger.getLogger(AccountController.class.getName());

	@RequestMapping(value = "/accounts/{number}", produces = "application/x-protobuf")
	public Account findByNumber(@PathVariable("number") String number) {
		logger.info(String.format("Account.findByNumber(%s)", number));
		return repository.findByNumber(number);
	}

	@RequestMapping(value = "/accounts/customer/{customer}", produces = "application/x-protobuf")
	public Accounts findByCustomer(@PathVariable("customer") Integer customerId) {
		logger.info(String.format("Account.findByCustomer(%s)", customerId));
		return Accounts.newBuilder().addAllAccount(repository.findByCustomer(customerId)).build();
	}

	@RequestMapping(value = "/accounts", produces = "application/x-protobuf")
	public Accounts findAll() {
		logger.info("Account.findAll()");
		return Accounts.newBuilder().addAllAccount(repository.findAll()).build();
	}

}

Method GET /accounts/customer/{customer} is called from customer service using @Feign client.

@FeignClient(value = "account-service")
public interface AccountClient {

    @RequestMapping(method = RequestMethod.GET, value = "/accounts/customer/{customerId}")
    Accounts getAccounts(@PathVariable("customerId") Integer customerId);

}

We can easily test described configuration using JUnit test class visible below.

@SpringBootTest(webEnvironment = WebEnvironment.RANDOM_PORT)
@RunWith(SpringRunner.class)
public class AccountApplicationTest {

	protected Logger logger = Logger.getLogger(AccountApplicationTest.class.getName());

	@Autowired
	TestRestTemplate template;

	@Test
	public void testFindByNumber() {
		Account a = this.template.getForObject("/accounts/{id}", Account.class, "111111");
		logger.info("Account[\n" + a + "]");
	}

	@Test
	public void testFindByCustomer() {
		Accounts a = this.template.getForObject("/accounts/customer/{customer}", Accounts.class, "2");
		logger.info("Accounts[\n" + a + "]");
	}

	@Test
	public void testFindAll() {
		Accounts a = this.template.getForObject("/accounts", Accounts.class);
		logger.info("Accounts[\n" + a + "]");
	}

	@TestConfiguration
	static class Config {

		@Bean
		public RestTemplateBuilder restTemplateBuilder() {
			return new RestTemplateBuilder().additionalMessageConverters(new ProtobufHttpMessageConverter());
		}

	}

}

Conclusion

This article shows how to enable Protocol Buffers for microservices project based on Spring Boot. Protocol Buffer is an alternative to text-based protocols like XML or JSON and surpasses them in terms of performance. Adapt to this protocol using in Spring Boot application is pretty simple. For microservices we can still uses Spring Cloud components like Feign or Ribbon in combination with Protocol Buffers same as with REST over JSON or XML.

Circuit Breaker, Fallback and Load Balancing with Apache Camel

Apache Camel has just released a new version of their framework Р2.19. In one of my previous articles on DZone I described details about microservices support which was released in the Camel 2.18 version. There are some new features in ServiceCall EIP component, which is responsible for microservice calls. You can see example source code which is based on the sample from my article on DZone. It is available on GitHub under new branch fallback.

In the code fragment below you can see DLS route’s configuration with support for Hystrix circuit breaker, Ribbon load balancer and Consul service discovery and registration. As a service discovery in the route definition you can also use some other solutions instead of Consul like etcd (etcServiceDiscovery) or Kubernetes (kubernetesServiceDiscovery).

from("direct:account")
	.to("bean:customerService?method=findById(${header.id})")
	.log("Msg: ${body}").enrich("direct:acc", new AggregationStrategyImpl());

from("direct:acc").setBody().constant(null)
	.hystrix()
		.hystrixConfiguration()
			.executionTimeoutInMilliseconds(2000)
		.end()
	.serviceCall()
		.name("account//account")
		.component("netty4-http")
		.ribbonLoadBalancer("ribbon-1")
		.consulServiceDiscovery("http://192.168.99.100:8500")
	.end()
	.unmarshal(format)
	.endHystrix()
	.onFallback()
	.to("bean:accountFallback?method=getAccounts");

We can easily configure all Hystrix’s parameters just by calling hystrixConfiguration method. In the sample above Hystrix waits max 2 seconds for the¬†response from remote service. In case of timeout fallback @Bean is called. Fallback @Bean implementation is really simple – it return empty list.

@Service
public class AccountFallback {

	public List<Account> getAccounts() {
		return new ArrayList<>();
	}

}

Alternatively, configuration can be implemented using object delarations. Here is service call configuration with Ribbon and Consul. Additionally, we can provide some parameters to Ribbon like client read timeout or max retry attempts. Unfortunately it seems they doesn’t work in this version of Apache Camel ūüôā (you can try to test it by yourself).¬†I hope this will be corrected soon.

ServiceCallConfigurationDefinition def = new ServiceCallConfigurationDefinition();

ConsulConfiguration config = new ConsulConfiguration();
config.setUrl("http://192.168.99.100:8500");
config.setComponent("netty4-http");
ConsulServiceDiscovery discovery = new ConsulServiceDiscovery(config);

RibbonConfiguration c = new RibbonConfiguration();
c.addProperty("MaxAutoRetries", "0");
c.addProperty("MaxAutoRetriesNextServer", "1");
c.addProperty("ReadTimeout", "1000");
c.setClientName("ribbon-1");
RibbonServiceLoadBalancer lb = new RibbonServiceLoadBalancer(c);
lb.setServiceDiscovery(discovery);

def.setComponent("netty4-http");
def.setLoadBalancer(lb);
def.setServiceDiscovery(discovery);
context.setServiceCallConfiguration(def);

I described similar case for Spring Cloud and Netflix OSS in one of my previous article. Just like in the example presented there, I also set here a delay inside account service, which depends on the port on which the microservice was started.

@Value("${port}")
private int port;

public List<Account> findByCustomerId(Integer customerId) {
	List<Account> l = new ArrayList<>();
	l.add(new Account(1, "1234567890", 4321, customerId));
	l.add(new Account(2, "1234567891", 12346, customerId));
	if (port%2 == 0) {
		try {
			Thread.sleep(5000);
		} catch (InterruptedException e) {
			e.printStackTrace();
		}
	}
	return l;
}

Results for Spring Cloud sample were much more satisfying. The introduced configuration parameters such as read timeout for Ribbon worked and in addition Hystrix was able to automatically redirect a much smaller number of requests to slow service – only 2% of the rest to the non-blocking thread instance for 5 seconds. This shows that Apache Camel still has a few things to improve if wants to compete in¬†microservice’s support with Sprint Cloud framework.

Spring Cloud Microservices at Pivotal Platform

Imagine you have multiple microservices running on different machines as multiple instances. It seems natural to think about the tools that helps you¬†in the process of monitoring and managing all of them.¬†If we add that our microservices are¬†created based on the Spring Cloud framework obviously seems we should look at the Pivotal platform. Here is figure with platform’s architecture download from the main Pivotal’s site.

PVDI-Microservices-Architecture

Although Pivotal Platform can run applications written in many languages it has the best support for Spring Cloud Services and Netflix OSS tools like you can see in the figure above. From the possibilities offered by Pivotal we can take advantage of three ways.

Pivotal Cloud Foundry – solution can be ran on public IaaS or private cloud like AWS, Google Cloud Platform, Microsoft Azure, VMware vSphere, OpenStack.

Pivotal Web Services – hosted cloud-native platform available at pivotal.io site.

PCF Dev Рthe instance which can be run locally as a a single virtual machine. It offers the opportunity to develop apps using an offline environment which basic services installed like Spring Cloud Services (SCS), MySQL, Redis databases and RabbitMQ broker. If you want to run it locally with SCS you need more than 6GB RAM free.

As a Spring Cloud Services there are available Circuit Breaker (Hystrix), Service Registry (Eureka) and standard Spring Configuration Server based on git configuration.

scs

That’s all I wanted to say about the theory. Let’s move on to practice. On the Pivotal website we have detailed materials on how to set it up, create and deploy a simple microservice based on Spring Cloud solutions. In this article I will try to present the essence collected from these descriptions based on one of my standard examples from the previous posts. As always sample source code is available on GitHub. If you are interested in detailed description of the sample application, microservices and Spring Cloud read my previous articles:

Part 1: Creating microservice using Spring Cloud, Eureka and Zuul

Part 3: Creating Microservices: Circuit Breaker, Fallback and Load Balancing with Spring Cloud

If you have a lot of free RAM you can install PCF Dev on your local workstation. You need to have Virtual Box installed. Then download and install Cloud Foundry Command Line Interface (CF CLI) and PCF Dev. All is described here. Finally you can run command below and take a small break for coffee. Virtual machine needs to downloaded and started.

cf dev start -s scs

For those who do not have RAM enough (like me) there is Pivotal Web Services platform. It is available here. Before use it you have to register on Pivotal site. The rest of the article is identical for both options.
In comparison to previous examples of Spring Cloud based microservices, we need to make some changes. There is one additional dependency inside every microservice’s pom.xml.

<properties>
	...
	<spring-cloud-services.version>1.4.1.RELEASE</spring-cloud-services.version>
	<spring-cloud.version>Dalston.RELEASE</spring-cloud.version>
</properties>

<dependencies>
	<dependency>
		<groupId>io.pivotal.spring.cloud</groupId>
		<artifactId>spring-cloud-services-starter-service-registry</artifactId>
	</dependency>
	...
</dependencies>

<dependencyManagement>
	<dependencies>
		<dependency>
			<groupId>org.springframework.cloud</groupId>
			<artifactId>spring-cloud-dependencies</artifactId>
			<version>${spring-cloud.version}</version>
			<type>pom</type>
			<scope>import</scope>
		</dependency>
		<dependency>
			<groupId>io.pivotal.spring.cloud</groupId>
			<artifactId>spring-cloud-services-dependencies</artifactId>
			<version>${spring-cloud-services.version}</version>
			<type>pom</type>
			<scope>import</scope>
		</dependency>
	</dependencies>
</dependencyManagement>

We also use Maven Cloud Foundry plugin cf-maven-plugin for application deployment on Pivotal platform. Here is sample for account-service. We run two instances of that microservice with max memory 512MB. Our application name is piomin-account-service.

<plugin>
	<groupId>org.cloudfoundry</groupId>
	<artifactId>cf-maven-plugin</artifactId>
	<version>1.1.3</version>
	<configuration>
		<target>http://api.run.pivotal.io</target>
		<org>piotrminkowski</org>
		<space>development</space>
		<appname>piomin-account-service</appname>
		<memory>512</memory>
		<instances>2</instances>
		<server>cloud-foundry-credentials</server>
	</configuration>
</plugin>

Don’t forget to add credentials configuration into Maven settings.xml file.

<server>
	<id>cloud-foundry-credentials</id>
	<username>piotr.minkowski@gmail.com</username>
	<password>***</password>
</server>

Now, when building sample application we to append cf:push command.

mvn clean install cf:push

Here is circuit breaker implementation inside customer-service.

@Service
public class AccountService {

	@Autowired
	private AccountClient client;

	@HystrixCommand(fallbackMethod = "getEmptyList")
	public List<Account> getAccounts(Integer customerId) {
		return client.getAccounts(customerId);
	}

	List<Account> getEmptyList(Integer customerId) {
		return new ArrayList<>();
	}

}

There is randomly generated delay on the account’s service side, so 25% of calls circuit breaker should be activated.

@RequestMapping("/accounts/customer/{customer}")
public List<Account> findByCustomer(@PathVariable("customer") Integer customerId) {
	logger.info(String.format("Account.findByCustomer(%s)", customerId));
	Random r = new Random();
	int rr = r.nextInt(4);
	if (rr == 1) {
		try {
			Thread.sleep(2000);
		} catch (InterruptedException e) {
			e.printStackTrace();
		}
	}
	return accounts.stream().filter(it -> it.getCustomerId().intValue() == customerId.intValue())
		.collect(Collectors.toList());
}

After successfully deploying application using Maven cf:push command we can go to Pivotal Web Services console available at https://console.run.pivotal.io/. Here are our two deployed services: two instances of piomin-account-service and one instance of piomin-customer-service.

pivotal-1

I have also activated Circuit Breaker and Service Registry from Marketplace.

pivotal-2

Every application need to be bound to service. To enable it select service, then expand Bound Apps overlap and select checkbox next to each service name.

pivotal-4

After this step applications needs to be restarted. It also can be be using web dashboard inside each service.

pivotal-5

Finally, all services are registered in Eureka and we can perform some tests using customer endpoint https://piomin-customer-service.cfapps.io/customers/{id}.

pivotal-4

Final words

With Pivotal solution we can easily deploy, scale and monitor our microservices. Deployment and scaling can be done using Maven plugin or via web dashboard. On Pivotal there are also available some services prepared especially for microservices needs like service registry, circuit breaker and configuration server. Pivotal is a competition for such solutions like Kubernetes which based on Docker containerization (more about this tools here). It is especially useful if you are creating a microservices based on Spring Boot and Spring Cloud frameworks.

Part 3: Creating Microservices: Circuit Breaker, Fallback and Load Balancing with Spring Cloud

Probably you read some articles about Hystrix and you know in what purpose it is used for. Today I would like to show you an example of exactly how to use it, which gives you the ability to combine with other tools from Netflix OSS stack like Feign and Ribbon. In this I assume that you have basic knowledge on topics such as microservices, load balancing, service discovery. If not I suggest you read some articles about it, for example my short introduction to microservices architecture available here: Part 1: Creating microservice using Spring Cloud, Eureka and Zuul. The code sample used in that article is also also used now. There is also sample source code available on GitHub. For the sample described now see hystrix branch, for basic sample master branch. 

Let’s look at some scenarios for using fallback and circuit breaker. We have Customer Service which calls API method from Account Service. There two running instances of Account Service. The requests to Account Service instances are load balanced by Ribbon client 50/50.

micro-details-1

Scenario 1

Hystrix is disabled for Feign client (1), auto retries mechanism is disabled for Ribbon client on local instance (2) and other instances (3). Ribbon read timeout is shorter than request max process time (4). This scenario also occurs with the default Spring Cloud configuration without Hystrix. When you call customer test method you sometimes receive full response and sometimes 500 HTTP error code (50/50).

ribbon:
  eureka:
    enabled: true
  MaxAutoRetries: 0 #(2)
  MaxAutoRetriesNextServer: 0 #(3)
  ReadTimeout: 1000 #(4)

feign:
  hystrix:
    enabled: false #(1)

Scenario 2

Hystrix is still disabled for Feign client (1), auto retries mechanism is disabled for Ribbon client on local instance (2) but enabled on other instances once (3). You always receive full response. If your request is received by instance with delayed response it is timed out after 1 second and then Ribbon calls another instance – in that case not delayed. You can always change MaxAutoRetries to positive value but gives us nothing in that sample.

ribbon:
  eureka:
    enabled: true
  MaxAutoRetries: 0 #(2)
  MaxAutoRetriesNextServer: 1 #(3)
  ReadTimeout: 1000 #(4)

feign:
  hystrix:
    enabled: false #(1)

Scenario 3

Here is not a very elegant solution to the problem. We set ReadTimeout on value bigger than delay inside API method (5000 ms).

ribbon:
  eureka:
    enabled: true
  MaxAutoRetries: 0
  MaxAutoRetriesNextServer: 0
  ReadTimeout: 10000

feign:
  hystrix:
    enabled: false

Generally configuration from Scenario 2 and 3 is right, you always get the full response. But in some cases you will wait more than 1 second (Scenario 2) or more than 5 seconds (Scenario 3) and delayed instance receives 50% requests from Ribbon client. But fortunately there is Hystrix – circuit breaker.

Scenario 4

Let’s enable Hystrix just by removing feign property. There is no auto retries for Ribbon client (1) and its read timeout (2) is bigger than Hystrix’s timeout (3). 1000ms is also default value for Hystrix timeoutInMilliseconds property. Hystrix circuit breaker and fallback will work for delayed instance of account service. For some first requests you receive fallback response from Hystrix. Then delayed instance will be cut off from requests, most of them will be directed to not delayed instance.

ribbon:
  eureka:
    enabled: true
  MaxAutoRetries: 0 #(1)
  MaxAutoRetriesNextServer: 0
  ReadTimeout: 2000 #(2)

hystrix:
  command:
    default:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 1000 #(3)

Scenario 5

This scenario is a more advanced development of Scenario 4. Now Ribbon timeout (2) is lower than Hystrix timeout (3) and also auto retries mechanism is enabled (1) for local instance and for other instances (4). The result is same as for Scenario 2 and 3 – you receive full response, but Hystrix is enabled and it cuts off delayed instance from future requests.

ribbon:
  eureka:
    enabled: true
  MaxAutoRetries: 3 #(1)
  MaxAutoRetriesNextServer: 1 #(4)
  ReadTimeout: 1000 #(2)

hystrix:
  command:
    default:
      execution:
        isolation:
          thread:
            timeoutInMilliseconds: 10000 #(3)

I could imagine a few other scenarios. But the idea was just a show differences in circuit breaker and fallback when modifying configuration properties for Feign, Ribbon and Hystrix in application.yml.

Hystrix

Let’s take a closer look on standard Hystrix circuit breaker and ¬†usage described in Scenario 4. To enable Hystrix in your Spring Boot application you have to following dependencies to pom.xml. Second step is to add annotation @EnableCircuitBreaker to main application class and also @EnableHystrixDashboard if you would like to have UI dashboard available.

<dependency>
	<groupId>org.springframework.cloud</groupId>
	<artifactId>spring-cloud-starter-hystrix</artifactId>
</dependency>
<dependency>
	<groupId>org.springframework.cloud</groupId>
	<artifactId>spring-cloud-starter-hystrix-dashboard</artifactId>
</dependency>

Hystrix fallback is set on Feign client inside customer service.

@FeignClient(value = "account-service", fallback = AccountFallback.class)
public interface AccountClient {

    @RequestMapping(method = RequestMethod.GET, value = "/accounts/customer/{customerId}")
    List<Account> getAccounts(@PathVariable("customerId") Integer customerId);

}

Fallback implementation is really simple. In this case I just return empty list instead of customer’s account list received from account service.

@Component
public class AccountFallback implements AccountClient {

	@Override
	public List<Account> getAccounts(Integer customerId) {
		List<Account> acc = new ArrayList<Account>();
		return acc;
	}

}

Now, we can perform some tests. Let’s start discovery service, two instances of account service on different ports (-DPORT VM argument during startup) and customer service. Endpoint for tests is /customers/{id}. There is also JUnit test class which sends multiple requests to this enpoint available in customer-service module¬†pl.piomin.microservices.customer.ApiTest.

	@RequestMapping("/customers/{id}")
	public Customer findById(@PathVariable("id") Integer id) {
		logger.info(String.format("Customer.findById(%s)", id));
		Customer customer = customers.stream().filter(it -> it.getId().intValue()==id.intValue()).findFirst().get();
		List<Account> accounts =  accountClient.getAccounts(id);
		customer.setAccounts(accounts);
		return customer;
	}

I enabled Hystrix Dashboard on account-service main class. If you would like to access it call from your web browser http://localhost:2222/hystrix address and then type Hystrix’s stream address from customer-service http://localhost:3333/hystrix.stream. When I run test that sends 1000 requests to customer service¬†about 20 (2%) of them were forwarder to delayed instance of account service, remaining to not delayed instance. Hystrix dashboard during that test is visible below. For more advanced Hystrix configuration refer to its documentation available¬†here.

hystrix-1

In memory data grid with Hazelcast

In my previous article JPA caching with Hazelcast, Hibernate and Spring Boot I described an example illustrating Hazelcast usage as a solution for Hibernate 2nd level cache. One big disadvantage of that example was an ability by caching entities only by primary key. Some help was the opportunity to cache JPA queries by some other indices. But that did not solve the problem completely, because query could use already cached entities even if they matched the criteria. In that article I’m going to show you smart solution of that problem based on Hazelcast distributed queries.

hz1

Spring Boot has an build-in auto configuration for Hazelcast if such a library is available under application classpath and @Bean Config is declared.

	@Bean
	Config config() {
		Config c = new Config();
		c.setInstanceName("cache-1");
		c.getGroupConfig().setName("dev").setPassword("dev-pass");
		ManagementCenterConfig mcc = new ManagementCenterConfig().setUrl("http://192.168.99.100:38080/mancenter").setEnabled(true);
		c.setManagementCenterConfig(mcc);
		SerializerConfig sc = new SerializerConfig().setTypeClass(Employee.class).setClass(EmployeeSerializer.class);
		c.getSerializationConfig().addSerializerConfig(sc);
		return c;
	}

In the code fragment above we declared cluster name and password credentials, connection parameters to Hazelcast Management Center and entity serialization configuration. Entity is pretty simple – it has @Id and two fields for searching personId and company.

@Entity
public class Employee implements Serializable {

	private static final long serialVersionUID = 3214253910554454648L;

	@Id
	@GeneratedValue
	private Integer id;
	private Integer personId;
	private String company;

	public Integer getId() {
		return id;
	}

	public void setId(Integer id) {
		this.id = id;
	}

	public Integer getPersonId() {
		return personId;
	}

	public void setPersonId(Integer personId) {
		this.personId = personId;
	}

	public String getCompany() {
		return company;
	}

	public void setCompany(String company) {
		this.company = company;
	}

}

Every entity needs to have serializer declared if it is to be inserted and selected from cache. There are same default serializers available inside Hazelcast library, but I implemented the custom one for our sample. It is based on StreamSerializer and ObjectDataInput.

public class EmployeeSerializer implements StreamSerializer<Employee> {

	@Override
	public int getTypeId() {
		return 1;
	}

	@Override
	public void write(ObjectDataOutput out, Employee employee) throws IOException {
		out.writeInt(employee.getId());
		out.writeInt(employee.getPersonId());
		out.writeUTF(employee.getCompany());
	}

	@Override
	public Employee read(ObjectDataInput in) throws IOException {
		Employee e = new Employee();
		e.setId(in.readInt());
		e.setPersonId(in.readInt());
		e.setCompany(in.readUTF());
		return e;
	}

	@Override
	public void destroy() {
	}

}

There is also DAO interface for interacting with database. It has two searching methods and extends Spring Data CrudRepository.

public interface EmployeeRepository extends CrudRepository<Employee, Integer> {

	public Employee findByPersonId(Integer personId);
	public List<Employee> findByCompany(String company);

}

In this sample Hazelcast instance is embedded into the¬†application. When starting Spring Boot application we have to provide VM argument -DPORT which is used for exposing service REST API. Hazelcast automatically detect other running member instances and its port will be incremented out of the box. Here’s REST @Controller class with exposed API.

@RestController
public class EmployeeController {

	private Logger logger = Logger.getLogger(EmployeeController.class.getName());

	@Autowired
	EmployeeService service;

	@GetMapping("/employees/person/{id}")
	public Employee findByPersonId(@PathVariable("id") Integer personId) {
		logger.info(String.format("findByPersonId(%d)", personId));
		return service.findByPersonId(personId);
	}

	@GetMapping("/employees/company/{company}")
	public List<Employee> findByCompany(@PathVariable("company") String company) {
		logger.info(String.format("findByCompany(%s)", company));
		return service.findByCompany(company);
	}

	@GetMapping("/employees/{id}")
	public Employee findById(@PathVariable("id") Integer id) {
		logger.info(String.format("findById(%d)", id));
		return service.findById(id);
	}

	@PostMapping("/employees")
	public Employee add(@RequestBody Employee emp) {
		logger.info(String.format("add(%s)", emp));
		return service.add(emp);
	}

}

@Service is injected into the EmployeeController. Inside EmployeeService there is an simple implementation of switching between Hazelcast cache instance and Spring Data DAO @Repository. In every find method we are trying to find data in the cache and in case it’s not there we are searching it in database and then putting found entity into the cache.

@Service
public class EmployeeService {

	private Logger logger = Logger.getLogger(EmployeeService.class.getName());

	@Autowired
	EmployeeRepository repository;
	@Autowired
	HazelcastInstance instance;

	IMap<Integer, Employee> map;

	@PostConstruct
	public void init() {
		map = instance.getMap("employee");
		map.addIndex("company", true);
		logger.info("Employees cache: " + map.size());
	}

	@SuppressWarnings("rawtypes")
	public Employee findByPersonId(Integer personId) {
		Predicate predicate = Predicates.equal("personId", personId);
		logger.info("Employee cache find");
		Collection<Employee> ps = map.values(predicate);
		logger.info("Employee cached: " + ps);
		Optional<Employee> e = ps.stream().findFirst();
		if (e.isPresent())
			return e.get();
		logger.info("Employee cache find");
		Employee emp = repository.findByPersonId(personId);
		logger.info("Employee: " + emp);
		map.put(emp.getId(), emp);
		return emp;
	}

	@SuppressWarnings("rawtypes")
	public List<Employee> findByCompany(String company) {
		Predicate predicate = Predicates.equal("company", company);
		logger.info("Employees cache find");
		Collection<Employee> ps = map.values(predicate);
		logger.info("Employees cache size: " + ps.size());
		if (ps.size() > 0) {
			return ps.stream().collect(Collectors.toList());
		}
		logger.info("Employees find");
		List<Employee> e = repository.findByCompany(company);
		logger.info("Employees size: " + e.size());
		e.parallelStream().forEach(it -> {
			map.putIfAbsent(it.getId(), it);
		});
		return e;
	}

	public Employee findById(Integer id) {
		Employee e = map.get(id);
		if (e != null)
			return e;
		e = repository.findOne(id);
		map.put(id, e);
		return e;
	}

	public Employee add(Employee e) {
		e = repository.save(e);
		map.put(e.getId(), e);
		return e;
	}

}

If you are interested in running sample application you can clone my repository on GitHub. In person-service module there is an example for my previous article about Hibernate 2nd cache with Hazelcast, in employee-module there is an example for that article.

Testing

Let’s start three instances of employee service on different ports using VM argument -DPORT. In the first figure visible in the beginning of article these ports are 2222, 3333 and 4444. When starting last third service’s instance you should see the fragment visible below in the application logs. It means that Hazelcast cluster of three members has been set up.

2017-05-09 23:01:48.127  INFO 16432 --- [ration.thread-0] c.h.internal.cluster.ClusterService      : [192.168.1.101]:5703 [dev] [3.7.7] 

Members [3] {
	Member [192.168.1.101]:5701 - 7a8dbf3d-a488-4813-a312-569f0b9dc2ca
	Member [192.168.1.101]:5702 - 494fd1ac-341b-451c-b585-1ad58a280fac
	Member [192.168.1.101]:5703 - 9750bd3c-9cf8-48b8-a01f-b14c915937c3 this
}

Here is picture from Hazelcast Management Center for two running members (only two members are available in the freeware version of Hazelcast Management Center).

hz-1.png

Then run docker containers with MySQL and Hazelcast Management Center.

docker run -d --name mysql -p 33306:3306 mysql
docker run -d --name hazelcast-mgmt -p 38080:8080 hazelcast/management-center:latest

Now, you could try to call endpoint http://localhost:/employees/company/{company} on all of your services. You should see that data is cached in the cluster and even if you call endpoint on different service it find entities put into the cache by different service. After several attempts my service instances put about 100k entities into the cache. Distribution between two Hazelcast members is 50% to 50%.

hz-2

Final Words

Probably we could implement smarter solution for the problem described in that article, but I just wanted to show you the idea. I tried to use Spring Data Hazelcast for that, but I’ve got a problem to run it on¬†Spring Boot application. It has HazelcastRepository interface, which something similar to Spring Data CrudRepository but basing on cached entities in Hazelcast grid and also uses Spring Data KeyValue module. The project is not well document and like I said before it didn’t worked with Spring Boot so I decided to implement my simple solution ūüôā

In my local¬†environment, visualized in the beginning of the article, queries on cache were about 10 times faster than similar queries on database. I inserted 2M records into the employee table. Hazelcast data grid could not only be a 2nd level cache but even a middleware between your application and database. If your priority is a performance¬†of queries on large amounts of data and you have a lot of RAM im memory data grid is right solution for you ūüôā

JPA caching with Hazelcast, Hibernate and Spring Boot

Preface

In-Memory Data Grid is an in-memory distributed key-value store that enables caching data using distributed clusters. Do not confuse this solution with in-memory or nosql database. In most cases it is used for performance reasons – all data is stored in RAM not in the disk like in traditional databases. For the first time I had a touch with in-memory data grid while we considering moving to Oracle Coherence in one of organizations I had been working before. The solution really made me curious. Oracle Coherence is obviously a paid solution, but there are also some open source solutions among which the most interesting seem to be Apache Ignite and Hazelcast. Today I’m going to show you how to use Hazelcast for caching data stored in MySQL database accessed by Spring Data DAO objects. Here’s the figure illustrating architecture of presented solution.

hazelcast-1

Implementation

  • Starting Docker containers

We use three Docker containers. First with MySQL database, second with Hazelcast instance and third for Hazelcast Management Center РUI dashboard for monitoring Hazelcast cluster instances.

docker run -d --name mysql -p 33306:3306 mysql
docker run -d --name hazelcast -p 5701:5701 hazelcast/hazelcast
docker run -d --name hazelcast-mgmt -p 38080:8080 hazelcast/management-center:latest

If we would like to connect with Hazelcast Management Center from Hazelcast instance we need to place custom hazelcast.xml in /opt/hazelcast catalog inside Docker container. This can be done in two ways, by extending hazelcast base image or just by copying file to existing hazelcast container and restarting it.

docker run -d --name hazelcast -p 5701:5701 hazelcast/hazelcast
docker stop hazelcast
docker start hazelcast

Here’s the most important Hazelcast’s configuration file fragment.

<hazelcast xmlns="http://www.hazelcast.com/schema/config" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.hazelcast.com/schema/config http://www.hazelcast.com/schema/config/hazelcast-config-3.8.xsd">
     <group>
          <name>dev</name>
          <password>dev-pass</password>
     </group>
     <management-center enabled="true" update-interval="3">http://192.168.99.100:38080/mancenter</management-center>
...
</hazelcast>

Hazelcast Dashboard is available under http://192.168.99.100:38080/mancenter address. We can monitor there all running cluster members, maps and some other parameters.

hazelcast-mgmt-1

  • Maven configuration

Project is based on Spring Boot 1.5.3.RELEASE. We also need to add Spring Web and MySQL Java connector dependencies. Here’s root project pom.xml.


	<parent>
		<groupId>org.springframework.boot</groupId>
		<artifactId>spring-boot-starter-parent</artifactId>
		<version>1.5.3.RELEASE</version>
	</parent>
	...
	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-web</artifactId>
		</dependency>
		<dependency>
			<groupId>mysql</groupId>
			<artifactId>mysql-connector-java</artifactId>
			<scope>runtime</scope>
		</dependency>
	...
	</dependencies>

Inside person-service module we declared some other dependencies to Hazelcast artifacts and Spring Data JPA. I had to override managed hibernate-core version for Spring Boot 1.5.3.RELEASE, because Hazelcast didn’t worked properly with 5.0.12.Final. Hazelcast needs hibernate-core in 5.0.9.Final version. Otherwise, an exception occurs when starting application.

	<dependencies>
		<dependency>
			<groupId>org.springframework.boot</groupId>
			<artifactId>spring-boot-starter-data-jpa</artifactId>
		</dependency>
		<dependency>
			<groupId>com.hazelcast</groupId>
			<artifactId>hazelcast</artifactId>
		</dependency>
		<dependency>
			<groupId>com.hazelcast</groupId>
			<artifactId>hazelcast-client</artifactId>
		</dependency>
		<dependency>
			<groupId>com.hazelcast</groupId>
			<artifactId>hazelcast-hibernate5</artifactId>
		</dependency>
		<dependency>
			<groupId>org.hibernate</groupId>
			<artifactId>hibernate-core</artifactId>
			<version>5.0.9.Final</version>
		</dependency>
	</dependencies>
  • Hibernate Cache configuration

Probably you can configure it in several different ways, but for me the most suitable solution was inside application.yml. Here’s YAML configurarion file fragment. I enabled L2 Hibernate cache, set Hazelcast native client address, credentials and cache factory class HazelcastCacheRegionFactory. We can also set HazelcastLocalCacheRegionFactory. The differences between them are in performance – local factory is faster since its operations are handled as distributed calls. While if you use HazelcastCacheRegionFactory, you can see your maps on Management Center.

spring:
  application:
    name: person-service
  datasource:
    url: jdbc:mysql://192.168.99.100:33306/datagrid?useSSL=false
    username: datagrid
    password: datagrid
  jpa:
    properties:
      hibernate:
        show_sql: true
        cache:
          use_query_cache: true
          use_second_level_cache: true
          hazelcast:
            use_native_client: true
            native_client_address: 192.168.99.100:5701
            native_client_group: dev
            native_client_password: dev-pass
          region:
            factory_class: com.hazelcast.hibernate.HazelcastCacheRegionFactory
  • Application code

First, we need to enable caching for Person @Entity.

@Cache(usage = CacheConcurrencyStrategy.READ_WRITE)
@Entity
public class Person implements Serializable {

	private static final long serialVersionUID = 3214253910554454648L;

	@Id
	@GeneratedValue
	private Integer id;
	private String firstName;
	private String lastName;
	private String pesel;
	private int age;

	public Integer getId() {
		return id;
	}

	public void setId(Integer id) {
		this.id = id;
	}

	public String getFirstName() {
		return firstName;
	}

	public void setFirstName(String firstName) {
		this.firstName = firstName;
	}

	public String getLastName() {
		return lastName;
	}

	public void setLastName(String lastName) {
		this.lastName = lastName;
	}

	public String getPesel() {
		return pesel;
	}

	public void setPesel(String pesel) {
		this.pesel = pesel;
	}

	public int getAge() {
		return age;
	}

	public void setAge(int age) {
		this.age = age;
	}

	@Override
	public String toString() {
		return "Person [id=" + id + ", firstName=" + firstName + ", lastName=" + lastName + ", pesel=" + pesel + "]";
	}

}

DAO is implemented using Spring Data CrudRepository. Sample application source code is available on GitHub.

public interface PersonRepository extends CrudRepository<Person, Integer> {
	public List<Person> findByPesel(String pesel);
}

Testing

Let’s insert¬†a¬†little more data to the table. You can use my¬†AddPersonRepositoryTest for that. It will insert 1M rows into the person table. Finally, we can call enpoint http://localhost:2222/persons/{id} twice with the same id.¬†For me, it looks like below: 22ms for first call, 3ms for next call which is read from L2 cache. Entity¬†can be cached¬†only by primary key. If you call http://localhost:2222/persons/pesel/{pesel} entity will always be searched bypassing the L2 cache.

2017-05-05 17:07:27.360 DEBUG 9164 --- [nio-2222-exec-9] org.hibernate.SQL                        : select person0_.id as id1_0_0_, person0_.age as age2_0_0_, person0_.first_name as first_na3_0_0_, person0_.last_name as last_nam4_0_0_, person0_.pesel as pesel5_0_0_ from person person0_ where person0_.id=?
Hibernate: select person0_.id as id1_0_0_, person0_.age as age2_0_0_, person0_.first_name as first_na3_0_0_, person0_.last_name as last_nam4_0_0_, person0_.pesel as pesel5_0_0_ from person person0_ where person0_.id=?
2017-05-05 17:07:27.362 DEBUG 9164 --- [nio-2222-exec-9] o.h.l.p.e.p.i.ResultSetProcessorImpl     : Starting ResultSet row #0
2017-05-05 17:07:27.362 DEBUG 9164 --- [nio-2222-exec-9] l.p.e.p.i.EntityReferenceInitializerImpl : On call to EntityIdentifierReaderImpl#resolve, EntityKey was already known; should only happen on root returns with an optional identifier specified
2017-05-05 17:07:27.363 DEBUG 9164 --- [nio-2222-exec-9] o.h.engine.internal.TwoPhaseLoad         : Resolving associations for [pl.piomin.services.datagrid.person.model.Person#444]
2017-05-05 17:07:27.364 DEBUG 9164 --- [nio-2222-exec-9] o.h.engine.internal.TwoPhaseLoad         : Adding entity to second-level cache: [pl.piomin.services.datagrid.person.model.Person#444]
2017-05-05 17:07:27.373 DEBUG 9164 --- [nio-2222-exec-9] o.h.engine.internal.TwoPhaseLoad         : Done materializing entity [pl.piomin.services.datagrid.person.model.Person#444]
2017-05-05 17:07:27.373 DEBUG 9164 --- [nio-2222-exec-9] o.h.r.j.i.ResourceRegistryStandardImpl   : HHH000387: ResultSet's statement was not registered
2017-05-05 17:07:27.374 DEBUG 9164 --- [nio-2222-exec-9] .l.e.p.AbstractLoadPlanBasedEntityLoader : Done entity load : pl.piomin.services.datagrid.person.model.Person#444
2017-05-05 17:07:27.374 DEBUG 9164 --- [nio-2222-exec-9] o.h.e.t.internal.TransactionImpl         : committing
2017-05-05 17:07:30.168 DEBUG 9164 --- [nio-2222-exec-6] o.h.e.t.internal.TransactionImpl         : begin
2017-05-05 17:07:30.171 DEBUG 9164 --- [nio-2222-exec-6] o.h.e.t.internal.TransactionImpl         : committing

Query Cache

We can enable JPA query caching by marking repository method with @Cacheable annotation and adding @EnableCaching to main class definition.

public interface PersonRepository extends CrudRepository<Person, Integer> {

	@Cacheable("findByPesel")
	public List<Person> findByPesel(String pesel);

}

In addition to the @EnableCaching annotation we should declare HazelcastIntance and CacheManager beans. As a cache manager HazelcastCacheManager from hazelcast-spring library is used.

@SpringBootApplication
@EnableCaching
public class PersonApplication {

	public static void main(String[] args) {
		SpringApplication.run(PersonApplication.class, args);
	}

	@Bean
	HazelcastInstance hazelcastInstance() {
		ClientConfig config = new ClientConfig();
		config.getGroupConfig().setName("dev").setPassword("dev-pass");
		config.getNetworkConfig().addAddress("192.168.99.100");
		config.setInstanceName("cache-1");
		HazelcastInstance instance = HazelcastClient.newHazelcastClient(config);
		return instance;
	}

	@Bean
	CacheManager cacheManager() {
		return new HazelcastCacheManager(hazelcastInstance());
	}

}

Now, we should try find person by PESEL number by calling endpoint http://localhost:2222/persons/pesel/{pesel}. Cached query is stored as a map as you see in the picture below.

hazelcast-3

Clustering

Before final words let me say a little about clustering, what is the key functionality of Hazelcast in memory data grid. In the previous chapters we based on single Hazelcast instance. Let’s begin from running second container with Hazelcast exposed on different port.

docker run -d --name hazelcast2 -p 5702:5701 hazelcast/hazelcast

Now we should perform one change in hazelcast.xml configuration file. Because data grid is ran inside docker container the public address has to be set. For the first container it is 192.168.99.100:5701, and for second 192.168.99.100:5702, because it is exposed on 5702 port.

     <network>
        ...
	<public-address>192.168.99.100:5701</public-address>
        ...
     </network>

When starting person-service application you should see in the logs similar to visible below – connection with two cluster members.

Members [2] {
Member [192.168.99.100]:5702 - 04f790bc-6c2d-4c21-ba8f-7761a4a7422c
Member [192.168.99.100]:5701 - 2ca6e30d-a8a7-46f7-b1fa-37921aaa0e6b
}

All Hazelcast running instances are visible in Management Center.

hazelcast-2

Conclusion

Caching and clustering with Hazelcast are simple and fast. We can cache JPA entities and queries. Monitoring is realized via Hazelcast Management Center dashboard. One problem for me is that I’m able to cache entities only by primary key. If I would like to find entity by other index like PESEL number I had to cache findByPesel query. Even if entity was cached before by id query will not find it in the cache but perform SQL on database. Only next query call is cached. I’ll show you smart solution for that problem in my next article about that subject In memory data grid with Hazelcast.