Hazelcast Hot Cache with Striim

I previously introduced some articles about Hazelcast – an open source in memory data grid solution. In the first of them JPA caching with Hazelcast, Hibernate and Spring Boot I described how to set up 2nd level JPA cache with Hazelcast and MySQL. In the second In memory data grid with Hazelcast I showed more advanced sample how to use Hazelcast distributed queries to enable faster data access for Spring Boot application. Using Hazelcast as a cache level between your application and relational database is generally a very good solution under one condition – all changes are going across your application. If a data source is modified by other application which does not use your caching solution it causes problem with outdated data for your application. Did you have encountered this problem in your organization? In my organization we still use relational databases in almost all our systems. Sometimes it causes performance problems, even optimized queries are too slow for real time applications. Relational database is still required, so solutions like Hazelcast can help us.

Let’s return to the topic of outdated cache. That’s why we need Striim, a real-time data integration and streaming analytics software platform. The architecture of presented solution is visible on the figure below. We have two applications. The first one employee-service uses Hazelcast as a cache, the second one employee-app performs changes directly to the database. Without such a solution like Striim data changed by employee-app is not visible for employee-service. Striim enables real-time data integration without modifying or slowing down data source. It uses CDC (Change Data Capture) mechanisms for detecting changes performed on data source, by analizing binary logs. It has a support for the most popular transactional databases like Oracle, Microsoft SQL Server and MySQL. Striim has many interesting features, but also one serious drawback – it is not an open source. An alternative for the presented solution, especially when using Oracle database, can be Oracle In-Memory Data Grid with Golden Gate Hot Cache.

striim-figure-1

I prepared sample application for that article purpose, which is as usual available on GitHub under striim branch. The application employee-service is based on Spring Boot and has embedded Hazelcast client which connects to the cluster and Hazelcast Management Center. If data is not available in the cache the application connects to MySQL database.

1. Starting MySQL and enabling binary log

Let’s start MySQL database using docker.

docker run -d --name mysql -e MYSQL_DATABASE=hz -e MYSQL_USER=hz -e MYSQL_PASSWORD=hz123 -e MYSQL_ALLOW_EMPTY_PASSWORD=yes -p 33306:3306 mysql

Binary log is disabled by default. We have to enable it by including the following lines into mysqld.cnf. The configuration file is available on docker container under /etc/mysql/mysql.conf.d/mysqld.cnf.

log_bin			 = /var/log/mysql/binary.log
expire-logs-days = 14
max-binlog-size  = 500M
server-id        = 1

If you are running MySQL on Docker you should restart your container using docker restart mysql.

2. Starting Hazelcast Dashboard and Striim

Same as for MySQL, I also used Docker.

docker run -d --name striim -p 39080:9080 striim/evalversion
docker run -d --name hazelcast-mgmt -p 38080:8080 hazelcast/management-center:3.7.7

I selected 3.7.7 version of Hazelcast Management Center, because this version is included by default into the Spring Boot release I used in the sample application. Now, you should be able to login into Hazelcast Dashboard available under http://192.168.99.100:38080/mancenter/ and to the Striim Dashboard which is available under http://192.168.99.100:39080/ (admin/admin).

3. Starting sample application

Build sample application with mvn clean install and start using java -jar employee-service-1.0-SNAPSHOT.jar. You can test it by calling one of endpoint:
/employees/person/{id}
/employees/company/{company}
/employees/{id}

Before testing create table employee in MySQL and insert some test data (you can run my test class pl.piomin.services.datagrid.employee.data.AddEmployeeRepositoryTest).

4. Configure entity mapping in Striim

Before creating our first application in Striim we have to provide mapping configuration. The first step is to copy your entity ORM mapping file into docker container filesystem. You can perform it using Striim dashboard or with docker cp command. Here’s my orm.xml file – it is used by Striim HazelcastWriter while putting data into cache.

<entity-mappings xmlns="http://www.eclipse.org/eclipselink/xsds/persistence/orm" 	version="2.4">
	<entity name="employee" class="pl.piomin.services.datagrid.employee.model.Employee">
<table name="hz.employee" />
		<attributes>
			<id name="id" attribute-type="Integer">
				<column nullable="false" name="id" />
				<generated-value strategy="AUTO" />
			</id>
			<basic name="personId" attribute-type="Integer">
				<column nullable="false" name="person_id" />
			</basic>
			<basic name="company" attribute-type="String">
				<column name="company" />
			</basic>
		</attributes>
	</entity>
</entity-mappings>

We also have to provide jar with entity class. It should be placed under /opt/Striim/lib directory on Striim docker container. What is important, the fields are public – do not make them private with setters, because it won’t work for HazelcastWriter. After all changes restart your container and proceed to the next steps. For the sample application just build employee-model module and upload to Striim.

public class Employee implements Serializable {

	private static final long serialVersionUID = 3214253910554454648L;
	public Integer id;
	public Integer personId;
	public String company;

	public Employee() {

	}

	@Override
	public String toString() {
		return "Employee [id=" + id + ", personId=" + personId + ", company=" + company + "]";
	}

}

5. Configuring MySQL CDC connection on Striim

If all the previous steps are completed we can finally begin to create our application in Striim. When creating a new app select Start with Template, and then MySQL CDC to Hazelcast. Put your MySQL connection data, security credentials and proceed. In addition to connection validation Striim also checks if binary log is enabled.

Then select tables for synchronization with cache.

striim-3

6. Configuring Hazelcast on Striim

After starting employee-service application you should see the following fragment in the file logs.

Members [1] {
	Member [192.168.8.205]:5701 - d568a38a-7531-453a-b7f8-db2be4715132 this
}

This address should be provided as a Hazelcast Cluster URL. We should also put ORM mapping file location and cluster credentials (by default these are dev/dev-pass).

striim-5

In the next screen you will see ORM mapping visualization and input selection. Your input is MySQL server you defined in the fifth step.

striim-7

7. Deploy application on Striim

After finishing previous steps you see the flow diagram. I suggest you create log file where all input events will be stored as a JSON. My diagram is visible in the figure below. If your configuration is finished deploy and start application.  At this stage I had some problems. For example, if I deploy application after Striim restart I always have to change something and save, otherwise exception during deploy occurs. However, after a long struggle with Striim, I finally succeeded in running the application! So we can start testing.

striim-8

8. Checking out

I created JUnit test to illustrate cache refresh performed by Striim. Inside this test I invoke employees/company/{company} REST API method and collect entities. Then I modified entities with EmployeeRepository which commits changes directly to the database bypassing Hazelcast cache. I invoke REST API again and compare results with entities collected with previous invoke. Field personId should not be equal with value for previously invoked entity. You also can test it manually by calling REST API endpoint and change something in the database using the client like MySQL Workbench.

@SpringBootTest(webEnvironment = WebEnvironment.DEFINED_PORT)
@RunWith(SpringRunner.class)
public class CacheRefreshEmployeeTest {

	protected Logger logger = Logger.getLogger(CacheRefreshEmployeeTest.class.getName());

	@Autowired
	EmployeeRepository repository;

	TestRestTemplate template = new TestRestTemplate();

	@Test
	public void testModifyAndRefresh() {
		Employee[] e = template.getForObject("http://localhost:3333/employees/company/{company}", Employee[].class, "Test001");
		for (int i = 0; i < e.length; i++) {
			Employee eMod = repository.findOne(e[i].getId());
			eMod.setPersonId(eMod.getPersonId()+1);
			repository.save(eMod);
		}

		Employee[] e2 = template.getForObject("http://localhost:3333/employees/company/{company}", Employee[].class, "Test001");
		for (int i = 0; i < e2.length; i++) {
			Assert.assertNotEquals(e[i].getPersonId(), e2[i].getPersonId());
		}

	}
}

Here’s the picture with Striim dashboard monitor. We can check out how many events were processed, what is actual memory and CPU usage etc.

striim-1

Final Thoughts

I have no definite opinion about Striim. On the one hand it is an interesting solution with many integration possibilities and a nice dashboard for configuration and monitoring. But on the other hand it is not free from errors and bugs. My application crashed when an exception was thrown for the lack of a matching serializer for the entity in Hazelcast’s cache. This stopped processing any further events. It may be a deliberate action, but in my opinion subsequent events should be processed as they may affect other tables. The application management with web dashboard is not very comfortable at all. Every time I restarted the container, I had to change something in the configuration, because the application threw not intuitive exception on startup. From this type of application I would expect first of all reliability if the application would require updating of the data on the Hazelcast. However, despite some drawbacks, it is worth a closer look at Striim.

Advertisements

Author: Piotr Mińkowski

IT Architect, Java Software Developer

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s