1 Reply Latest reply on Nov 18, 2011 5:39 AM by sannegrinovero

Hibernate search - Infinispan - Jgroups optimization on Amazon clustering environment

dungleonhart Nov 18, 2011 4:17 AM

Hi Inifispan team,

I'm using Hibernate Search with Infinispan as directory provider and Jgroups as synchronization backend for clustering on Amazon EC2.

My project is already on air now, and it's great for me to have your advices on performance tuning.

Here're my configurations:

1. Spring bean

<props>

<prop key="hibernate.dialect">org.hibernate.dialect.MySQLDialect</prop>

<prop key="hibernate.search.default.directory_provider">infinispan</prop>

<prop key="hibernate.search.infinispan.configuration_resourcename">hibernate-search-infinispan.xml</prop>

</props>

</property>

</bean>

2. hibernate-search-infinispan.xml

<?xml version="1.0" encoding="UTF-8"?>

<infinispan xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation="urn:infinispan:config:5.0 http://www.infinispan.org/schemas/infinispan-config-5.0.xsd"

xmlns="urn:infinispan:config:5.0">

<!-- Duplicate domains are allowed so that multiple deployments with default

configuration of Hibernate Search applications work - if possible it would

be better to use JNDI to share the CacheManager across applications -->

<globalJmxStatistics enabled="false"

cacheManagerName="HibernateSearch" allowDuplicateDomains="true" />

<transport clusterName="infinispan-hibernate-search-cluster"

distributedSyncTimeout="60000"

transportClass="org.infinispan.remoting.transport.jgroups.JGroupsTransport">

</properties>

</transport>

<!-- Used to register JVM shutdown hooks. hookBehavior: DEFAULT, REGISTER,

DONT_REGISTER. Hibernate Search takes care to stop the CacheManager so registering

is not needed -->

</global>

<locking lockAcquisitionTimeout="60000" writeSkewCheck="false"

concurrencyLevel="500" useLockStriping="false" />

<!-- This element specifies that the cache is clustered. modes supported:

distribution (d), replication (r) or invalidation (i). Don't use invalidation

to store Lucene indexes (as with Hibernate Search DirectoryProvider). Replication

is recommended for best performance of Lucene indexes, but make sure you

have enough memory to store the index in your heap. Also distribution scales

much better than replication on high number of nodes in the cluster. -->

<stateRetrieval timeout="60000" logFlushTimeout="60000"

fetchInMemoryState="true" alwaysProvideInMemoryState="true" />

</clustering>

</default>

<!-- While default configuration happens to be fine with similar settings

across the -->

<stateRetrieval fetchInMemoryState="true"

logFlushTimeout="60000" />

</clustering>

</namedCache>

<stateRetrieval fetchInMemoryState="true"

logFlushTimeout="60000" />

</clustering>

</namedCache>

<stateRetrieval fetchInMemoryState="true"

logFlushTimeout="60000" />

</clustering>

</namedCache>

</infinispan>

3. jdbc_ping.xml

<config xmlns="urn:org:jgroups" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"

xsi:schemaLocation="urn:org:jgroups JGroups-2.12.xsd">

<TCP bind_port="${jgroups.tcp.port:7800}"

loopback="true" port_range="30" recv_buf_size="20000000"

send_buf_size="640000" discard_incompatible_packets="true"

max_bundle_size="64000" max_bundle_timeout="30" enable_bundling="true"

use_send_queues="true" sock_conn_timeout="300" enable_diagnostics="false"

thread_pool.enabled="true" thread_pool.min_threads="2"

thread_pool.max_threads="30" thread_pool.keep_alive_time="5000"

thread_pool.queue_enabled="false" thread_pool.queue_max_size="100"

thread_pool.rejection_policy="Discard" oob_thread_pool.enabled="true"

oob_thread_pool.min_threads="2" oob_thread_pool.max_threads="30"

oob_thread_pool.keep_alive_time="5000" oob_thread_pool.queue_enabled="false"

oob_thread_pool.queue_max_size="100" oob_thread_pool.rejection_policy="Discard" />

<JDBC_PING connection_driver="com.mysql.jdbc.Driver"

connection_username="root" connection_password="root"

connection_url="jdbc:mysql://localhost/clientdb2" level="debug" />

<FD_SOCK />

<VERIFY_SUSPECT timeout="1500" />

<pbcast.NAKACK use_mcast_xmit="false" gc_lag="0"

retransmit_timeout="300,600,1200,2400,4800" discard_delivered_msgs="false" />

<pbcast.STABLE stability_delay="1000" desired_avg_gossip="50000"

max_bytes="400000" />

<pbcast.GMS print_local_addr="false" join_timeout="7000"

view_bundling="true" />

<pbcast.STREAMING_STATE_TRANSFER bind_port="7850"/>

</config>

---------------------

There're several things that make me concern with these configurations:

1. When I modify the search data, changes are updated almost immediately to other nodes in the cluster. Should I add some latency to this process and how can I do that?

2. I see these lines spamming the logs:

2011-11-18 16:08:19,852 [Timer-2,infinispan-hibernate-search-cluster,HK6HZP1-29948] DEBUG org.jgroups.protocols.JDBC_PING - Removed bf97191e-13f4-b3f9-0e45-41e902f5131f for clustername infinispan-hibernate-search-cluster from database.

2011-11-18 16:08:19,855 [Timer-2,infinispan-hibernate-search-cluster,HK6HZP1-29948] DEBUG org.jgroups.protocols.JDBC_PING - Registered bf97191e-13f4-b3f9-0e45-41e902f5131f for clustername infinispan-hibernate-search-cluster into database.

2011-11-18 16:08:21,440 [Timer-4,infinispan-hibernate-search-cluster,HK6HZP1-29948] DEBUG org.jgroups.protocols.JDBC_PING - Removed bf97191e-13f4-b3f9-0e45-41e902f5131f for clustername infinispan-hibernate-search-cluster from database.

2011-11-18 16:08:21,443 [Timer-4,infinispan-hibernate-search-cluster,HK6HZP1-29948] DEBUG org.jgroups.protocols.JDBC_PING - Registered bf97191e-13f4-b3f9-0e45-41e902f5131f for clustername infinispan-hibernate-search-cluster into database.

2011-11-18 16:08:41,698 [Timer-4,infinispan-hibernate-search-cluster,HK6HZP1-29948] DEBUG org.jgroups.protocols.JDBC_PING - Removed bf97191e-13f4-b3f9-0e45-41e902f5131f for clustername infinispan-hibernate-search-cluster from database.

2011-11-18 16:08:41,701 [Timer-4,infinispan-hibernate-search-cluster,HK6HZP1-29948] DEBUG org.jgroups.protocols.JDBC_PING - Registered bf97191e-13f4-b3f9-0e45-41e902f5131f for clustername infinispan-hibernate-search-cluster into database.

2011-11-18 16:08:50,585 [Timer-5,infinispan-hibernate-search-cluster,HK6HZP1-29948] DEBUG org.jgroups.protocols.JDBC_PING - Removed bf97191e-13f4-b3f9-0e45-41e902f5131f for clustername infinispan-hibernate-search-cluster from database.

It seems each node pings DB continously to register it to the cluster. Should I increase the interval ping time, and how can I do that?

3. I really have difficulty in understanding those configurations. Could you give some documents to know more about them?

Thanks a lot and Best regards,

Dung Ngo.

1. Re: Hibernate search - Infinispan - Jgroups optimization on Amazon clustering environment

sannegrinovero Nov 18, 2011 5:39 AM (in response to dungleonhart)

1. When I modify the search data, changes are updated almost immediately to other nodes in the cluster. Should I add some latency to this process and how can I do that?
Which version of Hibernate Search are you using? You can consider using the option exclusive_index_use, use an asynchronous backend, or ideally use the JMS/JGroups backend as described on the Hibernate Search documentation.

JDBC_PING should not register & deregister itself that often, I'm not sure why you see that. Do you see some Infinispan messages being logged as well about frequent view changes?

I really have difficulty in understanding those configurations. Could you give some documents to know more about them?
The documentation is to be found in the Infinispan, JGroups and Hibernate Search project separately: please look at each of the websites.
Actions