7 Replies Latest reply on Jun 30, 2011 6:06 AM by sannegrinovero

InfinispanDirectory: how to use it?

yangju Jun 25, 2011 10:27 PM

I have read the wiki https://docs.jboss.org/author/display/ISPN/Infinispan+as+a+Directory+for+Lucene and read the demo code like the following in DemoActions:

public void addNewDocument(String line) throws IOException {

IndexWriter iw = new IndexWriter(index, analyzer, MaxFieldLength.UNLIMITED);

try {

Document doc = new Document();

Field field = new Field(MAIN_FIELD, line, Store.YES, Index.ANALYZED);

doc.add(field);

iw.addDocument(doc);

iw.commit();

} finally {

iw.close();

}

It is not quite clear to me that if I use InfinispanDirectory for lucene indexing, do I have manully add the cache entry's index using the IndexWriter to the index, or infinispan will automatically update the index when an entry is added into infinispan cache?

Also, should I have configuration like this (distributed)

</properties>

</indexing>

Or should I leave the "hibernate.search.default.directory_provider" as default?

Please clarify for me.

Thanks.

1. Re: InfinispanDirectory: how to use it?

yangju Jun 25, 2011 11:57 PM (in response to yangju)

Specifically, in my code I have this:
Directory indexDir = new InfinispanDirectory(indexInfinispanCache, "myIndex");
Then how do I use this indexDir? Do I actually have to directly use this object when I add some entries to my cache? Note that the class to be cache is annotated for indexing, such as
@Indexed
@ProvidedId
etc.

It is not very clear from the infinispan query and infinispan lucene indexing directory document that how this indexDir is used subsequently once it is created.
Also, what should be the value for "hibernate.search.default.directory_provider" if I use InfinispanDirectory as lucene indexing dir?
Actions
2. Re: InfinispanDirectory: how to use it?

sannegrinovero Jun 26, 2011 12:45 PM (in response to yangju)
Hi,
I see we miss an "overall overview" of the Lucene integrations, let me try clarify this.
Apache Lucene is a very powerful and popular project to embed fulltext search in your applications, but it's API is quite complex as you've seen from the example code using the IndexWriter. This isn't necessarily a bad thing, as Lucene is very powerful and this API is necessarily complex, but most users need only some basic operations so the way it's used is usually by not exposing the Lucene API directly to your app, but have a little abstraction - some like to write their own, some reuse some existing helpers like Infinispan Query or Hibernate Search.

So there are two different integration problems which Infinispan helps to deal with, in regards to Lucene integration:
How to update the index, how to query it (basically how to use Lucene): Infinispan Query
Store the index in Infinispan (so that it can be distributed and updates "seen" by each node): Infinispan Lucene Directory

These two projects are independent: you could use the Infinispan Lucene Directory just to store the index as a replacing storage for any existing Lucene-consuming application, as it implements the same API it's a simple drop-in replacement.

Infinispan Query is a totally different module: it doesn't care about where the index is stored, it's meant to facilitate the way people use the index by abstracting index updates and the query needs. So if you annotate your values with @Indexed @ProvidedId, and enable <indexing enabled="true" .../> then the index will be transparently updated, no need to use the IndexWriter directly. Then to Query your object you can stick with the query methods defined in the Query module, or access an IndexSearcher directly (the "low level" Lucene API) to perform wathever (read) operation you need on the index, in case some functionality is not exposed by Query or you're a hard core Lucene expert and prefer to code that way.

It is of course possible - and desirable - to use them together: when configuring Infinispan Query it's going to need to store an index somewhere, and this can be you ram, your local disk (default), or to use the Infinispan Lucene Directory.
The configuration you've shown is correct, you need to set the directory provider to "infinispan" if you want to use it:
http://docs.jboss.org/hibernate/stable/search/reference/en-US/html_single/#d0e1752
You likely want to set the configuration property configuration_resourcename too, to specify a custom configuration for the infinispan instance you're booting to store the index.
1 of 1 people found this helpful
Actions
3. Re: InfinispanDirectory: how to use it?

yangju Jun 27, 2011 12:22 AM (in response to sannegrinovero)

Thanks for the explanation.
Does <property name="hibernate.search.default.directory_provider" value="infinispan" /> have to be defined in both persistence.xml (in case of JPA) and infinispan cache config files?

I got weird result in infinispan query (5.0.0.CR6) by using <property name="hibernate.search.default.directory_provider" value="infinispan" />.
It seems return more search result entries than I actually put into cache. With <property name="hibernate.search.default.directory_provider" value="ram" /> I got the correct result. I have only one node.

Also, if I set hibernate.search.infinispan.cachemanager_jndiname in the persistence.xml, I got NPE in deployment:
Caused by: java.lang.NullPointerException
    at java.util.HashSet.<init>(Unknown Source)
    at org.infinispan.remoting.MembershipArithmetic.getMembersLeft(MembershipArithmetic.java:46)
    at org.infinispan.transaction.TransactionTable$StaleTransactionCleanup.onViewChange(TransactionTable.java:204)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.infinispan.notifications.AbstractListenerImpl$ListenerInvocation$1.run(AbstractListenerImpl.java:200)

If I set hibernate.search.infinispan.configuration_resourcename, it appears that the deployment is OK. But the query result is wrong.

I really appeciate if a complete example of infinispan hibernate search with infinispan as lucene indexing dirctory can be posted.

I use jboss AS 6 Final.
Actions
4. Re: InfinispanDirectory: how to use it?

sannegrinovero Jun 27, 2011 5:57 AM (in response to yangju)

Thanks for the explanation.
Does <property name="hibernate.search.default.directory_provider" value="infinispan" /> have to be defined in both persistence.xml (in case of JPA) and infinispan cache config files?
depends on what you use. Are you using Infinispan Query or JPA ? If you use both, you'll have to set both options.

I got weird result in infinispan query (5.0.0.CR6) by using <property name="hibernate.search.default.directory_provider" value="infinispan" />.
It seems return more search result entries than I actually put into cache. With <property name="hibernate.search.default.directory_provider" value="ram" /> I got the correct result. I have only one node.
There was an issue: ISPN-1179 but it's solved in 5.0.0.CR6. Can you provide a test for this?

Also, if I set hibernate.search.infinispan.cachemanager_jndiname in the persistence.xml, I got NPE in deployment:
Caused by: java.lang.NullPointerException
    at java.util.HashSet.<init>(Unknown Source)
    at org.infinispan.remoting.MembershipArithmetic.getMembersLeft(MembershipArithmetic.java:46)
    at org.infinispan.transaction.TransactionTable$StaleTransactionCleanup.onViewChange(TransactionTable.java:204)
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
    at java.lang.reflect.Method.invoke(Unknown Source)
    at org.infinispan.notifications.AbstractListenerImpl$ListenerInvocation$1.run(AbstractListenerImpl.java:200)

Please report this on JIRA, and try adding as much context information as possible.
Actions
5. Re: InfinispanDirectory: how to use it?

yangju Jun 28, 2011 11:57 AM (in response to sannegrinovero)

Sanne:

I have tried the hibernate.search.default.directory_provider" value="infinispan" option and replace the default-hibernatesearch-infinispan.xml with a 5.0 version (the default config is still on 4.2 version) inside the hibernate-search-infinispan-3.4.0.Final.jar.

However, it seems that the search result is always wrong the first time the query is run. After that, subsequent queries (same query) always give correct results. It is really weird that it behaves like this.

Also, even with hibernate.search.default.directory_provider" value="ram", I get a warining each time I add into cache:

[org.hibernate.search.backend.impl.TransactionalWorker] It appears changes are being pushed to the index out of a transaction. Register the IndexWorkFlushEventListener listener on flush to correctly manage Collections!

Here is my cache config:
<infinispan-config name="epen_application" jndi-name="java:CacheManager/epen">
        <infinispan xmlns="urn:infinispan:config:5.0">
            <global>
                <transport clusterName="${jboss.partition.name:DefaultPartition}-epen"
                    distributedSyncTimeout="17500">
                    <properties>
                        <property name="stack" value="${jboss.default.jgroups.stack:udp}" />
                    </properties>
                </transport>
                <globalJmxStatistics enabled="true"
                    allowDuplicateDomains="true" />
                <shutdown hookBehavior="DONT_REGISTER" />
            </global>
            <default>
                <deadlockDetection enabled="true" spinDuration="1000" />
                <invocationBatching enabled="true" />
                <storeAsBinary enabled="true" />
                <locking isolationLevel="READ_COMMITTED"
                    lockAcquisitionTimeout="2000" writeSkewCheck="false"
                    concurrencyLevel="5000" useLockStriping="false" />
                <transaction syncRollbackPhase="false" syncCommitPhase="false"
                    useEagerLocking="true" eagerLockSingleNode="true" />
                <clustering mode="distribution">
                    <l1 enabled="false" lifespan="60000" />
                    <hash numOwners="1" rehashRpcTimeout="120000" />
                    <async />
                </clustering>

            </default>
            <namedCache name="response_record">
                <indexing enabled="true" indexLocalOnly="true">

                <properties>
                    <property name="hibernate.search.default.directory_provider" value="ram" />
                </properties>
                </indexing>
                <eviction maxEntries="100000" />
                <expiration lifespan="600000" />
            </namedCache>



        </infinispan>
    </infinispan-config>

Does this warning mean that adding index for this entry failed or it is just a warning? How to get around it?

Sorry I also posted this problem in http://community.jboss.org/message/596958#596958 but I post it here again as I am not sure you can see that post.

Thanks a lot for your time.
Actions
6. Re: InfinispanDirectory: how to use it?

yangju Jun 29, 2011 10:25 AM (in response to yangju)

Also, with the index set as "ram", I sometimes got this error when I add a entry into cache:
16:25:14,261 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] Exception occurred org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SimpleFSLock@.\com.company.product.bean.MyRecordImpl\write.lock
Primary Failure:
    Entity com.pearson.epen.bean.ResponseRecordImpl Id S:5948_1 Work Type org.hibernate.search.backend.AddLuceneWork
: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SimpleFSLock@.\com.company.product.bean.MyRecordImpl\write.lock
    at org.apache.lucene.store.Lock.obtain(Lock.java:84) [:3.2.0 1129474 - 2011-05-30 23:00:24]
    at org.apache.lucene.index.IndexWriter.<init>(IndexWriter.java:1114) [:3.2.0 1129474 - 2011-05-30 23:00:24]
    at org.hibernate.search.backend.Workspace.createNewIndexWriter(Workspace.java:202) [:3.4.0.Final]
    at org.hibernate.search.backend.Workspace.getIndexWriter(Workspace.java:180) [:3.4.0.Final]
    at org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor.run(PerDPQueueProcessor.java:103) [:3.4.0.Final]
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.FutureTask.run(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [:1.6.0_24]
    at java.lang.Thread.run(Unknown Source) [:1.6.0_24]

16:25:14,265 ERROR [org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor] Unexpected error in Lucene Backend: : java.lang.NullPointerException
    at org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate.performWork(AddWorkDelegate.java:76) [:3.4.0.Final]
    at org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor.run(PerDPQueueProcessor.java:106) [:3.4.0.Final]
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.FutureTask.run(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [:1.6.0_24]
    at java.lang.Thread.run(Unknown Source) [:1.6.0_24]

16:25:14,266 ERROR [org.hibernate.search.exception.impl.LogErrorHandler] Exception occurred java.lang.NullPointerException
Primary Failure:
    Entity com.pearson.epen.bean.ResponseRecordImpl Id S:5948_1 Work Type org.hibernate.search.backend.AddLuceneWork
: java.lang.NullPointerException
    at org.hibernate.search.backend.impl.lucene.works.AddWorkDelegate.performWork(AddWorkDelegate.java:76) [:3.4.0.Final]
    at org.hibernate.search.backend.impl.lucene.PerDPQueueProcessor.run(PerDPQueueProcessor.java:106) [:3.4.0.Final]
    at java.util.concurrent.Executors$RunnableAdapter.call(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.FutureTask$Sync.innerRun(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.FutureTask.run(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(Unknown Source) [:1.6.0_24]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source) [:1.6.0_24]
    at java.lang.Thread.run(Unknown Source) [:1.6.0_24]
Actions
7. Re: InfinispanDirectory: how to use it?

sannegrinovero Jun 30, 2011 6:06 AM (in response to yangju)

Hi,
the SimpleFSLock is not used by the RAMDirectory by default: either you have overriden the lockfactory configuration or the configuration is wrong as it doesn't seem it's using the RAM directory.

Could you use an invalid value for "hibernate.search.default.directory_provider" ? try something like "nonram", it should throw an exception: at least we double-check it's reading the parameter properly (it works fine here and in all tests, but maybe there's something strange with your configuration/setup).
1 of 1 people found this helpful
Actions

Go to original post