2 Replies Latest reply on Oct 11, 2007 8:51 AM by soshah

    Indexing binary documents / searching the cms

    frontline2

      I configured the jackrabbit installation by adding indexfilters for word, excel, pdf etc.
      But when I search them the search rarely returns any hits. Sometimes it returns some of the documents with some words in them (other than the title), so the configuration seems to work somewhat.
      Has anybody got the search to search these types of documents
      successfully?

      Is the jackrabbit installation just so bad/old or is there some other problem? I would remember that the newer jackrabbit releases seem to work fine with these kinds of searches.

        • 1. Re: Indexing binary documents / searching the cms
          frontline2

          Got this to work by adding the needed dependencies.

          BUT indexing pdf-documents still doesn't work. I get the error:

          14:29:02,416 ERROR [STDERR] java.lang.Throwable: Warning: You did not close the PDF Document
          14:29:02,416 ERROR [STDERR] at org.pdfbox.cos.COSDocument.finalize(COSDocument.java:384)
          14:29:02,416 ERROR [STDERR] at java.lang.ref.Finalizer.invokeFinalizeMethod(Native Method)
          14:29:02,416 ERROR [STDERR] at java.lang.ref.Finalizer.runFinalizer(Finalizer.java:83)
          14:29:02,416 ERROR [STDERR] at java.lang.ref.Finalizer.access$100(Finalizer.java:14)
          14:29:02,416 ERROR [STDERR] at java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:160)


          BTW, what version of jackrabbit is the portal using? I tried to use 1.2.3 but I get the error:
          14:10:42,856 WARN [org.jboss.system.ServiceController] Problem starting service portal:service=CMS
          java.lang.AbstractMethodError: org.jboss.portal.cms.hibernate.state.JBossCachePersistenceManager.init(Lorg/apache/jackrabbit/core/persistenc
          e/PMContext;)V
           at org.apache.jackrabbit.core.RepositoryImpl.createPersistenceManager(RepositoryImpl.java:1175)
          


          • 2. Re: Indexing binary documents / searching the cms
            soshah

            frontline-

            PortalCMS integrates 1.1 version of JackRabbit.

            Thanks