4 Replies Latest reply on Feb 13, 2015 4:03 AM by hchiorean

    Possible to extend Index / IndexProvider SPI so that we can ask a Workspace about currently disabled indexes

    bes82

      In my application / setup indexes might get damaged or simply deleted on purpose. If that happens they are being rebuilt or repaired on next startup.

       

      During that time ManagedIndex.enable is set to false so they won't be used on queries.

       

      During startup all kinds of tasks take place all in need of a working query engine. Problem is, as long as indexes are disabled every query has to fetch and search all nodes which then blows up the whole server. I'm not sure exactly what is causing the blowup, I normally would expect the operations to take quite a lot of time, but not blowing up everything.

       

      Anyway. What I really want is to simpy delay a query as long as I'm sure all indexes are up and running. So I check the query plan for every query which only costs 0.1ms and if there are mor access queries than indexes used, I trigger exponential backoff for this query.

       

      All working fine, only problem is, there doesn't seem to be a concept in modeshape for asking how many indexes in a workspace are active/inactive. So currently I hacked that in my IndexProviders and ask them directly. This is working fine, but as I said is a bit hacked.

       

      Would It be possible to extend the IndexProvider SPI so that there is a callback communication to the ModeShape QueryManager whenever an index is en/disabled by the provider?

      Additionally there should then be a public method in the QueryManager to ask about disabled/enabled indexes.

        • 1. Re: Possible to extend Index / IndexProvider SPI so that we can ask a Workspace about currently disabled indexes
          hchiorean

          During startup all kinds of tasks take place all in need of a working query engine. Problem is, as long as indexes are disabled every query has to fetch and search all nodes which then blows up the whole server. I'm not sure exactly what is causing the blowup, I normally would expect the operations to take quite a lot of time, but not blowing up everything.

           

          I'm not sure what your application is trying to do, but I'd suggest first investigating what is causing the "blowup". It is the intended/desired behavior that while no indexes are available/enabled and queries are executed, the query engine will default to scanning all the nodes.

          • 2. Re: Possible to extend Index / IndexProvider SPI so that we can ask a Workspace about currently disabled indexes
            bes82


            Sorry, but sometimes I really don't understand your answers. I'm reporting a real world problem that's not happening in modeshapes tests and also a pragmatic solution. And all you're telling me is that my app is the culprit when all it is doing is querying a bunch of nodes:

             

            Even if my app would only need one single node, then "the desired behaviour" would fetch all ~20M nodes (as long as the required index is not yet ready) for which there is not enough ram so infinispan eviction starts and I have absolutely no idea what is happening behind the scenes. In fact my app needs about 5 nodes or so to be feteched via query, all in parallel, so 5 parallel jobs will try to access all nodes.

             

            It is then not my app but the "desired behaviour" that is blowing up the server. If it is a permgenspace/garbage collector problem or something else? I don't know. But for me the simplest solution would - as always - not be to analyze the symtoms but first to erase the root of the problem and that is not to trigger that "desired behaviour" at all, because it's not desirable at all.

             

            If it later on happens that I enjoy the luxury of additional spare time, I might investigate, what is actually happening in infinispan/modeshape/the jvm.

            • 3. Re: Possible to extend Index / IndexProvider SPI so that we can ask a Workspace about currently disabled indexes
              hchiorean

              The reason for my answer is that you're asking us to change/update the SPI because of a problem you're seeing. Since the indexing SPI/API is already extremely complicated, the only case when I would consider this is if there's a very compelling reason - i.e. a real problem/deficiency in the SPI. And from my (limited) understading of the problem you're describing, I'm not convinced it's the case. I do suspect that you're asking us to change/update (and therefore start maintaining) the SPI so that you can work around your particular problem which you yourself don't fully understand - since it's a performance issue, you should really extensively profile to determine where the bottleneck lies.

              Additionally there should then be a public method in the QueryManager to ask about disabled/enabled indexes.

              You need to take into account that at least to my knowledge (rhauch should correct me if I'm wrong) the only public part of the indexing API/SPI is around the IndexProvider/Index interfaces. The RepositoryQueryManager/QueryEngine/etc are all internal components which we do not want to expose. Using/referring to these classes/interfaces from 3rd party code is a hack IMO.

               

              EDIT: btw, I think by far the easiest way to evaluate the SPI changes that you're proposing is for you to open a PR which we can then look at.

              • 4. Re: Possible to extend Index / IndexProvider SPI so that we can ask a Workspace about currently disabled indexes
                hchiorean

                As long as the changes to the API mean only updating the IndexManager to offer information about the re-index status of a certain index (finished/ongoing - i.e. enabled/disabled), this should be doable. Feel free to open an enhancement JIRA for this so that we can track it. Thanks.