14 Replies Latest reply on Oct 3, 2007 7:30 AM by manik

    JBCACHE-1154 - Introduce ability to mark nodes as resident -

    mircea.markus

      Nodes marked resident would be ignored by the eviction algorithms both when checking whether to trigger the eviction and when proceeding with the actual eviction of nodes. E.g. if the eviction policy for a region is "keep LRU 10 nodes" - the resident nodes won't be counted within those 10 nodes, and also won't be evicted when the threshold is reached.

      How to specify whether a node is resident or not?
      1. statically only, within the config file. There would be a flag on each node indicating this: Node.isResident.
      2. dynamically - having a flag on node that can be set to true (default to false - enforced by backward compatibility). Node.setResident, Node.isResident
      3. a combination of 1 and 2

      Some notes on static configurations(1&3):
      configuration module is on it's way to significantly change, we would like to keep the changes in this area to a minimum; also at the moment is not possible to specify nodes using regexp-like notations (out of scope) so it is a bit restrictive. Any critical requirements to have this implemented at this point?

      Other notes:
      1.When moving node from one region to another - I think it would make sense to also transfer the value of resident flag rather than reset it to false(fits better to what moving means)
      2.Backward compatibility should be assured by having flag's default value set to false.

        • 1. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
          galder.zamarreno

          I created a JIRA a few months ago to be able to define regions as regular expressions - http://jira.jboss.com/jira/browse/JBCACHE-1122

          +1 on the possibility of being able to mark nodes as resident for evictions (we had a customer that wanted to do this)

          • 2. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
            mircea.markus

            there is another JIRA that might be implemented based on resident nodes approach: Introduce a concept of structural nodes(http://jira.jboss.com/jira/browse/JBCACHE-1153)

            Structural nodes should not be taken into consideration when processing regions for eviction. Typically are nodes that do not contain data (and are not intended to contain data), but are used to build tree structure.
            (Please check JIRA for an eviction example with respect to structural nodes.)

            Structural node implementation based on resident nodes:
            - each such structural node would be defined by default as resident
            - when an operation is made on any such node (e.g. adding data) then it would be marked as non-structural(i.e. considered for eviction)

            Backward compatibility:
            - normally this should be an improvement in the way eviction framework works.

            • 3. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
              genman

              I thought it might be worthwhile sometime back to introduce various attributes such as read-only or hidden to a Node by potentially utilizing something like the EnumSet. Having a flag mechanism accomplishes a few things: Reduces future changes of the Node interface (no need to add methods for a new feature), and reduces node memory size (no need to add new boolean variables).

              I really think that how POJO Cache manages internal information is ugly. Having FQN with names such as "jboss_internal" etc. and doing all sorts of FQN hackery is really suggestive there must be a better way.

              • 4. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
                mircea.markus

                 

                and reduces node memory size (no need to add new boolean variables)
                - an boolean is hold on a bit. An RegularEnumSet aggregates an long (64 bit) + some caching arrays (other bits as well). From a memory POV it is optimal to aggregate booleans directly (than EnumSets wrapping enums).

                Reduces future changes of the Node interface (no need to add methods for a new feature)
                - code would be less readable; also I don't think this is an proper enum usage: i.e. define bounded types(the boolean variables are not logically related) -> enhance readability

                • 5. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
                  jason.greene

                   

                  "mircea.markus" wrote:
                  an boolean is hold on a bit. An RegularEnumSet aggregates an long (64 bit) + some caching arrays (other bits as well). From a memory POV it is optimal to aggregate booleans directly (than EnumSets wrapping enums).


                  A boolean is sizeof(unsigned char) which is a byte on every platform I know of. An array reference is the size of a pointer, so typically 32 or 64bit. There are 2. So when accounting for the reference to the EnumSet itself, on a 32 bit architecture you need more than 20 booleans to exceed the space utilization of an enumset with 64 or less Enums.

                  java.util.BitSet uses slightly less space than EnumSet (1 array ref + 1 32 bit value).

                  That said, any efficiency gain here is neligable, so we should be focusing on what best fits the design.

                  • 6. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
                    manik

                    I'd say focus on readability. There are much bigger memory gains that can be had elsewhere, which does need attention - I don't think aggregating booleans is one of them.

                    • 7. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
                      genman

                       

                      "mircea.markus" wrote:
                      and reduces node memory size (no need to add new boolean variables)
                      - an boolean is hold on a bit. An RegularEnumSet aggregates an long (64 bit) + some caching arrays (other bits as well). From a memory POV it is optimal to aggregate booleans directly (than EnumSets wrapping enums).

                      Reduces future changes of the Node interface (no need to add methods for a new feature)
                      - code would be less readable; also I don't think this is an proper enum usage: i.e. define bounded types(the boolean variables are not logically related) -> enhance readability


                      I still don't know why my concept hasn't caught on.

                      I saw your latest patches, which add flags to the data map. Then, you wrote a bunch of code to clean up the data as seen by the client. Sure, the external interface is clean but:

                      1. The internal code looks like crap. I don't know why you didn't opt for a boolean flag like everything else, but so be it.

                      2. You added new methods to an interface that's potentially designed for clients to implement. Or, is this interface supposed to get 2 new methods every point release? Adrian Brock "yelled" at me for doing something like this on a fairly obscure internal interface, and I don't and didn't work for JBoss. Hasn't he knocked a few times wondering what's going on?

                      3. You add methods to a general interface that are specific to a non-general concern.

                      The point is, sure you can have "boolean get/set" methods. But why not add to the interface instead:
                      void setProperty(Property p);
                      boolean getProperty(Property p);
                      

                      The properties themselves could be an enum or Object with an ordinal (so users could register their own). You could conceivably implement the internals as an integer or series of booleans or EnumSet, or what have you. Then, when a bug fix or something arises in the future, you minimize change scope.

                      I just get the sense things are going to continue to degenerate until maybe 3.0 when you guys get a clue and fix the APIs again.

                      The main point was never to "aggregate booleans", it would just have been a nice side-effect. Saving memory is certainly nice, but next time I won't bring it up since it seems to just confuse people.


                      • 8. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
                        mircea.markus

                         

                        1. The internal code looks like crap. I don't know why you didn't opt for a boolean flag like everything else, but so be it.

                        this flag also needs to be replicated within the cluster. Current implementation only replicates node's underlying map, but it does not replicate Node's state. So the reason of placing the info in the node map is for achieving replication. The nice solution of replicating the metadata(i.e. node's data that is not placed in the underlying map) requires API significant changes that are not acceptable within this release (e.g. the cache loaders would also need to propagate metadata). I'll add an implementation doc to state this.

                        2. You added new methods to an interface that's potentially designed for clients to implement.

                        See your point on this one - I don't think there is such a case, though.

                        3. You add methods to a general interface that are specific to a non-general concern.

                        I tend to agree that 'resident' is an non-general concern. As it has meaning only in the scope of eviction. Phps move closer to the eviction layer? e.g.
                        Region.markNodeResident(Fqn, isResident);
                        Region.isNodeResident(Fqn);
                        

                        On the other hand this would make the API a bit cumbersome. IMO the info should be held there, still. Others?


                        The point is, sure you can have "boolean get/set" methods. But why not add to the interface instead:
                        Code:

                        void setProperty(Property p);
                        boolean getProperty(Property p);



                        This approach the advantages would be a) easiness of change and b) memory footprint.
                        a) at the cost of having the code less readable/strong typed. I prefer the strongly typed approach.
                        b) As per a prev post don't reckon there is a memory gain

                        • 9. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
                          manik

                           

                          "genman" wrote:

                          1. The internal code looks like crap. I don't know why you didn't opt for a boolean flag like everything else, but so be it.

                          See Mircea's answer.

                          "genman" wrote:

                          2. You added new methods to an interface that's potentially designed for clients to implement. Or, is this interface supposed to get 2 new methods every point release? Adrian Brock "yelled" at me for doing something like this on a fairly obscure internal interface, and I don't and didn't work for JBoss. Hasn't he knocked a few times wondering what's going on?



                          1) This isn't an external interface that clients implement
                          2) This isn't a point-release.

                          "genman" wrote:

                          The main point was never to "aggregate booleans", it would just have been a nice side-effect. Saving memory is certainly nice, but next time I won't bring it up since it seems to just confuse people.


                          Nothing to confuse, we just opted for code readability over potentially saving a few bytes.

                          Storing such metadata in a Properties object does make sense, but until we can implement sufficient internal changes to replicate and persist such metadata, it is pointless.

                          • 10. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
                            manik

                             

                            "mircea.markus" wrote:
                            3. You add methods to a general interface that are specific to a non-general concern.

                            I tend to agree that 'resident' is an non-general concern. As it has meaning only in the scope of eviction. Phps move closer to the eviction layer? e.g.
                            Region.markNodeResident(Fqn, isResident);
                            Region.isNodeResident(Fqn);
                            

                            On the other hand this would make the API a bit cumbersome. IMO the info should be held there, still. Others?


                            Node metadata is the proper place to store this, since this flag would affect individual nodes, not regions of nodes. Considering where we're going with node and region design, with Regions becoming a fully configurable top-level construct in future, I don't want to clutter up the Region API anyway.



                            • 11. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
                              brian.stansberry

                              Hit submit button too early a sec ago; here's the real post:

                              mircea.markus' wrote:

                              Quote:
                              1. The internal code looks like crap. I don't know why you didn't opt for a boolean flag like everything else, but so be it.

                              this flag also needs to be replicated within the cluster. Current implementation only replicates node's underlying map, but it does not replicate Node's state. So the reason of placing the info in the node map is for achieving replication. The nice solution of replicating the metadata(i.e. node's data that is not placed in the underlying map) requires API significant changes that are not acceptable within this release (e.g. the cache loaders would also need to propagate metadata). I'll add an implementation doc to state this.


                              For now could we leave keeping this consistent around the cluster as an application-level concern? For the usages where I'm going to do this, these kind of things are dealt with as part of application startup. There I would build the structural nodes I need, or if they were state-transferred in, walk the tree and mark them. I think that kind of usage is probably most common. For more exotic stuff it can be handled by the app via a cache listener.

                              • 12. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
                                manik

                                Brian does have a point. Replication need not be dealt with until we have properly replicable metadata.

                                And as for cache loading, well, if a node is resident it won't be evicted, right? So it won't suffer the problem of not having it's resident status being lost on cache loading/activation.

                                • 13. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
                                  mircea.markus

                                  fine with me, didn't like the workaround code anyway :). There still is the transaction part. Do such prop changes subscribe to the ongoing transaction policy? E.g.

                                  tx.start();
                                  aNode.setResident(true);
                                  tx.rollback();
                                  //what's the value of aNode.sResident here?
                                  


                                  Generally speaking we might want to have metadata that does/doesn't subscribe(?); or which we want (want not) to replicate ?

                                  • 14. Re: JBCACHE-1154 - Introduce ability to mark nodes as reside
                                    manik

                                    I don't think this has anything to do with tx scoping. This is more a configuration thing.