2 Replies Latest reply on Apr 13, 2018 5:07 AM by Tristan Tarrant

    How does the Infinispan single file store clean up duplicate keys that appear to persist if the keys and values are put periodically with a lifespan?

    Prateek Kumar Newbie

      (Query section below towards middle)



      Infinispan 9.13

      Embedded cache in a cluster with jgroups (but can see same behavior in local basiccache too)

      Single file store

      No explicit eviction nor passivation

      <persistence passivation="false">

                      <file-store path="/path/to/file/store" max-entries="-1" purge="false"/>



                      <binary size="certain size here but never expect it to be met" eviction="MEMORY"/>




      At a certain frequency (calling it freq) a thread (say ExecutorService.scheduleAtFixedRate(...)) does a put(key, value, freq + buffer, TimeUnit.minutes) into the distributed cache.

      So at a fixed frequency, the entries are put into cache with a lifespan = freq + buffer (buffer is just for safety and much less than freq) for each key with the same or different value as before.

      For example, the insertion might put with lifespan=freq+buffer:

      k1, v1

      k2, v2

      k3, v3

      After the frequency we put with same lifespan:

      k1, v1

      k2, v4 (different value)

      k3, v3

      There is NO eviction happening because the maximum memory size is not being reached



      In the single file store, it seems the key-values are duplicated and we are checking well after the next frequency + buffer.

      So the file might have in single file store.dat:





      (this is not exact, but just a representation of whats in file store).

      Duplication is acceptable to us since accessing programmatically with a get() is always returning the updated or correct value (we may never call a get on most keys though).



      Does infinispan clean up stale values from the single file store.dat to ensure the file store does not grow indefinitely with each put(), because at each frequency it appends the key-values at the end, or replaces some in-place inside the file (some key-values are duplicated, and of the duplicates some are stale and some are updated)?


      Could you please point me to the documentation/mechanism which ensures this clean up does happen and the file store will not grow because of too many duplicate/stale/expired entries?


      Documentation refers to an expiration reaper thread which wakes up, but here were are not passing anything explicitly for its interval (as there is no <expiration> entry in xml file and lifespan is programmatically set via put()) - if the expiration reaper thread IS doing this cleanup I'd like to know and note the default wakeup interval.


      Notes from Infinispan documentation:


      Stale and duplicate entries seem to be expected behavior -


      "If you don’t use eviction, what’s in the persistent store is basically a copy of what’s in memory. If you do use eviction, what’s in the persistent store is basically a superset of what’s in memory (i.e. it includes entries that have been evicted from memory)." - http://infinispan.org/docs/stable/user_guide/user_guide.html#cache_loader_behavior_with_passivation_disabled_vs_enabled


      "If you have no eviction configured and and you let this time expire, it can look as Infinispan has not removed the entry." - http://infinispan.org/docs/stable/faqs/faqs.html#expiration_does_not_work_what_is_the_problem


      Is this cleaning up the duplicates in single file store.dat - “When an entry expires it will reside in the data container or cache store until it is accessed again by a user request. There is also an optional expiration reaper that can run at a given configurable interval of milliseconds which will check for expired entries and remove them.” - Infinispan 9.2 User Guide