2 Replies Latest reply on Apr 9, 2015 8:17 AM by prashant.thakur

    storeasBinary ,Serialization and Metadata information

    prashant.thakur

      We are facing some issues with memory being bloated up in Infinispan Cache when we are storing as plain Java Objects to tune of 10x as shown by Oracle's average row length or simple data size calculations. We tried using storeasbinary option and the value is still approx 3x.

       

      Hence we tried writing our own serialization routine which gives approx 1.2x of memory consumption. While going through documentation we see following text which we wish to confirm

      "Internally, Infinispan uses an implementation of this Marshaller interface in order to marshall/unmarshall Java objects so that they’re sent other nodes in the grid, or so that they’re stored in a cache store, or even so to transform them into byte arrays for lazy deserialization."

       

      There is no mention that its the same binary format which it will use for storing data in binary format if we write our own Externalizer.

      Can we modify the documentation to confirm the same as memory utilization would be a major issue and confirmation here would be helpful ?

      Please correct if we are mistaken.

       

      In case we dont use versionawaremarshaller still Would the InternalMetadata information would be passed during internode communication or we need to specifically write Metadata information for each Serialization we write.

        • 1. Re: storeasBinary ,Serialization and Metadata information
          nadirx

          The confusing phrase in the docs here is "or even so to transform them into byte arrays for lazy deserialization."

          The above really should be: "and to transform into byte arrays when using storeAsBinary".

           

          Metadata is marshalled separately and you don't need to do it yourself.

           

          Tristan

          • 2. Re: storeasBinary ,Serialization and Metadata information
            prashant.thakur

            Thanks Tristan,

            We observe that Lucene indexes doesn't work when we enable storeAsBinary were able to figure out from code but can we document this as well.

            Is there way other than masking the DataContainer with our own data container to have query enabled. Because I think it would be a basic requirement to have data stored in binary format and also query for huge data i.e. 100 million plus records .

            For this in our case we required approx 750GB of Memory with replication and also to query we would be requiring indexes as multiple copies of value part cannot be stored.

            We can bring the memory requirement down to 100GB but now the challenge is how to use Lucent indexes.

            It seems that Index Builders used Datacontainer interface rather than local cache interface which seems logical but this calls for changing the data container itself.