1 2 Previous Next 23 Replies Latest reply on May 12, 2015 2:55 AM by hchiorean

    org.modeshape.jcr.value.binary.BinaryStoreException

    mlateef

      Hi,

       

      We are running modeshape 4.1.0 and had some data and some files are no longer accessible,

       

      the error trace we are getting is below

       

      requested url = http://sharearchiver/archive/DownloadFile

      org.modeshape.jcr.value.binary.BinaryStoreException: Unable to find binary value with key "4f9aa35acbbdb5f4ffcea2dbdc33118d70

      2186b4" within binary store at "org.modeshape.jcr.value.binary.CompositeBinaryStore@e6d3c8"

              at org.modeshape.jcr.value.binary.CompositeBinaryStore.getInputStream(CompositeBinaryStore.java:224)

              at org.modeshape.jcr.value.binary.StoredBinaryValue.internalStream(StoredBinaryValue.java:70)

              at org.modeshape.jcr.value.binary.AbstractBinary.getStream(AbstractBinary.java:166)

              at com.virtualcode.servlets.DownloadFile.processRequest(DownloadFile.java:120)

              at com.virtualcode.servlets.DownloadFile.doGet(DownloadFile.java:217)

              at javax.servlet.http.HttpServlet.service(HttpServlet.java:618)

              at javax.servlet.http.HttpServlet.service(HttpServlet.java:725)

              at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:291)

              at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)

              at org.apache.tomcat.websocket.server.WsFilter.doFilter(WsFilter.java:52)

              at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)

              at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)

              at com.virtualcode.filters.CompanyTagFilter.doFilter(CompanyTagFilter.java:125)

              at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)

              at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)

              at org.springframework.orm.hibernate3.support.OpenSessionInViewFilter.doFilterInternal(OpenSessionInViewFilter.java:1

      98)

              at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:76)

              at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:239)

              at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)

              at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:219)

              at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:106)

              at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:501)

              at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:142)

              at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:79)

              at org.apache.catalina.valves.AbstractAccessLogValve.invoke(AbstractAccessLogValve.java:610)

              at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:88)

              at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:516)

              at org.apache.coyote.http11.AbstractHttp11Processor.process(AbstractHttp11Processor.java:1086)

              at org.apache.coyote.AbstractProtocol$AbstractConnectionHandler.process(AbstractProtocol.java:659)

              at org.apache.coyote.http11.Http11NioProtocol$Http11ConnectionHandler.process(Http11NioProtocol.java:223)

              at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.doRun(NioEndpoint.java:1558)

              at org.apache.tomcat.util.net.NioEndpoint$SocketProcessor.run(NioEndpoint.java:1515)

              at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)

              at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)

              at org.apache.tomcat.util.threads.TaskThread$WrappingRunnable.run(TaskThread.java:61)

              at java.lang.Thread.run(Thread.java:722)

      Error in tag filter: null

       

      This error is being generated for majority of the files, can somebody help with why this is so and more importantly how we may recover the data?

       

      There is a possibility the server may have suffered a sudden shutdown, the system starts as normal and does not report any other issues or corruption.

       

      Can somebody please assist?

       

      Thanks you

        • 1. Re: org.modeshape.jcr.value.binary.BinaryStoreException
          wesssel

          Is this by any chance a network mount that has become unavailable?

          • 2. Re: org.modeshape.jcr.value.binary.BinaryStoreException
            hchiorean

            You seem to be using a composite binary store, so has anything changed in *any* of the nested binary stores so that for example they aren't accessible anymore or there was some sort of FS corruption ? This would explain why binary values cannot be located anymore.

            Binary values are normally stored outside of ModeShape and referred to via SHA1 values. If the external stores aren't corrupted and are accessible there is no reason why ModeShape shouldn't retrieve the binary values.

            • 3. Re: org.modeshape.jcr.value.binary.BinaryStoreException
              mlateef
              Hi Horia/Wessel

               

               

              Thank you very much for your responses, yes we are using composite binary stores with a single store currently configured, all data is hosted locally on local disks. The odd thing is, some data is giving this error and some of the data is accessible just fine.
              With regards to OS corruption, as far I can see, there is no reason to suspect FS corruption, The FS is mounted as XFS, which should be very stable/reliable, besides the OS seems to be working well and the only issue appears to be the application returning some of these files.
              here is the mounted root partition where the data is hosted. The server has not been shutdown/powered off which may have caused corruption.

               

              root@sharearchiver:/opt/tomcat/logs# mount
              /dev/xvda1 on / type xfs (rw)

              proc on /proc type proc (rw,noexec,nosuid,nodev)

               

              Any further thoughts?

               

               

              Thank you in advance.

              • 4. Re: org.modeshape.jcr.value.binary.BinaryStoreException
                hchiorean

                Given that you're storing all the binaries on disk, one thing you can try is to manually look at the FS structure to see if a binary for which an exception is raised is there or not. To do this, take one binary key SHA1 (as reported by an exception) and considering the following code:

                https://github.com/ModeShape/modeshape/blob/modeshape-4.1.0.Final/modeshape-jcr/src/main/java/org/modeshape/jcr/value/binary/FileSystemBinaryStore.java#L293 you should be able to check if the actual content is on disk or not.

                 

                For example, given the key 4f9aa35acbbdb5f4ffcea2dbdc33118d702186b4, ModeShape will store the actual binary content on disk in the following substructure: binary_store_root / 4f / 9a / a3 / 4f9aa35acbbdb5f4ffcea2dbdc33118d702186b4

                If you check the FS and

                1. the content is there, it means that no data has actually been lost but rather there's a logical error (possible bug) which prevents ModeShape from locating this file. If that is the case you should first use the latest version of ModeShape (4.2.0.Final) and then debug the logic of the CompositeBinaryStore trying to figure out where the problem lies
                2. the content isn't there, it means that it has somehow been removed even though there are ModeShape nodes/properties still referencing it. If that is the case, I'm afraid there's nothing you can do except trying to write some custom code to remove/purge the properties which are referencing these phantom binary values.
                • 5. Re: org.modeshape.jcr.value.binary.BinaryStoreException
                  rhauch

                  After you confirm there is no data loss (see Horia's previous comment), then you can try something else. Since you're not really using the composite binary store's features, I'd suggest taking that out and directly using the file system binary store. If that configuration works, then the problem might be in the composite binary store. If that configuration does not work either, then the problems might be higher up.

                  • 6. Re: org.modeshape.jcr.value.binary.BinaryStoreException
                    mlateef

                    Hi Horia/Randall,


                    Again, thank you for your guidance on this, I have tested as suggested and below is the screen shot I am sharing, it seems the data is not there to begin with . it seems the 4f/2a/0d is also missing data, judging by the fact the folder structure is there but no binary data.

                     

                    The reason why we used composite store, is to offer the flexibility later on to add another logical volume to add an additional binary store rather than having to continue to extend the same vol forever, which brings me to another question, is it possible to reconfigure modeshape later on and change it to add a composite storage or does this have to be configured prior to adding any data to the repository?, further, can one change the composite storage name later e.g. from data1 to data2 and the data will still be accessible?

                    Is there a suggestion there is a possible bug with the composite repository implementation? - what is the most stable and tested configuration that I should use in production?

                    • 7. Re: org.modeshape.jcr.value.binary.BinaryStoreException
                      wesssel

                      We are running a composite binary store with file system stores that totals 2 TB+ with files <1 MB, on an NFS share, shared by 4 different machines all running Modeshape. No missing data so far. How sure are you that your filesystem is solid? Codebase is ModeShape 3.x for me btw.

                      • 8. Re: org.modeshape.jcr.value.binary.BinaryStoreException
                        rhauch

                        The reason why we used composite store, is to offer the flexibility later on to add another logical volume to add an additional binary store rather than having to continue to extend the same vol forever, which brings me to another question, is it possible to reconfigure modeshape later on and change it to add a composite storage or does this have to be configured prior to adding any data to the repository?, further, can one change the composite storage name later e.g. from data1 to data2 and the data will still be accessible?

                        You can start with a single binary store now, and then move to the composite binary store as long as it makes use of the original binary store. The composite binary store really just delegates to the others. Changing the config would require a restart.

                        • 9. Re: org.modeshape.jcr.value.binary.BinaryStoreException
                          mlateef

                          Hi Randall/Wessel,

                           

                          I have been doing some further investigation and been able to reproduce the issue, here are my findings..

                           

                          it seems the issue is may not be an issue after all with the modeshape / FS - seems to be some house keeping task which runs as a background task which we are trying to understand.. in order to investigate, I created a new repo and was taking a file count every minute of data/storage, with the repository I noticed at exactly midnight the number of files dropped from 520 to 416.. I had already taken a backup immediately after creating the repository so did have the original.

                           

                          I noticed there was a trash folder in the data1/storage/ folder which I didn't take much notice of originally.

                           

                          I compared the trash files in the backup original data to the trash folder in the live data (where data was removed) and found that there were 52 files in the backup trash folder that were no longer part of the live trash folder.

                           

                          I took the hash of one of the missing files and confirmed that it was in trash infact there were 2 instance of it, and both had been removed i..e from the data/storage/4f... and also from the data/storage/trash/4f.. so they seem to be correlating. looks like the file had been marked for deletion and hence it was copied /linked to trash folder.

                           

                          so just playing with the numbers - the files dropped from 520 to 416 = 104

                          52 files from trash and I am guessing the exact equivalent also dropped from the live data - so it seems the trash folder contains markers of files to be removed.

                           

                          my question is - is this some sort of retention implementation? why would data be in trash and what task may be kicking off to remove these files, we are still trying to trace the root cause I am not sure if this some modeshape internal task or some other task that we may have possibly created inadvertently. any guidance would be much appreciated.

                           

                          my colleagues are also looking into this.

                           

                          Many Thank in advance.

                          • 10. Re: org.modeshape.jcr.value.binary.BinaryStoreException
                            hchiorean

                            First, I would strongly recommend using the latest version - 4.2.0.Final - since we're always fixing bugs.

                             

                            Binary values which are unused - i.e. not referred anymore by any JCR node properties, are initially marked as such during each session.save(). ModeShape keeps reference counts for each binary value and when such a reference count becomes 0 a binary value "will be moved to trash" - which is a separate part on the FS - as you point out. The trash is not immediately emptied, allowing effectively for a binary value to be reused (i.e. moved out of the trash) if a node property which references it gets created.

                             

                            However, if "enough time passes", ModeShape will purge out (remove) the binary values from the trash which have not been used for at least a given amount of time (default is 1 hour) using a daemon thread which can be configured (by default runs at midnight each day).

                            You can see the relevant parts of the code here: https://docs.jboss.org/author/display/MODE40/Binary+values#Binaryvalues-Garbagecollection and here https://github.com/ModeShape/modeshape/blob/modeshape-4.2.0.Final/modeshape-jcr/src/main/java/org/modeshape/jcr/value/binary/FileSystemBinaryStore.java#L478. You can also read more about the 2-step binary GC here: https://docs.jboss.org/author/display/MODE40/Binary+values#Binaryvalues-Garbagecollection. You can also see what a configuration looks like here: https://github.com/ModeShape/modeshape/blob/modeshape-4.2.0.Final/modeshape-jcr/src/test/resources/config/repo-config-garbage-collection.json

                             

                            In general, if there is at least 1 property of node which references a binary value, this value *should never* be marked as unused and eventually removed. If this is not the case, it's most likely a bug somewhere which you'll have to help us reproduce (via a test-case). This is the reason why I was suggesting at the beginning that you run the latest codebase.

                            • 11. Re: org.modeshape.jcr.value.binary.BinaryStoreException
                              wesssel

                              Modeshape runs garbage collection of unused binaries every day at midnight by default. As to the 52 / 104 issue, AFAIK Modeshape stores the binary + a file which contains the mimetype in plain text with each binary. This explains why there are twice the number of files as there are binaries.

                              • 12. Re: org.modeshape.jcr.value.binary.BinaryStoreException
                                mlateef

                                Wessel,

                                 

                                thank you for your response,

                                 

                                how is a binary deemed unused? what process or action sets a binary to unused state? in this case the binary was in use and it was removed so just trying to understand the decision process behind the removal of data.

                                 

                                Thank you

                                • 13. Re: org.modeshape.jcr.value.binary.BinaryStoreException
                                  mlateef

                                  Hi Horia/Wessel,

                                   

                                  Many Thanks for your responses.

                                   

                                  Further to the recommendation to upgrade the modeshape version to 4.2.0 final - I have done so and retested the issue and found the issue still exists..

                                   

                                  I recreated a new small repository (rather than using the older backup), one thing to note is that immediately on creating and adding data,  some data is marked  for removal in the storage/trash from the get go please see below image,

                                   

                                  The data is accessible as normal when initially added and prior to the purge process - Therefore must be fully referenced, hence I am not sure why some of the data is marked for deletion due to no references.

                                   

                                  Question: Can we get around this issue by manually removing all references from the trash folder? if so how often do we need to check the existence of the trash folder and binary data marked for deletion.

                                   

                                  Any feedback on this would be much appreciated?

                                   

                                  Thanks in advance

                                   

                                  Mazhar

                                  • 14. Re: org.modeshape.jcr.value.binary.BinaryStoreException
                                    wesssel

                                    Do you have any code to test this with?


                                    Easiest way to prevent deletion would be to override the FileSystemBinaryStore to not actually delete anything from the FS on removal of binaries.

                                    1 2 Previous Next