we use Modeshape 5.4.1 in our product to save all kinds of data. We are using the FileSystemBinaryStorage and the binaries are stored under a folder /binaries. I will first try to explain the whole situation and then ask some questions.
When executing "du -sch" to calculate the size of the folder the result is about 230GB. We also have a tool that iterates over all nt:file nodes and gets the size from the jcr:data property. The result of this calculation is around 116GB (almost exactly the half).
When we first installed the system, Versioning was activated, but it was deactivated later. After deactivating the versioning we also deleted all versions, except the current one. So right now for a file there is one nt:file node and also one version with a nt:frozenNode, which share the same binary from what we know. We also detected that there are versions in the versionHistory, which have a binary, that is not used by any nt:file node anymore (jcr:versionStorage/.../1.XX/jcr:frozenNode) . Therefore this would not be recognized by our tool, but still waste the storage.
1. What could be the cause of having versions with a binary, that is not used by any nt:file node or what could be the cause of this versions not being deleted?
2. When versioning is deactivated, how safe is it to delete versions that share a binary with an nt:file node? ("jcr:versionStorage/.../1.XX/jcr:frozenNode" and "/path/to/node/")
2.1 Or delete all versions, since they should not be used anyway.
3. Do you have any other ideas, what may be wasting our storage?
4. Is there a way to mark nodes to be cleaned up by the garbage collector programmatically ?
Thanks in advance for your help.