6 Replies Latest reply on Sep 25, 2015 8:59 AM by hchiorean

node.remove() performance problem over cifs binary storage

ludopaquet Sep 24, 2015 4:52 AM

Hi all,

We use modeshape 4.3 with a share cifs folder (/data/repo/data/binaries) for binary storage.

infinispan storage is local

First we had performance issue with getting metadata. Setting minimumBinarySizeInBytes to 4096 solved the problem.

(

"storage" : {

"cacheName" : "Repo",

"cacheConfiguration" : "infinispan-config.xml",

"binaryStorage" : {

"type" : "file",

"directory": "/data/repo/data/binaries",

"minimumBinarySizeInBytes" : 4096

}

)

Then we notice that node.remove() (a parent node with about 1600 file child's nodes) takes a very long time :

just the code for removing and session.save : 760 ms

commiting the jta transaction : 2'30 min

Do you know what's wrong ?

Many Thanks,

1. Re: node.remove() performance problem over cifs binary storage

hchiorean Sep 24, 2015 5:24 AM (in response to ludopaquet)

As per the JCR spec, when you remove a node all its children have to be removed as well. This means that if you call node.remove on a node which has lots of children, all those children will be loaded into memory as well (recursively) which could take time. The reason they are loaded into memory is that JCR requires all sorts of validations (e.g. updating strong references). Moreover, if the binary data is not shared with other nodes (i.e. the files are unique in the repository), they will removed from the binary store as well, which means removing them via the network (CIFS).
Technically, removing from the binary store means moving them into the "trash" area of the store, so essentially this is a "move" operation from the FS perspective.

If loading all the children recursively in memory and then moving them to trash via CIFS is not the issue, you should profile and investigate where the problem is coming from.
1 of 1 people found this helpful
Actions
2. Re: node.remove() performance problem over cifs binary storage

ludopaquet Sep 24, 2015 11:51 AM (in response to hchiorean)

Many Thanks Horia, VisualVM says that the culprit is findFile from getTrashFile
Actions
3. Re: node.remove() performance problem over cifs binary storage

ludopaquet Sep 24, 2015 12:13 PM (in response to ludopaquet)

Maybe can I mount trash folder locally ? I see that this folder has ridicule size even after a remove node
Actions
4. Re: node.remove() performance problem over cifs binary storage

hchiorean Sep 25, 2015 1:23 AM (in response to ludopaquet)

That method is really simple: https://github.com/ModeShape/modeshape/blob/master/modeshape-jcr/src/main/java/org/modeshape/jcr/value/binary/FileSystemBinaryStore.java#L288
At most, it will create 3 directories representing the structure which will hold the SHA1 of a binary file, all using the standard java.io.File API. So I guess the performance problem comes from java.io.File having to do this over the network.

I suspect that googling the performance of java.io over CIFS will lead to more people reporting the same performance issues.
Actions
5. Re: node.remove() performance problem over cifs binary storage

ludopaquet Sep 25, 2015 8:53 AM (in response to hchiorean)

Mounting trash folder locally seems to do the trick. It's not large enough to require large amount of disk. I guess the problem is to create many small files.

Maybe it could be an option for a future release of filebinaryStore to map the trash folder in another path ?

Many thanks for helping me.
Actions
6. Re: node.remove() performance problem over cifs binary storage

hchiorean Sep 25, 2015 8:59 AM (in response to ludopaquet)

Feel free to log an enhancement request.
Actions

Go to original post