-
1. Re: ModeShape suitability for medium sized binary files
rhauch Jan 30, 2013 9:41 AM (in response to gmlopezdev)1 of 1 people found this helpfulModeShape can indeed handle very large files, though how fast depends on several differen things. If you haven't already, look at our documentation that describes how ModeShape handes Binary values. There are severall options to choose from, based upon the topology you're looking for.
A non-clustered server could use the FileSystemBinaryStore, which really just forwards all calls directly to the underlying java.io.File objects managed in the binary store. All access is via buffered file input/output streams. Thus, this option will be fast and should handle as large of files as the OS can handle directly.
Clustered topologies need a shared binary store, and we also have several options here as mentioned in the documentation. Most of them can handle very large files, but the need to have a shared, distributed store will increase overhead. Be sure to test performance on your own hardware.
(Sorry for the many updates. For some reason, I had a lot of trouble entering this post.)
-
2. Re: ModeShape suitability for medium sized binary files
gmlopezdev Jan 30, 2013 2:27 PM (in response to rhauch)Hi Randall, thank you very much for your reply! It is actually helpful as the link you have provided. As per the documentation, it appears that a Mongo data store could also be defined/configured to be used in particular as binary store.
If you do not mind. Could you please point me to some code samples for uploading/retrieving files from ModeShape?
Thanks again!
-
3. Re: ModeShape suitability for medium sized binary files
rhauch Jan 30, 2013 2:57 PM (in response to gmlopezdev)The key is that the content of a file would be stored as Binary value in a property on some node:
// Create a buffered input stream for the file's contents ...
InputStream stream =
new
BufferedInputStream(
new
FileInputStream(file));
// Create a node where we'll store the content ...
Node node = parentNode.addNode(
"myfile","nt:unstructured"
);
// Upload the file to that node ...
Binary binary = session.getValueFactory().createBinary(stream);
node.setProperty(
"content"
, binary);
// Save the session ...
session.save();
The important two lines that deal with setting a Binary value are just after the "Upload he file to that node" comment.
Getting the file's content back out is pretty easy:
// Get an input stream to the binary value ...
Binary content = node.getProperty("content").getBinary();
long size = content.getSize();
InputStream stream = content.getStream();
Obviously these examples do not show storing any other information about the file (e.g., no other metadata), and the resulting node is somewhat arbitrary in structure. You would likely add properties you find interesting, and design the node to suit your needs.
Now, the JCR specification actually pre-defines some node types that are expressly intended to be used in nodes that represent files and folders. Here's an example of code (note that the Binary part is largely the same):
Calendar lastModified = Calendar.getInstance();
lastModified.setTimeInMillis(file.lastModified());
// Create a buffered input stream for the file's contents ...
InputStream stream =
new
BufferedInputStream(
new
FileInputStream(file));
// Create an 'nt:file' node at the supplied path ...
Node fileNode = folder.addNode(file.getName(),
"nt:file"
);
// Upload the file to that node ...
Node contentNode = fileNode.addNode(
"jcr:content"
,
"nt:resource"
);
Binary binary = session.getValueFactory().createBinary(stream);
contentNode.setProperty(
"jcr:data"
, binary);
contentNode.setProperty(
"jcr:lastModified"
,lastModified);
// Save the session (and auto-created the properties) ...
session.save();
This code creates two nodes (one for the file thing, the other for the content), and it sets additional properties, including several (e.g., "jcr:mimeType", "jcr:created" and "jcr:createdBy") that are all set automatically upon save. Again, this is just one way to store files, but you're absolutely free to use whatever node structure you want. For example, some applications might want to store information about a patient, including scanned documents. I would imagine that documents would appear *under* the patient node type, and could be "nt:file" nodes or other custom node types (that may or may not subtype "nt:file").
I suggest these links to learn more about "nt:file" and "nt:folder":
-
4. Re: ModeShape suitability for medium sized binary files
gmlopezdev Jan 31, 2013 12:25 PM (in response to rhauch)Thanks again Randall!
Let me ask you one more question. Reading the documentation available, it is stated here that repositories are not good for storing large files although it is also stated that can be stored outside of the repository without any other comment about it. I believe you when you say that it is suitable for that purpose but the documentation is a bit contradictory. Could you please clarify for me?
-
5. Re: ModeShape suitability for medium sized binary files
rhauch Jan 31, 2013 12:49 PM (in response to gmlopezdev)1 of 1 people found this helpfulThat documentation says (emphasis mine):
JCR repositories are good at storing files, but binary values are accessed (via Java streams) and are thus less useful for storing very large files (e.g., GB in size)
This means that you can store files of any size, although very large files (I'd guess starting around 1GB) start to become more time-consuming to access simply because this requires the files to be processed with Java streams. Now, ModeShape might still be fine for very large files/content that are normally streamed to the client (e.g,. videos). But if delivery of these large files must be as fast as possible, then perhaps storing them on the file system may reduce the overhead of accessing these files.
Hope this helps. I'll update the documentation to reflect this subtle distinction.
-
6. Re: ModeShape suitability for medium sized binary files
gmlopezdev Jan 31, 2013 1:19 PM (in response to rhauch)Very helpful again! Just to avoid missunderstandings, what do you actually mean by "storing them on the filesystem"? Do you mean not using a content repository at all or using the FileSystemBinaryStore?
-
7. Re: ModeShape suitability for medium sized binary files
rhauch Jan 31, 2013 1:28 PM (in response to gmlopezdev)I only suggest storing the files outside the repository only if your application has some more efficient way to read/write the files (e.g., copying the files using system functions). On the other hand, if you're just going to use Java file IO (or NIO) to read and write those files, then putting them inside the repository is perfectly fine, since ModeShape also uses Java file IO and will be just as efficient as your application.