Have read the great article about Selecting the right connectors and have a quick test both types of connectors. We are considering of implementing our repository project using Disk Connector instead of Filesystem Connector because the Query just don't work out properly with Filesystem Connector within our setup as highlighted here. However, belows are some concerns of the migration plan:
1. Repository Structure
- Using Filesystem connector, we can basically just map the existing files and folders in disk storage as nt:file and nt:folder and we can still add extra metadata using mixin. Understand the performance is not as great as Disk connector but we still can really "See" the files and folders with the underlying OS explorer or other App that can access the filesystem.
- Using Disk connector, we have no problem of converting existing and new files and folders into Disk Connector repository and I believe we still can map nt:file and nt:folder to the Disk connector repository.However, the Disk Connector is like a new filesystem that is specific to Modeshape and can't be mounted (in linux sense). Just not feeling that "Secure" compared with Filesystem connector.
Concern: Is the Disk Connector stable and reliable ? because we are totally depend on it to access the files and folders within the repository. Will our whole repository corrupted and inaccessible due to something when using the Disk Connector for CRUD operation ? If it really crash, can we still recover the full/partial repository ?
2. Query Performance
- "Query" is the main consideration for Disk Connector because as mentioned, it is currently not working with Filesystem connector.
Concern: Let say we have millions of files eg. Picture or Documents , each with 10 metadata properties (eg.Tag, Author, Resolution...), all stored within a single repository with multi directory structure with top repository directory indicating "Owner" eg. "[Repository Root]/User1/folder1/files...." or "[Repository Root]/User2/anyfolder/anyfiles...." managed using Disk Connectors, will there be any performance problem if for example, we ask the Query object return us all JPG pictures having Tag like "holiday" during year 2010 ? *Assuming we can construct the Query statement properly*
Concern: Or will it be better if we separate the one big repository into Multi repository that is owned by specific Users (let say we have thousand of concurrent users )? But will this cause many repository instances (thousand of them) to be created at App server which consume extremely large amount of Ram&Cpu resources? and will it further cause problem if we want to do cross Repository query on a group users together ?
Anyone, pls help . Thank in advance.