The connectors were designed to be an interface towards an external system (external to the repository, that is). Therefore, it is by the design of the SPI that we don't provide access to the JCR session (which beside other things, is a "heavyweight" object). A connector instance does have access to global repository information like: the transaction manager, the execution context, the environment etc.
IMO trying to retrieve JCR nodes from within the connector indicates that the connector is trying to do too much.
Yes, I agree from the design view.
But a real case is, the extern system does need some info stored in a JCR repository to generate other data, which we would get via a new connector under modeshape framework.
If we forget the connector, the extern system could launch an individual JCR repository. But since the extern system is integrated into the JCR repository, it is nature (at least to me) for it to retrieve the info in an existing JCR repository in a simple way.
Further, the case would be much normal if the required data in a JCR repository is just migrated/import from other data source, e.g. DB.
Please let me know what is your opinion.
Of course launching another JCR repository from withing the connector is not a great idea/option.
That being said, in my opinion, if the connector needs some sort of additional data - additional to the external system itself - it should look at storing that data outside of a repository in the most lightweight & easy way possible (for example storing extra information on the filesystem). This kind of approach is used by the existing FileSystemConnector when storing additional information for files and folders.
If an external system needs to store information in a JCR repository, that sounds to me like a mix between a JCR/repository enhancement and an actual connector.
So we should not migrate those big data in a DATABASE into a JCR repository if a connector(external system) need any data from it?
Can you explain in a little more detail what kind of information the connector needs? So far, our objective is that the repository uses the connectors, but the connectors don't really know much about the repository or other repository content. We've never had an external system that was dependent upon the repository.
In my case, a GlacierConnector tries to get a archive list in a Vault in Glacier, which is invoked in the methods like getDocmentById(), which similar to get file list in file system. But the connector could get from Glacier after 4 hours for each request.
So,at the first thought, I would store those archive list info in a repository, as we are in a repository system.
Of course, I can put them in a DB too(not file system,as there would be big amount of data in the future). But it would increase the complexity of the system, no speaking of maintain and cost.
In discussing with Horia, just found another more common case, the issue on migrating old system to JCR repository:
1) an old system with big data in DB;
2) those data maybe shared with other system, via db connection interface;
3) now all data are migrated to JCR repository;
4) new interface in the old system (and other systems) is implemented to retrieve the data in the new repository;
4) we fail to use the old system via a connector interface, as it could not retrieve the data from the repository;
What do you think?
Okay, I think I understand what you're trying to do. You basically want ModeShape to contain information about the stuff you're storing in Glacier, and you want applications to access that stuff through ModeShape. However, the challenge with Glacier is the potentially significant latency - like you said the response might be delivered hours after you make the request to Glacier.
So IIUC you essentially want to use ModeShape as a cache of information stored in Glacier. Unfortunately, the connector framework was not really designed to store parts of a subtree in an external system; rather, it was intended to expose information in an external system as a single, complete subtree.
Have you thought about just having the code that talks to Glacier sit along side the repository (not underneath it) and simply store the results inside the ModeShape repository? This Glacier component could also modify and annotate the stored content as necessary. For example, you might store the URL to some archived content as a property, and anything that you've recently retrieved could be stored/cached inside the repository. Then, your application would look for the information inside ModeShape and, if everything is there (including the cached archived content) simply access it. Anything not cached would be directly accessed via the URLs in the content and then cached inside ModeShape.
This doesn't have quite the transparency that you might achieve with a connector-based approach, but it allows you to leverage the full capability of ModeShape. Thinking way outside the box, you could hide all of this with a second higher-level repository that uses a connector to talk to the first JCR repository. Personally, that seems like overkill, though you'll have to be the judge of that.
Hope this helps.
Thanks Randall and Horia.
Looks those are the only ways we could have at this moment.