SAVARA identifies service repository usage requ...| JBoss.org Content Archive (Read Only)

1. Re: SAVARA identifies service repository usage requirements

objectiser Oct 14, 2009 4:35 PM (in response to jeffdelong)

Along with reviewing the requirements of SAVARA in terms of its use of the repository, I also wanted to return to a previous topic on this forum, that of the repository structure.

This is because one of the issues with SAVARA using the repository is that, although services are an important part of the lifecycle, SAVARA actually starts earlier in the development lifecycle, and therefore needs to deal with other types of artifacts (such as requirements and architectural models) that are defined prior to any services being created.

In the previous thread on the topic of repository structure, I believe a hierarchy approach was being explored, where services would be contained in a top level 'services' node, and other top level nodes for 'schemas', 'policies', etc.

However I see this as imposing a structure on an organisation, whereas they may want to group their services and artifacts in their own way - e.g. by business unit.

So I was wondering whether a tag or query based approach might be better - so we have the repository as unstructured, but define the structure we want to present to a particular type of user just in the view (i.e. console).

So for someone interested in services, that would be their main view - based on a query looking for services in the repository. Perhaps the GUI could enable the views to be customised for each user, or for the group the users belong to.

Similarly for architects, they may only be interested in the architectural models as being their main focus.

However, whatever artifacts/components (i.e. services) are returned from a query and presented in a user's view, it should be possible to easily navigate from those components to other related components. So in the case of a service, it should be easy to navigate to dependent contracts (wsdl), information models (ddl, xsd), etc.

It should also be possible to easily add new dependencies to components - so if a service, then it should be possible to query other artifacts in the repository that can be associated with a service (e.g. wsdl, xsd, etc). Appropriate rules could be used to determine what artifacts can be associated with each other, and validation rules to determine whether the components and their dependencies are correct.

So just to clarify the approach - for anyone who has used googlemail - think about the way it works. There is no folders or hierarchies - each email just has one or more labels, and the user just decides which label they are interested in - and a particular email could have multiple labels and therefore be seen in multiple views.

The reason why I would prefer not to have any structure imposed on the repository itself is that it provides greater flexibility in using DNA as the repository implementation - as (for example) DNA may federate a number of existing repositories with different artifacts in each repository, We need to be able to reuse those artifacts without forcing them to be restructured into our predefined hierarchy.

2. Re: SAVARA identifies service repository usage requirements

jeffdelong Oct 14, 2009 5:41 PM (in response to jeffdelong)

In the previous thread on the topic of repository structure, I believe a hierarchy approach was being explored, where services would be contained in a top level 'services' node, and other top level nodes for 'schemas', 'policies', etc.

I have not looked at the previous forum posting recently, but I was always talking about what things looked like from the view of the user, i.e. the GUI application.

I expect Guvnor to provide an UI that exposes Services, similar to how Drools Guvnor provides a UI that exposes Rule Packages. However, as there are many more types of artifacts in a SOA context than in rules, I would expect to see other categories of content, which would include things like architectural models, schemas, etc. This is consistent with other repository products I have looked at.

I don't expect the UI to look like Google Mail, where as a user I would have to provide the organization to the content; I expect this organization to be provide by the product. I think that having structure provided by the product can still allow for flexibility within a deployment, particularly if these planned for. E.g. the capability for the business unit (or domain) as a way of partitioning what the user sees can be accounted for in the design.

As a user I am not that concerned about the underlying repository structure, although I think a JCR repository by its nature does provide some kind of hierarchical structure to the content, and I would suspect that a repository without any structure (i.e., no leaves on the tree) would be both inefficient and require more complex queries.

3. Re: SAVARA identifies service repository usage requirements

objectiser Oct 14, 2009 5:53 PM (in response to jeffdelong)

I agree about not wanting the user to have to define the structure - but was using google mail to illustrate how the content is organised based on some other notion than hierarchical structure.

So I would expect an out of the box product to define a set of predefined 'views' that would present the relevant components of interest - although it should be customisable by an administrator to define new component types and rules governing their behaviour in the repository - and which user groups may be interested in them.

It would be interesting to see whether Jervis believes having an unstructured repository, with an appropriate query mechanism to present the relevant user view, would be less efficient - especially considering the benefits it could offer when using a federated DNA repository.

4. Re: SAVARA identifies service repository usage requirements

rhauch Oct 14, 2009 6:11 PM (in response to jeffdelong)

"objectiser" wrote:
The reason why I would prefer not to have any structure imposed on the repository itself is that it provides greater flexibility in using DNA as the repository implementation - as (for example) DNA may federate a number of existing repositories with different artifacts in each repository, We need to be able to reuse those artifacts without forcing them to be restructured into our predefined hierarchy.

Actually, DNA can project multiple branches from each source. For example, one source repository could project content under "/A/services" into "/services" area of the unified repository, and could also project "/A/policies" into the "/policies" area of the unified repository. Another source could project content under "/foo/services" into "/services" in the unified repository. Thus, "/services" in the unified repository would contain content that comes from both sources, while "/policies" only comes from the first source.

DNA uses a simple notation for projections:
{path_in_source} => {path_in_unified}

Thus, the two projections for the first source would be written:
/A/services => /services
/A/policies => /policies
while the second source's projection would be written:
/foo/services => /services

5. Re: SAVARA identifies service repository usage requirements

rhauch Oct 14, 2009 6:23 PM (in response to jeffdelong)

"objectiser" wrote:
It would be interesting to see whether Jervis believes having an unstructured repository, with an appropriate query mechanism to present the relevant user view, would be less efficient - especially considering the benefits it could offer when using a federated DNA repository.

Any JCR repository is likely to retrieve the children of a node much faster than executing a query. In DNA, navigation is certainly must faster.

On the other hand, using queries is probably much more flexible.

There's a hybrid possibility. For quite some time, I've wanted to provide a feature in DNA that allows for defining a "virtual node" where the children are dynamically generated from a query (stored as a property on the "virtual node"). This means that the structure is dynamically generated based upon the queries. This technically is not possible in JCR 1.0, since nodes can only have one parent, but JCR 2.0 does introduce the notion of shared nodes.

I haven't thought this through too much, but I'm sure the DNA community would love to hear any thoughts or use cases on the topic.

6. Re: SAVARA identifies service repository usage requirements

jeff.yuchang Oct 15, 2009 12:25 AM (in response to jeffdelong)

I like Gary's unstructured repository idea. we will provide an out-of-box perspective, it would be great if we also offer the flexibilities to have other perspectives. Say by using query and label stuff.

-Jeff

7. Re: SAVARA identifies service repository usage requirements

objectiser Oct 15, 2009 4:14 AM (in response to jeffdelong)

"rhauch" wrote:

DNA uses a simple notation for projections:
{path_in_source} => {path_in_unified}

Thus, the two projections for the first source would be written:
/A/services => /services
/A/policies => /policies
while the second source's projection would be written:
/foo/services => /services

This would be ok if the federated repository organised its artifacts in a similar manner, but I don't think we could rely on this.

I like the idea of the "virtual node", although I think just having a "perspective" (as Jeff called it) based on predefined query would avoid having this additional capability in the repository, so it could work with a JCR1.0 repo for now. Although I think it would be great if DNA could support a concept like "virtual nodes" :)

8. Re: SAVARA identifies service repository usage requirements

jervisliu Oct 18, 2009 10:45 AM (in response to jeffdelong)

Firstly lets agree with our requirements. I can see two requirements have been raised here:

1.The repository needs to deal with many different types of artifacts. These artifacts can be SOA specific such as WSDL, XSD, JBOSS ESB configuration etc. These artifacts can also be domain specific such as CDL files or generic files such as requirement doc. It is likely that SOA repository is used to manage some unknown(unsupported) artifact types. And when this happens, the repository needs to be:

1.1 When the repository is used to handle an unknown artifact type, the repository should be able to manage this file as a generic type and be able to provide most common functions such as store, version, search/index, manage common meta data etc. However because this is an unknown type, the repository wont be able to provide a specialized editor on GUI, wont be able to extract specific meta data, wont be able to do any validation or policy enforcement etc.

1.2 The repository should be extensible or plugable to allow adding the support for new types easily. For example, to support a new artifact type called CDL file, a jar file will be dropped onto SOA repository's classpath. This jar contains a specific GUI editor for CDL, an extended JCR node definition that is used to store CDL file and a corresponding remote API that is used to access the JCR node. By this way, the support for a new artifact type can be added without recompiling SOA repository code or even without restarting SOA repository server.

2.Different type of users may want to view repository content differently. Eg, Admin/IT users want to view repository in a service centric view. Business users want to view repository based on business unit or per department. This requirement can be addressed by two slightly different approaches:

2.1. The SOA repository provides two fixed views. One is a service centric view (organize artifacts under services, schema etc). One is category based view. The category is defined by users. This is similar to what Drools Guvnor does. Drools Guvnor provides two views to access its content. Per package or per category. Per package is the default view.

2.2. The SOA repository allow users to create customized views.

Personally I think #1.1, #1.2 and #2.1 are must have. #2.2 is nice to have, but not compulsory. The category based view can already satisfy most cases if it is used properly.

Now lets come back to our real question. Which repository structure is better to fulfill requirement 1 and 2. BTW, a hierarchy approachâ€�can be found from [1].

I have not had a clear idea on how to implement #2.2 yet, maybe some pre-stored queries or aâ€œvirtue nodeâ€�as Randall suggested is the answer. Both â€œhierarchy approachâ€� and â€œunstructured approachâ€� would have no problem to implement #2.1 and #2.2. For example to implement #2.1 using â€œhierarchy approachâ€�, we can add an extra attribute called â€œcategoryâ€� on each node. But I would argue that the hierarchy structure is better because it is optimized for the main view while the unstructured approach is not optimized for any views.

Both â€œhierarchy approachâ€� and â€œunstructured approachâ€� would have no problem to implement 1.1 and 1.2 as well. But again, performance is the main difference.

[1]. http://www.jboss.org/index.html?module=bb&op=viewtopic&t=155196

9. Re: SAVARA identifies service repository usage requirements

objectiser Oct 19, 2009 4:37 AM (in response to jeffdelong)

I agree 1.1 and 1.2 are must haves.

I think 2.1 should only be considered a short term solution working towards 2.2 - if this is a "faster route to market" - i.e. if the existing code base works this way, then use it for now. However 2.2 provides greater flexibility, and enables any organisation to view their repository in whatever way they want.

For example, could have the concept of 'views' that contain 'categories', where the content of the categories is populated by a query. When the user selects a queried node in any of the categories, the 'node explorer' displays the node attributes, child nodes, link to parent node, dependent node refs, dependencies, etc. - allowing the user to navigate through the graph, and having a 'back' button to return. Completely generic, but can be delivered out of the box with a selection of views that could potentially be customised based on a group/role/user basis.

Structured vs Unstructured - this seems to be a case of performance vs flexibility - so I would be interested in understanding the performance difference and whether it would have any impact on user experience. System to system comms - performance is very important, faster the better. Human to system comms - performance can only get so fast before it becomes irrelevant.

10. Re: SAVARA identifies service repository usage requirements

jervisliu Oct 21, 2009 9:55 PM (in response to jeffdelong)

I think 2.1 should only be considered a short term solution working towards 2.2 - if this is a "faster route to market" - i.e. if the existing code base works this way, then use it for now. However 2.2 provides greater flexibility, and enables any organisation to view their repository in whatever way they want.

I do agree that 2.2 does provide a lot more flexibility and definitely a cool feature we want to have. However I am not sure about this sentence: "enables any organization to view their repository in whatever way they want". Yes, it would be great if an organization can view the repository in whatever way then want. This is the space where customizable views or category-based views (like Drools Gvunor does) can play their part. However we are talking about a SOA Repository product, not a generic repository product, aren't we? As a SOA Repository product, should not it have a primary view which is service-centric? And this view is mandatory. And the structure of this service-centric view is fixed, i.e., under the root node of the view, it always have sub-nodes called services, schema, docs etc. Though this view can be extended, i.e., more nodes can be added under the root node or sub-nodes.

If we can agree with the primary view (service-centric view) mentioned about, then I would say a corresponding structured repo structure would be the most natural choice. Custom views can be built upon this structure using queries etc.

11. Re: SAVARA identifies service repository usage requirements

jervisliu Oct 22, 2009 10:55 PM (in response to jeffdelong)

Structured vs Unstructured - this seems to be a case of performance vs flexibility - so I would be interested in understanding the performance difference and whether it would have any impact on user experience. System to system comms - performance is very important, faster the better. Human to system comms - performance can only get so fast before it becomes irrelevant.

Drools Guvnor has similar performance problems. When there are over 1000 nodes under same level, performance suffers.

12. Re: SAVARA identifies service repository usage requirements

jeff.yuchang Oct 22, 2009 11:35 PM (in response to jeffdelong)

I think for the tree widget, we at most show 30(50?) nodes as a maximum. For large amount of nodes, I would still prefer to have a 'list' widge (or like others called grid widge) to get over the performance issue.

Regards
Jeff

13. Re: SAVARA identifies service repository usage requirements

objectiser Oct 26, 2009 6:08 AM (in response to jeffdelong)

To answer Jervis, I think when the repository is provided as an SOA Repository within SOA-P, then yes the main view should be service centric - however when considered in the context of SAVARA, it also needs to have business-centric views as well.

So it is possible to provide "out of the box" views that are appropriate for the context in which the repository is being used.

If we can agree with the primary view (service-centric view) mentioned about, then I would say a corresponding structured repo structure would be the most natural choice. Custom views can be built upon this structure using queries etc.

I don't agree that service centric view is the primary view - I think it is an important view in the context of SOA-P. However if the repo is being used as part of SAVARA, then the initial set of artifacts are not service specific. In this context, the business requirements and architecture may be seen as the main view, with the services being the resulting implementation details (i.e. a secondary view).

In terms of the performance issue - enabling the tree structure to be more freely organised, would mean that we don't get in to situations where nodes have 1000+ child nodes. Consider how users organise their file systems, they very rarely have folders with large numbers of children, as it becomes unmanagable. If the repository is physically organised in the same way, it makes the traversal of the hierarchiy easy. We then provide canned and adhoc query support for presenting views on to that data - and where the result set is too large, we offer search refinement to help manage the view.

14. Re: SAVARA identifies service repository usage requirements

jeffdelong Oct 26, 2009 10:40 AM (in response to jeffdelong)

then the initial set of artifacts are not service specific.

While this is true, that initial artifacts, e.g., a choreography model, are not service specific, the intent of their creation is the identification of services. Though these initial artifacts may not be initially stored as part of a particular service, they would quickly be associated with a service.

In this context, the business requirements and architecture may be seen as the main view,

I think it would be fine to have other top-level hierarchies / views than just services, e.g., Models, or Documents, or Business Processes, however I don't see these as the main view. Service are still what what the user is most interested in.

with the services being the resulting implementation details (i.e. a secondary view).

Services being the resulting implementation details is a very narrow view. Services need to be defined by the user very early on in the life-cycle, and managed according to governance policies defined by the SOA repository product. I do think it might be useful in the repository to make a distinction between the service as a business entity and the service in terms of its implementation details (e.g., a web service). I.e., a business service has a service implementation.