5 Replies Latest reply on Sep 1, 2013 6:47 AM by kbachl

    Modeshape for ecommerce web application

    subhrajyotimoitra

      Hello Modeshapers,

       

      We are in the "feasibility" study of developing our own ecommerce platform, (SAAS, based similar to Demandware/hybris/wcm ecommerce suite) using Modeshape (JCR based) really!!

       

      The usual ecommerce facilities like shopping cart, inventory management, multiple store fronts, multi channel sales management, vendor/contractor management, mini crm  etc.

      Also analytics and personalization is going to be a feature of this platform. The platform will have to support products anywhere from 1000 to 5K, which move out every 3 months to be replaced with new products.

      About 9 -12 different product-categories are to be supported. Expect traffic to reach somewhere around 5-6 million per month within 7 months of launch.

       

      The site is currently on magento, and we have convinced the management to invest on an "in-house" solution that will last at least 5-7 yrs of continuous operations, before getting taken over by something more mature and reliable.

       

      Is this a good idea, to use JCR to develop ecommerce flows both customer facing(the web front, where orders are placed) as well as internal business process workflows (like fullfillment, supplier management)?

      I mean has anyone attempted to do this before using Modeshape/JCR implementations?

      Customer facing sites has availability and performance as a priority and business process has consistency as a priority. Can i solution these priorities using Modeshape?

       

      I understand that one-size-wont-fit-all, however, can a JCR (hierachial structure) be used to model e-commerce domain and processes?

      What problems i might face if i go ahead with Modeshape? Is Modeshape only to be used to "integrate" different data sources?

      What if i want to develop a PIM (product information management) system using Modeshape/JCR? Is this a good idea?

      What problems does experts in Modeshape see to this?

       

      Are there ecommerce implementation already using JCR for storing all of its product, customer, sales, inventory information?

       

      Any advice pertaining to this will be highly appreciated.

       

      Thanks a lot in advance,

      Cheers,

      Subhro.

        • 1. Re: Modeshape for ecommerce web application
          kbachl

          Hi,

           

          we use modeshape as a JCR backend for our ecom app and I really can suggest it. However, you need to clearly seperate the usecases and tools before you start onto somtehing.

           

          Some hints:

           

          1. Use Modeshape as a CMS storage, where you store pages, templates and on site config data for the shop instance;
          2. Dont use ModeShape for an RDBMs replacement, meaning: use a real DB for costumer data, product etc. where you need to enforce data integrity and ACID compliance;
          3. make yourself more familiar with JCR. You e.g. wrote "however, can a JCR (hierachial structure)" which isn't true. JCR is NOT hirarchical by definition. It can be, but doenst need to be. For example you can store data in a graph on nodes beneath it, while you also can store those nodes (if they arent too many) at the root and just get what you need by one of the JCR 2 query languages;
          4. Have a look at http://www.day.com/specs/jcr/2.0/index.html for JCR2, as well as the Home - ModeShape 3 - Project Documentation Editor Modeshape Documentation. Beside that, take time to understand that ModeShape inplements JCR 2, but is even more in term of use cases as modeshape supports things that are beyond JCR spec like federation etc;
          5. Traffic you estimated means you need to also take care for the way to deliver data. Think about a seperate CDN for static data (e.g.: CloudFront, akami) to lower pressure on the system themselfes. Dont query the DB itself for product data, use a distributed search and query facility like http://www.elasticsearch.org/ to get the data to deliver, leaving pressure from the DB while latter can be run on many nodes quite easy;

           

          I think those should get you started a bit. If you want to see a modeshape e-com app online you can vist http://www.whiskyworld.de/ - site is on german however and our traffic is alos quite less than what you aim on.

           

          Best,

           

          KB

          • 2. Re: Modeshape for ecommerce web application
            subhrajyotimoitra

            Hello KB,

            Thanks a lot for taking time to answer the query.

             

            Re: How to categorize ModeShape in the world of NoSQL database?

             

            This POST mentions that Modeshape is ACID complaint.

             

            I was actually going to use modeshape as the underlying persistence store for all functionality. So the code actually only deals with JCR nodes.

            Using Modeshapes capability to persist across RDBMS, filesystem and other NoSQL DBs, the application code has to mostly bother about one single form of data access.

            You are right about #3, i think it depends on how one designs the data model.

             

            For #5 i was inclined to use infinispan. We dont really need global CDNs but i think, infinispan can take care of most of the caching issues.

             

            http://wiki.apache.org/jackrabbit/DavidsModel

             

            This links gives some best practices on the same. I have a data model design like this:

             

            /ecomorganiztion/products/product1

            /ecomorganiztion/products/product1/skus/ (child nodes are actual sku items depending on whatever attributes this product has)

            /ecomorganiztion/products/product1/attributes/size (child nodes, stores the actual size values for this product)

            /ecomorganiztion/products/product1/attributes/color

            /ecomorganiztion/products/product1/attributes/pattern

             

            /ecomorganiztion/products/product1/images/img1.jpg

            /ecomorganiztion/products/product1/video/vid1.mkv

             

             

            /ecomorganiztion/customers/customer1

            /ecomorganiztion/customers/customer1/order11(stores path to products)

            /ecomorganiztion/customers/customer1/order11/history (stores change history for the order)

            /ecomorganiztion/customers/customer1/order11/comments (order comments)

             

            /ecomorganiztion/suppliers/supplier1

             

            Any comments on the same? I think one way to get a hold of this monster is to first get the main use cases (currently there are some 700+ use cases for the entire solution) and build out this data model prior to development.

             

            Please comment.

             

            Thanks,

            Subhro.

            • 3. Re: Modeshape for ecommerce web application
              kbachl

              @ACID: well, yes, ModeShape is ACID; However, you also need to take into account that there are other systems needing to get get access to data, and therefore you need a connector that can support that ACID. With RDBMs this is easy as every connector has it, with JCR its quite hard if not possible. Think about: you will use the system for 5-7 years on your estimate. You will need third party tools on some occassions and SQL and RDBMs are proved for data storage and integrity enforcement since the 1970's. Meaning you use an RDBMs also doesnt mean you need to connect to it via different paths, as modeshapes federation can actually feed you that data via its federation.

               

              Of course you can store all in ModeShape - but I still wouldn't suggest this.

               

              @5: you mix up the way how to store something vs. how to deliver it cost effectifely and scalable! Imagine an average visitor sees 5 pages with 100 product images in common. Why deliver that 100 requests from your own server, hitting your bandwith and drive latency while you can have it all for nearly no cost from e.g.: aws cloud front (only first request goes to your server for each image / caching cycle - rest ist delivered from their network) ? This IMHO is a no brainer, as if you have 1 000 visitors in parallel you'll need to server 100 000 images from your server. Give every one 100 kb and your at a sum of 10 000 MB for that visitors in parallel. And now compare that to http://aws.amazon.com/cloudfront/#common-use-cases (look at the media distribution use case!) and the prices they will charge and compare that to the sum of servers and hosting you'll need instead...

               

              @design-model:

               

              I see some design flaws in there:

               

              ecomorganiztion/products/product1/skus/ (child nodes are actual sku items depending on whatever attributes this product has)

              /ecomorganiztion/products/product1/attributes/size (child nodes, stores the actual size values for this product)

              /ecomorganiztion/products/product1/attributes/color

              /ecomorganiztion/products/product1/attributes/pattern

               

              this might put you in truble. Each item is not "one", but usually contains of "2" different sets.

              1st: Data you *require*: this has to be enforced! - Talking about ID, Manufacturer, Price, Taxvalue etc. etc.

              2nd: Data you *want*: this is the way to enhance it on the page, info for the customer, nice to know etc.

               

              Your way to store attributes out of the item itself will lead to stale data and chaos after some time as you need to understand that invalid data will eventually get inserted - no matter what you do, believe me! You need to make sure you enforce an fixed attribute set on #1; You can achive this with custom node type definitions however and storing the data within the item itself!

               

              /ecomorganiztion/products/product1/images/img1.jpg

              /ecomorganiztion/products/product1/video/vid1.mkv

               

              Well, if you use federation to store it outside it might work. If you instead store it in the same workspace: bad bad idea. Think about backup and the sizes your workspace will reach. A video easily has 100MB, now you'll have 2500 products in the DB (half of max), meaning you store about 250 GB data only on video.... and now in year 5 you have 15'000 items in workspace (each year 3000 items exchanged) and your video size is grown to 250MB at average (hey, its 2018 by now!) - you now have 3 TB 750 GB.... only on video, stored along your items. Sure this is an good idea?

               

              /ecomorganiztion/customers/customer1

              /ecomorganiztion/customers/customer1/order11(stores path to products)

              /ecomorganiztion/customers/customer1/order11/history (stores change history for the order)

              /ecomorganiztion/customers/customer1/order11/comments (order comments)

               

              This part will *not* work!

               

              You cant just store the path to the products for orders! What if a costumer buys item A for $ 5.00, and you change the price for it to $ 5.50 the next day? What if you change the product description a year later and need to reprint that invoice receipt?

              You need to understand that the lines in orders and receipts may *NEVER* *EVER* change after they are created - legally even the ordering of the lines is usually important. So you need to put all the required data in and store it there "forever" - enhanced with even more data, depending on the area of legal interest.

               

              All over I think you might want to get the source code of magento or any other big open source ecommerce solution to get a grasp the the requirements of the data modeling. For example look the the ER schema of Oxid, a smaller e comerce app: http://docu.oxid-esales.com/CE/dbdocumentation/OXID_eShop_CE_4.3.0_26948_DB_schema.png  and after that follow the magento schema (but not unless you understand why nobody really wants an EAV way of modelling things http://stackoverflow.com/questions/4066463/should-i-use-eav-model ): http://www.magentocommerce.com/wiki/2_-_magento_concepts_and_architecture/magento_database_diagram

               

               

              Best,


              KB

               

              PS: you now also might understand why a good part should be in an RDBMs and not only pure JCR

              1 of 1 people found this helpful
              • 4. Re: Modeshape for ecommerce web application
                subhrajyotimoitra

                Thanks again KB for your descriptive replies and references.

                Below are some ways i can solve, the issues u mention above.

                Replace RDBMS: I am keeping this question open for now.

                 

                Storing images:

                /ecomorganiztion/products/product1/images/img1.jpg

                this is a separate repo (filesystem connector). This has the same hierarchy as that of the product data repo, but only stores binary content, corresponding to that product.

                Basically different repos to store actual domain data and its related binary content. Will this work?


                order/invoice items :

                You are right, it never changes, so just copy all the "current" product data, coupon applied, taxing as per the product etc as properties of the order-item node (children of "order" nodes). Same applies for shipping and shipping-items. wont this work?

                Data integrity as u say, is not enforced. Dont see this as a problem, since i can enforce ref integrity of nodes as required. I have often found key-references this to be a source of problems during data migration for databases with a lot of key refs.

                 

                Product/Attributes:

                 

                I am not too keen to follow the EAV kind of model for all product properties. A product, will have a constant set of properties, like name, sku, description, meta-x, price etc. on top of this it will also have a "attribute-set"/"option-set", so for example i can have sets like "footwear", "topwear" etc... which stores only the variable part of the product, for example footwear has 7 sizes and topwear has 3 sizes, also color or other attrs. since this data wont change too frequently, thought it would be adequate to just store it as child nodes of a product.

                What do you think, will this work? not sure i understand how stale data would get in. When a product is edited, a set of its node properties will change. When attribute set changes, a new set of attribute-nodes will be added to the product, and previously attribute-nodes deleted.

                 

                Federation:

                This i havent completely understood how to leverage this, but if i develop a Oxid like schema, can i expose it using federation using modeshape?  If isnt this a ideal solution, then i get to store stuff in the RDBMS way with data integrity, and the code has to deal with only nodes?

                 

                Let me know what you think.

                 

                Thanks,

                Subhro.

                • 5. Re: Modeshape for ecommerce web application
                  kbachl

                  Apologise for the late answer, but hat a rough week and little time.

                   

                  @Storing images:

                   

                  yes, that would work. However, you need to understand that in a JCR repo each item is identified by its UUID and the path is just 1 way of querying it. You could also use different approach like storing a special ImageNode type you define that has the product id in a field and then query for that field. Also a node can have path aliases, meaning you can have more than 1 path to a node;

                   

                  @Order/ Invoices:

                   

                  yes, you can copy those. Make sure you dont get in trouble with backward-references and / or path aliases. I still would do the RDBMS way for that, as data enforcement and third party access is vital here; The migration problems you mentioned might be hard to solve, but a incorrect or wrong data can't be solved compared to the first one;

                   

                  @Product/Attributes

                   

                  ChildNodes are classic JCR principles and having those is nothing bad. But remember that your ecom system also needs to know the meta-data so it can treat that nodes and its attributes are correct;

                   

                  @Federation

                  This I dont understand what you mean. If you use federation you connect to another resource by leveraging the modeshape system. However, the restrictions of the underlying dataspace are still in effect, meaning you cant turn a RDBMS into a NoSQL store by just setting up a federated connector. (this is how I understood it AFAIK)

                   

                  Best,

                   

                  KB

                  1 of 1 people found this helpful