1 2 Previous Next 18 Replies Latest reply on Aug 14, 2005 7:25 PM by Andrew Oliver

    Project planning/Roadmap

    Andrew Oliver Master

      So I'm not a huge believer in "project mangement" in the tradtional sense or the fancy roadmaps. However, I'm trying to just work out goals for the 1.0 release to ensure we're working on the must haves for an enterprise grade mail server.

      I'd like some feedback. This isn't to say that we won't do any of the things I've kicked out of the 1.0 release, but since 1.0 has a tentative release date of say March or so. One must prioritize. Moreover its important we aim for a production release as soon as feasible. Its important for every project of course.

      Please take a look at the roadmap:

      http://jira.jboss.com/jira/browse/JBMAIL

      Meanwhile here are a few decisions I made in the interest of time that I view as more controversial:

      1. No NIO until the 2.0 release: http://jira.jboss.com/jira/browse/JBMAIL-94 - while this probably limits the "per instance" or "per server" scale of the 1.0 release, it probably increases our performance on some platforms and reduces complexity dramatically. I do think NIO is a good idea and that not dedicating a thread per connection is a good idea (and proven more or less by the tomcat clan), I think we can achieve adequate performance and scale for 1.0 without it. I suspect we can make bigger gains in the area of datastore management and memory tradeoffs than this. Moreover, I think we'll be best to wait until underlying OS support for this and JVM support for this is more mature accross all platforms.

      2. No "over HTTP" protocol stuff until the 2.0 release: http://jira.jboss.com/jira/browse/JBMAIL-42 - While Exchange 2003 actually has something like this as well and I think it is a good idea, its not a widely used thing now. Its something that will take time to do and probably won't be used widely anyhow. It does mean that we won't have as great of support for organizations with peculiar firewall rules (port 80 is holy and nothing bad can happen there right?), I don't think it limits us enough to make it worth our while for 2.0.

      3. Exchange protocol port pushed to 2.0. We have a lot of road to travel and this will be a big effort. Its more than one protocol but several things that have to be matched. The active directory stuff alone will take time. Next, it requires a software investment which means it probably is a "rich committers" and "employee-only" effort. I think once the project is more widespread we can probably get shared resources for things like this, but right now I don't think it is feasible. Sadly its the thing I most want to do!

      4. Aggregate message store stuff pushed to M5: http://jira.jboss.com/jira/browse/JBMAIL-37 - There is a lot of stuff in M4 and I want to get our milestones happening regularly again. I have only budgeted 1 month for M4 (pretty ambitious since I do have a week's trip to Japan in there). There might be some slippage but I'd rather move features to M5 than miss the date now.

      5. Basic IMAP for M4. I just mean like list command and maybe get a mail or two from tbird. Its probable that this will neither be a complete or workable implementation, just some basic support for Tbird's imap.

      There are a number of things I haven't detailed out yet:

      http://jira.jboss.com/jira/browse/JBMAIL-65

      These will mostly be aimed at M6+. I need to research them more heavily. It would also be helpful if experienced people and potential/existing users sounded off in those areas. Please distinguish between "nice to have" "must have" and "Must have for 1.0".

        • 1. Re: Project planning/Roadmap
          Thorsten Kunz Novice

           

          "acoliver@jboss.org" wrote:
          5. Basic IMAP for M4. I just mean like list command and maybe get a mail or two from tbird. Its probable that this will neither be a complete or workable implementation, just some basic support for Tbird's imap.

          Hm, I have a skeleton for IMAP support in the makeing. It is derived from the pop3 implementation from M2 and until now I only invested a few hours but a few commands work and I was about to implement stuff like a literals and command continuation. But I wanted to wait before going any further and posting a patch for several reasons:

          1. redesign of mailboxes to support hierarchies and flags. I remember sombody mentioned its redesign and I don't wanted to build on a depreciated interface

          2. IMAP supports fetching of partial octets of a message within the FETCH command. This may have a huge performance impact with database backed mailstorage as long as there is no way to tell the mailboxinterface to only fetch the desired part of the mail in a efficient way. Else a user could easily try to fetch 1 octet for all of his 50 x 100meg mails via IMAP and since (if I am not mistaken) the Folder fetchs a mail compleatly from the db before it returns it.... keeps the db backend quite busy. So there is a need for a more granular way to fetch partial message content like gimmeMailOctets(offset, length) (maybe doing it as part of the redesign for aggregated mailstorage JBMAIL-37 would make sense?)

          3. Optimistic locking for folders would be a huge plus

          4. I am moving from L.A. to Germany this/next week so I have little to no time to spare at the moment :(

          Cheers, Thorsten

          • 2. Re: Project planning/Roadmap
            Michael Barker Apprentice

             

            2. IMAP supports fetching of partial octets


            Ahaa, you be using M2...

            The current head has support for streaming mail bodies from the database, so we can do partial body reads. I'm not sure if IMAP needs to read information offset into the body e.g. FETCH 20 bytes from byte position 500. I think we can support this reasonably as well. I might need to do some work to ensure that the implemenation of the InputStream.skip() method is efficient. This should be quite easy to do.

            http://jira.jboss.com/jira/browse/JBMAIL-65


            One of the things on the list is the WebMail client. In my day job I am currently working on (18 months so far) on a rich thin client (AJAX before it was called that) based project. So if we were going to put together a GUI like GMail which avoids form submission and uses XMLHttp rather than form submission I have a little experience in that area. Another technology which would be interesting to build WebMail client with is this: http://www.openlaszlo.org/. However if we were going to go down the rich thin client route, I think we would need some form of HTTP access to mail, probably XML based e.g. REST.

            Mike

            • 3. Re: Project planning/Roadmap
              Andrew Oliver Master

              Ha if we made our thing half as nice as Gmail for 1.0 and half as functional as outlook webmail, I'd be very happy. In my view the outlook web client sucks because it is slow, but it does support nearly everything that lookout does... Gmail is a great balance between the two. Functional, appealing, yet mimalistic. Still after looking at that laszlo thing...damn that is nice. It worked great in Firefox (optimal), Safari and IE 5.2 (last mac version). So if you think that can be pulled off for this release (and it looked like it wasn't HORRIBLY difficult to use) then I'm game to add back the serverside of REST stuff. One thing I would like is if the serverside communication didn't connect over socket (as an option) but did direct MBean calls from the presumptive servlet but CAN go over socket (to another server and possibly on another machine).

              One thing to note is that we'll have a supplier-competitor as a result: http://www.laszlosystems.com/products/laszloMail/. Not that that is a deal breaker but we'll have to make a better mail client than they do as a result or they'll use it as an upsell strategy. Which I don't care particularly as long as they make us "one of the leading IMAP servers" that they support and don't talk folks into other mail servers as part of the deal. If we go that direction I'll get the business side to see about partnerships and crap to make sure that goes smoothly.

              Also an option and probably as functional as laszlo is Mozilla's XUL. http://xulplanet.com/. There is an IE plugin but I more or less consider this single browser.

              I like that Laszlo thing a lot though and flash is pretty much ubiquitous. How else would we play even more annoying banner ads? ha ha.

              • 4. Re: Project planning/Roadmap
                Andrew Oliver Master

                Thorsten,

                Definitely look at what mike did for mail stores. Should be a major improvement in memory consumption cross database.

                -Andy

                • 5. Re: Project planning/Roadmap
                  Andrew Oliver Master

                  More Laszlo up/downsides:

                  http://enthusiasm.cozy.org/archives/2004/10/open-laszlo/

                  Good points about flash and openness.

                  http://www.oliviertravers.com/archives/2004/10/10/coming-soon-laszlo-without-presentation-server/

                  Also lists alternatives. it would be nice not to have to deal with the presnetation server.

                  http://opensource.org/licenses/cpl.php

                  speaking of licensing, laszlo is under CPL which is fine for us (LGPL). According to FSF it isn't GPL compatible but I'm kinda okay with that (given that this is an overly technical objection that the GPL doesn't yet incorporate patent terminations although they like the idea... I freaking LOVE patent terminations -- they make me VERY happy)

                  ----

                  Other thoughts:

                  Our WebMail platform can be maybe "adequate but not particularly good" for 1.0 if we make TBird/JBMS combo able to do Mail over REST. This would make maybe 80% of users happy because a large reason for webmail is that you don't have access to port 25/110 and friends. What is a question in my mind is how we can do this securely without writing our own HTTP server of sorts. Meaning, you need something like STARTTLS support and to essentially commandeer the stream.

                  Out of scope for 1.0 but adding an outlook plugin for that would be kickass (though we could just support Exchange's version).

                  thoughts?

                  -Andy

                  • 6. Re: Project planning/Roadmap
                    Thorsten Kunz Novice

                     

                    I'm not sure if IMAP needs to read information offset into the body e.g. FETCH 20 bytes from byte position 500.

                    From my understanding this is exaclty what IMAP supports. Here is the part of the RFC that describes it (section 6.4.5):
                    It is possible to fetch a substring of the designated text. This is done by appending an open angle bracket ("<"), the octet position of the first desired octet, a period, the maximum number of octets desired, and a close angle bracket (">") to the part specifier.

                    But as I mentioned I did use M2 so I'll check out Mikes changes in HEAD as soon as I have the time.
                    ---
                    Another question: how to handle the IMAP SEARCH (RFC section 6.4.4) command? It has quite a few options and also includes basic AND/OR/NEGATION operators to combine multiple options. I think the right place to implement search methods is the mailbox/folder and not the protocol stack. I don't think I want to fetch all messages from a mailbox into the protocol instance only so that I can run a textsearch in their bodys. You guys already have a plan for that? :)

                    Thorsten

                    • 7. Re: Project planning/Roadmap
                      Michael Barker Apprentice

                      Searching is quite complex and probably out of scope for M4 (actual scope decisions are Andy's domain). What we really want to be able to do is leverage the capabilities of the database if at all possible. E.g. Oracle has some very good text indexing/searching functionality. We also need to have a generic solution that will work on pretty much any database. I was thinking of using Lucene to index emails. We would probably have a mail listener on the local delivery chain that would index incoming mails.

                      Mike.

                      • 8. Re: Project planning/Roadmap
                        Andrew Oliver Master

                        I would like, thorsten, if you'd send me (acoliver at jboss det org) your patches for IMAP so far. In the unlikely event we do much more than wrap up M3 we don't duplicate while you're flying around. That or upload it to the JIRA http://jira.jboss.com/jira/browse/JBMAIL-41.

                        No scope is everyone's domain, just it needs to be done at the beginning of each milestone (hence these threads that I open). I'm just a moderator and documentor of such things. Now I may draw more hard lines on the total release, but that's just to make sure the release happens.

                        I feel that we have to walk before we can run. I can't imagine that we'll get far enough on IMAP for this milestone (1 calendar month in scope) to implement search so its not planned. I could really use some help on scoping out what we should/can implement for this milestone.

                        Mailbox and security stuff is a must for IMAP. We may not finish some of the mailbox refactoring for M4 (meaning there may be more iterations of that), but its good to start.

                        I rather like the idea of doing indexing on the incoming. I'm not sure that Lucene would be appropriate for that (given that its a merge index and that would be a veritable recipie for fragmentation). I may be out of date there, but I doubt it. It would also be good if we can make our index somehow store in the DB. I've not thought up how to pull that off yet...

                        • 9. Re: Project planning/Roadmap
                          Joe Cheng Newbie

                           

                          "SunFire" wrote:
                          2. IMAP supports fetching of partial octets of a message within the FETCH command. This may have a huge performance impact with database backed mailstorage as long as there is no way to tell the mailboxinterface to only fetch the desired part of the mail in a efficient way. Else a user could easily try to fetch 1 octet for all of his 50 x 100meg mails via IMAP and since (if I am not mistaken) the Folder fetchs a mail compleatly from the db before it returns it.... keeps the db backend quite busy. So there is a need for a more granular way to fetch partial message content like gimmeMailOctets(offset, length) (maybe doing it as part of the redesign for aggregated mailstorage JBMAIL-37 would make sense?)


                          Actually it's even worse than this. IMAP supports fetching of partial octets of a particular MIME sub-part. So unless you keep the MIME-parsed representation in the database (or cache some hints), you're at least going to have to MIME decode from the beginning until you get to the bodypart you want, and then offset into that. If you use a non-streaming MIME parser (such as JavaMail) then I think you'll have no choice but to load the entire message.

                          I wouldn't worry much about the "fetch 1 octet for all of his 50 x 100meg mails" scenario; you have to optimize your design for scenarios that are possible with reasonable mail clients, not pathological edge cases. (It is very reasonable for a client to be selective about what MIME parts it retrieves, though!)

                          • 10. Re: Project planning/Roadmap
                            Joe Cheng Newbie

                             

                            "mikezzz" wrote:
                            Searching is quite complex and probably out of scope for M4 (actual scope decisions are Andy's domain). What we really want to be able to do is leverage the capabilities of the database if at all possible. E.g. Oracle has some very good text indexing/searching functionality. We also need to have a generic solution that will work on pretty much any database. I was thinking of using Lucene to index emails. We would probably have a mail listener on the local delivery chain that would index incoming mails.


                            Some things to keep in mind. Searching is more complicated than just string matching in a body. According to the author[1] of the spec, each textual MIME part must be content-transfer decoded before searching, and non-textual parts are specifically excluded.

                            Lucene is designed for token matching, whereas IMAP search is supposed to be substring matches (think grep-like).

                            I'm not clear on how much people actually use server-side IMAP search in the real world, or how fast they expect it to be. At least for v1.0 I'd think you would be fine doing a brute force implementation of SEARCH.

                            [1]http://groups.google.com/group/comp.mail.imap/msg/8ef8148d30647b15

                            • 11. Re: Project planning/Roadmap
                              Andrew Oliver Master

                              We're going to ditch JavaMail by the final release because it just sucks altogther. I think Joe is right on the "brute force" but "seamless integration with Thunderbird" is an objective that comes not only from my "pragmatism" streak but from the business side so that has to be an objective. We need not worry about byzantine features of IMAP for 1.0 unless TBird needs them for whatever reason. For M4 I'll be happy if we can list your mailboxes and get even 1 mail out! That is the M4 objective as I see it (skeletal support) along with mailbox refactoring preliminary to M5.

                              • 12. Re: Project planning/Roadmap
                                Michael Barker Apprentice

                                 

                                whereas IMAP search is supposed to be substring matches (think grep-like).


                                I suppose I should probably read the RFC ;-). You are probably right token based index will probably not help. We do intend to handle store a mail's attachments seperately http://jira.jboss.org/jira/browse/JBMAIL-37 and will need to attach the mime type to each of these. Once done excluding non-text MIME types from search will be almost trivial. An interesting approach may be to store text MIME data in a data structure in a column type other that a BLOB (e.g. a CLOB) such that the database can more efficiently scan it. But for 1.0 scanning only the text parts in memory should be enough.

                                Mike.

                                • 13. Re: Project planning/Roadmap
                                  Andrew Oliver Master

                                  Mike how are you handling the blobs with regards to encoding ATM? I have done little to figure out why but we're presently dropping unicode down to gibberish (could be T-Bird though). It seems to me we'll have to make sure this stuff is encoded when writing to the DB as many DBs are goofy with handling encoding themselves... Not sure how that will feel with performance...

                                  Overall this release seems faster, even on MySQL...

                                  • 14. Re: Project planning/Roadmap
                                    Michael Barker Apprentice

                                    I am surprised if it is corrupting stuff. I simply read in bytes and write out bytes. The only translation I do is to handle dot stuffing. I did test with some large binary attachments (mostly PDFs and Excel files) and they worked fine. I will add some unicode test cases to the set of unit tests. Normally encoding is set when the input stream is retrieved. I will have a dig around.

                                    Mike.

                                    1 2 Previous Next