My Store stuff is fairly close. I need to do some work to finish off the handling of transactions and I should be able to commit around the end of this week.
However there is a task assigned to me (JBMAIL-25) that recommends that the store stuff I have done be reworked to implement org.jboss.mail.msgstore interfaces. I have looked at the interfaces and I am quite confused by them. There is no methods that support doing partial reads/writes (this is the most important requirement for the store) either via a stream or any other mechanism. There are a number of other small quirks of the interfaces that seem to be counter to your requirements (e.g. store should generate ids, not be passed them).
On the message side, I was thinking of seperating body from the message and making it a seperate class. The body of the message would handle the complexity of whether it is stored or in memory. It could be fronted by an interface and an MBean could generate concrete instances of it. The body would get saved as (small) blob in the existing folder structure. If the body was actually a reference to a object in the store then the folder would be storing a persistant handle/proxy.
"There is no methods that support doing partial reads/writes":
You are correct. The current implementation uses Hibernate for persistence, with no support for patial reads afaik. The mailstore interface should be adapted to allow it if a store supports it. It would probably mean adding properties such as isStreamable, getInputStream and getOutputStream.
"store should generate ids, not be passed them":
The interface does not allow generation of external ids. When storing a message only the sender address and message data is accepted. The message is stored in some way depending on implementation, and a StoredMessage object is returned containing the generated message id.
Future references to the stored message are made through this id.
Whether the store should care about the sender address is debatable, it could be handled at mailbox level.
yeah...you guys figure out how to integrate and work together.... I kinda favor the sender being in the mailbox too. However what is more important is that you guys establish a working relationship and communicate... Thats more important to me strategically (goes for everyone).
I have just commited the latest update to my store code. The JDBC3 and PostgreSQL module play nice the JBoss Client Transaction Manager now. Currently it has similar behaviour to the EJB "Required" transaction type, mainly because the Client Transaction Manager doesn't support nested transactions. (Given the transaction requirements for these 2 modules perhaps they should be EJBs....) The BDB module is also there and works, but from what I have garnered from the sleepycat mailing list, it looks like BDB doesn't actually support partial reads/writes, so I may have to deprecate this module in the near future.
I have retested with the latest MySQL JDBC Connector (3.1.7 final), and I am going eat my words now. It is a lot better. The MySQL bug 8096 is now fixed and performance is much improved (Bug 7745 will still cause problems http://bugs.mysql.com/bug.php?id=7745).
The main question for now is where do we go from here. We have 2 different implemenations covering the same area. The decision is how we merge the 2 and produce something useful. Below is my take on it (I could be wrong, and I am quite willing to do something complete different if it is the better approach).
Partial read/write support. This IMO is the most important feature of Message Store. Without this we are better off simply holding the message in memory. Where does that leave the Hibernate implemenation, which doesn't support streamable blobs. An option is still use hibernate ORM features, but build an object model that supports the partial read/write requirement. This could be done making the StoredMessage object aggregate a number of StoredPage objects of a fixed maximum size. On top of this model we would need Paged[Input/Output]Stream which would fetch or flush a page whenever the current page is underrun or overrun respectively. This would provide a simple generic implementation that would be usable across multiple databases.
User defined meta data. This is information such as sender address. Currently the store implemation doesn't support this, but can be if required. For starters a simple string will suffice, but eventually name/value pair style properties would be preferred.
Where does it sit (AKA what's in a name). When I wrote the current store implementation I didn't want to tie any particular domain. I.e. I avoided the use of terms such Mail and Message. It was designed as a store for any type of big thing that we may want to stuff in it (e.g. Calendar appointments with attachments). Is this something that is important? If so then the hibernate based implemenation should be moved to org.jboss.mail.store. If it should be specifically a message store then the reverse should happen.
The store interface. Currently neither of the 2 implementations have an interface that covers all the bases. Who specs it and what it is based on will be largely decided by the outcome of the above paragraph. Just something that needs to be done 'tis all. One specific on the interface is that we need to support different store implementations, which will use different data types for their ids. This suggests the type of the id needs to be an Object or a Serializable rather than a String.
This is what needs to be tied up before seriously integrating it into the main execution path. I am happy to do any, all or none of the above. I didn't want to go stomping all over others peoples code and making decisions without fostering a little discussion first.
Ultimate desired behavior:
When DATA is coming in (via the CmdDATA) we start creating a Mail (via Mail.java or whatever). If the size exceeds N bytes we instead create a mail store. The Mail now points to the mail store.
Acceptable behavior for M3:
When DATA is coming in, it is immediately directed to the MailStore. The Hibernate implementation just keeps a pointer to it.
If I were a dicktator:
Immediately add this in the execution path. Immediately change the Hibernate stuff to us this for the main message body.
We aren't even close to starting the Calendar stuff. IMAP is really more important to get under way in the short term. Security is more important as the IMAP stuff flushes out. We can always refactor and repackage stuff later so don't worry about it. Its just Java. We have CVS. We can always change the Java in CVS.
The two of you are avoiding the hard work of collaborating. The more important outcome is you two to start collaborating in code and in communication. Writing Java is easy because working with computers is easy. Its working with people that is hard (I'm an expert on making it harder ;-) ).
However, part of working together is to give up this excessive notion of "ownership" and start thinking of it as all our communal code. If you make it work better then dawie shouldn't mind. If you make it work worse...then everyone will mind. If he doesn't like what you do it it, he can fix it. You both are being to timid. Let the commits flow! Let the discussion come with it. However, be optimistic!! If you think we may need to go back, just use CVS tags.
To be clear...the Mailbox should Point to the message store. The Body should go in the store, the Mailbox should use the store. We don't "trust" what is in the store to be valid as a mail or that we'll even not throw it away. Eventually we'll even want code to reap the unattached bodies (morbid).
The idea behind the design is to be a portable implementation that should run as is on almost any database, based on the services provided by hibernate.
I have tested it on blobs of 10MB, and it performs pretty well on Portgres. It does however consume large amounts of memory since it is not a streaming implementation.
I think what we need is a flag on the MessageStore implementation to indicate if it supports streaming, and getInputStream/getOutputStream methods in the StoredMessage interface.
I also agree that the MessageStore should not be concerned with the actual content type, since we will most probably use it later on to store other types of data, e.g. newsgroup messages, calendar events etc.
Mike, you are welcome to go ahead and modify the Mailstore interface as you see fit. I'd like to get this issue stabilised so that I can continue with the mailbox implementation.
Ok. I will get to work on integrating the message store into the main line. The body of the Mail will become a seperate object (currently it is a List of byte arrays) which will be a proxy to the Store.
When I have that done I will modify the hibernate implemenation such that it implements a simple paging structure to do partial reads/writes.
I will aim to have the first of these done around the end of this week.
Postgres's latest drivers seem to support streaming. They do so in 4k blocks, but thats pretty okay.
There is another reason to use the store :-). JBossMQ...it...well makes multiple copies of the body.
Ultimately we will need BOTH behaviors (Store and No Store). For this release Store-only is fine. This is a bit of a feature regression in that there are going to be people who might send the messages over JMS to other servers with other DBs. However, I can live with that for now because those people....well....they are a little cracked (that *won't* perform well). For the present you'll be limited to one DB on the same server as JBossMQ.
Eventually there should be:
1. Always use store
2. Never use store
3. Use store when the message is > N kb
So while I'm only asking for #1 for M3, don't code yourself into a corner with #2, #3 for M4 :-)
i also think this is a great! Glad you guys are talking! This all looks like great work!
Postgres's latest drivers seem to support streaming.
Sort of, streaming is supported via JDBC2 for inputing into a byte array field, but not for retrieving data. Also the JDBC3 Blob is not supported, our implemenation uses this (I sent a patch a couple of weeks ago but have not seen any response to it, maybe they don't like my code, sniff, sniff). However that is not a problem as we have an implemenation of the store that talks directly to the LargeObjectAPI. Check out: org.jboss.mail.store.postgresql.PostgreSQLStore.