Skip navigation

Most distrubution system will select UUID as the object indentifier. But the length of UUID would be 36 chars and in Java it need 72 bytes persist and it was randomly and no-sequential inside so it will hurt the search/lookup within storage. Even some system provide some sequential UUID support, but it binding with system (specialization). So here I want post new way replace that:

 

The new ID will be includes the following parts:

1. Sever identifier

2. Component identifier

3. Thread identifier    (need consider thread group | thread pool either)

4. Time          (it need be judged by your system loads, seconds, milliseconds, etc)

 

Note: Time should be consider centralized synchronized. Otherwise it's not universal unique. Please refer:

http://community.jboss.org/people/andy.song/blog/2010/12/09/the-time-of-your-machine-can-be-trust

 

For examples:

1                   2                        111           6183640443687

Sever           Component           Thread      NanoTime

 

So totoal 19 bits, and its numeric values. And in general situation is has sequence, so use that kinds of id will improve a lot your system performance.

 

It's not new, Twitter current will leverage that ideas. One of their engineer open source that library (snowflake) in:

https://github.com/twitter/snowflake/tree/1cd0af14db9efa7972a9ed605661a7b70962914a/src

OS will provid your the time support for you automatically, so you already used to that. But how about you go to distrubuted computing senarios, which means multiple servers works for large chunk requests, can you trust each machine time? The answer is "no", as time for each machine depends on the electronic power so some may go faster some may go slower. So that may introduct some expected behaviors if you don't handled that.

 

So from machine perspective the time synchronization come to stage, for example: Windows Time Service.

 

But how about from software perspective ? The answer is "centralized or distributed Time service".

 

Time Service.PNG

 

The chanllenges will be:

* Network Communication Latency

* The scale level you want to achieve (Seconds, Milliseconds, Nanoseconds, etc)

* Still need machine time service coordination because how about you want to do geographical distribution

* SPOF invovled ? Maybe, but you can overcome by some other design.

1.  The restriction of ephemeral port range in OS when designning long running Queue|Topic consumer

2.  TCP time_wait and close_wait impact on the create new connections with Messaging Server

3.  Due to producer or consumer were client from TCP/IP design, so the close one tcp/ip connection wasn't active close. That will hurt the messaging server capacities.

4.  Equal messaging size or freedom messaging size? Equal messaging size will decrease the message server persistence fragment will bring you more benefit when large concurrent messaging going-in and -out.Compressed or uncompressed? Compressed with some overhead with client perspective but gain more benefit from Messaging Server perspective, so it deserved to compressed messages as you can.

5.  General Messaging Header need consider, recommend:

     *   Source

     *   Destination

     *   Version

     *   SendTime

     *   ReceiveTime

     *   TransactionId or CorrelationId

6.  Messages compatibility.

 

Do you have more advices?

Filter Blog

By date:
By tag: