10 Replies Latest reply on Feb 23, 2009 2:14 PM by clebert.suconic

    Strings Experiments...

    clebert.suconic

      I have done some experiments with Strings & Encoding today, and this thread is a summary of what I have found. We can talk about this at our daily iRC meeting tomorrow.

      I'm reusing a buffer (ChannelBufferWrapper, same buffer used by our Netty channel), and doing the process for about 1.000.000 times.

      I have compared a few different ways of serializing strings: putUTF, putSimpleString, putNewSimpleString, putString, putStringNewWay.

      putNewSimpleString: I"m instantiating a new SimpleString on every write. Just because it wasn't fair to the other methods).

      putSimpleString : Aways reusing the same SimpleString.

      putStringNewWay: This is getting the bytes from the String, in the same way as SimpleString is doing. I'm doing that just to measure a possible optimization.


      And these are the results I got:

      
      putUTF = 2928 milliseconds
      
      putNewSimpleString = 1497 milliseconds
      
      putSimpleString = 98 milliseconds
      
      putString = 2999 milliseconds
      
      putStringNewWay = 1468 milliseconds
      
      



      We are being able to persist 1 million Strings in 3 seconds with the UTF. We could optimize it to probably around 50%.

      We could for sure optimize putString, using the same idea Tim used to extract bytes on SimpleString.



      This is the test I'm using:


      
      import junit.framework.TestCase;
      
      import org.jboss.messaging.integration.transports.netty.ChannelBufferWrapper;
      import org.jboss.messaging.util.SimpleString;
      
      public class UTF8Test extends TestCase
      {
      
       private final String str = "abcdef&^*&!^ghijkl\uB5E2\uCAC7\uB2BB\uB7DD\uB7C7\uB3A3\uBCE4\uB5A5" + "abcdef&^*&!^ghijkl\uB5E2\uCAC7\uB2BB\uB7DD\uB7C7\uB3A3\uBCE4\uB5A5"
       + "abcdef&^*&!^ghijkl\uB5E2\uCAC7\uB2BB\uB7DD\uB7C7\uB3A3\uBCE4\uB5A5"
       + "abcdef&^*&!^ghijkl\uB5E2\uCAC7\uB2BB\uB7DD\uB7C7\uB3A3\uBCE4\uB5A5"
       + "abcdef&^*&!^ghijkl\uB5E2\uCAC7\uB2BB\uB7DD\uB7C7\uB3A3\uBCE4\uB5A5"
       + "abcdef&^*&!^ghijkl\uB5E2\uCAC7\uB2BB\uB7DD\uB7C7\uB3A3\uBCE4\uB5A5"
       + "abcdef&^*&!^ghijkl\uB5E2\uCAC7\uB2BB\uB7DD\uB7C7\uB3A3\uBCE4\uB5A5"
       + "abcdef&^*&!^ghijkl\uB5E2\uCAC7\uB2BB\uB7DD\uB7C7\uB3A3\uBCE4\uB5A5"
       + "abcdef&^*&!^ghijkl\uB5E2\uCAC7\uB2BB\uB7DD\uB7C7\uB3A3\uBCE4\uB5A5";
      
       private final SimpleString simpleStr = new SimpleString(str);
      
       int TIMES = 5;
      
       // Attributes ----------------------------------------------------
      
       // Static --------------------------------------------------------
      
       // Constructors --------------------------------------------------
      
       // Public --------------------------------------------------------
      
       long numberOfIteractions = 1000000;
      
       public void testUTFOnBufferWrapper() throws Exception
       {
       ChannelBufferWrapper buffer = new ChannelBufferWrapper(10 * 1024);
      
       long start = System.currentTimeMillis();
      
       for (int c = 0; c < TIMES; c++)
       {
       for (long i = 0; i < numberOfIteractions; i++)
       {
       if (i == 10000)
       {
       start = System.currentTimeMillis();
       }
      
       buffer.rewind();
       buffer.putUTF(str);
       }
      
       long spentTime = System.currentTimeMillis() - start;
      
       System.out.println("spentTime UTF = " + spentTime);
       }
      
       }
      
       public void testPutNewSimpleString() throws Exception
       {
       ChannelBufferWrapper buffer = new ChannelBufferWrapper(10 * 1024);
      
       for (int c = 0; c < TIMES; c++)
       {
       long start = System.currentTimeMillis();
      
       for (int i = 0; i < numberOfIteractions; i++)
       {
       if (i == 10000)
       {
       start = System.currentTimeMillis();
       }
       buffer.rewind();
       buffer.putSimpleString(new SimpleString(str + i));
       }
      
       long spentTime = System.currentTimeMillis() - start;
      
       System.out.println("spentTime PutNewSimpleString = " + spentTime);
       }
      
       }
      
       public void testPutSimpleString() throws Exception
       {
       ChannelBufferWrapper buffer = new ChannelBufferWrapper(10 * 1024);
      
       for (int c = 0; c < TIMES; c++)
       {
       long start = System.currentTimeMillis();
      
       for (int i = 0; i < numberOfIteractions; i++)
       {
       if (i == 10000)
       {
       start = System.currentTimeMillis();
       }
       buffer.rewind();
       buffer.putSimpleString(simpleStr);
       }
      
       long spentTime = System.currentTimeMillis() - start;
      
       System.out.println("spentTime PutSimpleString = " + spentTime);
       }
      
       }
      
       public void testPutString() throws Exception
       {
       ChannelBufferWrapper buffer = new ChannelBufferWrapper(10 * 1024);
      
       for (int c = 0; c < TIMES; c++)
       {
       long start = System.currentTimeMillis();
      
       for (int i = 0; i < numberOfIteractions; i++)
       {
       if (i == 10000)
       {
       start = System.currentTimeMillis();
       }
       buffer.rewind();
       buffer.putString(str + i);
       }
      
       long spentTime = System.currentTimeMillis() - start;
      
       System.out.println("spentTime putString = " + spentTime);
       }
      
       }
      
       public void testPutStringNewWay() throws Exception
       {
       ChannelBufferWrapper buffer = new ChannelBufferWrapper(10 * 1024);
      
       for (int c = 0; c < TIMES; c++)
       {
       long start = System.currentTimeMillis();
      
       for (int i = 0; i < numberOfIteractions; i++)
       {
       if (i == 10000)
       {
       start = System.currentTimeMillis();
       }
       buffer.rewind();
       buffer.putStringNewWay(str + i);
       }
      
       long spentTime = System.currentTimeMillis() - start;
      
       System.out.println("spentTime putStringNewWay = " + spentTime);
       }
      
       }
      }
      
      




      And this is the proposal for putString:

      (one funny thing, If I placed this loop on a method, I would increase the total time for about 0.5 second for the 1 million messages).


       public void putStringNewWay(final String nullableString)
       {
       flip();
       int len = nullableString.length();
      
       byte[] data = new byte[len << 1];
      
       int j = 0;
      
       for (int i = 0; i < len; i++)
       {
       char c = nullableString.charAt(i);
      
       byte low = (byte)(c & 0xFF); // low byte
      
       data[j++] = low;
      
       byte high = (byte)(c >> 8 & 0xFF); // high byte
      
       data[j++] = high;
       }
      
      
       buffer.writeInt(data.length);
       buffer.writeBytes(data);
       buffer.readerIndex(buffer.writerIndex());
       }
      
      
      




      We can talk about these numbers on the IRC meeting tomorrow.

        • 1. Re: Strings Experiments...
          timfox

          Your conmparison is invalid:

          public void testPutString() throws Exception
           {
           ChannelBufferWrapper buffer = new ChannelBufferWrapper(10 * 1024);
          
           for (int c = 0; c < TIMES; c++)
           {
           long start = System.currentTimeMillis();
          
           for (int i = 0; i < numberOfIteractions; i++)
           {
           if (i == 10000)
           {
           start = System.currentTimeMillis();
           }
           buffer.rewind();
           buffer.putString(str + i);
           }
          
           long spentTime = System.currentTimeMillis() - start;
          
           System.out.println("spentTime putString = " + spentTime);
           }
          
           }
          


          You are adding the integer i to the string in some of the tests. These will cause a new string and a copy. You're not doing that when you're testing with SimpleString

          • 2. Re: Strings Experiments...
            timfox

            Removing the + i gives:

            spentTime UTF = 3696
            spentTime UTF = 3587
            spentTime UTF = 3699
            spentTime UTF = 3672
            spentTime UTF = 3806
            spentTime PutSimpleString = 161
            spentTime PutSimpleString = 142
            spentTime PutSimpleString = 133
            spentTime PutSimpleString = 148
            spentTime PutSimpleString = 135
            spentTime putString = 2972
            spentTime putString = 2922
            spentTime putString = 2967
            spentTime putString = 2942
            spentTime putString = 2905
            spentTime putStringNewWay = 974
            spentTime putStringNewWay = 937
            spentTime putStringNewWay = 977
            spentTime putStringNewWay = 955
            spentTime putStringNewWay = 960
            


            I also remove putNewString() since I didn't see the point of that.

            Also you need to test the perf of reading. There's no point writing stuff you never read!

            The point that putString is slower than putting the shorts in one by one, implies to me that the overhead is in the Netty ChannelBufferWrapper implementation.

            If you look at the Netty ChannelBufferWrapper implementations you will see they do a lot of stuff on every put. So we need to minimise the amount of puts.

            • 3. Re: Strings Experiments...
              trustin

              This the the microbenchmark code I'm using:

              import java.nio.ByteBuffer;
              
              import org.jboss.netty.buffer.ChannelBuffer;
              import org.jboss.netty.buffer.ChannelBuffers;
              
              public class Test {
               public static void main(String[] args) throws Exception {
               long startTime, endTime;
              
               ChannelBuffer buf = ChannelBuffers.dynamicBuffer(1048576);
               byte[] data = new byte[256];
              
               startTime = System.nanoTime();
               for (int j = 0; j < 100; j ++) {
               buf.clear();
               for (int i = 0; i < 1048576 / data.length; i ++) {
               buf.writeBytes(data);
               }
               }
               endTime = System.nanoTime();
              
               System.out.println(endTime - startTime);
              
               startTime = System.nanoTime();
               for (int j = 0; j < 100; j ++) {
               buf.clear();
               for (int i = 0; i < 1048576 / 2; i ++) {
               buf.writeShort((short) 0);
               }
               }
               endTime = System.nanoTime();
              
               System.out.println(endTime - startTime);
              
               ByteBuffer buf2 = ByteBuffer.allocate(1048576);
               startTime = System.nanoTime();
               for (int j = 0; j < 100; j ++) {
               buf2.clear();
               for (int i = 0; i < 1048576 / 2; i ++) {
               buf2.putShort((short) 0);
               }
               }
               endTime = System.nanoTime();
              
               System.out.println(endTime - startTime);
              
               ChannelBuffer buf3 = ChannelBuffers.buffer(1048576);
               startTime = System.nanoTime();
               for (int j = 0; j < 100; j ++) {
               buf3.clear();
               for (int i = 0; i < 1048576 / 2; i ++) {
               buf3.writeShort((short) 0);
               }
               }
               endTime = System.nanoTime();
              
               System.out.println(endTime - startTime);
               }
              }

              The following is my result:
              41097167
              348230930
              444482977
              73107347

              It seems like the boundary checkers in Netty's DynamicChannelBuffer and ByteBuffer are slowing down the put operation. DynamicChannelBuffer uses a byte array rather than a ByteBuffer as its internal data store, and that's why DynamicChannelBuffer does not perform worse than ByteBuffer. It's interesting that ByteBuffer performs even worse than DynamicChannelBuffer which dynamically increases its capacity on demand, while ByteBuffer does nothing much in this test case.

              If a non-dynamic ChannelBuffer, just a wrapper of a byte array, is used, no boundary check is done, and therefore it performs pretty well and the only overhead left seems like increasing the buffer position.

              I think it is difficult to remove the existing boundary checker in DynamicChannelBuffer because it can't expand itself without it. I would suggest you to use non-dynamic ChannelBuffer whenever possible if the length of the buffer is known.

              • 4. Re: Strings Experiments...
                clebert.suconic

                I'm not saying it's impossible, but I'm not being successful at improving the performance on the UTF.

                I could save probably 1% of the time by writing directly on the OutputBuffer, with the introduction of ThreadLocals to reuse buffers on the string compressions. (Something that is probably being done at the JDK) (read it as complexity). So.. I don't think it's worth the effort.

                Think on the UTF as a compressed string:

                Characters between 1 and 0x7f will use 1 byte on the stream, characters > 0x800 will use 2 and anything between 0x7f and 0x799 will use 3 bytes on the streaming.

                BTW: the putString was very fast if not using the dynamicBuffer, making putStringNewWay moot. As Tim told me that we are changing this to use RegularBuffers, I'm not changing that.

                • 5. Re: Strings Experiments...
                  clebert.suconic

                  I just placed a class called UTFUtil (not being used for now), and a test on timing validating the performance.

                  I'm getting these numbers with org.jboss.messaging.tests.timing.util.UTF8Test:

                  Time WriteUTF = 3271
                  Time WriteUTF = 3061
                  Time WriteUTF = 3058
                  Time WriteUTF = 3044
                  Time WriteUTF = 3055
                  Time readUTF = 4166
                  Time readUTF = 4137
                  Time readUTF = 4153
                  Time readUTF = 4144
                  Time readUTF = 4151
                  time NewUTF = 3003
                  time NewUTF = 2849
                  time NewUTF = 2834
                  time NewUTF = 2832
                  time NewUTF = 2837
                  spentTime readUTFNew = 3207
                  spentTime readUTFNew = 3179
                  spentTime readUTFNew = 3179
                  spentTime readUTFNew = 3180
                  spentTime readUTFNew = 3181
                  



                  And these numbers after changing ChannelBufferWrapper to:

                  public ChannelBufferWrapper(final int size)
                   {
                   buffer = ChannelBuffers.buffer(size);
                   buffer.writerIndex(buffer.capacity());
                   }
                  
                  


                  Time WriteUTF = 3338
                  Time WriteUTF = 3054
                  Time WriteUTF = 3077
                  Time WriteUTF = 3094
                  Time WriteUTF = 3076
                  Time readUTF = 4181
                  Time readUTF = 4158
                  Time readUTF = 4236
                  Time readUTF = 4270
                  Time readUTF = 4276
                  time NewUTF = 2961
                  time NewUTF = 2988
                  time NewUTF = 2974
                  time NewUTF = 2959
                  time NewUTF = 2977
                  spentTime readUTFNew = 3093
                  spentTime readUTFNew = 3126
                  spentTime readUTFNew = 3144
                  spentTime readUTFNew = 3121
                  spentTime readUTFNew = 3134
                  
                  



                  If we decide to use this UTF code, it is as simple as:


                   public void putUTF(final String str) throws Exception
                   {
                   flip();
                   UTFUtil.saveUTF(this, str);
                   buffer.readerIndex(buffer.writerIndex());
                   }
                  
                  



                  I will let you guys (Tim) to decide if we should use it or not.

                  • 6. Re: Strings Experiments...
                    timfox

                    Maybe you should explain what those numbers mean?

                    Just posting a list of numbers doesn't really help us.

                    • 7. Re: Strings Experiments...
                      clebert.suconic

                       

                      "timfox" wrote:
                      Maybe you should explain what those numbers mean?

                      Just posting a list of numbers doesn't really help us.


                      Test very similar to the initial post:



                      WriteUTF = Time writing UTF as they are now

                      ReadUTF = Time reading UTFs as they are now

                      time NewUTF = Time writing UTF with the new code (UTFUtil)

                      spentTime readUTFNew = Time reading UTF with the new code (UTFUtil)



                      Time to complete reading or writing 1.000.000 strings. (ignoring the first 10K)

                      • 8. Re: Strings Experiments...
                        clebert.suconic

                        I just did one quick, changing the size of the string locally.

                        private final String str = "This is a standard size string";

                        Time WriteUTF = 463
                        Time WriteUTF = 356
                        Time WriteUTF = 316
                        Time WriteUTF = 350
                        Time WriteUTF = 288
                        Time readUTF = 580
                        Time readUTF = 534
                        Time readUTF = 541
                        Time readUTF = 537
                        Time readUTF = 536
                        time NewUTF = 241
                        time NewUTF = 237
                        time NewUTF = 245
                        time NewUTF = 245
                        time NewUTF = 244
                        spentTime readUTFNew = 340
                        spentTime readUTFNew = 335
                        spentTime readUTFNew = 334
                        spentTime readUTFNew = 333
                        spentTime readUTFNew = 337
                        


                        • 9. Re: Strings Experiments...
                          timfox

                          If it's faster, then let's use it

                          • 10. Re: Strings Experiments...
                            clebert.suconic

                            Just to complete this task / thread, after some simplification on the UTF8Util class, I could get better numbers

                            Time WriteUTF = 2193
                            Time WriteUTF = 2174
                            Time WriteUTF = 2192
                            Time WriteUTF = 2177
                            Time WriteUTF = 2198
                            
                            Time readUTF = 2375
                            Time readUTF = 2304
                            Time readUTF = 2309
                            Time readUTF = 2305
                            Time readUTF = 2311