[Zope-dev] Re: [Zope-Annce] ZEO 2.0 beta 1 released

Toby Dickenson tdickenson@geminidataloggers.com
Thu, 29 Aug 2002 10:44:27 +0100


--------------Boundary-00=_32NLVJP3UKKREIAPFMQ8
Content-Type: text/plain;
  charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

On Wednesday 28 Aug 2002 2:19 pm, Jeremy Hylton wrote:

> I'd like to change this bit of code in handle_write():
>
>             if n < len(v):
>                 # XXX It's unfortunate that we end up making many
>                 # slices of a large string.
>                 output.insert(0, v[n:])
>                 break # we can't write any more
>
> If we send a very large message, which could be caused by a big pickle
> or just a transaction that touches a lot of objects, the message gets
> sent in little chunks.  The current implementation uses string slicing
> to save the rest of the string, but that ends up making many copies of
> the large string -- an O(n^2) proposition.
>
> A possible solution is to store an index into the first string in
> __output and just increment the index.  The logic will be a bit tricky
> to get right.

There is an O(n) option with easy logic... fragment big strings inside=20
message_output.

diff attached.


[hmmm. Its use of a python list as a fifo queue is O(n2) worst case, but =
I=20
doubt we will be affected by that in practice]

--------------Boundary-00=_32NLVJP3UKKREIAPFMQ8
Content-Type: text/x-diff;
  charset="iso-8859-1";
  name="bigoutput.diff"
Content-Transfer-Encoding: 7bit
Content-Disposition: attachment; filename="bigoutput.diff"

Index: smac.py
===================================================================
RCS file: /cvs-repository/Packages/ZEO/smac.py,v
retrieving revision 1.19
diff -c -4 -r1.19 smac.py
*** smac.py	22 Aug 2002 18:29:13 -0000	1.19
--- smac.py	29 Aug 2002 09:40:44 -0000
***************
*** 133,140 ****
--- 133,145 ----
              return 0
          else:
              return 1
  
+     # We chose 60000 as the socket limit by looking at the
+     # largest strings that we could pass to send() without
+     # blocking.
+     _write_chunk_size = 60000
+ 
      def handle_write(self):
          output = self.__output
          while output:
              # Accumulate output into a single string so that we avoid
***************
*** 144,159 ****
              # unfortunate interaction between the Nagle algorithm and
              # delayed acks.  If we send a very large string, only a
              # portion of it will actually be delivered at a time.
  
-             # We chose 60000 as the socket limit by looking at the
-             # largest strings that we could pass to send() without
-             # blocking.
- 
              l = 0
              for i in range(len(output)):
                  l += len(output[i])
!                 if l > 60000:
                      break
  
              i += 1
              # It is very unlikely that i will be 1.
--- 149,160 ----
              # unfortunate interaction between the Nagle algorithm and
              # delayed acks.  If we send a very large string, only a
              # portion of it will actually be delivered at a time.
  
              l = 0
              for i in range(len(output)):
                  l += len(output[i])
!                 if l >= self._write_chunk_size:
                      break
  
              i += 1
              # It is very unlikely that i will be 1.
***************
*** 189,198 ****
                  "This action is temporarily unavailable."
                  "<p>"
                  )
          # do two separate appends to avoid copying the message string
!         self.__output.append(struct.pack(">i", len(message)))
!         self.__output.append(message)
  
      def close(self):
          if self.__closed is None:
              self.__closed = 1
--- 190,206 ----
                  "This action is temporarily unavailable."
                  "<p>"
                  )
          # do two separate appends to avoid copying the message string
!         l = len(message)
!         self.__output.append(struct.pack(">i", l))
!         if l <= self._write_chunk_size:
!             self.__output.append(message)
!         else:
!             # fragment large strings now, to avoid O(n2) behavior when
!             # dealing with huge strings.
!             for start in range(0,l,self._write_chunk_size):
!                 self.__output.append(message[start:start+self._write_chunk_size])
  
      def close(self):
          if self.__closed is None:
              self.__closed = 1

--------------Boundary-00=_32NLVJP3UKKREIAPFMQ8--