roxen.lists.pike.general

Subject Author Date
Re: Startup times Pike vs. Perl Stephen R. van den Berg <srb[at]cuci[dot]nl> 17-08-2009
Stephen R. van den Berg wrote:
>Pike and Perl straces included below (I tried creating ltraces, but that
>is a bit timeconsuming (at least in the Pike case it is)):

I did a partial ltrace.  It is attached (lzma-compressed).
It contains gems like these long strains of:

  0.000251 memcpy(0x08a327bc, "gpike/src/iterators.cmod", 24) = 0x08a327bc
  0.000490 memset(0x08a374e8, '0', 212)       = 0x08a374e8
  0.000305 gettimeofday(0x08a37500, NULL <unfinished ...>
  0.000165 SYS_gettimeofday(0x08a37500, NULL)    = 0
  0.000199 <... gettimeofday resumed> )          = 0
  0.000093 malloc(108)                           = 0x08a446f0
  0.000264 memset(0x08a446f4, '0', 4)         = 0x08a446f4
  0.000301 memset(0x08a446f8, '0', 4)         = 0x08a446f8
  0.000302 memset(0x08a446fc, '0', 4)         = 0x08a446fc
  0.000299 memset(0x08a44700, '0', 4)         = 0x08a44700
  0.000300 memset(0x08a44704, '0', 4)         = 0x08a44704
  0.000299 memset(0x08a44708, '0', 4)         = 0x08a44708
  0.000301 memset(0x08a4470c, '0', 4)         = 0x08a4470c
  0.000299 memset(0x08a44710, '0', 4)         = 0x08a44710
  0.000300 memset(0x08a44714, '0', 4)         = 0x08a44714

Which seems to be a rather cumbersome way to initialise an array to zero.

And we seem to be doing a lot of:
  0.000374 strlen("error_name")                  = 10
  0.000338 memcmp(0x8a3068c, 0x819e552, 10, 0, 0x8a2f310) = 0
  0.000361 _setjmp(0xbfdf5fa4, 0xbfdf5fa0, 0, 0xbfdf6140, 0x8a3033c) = 0
  0.000361 strlen("is_cpp_error")                = 12
  0.000361 memcpy(0x08a30308, "is_cpp_error", 12) = 0x08a30308
  0.000393 _setjmp(0xbfdf5fc8, 0xbfdf5fc4, 0, 0xbfdf6140, 0) = 0

It seems that for every (static) string entering the Pike
hashed-string-collection, we do a strlen *and* a memcpy.
I'd have guessed that calculating the hash *and* determining the length
could be combined, and then the memcpy is not needed anymore because the
string is const?  (I haven't looked at the actual Pike source yet).

Another odd case is:

  0.000248 strlen("__register_new_program")      = 22
  0.000455 malloc(13328 <unfinished ...>
  0.000154 SYS_brk(0x08a8e000)                   = 0x08a8e000
  0.000221 <... malloc resumed> )                = 0x08a6a8a8
  0.000112 memcpy(0x08a6dc98, "", 6)             = 0x08a6dc98
  0.000334 malloc(21520)                         = 0x08a6dcc0
  0.000306 memcmp(0x8a6dc98, 0x8a6dc64, 6, 1, 0x8a2ee74) = 0
  0.000375 memcpy(0xbfdf611f, "2", 1)         = 0xbfdf611f
  0.000342 memcpy(0xbfdf611f, "1", 1)         = 0xbfdf611f
  0.000308 memcpy(0xbfdf611f, "3", 1)         = 0xbfdf611f
  0.000306 memcpy(0xbfdf611f, "\r", 1)           = 0xbfdf611f
  0.000304 memcpy(0xbfdf611f, "4", 1)         = 0xbfdf611f
  0.000474 memcpy(0xbfdf611f, "4", 1)         = 0xbfdf611f

followed by numerous of these silly one-byte memcpys.  Not quite sure where
this is, but these could obviously (at least) be optimised as simple
one-character copies instead of calling memcpy.  And since they're all
going into the same spot, it might even be a case of loading a character
into a register?  Just in case this is an artifact of my gcc playing tricks
because of the particular optimisation settings I'm using here, could
someone verify this on their system as well?
-- 
Sincerely,
           Stephen R. van den Berg.

"God... root...  what's the difference?..."