roxen.lists.pike.general

Subject Author Date
Re: Startup times Pike vs. Perl Henrik_Grubbström <grubba[at]roxen[dot]com> 17-08-2009
On Mon, 17 Aug 2009, Stephen R. van den Berg wrote:

> Stephen R. van den Berg wrote:
>> Pike and Perl straces included below (I tried creating ltraces, but that
>> is a bit timeconsuming (at least in the Pike case it is)):
>
> I did a partial ltrace.  It is attached (lzma-compressed).
> It contains gems like these long strains of:
>
>  0.000251 memcpy(0x08a327bc, "gpike/src/iterators.cmod", 24) = 0x08a327bc
>  0.000490 memset(0x08a374e8, '0', 212)       = 0x08a374e8
>  0.000305 gettimeofday(0x08a37500, NULL <unfinished ...>
>  0.000165 SYS_gettimeofday(0x08a37500, NULL)    = 0
>  0.000199 <... gettimeofday resumed> )          = 0
>  0.000093 malloc(108)                           = 0x08a446f0
>  0.000264 memset(0x08a446f4, '0', 4)         = 0x08a446f4
[...]
>  0.000300 memset(0x08a44714, '0', 4)         = 0x08a44714

The above looks like it's probably compilation.h in PUSH mode.

Fix on the way...

> Which seems to be a rather cumbersome way to initialise an array to zero.
>
> And we seem to be doing a lot of:
>  0.000374 strlen("error_name")                  = 10
>  0.000338 memcmp(0x8a3068c, 0x819e552, 10, 0, 0x8a2f310) = 0
>  0.000361 _setjmp(0xbfdf5fa4, 0xbfdf5fa0, 0, 0xbfdf6140, 0x8a3033c) = 0
>  0.000361 strlen("is_cpp_error")                = 12
>  0.000361 memcpy(0x08a30308, "is_cpp_error", 12) = 0x08a30308
>  0.000393 _setjmp(0xbfdf5fc8, 0xbfdf5fc4, 0, 0xbfdf6140, 0) = 0
>
> It seems that for every (static) string entering the Pike
> hashed-string-collection, we do a strlen *and* a memcpy.
> I'd have guessed that calculating the hash *and* determining the length
> could be combined, and then the memcpy is not needed anymore because the
> string is const?  (I haven't looked at the actual Pike source yet).

The reason for this is that currently the strings have to be reallocated 
to the data segment, since they are prefixed with writeable data 
(hash-value, hash-table links, the length, etc). I can think of ways to 
avoid this, but the question is how happy the C-compilers would be...

> Another odd case is:
>
>  0.000248 strlen("__register_new_program")      = 22
>  0.000455 malloc(13328 <unfinished ...>
>  0.000154 SYS_brk(0x08a8e000)                   = 0x08a8e000
>  0.000221 <... malloc resumed> )                = 0x08a6a8a8
>  0.000112 memcpy(0x08a6dc98, "", 6)             = 0x08a6dc98
>  0.000334 malloc(21520)                         = 0x08a6dcc0
>  0.000306 memcmp(0x8a6dc98, 0x8a6dc64, 6, 1, 0x8a2ee74) = 0
>  0.000375 memcpy(0xbfdf611f, "2", 1)         = 0xbfdf611f
[...]
>  0.000474 memcpy(0xbfdf611f, "4", 1)         = 0xbfdf611f

I haven't identified this one yet.

> followed by numerous of these silly one-byte memcpys.  Not quite sure where
> this is, but these could obviously (at least) be optimised as simple
> one-character copies instead of calling memcpy.  And since they're all
> going into the same spot, it might even be a case of loading a character
> into a register?  Just in case this is an artifact of my gcc playing tricks
> because of the particular optimisation settings I'm using here, could
> someone verify this on their system as well?
> -- 
> Sincerely,
>           Stephen R. van den Berg.
>
> "God... root...  what's the difference?..."

--
Henrik Grubbström					<grubba[at]roxen.com>
Roxen Internet Software AB