roxen.lists.pike.general

Subject Author Date
RE: Arrays, multisets, mappings: avoiding unnecessary memory Henrik_Grubbström <grubba[at]roxen[dot]com> 07-09-2009
 (re)assigns
On Mon, 7 Sep 2009, Arjan van Staalduijnen wrote:

> Thanks for the answer. In short let me check if I got things right.
>
>> In various pieces of code I am bluntly doing frequent operations such
>> as:
>>
>> 	ids -= (< identifier >);
>> or
>> 	ids |= ({ identifier });
>> or
>> 	m_delete(ids, identifier);
>
> The temporary arrays or multisets used in the examples are cause for 
> overhead memory assignments on their own. Avoiding these temporary 
> assignments by first checking through a has_value (using these examples) 
> would be better.
>
> For multisets there is an easy alternative using the 0 or 1. It avoids 
> a destructive modification all together? Has_value() not necessary in 
> that case, because there's no downside to the 0/1-method.

I think you've gotten the terminology confused; -=, |= et al are 
non-destructive operations (ie they copy the data during operation), while 
m_delete() and `[]=() are destructive operations (ie no copying of data is 
performed).

The possible downsides with using indexing is that you lose the "multi"
of multiset, and that it is a destructive operation so you need to know
that the multiset isn't in use for some other purpose.

> For arrays the '|=' operation is always cause for a memory copy, even 
> if 'identifier' was already available in the array, due to the copy of 
> the temporary array?

Actually it's even more expensive than so; from array.c:

  * Merge two arrays and retain their order. This is done by arranging them
  * into ordered sets, merging them as sets and then rearranging the zipper
  * before zipping the sets together.

So the data is essentially copied thrice; once during sorting, once during 
zipping, and once during reshuffling.

Anyway for single items code that does

   if (!has_value(arr, item)) arr += ({ item });

should typically be faster than using |= (up until a few minutes ago that 
is, since I've added an optimization of this case to Pike 7.8).

> Arjan

--
Henrik Grubbström					<grubba[at]roxen.com>
Roxen Internet Software AB