roxen.lists.roxen.general

Subject Author Date
Re: Lots of patches that can be integrated into Roxen 5.0 in a heartbeat Martin Stjernholm <mast[at]roxen[dot]com> 17-01-2009
"Stephen R. van den Berg" <<srb[at]cuci.nl>> wrote:

>>/.../ the caching gets too defensive - practically no rxml code gets
>>cached in the protocol cache. Instead one has to use the
>><set-max-cache> tag at the end to raise the cache time.
>
> Which turns out to be messy at best. I either got too little effect,
> or too much effect, it was never "just right". It's the reason I
> started adding the set-expr-cache (but haven't finished that).

It's the reason we in our CMS caches the compiled rxml pages in a
database, so that the <cache> tags can be used to fine-control the
caching inside a page. Of course, the protocol cache is still
important to use whenever possible - it's a lot faster.

It'd be fairly simple to add p-code caching to rxmlparse.pike as well
(actually got a half-baked patch for that).

> I see.  Well, I never run without thread support...  Except, when running
> under gdb.  Debugging a threaded program still is more painful than
> debugging a single threaded one.

Is it? I've never felt bothered by it (unless, of course, it's a
thread problem I'm trying to debug). So you have a pike compiled with
--without-threads for debugging?

> So in order to simplify debugging I'd wish the no-threading support
> to not completely die; /.../

Well, if your patches are what's required for that then it's ok. But
they need a bit of cleanup first.

> Fine with me, however, in that same file, there are "global" variables and
> class-local variables with the same name; that is confusing (at least it
> was to my while I was trying to understand the code).  Please rename
> the old or new ones.

Well, I'd rather not, actually. It's afterall exposed functions and
variables so there _could_ theoretically be code out there that
breaks. Maybe I'm overly paranoid, but otoh hidden identifiers aren't
exactly unheard of - one could also say that you better just get used
to them. ;>

> The reasons it would be good to do it the new way is are:
> a. It would save a *lot* of cycles in the hottest path, as long as you match
>    exact.

I doubt it's a _lot_ of cycles, but it's some. If one's into that, I
guess a bit more could be shaved off by splitting the matching so that
the host is matched separately from the path (which I believe almost
never is a pattern).

And it is afterall possible to retain the order even with this: Just
keep a flag for every entry to tell whether it's a glob or not. (Not
that it'd necessarily be an unbending requirement to keep compat here,
though.)

> b. The abovementioned case where it breaks backward compatibility is rare
>    enough that it is rather unlikely that anyone constructed something like
>    that, or is it?

Indeed, it looks like a very dubious configuration.

> c. I have personally experienced numerous occasions where the resolving
>    order of the servers and/or redirects changed over the years in Roxen,
>    usually a lot more destructive than this change (I know, weak reason,
>    consider it a late-complaint ;-).

Oh yes, we've got quite a history of change-first-think-later,
unfortunately. :P But old sins shouldn't be used to motivate new
ones..

> If Roxen one-way hashed the username/password into the generated URL,
> the adversary cannot know the url (unless he knows the password).
> (I'm not sure that it does, currently, though).

I think the relevant case to consider is that the adversary has found
the actual links through some other weakness (say by looking in a
browser cache or by getting hold of an access log), not that (s)he
tries to guess the hashes.

> I'll look into this once more (also because my fix is not conclusive, it
> seems), and come up with a better researched and worded fix/patch, we'll
> how well that can be digested.

Ok, thank you. Maybe making it configurable is a good idea. Then the
risk assessment is in the hands of the admin.

> b. What I do/want/did is put a symlink in the tree which points to the
>    real local.
> c. During checkout operations in git, this symlink is consistently being
>    replaced by a directory and populated with the files from the repository.
> d. Even if I'd move the real local directory into the source tree (which I
>    want to avoid, since I have multiple Roxen trees lying around, which all
>    share the same local tree), there is the problem that during a checkout -f
>    git wipes out all other files which are not inside the
>    repository.

I'm inclined to consider all these weaknesses in git. One can hope it
gets refined over time to handle such things better.

I've several times got a bit hampered by the general all-or-nothing
approach in git. E.g. a working tree has to be completely clean to
switch branches (while cvs just patches the changes over into the new
tree), and one can't switch branch just in a subdirectory to quickly
check something. I hope git will get a bit more pragmatic in cases
like that for the sake of conveniency.

Anyway, this is currently a non-issue for us, and implementing a
directory move would make it one. 5.0 is long overdue and we are
actually working hard to push it out the door, so we're not looking
for more trouble right now. Sorry.

> /.../ or we fix (a), and make sure that there is (preferably one) a
> place where the local path can be configured.

There is some support for a $LOCALDIR "variable" in paths. Maybe it
just needs to be extended?

> Hmmm...
> in my CMS, once I find that a certain object has changed, I force all
> upstream caches which contain (part of) this object to flush and recache.
> That might be the cause then.

On-demand invalidation of entries in <cache> tag caches is on the
wishlist. Are you using timeout tricks to achieve that?