Roxen Mailing List Mirror

roxen.lists.roxen.general

Subject	Author	Date
RE: database timeouts?	Henrik_Grubbström <grubba[at]roxen[dot]com>	17-02-2006
On Fri, 17 Feb 2006, Graeme Davis wrote: > It hangs the whole server and I have a script that will kill -USR1 backtrace > & restart Roxen when it's hung. Looking at some of the debug logs, it seems > like most threads are hung on destroy() calls. But it's hung on calls to > my local MySQL dbs which I know are up, so something is causing everything > to hang.... Could it be the create() call that locks stuff? The locking has probably been done in DBManager.pmod, which means that the lock is held for a too long time. > Background: "PERPT" is the shoddy Oracle DB that goes down a lot. > > 15:51:00 : __builtin.mutex: lock() > 14m38.7s : base_server/roxenloader.pike:1413: SQL( "mysql;//u:<p[at]h>/etms:-" > )->destroy() > : base_server/emit_object.pike:61: get_row() > > : ### Thread 11: > : __builtin.mutex: lock() > : base_server/roxenloader.pike:1413: SQL( "local:rw" )->destroy() > : base_server/roxen.pike:5038: > roxen->compile_security_pattern("",RoxenModule(CCARE/email#0)) > > : ### Thread 14: > : pike/lib/pike/modules/Sql.pmod/oracle.pike: > create("PERPT","","u","p") > : pike/lib/pike/modules/Sql.pmod/Sql.pike:223: > create("PERPT",0,"u","p",0) > 15:51:00 : etc/modules/DBManager.pmod:243: > sql_cache_get("oracle://u:<p[at]PERPT>") Ok, to me the above looks like thread 14 has taken the sq_cache_lock, and thread 11 (and others) hang waiting for it. A possible work around could be to start the server with -DNO_DB_REUSE. A proper fix would probably involve letting DBManager.sql_cache_get() release the sq_cache_lock during the call to get_sql_handler(). > Hope this provides more info on potential solutions =) > > Thanks a lot, > > Graeme -- Henrik Grubbström <grubba[at]roxen.com> Roxen Internet Software AB

Subject

Author

Date

Henrik_Grubbström <grubba[at]roxen[dot]com>

17-02-2006

On Fri, 17 Feb 2006, Graeme Davis wrote:

> It hangs the whole server and I have a script that will kill -USR1 backtrace
> & restart Roxen when it's hung.  Looking at some of the debug logs, it seems
> like most threads are hung on destroy() calls.   But it's hung on calls to
> my local MySQL dbs which I know are up, so something is causing everything
> to hang.... Could it be the create() call that locks stuff?

The locking has probably been done in DBManager.pmod, which means that the 
lock is held for a too long time.

> Background: "PERPT" is the shoddy Oracle DB that goes down a lot.
>
> 15:51:00  : __builtin.mutex: lock()
> 14m38.7s  : base_server/roxenloader.pike:1413: SQL( "mysql;//u:<p[at]h>/etms:-"
> )->destroy()
>          : base_server/emit_object.pike:61: get_row()
>
>          : ### Thread 11:
>          : __builtin.mutex: lock()
>          : base_server/roxenloader.pike:1413: SQL( "local:rw" )->destroy()
>          : base_server/roxen.pike:5038:
> roxen->compile_security_pattern("",RoxenModule(CCARE/email#0))
>
>          : ### Thread 14:
>          : pike/lib/pike/modules/Sql.pmod/oracle.pike:
> create("PERPT","","u","p")
>          : pike/lib/pike/modules/Sql.pmod/Sql.pike:223:
> create("PERPT",0,"u","p",0)
> 15:51:00  : etc/modules/DBManager.pmod:243:
> sql_cache_get("oracle://u:<p[at]PERPT>")

Ok, to me the above looks like thread 14 has taken the sq_cache_lock, and 
thread 11 (and others) hang waiting for it.

A possible work around could be to start the server with -DNO_DB_REUSE.

A proper fix would probably involve letting DBManager.sql_cache_get() 
release the sq_cache_lock during the call to get_sql_handler().

> Hope this provides more info on potential solutions =)
>
> Thanks a lot,
>
> Graeme

--
Henrik Grubbström					<grubba[at]roxen.com>
Roxen Internet Software AB

Roxen & Pike List Archives

roxen.lists.roxen.general