Ticket #232 (closed defect: fixed)
httpd threads stuck, all doing EACCELERATOR_LOCK_RW for ever
| Reported by: | Tarkbark | Owned by: | somebody |
|---|---|---|---|
| Priority: | critical | Milestone: | |
| Component: | eAccelerator | Version: | 0.9.4 |
| Keywords: | Cc: |
Description (last modified by bart) (diff)
Hi,
We are running apache 2.0.59, PHP 4.4.4 and eaccelerator 0.9.4. A few times a month apache ends up with all threads running the same portion of code over and over again. If more info is needed, send an email to hakan@….
Here is the loop that is running over and over again:
(gdb)
117 EACCELERATOR_LOCK_RW ();
(gdb)
118 p = &eaccelerator_mm_instance->locks;
(gdb)
119 while ((*p) != NULL) {
(gdb)
120 if (strcmp ((*p)->key, x->key) == 0) {
(gdb)
131 p = &(*p)->next;
(gdb)
120 if (strcmp ((*p)->key, x->key) == 0) {
(gdb)
124 if (x->pid == (*p)->pid) {
(gdb)
133 if ((*p) == NULL) {
(gdb)
137 EACCELERATOR_UNLOCK_RW ();
(gdb)
138 if (ok) {
(gdb)
149 t.tv_sec = 0;
(gdb)
151 select (0, NULL, NULL, NULL, &t);
(gdb)
150 t.tv_usec = 100;
(gdb)
151 select (0, NULL, NULL, NULL, &t);
(gdb)
117 EACCELERATOR_LOCK_RW ();
(gdb)
118 p = &eaccelerator_mm_instance->locks;
(gdb)
119 while ((*p) != NULL) {
(gdb)
120 if (strcmp ((*p)->key, x->key) == 0) {
(gdb)
131 p = &(*p)->next;
(gdb)
120 if (strcmp ((*p)->key, x->key) == 0) {
(gdb)
124 if (x->pid == (*p)->pid) {
(gdb)
133 if ((*p) == NULL) {
(gdb)
137 EACCELERATOR_UNLOCK_RW ();
(gdb)
138 if (ok) {
(gdb)
149 t.tv_sec = 0;
(gdb)
151 select (0, NULL, NULL, NULL, &t);
(gdb)
150 t.tv_usec = 100;
(gdb)
151 select (0, NULL, NULL, NULL, &t);
(gdb)
117 EACCELERATOR_LOCK_RW ();
(gdb)
118 p = &eaccelerator_mm_instance->locks;
(gdb)
119 while ((*p) != NULL) {
(gdb)
120 if (strcmp ((*p)->key, x->key) == 0) {
(gdb)
131 p = &(*p)->next;
(gdb)
120 if (strcmp ((*p)->key, x->key) == 0) {
Attachments
Change History
comment:2 Changed 4 years ago by Tarkbark
Hi, i was asked to attach some info from config.h
#define MM_SEM_SPINLOCK 1 #define MM_SHM_IPC 1
comment:3 Changed 3 years ago by JonathanO
- Priority changed from major to critical
I suspect that this is probably a dup of #224. It's causing us serious problems, so I'm upping the priority.
comment:5 Changed 3 years ago by terrysduncan
I am seeing something very similar occasionally using lighttpd 1.4.11, EA 0.9.5 and PHP 5.2.2 on an ARM processor. The php processes go nuts. I have debugged it a bit and discovered that the free list in the shared memory segment is corrupt - it shows one node on the free list and it points back to itself which causes the php processes to spin.
I have also noticed that the semaphore value is hosed. Using an IPC semaphore, the value should always be 1 unless some process has it locked. I am seeing an increasing value there. So, it follows that if the locking mechanism is not working, it leaves open the possiblity of corrupting the free list.
I have a lot of testing before I can declare victory but I believe the problem is that there is a call to mm_unlock() in eacclerator_clean_request() without a matching mm_lock(). Can anyone tell me why that call is there? Is it going to hose something up if I remove it? terry dot s dot duncan at intel dot com
comment:6 Changed 3 years ago by terrysduncan
Here is a patch that seems to address this problem... I have seen no side effects for removing the unlock call and I have not seen the spinning PHP issue since implementing it. The mm.c change below is not necessary but would prevent other potential unmatched lock / unlock calls from causing problems.
--- eaccelerator.c.orig 2007-05-16 12:07:31.000000000 -0700 +++ eaccelerator.c 2007-12-10 13:41:23.000000000 -0800 @@ -1752,7 +1752,6 @@
mm_used_entry *p = (mm_used_entry*)EAG(used_entries); if (eaccelerator_mm_instance != NULL) {
EACCELERATOR_UNPROTECT();
- mm_unlock(eaccelerator_mm_instance->mm);
if (p != NULL eaccelerator_mm_instance->locks != NULL) { EACCELERATOR_LOCK_RW(); while (p != NULL) {
--- mm.c.orig 2006-10-11 05:45:52.000000000 -0700 +++ mm.c 2007-12-07 16:14:44.000000000 -0800 @@ -357,10 +357,18 @@
return 1;
}
+static int locked = 0; +
static int mm_do_lock(mm_mutex* lock, int kind) {
int rc; struct sembuf op;
+ if (locked) + { + ea_debug_log("eAccelerator: attempted double lock: %u\n", getpid()); + return 1; + } + locked++;
op.sem_num = 0; op.sem_op = -1; op.sem_flg = SEM_UNDO;
@@ -374,6 +382,12 @@
int rc; struct sembuf op;
+ if (!locked) + { + ea_debug_log("eAccelerator: attempted double unlock: %u\n", getpid()); + return 1; + } + locked--;
op.sem_num = 0; op.sem_op = 1; op.sem_flg = SEM_UNDO;
Changed 3 years ago by terrysduncan
-
attachment
eaccelerator-lockbug.patch
added
Shared memory locking patch
comment:7 Changed 10 months ago by hans
- Status changed from new to closed
- Resolution set to fixed
Bart already seems to have fixed this in rev @342
Thanks for your input on this!
comment:8 Changed 6 months ago by sim
decoration Changed 1 year ago by admin
bathtub Changed 1 year ago by admin
solar system Changed 1 year ago by admin
stair parts Changed 1 year ago by admin
solar supply Changed 1 year ago by admin
comment:10 Changed 3 months ago by whome
It appears we have a decent cross-section of people here. dresses for prom | hobo purses
comment:11 Changed 3 months ago by bobmarks
and its good to see this ticket resolved as well, well done daily sudoku puzzles daily sudoku