Ticket #174 (closed defect: invalid)
possible performance/lock issues under SMP
| Reported by: | oliver@… | Owned by: | bart |
|---|---|---|---|
| Priority: | major | Milestone: | 0.9.5 |
| Component: | eAccelerator | Version: | 0.9.5 |
| Keywords: | Cc: |
Description
Situation:
eAccelerator 0.9.5-rc1 [shm:mmap_anon sem:spinlock] PHP 5.1.6 [ZE 2.1.0] Using apache2handler on FreeBSD server 6.1-RELEASE-p3 FreeBSD 6.1-RELEASE-p3 #2: Wed Sep 13 14:56:33 BST 2006 root@server:/usr/src/sys/amd64/compile/MISSION amd64 2x Dual Core AMD Opterons 4GB RAM. [eaccelerator] zend_extension="/usr/local/lib/php/20050922/eaccelerator.so" eaccelerator.shm_size="128" eaccelerator.cache_dir="/tmp/eaccelerator" eaccelerator.enable="1" eaccelerator.optimizer="1" eaccelerator.check_mtime="1" eaccelerator.debug="0" eaccelerator.filter="" eaccelerator.shm_max="0" eaccelerator.shm_ttl="10" eaccelerator.shm_prune_period="5" ; don't use disk cache right now, as it causes segfaults ; eaccelerator.shm_only="0" eaccelerator.shm_only="1" ;eaccelerator.keys = "shm_and_disk" ;eaccelerator.sessions = "shm_and_disk" ;eaccelerator.content = "shm_and_disk" eaccelerator.keys = "shm" eaccelerator.sessions = "shm" eaccelerator.content = "shm" eaccelerator.compress="1" eaccelerator.compress_level="9" eaccelerator.allowed_admin_path="/usr/local/www/mission.realtsp.com/server/control.php" Caching enabled yes Optimizer enabled yes Memory usage 75.32% (80.95MB/ 128.00MB) Free memory 28MB Cached scripts 1296 Number of apache processes ~50
Problem:
We are getting very high "system load" in top (vs user load which is relatively low):
last pid: 26059; load averages: 20.47, 15.48, 15.53 up 0+06:04:09 13:31:37 CPU states: 18.0% user, 0.0% nice, 75.6% system, 0.2% interrupt, 7.3% idle
When we profile the php application it turns out that 75% of total parse time is spent on doing all the "requires". When server load is low the requires take 100ms. When load grows the requires take 1.3s while the rest of the application code grows from 200ms to 500ms. ie eA is getting 13x times slower at providing the cached code while the rest of the app is only 2.5x slower.
It seems like the apache processes are maybe "locking each other out" from retrieving the cache. Each process has to "require" about 500 files = 30MB on each request. Should there even be a lock when they are only reading the cache?
there is lots of memory on this machine. Having eA @ 128MB gives the worst result. If we reduce it to 32MB (which produces some swapping in eA memory) it runs much much better!
We have also changed our application code to use autoload so we only require what we need. That obviously helps enormously. Previously we never bothered with autoload because eA was so fast as serving the compiled code. Now on this 4 cpu-core machine with high load it is a very different story. 70% of the CPU is eaten up by waiting for ea to server the compiled code. turning eA off altogether only slows it down a further 10%.
We have contained the situation for now by "requiring" less by using autoload and using a smaller (weird!) eA memory.
hoping that there is better long term solution?
Should we try SEM=IPC or shm=sysv? How can we compile with those options?
Thanks
Attachments
Change History
comment:3 Changed 4 years ago by bart
I've been doing some benchmarking and pthread locking seems to be almost as fast as spinlocks but that is on a UP machine. I think it will be even faster then spinlocks when using SMP machines because like you say sometimes eA will just be spinning and waiting. I recommend you try pthread locking, I didn't get much testing but my first tests show that it's stable and doesn't produce any problems.
comment:4 Changed 4 years ago by bart
- Version set to 0.9.5
- Milestone set to 0.9.5
I talked to someone who had the same problem once. He contributed the attached patch to mmcache and it got included and right before Dmitry left it was removed again without reason. Can you test this one? It should help the FreeBSD schedular when using spinlocks.
comment:5 Changed 4 years ago by oliver
Thanks for the contributed patch. The good news is that it patched fine against 0.95-RC1, compiled perfectly (configure detected sched.h correctly) and the resulting eaccelerator.so worked perfecttly first time when I restarted apache. :-)
The dissappointing news is that I didn't get any measurable improvement in eaccelerator performance during the same test conditions we had before. :-( It still seems to get all locked up when the load increases.
My current theory is that 4 CPUs trying to load 500 php files 10times per second is just an awful lot and will always result in locking/performance issues.
As an alternative we are now investigating "packaging" the many many class files we have into larger files which always get loaded together. We should be able to reduce the total number of cached files from ~500 to < 100, which will "solve" the real world problem we have.
I still wonder why it is necessary to "lock" at all when accessing the cache for "reading". Or at least why a "read-lock" should ever have to wait unless there is currently another process writing the compiled code into the cache which will happen only after a restart or file update. Why does eaccelerator give "exclusive read locks" (if I have understood correctly what it does).
Oliver
comment:6 Changed 4 years ago by oliver
bart tells me that: "eA lock asks a RW lock everytime, because it tries to do some garbage collection when searching a bucket"
and "it locks the whole memory block when it loops through the hashtable bucket, just in case there is a bucket that needs to be removed"
which possibly explains some of the locking issues, as it would be better to just use read-only-locks for >99% of cases.
bart says: "I'm working on improving that whole part in 0.9.6 but it's quite some work"
In the meantime we are investigating further with sysvipc locks etc...
comment:7 Changed 4 years ago by oliver
- Status changed from new to closed
- Resolution set to invalid
ok, I was on the wrong track, the problem was *_once not eA (or APC for that matter). See here for more info: http://pecl.php.net/bugs/bug.php?id=8765
and here for detailed findings of alternatives to *_once.
http://propel.tigris.org/servlets/ReadMsg?list=dev&msgNo=1782
