|
|
|
|
Darryl Dixon - Winterhouse Consulting-2
()
|
|
||||||||||||
|
Hi All,
I oversee an environment that runs a Zope/ZEO cluster on some very fast HP EVA storage (50+ disks RAID1+0 array), attached to some very fast HP blade servers (2x dual core Opteron @3Ghz with 14GB RAM). Multiple Plone sites are served by the Zope instances running in this cluster. Several of the tasks that get run on the Zope instances are very long-running and stress the storage layer considerably (eg, load and store every content object in the database). I have had the opportunity over the last few days to directly compare apples-to-apples how the performance of these tasks are affected by the CPU speed of the machines in question, as one of the machines in the cluster has been replaced by a 2x quad core Opteron blade @2.3Ghz (instead of 3Ghz). So, more CPU cores, but each one slower. We run fewer instances than the max CPUs regardless, so there's no contention there. Anyhow, I have been running a few of these long-running tasks, and the results have somewhat surprised me. A certain task that consistently takes ~20minutes to run consistently becomes ~28minutes when the *ZEO* server is run on the quad-core machines (2.3Ghz). It doesn't matter if the Zope instance running the job is on 3Ghz or 2.3Ghz, the job takes about 28 minutes regardless. If the ZEO server is moved to the 3Ghz machine, the job drops back to taking ~20 minutes like usual. Once again it doesn't matter if the Zope instance running the job is on 2.3Ghz or 3Ghz. So my observation is simply that raw CPU speed for the ZEO server directly impacts in a measurable way on the overall performance of the cluster. This was a little surprising to me, as I have always understood the 'received wisdom' in the Zope community to be that 'The Zope instances are CPU bound, the ZEO server is IO bound' [1]. Now, to be fair, these are highly spec'd machines running the database on a very fast fibre-channel SAN, but I'm still surprised to see CPU speed for object load/store turn in to a measurable bottleneck. So I guess in terms of questions for the list: * Is this surprising to anyone else? * Is there anyone out there running their ZEO server on some super-fast hardware such as IBM Power6? [2] Best regards, Darryl Dixon Winterhouse Consulting Ltd http://www.winterhouseconsulting.com [1] eg, see under Scalability: http://plone.org/documentation/tutorial/introduction-to-the-zodb/an-introduction-to-the-zodb [2] http://www-03.ibm.com/press/us/en/pressrelease/21580.wss _______________________________________________ Enterprise mailing list [hidden email] http://lists.plone.org/mailman/listinfo/enterprise |
||||||||||||||
|
|
ajung
()
|
|
||||||||||||
|
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1 On 26.03.2009 21:12 Uhr, Darryl Dixon - Winterhouse Consulting wrote: > > So my observation is simply that raw CPU speed for the ZEO server directly > impacts in a measurable way on the overall performance of the cluster. > This was a little surprising to me, as I have always understood the > 'received wisdom' in the Zope community to be that 'The Zope instances are > CPU bound, the ZEO server is IO bound' [1]. Now, to be fair, these are > highly spec'd machines running the database on a very fast fibre-channel > SAN, but I'm still surprised to see CPU speed for object load/store turn > in to a measurable bottleneck. > - -aj -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (Darwin) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iEYEARECAAYFAknMOmkACgkQCJIWIbr9KYzo8ACgtpJvlWuxnaUP6/PM2lTKSLX/ q34AoLVASSY84fArcNR4eZJO/uIMnTYv =fEfa -----END PGP SIGNATURE----- [lists.vcf] begin:vcard fn:Andreas Jung n:Jung;Andreas org:ZOPYX Ltd. & Co. KG adr;quoted-printable:;;Charlottenstr. 37/1;T=C3=BCbingen;;72070;Germany email;internet:[hidden email] title:CEO tel;work:+49-7071-793376 tel;fax:+49-7071-7936840 tel;home:+49-7071-793257 x-mozilla-html:FALSE url:www.zopyx.com version:2.1 end:vcard _______________________________________________ Enterprise mailing list [hidden email] http://lists.plone.org/mailman/listinfo/enterprise |
||||||||||||||
|
|
ctxlken
()
|
|
||||||||||||
|
In reply to this post
by Darryl Dixon - Winterhouse Consulting-2
Some javascript/style in this post has been disabled (why?)
Darryl,Just generally, if I've got a Zope/Plone application that gets many concurrent requests, then I'll want numerous ZEO clients, multiple CPUs and preferrably multiple machines to spread the load across. But if I've got long-running individual processes (Python scripts in my Zope instance that are perhaps kicked off by a cron job), then I'd rather have a single CPU with some serious horse power. Reason being? Your long-running python script/process isn't going to be able to leverage more than one CPU at any given point in time, so make it count (get the fastest/most CPU you can.) On a lot of the more serious multi-CPU machines these days, the hardware defaults aren't necessarily tweaked to defaults that lend themselves as well to such long-running processes. If you could disable hyperthreading, you should for such an application, otherwise, since your Python/Zope process is pegged to a single CPU at any time, since you're hyperthreaded, your CPUs are virtually 'doubled', but your max CPU available for a single process is cut in half. Case in point: stop everything else on your server. Start a single Zope instance and run your long-running process. If you had a single CPU and no hyperthreading, that process would be able to show that it's pegging your machine at 99%-100% CPU utilization. Since you more than likely have 2XQuad Core with hyperthreading, it'll appear to you that you have 8 CPUs, so the max CPU on the machine that your long-running process will ever get to utilize will be 12.5%. So, you've got this new server with all these nice processors available, but you're only using 13% CPU. Serious waste of resources. You're better off with faster CPU and fewer of them for your scenario, it seems. Best of luck with your future benchmarking! Ken Wasetis Contextual Corp. ken <dot> wasetis <at> contextualcorp <dot> com
_______________________________________________ Enterprise mailing list [hidden email] http://lists.plone.org/mailman/listinfo/enterprise |
||||||||||||||
|
|
Darryl Dixon - Winterhouse Consulting-2
()
|
|
||||||||||||
|
In reply to this post
by ajung
Hi Andreas,
> On 26.03.2009 21:12 Uhr, Darryl Dixon - Winterhouse Consulting wrote: > >> >> So my observation is simply that raw CPU speed for the ZEO server >> directly >> impacts in a measurable way on the overall performance of the cluster. >> This was a little surprising to me, as I have always understood the >> 'received wisdom' in the Zope community to be that 'The Zope instances >> are >> CPU bound, the ZEO server is IO bound' [1]. Now, to be fair, these are >> highly spec'd machines running the database on a very fast fibre-channel >> SAN, but I'm still surprised to see CPU speed for object load/store turn >> in to a measurable bottleneck. >> > > Did you encounter or have you measured IO contention? > The EVA is a shared storage, so there is some background IO always being performed by other users of the array, but this is very low volume and the overall throughput while these jobs run is well below the IOPS the EVA is capable of. Actual throughput volume is also below the maximum capabilities (100+ MB/sec on 2gbits FC HBAs). Also, I/O wait time on the machines in question is basically non-existent while this job is in progress. Interestingly, the flip-side is also true: the ZEO process while this job runs consistently sits anywhere from 30-70% of a 3Ghz core, so it is definitely using a pretty serious amount of CPU time. regards, Darryl Dixon Winterhouse Consulting Ltd http://www.winterhouseconsulting.com _______________________________________________ Enterprise mailing list [hidden email] http://lists.plone.org/mailman/listinfo/enterprise |
||||||||||||||
|
|
Darryl Dixon - Winterhouse Consulting-2
()
|
|
||||||||||||
|
In reply to this post
by ctxlken
> Darryl,
> > Just generally, if I've got a Zope/Plone application that gets many > concurrent requests, then I'll want numerous ZEO clients, multiple CPUs > and preferrably multiple machines to spread the load across. > Absolutely agree. > But if I've got long-running individual processes (Python scripts in my > Zope instance that are perhaps kicked off by a cron job), then I'd rather > have a single CPU with some serious horse power. > Yes, definitely. > On a lot of the more serious multi-CPU machines these days, the hardware > defaults aren't necessarily tweaked to defaults that lend themselves as > well to such long-running processes. If you could disable hyperthreading, > you should for such an application, otherwise, since your Python/Zope > process is pegged to a single CPU at any time, since you're hyperthreaded, > your CPUs are virtually 'doubled', but your max CPU available for a single > process is cut in half. > The point is well made but in this case is moot, the CPUs in question are AMD Opteron with true multi-core. re only using 13% CPU. > > You're better off with faster CPU and fewer > of them for your scenario, it seems. > Indeed I agree, but the interesting thing seems to be that the serious CPU would be so useful for the ZEO server, which I found quite unexpected. > Best of luck with your future benchmarking! Thanks :) Darryl Dixon Winterhouse Consulting Ltd http://www.winterhouseconsulting.com _______________________________________________ Enterprise mailing list [hidden email] http://lists.plone.org/mailman/listinfo/enterprise |
||||||||||||||
|
|
ctxlken
()
|
|
||||||||||||
|
In reply to this post
by Darryl Dixon - Winterhouse Consulting-2
Some javascript/style in this post has been disabled (why?)
Darryl,Sorry, I didn't catch the part about this being related to the ZEO storage server process in the first place (versus the ZEO client(s) .) That is odd. You mentioned that you're using networked storage. Have you tried comparing performance using local storage? Even if that's not your end desire/requirement, it might be helpful in trying to identify where the bottleneck is. The ZEO server obviously doesn't have business logic running in it (your Python script code), so it shouldn't be using a lot of CPU consistently, if things are configured well. Even if the SAN storage is reporting decent ms seek/write times, I wouldn't be surprised if you see a gain using local storage. Ken
_______________________________________________ Enterprise mailing list [hidden email] http://lists.plone.org/mailman/listinfo/enterprise |
||||||||||||||
|
|
alan runyan-2
()
|
|
||||||||||||
|
In reply to this post
by Darryl Dixon - Winterhouse Consulting-2
Darryl,
So what does a ZEO Server do? - It loads, stores objects from disk - It performs conflict resolution if two stores occur simultaneously - It invalidates client storages when object store occurs. - Anything else? What version of ZODB are you running? Are you using authentication in ZEO? Is there any metrics you can gather from the ZEO server? - zeoserverlog analyzing some log files may produce some information. http://svn.zope.org/ZODB/trunk/src/ZEO/scripts NOTE: This question may want to be re-phrased and asked on zodb-dev. It could very well be a Plone (application level issue), i.e. thrashing the ZEO server. like a bad application beating up a RDBMS badly. I'm sure zodb-dev would be interested in getting some metrics around your usage. cheers alan On Thu, Mar 26, 2009 at 10:07 PM, Darryl Dixon - Winterhouse Consulting <[hidden email]> wrote: >> Darryl, >> >> Just generally, if I've got a Zope/Plone application that gets many >> concurrent requests, then I'll want numerous ZEO clients, multiple CPUs >> and preferrably multiple machines to spread the load across. >> > > Absolutely agree. > >> But if I've got long-running individual processes (Python scripts in my >> Zope instance that are perhaps kicked off by a cron job), then I'd rather >> have a single CPU with some serious horse power. >> > > Yes, definitely. > >> On a lot of the more serious multi-CPU machines these days, the hardware >> defaults aren't necessarily tweaked to defaults that lend themselves as >> well to such long-running processes. If you could disable hyperthreading, >> you should for such an application, otherwise, since your Python/Zope >> process is pegged to a single CPU at any time, since you're hyperthreaded, >> your CPUs are virtually 'doubled', but your max CPU available for a single >> process is cut in half. >> > > The point is well made but in this case is moot, the CPUs in question are > AMD Opteron with true multi-core. > > re only using 13% CPU. >> >> You're better off with faster CPU and fewer >> of them for your scenario, it seems. >> > > Indeed I agree, but the interesting thing seems to be that the serious CPU > would be so useful for the ZEO server, which I found quite unexpected. > >> Best of luck with your future benchmarking! > > Thanks :) > > Darryl Dixon > Winterhouse Consulting Ltd > http://www.winterhouseconsulting.com > > _______________________________________________ > Enterprise mailing list > [hidden email] > http://lists.plone.org/mailman/listinfo/enterprise > -- Alan Runyan Enfold Systems, Inc. http://www.enfoldsystems.com/ phone: +1.713.942.2377x111 fax: +1.832.201.8856 _______________________________________________ Enterprise mailing list [hidden email] http://lists.plone.org/mailman/listinfo/enterprise |
||||||||||||||
| Free Embeddable Forum Powered by Nabble | Help |