How to run Dytran applications under Grid Engine

Posted by chris Wed, 28 May 2008 15:44:29 GMT

Gerhard Venter asked the users list for assistance in getting Dytran to run under Grid Engine. Once his issues were resolved, Gerhard was kind enough to write up a Wiki Entry on Dytran/SGE integration.

The wiki page is here:
http://wiki.gridengine.info/wiki/index.php/Dytran

Thanks Gerhard!

How to run Dytran applications under Grid Engine

Posted by chris Wed, 28 May 2008 15:44:29 GMT

Gerhard Venter asked the users list for assistance in getting Dytran to run under Grid Engine. Once his issues were resolved, Gerhard was kind enough to write up a Wiki Entry on Dytran/SGE integration.

The wiki page is here:
http://wiki.gridengine.info/wiki/index.php/Dytran

Thanks Gerhard!

Java DRMAA binding via JavaScript

Posted by chris Wed, 28 May 2008 15:27:59 GMT

%  jrunscript -cp $SGE_ROOT/lib/drmaa.jar -f drmaa.js

Job 2 submitted
Job 2 has ended
Job terminated abnormally
%

Richard Hierlmeier seems to have joined the ranks of Sun Bloggers and has a facinating post up documenting how he used the javascript engine that ships with Java 6 to bind to drmaa.jar.

The post is here:

http://blogs.sun.com/rhierlmeier/entry/java_drmaa_binding_with_javascript

Java DRMAA binding via JavaScript

Posted by chris Wed, 28 May 2008 15:27:59 GMT

%  jrunscript -cp $SGE_ROOT/lib/drmaa.jar -f drmaa.js

Job 2 submitted
Job 2 has ended
Job terminated abnormally
%

Richard Hierlmeier seems to have joined the ranks of Sun Bloggers and has a facinating post up documenting how he used the javascript engine that ships with Java 6 to bind to drmaa.jar.

The post is here:

http://blogs.sun.com/rhierlmeier/entry/java_drmaa_binding_with_javascript

Creating Hadoop PE under Grid Engine

Posted by chris Fri, 23 May 2008 14:13:24 GMT

Dan has found a great Sun blog article by Ravi Chandra Nallan post on integrating Hadoop into SGE via the use of a parallel environment.


Image source: http://hadoop.apache.org/core/

Links:

mpiblast, SGE and MPICH2 integration

Posted by chris Mon, 21 Apr 2008 15:49:28 GMT

Matthias Neder has posted a quick summary of a tightly integrated MPICH2 integration that can successfully handle his mpiblast application integration.

The summarized solution can be found here:
http://gridengine.sunsource.net/servlets/ReadMsg?listName=users&msgNo=24204

mpiblast, SGE and MPICH2 integration

Posted by chris Mon, 21 Apr 2008 15:49:28 GMT

Matthias Neder has posted a quick summary of a tightly integrated MPICH2 integration that can successfully handle his mpiblast application integration.

The summarized solution can be found here:
http://gridengine.sunsource.net/servlets/ReadMsg?listName=users&msgNo=24204

mpiblast, SGE and MPICH2 integration

Posted by chris Mon, 21 Apr 2008 15:49:28 GMT

Matthias Neder has posted a quick summary of a tightly integrated MPICH2 integration that can successfully handle his mpiblast application integration.

The summarized solution can be found here:
http://gridengine.sunsource.net/servlets/ReadMsg?listName=users&msgNo=24204

Screenshots of enhanced Olesen FLEXlm tools in action

Posted by chris Thu, 06 Mar 2008 14:21:00 GMT

In a follow-up post to Mark's recent announcement we've gotten our hands on some screenshots from Mark showing his tools in use. The screenshots show the results of using XSLT transformations to turn Grid Engine XML data into XHTML form suitable for web pages. The benefit includes web-based visibility into current resource (and software license!) usage. This is exactly the approach that I tried out with the xml-qstat project. Mark is pretty familiar with that effort and will be merging his improvements and enhancements into xml-qstat's SVN repository. Speaking personally as a "scratch an itch" programmer with no real software engineering skill or talent I'm pretty excited to have a real coder take a look at xml-qstat. Related to that I already owe a debt to Petr Jung from Sun who contributed the Java based CommandGenerator code that finally allows xml-qstat to be a 100% Java/Cocoon web application that does not require external perl daemons to cache XML state data.

Before the screen captures, I'd like to ask a favor of people who read this blog. I filed bug Issue #2335 back in July of 2007 and it has not received much love (or even a targeted milestone date for a fix). The bug is a simple one -- "qstat -f -xml" no longer reports load average data which (a) makes xml-qstat a whole lot less useful and (b) breaks the SGE developer philosophy of ensuring that command output returns the same information regardless of output format. Until that bug is fixed it does make sense for xml-qstat to have it's long overdue "1.0" release. If you have a user account over on http://gridengine.sunsource.net I'd appreciate it if you can cast one of your "votes" for Issue 2335. Thanks!

And now the screenshots (edited to mask out personal/company information). Click on each image for a larger version.

qhost overview

Click on through for the rest of the pictures ...

qstat full view (a)

qstat full view (b)

qstat queue summary

qstat resource summary

qstat view

Olesen FLEXlm integration tools updated

Posted by chris Tue, 04 Mar 2008 16:31:31 GMT

Mark has posted a significant update to his most excellent FLEXlm license management integration tools. Key changes include:

  • XML configuration files
  • XML status output
  • XSLT stylesheets to transform monitoring information into web pages
  • Ability to integrate with xml-qstat

Mark further explains the updates and new Wiki-based documentation in his post to the mailing list.

tight MPICH2 integration broken with mpich2-1.0.6p1

Posted by chris Fri, 25 Jan 2008 13:53:56 GMT

If you are interested in tightly integrated MPICH2 environments, keep an eye on this mailing list thread. It seems that a current release (mpich2-1.06p1) has some sort of changed behavior that breaks the existing methods for tight integration as documented in the HOWTO.

Older versions of the mpich2 code (version 1.04p1) still seem to integrate without error.

Wildcard PEs for threaded app optimization on multicore systems

Posted by chris Fri, 02 Nov 2007 15:49:29 GMT

This is from an old mailing list thread I had kept flagged in my inbox ...

In this interesting mailing list thread from back in August, John Coldrick is looking for advice on how to maximize the power of his render farm.

John has a threaded (non-parallel) application that must run within a single execution host, some hosts having up to 8 cores available for jobs. The application's thread usage can be dialed up or down depending on how many CPU cores are available. What John is basically trying to do is:

  1. Sort available hosts to find the one with the most CPU cores available
  2. Reserve or otherwise tell the SGE scheduler that those CPU cores are all going to be used by a single application
  3. Tell the application itself how many cores it has been granted so that it can dial it's own thread usage up or down appropriately

The solution suggested by Dan combines some old admin magic from the SGE 5.x days (using PEs as a nifty hack to lock out multiple job slots in use by threaded non-parallel applications) with some newer SGE 6.x features (using wildcard '*' selectors when making a request for a parallel environment) to arrive at a nifty solution.

After creating a PE on each of his execution hosts, John can submit his render job requesting a range of CPU slots ([1-8] in his case) while also using a wildcard selector to ask for any parallel environment. The end result is that:

  • The SGE scheduler will find the system with the most available slots/cores automatically
  • Within the parallel environment SGE understands the job will consume more than 1 job slot
  • John's application script can just query the environment variable $NSLOTS to learn how many CPUs it was granted and then adjust it's thread usage accordingly

Related post:
"Grouping jobs to nodes via wildcard PE's"

Preview release: powerful new array task interdependency features

Posted by chris Tue, 11 Sep 2007 09:24:20 GMT

Today the Grid Engine team announced the availability of new developer preview snapshot binaries showcasing major new functionality.

Full announcement: "Rising Sun Pictures Adds Array Job Interdependencies to GE"

This is a success on many levels - a major feature gain for Grid Engine that comes from users of the open source Grid Engine product. The architects and drivers of this new functionality all work for Rising Sun Pictures, a visual effects house based in Australia.

Simply put, Rising Sun's rendering workflow involves serious use of Array Jobs and the complexity of their effects work required a more powerful implementation of how job dependencies are handled. What RSP has done is bring job dependency behavior down to the level of individual sub-tasks within Array jobs.

The best explanation of the requirements and the implementation can be found here (a very interesting read): http://open.rsp.com.au/?page_id=11.

OpenMPI Support

Posted by chris Wed, 27 Sep 2006 02:30:04 GMT

Two interesting things reported by Olli-Pekka Leht in this interesting thread on OpenMPI support for grid engine:

  • A tight parallel environment integration module exists in the OpenMPI development tree and there is a chance it may be released at the November 11-17 SC06 meeting.
  • Not wanting to wait, Olli-Pekka took the module from a nightly snapshot and compiled it into an OpenMPI-1.1.1 installation

People interested in doing what Olli-Pekka has done can visit http://staff.csc.fi/~oplehto/openmpi-gridengine/. Basic implementation notes are offered in this mailing list post.

DRMAA bindings for Ruby

Posted by chris Thu, 08 Jun 2006 18:34:00 GMT


Update: 13 June 2006
Andreas has started a formal Ruby-DRMAA project:

…the DL based Ruby wrapper for shared DRMAA C bindings now has it’s own home page and CVS repository under

http://drmaa4ruby.sunsource.net/

Andreas writes:

  http://gridengine.sunsource.net/files/documents/7/89/drmaa_ruby.tar.gz

my first version of a DRMAA binding for Ruby. Thanks to Ruby/DL it was
not necessary to use SWIG. Open issues are

  * assumes DRMAA 0.95 instead of DRMAA 1.0
  * memory leak due to JobTemplate destructor not yet functioning
  * a number of functions are not yet supported (drmaa_ps_job(),
    drmaa_control(), ...)

In related news, it appears that people are using and beginning to extend the Ruby-based SGE log-file analysis script mentioned previously on this site.

Older posts: 1 2 3