revision 1.19 (1/27/2006) 2 About LSF Platform LSF (load sharing facility) software enables efficient use of resources through one common interface. LSF acts as the glue to make license resources, host, CPU, operating system, memory, and other resources available for jobs. Without LSF everyone would have to manually coordinate who gets access to which resources. LSF acts as a master scheduler and coordinates access to resources based on management defined policy. LSF balances the workload across the compute pools. User submit jobs to LSF system which uses priorities, jobs requirements, and resource availability to determine where to place the job. 3 About LSF QCT runs LSF to: Dynamically add resources based on customer need. (hosts, memory, etc.) Allow projects with aggressive tapeout schedules to get priority access to resources. Provide a seamless compute environment to QCT engineers. Track resource usage. Additional future uses: High availability restart failed jobs automatically Manage licenses between groups Use idle resources at remote sites 4 What LSF Does LSF Workstations (Solaris, HPUX, Linux) Servers (Unix, HPUX, Linux) License Availability Free memory Idle Time Number of CPUs Host Status Disk I/O Rate Available Disk Space Collects information on status of all resources in cluster (resource can be any hardware or software entity). 5 What LSF Does Workstations (Unix, HPUX, Linux) Servers (Unix, HPUX, Linux) vsim BuildGates Calibre Verilog vcs Design Compiler LSF -SDDR -High Road -Fast I.O. Chooses ideal, available & appropriate resource to process the computational job (i.e., specific hardware, O/S & software license). 6 QCT USA LSF Architecture Campbell 75 hosts San Diego 1,800 hosts Raleigh 200 hosts Austin 50 hosts 7 Overview of LSF Job Flow Submit Job Gather Job Data Analyze Output LSF Cluster E-Mail App Log LSF Queue Monitor Job Job Done Command Wrapper bsub 8 Basic Job Flow Gather job requirements Submit job into LSF Job will PEND in a queue until resources for job are available When resources available, job will be dispatched to best available execution host(s) LSF will re-create your environment on the execution host(s) exactly as it was when the job was submitted, then start the job. User can monitor job while running When job is complete, results sent to user out written to output file 9 LSF Terminology Server - Hosts where jobs run. Can submit, query, and execute jobs. Client - Does NOT execute or run LSF jobs. Can only submit and query jobs. (login servers, sunray servers) Queues Container for jobs. (Like checkout lines at market) Cluster name ICEng for San Diego, CAM for Campbell, RTP for Raleigh, AUS for Austin, TEST for San Diego test cluster Job Command submitted to LSF for execution. LSF schedules, controls, and tracks the job according to configured policy. Job Slot Bucket into which a single unit of work is assigned. Each host has a configured number of job slots. (We configure 1 job slot per physical processor). 10 Gathering Data Before submitting jobs to LSF, gather data about the program you want to submit. What operating system does it best run on? (Linux, Linux 64-bit, Solaris) Which project is this job for? How much memory does the job require? Which license features does your job need to run? Can your job run in parallel across multiple machines / processors? Can your job be broken into logical, smaller chunks for faster processing? Check the qct tool dependency database: http://qctweb1.qualcomm.com/Resources/Dependency Many wrappers may already code job submission with proper resource requests. 11 Submitting to LSF Which Queue? You submit jobs into an LSF queue, which specifies user access and priorities to resources. SunRay and MFU users have a specialized queue - app priority/priority_linux urgent or final tapeout related jobs. normal/linux regular jobs regression_queue regression jobs A few application queues (hspice, smartest, etc.) also exist. Use bqueues w for a listing of queues and data -or- bqueues l {queue_name} for details on a queue and the list admin 12 Submitting to LSF Which Queue? On a sunray or mfu server (must submit all jobs through LSF) Interactive design work, interactive simulations use app queue Priority queues (priority, priority_linux to be combined) Regular queues (normal, linux, idle ) On desktop workstation (recomended to submit all jobs through LSF) Regular queues (normal, linux, idle, etc.) Find authorized queues: Bqueues u myusername 13 LSF Queues app interactive jobs for users with a sunray or mfu session priority, priority_linux - urgent solaris or linux jobs or final tapeout related work. normal/linux regular queues idle desktop solaris hosts only (this is the default queue, lowest priority) short jobs > 30 minutes will be killed automatically (sparc only) hspice hspice jobs only smartest special queues for teradyne tester group night jobs submitted in this queue only run at night regression_queue regression jobs 14 Access to resources Access to LSF resources is determined by Engineering Management. QCT Engineering Services only implements the policy given to us. -normal, linux, idle, night queues accessable to all QCT engineers. -app queue only accessable from sunray or MFU servers For project queues ask the queue administrator for access bqueues l {queue_name} and look for the ADMINISTRATORS line. 15 Submitting your job - bsub The bsub command submits that job to LSF for processing. your_host> bsub q {queue name} {resource / options} {command} With bsub, you define the resources and options needed for your job to run. Many wrappers produce the desired bsub command with appropriate resource options. You can also specify job run options, like output files and suppressing LSF e-mail notification. LSF allows for parallel job processing. LSF dispatches the job to the host that best matches your request. The more restrictive your resource request, the longer it will take to place the job. LSF dispatches to hosts until the job slots are full, or resources are peaked (cpu loaded, no available application licenses, etc.) 16 QCT LSF Compute Policy All compute intensive jobs that use licenses from EDA tools use LSF No camping out on LSF compute servers Jobs submitted through LSF must specify necessary license resource http://qctes.qualcomm.com/twiki/bin/view/QCTES/UsagePolicy 17 Submitting your job The bsub command is broken into several sections: LSF options: -q {queue_name} to specify queue to dispatch job to. -o for output file designation to write job output. -n for processor spanning / parallel process jobs. -I | -Ip | -Is for interactive jobs. (pseudo-terminal and pseudo-terminal with shell support) Lots of other options in product documentation. Resource request strings: -R specifying select and rusage statements. Select specifies the characteristics a host must have to be considered a potential execution host. Rusage indicates that when the host is selected, what resources are required for the job to run on the host. Path & command to run. 18 bsub examples bsub q eagle_queue Ip myjobscript bsub q dove_queue R select[(realpower >=1) && (type==SPARC)] rusage [mem=1000, realpower=1:duration=1] myjob bsub R select [select-string] rusage[rusage-string] ... Specify the requirements your job needs to run and let LSF place the job. If you dont select any resources, your job will be dispatched to any available host in the queue you specify. Queues are mixed architecture. Specify the resources needed for your application. For multi-processor jobs: bsub n 4 (4 cpus required) bsub n 4 q eagle_queue myjob 19 Common Resource Select Options bsub q saber_queue R select[ compute ] myjob Run job only on solaris compute server (host in computer server room) Use compute when you dont want your job to suspend. bsub q idle R select[ desktop ] myjob Run job only on a desktop workstation (may be suspended) bsub R select[ type==SPARC ] myjob Run job on a solaris host busb R select[ (type==SPARC) && (os_version==2.8) ] myjob Run job only on solaris 2.8 host bsub R select[ type==LINUX ] myjob Run job on an x86 (32 bit) host running linux bsub R select[ (type==LINUX) || (type==SPARC) ] myjob Run job on either solaris or linux 20 Common Resource Select Options Old New Meaning os2_8 (type==SPARC) && (os_version==2.8) solaris 2.8 linux (type==LINUX) linux (x86_32) linux (type==LINUX64) opteron (x86_64) linux (type==LINUXIA64) linux (ia64) desktop solaris sun workstation in office compute solaris sun server in server room mem free memory (dynamic changes depending on allocated memory) maxmem total memory installed on host (static does not change) cpuf Do not use this resource 21 Common Resource Errors Do NOT use compute resource when submitting to linux queues Do NOT use compute resource when submitting to app queue. Do NOT use os2_8 resource when submitting to linux queues In the future we may have script in place to monitor users for submitting jobs which with invalid resoure combinations. Also working with Platform to prevent submission of jobs with invalid resource combinations. 22 Rusage Options Rusage virtually reserve the specified resources for your job rusage[xsim=1:duration=1, mem=10000] rusage[xsim=1:duration=1, ysim=1:duration=1, zsim=1:duration=1] LSF does not check out a license from the license server only your application can do it. LSF scheduler will virtually reserves the license resource for the specified duration to give your application time to start and check out the license itself. This prevents new jobs from being dispatched and taking the license. LSF will monitor your job and decrement the reserved memory by the actual memory used by the job. 23 Rusage Memory and Licenses Ensure your job gets the necessary memory with rusage. bsub R rusage[mem = 16000] myjob. All jobs requiring large amounts of memory should use rusage or else another job could place on the host and cause your job to crash. Be sure to use mem in your rusage (NOT maxmem) 24 Job submission Submit with -Is (Interactive shell support) Creates a pseudo terminal with shell mode support (handles CTRL-C and CTRL-Z, properly) and sends output to terminal. (type==SPARC) || (type==LINUX) Candidate execution hosts can be either solaris or linux. Shell limits for long running simulations, your wrapper may need to unlimit settings from the shell. unlimit cputime unlimit datasize 25 Prohibited Job Submission All jobs should be submitted through LSF batch system via bsub Not allowed: Submitting xterms or shells to project queues or compute servers. (Only allowed in app queues.) Never allowed: Use of lsrun, lsgrun, lslogin, or ch Users running xterms or shells tcsh/csh/bash on compute hosts may cause your jobs to crash or may cause other jobs to crash. Any processes launched from the shell may steal the cpu, memory, etc from another process. 26 Checking job status Once your job is submitted, you can check on the status of the job with the bjobs command. Command by itself will show you your jobs. bjobs u all will show all jobs in system. bjobs u all q {queue_name} shows all jobs in that queue. bjobs u all m {hostname / host group} shows all user jobs on a single host or host group. A job can be in several states: Pend Not yet started Run Job is running USUSP suspend by user or admin SSUSP suspended from threshold, run window DONE job completed EXIT job completed non-zero status 27 Checking job status If your job is pending, it can be pending for any number of causes. You can do a bjobs lp {job_ID} for a detailed description on why its pending. The more restrictive your select and rusage statements, the more time it will take LSF to find a candidate host for you. LSF does not validate the logic in your select or rusage statements. So, if you request more memory than is available on any LSF host, your job will continually pend. Your job may also pend because a license feature is not available for the job. Your job may also be pending because you have reached your limit on a queue, and LSF has throttled the number of jobs it can dispatch for you. 28 Job status commands bjobs JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 1869442 user1 RUN phoenix_qu kenvil compute-spa xterm Sep 15 03:42 1979970 user2 RUN devo_queue bridger compute-spa *c_new1.do Sep 17 10:42 1879711 user3 RUN dora65_que olgly compute-x86 *04.12-SP3 Sep 15 09:20 1895128 user4 RUN raven_queu oreana compute-x86 *libre_drv Sep 15 19:30 1880282 user5 RUN phoenix_qu kamoro compute-x86 *aPlace.sc Sep 15 10:05 1888906 user6 RUN dora65_que sr-san-08 compute-x86 *no_ar.run Sep 15 15:48 1977333 user7 RUN conan_queu sr-san-18 compute-x86 *n_sta.tcl Sep 17 07:58 1893440 user8 RUN phoenix_qu kamoro compute-x86 *ace_topo Sep 15 18:23 2025726 user9 RUN phoenix_qu mfu-san-02 compute-x86 *n SUSE.64 Sep 18 23:01 29 Job status commands bjobs u all q eagle_queue JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 591912 user1 RUN eagle_queu lisbon strontium */eagle.pt Feb 15 16:47 632827 usre2 RUN eagle_queu adelaide strontium xterm Feb 17 10:18 645065 user3 RUN eagle_queu adelaide strontium xterm Feb 17 15:07 645769 user4 RUN eagle_queu olgly strontium *_9_CLK $* Feb 17 15:20 633299 user5 RUN eagle_queu lisbon rubidium *_oen_dout Feb 17 11:10 633343 user6 RUN eagle_queu harper tsuchiura realpowerX Feb 17 11:20 645642 user7 RUN eagle_queu lisbon evaki *_clk_dout Feb 17 15:1 bjobs lp {job_ID} Job <636830>, User <johndoe>, Project <default>, Status <PEND>, Queue <linux>, Job Priority <50>, Command <./gpssim-fast-linux ../Phoenix _BW.cfg GPS_N20_M176_C12.20_101.cfg> Tue Feb 17 12:47:41: Submitted from host <ackerly>, CWD </prj/gpsone/systems/si ms/egps_roc/test_unif_pfa>, Output File </dev/null>; PENDING REASONS: User has reached the per-user job slot limit of the queue; 30 Job output Your job may complete with either a done or exited state. Done indicates the job ran to successful completion. Exit indicates it exited for a given reason / exit code. Exit codes can be LSF or application specific. Common reasons for exit are unable to obtain a license (license resource not specified in rusage), or problems with LSF execution host. Re-submit your job or if an unknown exit code, notify vlsi.unix.help. When your job completes, you will receive e-mail notification from LSF. Contains a summary of job output and statistics. E-mail notification disable with output file (-o) or specifying output file to /dev/null Check your application log for further output. 31 LSF User Fairshare Some queues have fairshare enabled (linux, hspice, regression_queue) Dynamic priority based upon recently job history and other similar pending jobs. Jobs in fairshare queues are not dispatched FCFS. bqueues l to see user priority in queue. Updates based upon new and completed jobs. Allows users who submit small number of jobs to go ahead of others. 32 LSF Host Status To check the status of LSF hosts, you can run any of the following commands: bhosts w This shows the general LSF state of the host, if its accepting jobs, and what job slots / jobs are running on the host. lshosts w This shows the architecture type / model of the hosts within LSF, the CPU factors, number of CPUs memory and defined LSF resources (to use with a select statement). Both commands allow for a l {hostname} for detailed status. EX) bhosts l strontium Note: Even with w option, lsload will still truncate long hostnames. 33 LSF Host Status Check on the status of LSF hosts by using the bhosts w command. HOST_NAME STATUS JL/U MAX NJOBS RUN SSUSP USUSP RSV abbotsford closed_Adm - 1 0 0 0 0 0 ashtabula ok - 1 0 0 0 0 0 beni closed_Full - 1 1 1 0 0 0 beryllium ok - 4 3 3 0 0 0 bray unavail - 1 0 0 0 0 0 jakarta closed_Busy - 2 1 0 1 0 0 The number indicate the maximum, and number of jobs a host can have / is running and jobs within a particular state. A host may be in several states: OK everything on the host is OK and its ready for jobs. closed_Adm means the host is undergoing sys admin maintenance. closed_Full | closed_Busy means the host has no open job slots to dispatch jobs to or theshold has been exceeded. unavail means LSF cannot communicate with the host. 34 LSF Host Status lshosts w lshosts w {hostname} lshosts w R {resource} HOST_NAME type model cpuf ncpus maxmem maxswp server RESOURCES abbotsford SPARC Ultra_60_1 12.0 1 512M 839M Yes (desktop os2_8) ashtabula SPARC Ultra_60_1 12.0 1 1024M 3667M Yes (ikos os2_8) beryllium SPARC Sun_Fire_480R_4 40.0 4 32768M 4038M Yes (compute os2_8) compute-ia64-san-001 LINUX64 Itanium2 100.0 4 48777M 8691M Yes () mottelson LINUX XEON_2 99.0 2 5940M 4000M Yes (linux linux_2_4) schwarzes023 LINUX Opteron_246_2 100.0 2 8028M 1027M Yes (linux linux_2_4) Use any of the information in the output to help correctly format your bsub select statement. 35 Job Status Commands After your job is submitted to LSF, you can run a series of tools to control your job. The bkill command allows you to terminate your job. This command terminates the job. If you kill a running job, you will still receive LSF output with an exit code indicating it was terminated. EX) bkill {job_ID} The bmod command allows you to modify your job submission. Its best used to modify a job that is pending. When pending, all options of the bsub can be modified. Only certain options can be modified for a running job, including the resource requirement string, CPU & memory limits, and job output options. EX) bmod {bsub options} {job_ID} 36 Job History LSF keeps a log of all submitted jobs. Several LSF commands pull job information and summary statistics. The bhist command shows a detailed history of jobs. It can be used to get bjobs-like status information in a brief format, or more detailed information about jobs. You can specify one or multiple job IDs to report on. You can specify queues, users, projects, hosts, or host groups that jobs ran in/on. You can specify date ranges for when jobs completed, were dispatched, or started. LSF keeps job information in memory for 1 hour. If your job is earlier than that, youll need to specify LSF to check all log files with the n 0 option. The bhist command can be submitted to LSF just like other jobs, and youll get the output in e-mail. 37 bhist bhist {job_ID} bhsit l {job_ID} Shows summary job information or detailed job status (by time) for that job. bhist q {queue_name} u {username} P {project idenfier} l Will specify similar output for select queues, users, and or projects. To specify a timeframe, use time format MM/DD/YYYY/HH24:MI in the following commands: Dispatched during a specified time: -D{time0},{time1} Completed or exited during a specified time: -C{time0},{time1} Started during a specified time: -S{time0},{time1} All jobs (all states) during a specified time: -T{time0},{time1} In most cases, the n 0 option will need to be used. 38 bhist output JOBID USER JOB_NAME PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL 678129 johndoe *M_FIXED 11 0 60788 0 0 0 60799 678133 johndoe *M_FIXED 22 0 60770 0 0 0 60792 678220 johndoe *M_FIXED 51 0 60183 0 495 0 60729 678233 johndoe *M_FIXED 88 0 51132 0 9497 0 60717 678247 johndoe *M_FIXED 81 0 23972 0 36657 0 60710 Job <678129>, User <johndoe>, Project <default>, Command <#!/bin/bash;/prj/vlsi/ch eetah/systems/bin/CHEETAH_SIM_FIXED_7.1-linux -ref 2>&1 > results.0217.232007> Tue Feb 17 23:20:12: Submitted from host <pauling>, to Queue <linux>, CWD </usr /local/projects/vlsi/saber/systems/users/nyee/CSIM_PERFORM ANCE/DL/cfg_cheetah_NM_sttd/test20805_EcIor-12>, Requested Resources <type=any>; Tue Feb 17 23:20:23: Dispatched to <skou>; Tue Feb 17 23:20:23: Starting (Pid 1307); Tue Feb 17 23:20:23: Running with execution home </usr2/johndoe>, Execution CWD </ usr/local/projects/vlsi/saber/systems/users/johndoe/CSIM_PERF ORMANCE/DL/cfg_cheetah_NM_sttd/test20805_EcIor-12>, Execut ion Pid <1307>; Summary of time in seconds spent in various states by Wed Feb 18 16:17:03 PEND PSUSP RUN USUSP SSUSP UNKWN TOTAL 11 0 61000 0 0 0 61011 39 Job History Another command is bacct, which shows summary statistics as well as job output (like bhist). It takes the same options as bhist, allowing you to report by: One or multiple job IDs. Queues, users, projects, hosts, or host groups that jobs ran in/on. Date ranges for when jobs completed, were dispatched, or started. Unlike bhist, bacct will automatically search through all LSF accounting files. The bacct command can be submitted to LSF just like other jobs, and youll get the output in e-mail. 40 bacct command bacct {job_ID} bhsit l {job_ID} Shows summary job information or detailed job status (by time) for that job. bacct q {queue_name} u {username} P {project idenfier} l Will specify similar output for select queues, users, and or projects. To specify a timeframe, use time format MM/DD/YYYY/HH24:MI in the following commands: Dispatched during a specified time: -D{time0},{time1} Completed or exited during a specified time: -C{time0},{time1} Started during a specified time: -S{time0},{time1} 41 bacct command Accounting information about jobs that are: - submitted by all users. - accounted on all projects. - completed normally or exited - completed between Thu Jan 1 00:00:00 2004 and Fri Jan 2 00:00:00 2004 - executed on all hosts. - submitted to all queues. ------------------------------------------------------------------------------ SUMMARY: ( time unit: second ) Total number of done jobs: 14443 Total number of exited jobs: 1892 Total CPU time consumed: 12806144.0 Average CPU time consumed: 784.0 Maximum CPU time of a job: 401677.1 Minimum CPU time of a job: 0.0 Total wait time in queues: 32101168.0 Average wait time in queue: 1965.2 Maximum wait time in queue:746718.0 Minimum wait time in queue: 0.0 Average turnaround time: 3206 (seconds/job) Maximum turnaround time: 875735 Minimum turnaround time: 3 Average hog factor of a job: 0.74 ( cpu time / turnaround time ) Maximum hog factor of a job: 1.88 Minimum hog factor of a job: 0.00 Total throughput: 680.95 (jobs/hour) during 23.99 hours Beginning time: Jan 1 00:00 Ending time: Jan 2 00:00 bacct u all C2004/01/01/00:00,2004/01/02/00:00 42 LSF Analytics QCT-ES has installed Platform Analytics that captures and reports LSF and license information. Provides LSF data reporting as well as license usage reporting. QCT-IT utilizes this information to: Predict future usage. Analyze and fix resource bottlenecks. Purchase host resources to add to the cluster to make job submission faster. Understand license usage patterns to purchase or re-mix license features. Data is available to all QCT employees who are interested. email analytics for report request portal accesable from http://lsf.qualcomm.com QCT-IT provides customized reporting services for CAD management and chip leads. 43 LSF Analytics - Architecture LSF Cluster LSF Cluster License Servers License usage Agent & data file Agent & data file DB (Oracle) FLEX lmstat Cognos / Web Reports San Diego Other sites San Diego ETL 44 LSF Analytics Sample Report 45 Recent Job Memory Trends Jobs using 4 - 8G of memory (8/2003-8/15/2004) 0 20 40 60 80 100 120 140 160 7 / 2 7 / 2 0 0 3 8 / 1 0 / 2 0 0 3 8 / 2 4 / 2 0 0 3 9 / 7 / 2 0 0 3 9 / 2 1 / 2 0 0 3 1 0 / 5 / 2 0 0 3 1 0 / 1 9 / 2 0 0 3 1 1 / 2 / 2 0 0 3 1 1 / 1 6 / 2 0 0 3 1 1 / 3 0 / 2 0 0 3 1 2 / 1 4 / 2 0 0 3 1 2 / 2 8 / 2 0 0 3 1 / 1 1 / 2 0 0 4 1 / 2 5 / 2 0 0 4 2 / 8 / 2 0 0 4 2 / 2 2 / 2 0 0 4 3 / 7 / 2 0 0 4 3 / 2 1 / 2 0 0 4 4 / 4 / 2 0 0 4 4 / 1 8 / 2 0 0 4 5 / 2 / 2 0 0 4 5 / 1 6 / 2 0 0 4 5 / 3 0 / 2 0 0 4 6 / 1 3 / 2 0 0 4 6 / 2 7 / 2 0 0 4 7 / 1 1 / 2 0 0 4 7 / 2 5 / 2 0 0 4 8 / 8 / 2 0 0 4 N u m b e r
o f
j o b s 46 QCT-ES Role in LSF QCT-ES provides help with your LSF job needs. We will troubleshoot issues you have with running LSF. We monitor the cluster for performance, improving LSF performance when needed. We make modifications to LSF through host adjustments, queue changes, user changes, and LSF settings. We physically maintain the hosts within the LSF cluster. We work with engineers to understand their compute needs and plan appropriate resources to meet those needs. We install and monitor license keys / resources. We work with engineers on their job wrapper needs. 47 LSF Resources QCT-ES LSF web site, with documentation, vendor docs, and FAQs. http://lsf.qualcomm.com LSF man pages: man {lsf_command} Platform LSF documentation (pdf and html) Additional LSF slice of knowledge training sessions. PERL AVL: http://qctes.qualcomm.com/twiki/bin/view/QCTES/QCTESAvlDocs On the QCT-ES web site: --FAQ LSF questions. -Introductoin to LSF (pdf and html) guide by Platform computing -Much more 48 Common problems Batch system not responding Reconfiguration in progress LSF event file rotation in progress Failed simulation No license, out of disk space, linux automounter/nfs, host out of memory Disk quota (home directory) linux x86 kernel file size limit Long pend times All hosts busy Invalid resource options Out of licenses Queue limit 49 How to Get Help e-mail vlsi.unix.help for any issues. When submitting a ticket, please include as much information about your job as possible. For example: Provide the Job ID and the output of bjobs lp {job_ID} so we can see how the job is run. Provide the host / queue youre having problems with. Provide the license resource thats unavailable that youre trying to use. Provide any LSF or application exit codes you received. Provide the output of your program log for additional support. Be descriptive and explain your issue. http://lsf.qualcomm.com