mirror of the now-defunct rocklinux.org
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

236 lines
12 KiB

  1. Building ROCK Linux on a cluster
  2. ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  3. 1. Basics
  4. =========
  5. I'm assuming you have read the BUILD file and know how to make a 'normal'
  6. build of ROCK Linux. I'm also assuming that you know how to use a Linux
  7. cluster (since you are reading this, you might have one). I'm now going to
  8. explain how to build ROCK Linux on a cluster. The techniques described here
  9. can also be used to build ROCK Linux on an SMP machine to get the best
  10. performance out of all CPUs.
  11. ROCK Linux can be build on a simple cluster of workstations connected with
  12. a normal LAN (ethernet, etc). No low-latency or high-bandwith network is needed
  13. to build ROCK Linux on a cluster with good performance.
  14. ROCK Linux has it's own job scheduler to distribute jobs over the cluster
  15. nodes, but you can also use any job scheduler you have already installed on
  16. your cluster to do the job.
  17. When building ROCK Linux in parallel (cluster) mode, the build scripts simply
  18. decide, based on the package dependencies, which packages may be built in
  19. parallel and does so if applicable (instead of serial, which is the default
  20. behavior).
  21. For building ROCK Linux you always have to be root. That doesn't change
  22. when you are building on a cluster. The 'Abort when a package-build fails'
  23. config option is not available when making a parallel (cluster) build.
  24. 2. Amdahl's law
  25. ===============
  26. In a famous paper Amdahl observed that one must consider an entire application
  27. when considering the level of available parallelism. If only one percent of a
  28. problem fails to parallelize, then no matter how much parallelism is available
  29. for the rest, the problem can never be solved more than one hundred times
  30. faster than in the sequential case.
  31. Almost every package in ROCK Linux depends on a few very basic packages like
  32. the C-library, the C-compiler and the shell. So it's not possible to make use
  33. of the power of your cluster in the early phase of the build where these
  34. essential packages are build. Later in the build there are almost always a few
  35. more packages which can be built in parallel (100 packages is very common
  36. after the base packages have been built).
  37. The tool './scripts/Create-ParaSim' can be used to "simulate" a parallel build.
  38. Just configure your build and run './scripts/Create-ParaSim'. The output is a
  39. graph showing how many parallel jobs are available for building in which phase
  40. of the Build. It looks like this:
  41. ----+----------------------------------------------------------------------+
  42. 181 | ::::. |
  43. | .:::::::. |
  44. P | .:::::::::::::: |
  45. a | .::::::::::::::::. |
  46. r | :::::::::::::::::::::. |
  47. a | ..::::::::::::::::::::::::. |
  48. l | . .. ...:::::::::::::::::::::::::::: |
  49. l | ::::::::::::::::::::::::::::::::::::::::. |
  50. e | ::::::::::::::::::::::::::::::::::::::::::. |
  51. l | ::::::::::::::::::::::::::::::::::::::::::::. |
  52. | .:::::::::::::::::::::::::::::::::::::::::::::: |
  53. J | ::::::::::::::::::::::::::::::::::::::::::::::::. |
  54. o | ::::::::::::::::::::::::::::::::::::::::::::::::::. |
  55. b | ::::::::::::::::::::::::::::::::::::::::::::::::::::. |
  56. s | ::::::::::::::::::::::::::::::::::::::::::::::::::::::::. |
  57. | :.::::::::::::::::::::::::::::::::::::::::::::::::::::::::::. |
  58. 1 |...::..::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::.|
  59. ----+----------------------------------------------------------------------+
  60. | 1 Number of Jobs build so far 424 |
  61. You can see that the build doesn't parallelize very well in the early phase
  62. but soon reaches a state where over 100 jobs can be built at the same time.
  63. That the number of available jobs is going down in the right side of the graph
  64. is normal. When E.g. 400 of 424 jobs are already built, there are only 24
  65. jobs left and so it's not possible anymore to have 100 parallel jobs.
  66. Note that the X-axis is the number of jobs built already - and not the time.
  67. so that graph is telling you something about the level of parallelism which
  68. is possible in your selected configuration in general - but it does not provide
  69. exact numbers how much faster the build would be e.g. on a 16 node cluster.
  70. You can pass the option '-jobs N' to ./scripts/Create-ParaSim to get a
  71. simulation of the build on a cluster with N nodes. The script assumes that the
  72. cluster nodes are as fast as the system which has done the reference build. If
  73. your cluster nodes are e.g. about 20% faster, your build will be completed about
  74. 20% sooner as printed in the stat. You can even compare builds - e.g.
  75. "-jobs 1,2,8" would compare a "normal" single-node build with a build on a
  76. 2-node cluster and an 8-node cluster:
  77. -----+--------------------------------------------------------------------+
  78. 8 | : ::: |
  79. | :. ::::. |
  80. | ..:: ::::: |
  81. | ::::..:::::. |
  82. 1 |:::::::::::::::::: |
  83. -----+--------------------------------------------------------------------+
  84. 2 | :::::::::::::::::::::::::::::::: |
  85. | :::::::::::::::::::::::::::::::::: |
  86. |.::::::::::::::::::::::::::::::::::: |
  87. |:::::::::::::::::::::::::::::::::::: |
  88. 1 |:::::::::::::::::::::::::::::::::::: |
  89. -----+--------------------------------------------------------------------+
  90. 1 |::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::|
  91. |::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::|
  92. |::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::|
  93. |::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::|
  94. 1 |::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::::|
  95. -----+--------------------------------------------------------------------+
  96. Jobs | 00:00 Time 14:41 |
  97. If you have 'gnuplot' installed and $DISPLAY set, you can also pass the option
  98. '-x11' to ./scripts/Create-ParaSim so it will use the program 'gnuplot' to
  99. graph the results. A screenshot of the '-x11' mode of ./scripts/Create-ParaSim
  100. can be found at http://www.rocklinux.net/pics/screenshot_parasim.jpg.
  101. 3. Setting up the master
  102. ========================
  103. Extract the ROCK Linux source somewhere and export this directory read-write
  104. to all nodes using NFS. In many cases there will be already a directory on
  105. your cluster which is shared between all nodes (e.g. /home). I will assume
  106. the directory name /home/rock-master in this document.
  107. Configure your build as usual. Enable the config option 'Make a parallel
  108. (cluster) build'. The config option 'Maximum size of job queue' should have
  109. a value which is higher than the maximum number of jobs which will be built
  110. on our cluster. Set this config option to '0' (unlimited) when building on a
  111. big cluster.
  112. The option 'Command for adding jobs' will be explained in section 6 (Building
  113. with an external job scheduler) and can be left blank if you are using the
  114. built-in job scheduler.
  115. You also might want to enable the 'Always clean up src dirs (even on pkg
  116. fail)' option so the local disks of your cluster nodes are not filled up
  117. with the src dirs of broken packages.
  118. Download the required source packages as usual (if you don't already have them
  119. all downloaded).
  120. 4. Setting up the nodes
  121. =======================
  122. The following has to be done on every node. If you have many nodes in your
  123. cluster you might mant to use 'prsh' from http://www.cacr.caltech.edu/beowulf/,
  124. the "Send input to all tabs" feature of KDE-Konsole, or even multissh, which
  125. is availible at oss.linbit.at, to perform the following steps on all nodes.
  126. You need to create a local build directory on every cluster node (building
  127. the packages on the NFS share would cost too much performance). In many cases
  128. there will be already a directory on the cluster for this (e.g. /scratch). I
  129. will assume the directory name /scratch/rock-node in this document.
  130. Set up the /scratch/rock-node directory using the commands:
  131. # mkdir -p /scratch/rock-node
  132. # cd /home/rock-master
  133. # ./scripts/Create-Links -config -build /scratch/rock-node
  134. Now your cluster is ready for building ROCK Linux.
  135. 5. Building with the built-in job scheduler
  136. ===========================================
  137. Run './scripts/Build-Target' in /home/rock-master on the master. Instead of
  138. building the packages the master will create a job queue and add those
  139. packages to the queue which can be built next.
  140. Run './scripts/Build-Job -daemon' in /scratch/rock-node on the nodes. Again,
  141. you might want to use 'prsh'/'multissh' to do this on all nodes. If you want to
  142. build multiple packages parallel on one cluster node (e.g. because they have
  143. two CPUs) you need to run './scripts/Build-Job -daemon' as often as how many
  144. jobs you want to run on the node at the same time.
  145. "Build-Target" on the master will show you what's going on. You can view
  146. the current status of your build from every console using the tool
  147. './scripts/Create-ParaStatus'. The output of the script looks like this:
  148. 18:41 2002-05-08: --- current status ---
  149. Build-Job (daemon mode) running on node01 with PID 18452
  150. Build-Job (daemon mode) running on node02 with PID 18665
  151. Build-Job (daemon mode) running on node03 with PID 19618
  152. Job 3-kdenetwork node02 (18665) since 18:32 2002-5-08
  153. Job 3-kdeutils node03 (19618) since 18:41 2002-5-08
  154. Job 3-kdevelop node01 (18452) since 18:30 2002-5-08
  155. Job 3-kdebindings waiting in the job queue (priority 2)
  156. Job 3-kdeadmin waiting in the job queue (priority 1)
  157. Job 3-kde-i18n-fr waiting in the job queue (priority 1)
  158. Job 3-kde-i18n-es waiting in the job queue (priority 1)
  159. Job 3-kde-i18n-de waiting in the job queue (priority 1)
  160. Job 3-kdeartwork waiting in the job queue (priority 0)
  161. Job 3-kdeaddons waiting in the job queue (priority 0)
  162. 18:41 2002-05-08: ----------------------
  163. "Build-Job -daemon" on the nodes forks into background, only printing a one
  164. line message with the filename of the logfile which contains the output of the
  165. script. This logfile is in the build/ directory, which is shared between all
  166. nodes so you can view all logs from the master node.
  167. 6. Building with an external job scheduler
  168. ==========================================
  169. Let's say the command for adding jobs in your job scheduler is 'addjob' and
  170. it takes only one parameter: the command to execute. You would set the config
  171. option 'Command for adding jobs' to the value
  172. addjob 'cd /scratch/rock-node ; {}'
  173. The text {} will automatically be replaced with the Build-Job invocation for
  174. building the package and is always in the form:
  175. ./scripts/Build-Job -cfg <config-name> <stagelevel>-<package-name>
  176. So if you want to make some intelligent job scheduling (e.g. building large
  177. packages on a faster node) you can also pass {} to another script and
  178. have the command in $*, the config name in $3 and the stagelevel and
  179. package name in $4.
  180. If not all jobs can be executed, the job scheduler should prefer those jobs
  181. which have been submitted first. This is important to make sure it is always
  182. possible that multiple packages can be built in parallel.
  183. Note that './scripts/Build-Job -daemon' does not work if the 'Command for
  184. adding jobs' config option is set. The './scripts/Create-ParaStatus' script
  185. works as usual.